HEPcrawl¶

https://img.shields.io/github/tag/inspirehep/hepcrawl.svg

https://img.shields.io/pypi/dm/hepcrawl.svg

https://img.shields.io/github/license/inspirehep/hepcrawl.svg

HEPcrawl is a harvesting library based on Scrapy (http://scrapy.org) for INSPIRE-HEP (http://inspirehep.net) that focuses on automatic and semi-automatic retrieval of new content from all the sources the site aggregates. In particular content from major and minor publishers in the field of High-Energy Physics.

The project is currently in early stage of development.

See full documentation at http://pythonhosted.org/hepcrawl

User’s Guide¶

This part of the documentation will show you how to get started in using HEPCrawl.

API Reference¶

If you are looking for information on a specific function, class or method, this part of the documentation is for you.

API
- Items
- Spiders

Additional Notes¶

Notes on how to contribute, legal information and changes are here for the interested.

Happy crawling!

INSPIRE Development Team

Email: feedback@inspirehep.net
Twitter: http://twitter.com/inspirehep
GitHub: http://github.com/inspirehep
URL: http://inspirehep.net

HEPcrawl¶

User’s Guide¶

API Reference¶

Additional Notes¶

HEPCrawl

Navigation

Related Topics