HEPcrawl¶
HEPcrawl is a harvesting library based on Scrapy (http://scrapy.org) for INSPIRE-HEP (http://inspirehep.net) that focuses on automatic and semi-automatic retrieval of new content from all the sources the site aggregates. In particular content from major and minor publishers in the field of High-Energy Physics.
The project is currently in early stage of development.
See full documentation at http://pythonhosted.org/hepcrawl
User’s Guide¶
This part of the documentation will show you how to get started in using HEPCrawl.
API Reference¶
If you are looking for information on a specific function, class or method, this part of the documentation is for you.
Additional Notes¶
Notes on how to contribute, legal information and changes are here for the interested.
Happy crawling!