inspire-crawler¶
Crawler integration with INSPIRE-HEP using scrapy project HEPCrawl.
This module allows scheduling of crawler jobs to a Scrapyd instance serving a Scrapy project. E.g. in this case the default scrapy project is HEPCrawl.
It integrates directly with invenio-workflows module to create workflows for every record harvested by the crawler.
This module is meant to use only with INSPIRE-HEP overlay. Use at own risk.
Full documentation is hosted here: http://pythonhosted.org/inspire-crawler/
See also documentation of HEPCrawl: http://pythonhosted.org/hepcrawl/
User’s Guide¶
This part of the documentation will show you how to get started in using inspire-crawler.
API Reference¶
If you are looking for information on a specific function, class or method, this part of the documentation is for you.
Additional Notes¶
Notes on how to contribute, legal information and changes are here for the interested.