API Docs¶
Invenio-OAIHarvester API to harvest items from OAI-PMH servers.
If you need to schedule or run harvests from inside of Python, you can use our API:
from invenio_oaiharvester.api import get_records
request, records = get_records(identifiers=["oai:arXiv.org:1207.7214"],
url="http://export.arxiv.org/oai2")
for record in records:
print rec.raw
-
invenio_oaiharvester.api.get_info_by_oai_name(name)[source]¶ Get basic OAI request data from the OAIHarvestConfig model.
Parameters: name – name of the source (OAIHarvestConfig.name) Returns: (url, metadataprefix, lastrun as YYYY-MM-DD, setspecs)
-
invenio_oaiharvester.api.get_records(identifiers, metadata_prefix=None, url=None, name=None)[source]¶ Harvest specific records from an OAI repo via OAI-PMH identifiers.
Parameters: - metadata_prefix – The prefix for the metadata return (defaults to ‘oai_dc’).
- identifiers – list of unique identifiers for records to be harvested.
- url – The The url to be used to create the endpoint.
- name – The name of the OAIHarvestConfig to use instead of passing specific parameters.
Returns: request object, list of harvested records
-
invenio_oaiharvester.api.list_records(metadata_prefix=None, from_date=None, until_date=None, url=None, name=None, setspecs=None)[source]¶ Harvest multiple records from an OAI repo.
Parameters: - metadata_prefix – The prefix for the metadata return (defaults to ‘oai_dc’).
- from_date – The lower bound date for the harvesting (optional).
- until_date – The upper bound date for the harvesting (optional).
- url – The The url to be used to create the endpoint.
- name – The name of the OAIHarvestConfig to use instead of passing specific parameters.
- setspecs – The ‘set’ criteria for the harvesting (optional).
Returns: request object, list of harvested records
Models¶
OAI harvest database models.
Configuration¶
OAI harvest config.
-
invenio_oaiharvester.config.OAIHARVESTER_DEFAULT_NAMESPACE_MAP= {'OAI-PMH': 'http://www.openarchives.org/OAI/2.0/'}¶ The default namespace used when handling OAI-PMH results.
-
invenio_oaiharvester.config.OAIHARVESTER_WORKDIR= None¶ Path to directory for oaiharvester related files, default: instance_path.
Tasks¶
Celery tasks used by Invenio-OAIHarvester.
-
(task)
invenio_oaiharvester.tasks.get_specific_records(identifiers, metadata_prefix=None, url=None, name=None, signals=True, **kwargs)[source]¶ Harvest specific records from an OAI repo via OAI-PMH identifiers.
Parameters: - metadata_prefix – The prefix for the metadata return (e.g. ‘oai_dc’)
- identifiers – list of unique identifiers for records to be harvested.
- url – The The url to be used to create the endpoint.
- name – The name of the OAIHarvestConfig to use instead of passing specific parameters.
- signals – If signals should be emitted about results.
-
(task)
invenio_oaiharvester.tasks.list_records_from_dates(metadata_prefix=None, from_date=None, until_date=None, url=None, name=None, setspecs=None, signals=True, **kwargs)[source]¶ Harvest multiple records from an OAI repo.
Parameters: - metadata_prefix – The prefix for the metadata return (e.g. ‘oai_dc’)
- from_date – The lower bound date for the harvesting (optional).
- until_date – The upper bound date for the harvesting (optional).
- url – The The url to be used to create the endpoint.
- name – The name of the OAIHarvestConfig to use instead of passing specific parameters.
- setspecs – The ‘set’ criteria for the harvesting (optional).
- signals – If signals should be emitted about results.
Exceptions¶
OAI harvester errors.
-
exception
invenio_oaiharvester.errors.IdentifiersOrDates[source]¶ Identifiers cannot be used in combination with dates.
-
exception
invenio_oaiharvester.errors.InvenioOAIHarvesterError[source]¶ Base exception for invenio-oaiharvester.