API Docs¶
Invenio-OAIHarvester API to harvest items from OAI-PMH servers.
If you need to schedule or run harvests from inside of Python, you can use our API:
from invenio_oaiharvester.api import get_records
request, records = get_records(identifiers=["oai:arXiv.org:1207.7214"],
url="http://export.arxiv.org/oai2")
for record in records:
print rec.raw
-
invenio_oaiharvester.api.
get_info_by_oai_name
(name)[source]¶ Get basic OAI request data from the OAIHarvestConfig model.
Parameters: name – name of the source (OAIHarvestConfig.name) Returns: (url, metadataprefix, lastrun as YYYY-MM-DD, setspecs)
-
invenio_oaiharvester.api.
get_records
(identifiers, metadata_prefix=None, url=None, name=None)[source]¶ Harvest specific records from an OAI repo via OAI-PMH identifiers.
Parameters: - metadata_prefix – The prefix for the metadata return (defaults to ‘oai_dc’).
- identifiers – list of unique identifiers for records to be harvested.
- url – The The url to be used to create the endpoint.
- name – The name of the OAIHarvestConfig to use instead of passing specific parameters.
Returns: request object, list of harvested records
-
invenio_oaiharvester.api.
list_records
(metadata_prefix=None, from_date=None, until_date=None, url=None, name=None, setspecs=None)[source]¶ Harvest multiple records from an OAI repo.
Parameters: - metadata_prefix – The prefix for the metadata return (defaults to ‘oai_dc’).
- from_date – The lower bound date for the harvesting (optional).
- until_date – The upper bound date for the harvesting (optional).
- url – The The url to be used to create the endpoint.
- name – The name of the OAIHarvestConfig to use instead of passing specific parameters.
- setspecs – The ‘set’ criteria for the harvesting (optional).
Returns: request object, list of harvested records
Models¶
OAI harvest database models.
Configuration¶
OAI harvest config.
-
invenio_oaiharvester.config.
OAIHARVESTER_DEFAULT_NAMESPACE_MAP
= {'OAI-PMH': 'http://www.openarchives.org/OAI/2.0/'}¶ The default namespace used when handling OAI-PMH results.
-
invenio_oaiharvester.config.
OAIHARVESTER_WORKDIR
= None¶ Path to directory for oaiharvester related files, default: instance_path.
Tasks¶
Celery tasks used by Invenio-OAIHarvester.
-
(task)
invenio_oaiharvester.tasks.
get_specific_records
(identifiers, metadata_prefix=None, url=None, name=None, signals=True, **kwargs)[source]¶ Harvest specific records from an OAI repo via OAI-PMH identifiers.
Parameters: - metadata_prefix – The prefix for the metadata return (e.g. ‘oai_dc’)
- identifiers – list of unique identifiers for records to be harvested.
- url – The The url to be used to create the endpoint.
- name – The name of the OAIHarvestConfig to use instead of passing specific parameters.
- signals – If signals should be emitted about results.
-
(task)
invenio_oaiharvester.tasks.
list_records_from_dates
(metadata_prefix=None, from_date=None, until_date=None, url=None, name=None, setspecs=None, signals=True, **kwargs)[source]¶ Harvest multiple records from an OAI repo.
Parameters: - metadata_prefix – The prefix for the metadata return (e.g. ‘oai_dc’)
- from_date – The lower bound date for the harvesting (optional).
- until_date – The upper bound date for the harvesting (optional).
- url – The The url to be used to create the endpoint.
- name – The name of the OAIHarvestConfig to use instead of passing specific parameters.
- setspecs – The ‘set’ criteria for the harvesting (optional).
- signals – If signals should be emitted about results.
Exceptions¶
OAI harvester errors.
-
exception
invenio_oaiharvester.errors.
IdentifiersOrDates
[source]¶ Identifiers cannot be used in combination with dates.
-
exception
invenio_oaiharvester.errors.
InvenioOAIHarvesterError
[source]¶ Base exception for invenio-oaiharvester.