API Docs

Configuration

Indexer for Invenio.

invenio_indexer.config.INDEXER_BEFORE_INDEX_HOOKS = []

List of automatically connected hooks (function or importable string).

invenio_indexer.config.INDEXER_BULK_REQUEST_TIMEOUT = 10

Request timeout to use in Bulk indexing.

invenio_indexer.config.INDEXER_DEFAULT_DOC_TYPE = 'record-v1.0.0'

Default doc_type to use if no schema is defined.

invenio_indexer.config.INDEXER_DEFAULT_INDEX = 'records-record-v1.0.0'

Default index to use if no schema is defined.

invenio_indexer.config.INDEXER_MQ_EXCHANGE = <unbound Exchange indexer(direct)>

Default exchange for message queue.

invenio_indexer.config.INDEXER_MQ_QUEUE = <unbound Queue indexer -> <unbound Exchange indexer(direct)> -> indexer>

Default queue for message queue.

invenio_indexer.config.INDEXER_MQ_ROUTING_KEY = 'indexer'

Default routing key for message queue.

invenio_indexer.config.INDEXER_RECORD_TO_INDEX = 'invenio_indexer.utils.default_record_to_index'

Provide an implemetation of record_to_index function

invenio_indexer.config.INDEXER_REPLACE_REFS = True

Whether to replace JSONRefs prior to indexing record.

Record Indexer

API for indexing of records.

class invenio_indexer.api.Producer(channel, exchange=None, routing_key=None, serializer=None, auto_declare=None, compression=None, on_return=None)[source]

Producer validating published messages.

For more information visit kombu.Producer.

publish(data, **kwargs)[source]

Validate operation type.

class invenio_indexer.api.RecordIndexer(search_client=None, exchange=None, queue=None, routing_key=None, version_type=None, record_to_index=None)[source]

Provide an interface for indexing records in Elasticsearch.

Bulk indexing works by queuing requests for indexing records and processing these requests in bulk.

Initialize indexer.

Parameters:
  • search_client – Elasticsearch client. (Default: current_search_client)
  • exchange – A kombu.Exchange instance for message queue.
  • queue – A kombu.Queue instance for message queue.
  • routing_key – Routing key for message queue.
  • version_type – Elasticsearch version type. (Default: external_gte)
  • record_to_index – Function to extract the index and doc_type from the record.
bulk_delete(record_id_iterator)[source]

Bulk delete records from index.

Parameters:record_id_iterator – Iterator yielding record UUIDs.
bulk_index(record_id_iterator)[source]

Bulk index records.

Parameters:record_id_iterator – Iterator yielding record UUIDs.
create_producer(*args, **kwds)[source]

Context manager that yields an instance of Producer.

delete(record)[source]

Delete a record.

Parameters:record – Record instance.
delete_by_id(record_uuid)[source]

Delete record from index by record identifier.

index(record)[source]

Index a record.

The caller is responsible for ensuring that the record has already been committed to the database. If a newer version of a record has already been indexed then the provided record will not be indexed. This behavior can be controlled by providing a different version_type when initializing RecordIndexer.

Parameters:record – Record instance.
index_by_id(record_uuid)[source]

Index a record by record identifier.

Parameters:record_uuid – Record identifier.
mq_exchange

Message Queue exchange.

Returns:The Message Queue exchange.
mq_queue

Message Queue queue.

Returns:The Message Queue queue.
mq_routing_key

Message Queue routing key.

Returns:The Message Queue routing key.
process_bulk_queue()[source]

Process bulk indexing queue.

record_to_index(record)[source]

Get index/doc_type given a record.

Parameters:record – The record where to look for the information.
Returns:A tuple (index, doc_type).

Flask Extension

Flask exension for Invenio-Indexer.

class invenio_indexer.ext.InvenioIndexer(app=None)[source]

Invenio-Indexer extension.

Extension initialization.

Parameters:app – The Flask application. (Default: None)
init_app(app)[source]

Flask application initialization.

Parameters:app – The Flask application.
init_config(app)[source]

Initialize configuration.

Parameters:app – The Flask application.
record_to_index[source]

Import the configurable ‘record_to_index’ function.

Celery tasks

Celery tasks to index records.

invenio_indexer.tasks.process_bulk_queue(version_type=None)[source]

Process bulk indexing queue.

Parameters:version_type – Elasticsearch version type.

Note: You can start multiple versions of this task.

invenio_indexer.tasks.index_record(record_uuid)[source]

Index a single record.

Parameters:record_uuid – The record UUID.
invenio_indexer.tasks.delete_record(record_uuid)[source]

Delete a single record.

Parameters:record_uuid – The record UUID.

Signals

Signals for indexer.

invenio_indexer.signals.before_record_index = <blinker.base.NamedSignal object at 0x0000000003749f30; 'before-record-index'>

Signal sent before a record is indexed.

The sender is the current Flask application, and two keyword arguments are provided:

  • json: The dumped record dictionary which can be modified.
  • record: The record being indexed.
  • index: The index in which the record will be indexed.
  • doc_type: The doc_type for the record.