pyes.es

class pyes.es.ES(server, timeout=5.0, bulk_size=400, encoder=None, decoder=None, max_retries=3, autorefresh=False, default_indices=['_all'], dump_curl=False)

ES connection object.

add_alias(alias, indices)

Add an alias to point to a set of indices.

analyze(text, index=None)

Performs the analysis process on a text and return the tokens breakdown of the text

change_aliases(commands)

Change the aliases stored.

commands is a list of 3-tuples; (command, index, alias), where command is one of “add” or “remove”, and index and alias are the index and alias to add or remove.

close_index(index)

Close an index.

cluster_health(indices=None, level='cluster', wait_for_status=None, wait_for_relocating_shards=None, timeout=30)

Check the current cluster health. Request Parameters

The cluster health API accepts the following request parameters:

Parameters:
  • level – Can be one of cluster, indices or shards. Controls the details level of the health information returned. Defaults to cluster.
  • wait_for_status – One of green, yellow or red. Will wait (until the timeout provided) until the status of the cluster changes to the one provided. By default, will not wait for any status.
  • wait_for_relocating_shards – A number controlling to how many relocating shards to wait for. Usually will be 0 to indicate to wait till all relocation have happened. Defaults to not to wait.
  • timeout – A time based parameter controlling how long to wait if one of the wait_for_XXX are provided. Defaults to 30s.
cluster_nodes(nodes=None)

The cluster nodes info API allows to retrieve one or more (or all) of the cluster nodes information.

cluster_state(filter_nodes=None, filter_routing_table=None, filter_metadata=None, filter_blocks=None, filter_indices=None)

Retrieve the cluster state.

Parameters:
  • filter_nodes – set to true to filter out the nodes part of the response.
  • filter_routing_table – set to true to filter out the routing_table part of the response.
  • filter_metadata – set to true to filter out the metadata part of the response.
  • filter_blocks – set to true to filter out the blocks part of the response.
  • filter_indices – when not filtering metadata, a comma separated list of indices to include in the response.
cluster_stats(nodes=None)

The cluster nodes info API allows to retrieve one or more (or all) of the cluster nodes information.

collect_info()

Collect info about the connection and fill the info dictionary

count(query, indices=None, doc_types=None, **query_params)

Execute a query against one or more indices and get hits count.

create_index(index, settings=None)

Creates an index with optional settings. Settings must be a dictionary which will be converted to JSON. Elasticsearch also accepts yaml, but we are only passing JSON.

create_index_if_missing(index, settings=None)

Creates an index if it doesn’t already exist.

If supplied, settings must be a dictionary.

create_percolator(index, name, query, **kwargs)

Create a percolator document

Any kwargs will be added to the document as extra properties.

create_river(river, river_name=None)

Create a river

delete(index, doc_type, id, bulk=False)

Delete a typed JSON document from a specific index based on its id. If bulk is True, the delete operation is put in bulk mode.

deleteByQuery(indices, doc_types, query, **request_params)

Delete documents from one or more indices and one or more types based on a query.

delete_alias(alias, indices)

Delete an alias.

The specified index or indices are deleted from the alias, if they are in it to start with. This won’t report an error even if the indices aren’t present in the alias.

delete_index(index)

Deletes an index.

delete_index_if_exists(index)

Deletes an index if it exists.

delete_mapping(index, doc_type)

Delete a typed JSON document type from a specific index.

delete_percolator(index, name)

Delete a percolator document

delete_river(river, river_name=None)

Delete a river

flush(indices=None, refresh=None)

Flushes one or more indices (clear memory)

flush_bulk(forced=False)

Wait to process all pending operations

force_bulk()

Force executing of all bulk data

gateway_snapshot(indices=None)

Gateway snapshot one or more indices

get(index, doc_type, id, fields=None, routing=None, **get_params)

Get a typed JSON document from an index based on its id.

get_alias(alias)

Get the index or indices pointed to by a given alias.

Raises IndexMissingException if the alias does not exist.

Otherwise, returns a list of index names.

get_file(index, doc_type, id=None)

Return the filename and memory data stream

get_indices(include_aliases=False)

Get a dict holding an entry for each index which exists.

If include_alises is True, the dict will also contain entries for aliases.

The key for each entry in the dict is the index or alias name. The value is a dict holding the following properties:

  • num_docs: Number of documents in the index or alias.
  • alias_for: Only present for an alias: holds a list of indices which this is an alias for.
get_mapping(doc_type=None, indices=None)

Register specific mapping definition for a specific type against one or more indices.

index(doc, index, doc_type, id=None, parent=None, force_insert=False, bulk=False, version=None, querystring_args=None)

Index a typed JSON document into a specific index and make it searchable.

index_raw_bulk(header, document)

Function helper for fast inserting

header and document must be string “

” ended

morelikethis(index, doc_type, id, fields, **query_params)

Execute a “more like this” search query against one or more fields and get back search hits.

open_index(index)

Open an index.

optimize(indices=None, wait_for_merge=False, max_num_segments=None, only_expunge_deletes=False, refresh=True, flush=True)

Optimize one or more indices.

indices is the list of indices to optimise. If not supplied, or “_all”, all indices are optimised.

wait_for_merge (boolean): If True, the operation will not return until the merge has been completed. Defaults to False.

max_num_segments (integer): The number of segments to optimize to. To fully optimize the index, set it to 1. Defaults to half the number configured by the merge policy (which in turn defaults to 10).

only_expunge_deletes (boolean): Should the optimize process only expunge segments with deletes in it. In Lucene, a document is not deleted from a segment, just marked as deleted. During a merge process of segments, a new segment is created that does have those deletes. This flag allow to only merge segments that have deletes. Defaults to false.

refresh (boolean): Should a refresh be performed after the optimize. Defaults to true.

flush (boolean): Should a flush be performed after the optimize. Defaults to true.

percolate(index, doc_types, query)

Match a query with a document

put_file(filename, index, doc_type, id=None)

Store a file in a index

put_mapping(doc_type=None, mapping=None, indices=None)

Register specific mapping definition for a specific type against one or more indices.

refresh(indices=None, timesleep=1)

Refresh one or more indices

timesleep: seconds to wait

reindex(query, indices=None, doc_types=None, **query_params)

Execute a search query against one or more indices and and reindex the hits. query must be a dictionary or a Query object that will convert to Query DSL. Note: reindex is only available in my ElasticSearch branch on github.

scan(query, indices=None, doc_types=None, scroll_timeout='10m', **query_params)

Return a generator which will scan against one or more indices and iterate over the search hits. (currently support only by ES Master)

query must be a Search object, a Query object, or a custom dictionary of search parameters using the query DSL to be passed directly.

search(query, indices=None, doc_types=None, **query_params)

Execute a search against one or more indices to get the resultset.

query must be a Search object, a Query object, or a custom dictionary of search parameters using the query DSL to be passed directly.

search_raw(query, indices=None, doc_types=None, **query_params)

Execute a search against one or more indices to get the search hits.

query must be a Search object, a Query object, or a custom dictionary of search parameters using the query DSL to be passed directly.

search_scroll(scroll_id, scroll_timeout='10m')

Executes a scrolling given an scroll_id

set_alias(alias, indices)

Set an alias.

This handles removing the old list of indices pointed to by the alias.

Warning: there is a race condition in the implementation of this function - if another client modifies the indices which this alias points to during this call, the old value of the alias may not be correctly set.

status(indices=None)

Retrieve the status of one or more indices

update_settings(index, newvalues)

Update Settings of an index.

pyes.es.file_to_attachment(filename)

Convert a file to attachment

pyes.es.decode_json(data)

Decode some json to dict

Previous topic

pyes.connection_http

Next topic

pyes.exceptions

This Page