Reference

Data representation

Solr documents are simple collections of named fields; the fields may be multi-valued, depending on the Solr schema being used.

In Python, these documents are modeled as dictionaries with the field names as keys and field values (or sets of values) as values. When multiple values are presented to Solr, the value in the dictionary must be a list, tuple or set. (A frozenset is not accepted.)

Values may be strings (str or unicode), dates, datetimes, bools, or None. Field values of None are omitted from the values submitted to Solr.

datetime.datetime values are converted to UTC.

datetime.date values are converted to datetime.datetime values at 00:00:00 with an assumed timezone of UTC.

bool values are converted to the string values 'true' or 'false'.

Exceptions

The solr module provides everything you’ll need. There’s an exception that might be raised by many operations:

exception solr.SolrException(httpcode, reason=None, body=None)

Bases: exceptions.Exception

An exception thrown by solr connections.

Detailed information is provided in attributes of the exception object.

httpcode

HTTP response code from Solr.

reason

Error message from the HTTP response sent by Solr.

body

Response body returned by Solr.

This can contain much more information about the error, including tracebacks from the Java runtime.

These exceptions, along with others, can be raised by the connection objects that are provided.

Connections

There are two flavors of connection objects; one is provided to support older applications, and the other provides more rational (and powerful) access to commit controls.

Both connection classes are designed to work with the 2.2 response format generated by Solr 1.2 and newer, but will likely work with the older 2.1 response format as well.

class solr.Solr(url)

Connect to the Solr instance at url. If the Solr instance provides multiple cores, url should be to a specific core. Examples:

Python must have SSL support installed for the https scheme to work. (Most pre-packaged Python builds are.)

Many keyword arguments can be specified to tailor instances for the needs of specific applications:

persistent
Keep a persistent connection open. Defaults to True.
timeout
Timeout, in seconds, for the server to response. By default, use the Python default timeout.
ssl_key, ssl_cert
If using client-side key files for SSL authentication, these should be, respectively, your PEM key file and certificate file.
http_user, http_pass
If given, include HTTP Basic authentication in all request headers.
post_headers
A dictionary of headers that should be included in all requests to Solr. This is a good way to provide the User-Agent or other specialized headers.
max_retries
Maximum number of retries to perform automatically. Re-tries are only attempted when socket errors or httplib.ImproperConnectionState or httplib.BadStatusLine exceptions are generated from calls into httplib.

Commit-control arguments

Some methods support optional Boolean arguments to control commits that may be made by the method. These arguments are always optional, and no commit will be performed if they are not given.

Methods that accept these arguments are identified as supporting commit-control arguments, but the arguments are not listed or described for the individual methods.

The following commit-control keyword arguments are defined:

commit
Indicates whether a commit should be performed before the method returns.
optimize
Indicates whether index optimization should be performed before the method returns. If true, implies a commit value of True.
wait_flush
Indicates whether the request should block until the commit has been flushed to disk on the server. If not specified, this defaults to True. (There’s some question about whether this is honored in recent versions of Solr.)
wait_searcher
Indicates whether the request should block until searcher objects have been warmed for use before returning. If not specified, this defaults to True. If true, implies a wait_flush value of True (a false wait_flush value will be ignored).

If wait_flush or wait_searcher are specified when neither commit nor optimize are true, a TypeError will be raised.

Whenever possible, the request to commit or optimize the index will be collapsed into an update request being performed by the method being called. This avoids a separate HTTP round-trip to commit changes.

Methods common to connections

These methods are available on both connection classes.

Solr.delete(id=None, ids=None, queries=None)

Delete documents by ids or queries.

Any or all of id, ids, or queries may be given; all provided will be used. If none are provided, no request will be sent to Solr.

id is a single value for the schema’s unique id field. ids is an iterable of unique ids.

queries is an iterable of standard-syntax queries. Supports commit-control arguments.

Solr.delete_many(ids)

Delete documents using an iterable of ids.

This is equivalent to delete(ids=[ids]). Supports commit-control arguments.

Solr.delete_query(query)

Delete all documents identified by a query.

This is equivalent to delete(queries=[query]). Supports commit-control arguments.

Solr.commit(wait_flush=True, wait_searcher=True)

Issue a commit command to the Solr server.

wait_flush and wait_searcher have the same interpretations as the like-name commit-control arguments.

Solr.optimize(wait_flush=True, wait_searcher=True)

Issue an optimize command to the Solr server.

wait_flush and wait_searcher have the same interpretations as the like-name commit-control arguments.

Solr.close()

Close the underlying HTTP(S) connection.

Methods specific to Solr

These methods are specific to the Solr class; similarly-named methods on SolrConnection may exist with different signatures.

Solr.select

A SearchHandler instance for the commonly-defined select request handler on the server.

Solr.add(doc)

Add a document to the Solr server. Document fields should be specified as arguments to this function

Example:

doc = {"id": "mydoc", "author": "Me"}
connection.add(doc)

Supports commit-control arguments.

Solr.add_many(docs)

Add several documents to the Solr server.

docs
An iterable of document dictionaries.

Supports commit-control arguments.

Compatibility support

class solr.SolrConnection(url)

This class is used by older applications of solrpy; newer applications should use solr.Solr.

The constructor arguments and most methods are the same as for solr.Solr; only these method signatures differ:

SolrConnection.add(_commit=False, **fields)

Add or update a single document with field values given by keyword arguments.

The _commit argument is treated specially, causing an immediate commit if present. It may be specified either positionally or as a keyword. If _commit is true, the commit will be issued as part of the same HTTP request to the Solr server.

Example:

connection.add(id="mydoc", author="Me")

This is equialent to solr.Solr.add(fields, commit=_commit).

Unlike the same-named method of Solr, this does not support commit-control arguments.

SolrConnection.add_many(docs, _commit=False)

Add or update multiple documents. with field values for each given by dictionaries in the sequence docs.

The _commit argument is treated specially, causing an immediate commit if present. It may be specified either positionally or as a keyword. If _commit is true, the commit will be issued as part of the same HTTP request to the Solr server.

Example:

doc1 = {...}
doc2 = {...}
connection.add_many([doc1, doc2], _commit=True)

This is equialent to solr.Solr.add_many(docs, commit=_commit).

Unlike the same-named method of Solr, this does not support commit-control arguments.

SolrConnection.query(q, fields=None, highlight=None, score=True, sort=None, sort_order="asc", **params)

Call the select search handler, returning the result of that call.

SolrConnection.raw_query(**params)

Call the raw method of the select search handler, returning the result of that call.

Search handlers

A search handler provides access to a named search on the Solr server. Most servers are configured with a search named select, but different searches may be defined that require different arguments or different default parameters.

The SearchHandler class provides access to a named search. Handlers are constructed simply, and can be saved and used as many times as needed.

class solr.SearchHandler(connection, path)

Construct a search handler for connection with the relative path given by path. For example, to use the commonly-defined select search, construct a handler like this:

import solr
conn = solr.Solr("http://solr.example.net/solr")
select = solr.SearchHandler(conn, "/select")

This is exactly how the select attribute of Solr instances is constructed. An alternate request handler can be used by providing an alternate path:

find_stuff = solr.SearchHandler(conn, "/find_stuff")

The slash at the beginning of the path value is required if the URL given to the connection constructor does not end with a slash.

SearchHandler.__call__(q=None, fields=None, highlight=None, score=True, sort=None, sort_order="asc", **params)

q is the query string in the format configured for the request handler in the Solr server.

fields is an optional list of fields to include. It can be either a string in the format that Solr expects, or an iterable of field names. Defaults to all fields ('*').

score indicates whether score should be included in the field list. Note that if you explicitly list “score” in your fields value, then score is effectively ignored. Defaults to True.

highlight indicates whether highlighting should be included. highlight can either be False, indicating “No” (the default), a list of fields in the same format as fields or True, indicating to highlight any fields included in fields. If True and no “fields” are given, raise a ValueError.

sort is a list of fields to sort by. See fields for formatting. Each sort element can have be in the form “fieldname asc|desc” as specified by Solr specs.

sort_order is the backward-compatible way to add the same ordering to all the sort field when it is not specified.

Optional parameters can also be passed in. Many Solr parameters are in a dotted notation (for example, hl.simple.post). For such parameters, replace the dots with underscores when calling this method:

r = conn.query('text:solrpy', hl_simple_post='</pre>')

Returns a Response instance.

SearchHandler.raw(**params)

Issue a query against a Solr server. No logical interpretation of the parameters is performed, but encoding for transfer as form fields over HTTP is handled.

Return the raw result as text. No processing is performed on the response.