See correcting errors in user queries.
This module contains helper functions for correcting typos in user queries.
Base class for spelling correction objects. Concrete sub-classes should
implement the _suggestions method.
suggest(text, limit=5, maxdist=2, prefix=0)
- text – the text to check. This word will not be added to the
suggestions, even if it appears in the word graph.
- limit – only return up to this many suggestions. If there are not
enough terms in the field within maxdist of the given word, the
returned list will be shorter than this number.
- maxdist – the largest edit distance from the given word to look
at. Numbers higher than 2 are not very effective or efficient.
- prefix – require suggestions to share a prefix of this length
with the given word. This is often justifiable since most
misspellings do not involve the first letter of the word. Using a
prefix dramatically decreases the time it takes to generate the
list of words.
class whoosh.spelling.ReaderCorrector(reader, fieldname)
Suggests corrections based on the content of a field in a reader.
Ranks suggestions by the edit distance, then by highest to lowest
Suggests corrections based on the content of a raw
By default ranks suggestions based on the edit distance.
Merges suggestions from a list of sub-correctors.
Base class for objects that correct words in a user query.
Returns a Correction object representing the corrected
form of the given query.
- q – the original whoosh.query.Query tree to be
- qstring – the original user query. This may be None if the
original query string is not available, in which case the
Correction.string attribute will also be None.
class whoosh.spelling.SimpleQueryCorrector(correctors, terms, prefix=0, maxdist=2)
A simple query corrector based on a mapping of field names to
Corrector objects, and a list of ("fieldname", "text") tuples
to correct. And terms in the query that appear in list of term tuples are
corrected using the appropriate corrector.
- correctors – a dictionary mapping field names to
- terms – a sequence of ("fieldname", "text") tuples
representing terms to be corrected.
- prefix – suggested replacement words must share this number of
initial characters with the original word. Increasing this even to
just 1 can dramatically speed up suggestions, and may be
justifiable since spellling mistakes rarely involve the first
letter of a word.
- maxdist – the maximum number of “edits” (insertions, deletions,
subsitutions, or transpositions of letters) allowed between the
original word and any suggestion. Values higher than 2 may be
class whoosh.spelling.Correction(q, qstring, corr_q, tokens)
Represents the corrected version of a user query string. Has the
- The corrected whoosh.query.Query object.
- The corrected user query string.
- The original whoosh.query.Query object that was corrected.
- The original user query string.
- A list of token objects representing the corrected words.
You can also use the Correction.format_string() to reformat the
corrected query string using a whoosh.highlight.Formatter class.
For example, to display the corrected query string as HTML with the
changed words emphasized:
from whoosh import highlight
correction = mysearcher.correct_query(q, qstring)
hf = highlight.HtmlFormatter(classname="change")
html = correction.format_string(hf)