Lang Analyzer¶

A set of analyzers aimed at analyzing specific language text. The following types are supported: arabic, armenian, basque, bulgarian, brazilian, catalan, chinese, cjk,, czech, danish, dutch, english, finnish, french, galician, german, greek**, persian, hindi, hungarian, indonesian, italian, norwegian, portuguese, romanian, russian**, spanish, swedish, turkish, thai.

All analyzers support setting custom stopwords either internally in the config, or by using an external stopwords file by setting stopwords_path.

Arabic Analyzer¶

The arabic analyzer is built on top of arabic_letter tokenizer, and lowercase, stop, arabic_normalizer and arabic_stem filters.

Brazilian Analyzer¶

The brazilian analyzer is built on top of standard tokenizer, and lowercase, standard, stop, and brazilian_stem filters.

Chinese Analyzer¶

The chinese analyzer is built on top of chinese tokenizer and chinese filter.

Cjk Analyzer¶

The cjk analyzer is built on top of cjk tokenizer and stop filter.

Czech Analyzer¶

The czech analyzer is built on top of standard tokenizer, and standard, lowercase, stop and czech_stem filters. It comes with default stopwords but they can be set.

Dutch Analyzer¶

The dutch analyzer is built on top of standard tokenizer, and standard, stop and dutch_stem filters.

French Analyzer¶

p .The french analyzer is built on top of standard tokenizer, and standard, stop, french_stem and lowercase filters.

German Analyzer¶

The german analyzer is built on top of standard tokenizer, and standard, lowercase, stop, german_stem filters.

Greek Analyzer¶

The greek analyzer is built on top of standard tokenizer, and greek_lowercase, stop filters.

Persian Analyzer¶

The persian analyzer is built on top of arabic_letter tokenizer and lowercase, arabic_normalization, persian_normalization and stop filters.

Russian Analyzer¶

The russian analyzer is built on top of russian_letter tokenizer and lowercase, stop and russian_stem filters. It comes with default stopwords but they can be set.

Thai Analyzer¶

The thai analyzer is built on top of standard tokenizer, and standard, thai_word, stop filters.

Lang Analyzer¶

Arabic Analyzer¶

Brazilian Analyzer¶

Chinese Analyzer¶

Cjk Analyzer¶

Czech Analyzer¶

Dutch Analyzer¶

French Analyzer¶

German Analyzer¶

Greek Analyzer¶

Persian Analyzer¶

Russian Analyzer¶

Thai Analyzer¶

Table Of Contents

Previous topic

Next topic

This Page

Navigation

Lang Analyzer¶

Arabic Analyzer¶

Brazilian Analyzer¶

Chinese Analyzer¶

Cjk Analyzer¶

Czech Analyzer¶

Dutch Analyzer¶

French Analyzer¶

German Analyzer¶

Greek Analyzer¶

Persian Analyzer¶

Russian Analyzer¶

Thai Analyzer¶

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Navigation