public class MorfologikAnalyzer extends Analyzer
Analyzer
using Morfologik library.Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
Constructor and Description |
---|
MorfologikAnalyzer()
Builds an analyzer with the default Morfologik's Polish dictionary.
|
MorfologikAnalyzer(Dictionary dictionary)
Builds an analyzer with an explicit
Dictionary resource. |
Modifier and Type | Method and Description |
---|---|
protected Analyzer.TokenStreamComponents |
createComponents(String field)
Creates a
Analyzer.TokenStreamComponents
which tokenizes all the text in the provided Reader . |
protected TokenStream |
normalize(String fieldName,
TokenStream in)
Wrap the given
TokenStream in order to apply normalization filters. |
attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, getVersion, initReader, initReaderForNormalization, normalize, setVersion, tokenStream, tokenStream
public MorfologikAnalyzer(Dictionary dictionary)
Dictionary
resource.dictionary
- A prebuilt automaton with inflected and base word forms.public MorfologikAnalyzer()
protected Analyzer.TokenStreamComponents createComponents(String field)
Analyzer.TokenStreamComponents
which tokenizes all the text in the provided Reader
.createComponents
in class Analyzer
field
- ignored field nameAnalyzer.TokenStreamComponents
built from an StandardTokenizer
filtered with
StandardFilter
and MorfologikFilter
.protected TokenStream normalize(String fieldName, TokenStream in)
Analyzer
TokenStream
in order to apply normalization filters.
The default implementation returns the TokenStream
as-is. This is
used by Analyzer.normalize(String, String)
.