RussianStemFilterpublic final class RussianStemFilter extends TokenFilter A filter that stems Russian words. The implementation was inspired by GermanStemFilter.
The input should be filtered by RussianLowerCaseFilter before passing it to RussianStemFilter ,
because RussianStemFilter only works with lowercase part of any "russian" charset. |
Fields Summary |
---|
private Token | tokenThe actual token in the input stream. | private RussianStemmer | stemmer |
Constructors Summary |
---|
public RussianStemFilter(TokenStream in, char[] charset)
super(in);
stemmer = new RussianStemmer(charset);
|
Methods Summary |
---|
public final org.apache.lucene.analysis.Token | next()
if ((token = input.next()) == null)
{
return null;
}
else
{
String s = stemmer.stem(token.termText());
if (!s.equals(token.termText()))
{
return new Token(s, token.startOffset(), token.endOffset(),
token.type());
}
return token;
}
| public void | setStemmer(RussianStemmer stemmer)Set a alternative/custom RussianStemmer for this filter.
if (stemmer != null)
{
this.stemmer = stemmer;
}
|
|