DutchStemFilterpublic final class DutchStemFilter extends TokenFilter A filter that stems Dutch words. It supports a table of words that should
not be stemmed at all. The stemmer used can be changed at runtime after the
filter object is created (as long as it is a DutchStemmer). |
Fields Summary |
---|
private Token | tokenThe actual token in the input stream. | private DutchStemmer | stemmer | private Set | exclusions |
Constructors Summary |
---|
public DutchStemFilter(TokenStream _in)
super(_in);
stemmer = new DutchStemmer();
| public DutchStemFilter(TokenStream _in, Set exclusiontable)Builds a DutchStemFilter that uses an exclusiontable.
this(_in);
exclusions = exclusiontable;
| public DutchStemFilter(TokenStream _in, Set exclusiontable, Map stemdictionary)
this(_in, exclusiontable);
stemmer.setStemDictionary(stemdictionary);
|
Methods Summary |
---|
public org.apache.lucene.analysis.Token | next()
if ((token = input.next()) == null) {
return null;
}
// Check the exclusiontable
else if (exclusions != null && exclusions.contains(token.termText())) {
return token;
} else {
String s = stemmer.stem(token.termText());
// If not stemmed, dont waste the time creating a new token
if (!s.equals(token.termText())) {
return new Token(s, token.startOffset(),
token.endOffset(), token.type());
}
return token;
}
| public void | setExclusionTable(java.util.HashSet exclusiontable)Set an alternative exclusion list for this filter.
exclusions = exclusiontable;
| public void | setStemDictionary(java.util.HashMap dict)Set dictionary for stemming, this dictionary overrules the algorithm,
so you can correct for a particular unwanted word-stem pair.
if (stemmer != null)
stemmer.setStemDictionary(dict);
| public void | setStemmer(DutchStemmer stemmer)Set a alternative/custom DutchStemmer for this filter.
if (stemmer != null) {
this.stemmer = stemmer;
}
|
|