FileDocCategorySizeDatePackage
DutchStemFilter.javaAPI DocApache Lucene 2.1.03372Wed Feb 14 10:46:32 GMT 2007org.apache.lucene.analysis.nl

DutchStemFilter

public final class DutchStemFilter extends TokenFilter
A filter that stems Dutch words. It supports a table of words that should not be stemmed at all. The stemmer used can be changed at runtime after the filter object is created (as long as it is a DutchStemmer).
author
Edwin de Jonge

Fields Summary
private Token
token
The actual token in the input stream.
private DutchStemmer
stemmer
private Set
exclusions
Constructors Summary
public DutchStemFilter(TokenStream _in)


     
    super(_in);
    stemmer = new DutchStemmer();
  
public DutchStemFilter(TokenStream _in, Set exclusiontable)
Builds a DutchStemFilter that uses an exclusiontable.

    this(_in);
    exclusions = exclusiontable;
  
public DutchStemFilter(TokenStream _in, Set exclusiontable, Map stemdictionary)

param
stemdictionary Dictionary of word stem pairs, that overrule the algorithm

    this(_in, exclusiontable);
    stemmer.setStemDictionary(stemdictionary);
  
Methods Summary
public org.apache.lucene.analysis.Tokennext()

return
Returns the next token in the stream, or null at EOS

    if ((token = input.next()) == null) {
      return null;
    }

    // Check the exclusiontable
    else if (exclusions != null && exclusions.contains(token.termText())) {
      return token;
    } else {
      String s = stemmer.stem(token.termText());
      // If not stemmed, dont waste the time creating a new token
      if (!s.equals(token.termText())) {
        return new Token(s, token.startOffset(),
            token.endOffset(), token.type());
      }
      return token;
    }
  
public voidsetExclusionTable(java.util.HashSet exclusiontable)
Set an alternative exclusion list for this filter.

    exclusions = exclusiontable;
  
public voidsetStemDictionary(java.util.HashMap dict)
Set dictionary for stemming, this dictionary overrules the algorithm, so you can correct for a particular unwanted word-stem pair.

    if (stemmer != null)
      stemmer.setStemDictionary(dict);
  
public voidsetStemmer(DutchStemmer stemmer)
Set a alternative/custom DutchStemmer for this filter.

    if (stemmer != null) {
      this.stemmer = stemmer;
    }