FileDocCategorySizeDatePackage
GermanStemFilter.javaAPI DocApache Lucene 2.1.03023Wed Feb 14 10:46:30 GMT 2007org.apache.lucene.analysis.de

GermanStemFilter

public final class GermanStemFilter extends TokenFilter
A filter that stems German words. It supports a table of words that should not be stemmed at all. The stemmer used can be changed at runtime after the filter object is created (as long as it is a GermanStemmer).
author
Gerhard Schwarz
version
$Id: GermanStemFilter.java 472959 2006-11-09 16:21:50Z yonik $

Fields Summary
private Token
token
The actual token in the input stream.
private GermanStemmer
stemmer
private Set
exclusionSet
Constructors Summary
public GermanStemFilter(TokenStream in)


        
    
      super(in);
      stemmer = new GermanStemmer();
    
public GermanStemFilter(TokenStream in, Set exclusionSet)
Builds a GermanStemFilter that uses an exclusiontable.

      this( in );
      this.exclusionSet = exclusionSet;
    
Methods Summary
public final org.apache.lucene.analysis.Tokennext()

return
Returns the next token in the stream, or null at EOS

      if ( ( token = input.next() ) == null ) {
        return null;
      }
      // Check the exclusiontable
      else if ( exclusionSet != null && exclusionSet.contains( token.termText() ) ) {
        return token;
      }
      else {
        String s = stemmer.stem( token.termText() );
        // If not stemmed, dont waste the time creating a new token
        if ( !s.equals( token.termText() ) ) {
          return new Token( s, token.startOffset(),
            token.endOffset(), token.type() );
        }
        return token;
      }
    
public voidsetExclusionSet(java.util.Set exclusionSet)
Set an alternative exclusion list for this filter.

      this.exclusionSet = exclusionSet;
    
public voidsetStemmer(GermanStemmer stemmer)
Set a alternative/custom GermanStemmer for this filter.

      if ( stemmer != null ) {
        this.stemmer = stemmer;
      }