FileDocCategorySizeDatePackage
FrenchStemFilter.javaAPI DocApache Lucene 2.1.02773Wed Feb 14 10:46:28 GMT 2007org.apache.lucene.analysis.fr

FrenchStemFilter

public final class FrenchStemFilter extends TokenFilter
A filter that stemms french words. It supports a table of words that should not be stemmed at all. The used stemmer can be changed at runtime after the filter object is created (as long as it is a FrenchStemmer).
author
Patrick Talbot (based on Gerhard Schwarz work for German)

Fields Summary
private Token
token
The actual token in the input stream.
private FrenchStemmer
stemmer
private Set
exclusions
Constructors Summary
public FrenchStemFilter(TokenStream in)


	     
    super(in);
		stemmer = new FrenchStemmer();
	
public FrenchStemFilter(TokenStream in, Set exclusiontable)

		this( in );
		exclusions = exclusiontable;
	
Methods Summary
public final org.apache.lucene.analysis.Tokennext()

return
Returns the next token in the stream, or null at EOS

		if ( ( token = input.next() ) == null ) {
			return null;
		}
		// Check the exclusiontable
		else if ( exclusions != null && exclusions.contains( token.termText() ) ) {
			return token;
		}
		else {
			String s = stemmer.stem( token.termText() );
			// If not stemmed, dont waste the time creating a new token
			if ( !s.equals( token.termText() ) ) {
			   return new Token( s, token.startOffset(), token.endOffset(), token.type());
			}
			return token;
		}
	
public voidsetExclusionTable(java.util.Hashtable exclusiontable)
Set an alternative exclusion list for this filter.

		exclusions = new HashSet(exclusiontable.keySet());
	
public voidsetStemmer(FrenchStemmer stemmer)
Set a alternative/custom FrenchStemmer for this filter.

		if ( stemmer != null ) {
			this.stemmer = stemmer;
		}