FileDocCategorySizeDatePackage
RussianLowerCaseFilter.javaAPI DocApache Lucene 1.91774Mon Feb 20 09:18:52 GMT 2006org.apache.lucene.analysis.ru

RussianLowerCaseFilter

public final class RussianLowerCaseFilter extends TokenFilter
Normalizes token text to lower case, analyzing given ("russian") charset.
author
Boris Okner, b.okner@rogers.com
version
$Id: RussianLowerCaseFilter.java 150998 2004-08-16 20:30:46Z dnaber $

Fields Summary
char[]
charset
Constructors Summary
public RussianLowerCaseFilter(TokenStream in, char[] charset)

        super(in);
        this.charset = charset;
    
Methods Summary
public final org.apache.lucene.analysis.Tokennext()

        Token t = input.next();

        if (t == null)
            return null;

        String txt = t.termText();

        char[] chArray = txt.toCharArray();
        for (int i = 0; i < chArray.length; i++)
        {
            chArray[i] = RussianCharsets.toLowerCase(chArray[i], charset);
        }

        String newTxt = new String(chArray);
        // create new token
        Token newToken = new Token(newTxt, t.startOffset(), t.endOffset());

        return newToken;