GermanAnalyzerpublic class GermanAnalyzer extends Analyzer Analyzer for German language. Supports an external list of stopwords (words that
will not be indexed at all) and an external list of exclusions (word that will
not be stemmed, but indexed).
A default set of stopwords is used unless an alternative list is specified, the
exclusion list is empty by default. |
Fields Summary |
---|
private String[] | GERMAN_STOP_WORDSList of typical german stopwords. | private Set | stopSetContains the stopwords used with the StopFilter. | private Set | exclusionSetContains words that should be indexed but not stemmed. |
Constructors Summary |
---|
public GermanAnalyzer()Builds an analyzer.
stopSet = StopFilter.makeStopSet(GERMAN_STOP_WORDS);
| public GermanAnalyzer(String[] stopwords)Builds an analyzer with the given stop words.
stopSet = StopFilter.makeStopSet(stopwords);
| public GermanAnalyzer(Hashtable stopwords)Builds an analyzer with the given stop words.
stopSet = new HashSet(stopwords.keySet());
| public GermanAnalyzer(File stopwords)Builds an analyzer with the given stop words.
stopSet = WordlistLoader.getWordSet(stopwords);
|
Methods Summary |
---|
public void | setStemExclusionTable(java.lang.String[] exclusionlist)Builds an exclusionlist from an array of Strings.
exclusionSet = StopFilter.makeStopSet(exclusionlist);
| public void | setStemExclusionTable(java.util.Hashtable exclusionlist)Builds an exclusionlist from a Hashtable.
exclusionSet = new HashSet(exclusionlist.keySet());
| public void | setStemExclusionTable(java.io.File exclusionlist)Builds an exclusionlist from the words contained in the given file.
exclusionSet = WordlistLoader.getWordSet(exclusionlist);
| public org.apache.lucene.analysis.TokenStream | tokenStream(java.lang.String fieldName, java.io.Reader reader)Creates a TokenStream which tokenizes all the text in the provided Reader.
TokenStream result = new StandardTokenizer(reader);
result = new StandardFilter(result);
result = new LowerCaseFilter(result);
result = new StopFilter(result, stopSet);
result = new GermanStemFilter(result, exclusionSet);
return result;
|
|