FileDocCategorySizeDatePackage
RefinedSoundex.javaAPI DocAndroid 1.5 API6571Wed May 06 22:41:10 BST 2009org.apache.commons.codec.language

RefinedSoundex

public class RefinedSoundex extends Object implements StringEncoder
Encodes a string into a Refined Soundex value. A refined soundex code is optimized for spell checking words. Soundex method originally developed by Margaret Odell and Robert Russell.
author
Apache Software Foundation
version
$Id: RefinedSoundex.java,v 1.21 2004/06/05 18:32:04 ggregory Exp $

Fields Summary
public static final RefinedSoundex
US_ENGLISH
This static variable contains an instance of the RefinedSoundex using the US_ENGLISH mapping.
public static final char[]
US_ENGLISH_MAPPING
RefinedSoundex is *refined* for a number of reasons one being that the mappings have been altered. This implementation contains default mappings for US English.
private char[]
soundexMapping
Every letter of the alphabet is "mapped" to a numerical value. This char array holds the values to which each letter is mapped. This implementation contains a default map for US_ENGLISH
Constructors Summary
public RefinedSoundex()
Creates an instance of the RefinedSoundex object using the default US English mapping.


                      
      
        this(US_ENGLISH_MAPPING);
    
public RefinedSoundex(char[] mapping)
Creates a refined soundex instance using a custom mapping. This constructor can be used to customize the mapping, and/or possibly provide an internationalized mapping for a non-Western character set.

param
mapping Mapping array to use when finding the corresponding code for a given character

        this.soundexMapping = mapping;
    
Methods Summary
public intdifference(java.lang.String s1, java.lang.String s2)
Returns the number of characters in the two encoded Strings that are the same. This return value ranges from 0 to the length of the shortest encoded String: 0 indicates little or no similarity, and 4 out of 4 (for example) indicates strong similarity or identical values. For refined Soundex, the return value can be greater than 4.

param
s1 A String that will be encoded and compared.
param
s2 A String that will be encoded and compared.
return
The number of characters in the two encoded Strings that are the same from 0 to to the length of the shortest encoded String.
see
MS T-SQL DIFFERENCE
throws
EncoderException if an error occurs encoding one of the strings
since
1.3

        return SoundexUtils.difference(this, s1, s2);
    
public java.lang.Objectencode(java.lang.Object pObject)
Encodes an Object using the refined soundex algorithm. This method is provided in order to satisfy the requirements of the Encoder interface, and will throw an EncoderException if the supplied object is not of type java.lang.String.

param
pObject Object to encode
return
An object (or type java.lang.String) containing the refined soundex code which corresponds to the String supplied.
throws
EncoderException if the parameter supplied is not of type java.lang.String

        if (!(pObject instanceof java.lang.String)) {
            throw new EncoderException("Parameter supplied to RefinedSoundex encode is not of type java.lang.String");
        }
        return soundex((String) pObject);
    
public java.lang.Stringencode(java.lang.String pString)
Encodes a String using the refined soundex algorithm.

param
pString A String object to encode
return
A Soundex code corresponding to the String supplied

        return soundex(pString);
    
chargetMappingCode(char c)
Returns the mapping code for a given character. The mapping codes are maintained in an internal char array named soundexMapping, and the default values of these mappings are US English.

param
c char to get mapping for
return
A character (really a numeral) to return for the given char

        if (!Character.isLetter(c)) {
            return 0;
        }
        return this.soundexMapping[Character.toUpperCase(c) - 'A"];
    
public java.lang.Stringsoundex(java.lang.String str)
Retreives the Refined Soundex code for a given String object.

param
str String to encode using the Refined Soundex algorithm
return
A soundex code for the String supplied

        if (str == null) {
            return null;
        }
        str = SoundexUtils.clean(str);
        if (str.length() == 0) {
            return str;
        }

        StringBuffer sBuf = new StringBuffer();
        sBuf.append(str.charAt(0));

        char last, current;
        last = '*";

        for (int i = 0; i < str.length(); i++) {

            current = getMappingCode(str.charAt(i));
            if (current == last) {
                continue;
            } else if (current != 0) {
                sBuf.append(current);
            }

            last = current;

        }

        return sBuf.toString();