File Doc Category Size Date Package
CollateDemo.java API Doc Example 33401 Wed Apr 19 11:19:02 BST 2000 None

CollateDemo

java.lang.Object
- DemoApplet

public class CollateDemo extends DemoApplet

Concrete class for demonstrating language sensitive collation. The following is the instruction on how to run the collation demo.

===================

Customization

You can produce a new collation by adding to or changing an existing one.

To show...

You can modify an existing collation to show how this works. By adding items at the end of a collation you override earlier information. Watch how you can make the letter P sort at the end of the alphabet.

Do...

1. Scroll to the end of the Sequence field. After the Z, type "< p , P". This will put the letter P (with both of its case variants) at the end of the alphabet. Hit the Set Rule button. This creates a new collation with the name "Custom-1" (you could give it a different name by typing in the Collator Name field). When you now look at the Text field, you will see that you have changed the sequence to put Pat at the end. (If you did not have Sort Ascending on, click it now.)

Making P sort at the end may not seem terribly useful, but it is used to make modifications in the sorting sequence for different languages.

To show...

For example, you can add CH as a single letter after C, as in traditional Spanish sorting.

Do...

Enter in the following after Z; "& C < ch , cH, Ch, CH". Hit the Set Rule button, type in test words in the Text field (such as "czar", "churo" and "darn"), and select Sort Ascending to see the resulting sort order.

To show...

You can also add other sequences to the collation rules, such as sorting symbols with their alphabetic equivalents.

Do...

1. Scroll to the end of the Sequence field. After the end, type the following list (you can just select this text in your browser and paste it in, to avoid typing). Now type lines in the Text field with these symbols on them, and select Sort Ascending to see the resulting sort order.

& Asterisk ; *
& Question-mark ; ?
& Hash-mark ; #
& Exclamation-mark ; !
& Dollar-sign ; $
& Ampersand ; '&';

Details

If you are an advanced user and interested in trying out more rules, here is a brief explanation of how they work. The sequence is a list of rules. Each rule is of two forms:

<modifier>
<relation> <text-argument>
<reset> <text-argument>

Modifier

@ Indicates that accents are sorted backwards, as in French

Text-argument

The text can be any number of characters (if you want to include special characters, such as space, use single-quotes around them).

Relation

The relations are the following:

< Greater, as a letter difference (primary)

; Greater, as an accent difference (secondary)

, Greater, as a case difference (tertiary)

= Equal

& Reset previous comparison.

Reset

The "&" is special in that does not put the text-argument into the sorting sequence; instead, it indicates that the next rule is with respect to where the text-argument would be sorted. This sounds more complicated than it is in practice. For example, the following are equivalent ways of expressing the same thing:

a < b < c
a < b & b < c
a < c & a < b

Notice that the order is important, since the subsequent item goes immediately after the text-argument. The following are not equivalent:

a < b & a < c
a < c & a < b

The text-argument must already be present in the sequence, or some initial substring of the text-argument must be present. (e.g. "a < b& ae < e" is valid since "a" is present in the sequencebefore "ae" is reset). In this latter case, "ae" is not entered and treated as a single character; instead, "e" is sorted as if it were expanded to two characters: "a" followed by an "e".
This difference appears in natural languages: in traditional Spanish "ch" is treated as though it contracts to a single character (expressed as "c < ch < d"), while in traditional German "ä" (a-umlaut) is treated as though it expands to two characters (expressed as "a & ae ; ä < b").

Ignorable Characters

The first rule must start with a relation (the examples we have used are fragments; "a < b" really should be "< a < b"). If, however, the first relation is not "<", then all the all text-arguments up to the first "<" are ignorable. For example, ", - < a < b" makes "-" an ignorable character, as we saw earlier in the word "black-birds".

Accents

The Collator automatically normalizes text internally to separate accents from base characters where possible. So, if you type in an "ä" (a-umlaut), after you reset the collation you will see "a\u0308" in the sequence, where \u0308 is the Java syntax for umlaut. The demonstration program uses this syntax instead of just showing the umlaut since many browsers are unable to display the umlaut yet.

Errors

The following are errors:

Two relations in a row (e.g. "a < , b"
Two text arguments in a row (e.g. "a < b c < d")
A reset where the text-argument is not already in the sequence (e.g."a < b & e < f")

If you produce one of these errors, then the demonstration will beep at you, and select the offending text (note: on some browsers, the selection will not appear correctly).

version: 1.1 11/23/96
author: Kathleen Wilson, Helena Shih
see: java.util.Collator
see: java.util.RuleBasedCollator
see: java.demos.utilities.DemoApplet

Fields Summary
Constructors Summary
Methods Summary
public java.awt.Frame createDemoFrame(DemoApplet applet)
This creates a CollateFrame for the demo applet.
return new CollateFrame(applet);
public static void main(java.lang.String[] argv)
The main function which defines the behavior of the CollateDemo applet when an applet is started.
DemoApplet.showDemo(new CollateFrame(null));