File Doc Category Size Date Package
Sort.java API Doc Apache Lucene 1.4.3 7227 Mon Apr 05 19:23:38 BST 2004 org.apache.lucene.search

Sort

java.lang.Object

public class Sort extends Object implements Serializable

Encapsulates sort criteria for returned hits.

The fields used to determine sort order must be carefully chosen. Documents must contain a single term in such a field, and the value of the term should indicate the document's relative position in a given sort order. The field must be indexed, but should not be tokenized, and does not need to be stored (unless you happen to want it back with the rest of your document data). In other words:

document.add (new Field ("byNumber", Integer.toString(x), false, true, false));

Valid Types of Values

There are three possible kinds of term values which may be put into sorting fields: Integers, Floats, or Strings. Unless {@link SortField SortField} objects are specified, the type of value in the field is determined by parsing the first term in the field.

Integer term values should contain only digits and an optional preceeding negative sign. Values must be base 10 and in the range Integer.MIN_VALUE and Integer.MAX_VALUE inclusive. Documents which should appear first in the sort should have low value integers, later documents high values (i.e. the documents should be numbered 1..n where 1 is the first and n the last).

Float term values should conform to values accepted by {@link Float Float.valueOf(String)} (except that NaN and Infinity are not supported). Documents which should appear first in the sort should have low values, later documents high values.

String term values can contain any valid String, but should not be tokenized. The values are sorted according to their {@link Comparable natural order}. Note that using this type of term value has higher memory requirements than the other two types.

Object Reuse

One of these objects can be used multiple times and the sort order changed between usages.

This class is thread safe.

Memory Usage

Sorting uses of caches of term values maintained by the internal HitQueue(s). The cache is static and contains an integer or float array of length IndexReader.maxDoc() for each field name for which a sort is performed. In other words, the size of the cache in bytes is:

4 * IndexReader.maxDoc() * (# of different fields actually used to sort)

For String fields, the cache is larger: in addition to the above array, the value of every term in the field is kept in memory. If there are many unique terms in the field, this could be quite large.

Note that the size of the cache is not affected by how many fields are in the index and might be used to sort - only by the ones actually used to sort a result set.

The cache is cleared each time a new IndexReader is passed in, or if the value returned by maxDoc() changes for the current IndexReader. This class is not set up to be able to efficiently sort hits from more than one index simultaneously.

Created: Feb 12, 2004 10:53:57 AM

author: Tim Jones (Nacimiento Software)
since: lucene 1.4
version: $Id: Sort.java,v 1.7 2004/04/05 17:23:38 ehatcher Exp $

Fields Summary
public static final Sort
RELEVANCE
Represents sorting by computed relevance. Using this sort criteria returns the same results as calling {@link Searcher#search(Query) Searcher#search()} without a sort criteria, only with slightly more overhead.
public static final Sort
INDEXORDER
Represents sorting by index order.
SortField[]
fields
Constructors Summary
public Sort()
Sorts by computed relevance. This is the same sort criteria as calling {@link Searcher#search(Query) Searcher#search()} without a sort criteria, only with slightly more overhead.
this (new SortField[]{SortField.FIELD_SCORE, SortField.FIELD_DOC});
public Sort(String field)
Sorts by the terms in field then by index order (document number). The type of value in field is determined automatically.
see
SortField#AUTO
setSort (field, false);
public Sort(String field, boolean reverse)
Sorts possibly in reverse by the terms in field then by index order (document number). The type of value in field is determined automatically.
see
SortField#AUTO
setSort (field, reverse);
public Sort(String[] fields)
Sorts in succession by the terms in each field. The type of value in field is determined automatically.
see
SortField#AUTO
setSort (fields);
public Sort(SortField field)
Sorts by the criteria in the given SortField.
setSort (field);
public Sort(SortField[] fields)
Sorts in succession by the criteria in each SortField.
setSort (fields);
Methods Summary
public void setSort(org.apache.lucene.search.SortField field)
Sets the sort to the given criteria.
this.fields = new SortField[]{field};
public void setSort(org.apache.lucene.search.SortField[] fields)
Sets the sort to the given criteria in succession.
this.fields = fields;
public final void setSort(java.lang.String field)
Sets the sort to the terms in field then by index order (document number).
setSort (field, false);
public void setSort(java.lang.String field, boolean reverse)
Sets the sort to the terms in field possibly in reverse, then by index order (document number).
SortField[] nfields = new SortField[]{ new SortField (field, SortField.AUTO, reverse), SortField.FIELD_DOC }; fields = nfields;
public void setSort(java.lang.String[] fieldnames)
Sets the sort to the terms in each field in succession.
final int n = fieldnames.length; SortField[] nfields = new SortField[n]; for (int i = 0; i < n; ++i) { nfields[i] = new SortField (fieldnames[i], SortField.AUTO); } fields = nfields;
public java.lang.String toString()
StringBuffer buffer = new StringBuffer(); for (int i = 0; i < fields.length; i++) { buffer.append(fields[i].toString()); if ((i +1) < fields.length) buffer.append(',"); } return buffer.toString();