IndexModifier.java (Apache Lucene 2.0.0 JavaDoc)

File

Doc

Category

Size

Date

Package

IndexModifier.java

API Doc

Apache Lucene 2.0.0

20706

Fri May 26 09:54:20 BST 2006

org.apache.lucene.index

IndexModifier

public class IndexModifier extends Object

A class to modify an index, i.e. to delete and add documents. This class hides {@link IndexReader} and {@link IndexWriter} so that you do not need to care about implementation details such as that adding documents is done via IndexWriter and deletion is done via IndexReader.

Note that you cannot create more than one IndexModifier object on the same directory at the same time.

Example usage:


    Analyzer analyzer = new StandardAnalyzer();

    // create an index in /tmp/index, overwriting an existing one:

    IndexModifier indexModifier = new IndexModifier("/tmp/index", analyzer, true);

    Document doc = new Document();

    doc.add(new Field("id", "1", Field.Store.YES, Field.Index.UN_TOKENIZED));

    doc.add(new Field("body", "a simple test", Field.Store.YES, Field.Index.TOKENIZED));

    indexModifier.addDocument(doc);

    int deleted = indexModifier.delete(new Term("id", "1"));

    System.out.println("Deleted " + deleted + " document");

    indexModifier.flush();

    System.out.println(indexModifier.docCount() + " docs in index");

    indexModifier.close();

Not all methods of IndexReader and IndexWriter are offered by this class. If you need access to additional methods, either use those classes directly or implement your own class that extends IndexModifier.

Although an instance of this class can be used from more than one thread, you will not get the best performance. You might want to use IndexReader and IndexWriter directly for that (but you will need to care about synchronization yourself then).

While you can freely mix calls to add() and delete() using this class, you should batch you calls for best performance. For example, if you want to update 20 documents, you should first delete all those documents, then add all the new documents.

author: Daniel Naber

Fields Summary
protected IndexWriter	indexWriter
protected IndexReader	indexReader
protected Directory	directory
protected Analyzer	analyzer
protected boolean	open
protected PrintStream	infoStream
protected boolean	useCompoundFile
protected int	maxBufferedDocs
protected int	maxFieldLength
protected int	mergeFactor

Constructors Summary

public IndexModifier(Directory directory, Analyzer analyzer, boolean create)

Open an index with write access.
param
directory the index directory
param
analyzer the analyzer to use for adding new documents
param
create true to create the index or overwrite the existing one; false to append to the existing index
                                    	         
           
    init(directory, analyzer, create);
  

public IndexModifier(String dirName, Analyzer analyzer, boolean create)

Open an index with write access.
param
dirName the index directory
param
analyzer the analyzer to use for adding new documents
param
create true to create the index or overwrite the existing one; false to append to the existing index
    Directory dir = FSDirectory.getDirectory(dirName, create);
    init(dir, analyzer, create);
  

public IndexModifier(File file, Analyzer analyzer, boolean create)

Open an index with write access.
param
file the index directory
param
analyzer the analyzer to use for adding new documents
param
create true to create the index or overwrite the existing one; false to append to the existing index
    Directory dir = FSDirectory.getDirectory(file, create);
    init(dir, analyzer, create);
  

Methods Summary

public void

addDocument(org.apache.lucene.document.Document doc)

Adds a document to this index. If the document contains more than {@link #setMaxFieldLength(int)} terms for a given field, the remainder are discarded.
see
IndexWriter#addDocument(Document)
throws
IllegalStateException if the index is closed
    addDocument(doc, null);
  

public void

addDocument(org.apache.lucene.document.Document doc, org.apache.lucene.analysis.Analyzer docAnalyzer)

Adds a document to this index, using the provided analyzer instead of the one specific in the constructor. If the document contains more than {@link #setMaxFieldLength(int)} terms for a given field, the remainder are discarded.
see
IndexWriter#addDocument(Document, Analyzer)
throws
IllegalStateException if the index is closed
    synchronized(directory) {
      assureOpen();
      createIndexWriter();
      if (docAnalyzer != null)
        indexWriter.addDocument(doc, docAnalyzer);
      else
        indexWriter.addDocument(doc);
    }
  

protected void

assureOpen()

Throw an IllegalStateException if the index is closed.
throws
IllegalStateException
    if (!open) {
      throw new IllegalStateException("Index is closed");
    }
  

public void

close()

Close this index, writing all pending changes to disk.

throws
IllegalStateException if the index has been closed before already

    synchronized(directory) {
      if (!open)
        throw new IllegalStateException("Index is closed already");
      if (indexWriter != null) {
        indexWriter.close();
        indexWriter = null;
      } else {
        indexReader.close();
        indexReader = null;
      }
      open = false;
    }

protected void

createIndexReader()

Close the IndexWriter and open an IndexReader.

throws
IOException

    if (indexReader == null) {
      if (indexWriter != null) {
        indexWriter.close();
        indexWriter = null;
      }
      indexReader = IndexReader.open(directory);
    }

protected void

createIndexWriter()

Close the IndexReader and open an IndexWriter.

throws
IOException

    if (indexWriter == null) {
      if (indexReader != null) {
        indexReader.close();
        indexReader = null;
      }
      indexWriter = new IndexWriter(directory, analyzer, false);
      indexWriter.setInfoStream(infoStream);
      indexWriter.setUseCompoundFile(useCompoundFile);
      indexWriter.setMaxBufferedDocs(maxBufferedDocs);
      indexWriter.setMaxFieldLength(maxFieldLength);
      indexWriter.setMergeFactor(mergeFactor);
    }

public void

deleteDocument(int docNum)

Deletes the document numbered docNum.
see
IndexReader#deleteDocument(int)
throws
IllegalStateException if the index is closed
    synchronized(directory) {
      assureOpen();
      createIndexReader();
      indexReader.deleteDocument(docNum);
    }
  

public int

deleteDocuments(org.apache.lucene.index.Term term)

Deletes all documents containing term. This is useful if one uses a document field to hold a unique ID string for the document. Then to delete such a document, one merely constructs a term with the appropriate field and the unique ID string as its text and passes it to this method. Returns the number of documents deleted.
return
the number of documents deleted
see
IndexReader#deleteDocuments(Term)
throws
IllegalStateException if the index is closed
    synchronized(directory) {
      assureOpen();
      createIndexReader();
      return indexReader.deleteDocuments(term);
    }
  

public int

docCount()

Returns the number of documents currently in this index.
see
IndexWriter#docCount()
see
IndexReader#numDocs()
throws
IllegalStateException if the index is closed
    synchronized(directory) {
      assureOpen();
      if (indexWriter != null) {
        return indexWriter.docCount();
      } else {
        return indexReader.numDocs();
      }
    }
  

public void

flush()

Make sure all changes are written to disk.

throws
IOException

    synchronized(directory) {
      assureOpen();
      if (indexWriter != null) {
        indexWriter.close();
        indexWriter = null;
        createIndexWriter();
      } else {
        indexReader.close();
        indexReader = null;
        createIndexReader();
      }
    }

public java.io.PrintStream

getInfoStream()

throws
IOException
see
IndexModifier#setInfoStream(PrintStream)

    synchronized(directory) {
      assureOpen();
      createIndexWriter();
      return indexWriter.getInfoStream();
    }

public int

getMaxBufferedDocs()

throws
IOException
see
IndexModifier#setMaxBufferedDocs(int)

    synchronized(directory) {
      assureOpen();
      createIndexWriter();
      return indexWriter.getMaxBufferedDocs();
    }

public int

getMaxFieldLength()

throws
IOException
see
IndexModifier#setMaxFieldLength(int)

    synchronized(directory) {
      assureOpen();
      createIndexWriter();
      return indexWriter.getMaxFieldLength();
    }

public int

getMergeFactor()

throws
IOException
see
IndexModifier#setMergeFactor(int)

    synchronized(directory) {
      assureOpen();
      createIndexWriter();
      return indexWriter.getMergeFactor();
    }

public boolean

getUseCompoundFile()

throws
IOException
see
IndexModifier#setUseCompoundFile(boolean)

    synchronized(directory) {
      assureOpen();
      createIndexWriter();
      return indexWriter.getUseCompoundFile();
    }

protected void

init(org.apache.lucene.store.Directory directory, org.apache.lucene.analysis.Analyzer analyzer, boolean create)

Initialize an IndexWriter.

throws
IOException

    this.directory = directory;
    synchronized(this.directory) {
      this.analyzer = analyzer;
      indexWriter = new IndexWriter(directory, analyzer, create);
      open = true;
    }

public void

optimize()

Merges all segments together into a single segment, optimizing an index for search.
see
IndexWriter#optimize()
throws
IllegalStateException if the index is closed
    synchronized(directory) {
      assureOpen();
      createIndexWriter();
      indexWriter.optimize();
    }
  

public void

setInfoStream(java.io.PrintStream infoStream)

If non-null, information about merges and a message when {@link #getMaxFieldLength()} is reached will be printed to this.
Example: index.setInfoStream(System.err);
see
IndexWriter#setInfoStream(PrintStream)
throws
IllegalStateException if the index is closed
    synchronized(directory) {
      assureOpen();
      if (indexWriter != null) {
        indexWriter.setInfoStream(infoStream);
      }
      this.infoStream = infoStream;
    }
  

public void

setMaxBufferedDocs(int maxBufferedDocs)

The maximum number of terms that will be indexed for a single field in a document. This limits the amount of memory required for indexing, so that collections with very large files will not crash the indexing process by running out of memory.
Note that this effectively truncates large documents, excluding from the index terms that occur further in the document. If you know your source documents are large, be sure to set this value high enough to accomodate the expected size. If you set it to Integer.MAX_VALUE, then the only limit is your memory, but you should anticipate an OutOfMemoryError.
By default, no more than 10,000 terms will be indexed for a field.
see
IndexWriter#setMaxBufferedDocs(int)
throws
IllegalStateException if the index is closed
    synchronized(directory) {
      assureOpen();
      if (indexWriter != null) {
        indexWriter.setMaxBufferedDocs(maxBufferedDocs);
      }
      this.maxBufferedDocs = maxBufferedDocs;
    }
  

public void

setMaxFieldLength(int maxFieldLength)

The maximum number of terms that will be indexed for a single field in a document. This limits the amount of memory required for indexing, so that collections with very large files will not crash the indexing process by running out of memory.
Note that this effectively truncates large documents, excluding from the index terms that occur further in the document. If you know your source documents are large, be sure to set this value high enough to accomodate the expected size. If you set it to Integer.MAX_VALUE, then the only limit is your memory, but you should anticipate an OutOfMemoryError.
By default, no more than 10,000 terms will be indexed for a field.
see
IndexWriter#setMaxFieldLength(int)
throws
IllegalStateException if the index is closed
    synchronized(directory) {
      assureOpen();
      if (indexWriter != null) {
        indexWriter.setMaxFieldLength(maxFieldLength);
      }
      this.maxFieldLength = maxFieldLength;
    }
  

public void

setMergeFactor(int mergeFactor)

Determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.
This must never be less than 2. The default value is 10.
see
IndexWriter#setMergeFactor(int)
throws
IllegalStateException if the index is closed
    synchronized(directory) {
      assureOpen();
      if (indexWriter != null) {
        indexWriter.setMergeFactor(mergeFactor);
      }
      this.mergeFactor = mergeFactor;
    }
  

public void

setUseCompoundFile(boolean useCompoundFile)

Setting to turn on usage of a compound file. When on, multiple files for each segment are merged into a single file once the segment creation is finished. This is done regardless of what directory is in use.
see
IndexWriter#setUseCompoundFile(boolean)
throws
IllegalStateException if the index is closed
    synchronized(directory) {
      assureOpen();
      if (indexWriter != null) {
        indexWriter.setUseCompoundFile(useCompoundFile);
      }
      this.useCompoundFile = useCompoundFile;
    }
  

public java.lang.String

toString()

    return "Index@" + directory;