FileDocCategorySizeDatePackage
CharsetEncoder.javaAPI DocAndroid 1.5 API33442Wed May 06 22:41:04 BST 2009java.nio.charset

CharsetEncoder

public abstract class CharsetEncoder extends Object
A converter that can converts a 16-bit Unicode character sequence to a byte sequence in some charset.

The input character sequence is wrapped by a {@link java.nio.CharBuffer CharBuffer} and the output character sequence is a {@link java.nio.ByteBuffer ByteBuffer}. An encoder instance should be used in the following sequence, which is referred to as a encoding operation:

  1. invoking the {@link #reset() reset} method to reset the encoder if the encoder has been used;
  2. invoking the {@link #encode(CharBuffer, ByteBuffer, boolean) encode} method until the additional input is not needed, the endOfInput parameter must be set to false, the input buffer must be filled and the output buffer must be flushed between invocations;
  3. invoking the {@link #encode(CharBuffer, ByteBuffer, boolean) encode} method for the last time and the endOfInput parameter must be set to {@code true}
  4. invoking the {@link #flush(ByteBuffer) flush} method to flush the output.

The {@link #encode(CharBuffer, ByteBuffer, boolean) encode} method will convert as many characters as possible, and the process won't stop until the input characters have run out, the output buffer has been filled or some error has happened. A {@link CoderResult CoderResult} instance will be returned to indicate the stop reason, and the invoker can identify the result and choose further action, which includes filling the input buffer, flushing the output buffer or recovering from an error and trying again.

There are two common encoding errors. One is named malformed and it is returned when the input content is an illegal 16-bit Unicode character sequence, the other is named unmappable character and occurs when there is a problem mapping the input to a valid byte sequence in the specified charset.

Both errors can be handled in three ways, the default one is to report the error to the invoker by a {@link CoderResult CoderResult} instance, and the alternatives are to ignore it or to replace the erroneous input with the replacement byte array. The replacement byte array is '{@code ?}' by default and can be changed by invoking the {@link #replaceWith(byte[]) replaceWith} method. The invoker of this encoder can choose one way by specifying a {@link CodingErrorAction CodingErrorAction} instance for each error type via the {@link #onMalformedInput(CodingErrorAction) onMalformedInput} method and the {@link #onUnmappableCharacter(CodingErrorAction) onUnmappableCharacter} method.

This class is abstract and encapsulates many common operations of the encoding process for all charsets. Encoders for a specific charset should extend this class and need only to implement the {@link #encodeLoop(CharBuffer, ByteBuffer) encodeLoop} method for basic encoding. If a subclass maintains an internal state, it should override the {@link #implFlush(ByteBuffer) implFlush} method and the {@link #implReset() implReset} method in addition.

This class is not thread-safe.

see
java.nio.charset.Charset
see
java.nio.charset.CharsetDecoder
since
Android 1.0

Fields Summary
private static final int
INIT
private static final int
ONGOING
private static final int
END
private static final int
FLUSH
private Charset
cs
private float
averBytes
private float
maxBytes
private byte[]
replace
private int
status
private CodingErrorAction
malformAction
private CodingErrorAction
unmapAction
private CharsetDecoder
decoder
Constructors Summary
protected CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar)
Constructs a new CharsetEncoder using the given Charset, average number and maximum number of bytes created by this encoder for one input character.

param
cs the Charset to be used by this encoder.
param
averageBytesPerChar average number of bytes created by this encoder for one input character, must be positive.
param
maxBytesPerChar maximum number of bytes which can be created by this encoder for one input character, must be positive.
throws
IllegalArgumentException if maxBytesPerChar or averageBytesPerChar is negative.
since
Android 1.0


    /*
     * --------------------------------------- Constructors
     * ---------------------------------------
     */

                                                                                                                                                                          
        
              
        this(cs, averageBytesPerChar, maxBytesPerChar,
                new byte[] { (byte) '?" });
    
protected CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement)
Constructs a new CharsetEncoder using the given Charset, replacement byte array, average number and maximum number of bytes created by this encoder for one input character.

param
cs the Charset to be used by this encoder.
param
averageBytesPerChar average number of bytes created by this encoder for one single input character, must be positive.
param
maxBytesPerChar maximum number of bytes which can be created by this encoder for one single input character, must be positive.
param
replacement the replacement byte array, cannot be null or empty, its length cannot be larger than maxBytesPerChar, and must be a legal replacement, which can be justified by {@link #isLegalReplacement(byte[]) isLegalReplacement}.
throws
IllegalArgumentException if any parameters are invalid.
since
Android 1.0

        if (averageBytesPerChar <= 0 || maxBytesPerChar <= 0) {
            // niochar.02=Bytes number for one character must be positive.
            throw new IllegalArgumentException(Messages.getString("niochar.02")); //$NON-NLS-1$
        }
        if (averageBytesPerChar > maxBytesPerChar) {
            // niochar.03=averageBytesPerChar is greater than maxBytesPerChar.
            throw new IllegalArgumentException(Messages.getString("niochar.03")); //$NON-NLS-1$
        }
        this.cs = cs;
        averBytes = averageBytesPerChar;
        maxBytes = maxBytesPerChar;
        status = INIT;
        malformAction = CodingErrorAction.REPORT;
        unmapAction = CodingErrorAction.REPORT;
        replaceWith(replacement);
    
Methods Summary
private java.nio.ByteBufferallocateMore(java.nio.ByteBuffer output)

        if (output.capacity() == 0) {
            return ByteBuffer.allocate(1);
        }
        ByteBuffer result = ByteBuffer.allocate(output.capacity() * 2);
        output.flip();
        result.put(output);
        return result;
    
public final floataverageBytesPerChar()
Gets the average number of bytes created by this encoder for a single input character.

return
the average number of bytes created by this encoder for a single input character.
since
Android 1.0

        return averBytes;
    
public booleancanEncode(char c)
Checks if the given character can be encoded by this encoder.

Note that this method can change the internal status of this encoder, so it should not be called when another encoding process is ongoing, otherwise it will throw an IllegalStateException.

This method can be overridden for performance improvement.

param
c the given encoder.
return
true if given character can be encoded by this encoder.
throws
IllegalStateException if another encode process is ongoing so that the current internal status is neither RESET or FLUSH.
since
Android 1.0

        return implCanEncode(CharBuffer.wrap(new char[] { c }));
    
public booleancanEncode(java.lang.CharSequence sequence)
Checks if a given CharSequence can be encoded by this encoder. Note that this method can change the internal status of this encoder, so it should not be called when another encode process is ongoing, otherwise it will throw an IllegalStateException. This method can be overridden for performance improvement.

param
sequence the given CharSequence.
return
true if the given CharSequence can be encoded by this encoder.
throws
IllegalStateException if current internal status is neither RESET or FLUSH.
since
Android 1.0

        CharBuffer cb;
        if (sequence instanceof CharBuffer) {
            cb = ((CharBuffer) sequence).duplicate();
        } else {
            cb = CharBuffer.wrap(sequence);
        }
        return implCanEncode(cb);
    
public final java.nio.charset.Charsetcharset()
Gets the Charset which this encoder uses.

return
the Charset which this encoder uses.
since
Android 1.0

        return cs;
    
private voidcheckCoderResult(java.nio.charset.CoderResult result)

        if (result.isMalformed() && malformAction == CodingErrorAction.REPORT) {
            throw new MalformedInputException(result.length());
        } else if (result.isUnmappable()
                && unmapAction == CodingErrorAction.REPORT) {
            throw new UnmappableCharacterException(result.length());
        }
    
public final java.nio.charset.CoderResultencode(java.nio.CharBuffer in, java.nio.ByteBuffer out, boolean endOfInput)
Encodes characters starting at the current position of the given input buffer, and writes the equivalent byte sequence into the given output buffer from its current position.

The buffers' position will be changed with the reading and writing operation, but their limits and marks will be kept intact.

A CoderResult instance will be returned according to following rules:

  • A {@link CoderResult#malformedForLength(int) malformed input} result indicates that some malformed input error was encountered, and the erroneous characters start at the input buffer's position and their number can be got by result's {@link CoderResult#length() length}. This kind of result can be returned only if the malformed action is {@link CodingErrorAction#REPORT CodingErrorAction.REPORT}.
  • {@link CoderResult#UNDERFLOW CoderResult.UNDERFLOW} indicates that as many characters as possible in the input buffer have been encoded. If there is no further input and no characters left in the input buffer then this task is complete. If this is not the case then the client should call this method again supplying some more input characters.
  • {@link CoderResult#OVERFLOW CoderResult.OVERFLOW} indicates that the output buffer has been filled, while there are still some characters remaining in the input buffer. This method should be invoked again with a non-full output buffer.
  • A {@link CoderResult#unmappableForLength(int) unmappable character} result indicates that some unmappable character error was encountered, and the erroneous characters start at the input buffer's position and their number can be got by result's {@link CoderResult#length() length}. This kind of result can be returned only on {@link CodingErrorAction#REPORT CodingErrorAction.REPORT}.

The endOfInput parameter indicates if the invoker can provider further input. This parameter is true if and only if the characters in the current input buffer are all inputs for this encoding operation. Note that it is common and won't cause an error if the invoker sets false and then has no more input available, while it may cause an error if the invoker always sets true in several consecutive invocations. This would make the remaining input to be treated as malformed input. input.

This method invokes the {@link #encodeLoop(CharBuffer, ByteBuffer) encodeLoop} method to implement the basic encode logic for a specific charset.

param
in the input buffer.
param
out the output buffer.
param
endOfInput true if all the input characters have been provided.
return
a CoderResult instance indicating the result.
throws
IllegalStateException if the encoding operation has already started or no more input is needed in this encoding process.
throws
CoderMalfunctionError If the {@link #encodeLoop(CharBuffer, ByteBuffer) encodeLoop} method threw an BufferUnderflowException or BufferUnderflowException.
since
Android 1.0

        if ((status == FLUSH) || (!endOfInput && status == END)) {
            throw new IllegalStateException();
        }

        CoderResult result;
        while (true) {
            try {
                result = encodeLoop(in, out);
            } catch (BufferOverflowException e) {
                throw new CoderMalfunctionError(e);
            } catch (BufferUnderflowException e) {
                throw new CoderMalfunctionError(e);
            }
            if (result.isUnderflow()) {
                int remaining = in.remaining();
                status = endOfInput ? END : ONGOING;
                if (endOfInput && remaining > 0) {
                    result = CoderResult.malformedForLength(remaining);
                } else {
                    return result;
                }
            }
            if (result.isOverflow()) {
                status = endOfInput ? END : ONGOING;
                return result;
            }
            CodingErrorAction action = malformAction;
            if (result.isUnmappable()) {
                action = unmapAction;
            }
            // If the action is IGNORE or REPLACE, we should continue
            // encoding.
            if (action == CodingErrorAction.REPLACE) {
                if (out.remaining() < replace.length) {
                    return CoderResult.OVERFLOW;
                }
                out.put(replace);
            } else {
                if (action != CodingErrorAction.IGNORE) {
                    return result;
                }
            }
            in.position(in.position() + result.length());
        }
    
public final java.nio.ByteBufferencode(java.nio.CharBuffer in)
This is a facade method for the encoding operation.

This method encodes the remaining character sequence of the given character buffer into a new byte buffer. This method performs a complete encoding operation, resets at first, then encodes, and flushes at last.

This method should not be invoked if another encode operation is ongoing.

param
in the input buffer.
return
a new ByteBuffer containing the bytes produced by this encoding operation. The buffer's limit will be the position of the last byte in the buffer, and the position will be zero.
throws
IllegalStateException if another encoding operation is ongoing.
throws
MalformedInputException if an illegal input character sequence for this charset is encountered, and the action for malformed error is {@link CodingErrorAction#REPORT CodingErrorAction.REPORT}
throws
UnmappableCharacterException if a legal but unmappable input character sequence for this charset is encountered, and the action for unmappable character error is {@link CodingErrorAction#REPORT CodingErrorAction.REPORT}. Unmappable means the Unicode character sequence at the input buffer's current position cannot be mapped to a equivalent byte sequence.
throws
CharacterCodingException if other exception happened during the encode operation.
since
Android 1.0

        if (in.remaining() == 0) {
            return ByteBuffer.allocate(0);
        }
        reset();
        int length = (int) (in.remaining() * averBytes);
        ByteBuffer output = ByteBuffer.allocate(length);
        CoderResult result = null;
        while (true) {
            result = encode(in, output, false);
            checkCoderResult(result);
            if (result.isUnderflow()) {
                break;
            } else if (result.isOverflow()) {
                output = allocateMore(output);
            }
        }
        result = encode(in, output, true);
        checkCoderResult(result);

        while (true) {
            result = flush(output);
            checkCoderResult(result);
            if (result.isOverflow()) {
                output = allocateMore(output);
            } else {
                break;
            }
        }

        output.flip();
        if (result.isMalformed()) {
            throw new MalformedInputException(result.length());
        } else if (result.isUnmappable()) {
            throw new UnmappableCharacterException(result.length());
        }
        status = FLUSH;
        return output;
    
protected abstract java.nio.charset.CoderResultencodeLoop(java.nio.CharBuffer in, java.nio.ByteBuffer out)
Encodes characters into bytes. This method is called by {@link #encode(CharBuffer, ByteBuffer, boolean) encode}.

This method will implement the essential encoding operation, and it won't stop encoding until either all the input characters are read, the output buffer is filled, or some exception is encountered. Then it will return a CoderResult object indicating the result of the current encoding operation. The rule to construct the CoderResult is the same as for {@link #encode(CharBuffer, ByteBuffer, boolean) encode}. When an exception is encountered in the encoding operation, most implementations of this method will return a relevant result object to the {@link #encode(CharBuffer, ByteBuffer, boolean) encode} method, and some performance optimized implementation may handle the exception and implement the error action itself.

The buffers are scanned from their current positions, and their positions will be modified accordingly, while their marks and limits will be intact. At most {@link CharBuffer#remaining() in.remaining()} characters will be read, and {@link ByteBuffer#remaining() out.remaining()} bytes will be written.

Note that some implementations may pre-scan the input buffer and return CoderResult.UNDERFLOW until it receives sufficient input.

param
in the input buffer.
param
out the output buffer.
return
a CoderResult instance indicating the result.
since
Android 1.0

public final java.nio.charset.CoderResultflush(java.nio.ByteBuffer out)
Flushes this encoder.

This method will call {@link #implFlush(ByteBuffer) implFlush}. Some encoders may need to write some bytes to the output buffer when they have read all input characters, subclasses can overridden {@link #implFlush(ByteBuffer) implFlush} to perform writing action.

The maximum number of written bytes won't larger than {@link ByteBuffer#remaining() out.remaining()}. If some encoder wants to write more bytes than the output buffer's available remaining space, then CoderResult.OVERFLOW will be returned, and this method must be called again with a byte buffer that has free space. Otherwise this method will return CoderResult.UNDERFLOW, which means one encoding process has been completed successfully.

During the flush, the output buffer's position will be changed accordingly, while its mark and limit will be intact.

param
out the given output buffer.
return
CoderResult.UNDERFLOW or CoderResult.OVERFLOW.
throws
IllegalStateException if this encoder hasn't read all input characters during one encoding process, which means neither after calling {@link #encode(CharBuffer) encode(CharBuffer)} nor after calling {@link #encode(CharBuffer, ByteBuffer, boolean) encode(CharBuffer, ByteBuffer, boolean)} with {@code true} for the last boolean parameter.
since
Android 1.0

        if (status != END && status != INIT) {
            throw new IllegalStateException();
        }
        CoderResult result = implFlush(out);
        if (result == CoderResult.UNDERFLOW) {
            status = FLUSH;
        }
        return result;
    
private booleanimplCanEncode(java.nio.CharBuffer cb)

        if (status == FLUSH) {
            status = INIT;
        }
        if (status != INIT) {
            // niochar.0B=Another encoding process is ongoing\!
            throw new IllegalStateException(Messages.getString("niochar.0B")); //$NON-NLS-1$
        }
        CodingErrorAction malformBak = malformAction;
        CodingErrorAction unmapBak = unmapAction;
        onMalformedInput(CodingErrorAction.REPORT);
        onUnmappableCharacter(CodingErrorAction.REPORT);
        boolean result = true;
        try {
            this.encode(cb);
        } catch (CharacterCodingException e) {
            result = false;
        }
        onMalformedInput(malformBak);
        onUnmappableCharacter(unmapBak);
        reset();
        return result;
    
protected java.nio.charset.CoderResultimplFlush(java.nio.ByteBuffer out)
Flushes this encoder. The default implementation does nothing and always returns CoderResult.UNDERFLOW; this method can be overridden if needed.

param
out the output buffer.
return
CoderResult.UNDERFLOW or CoderResult.OVERFLOW.
since
Android 1.0

        return CoderResult.UNDERFLOW;
    
protected voidimplOnMalformedInput(java.nio.charset.CodingErrorAction newAction)
Notifies that this encoder's CodingErrorAction specified for malformed input error has been changed. The default implementation does nothing; this method can be overridden if needed.

param
newAction the new action.
since
Android 1.0

        // default implementation is empty
    
protected voidimplOnUnmappableCharacter(java.nio.charset.CodingErrorAction newAction)
Notifies that this encoder's CodingErrorAction specified for unmappable character error has been changed. The default implementation does nothing; this method can be overridden if needed.

param
newAction the new action.
since
Android 1.0

        // default implementation is empty
    
protected voidimplReplaceWith(byte[] newReplacement)
Notifies that this encoder's replacement has been changed. The default implementation does nothing; this method can be overridden if needed.

param
newReplacement the new replacement string.
since
Android 1.0

        // default implementation is empty
    
protected voidimplReset()
Resets this encoder's charset related state. The default implementation does nothing; this method can be overridden if needed.

since
Android 1.0

        // default implementation is empty
    
public booleanisLegalReplacement(byte[] repl)
Checks if the given argument is legal as this encoder's replacement byte array. The given byte array is legal if and only if it can be decode into sixteen bits Unicode characters. This method can be overridden for performance improvement.

param
repl the given byte array to be checked.
return
true if the the given argument is legal as this encoder's replacement byte array.
since
Android 1.0

        if (decoder == null) {
            decoder = cs.newDecoder();
        }

        CodingErrorAction malform = decoder.malformedInputAction();
        CodingErrorAction unmap = decoder.unmappableCharacterAction();
        decoder.onMalformedInput(CodingErrorAction.REPORT);
        decoder.onUnmappableCharacter(CodingErrorAction.REPORT);
        ByteBuffer in = ByteBuffer.wrap(repl);
        CharBuffer out = CharBuffer.allocate((int) (repl.length * decoder
                .maxCharsPerByte()));
        CoderResult result = decoder.decode(in, out, true);
        decoder.onMalformedInput(malform);
        decoder.onUnmappableCharacter(unmap);
        return !result.isError();
    
public java.nio.charset.CodingErrorActionmalformedInputAction()
Gets this encoder's CodingErrorAction when a malformed input error occurred during the encoding process.

return
this encoder's CodingErrorAction when a malformed input error occurred during the encoding process.
since
Android 1.0

        return malformAction;
    
public final floatmaxBytesPerChar()
Gets the maximum number of bytes which can be created by this encoder for one input character, must be positive.

return
the maximum number of bytes which can be created by this encoder for one input character, must be positive.
since
Android 1.0

        return maxBytes;
    
public final java.nio.charset.CharsetEncoderonMalformedInput(java.nio.charset.CodingErrorAction newAction)
Sets this encoder's action on malformed input error. This method will call the {@link #implOnMalformedInput(CodingErrorAction) implOnMalformedInput} method with the given new action as argument.

param
newAction the new action on malformed input error.
return
this encoder.
throws
IllegalArgumentException if the given newAction is null.
since
Android 1.0

        if (null == newAction) {
            // niochar.0C=Action on malformed input error cannot be null\!
            throw new IllegalArgumentException(Messages.getString("niochar.0C")); //$NON-NLS-1$
        }
        malformAction = newAction;
        implOnMalformedInput(newAction);
        return this;
    
public final java.nio.charset.CharsetEncoderonUnmappableCharacter(java.nio.charset.CodingErrorAction newAction)
Sets this encoder's action on unmappable character error. This method will call the {@link #implOnUnmappableCharacter(CodingErrorAction) implOnUnmappableCharacter} method with the given new action as argument.

param
newAction the new action on unmappable character error.
return
this encoder.
throws
IllegalArgumentException if the given newAction is null.
since
Android 1.0

        if (null == newAction) {
            // niochar.0D=Action on unmappable character error cannot be null\!
            throw new IllegalArgumentException(Messages.getString("niochar.0D")); //$NON-NLS-1$
        }
        unmapAction = newAction;
        implOnUnmappableCharacter(newAction);
        return this;
    
public final java.nio.charset.CharsetEncoderreplaceWith(byte[] replacement)
Sets the new replacement value. This method first checks the given replacement's validity, then changes the replacement value and finally calls the {@link #implReplaceWith(byte[]) implReplaceWith} method with the given new replacement as argument.

param
replacement the replacement byte array, cannot be null or empty, its length cannot be larger than maxBytesPerChar, and it must be legal replacement, which can be justified by calling isLegalReplacement(byte[] repl).
return
this encoder.
throws
IllegalArgumentException if the given replacement cannot satisfy the requirement mentioned above.
since
Android 1.0

        if (null == replacement || 0 == replacement.length
                || maxBytes < replacement.length
                || !isLegalReplacement(replacement)) {
            // niochar.0E=Replacement is illegal
            throw new IllegalArgumentException(Messages.getString("niochar.0E")); //$NON-NLS-1$
        }
        replace = replacement;
        implReplaceWith(replacement);
        return this;
    
public final byte[]replacement()
Gets the replacement byte array, which is never null or empty.

return
the replacement byte array, cannot be null or empty.
since
Android 1.0

        return replace;
    
public final java.nio.charset.CharsetEncoderreset()
Resets this encoder. This method will reset the internal status and then calla implReset() to reset any status related to the specific charset.

return
this encoder.
since
Android 1.0

        status = INIT;
        implReset();
        return this;
    
public java.nio.charset.CodingErrorActionunmappableCharacterAction()
Gets this encoder's CodingErrorAction when unmappable character occurred during encoding process.

return
this encoder's CodingErrorAction when unmappable character occurred during encoding process.
since
Android 1.0

        return unmapAction;