FileDocCategorySizeDatePackage
OutputFormat.javaAPI DocJava SE 6 API25898Tue Jun 10 00:23:06 BST 2008com.sun.org.apache.xml.internal.serialize

OutputFormat

public class OutputFormat extends Object
Specifies an output format to control the serializer. Based on the XSLT specification for output format, plus additional parameters. Used to select the suitable serializer and determine how the document should be formatted on output.

The two interesting constructors are:

  • {@link #OutputFormat(String,String,boolean)} creates a format for the specified method (XML, HTML, Text, etc), encoding and indentation
  • {@link #OutputFormat(Document,String,boolean)} creates a format compatible with the document type (XML, HTML, Text, etc), encoding and indentation
version
$Revision: 1.2.6.1 $ $Date: 2005/09/09 07:26:16 $
author
Assaf Arkin Keith Visco
see
Serializer
see
Method
see
LineSeparator

Fields Summary
private String
_method
Holds the output method specified for this document, or null if no method was specified.
private String
_version
Specifies the version of the output method.
private int
_indent
The indentation level, or zero if no indentation was requested.
private String
_encoding
The encoding to use, if an input stream is used. The default is always UTF-8.
private EncodingInfo
_encodingInfo
The EncodingInfo instance for _encoding.
private boolean
_allowJavaNames
private String
_mediaType
The specified media type or null.
private String
_doctypeSystem
The specified document type system identifier, or null.
private String
_doctypePublic
The specified document type public identifier, or null.
private boolean
_omitXmlDeclaration
Ture if the XML declaration should be ommited;
private boolean
_omitDoctype
Ture if the DOCTYPE declaration should be ommited;
private boolean
_omitComments
Ture if comments should be ommited;
private boolean
_stripComments
Ture if the comments should be ommited;
private boolean
_standalone
True if the document type should be marked as standalone.
private String[]
_cdataElements
List of element tag names whose text node children must be output as CDATA.
private String[]
_nonEscapingElements
List of element tag names whose text node children must be output unescaped.
private String
_lineSeparator
The selected line separator.
private int
_lineWidth
The line width at which to wrap long lines when indenting.
private boolean
_preserve
True if spaces should be preserved in elements that do not specify otherwise, or specify the default behavior.
private boolean
_preserveEmptyAttributes
If true, an empty string valued attribute is output as "". If false and and we are using the HTMLSerializer, then only the attribute name is serialized. Defaults to false for backwards compatibility.
Constructors Summary
public OutputFormat()
Constructs a new output format with the default values.


                  
     
    
    
public OutputFormat(String method, String encoding, boolean indenting)
Constructs a new output format with the default values for the specified method and encoding. If indent is true, the document will be pretty printed with the default indentation level and default line wrapping.

param
method The specified output method
param
encoding The specified encoding
param
indenting True for pretty printing
see
#setEncoding
see
#setIndenting
see
#setMethod

        setMethod( method );
        setEncoding( encoding );
        setIndenting( indenting );
    
public OutputFormat(Document doc)
Constructs a new output format with the proper method, document type identifiers and media type for the specified document.

param
doc The document to output
see
#whichMethod

        setMethod( whichMethod( doc ) );
        setDoctype( whichDoctypePublic( doc ), whichDoctypeSystem( doc ) );
        setMediaType( whichMediaType( getMethod() ) );
    
public OutputFormat(Document doc, String encoding, boolean indenting)
Constructs a new output format with the proper method, document type identifiers and media type for the specified document, and with the specified encoding. If indent is true, the document will be pretty printed with the default indentation level and default line wrapping.

param
doc The document to output
param
encoding The specified encoding
param
indenting True for pretty printing
see
#setEncoding
see
#setIndenting
see
#whichMethod

        this( doc );
        setEncoding( encoding );
        setIndenting( indenting );
    
Methods Summary
public java.lang.String[]getCDataElements()
Returns a list of all the elements whose text node children should be output as CDATA, or null if no such elements were specified.

        return _cdataElements;
    
public java.lang.StringgetDoctypePublic()
Returns the specified document type public identifier, or null.

        return _doctypePublic;
    
public java.lang.StringgetDoctypeSystem()
Returns the specified document type system identifier, or null.

        return _doctypeSystem;
    
public java.lang.StringgetEncoding()
Returns the specified encoding. If no encoding was specified, the default is always "UTF-8".

return
The encoding

        return _encoding;
    
public com.sun.org.apache.xml.internal.serialize.EncodingInfogetEncodingInfo()
Returns an EncodingInfo instance for the encoding.

see
#setEncoding

        if (_encodingInfo == null)
            _encodingInfo = Encodings.getEncodingInfo(_encoding, _allowJavaNames);
        return _encodingInfo;
    
public intgetIndent()
Returns the indentation specified. If no indentation was specified, zero is returned and the document should not be indented.

return
The indentation or zero
see
#setIndenting

        return _indent;
    
public booleangetIndenting()
Returns true if indentation was specified.

        return ( _indent > 0 );
    
public chargetLastPrintable()
Returns the last printable character based on the selected encoding. Control characters and non-printable characters are always printed as character references.

        if ( getEncoding() != null &&
             ( getEncoding().equalsIgnoreCase( "ASCII" ) ) )
            return 0xFF;
        else
            return 0xFFFF;
    
public java.lang.StringgetLineSeparator()
Returns a specific line separator to use. The default is the Web line separator (\n). A string is returned to support double codes (CR + LF).

return
The specified line separator

        return _lineSeparator;
    
public intgetLineWidth()
Return the selected line width for breaking up long lines. When indenting, and only when indenting, long lines will be broken at space boundaries based on this line width. No line wrapping occurs if this value is zero.

        return _lineWidth;
    
public java.lang.StringgetMediaType()
Returns the specified media type, or null. To determine the media type based on the document type, use {@link #whichMediaType}.

return
The specified media type, or null

        return _mediaType;
    
public java.lang.StringgetMethod()
Returns the method specified for this output format. Typically the method will be xml, html or text, but it might be other values. If no method was specified, null will be returned and the most suitable method will be determined for the document by calling {@link #whichMethod}.

return
The specified output method, or null

        return _method;
    
public java.lang.String[]getNonEscapingElements()
Returns a list of all the elements whose text node children should be output unescaped (no character references), or null if no such elements were specified.

        return _nonEscapingElements;
    
public booleangetOmitComments()
Returns true if comments should be ommited. The default is false.

        return _omitComments;
    
public booleangetOmitDocumentType()
Returns true if the DOCTYPE declaration should be ommited. The default is false.

        return _omitDoctype;
    
public booleangetOmitXMLDeclaration()
Returns true if the XML document declaration should be ommited. The default is false.

        return _omitXmlDeclaration;
    
public booleangetPreserveEmptyAttributes()
Returns the preserveEmptyAttribute flag. If flag is false, then' attributes with empty string values are output as the attribute name only (in HTML mode).

return
preserve the preserve flag

		return _preserveEmptyAttributes;	
public booleangetPreserveSpace()
Returns true if the default behavior for this format is to preserve spaces. All elements that do not specify otherwise or specify the default behavior will be formatted based on this rule. All elements that specify space preserving will always preserve space.

        return _preserve;
    
public booleangetStandalone()
Returns true if the document type is standalone. The default is false.

        return _standalone;
    
public java.lang.StringgetVersion()
Returns the version for this output method. If no version was specified, will return null and the default version number will be used. If the serializerr does not support that particular version, it should default to a supported version.

return
The specified method version, or null

        return _version;
    
public booleanisCDataElement(java.lang.String tagName)
Returns true if the text node children of the given elements should be output as CDATA.

param
tagName The element's tag name
return
True if should serialize as CDATA

        int i;

        if ( _cdataElements == null )
            return false;
        for ( i = 0 ; i < _cdataElements.length ; ++i )
            if ( _cdataElements[ i ].equals( tagName ) )
                return true;
        return false;
    
public booleanisNonEscapingElement(java.lang.String tagName)
Returns true if the text node children of the given elements should be output unescaped.

param
tagName The element's tag name
return
True if should serialize unescaped

        int i;

        if ( _nonEscapingElements == null ) {
            return false;
        }
        for ( i = 0 ; i < _nonEscapingElements.length ; ++i )
            if ( _nonEscapingElements[ i ].equals( tagName ) )
                return true;
        return false;
    
public voidsetAllowJavaNames(boolean allow)
Sets whether java encoding names are permitted

        _allowJavaNames = allow;
    
public booleansetAllowJavaNames()
Returns whether java encoding names are permitted

        return _allowJavaNames;
    
public voidsetCDataElements(java.lang.String[] cdataElements)
Sets the list of elements for which text node children should be output as CDATA.

param
cdataElements List of CDATA element tag names

        _cdataElements = cdataElements;
    
public voidsetDoctype(java.lang.String publicId, java.lang.String systemId)
Sets the document type public and system identifiers. Required only if the DOM Document or SAX events do not specify the document type, and one must be present in the serialized document. Any document type specified by the DOM Document or SAX events will override these values.

param
publicId The public identifier, or null
param
systemId The system identifier, or null

        _doctypePublic = publicId;
        _doctypeSystem = systemId;
    
public voidsetEncoding(java.lang.String encoding)
Sets the encoding for this output method. If no encoding was specified, the default is always "UTF-8". Make sure the encoding is compatible with the one used by the {@link java.io.Writer}.

see
#getEncoding
param
encoding The encoding, or null

        _encoding = encoding;
        _encodingInfo = null;
    
public voidsetEncoding(com.sun.org.apache.xml.internal.serialize.EncodingInfo encInfo)
Sets the encoding for this output method with an EncodingInfo instance.

        _encoding = encInfo.getIANAName();
        _encodingInfo = encInfo;
    
public voidsetIndent(int indent)
Sets the indentation. The document will not be indented if the indentation is set to zero. Calling {@link #setIndenting} will reset this value to zero (off) or the default (on).

param
indent The indentation, or zero

        if ( indent < 0 )
            _indent = 0;
        else
            _indent = indent;
    
public voidsetIndenting(boolean on)
Sets the indentation on and off. When set on, the default indentation level and default line wrapping is used (see {@link Defaults#Indent} and {@link Defaults#LineWidth}). To specify a different indentation level or line wrapping, use {@link #setIndent} and {@link #setLineWidth}.

param
on True if indentation should be on

        if ( on ) {
            _indent = Defaults.Indent;
            _lineWidth = Defaults.LineWidth;
        } else {
            _indent = 0;
            _lineWidth = 0;
        }
    
public voidsetLineSeparator(java.lang.String lineSeparator)
Sets the line separator. The default is the Web line separator (\n). The machine's line separator can be obtained from the system property line.separator, but is only useful if the document is edited on machines of the same type. For general documents, use the Web line separator.

param
lineSeparator The specified line separator

        if ( lineSeparator == null )
            _lineSeparator =  LineSeparator.Web;
        else
            _lineSeparator = lineSeparator;
    
public voidsetLineWidth(int lineWidth)
Sets the line width. If zero then no line wrapping will occur. Calling {@link #setIndenting} will reset this value to zero (off) or the default (on).

param
lineWidth The line width to use, zero for default
see
#getLineWidth
see
#setIndenting

        if ( lineWidth <= 0 )
            _lineWidth = 0;
        else
            _lineWidth = lineWidth;
    
public voidsetMediaType(java.lang.String mediaType)
Sets the media type.

see
#getMediaType
param
mediaType The specified media type

        _mediaType = mediaType;
    
public voidsetMethod(java.lang.String method)
Sets the method for this output format.

see
#getMethod
param
method The output method, or null

        _method = method;
    
public voidsetNonEscapingElements(java.lang.String[] nonEscapingElements)
Sets the list of elements for which text node children should be output unescaped (no character references).

param
nonEscapingElements List of unescaped element tag names

        _nonEscapingElements = nonEscapingElements;
    
public voidsetOmitComments(boolean omit)
Sets comment omitting on and off.

param
omit True if comments should be ommited

        _omitComments = omit;
    
public voidsetOmitDocumentType(boolean omit)
Sets DOCTYPE declaration omitting on and off.

param
omit True if DOCTYPE declaration should be ommited

        _omitDoctype = omit;
    
public voidsetOmitXMLDeclaration(boolean omit)
Sets XML declaration omitting on and off.

param
omit True if XML declaration should be ommited

        _omitXmlDeclaration = omit;
    
public voidsetPreserveEmptyAttributes(boolean preserve)
Sets the preserveEmptyAttribute flag. If flag is false, then' attributes with empty string values are output as the attribute name only (in HTML mode).

param
preserve the preserve flag

		_preserveEmptyAttributes = preserve;	
public voidsetPreserveSpace(boolean preserve)
Sets space preserving as the default behavior. The default is space stripping and all elements that do not specify otherwise or use the default value will not preserve spaces.

param
preserve True if spaces should be preserved

        _preserve = preserve;
    
public voidsetStandalone(boolean standalone)
Sets document DTD standalone. The public and system identifiers must be null for the document to be serialized as standalone.

param
standalone True if document DTD is standalone

        _standalone = standalone;
    
public voidsetVersion(java.lang.String version)
Sets the version for this output method. For XML the value would be "1.0", for HTML it would be "4.0".

see
#getVersion
param
version The output method version, or null

        _version = version;
    
public static java.lang.StringwhichDoctypePublic(org.w3c.dom.Document doc)
Returns the document type public identifier specified for this document, or null.

        DocumentType doctype;

           /*  DOM Level 2 was introduced into the code base*/
           doctype = doc.getDoctype();
           if ( doctype != null ) {
           // Note on catch: DOM Level 1 does not specify this method
           // and the code will throw a NoSuchMethodError
           try {
           return doctype.getPublicId();
           } catch ( Error except ) {  }
           }
        
        if ( doc instanceof HTMLDocument )
            return DTD.XHTMLPublicId;
        return null;
    
public static java.lang.StringwhichDoctypeSystem(org.w3c.dom.Document doc)
Returns the document type system identifier specified for this document, or null.

        DocumentType doctype;

        /* DOM Level 2 was introduced into the code base*/
           doctype = doc.getDoctype();
           if ( doctype != null ) {
           // Note on catch: DOM Level 1 does not specify this method
           // and the code will throw a NoSuchMethodError
           try {
           return doctype.getSystemId();
           } catch ( Error except ) { }
           }
        
        if ( doc instanceof HTMLDocument )
            return DTD.XHTMLSystemId;
        return null;
    
public static java.lang.StringwhichMediaType(java.lang.String method)
Returns the suitable media format for a document output with the specified method.

        if ( method.equalsIgnoreCase( Method.XML ) )
            return "text/xml";
        if ( method.equalsIgnoreCase( Method.HTML ) )
            return "text/html";
        if ( method.equalsIgnoreCase( Method.XHTML ) )
            return "text/html";
        if ( method.equalsIgnoreCase( Method.TEXT ) )
            return "text/plain";
        if ( method.equalsIgnoreCase( Method.FOP ) )
            return "application/pdf";
        return null;
    
public static java.lang.StringwhichMethod(org.w3c.dom.Document doc)
Determine the output method for the specified document. If the document is an instance of {@link org.w3c.dom.html.HTMLDocument} then the method is said to be html. If the root element is 'html' and all text nodes preceding the root element are all whitespace, then the method is said to be html. Otherwise the method is xml.

param
doc The document to check
return
The suitable method

        Node    node;
        String  value;
        int     i;

        // If document is derived from HTMLDocument then the default
        // method is html.
        if ( doc instanceof HTMLDocument )
            return Method.HTML;

        // Lookup the root element and the text nodes preceding it.
        // If root element is html and all text nodes contain whitespace
        // only, the method is html.

        // FIXME (SM) should we care about namespaces here?

        node = doc.getFirstChild();
        while (node != null) {
            // If the root element is html, the method is html.
            if ( node.getNodeType() == Node.ELEMENT_NODE ) {
                if ( node.getNodeName().equalsIgnoreCase( "html" ) ) {
                    return Method.HTML;
                } else if ( node.getNodeName().equalsIgnoreCase( "root" ) ) {
                    return Method.FOP;
                } else {
                    return Method.XML;
                }
            } else if ( node.getNodeType() == Node.TEXT_NODE ) {
                // If a text node preceding the root element contains
                // only whitespace, this might be html, otherwise it's
                // definitely xml.
                value = node.getNodeValue();
                for ( i = 0 ; i < value.length() ; ++i )
                    if ( value.charAt( i ) != 0x20 && value.charAt( i ) != 0x0A &&
                         value.charAt( i ) != 0x09 && value.charAt( i ) != 0x0D )
                        return Method.XML;
            }
            node = node.getNextSibling();
        }
        // Anything else, the method is xml.
        return Method.XML;