File Doc Category Size Date Package
OutputFormat.java API Doc Java SE 5 API 27962 Fri Aug 26 14:56:02 BST 2005 com.sun.org.apache.xml.internal.serialize

OutputFormat

java.lang.Object

public class OutputFormat extends Object

Specifies an output format to control the serializer. Based on the XSLT specification for output format, plus additional parameters. Used to select the suitable serializer and determine how the document should be formatted on output.

The two interesting constructors are:

{@link #OutputFormat(String,String,boolean)} creates a format for the specified method (XML, HTML, Text, etc), encoding and indentation
{@link #OutputFormat(Document,String,boolean)} creates a format compatible with the document type (XML, HTML, Text, etc), encoding and indentation

version: $Revision: 1.20 $ $Date: 2003/12/10 17:14:17 $
author: Assaf Arkin Keith Visco
see: Serializer
see: Method
see: LineSeparator

Fields Summary
private String
_method
Holds the output method specified for this document, or null if no method was specified.
private String
_version
Specifies the version of the output method.
private int
_indent
The indentation level, or zero if no indentation was requested.
private String
_encoding
The encoding to use, if an input stream is used. The default is always UTF-8.
private EncodingInfo
_encodingInfo
The EncodingInfo instance for _encoding.
private boolean
_allowJavaNames
private String
_mediaType
The specified media type or null.
private String
_doctypeSystem
The specified document type system identifier, or null.
private String
_doctypePublic
The specified document type public identifier, or null.
private boolean
_omitXmlDeclaration
Ture if the XML declaration should be ommited;
private boolean
_omitDoctype
Ture if the DOCTYPE declaration should be ommited;
private boolean
_omitComments
Ture if comments should be ommited;
private boolean
_stripComments
Ture if the comments should be ommited;
private boolean
_standalone
True if the document type should be marked as standalone.
private String[]
_cdataElements
List of element tag names whose text node children must be output as CDATA.
private String[]
_nonEscapingElements
List of element tag names whose text node children must be output unescaped.
private String
_lineSeparator
The selected line separator.
private int
_lineWidth
The line width at which to wrap long lines when indenting.
private boolean
_preserve
True if spaces should be preserved in elements that do not specify otherwise, or specify the default behavior.
private boolean
_preserveEmptyAttributes
If true, an empty string valued attribute is output as "". If false and and we are using the HTMLSerializer, then only the attribute name is serialized. Defaults to false for backwards compatibility.
Constructors Summary
public OutputFormat()
Constructs a new output format with the default values.
public OutputFormat(String method, String encoding, boolean indenting)
Constructs a new output format with the default values for the specified method and encoding. If indent is true, the document will be pretty printed with the default indentation level and default line wrapping.
param
method The specified output method
param
encoding The specified encoding
param
indenting True for pretty printing
see
#setEncoding
see
#setIndenting
see
#setMethod
setMethod( method ); setEncoding( encoding ); setIndenting( indenting );
public OutputFormat(Document doc)
Constructs a new output format with the proper method, document type identifiers and media type for the specified document.
param
doc The document to output
see
#whichMethod
setMethod( whichMethod( doc ) ); setDoctype( whichDoctypePublic( doc ), whichDoctypeSystem( doc ) ); setMediaType( whichMediaType( getMethod() ) );
public OutputFormat(Document doc, String encoding, boolean indenting)
Constructs a new output format with the proper method, document type identifiers and media type for the specified document, and with the specified encoding. If indent is true, the document will be pretty printed with the default indentation level and default line wrapping.
param
doc The document to output
param
encoding The specified encoding
param
indenting True for pretty printing
see
#setEncoding
see
#setIndenting
see
#whichMethod
this( doc ); setEncoding( encoding ); setIndenting( indenting );
Methods Summary
public java.lang.String[] getCDataElements()
Returns a list of all the elements whose text node children should be output as CDATA, or null if no such elements were specified.
return _cdataElements;
public java.lang.String getDoctypePublic()
Returns the specified document type public identifier, or null.
return _doctypePublic;
public java.lang.String getDoctypeSystem()
Returns the specified document type system identifier, or null.
return _doctypeSystem;
public java.lang.String getEncoding()
Returns the specified encoding. If no encoding was specified, the default is always "UTF-8".
return
The encoding
return _encoding;
public com.sun.org.apache.xml.internal.serialize.EncodingInfo getEncodingInfo()
Returns an EncodingInfo instance for the encoding.see #setEncoding if (_encodingInfo == null) _encodingInfo = Encodings.getEncodingInfo(_encoding, _allowJavaNames); return _encodingInfo;
public int getIndent()
Returns the indentation specified. If no indentation was specified, zero is returned and the document should not be indented.
return
The indentation or zero
see
#setIndenting
return _indent;
public boolean getIndenting()
Returns true if indentation was specified.
return ( _indent > 0 );
public char getLastPrintable()
Returns the last printable character based on the selected encoding. Control characters and non-printable characters are always printed as character references.
if ( getEncoding() != null && ( getEncoding().equalsIgnoreCase( "ASCII" ) ) ) return 0xFF; else return 0xFFFF;
public java.lang.String getLineSeparator()
Returns a specific line separator to use. The default is the Web line separator (\n). A string is returned to support double codes (CR + LF).
return
The specified line separator
return _lineSeparator;
public int getLineWidth()
Return the selected line width for breaking up long lines. When indenting, and only when indenting, long lines will be broken at space boundaries based on this line width. No line wrapping occurs if this value is zero.
return _lineWidth;
public java.lang.String getMediaType()
Returns the specified media type, or null. To determine the media type based on the document type, use {@link #whichMediaType}.
return
The specified media type, or null
return _mediaType;
public java.lang.String getMethod()
Returns the method specified for this output format. Typically the method will be xml, html or text, but it might be other values. If no method was specified, null will be returned and the most suitable method will be determined for the document by calling {@link #whichMethod}.
return
The specified output method, or null
return _method;
public java.lang.String[] getNonEscapingElements()
Returns a list of all the elements whose text node children should be output unescaped (no character references), or null if no such elements were specified.
return _nonEscapingElements;
public boolean getOmitComments()
Returns true if comments should be ommited. The default is false.
return _omitComments;
public boolean getOmitDocumentType()
Returns true if the DOCTYPE declaration should be ommited. The default is false.
return _omitDoctype;
public boolean getOmitXMLDeclaration()
Returns true if the XML document declaration should be ommited. The default is false.
return _omitXmlDeclaration;
public boolean getPreserveEmptyAttributes()
Returns the preserveEmptyAttribute flag. If flag is false, then' attributes with empty string values are output as the attribute name only (in HTML mode).
return
preserve the preserve flag
return _preserveEmptyAttributes;
public boolean getPreserveSpace()
Returns true if the default behavior for this format is to preserve spaces. All elements that do not specify otherwise or specify the default behavior will be formatted based on this rule. All elements that specify space preserving will always preserve space.
return _preserve;
public boolean getStandalone()
Returns true if the document type is standalone. The default is false.
return _standalone;
public java.lang.String getVersion()
Returns the version for this output method. If no version was specified, will return null and the default version number will be used. If the serializerr does not support that particular version, it should default to a supported version.
return
The specified method version, or null
return _version;
public boolean isCDataElement(java.lang.String tagName)
Returns true if the text node children of the given elements should be output as CDATA.
param
tagName The element's tag name
return
True if should serialize as CDATA
int i; if ( _cdataElements == null ) return false; for ( i = 0 ; i < _cdataElements.length ; ++i ) if ( _cdataElements[ i ].equals( tagName ) ) return true; return false;
public boolean isNonEscapingElement(java.lang.String tagName)
Returns true if the text node children of the given elements should be output unescaped.
param
tagName The element's tag name
return
True if should serialize unescaped
int i; if ( _nonEscapingElements == null ) { return false; } for ( i = 0 ; i < _nonEscapingElements.length ; ++i ) if ( _nonEscapingElements[ i ].equals( tagName ) ) return true; return false;
public void setAllowJavaNames(boolean allow)
Sets whether java encoding names are permitted
_allowJavaNames = allow;
public boolean setAllowJavaNames()
Returns whether java encoding names are permitted
return _allowJavaNames;
public void setCDataElements(java.lang.String[] cdataElements)
Sets the list of elements for which text node children should be output as CDATA.
param
cdataElements List of CDATA element tag names
_cdataElements = cdataElements;
public void setDoctype(java.lang.String publicId, java.lang.String systemId)
Sets the document type public and system identifiers. Required only if the DOM Document or SAX events do not specify the document type, and one must be present in the serialized document. Any document type specified by the DOM Document or SAX events will override these values.
param
publicId The public identifier, or null
param
systemId The system identifier, or null
_doctypePublic = publicId; _doctypeSystem = systemId;
public void setEncoding(java.lang.String encoding)
Sets the encoding for this output method. If no encoding was specified, the default is always "UTF-8". Make sure the encoding is compatible with the one used by the {@link java.io.Writer}.
see
#getEncoding
param
encoding The encoding, or null
_encoding = encoding; _encodingInfo = null;
public void setEncoding(com.sun.org.apache.xml.internal.serialize.EncodingInfo encInfo)
Sets the encoding for this output method with an EncodingInfo instance.
_encoding = encInfo.getIANAName(); _encodingInfo = encInfo;
public void setIndent(int indent)
Sets the indentation. The document will not be indented if the indentation is set to zero. Calling {@link #setIndenting} will reset this value to zero (off) or the default (on).
param
indent The indentation, or zero
if ( indent < 0 ) _indent = 0; else _indent = indent;
public void setIndenting(boolean on)
Sets the indentation on and off. When set on, the default indentation level and default line wrapping is used (see {@link Defaults#Indent} and {@link Defaults#LineWidth}). To specify a different indentation level or line wrapping, use {@link #setIndent} and {@link #setLineWidth}.
param
on True if indentation should be on
if ( on ) { _indent = Defaults.Indent; _lineWidth = Defaults.LineWidth; } else { _indent = 0; _lineWidth = 0; }
public void setLineSeparator(java.lang.String lineSeparator)
Sets the line separator. The default is the Web line separator (\n). The machine's line separator can be obtained from the system property line.separator, but is only useful if the document is edited on machines of the same type. For general documents, use the Web line separator.
param
lineSeparator The specified line separator
if ( lineSeparator == null ) _lineSeparator = LineSeparator.Web; else _lineSeparator = lineSeparator;
public void setLineWidth(int lineWidth)
Sets the line width. If zero then no line wrapping will occur. Calling {@link #setIndenting} will reset this value to zero (off) or the default (on).
param
lineWidth The line width to use, zero for default
see
#getLineWidth
see
#setIndenting
if ( lineWidth <= 0 ) _lineWidth = 0; else _lineWidth = lineWidth;
public void setMediaType(java.lang.String mediaType)
Sets the media type.
see
#getMediaType
param
mediaType The specified media type
_mediaType = mediaType;
public void setMethod(java.lang.String method)
Sets the method for this output format.
see
#getMethod
param
method The output method, or null
_method = method;
public void setNonEscapingElements(java.lang.String[] nonEscapingElements)
Sets the list of elements for which text node children should be output unescaped (no character references).
param
nonEscapingElements List of unescaped element tag names
_nonEscapingElements = nonEscapingElements;
public void setOmitComments(boolean omit)
Sets comment omitting on and off.
param
omit True if comments should be ommited
_omitComments = omit;
public void setOmitDocumentType(boolean omit)
Sets DOCTYPE declaration omitting on and off.
param
omit True if DOCTYPE declaration should be ommited
_omitDoctype = omit;
public void setOmitXMLDeclaration(boolean omit)
Sets XML declaration omitting on and off.
param
omit True if XML declaration should be ommited
_omitXmlDeclaration = omit;
public void setPreserveEmptyAttributes(boolean preserve)
Sets the preserveEmptyAttribute flag. If flag is false, then' attributes with empty string values are output as the attribute name only (in HTML mode).
param
preserve the preserve flag
_preserveEmptyAttributes = preserve;
public void setPreserveSpace(boolean preserve)
Sets space preserving as the default behavior. The default is space stripping and all elements that do not specify otherwise or use the default value will not preserve spaces.
param
preserve True if spaces should be preserved
_preserve = preserve;
public void setStandalone(boolean standalone)
Sets document DTD standalone. The public and system identifiers must be null for the document to be serialized as standalone.
param
standalone True if document DTD is standalone
_standalone = standalone;
public void setVersion(java.lang.String version)
Sets the version for this output method. For XML the value would be "1.0", for HTML it would be "4.0".
see
#getVersion
param
version The output method version, or null
_version = version;
public static java.lang.String whichDoctypePublic(org.w3c.dom.Document doc)
Returns the document type public identifier specified for this document, or null.
DocumentType doctype; /* DOM Level 2 was introduced into the code base*/ doctype = doc.getDoctype(); if ( doctype != null ) { // Note on catch: DOM Level 1 does not specify this method // and the code will throw a NoSuchMethodError try { return doctype.getPublicId(); } catch ( Error except ) { } } if ( doc instanceof HTMLDocument ) return DTD.XHTMLPublicId; return null;
public static java.lang.String whichDoctypeSystem(org.w3c.dom.Document doc)
Returns the document type system identifier specified for this document, or null.
DocumentType doctype; /* DOM Level 2 was introduced into the code base*/ doctype = doc.getDoctype(); if ( doctype != null ) { // Note on catch: DOM Level 1 does not specify this method // and the code will throw a NoSuchMethodError try { return doctype.getSystemId(); } catch ( Error except ) { } } if ( doc instanceof HTMLDocument ) return DTD.XHTMLSystemId; return null;
public static java.lang.String whichMediaType(java.lang.String method)
Returns the suitable media format for a document output with the specified method.
if ( method.equalsIgnoreCase( Method.XML ) ) return "text/xml"; if ( method.equalsIgnoreCase( Method.HTML ) ) return "text/html"; if ( method.equalsIgnoreCase( Method.XHTML ) ) return "text/html"; if ( method.equalsIgnoreCase( Method.TEXT ) ) return "text/plain"; if ( method.equalsIgnoreCase( Method.FOP ) ) return "application/pdf"; return null;
public static java.lang.String whichMethod(org.w3c.dom.Document doc)
Determine the output method for the specified document. If the document is an instance of {@link org.w3c.dom.html.HTMLDocument} then the method is said to be html. If the root element is 'html' and all text nodes preceding the root element are all whitespace, then the method is said to be html. Otherwise the method is xml.
param
doc The document to check
return
The suitable method
Node node; String value; int i; // If document is derived from HTMLDocument then the default // method is html. if ( doc instanceof HTMLDocument ) return Method.HTML; // Lookup the root element and the text nodes preceding it. // If root element is html and all text nodes contain whitespace // only, the method is html. // FIXME (SM) should we care about namespaces here? node = doc.getFirstChild(); while (node != null) { // If the root element is html, the method is html. if ( node.getNodeType() == Node.ELEMENT_NODE ) { if ( node.getNodeName().equalsIgnoreCase( "html" ) ) { return Method.HTML; } else if ( node.getNodeName().equalsIgnoreCase( "root" ) ) { return Method.FOP; } else { return Method.XML; } } else if ( node.getNodeType() == Node.TEXT_NODE ) { // If a text node preceding the root element contains // only whitespace, this might be html, otherwise it's // definitely xml. value = node.getNodeValue(); for ( i = 0 ; i < value.length() ; ++i ) if ( value.charAt( i ) != 0x20 && value.charAt( i ) != 0x0A && value.charAt( i ) != 0x09 && value.charAt( i ) != 0x0D ) return Method.XML; } node = node.getNextSibling(); } // Anything else, the method is xml. return Method.XML;