URLpublic final class URL extends Object implements SerializableClass URL represents a Uniform Resource
Locator, a pointer to a "resource" on the World
Wide Web. A resource can be something as simple as a file or a
directory, or it can be a reference to a more complicated object,
such as a query to a database or to a search engine. More
information on the types of URLs and their formats can be found at:
http://archive.ncsa.uiuc.edu/SDG/Software/Mosaic/Demo/url-primer.html
In general, a URL can be broken into several parts. The previous
example of a URL indicates that the protocol to use is
http (HyperText Transfer Protocol) and that the
information resides on a host machine named
www.ncsa.uiuc.edu . The information on that host
machine is named /SDG/Software/Mosaic/Demo/url-primer.html . The exact
meaning of this name on the host machine is both protocol
dependent and host dependent. The information normally resides in
a file, but it could be generated on the fly. This component of
the URL is called the path component.
A URL can optionally specify a "port", which is the
port number to which the TCP connection is made on the remote host
machine. If the port is not specified, the default port for
the protocol is used instead. For example, the default port for
http is 80 . An alternative port could be
specified as:
http://archive.ncsa.uiuc.edu:80/SDG/Software/Mosaic/Demo/url-primer.html
The syntax of URL is defined by RFC 2396: Uniform
Resource Identifiers (URI): Generic Syntax, amended by RFC 2732: Format for
Literal IPv6 Addresses in URLs. The Literal IPv6 address format
also supports scope_ids. The syntax and usage of scope_ids is described
here.
A URL may have appended to it a "fragment", also known
as a "ref" or a "reference". The fragment is indicated by the sharp
sign character "#" followed by more characters. For example,
http://java.sun.com/index.html#chapter1
This fragment is not technically part of the URL. Rather, it
indicates that after the specified resource is retrieved, the
application is specifically interested in that part of the
document that has the tag chapter1 attached to it. The
meaning of a tag is resource specific.
An application can also specify a "relative URL",
which contains only enough information to reach the resource
relative to another URL. Relative URLs are frequently used within
HTML pages. For example, if the contents of the URL:
http://java.sun.com/index.html
contained within it the relative URL:
FAQ.html
it would be a shorthand for:
http://java.sun.com/FAQ.html
The relative URL need not specify all the components of a URL. If
the protocol, host name, or port number is missing, the value is
inherited from the fully specified URL. The file component must be
specified. The optional fragment is not inherited.
The URL class does not itself encode or decode any URL components
according to the escaping mechanism defined in RFC2396. It is the
responsibility of the caller to encode any fields, which need to be
escaped prior to calling URL, and also to decode any escaped fields,
that are returned from URL. Furthermore, because URL has no knowledge
of URL escaping, it does not recognise equivalence between the encoded
or decoded form of the same URL. For example, the two URLs:
http://foo.com/hello world/ and http://foo.com/hello%20world
would be considered not equal to each other.
Note, the {@link java.net.URI} class does perform escaping of its
component fields in certain circumstances. The recommended way
to manage the encoding and decoding of URLs is to use {@link java.net.URI},
and to convert between these two classes using {@link #toURI()} and
{@link URI#toURL()}.
The {@link URLEncoder} and {@link URLDecoder} classes can also be
used, but only for HTML form encoding, which is not the same
as the encoding scheme defined in RFC2396. |
Fields Summary |
---|
static final long | serialVersionUID | private static final String | protocolPathPropThe property which specifies the package prefix list to be scanned
for protocol handlers. The value of this property (if any) should
be a vertical bar delimited list of package names to search through
for a protocol handler to load. The policy of this class is that
all protocol handlers will be in a class called .Handler,
and each package in the list is examined in turn for a matching
handler. If none are found (or the property is not specified), the
default package prefix, sun.net.www.protocol, is used. The search
proceeds from the first package in the list to the last and stops
when a match is found. | private String | protocolThe protocol to use (ftp, http, nntp, ... etc.) . | private String | hostThe host name to connect to. | private int | portThe protocol port to connect to. | private String | fileThe specified file name on that host. file is
defined as path[?query] | private transient String | queryThe query part of this URL. | private String | authorityThe authority part of this URL. | private transient String | pathThe path part of this URL. | private transient String | userInfoThe userinfo part of this URL. | private String | ref# reference. | transient InetAddress | hostAddressThe host's IP address, used in equals and hashCode.
Computed on demand. An uninitialized or unknown hostAddress is null. | transient URLStreamHandler | handlerThe URLStreamHandler for this URL. | private int | hashCode | static URLStreamHandlerFactory | factoryThe URLStreamHandler factory. | static Hashtable | handlersA table of protocol handlers. | private static Object | streamHandlerLock |
Constructors Summary |
---|
public URL(String protocol, String host, int port, String file)Creates a URL object from the specified
protocol , host , port
number, and file .
host can be expressed as a host name or a literal
IP address. If IPv6 literal address is used, it should be
enclosed in square brackets ('[' and ']'), as
specified by RFC 2732;
However, the literal IPv6 address format defined in RFC 2373: IP
Version 6 Addressing Architecture is also accepted.
Specifying a port number of -1
indicates that the URL should use the default port for the
protocol.
If this is the first URL object being created with the specified
protocol, a stream protocol handler object, an instance of
class URLStreamHandler , is created for that protocol:
- If the application has previously set up an instance of
URLStreamHandlerFactory as the stream handler factory,
then the createURLStreamHandler method of that instance
is called with the protocol string as an argument to create the
stream protocol handler.
- If no
URLStreamHandlerFactory has yet been set up,
or if the factory's createURLStreamHandler method
returns null , then the constructor finds the
value of the system property:
java.protocol.handler.pkgs
If the value of that system property is not null ,
it is interpreted as a list of packages separated by a vertical
slash character '| '. The constructor tries to load
the class named:
<package>.<protocol>.Handler
where <package> is replaced by the name of the package
and <protocol> is replaced by the name of the protocol.
If this class does not exist, or if the class exists but it is not
a subclass of URLStreamHandler , then the next package
in the list is tried.
- If the previous step fails to find a protocol handler, then the
constructor tries to load from a system default package.
<system default package>.<protocol>.Handler
If this class does not exist, or if the class exists but it is not a
subclass of URLStreamHandler , then a
MalformedURLException is thrown.
Protocol handlers for the following protocols are guaranteed
to exist on the search path :-
http, https, ftp, file, and jar
Protocol handlers for additional protocols may also be
available.
No validation of the inputs is performed by this constructor.
this(protocol, host, port, file, null);
| public URL(String protocol, String host, String file)Creates a URL from the specified protocol
name, host name, and file name. The
default port for the specified protocol is used.
This method is equivalent to calling the four-argument
constructor with the arguments being protocol ,
host , -1 , and file .
No validation of the inputs is performed by this constructor.
this(protocol, host, -1, file);
| public URL(String protocol, String host, int port, String file, URLStreamHandler handler)Creates a URL object from the specified
protocol , host , port
number, file , and handler . Specifying
a port number of -1 indicates that
the URL should use the default port for the protocol. Specifying
a handler of null indicates that the URL
should use a default stream handler for the protocol, as outlined
for:
java.net.URL#URL(java.lang.String, java.lang.String, int,
java.lang.String)
If the handler is not null and there is a security manager,
the security manager's checkPermission
method is called with a
NetPermission("specifyStreamHandler") permission.
This may result in a SecurityException.
No validation of the inputs is performed by this constructor.
if (handler != null) {
SecurityManager sm = System.getSecurityManager();
if (sm != null) {
// check for permission to specify a handler
checkSpecifyHandler(sm);
}
}
protocol = protocol.toLowerCase();
this.protocol = protocol;
if (host != null) {
/**
* if host is a literal IPv6 address,
* we will make it conform to RFC 2732
*/
if (host != null && host.indexOf(':") >= 0
&& !host.startsWith("[")) {
host = "["+host+"]";
}
this.host = host;
if (port < -1) {
throw new MalformedURLException("Invalid port number :" +
port);
}
this.port = port;
authority = (port == -1) ? host : host + ":" + port;
}
Parts parts = new Parts(file);
path = parts.getPath();
query = parts.getQuery();
if (query != null) {
this.file = path + "?" + query;
} else {
this.file = path;
}
ref = parts.getRef();
// Note: we don't do validation of the URL here. Too risky to change
// right now, but worth considering for future reference. -br
if (handler == null &&
(handler = getURLStreamHandler(protocol)) == null) {
throw new MalformedURLException("unknown protocol: " + protocol);
}
this.handler = handler;
| public URL(String spec)Creates a URL object from the String
representation.
This constructor is equivalent to a call to the two-argument
constructor with a null first argument.
this(null, spec);
| public URL(URL context, String spec)Creates a URL by parsing the given spec within a specified context.
The new URL is created from the given context URL and the spec
argument as described in
RFC2396 "Uniform Resource Identifiers : Generic * Syntax" :
<scheme>://<authority><path>?<query>#<fragment>
The reference is parsed into the scheme, authority, path, query and
fragment parts. If the path component is empty and the scheme,
authority, and query components are undefined, then the new URL is a
reference to the current document. Otherwise, the fragment and query
parts present in the spec are used in the new URL.
If the scheme component is defined in the given spec and does not match
the scheme of the context, then the new URL is created as an absolute
URL based on the spec alone. Otherwise the scheme component is inherited
from the context URL.
If the authority component is present in the spec then the spec is
treated as absolute and the spec authority and path will replace the
context authority and path. If the authority component is absent in the
spec then the authority of the new URL will be inherited from the
context.
If the spec's path component begins with a slash character
"/" then the
path is treated as absolute and the spec path replaces the context path.
Otherwise, the path is treated as a relative path and is appended to the
context path, as described in RFC2396. Also, in this case,
the path is canonicalized through the removal of directory
changes made by occurences of ".." and ".".
For a more detailed description of URL parsing, refer to RFC2396.
this(context, spec, null);
| public URL(URL context, String spec, URLStreamHandler handler)Creates a URL by parsing the given spec with the specified handler
within a specified context. If the handler is null, the parsing
occurs as with the two argument constructor.
String original = spec;
int i, limit, c;
int start = 0;
String newProtocol = null;
boolean aRef=false;
boolean isRelative = false;
// Check for permission to specify a handler
if (handler != null) {
SecurityManager sm = System.getSecurityManager();
if (sm != null) {
checkSpecifyHandler(sm);
}
}
try {
limit = spec.length();
while ((limit > 0) && (spec.charAt(limit - 1) <= ' ")) {
limit--; //eliminate trailing whitespace
}
while ((start < limit) && (spec.charAt(start) <= ' ")) {
start++; // eliminate leading whitespace
}
if (spec.regionMatches(true, start, "url:", 0, 4)) {
start += 4;
}
if (start < spec.length() && spec.charAt(start) == '#") {
/* we're assuming this is a ref relative to the context URL.
* This means protocols cannot start w/ '#', but we must parse
* ref URL's like: "hello:there" w/ a ':' in them.
*/
aRef=true;
}
for (i = start ; !aRef && (i < limit) &&
((c = spec.charAt(i)) != '/") ; i++) {
if (c == ':") {
String s = spec.substring(start, i).toLowerCase();
if (isValidProtocol(s)) {
newProtocol = s;
start = i + 1;
}
break;
}
}
// Only use our context if the protocols match.
protocol = newProtocol;
if ((context != null) && ((newProtocol == null) ||
newProtocol.equalsIgnoreCase(context.protocol))) {
// inherit the protocol handler from the context
// if not specified to the contructor
if (handler == null) {
handler = context.handler;
}
// If the context is a hierarchical URL scheme and the spec
// contains a matching scheme then maintain backwards
// compatibility and treat it as if the spec didn't contain
// the scheme; see 5.2.3 of RFC2396
if (context.path != null && context.path.startsWith("/"))
newProtocol = null;
if (newProtocol == null) {
protocol = context.protocol;
authority = context.authority;
userInfo = context.userInfo;
host = context.host;
port = context.port;
file = context.file;
path = context.path;
isRelative = true;
}
}
if (protocol == null) {
throw new MalformedURLException("no protocol: "+original);
}
// Get the protocol handler if not specified or the protocol
// of the context could not be used
if (handler == null &&
(handler = getURLStreamHandler(protocol)) == null) {
throw new MalformedURLException("unknown protocol: "+protocol);
}
this.handler = handler;
i = spec.indexOf('#", start);
if (i >= 0) {
ref = spec.substring(i + 1, limit);
limit = i;
}
/*
* Handle special case inheritance of query and fragment
* implied by RFC2396 section 5.2.2.
*/
if (isRelative && start == limit) {
query = context.query;
if (ref == null) {
ref = context.ref;
}
}
handler.parseURL(this, spec, start, limit);
} catch(MalformedURLException e) {
throw e;
} catch(Exception e) {
throw new MalformedURLException(e.getMessage());
}
|
Methods Summary |
---|
private void | checkSpecifyHandler(java.lang.SecurityManager sm)
sm.checkPermission(SecurityConstants.SPECIFY_HANDLER_PERMISSION);
| public boolean | equals(java.lang.Object obj)Compares this URL for equality with another object.
If the given object is not a URL then this method immediately returns
false .
Two URL objects are equal if they have the same protocol, reference
equivalent hosts, have the same port number on the host, and the same
file and fragment of the file.
Two hosts are considered equivalent if both host names can be resolved
into the same IP addresses; else if either host name can't be
resolved, the host names must be equal without regard to case; or both
host names equal to null.
Since hosts comparison requires name resolution, this operation is a
blocking operation.
Note: The defined behavior for equals is known to
be inconsistent with virtual hosting in HTTP.
if (!(obj instanceof URL))
return false;
URL u2 = (URL)obj;
return handler.equals(this, u2);
| public java.lang.String | getAuthority()Gets the authority part of this URL .
return authority;
| public final java.lang.Object | getContent()Gets the contents of this URL. This method is a shorthand for:
openConnection().getContent()
return openConnection().getContent();
| public final java.lang.Object | getContent(java.lang.Class[] classes)Gets the contents of this URL. This method is a shorthand for:
openConnection().getContent(Class[])
return openConnection().getContent(classes);
| public int | getDefaultPort()Gets the default port number of the protocol associated
with this URL . If the URL scheme or the URLStreamHandler
for the URL do not define a default port number,
then -1 is returned.
return handler.getDefaultPort();
| public java.lang.String | getFile()Gets the file name of this URL .
The returned file portion will be
the same as getPath() , plus the concatenation of
the value of getQuery() , if any. If there is
no query portion, this method and getPath() will
return identical results.
return file;
| public java.lang.String | getHost()Gets the host name of this URL , if applicable.
The format of the host conforms to RFC 2732, i.e. for a
literal IPv6 address, this method will return the IPv6 address
enclosed in square brackets ('[' and ']').
return host;
| public java.lang.String | getPath()Gets the path part of this URL .
return path;
| public int | getPort()Gets the port number of this URL .
return port;
| public java.lang.String | getProtocol()Gets the protocol name of this URL .
return protocol;
| public java.lang.String | getQuery()Gets the query part of this URL .
return query;
| public java.lang.String | getRef()Gets the anchor (also known as the "reference") of this
URL .
return ref;
| static java.net.URLStreamHandler | getURLStreamHandler(java.lang.String protocol)Returns the Stream Handler.
URLStreamHandler handler = (URLStreamHandler)handlers.get(protocol);
if (handler == null) {
boolean checkedWithFactory = false;
// Use the factory (if any)
if (factory != null) {
handler = factory.createURLStreamHandler(protocol);
checkedWithFactory = true;
}
// Try java protocol handler
if (handler == null) {
String packagePrefixList = null;
packagePrefixList
= (String) java.security.AccessController.doPrivileged(
new sun.security.action.GetPropertyAction(
protocolPathProp,""));
if (packagePrefixList != "") {
packagePrefixList += "|";
}
// REMIND: decide whether to allow the "null" class prefix
// or not.
packagePrefixList += "sun.net.www.protocol";
StringTokenizer packagePrefixIter =
new StringTokenizer(packagePrefixList, "|");
while (handler == null &&
packagePrefixIter.hasMoreTokens()) {
String packagePrefix =
packagePrefixIter.nextToken().trim();
try {
String clsName = packagePrefix + "." + protocol +
".Handler";
Class cls = null;
try {
cls = Class.forName(clsName);
} catch (ClassNotFoundException e) {
ClassLoader cl = ClassLoader.getSystemClassLoader();
if (cl != null) {
cls = cl.loadClass(clsName);
}
}
if (cls != null) {
handler =
(URLStreamHandler)cls.newInstance();
}
} catch (Exception e) {
// any number of exceptions can get thrown here
}
}
}
synchronized (streamHandlerLock) {
URLStreamHandler handler2 = null;
// Check again with hashtable just in case another
// thread created a handler since we last checked
handler2 = (URLStreamHandler)handlers.get(protocol);
if (handler2 != null) {
return handler2;
}
// Check with factory if another thread set a
// factory since our last check
if (!checkedWithFactory && factory != null) {
handler2 = factory.createURLStreamHandler(protocol);
}
if (handler2 != null) {
// The handler from the factory must be given more
// importance. Discard the default handler that
// this thread created.
handler = handler2;
}
// Insert this handler into the hashtable
if (handler != null) {
handlers.put(protocol, handler);
}
}
}
return handler;
| public java.lang.String | getUserInfo()Gets the userInfo part of this URL .
return userInfo;
| public synchronized int | hashCode()Creates an integer suitable for hash table indexing.
The hash code is based upon all the URL components relevant for URL
comparison. As such, this operation is a blocking operation.
if (hashCode != -1)
return hashCode;
hashCode = handler.hashCode(this);
return hashCode;
| private boolean | isValidProtocol(java.lang.String protocol)
int len = protocol.length();
if (len < 1)
return false;
char c = protocol.charAt(0);
if (!Character.isLetter(c))
return false;
for (int i = 1; i < len; i++) {
c = protocol.charAt(i);
if (!Character.isLetterOrDigit(c) && c != '." && c != '+" &&
c != '-") {
return false;
}
}
return true;
| public java.net.URLConnection | openConnection()Returns a URLConnection object that represents a
connection to the remote object referred to by the URL .
A new connection is opened every time by calling the
openConnection method of the protocol handler for
this URL.
If for the URL's protocol (such as HTTP or JAR), there
exists a public, specialized URLConnection subclass belonging
to one of the following packages or one of their subpackages:
java.lang, java.io, java.util, java.net, the connection
returned will be of that subclass. For example, for HTTP an
HttpURLConnection will be returned, and for JAR a
JarURLConnection will be returned.
return handler.openConnection(this);
| public java.net.URLConnection | openConnection(java.net.Proxy proxy)Same as openConnection(), except that the connection will be
made through the specified proxy; Protocol handlers that do not
support proxing will ignore the proxy parameter and make a
normal connection.
Calling this method preempts the system's default ProxySelector
settings.
if (proxy == null) {
throw new IllegalArgumentException("proxy can not be null");
}
SecurityManager sm = System.getSecurityManager();
if (proxy.type() != Proxy.Type.DIRECT && sm != null) {
InetSocketAddress epoint = (InetSocketAddress) proxy.address();
if (epoint.isUnresolved())
sm.checkConnect(epoint.getHostName(), epoint.getPort());
else
sm.checkConnect(epoint.getAddress().getHostAddress(),
epoint.getPort());
}
return handler.openConnection(this, proxy);
| public final java.io.InputStream | openStream()Opens a connection to this URL and returns an
InputStream for reading from that connection. This
method is a shorthand for:
openConnection().getInputStream()
return openConnection().getInputStream();
| private synchronized void | readObject(java.io.ObjectInputStream s)readObject is called to restore the state of the URL from the
stream. It reads the components of the URL and finds the local
stream handler.
s.defaultReadObject(); // read the fields
if ((handler = getURLStreamHandler(protocol)) == null) {
throw new IOException("unknown protocol: " + protocol);
}
// Construct authority part
if (authority == null &&
((host != null && host.length() > 0) || port != -1)) {
if (host == null)
host = "";
authority = (port == -1) ? host : host + ":" + port;
// Handle hosts with userInfo in them
int at = host.lastIndexOf('@");
if (at != -1) {
userInfo = host.substring(0, at);
host = host.substring(at+1);
}
} else if (authority != null) {
// Construct user info part
int ind = authority.indexOf('@");
if (ind != -1)
userInfo = authority.substring(0, ind);
}
// Construct path and query part
path = null;
query = null;
if (file != null) {
// Fix: only do this if hierarchical?
int q = file.lastIndexOf('?");
if (q != -1) {
query = file.substring(q+1);
path = file.substring(0, q);
} else
path = file;
}
| public boolean | sameFile(java.net.URL other)Compares two URLs, excluding the fragment component.
Returns true if this URL and the
other argument are equal without taking the
fragment component into consideration.
return handler.sameFile(this, other);
| protected void | set(java.lang.String protocol, java.lang.String host, int port, java.lang.String authority, java.lang.String userInfo, java.lang.String path, java.lang.String query, java.lang.String ref)Sets the specified 8 fields of the URL. This is not a public method so
that only URLStreamHandlers can modify URL fields. URLs are otherwise
constant.
synchronized (this) {
this.protocol = protocol;
this.host = host;
this.port = port;
this.file = query == null ? path : path + "?" + query;
this.userInfo = userInfo;
this.path = path;
this.ref = ref;
/* This is very important. We must recompute this after the
* URL has been changed. */
hashCode = -1;
hostAddress = null;
this.query = query;
this.authority = authority;
}
| protected void | set(java.lang.String protocol, java.lang.String host, int port, java.lang.String file, java.lang.String ref)Sets the fields of the URL. This is not a public method so that
only URLStreamHandlers can modify URL fields. URLs are
otherwise constant.
synchronized (this) {
this.protocol = protocol;
this.host = host;
authority = port == -1 ? host : host + ":" + port;
this.port = port;
this.file = file;
this.ref = ref;
/* This is very important. We must recompute this after the
* URL has been changed. */
hashCode = -1;
hostAddress = null;
int q = file.lastIndexOf('?");
if (q != -1) {
query = file.substring(q+1);
path = file.substring(0, q);
} else
path = file;
}
| public static void | setURLStreamHandlerFactory(java.net.URLStreamHandlerFactory fac)Sets an application's URLStreamHandlerFactory .
This method can be called at most once in a given Java Virtual
Machine.
The URLStreamHandlerFactory instance is used to
construct a stream protocol handler from a protocol name.
If there is a security manager, this method first calls
the security manager's checkSetFactory method
to ensure the operation is allowed.
This could result in a SecurityException.
synchronized (streamHandlerLock) {
if (factory != null) {
throw new Error("factory already defined");
}
SecurityManager security = System.getSecurityManager();
if (security != null) {
security.checkSetFactory();
}
handlers.clear();
factory = fac;
}
| public java.lang.String | toExternalForm()Constructs a string representation of this URL . The
string is created by calling the toExternalForm
method of the stream protocol handler for this object.
return handler.toExternalForm(this);
| public java.lang.String | toString()Constructs a string representation of this URL . The
string is created by calling the toExternalForm
method of the stream protocol handler for this object.
return toExternalForm();
| public java.net.URI | toURI()Returns a {@link java.net.URI} equivalent to this URL.
This method functions in the same way as new URI (this.toString()) .
Note, any URL instance that complies with RFC 2396 can be converted
to a URI. However, some URLs that are not strictly in compliance
can not be converted to a URI.
return new URI (toString());
| private synchronized void | writeObject(java.io.ObjectOutputStream s)WriteObject is called to save the state of the URL to an
ObjectOutputStream. The handler is not saved since it is
specific to this system.
s.defaultWriteObject(); // write the fields
|
|