CGI-invoking servlet for web applications, used to execute scripts which
comply to the Common Gateway Interface (CGI) specification and are named
in the path-info used to invoke this servlet.
Note: This code compiles and even works for simple CGI cases.
Exhaustive testing has not been done. Please consider it beta
quality. Feedback is appreciated to the author (see below).
Example:
If an instance of this servlet was mapped (using
<web-app>/WEB-INF/web.xml ) to:
<web-app>/cgi-bin/*
then the following request:
http://localhost:8080/<web-app>/cgi-bin/dir1/script/pathinfo1
would result in the execution of the script
<web-app-root>/WEB-INF/cgi/dir1/script
with the script's PATH_INFO set to /pathinfo1 .
Recommendation: House all your CGI scripts under
<webapp>/WEB-INF/cgi . This will ensure that you do not
accidentally expose your cgi scripts' code to the outside world and that
your cgis will be cleanly ensconced underneath the WEB-INF (i.e.,
non-content) area.
The default CGI location is mentioned above. You have the flexibility to
put CGIs wherever you want, however:
The CGI search path will start at
webAppRootDir + File.separator + cgiPathPrefix
(or webAppRootDir alone if cgiPathPrefix is
null).
cgiPathPrefix is defined by setting
this servlet's cgiPathPrefix init parameter
CGI Specification: derived from
http://cgi-spec.golux.com.
A work-in-progress & expired Internet Draft. Note no actual RFC describing
the CGI specification exists. Where the behavior of this servlet differs
from the specification cited above, it is either documented here, a bug,
or an instance where the specification cited differs from Best
Community Practice (BCP).
Such instances should be well-documented here. Please email the
Jakarta Tomcat group [tomcat-dev@jakarta.apache.org]
with amendments.
Canonical metavariables:
The CGI specification defines the following canonical metavariables:
[excerpt from CGI specification]
AUTH_TYPE
CONTENT_LENGTH
CONTENT_TYPE
GATEWAY_INTERFACE
PATH_INFO
PATH_TRANSLATED
QUERY_STRING
REMOTE_ADDR
REMOTE_HOST
REMOTE_IDENT
REMOTE_USER
REQUEST_METHOD
SCRIPT_NAME
SERVER_NAME
SERVER_PORT
SERVER_PROTOCOL
SERVER_SOFTWARE
Metavariables with names beginning with the protocol name (e.g.,
"HTTP_ACCEPT") are also canonical in their description of request header
fields. The number and meaning of these fields may change independently
of this specification. (See also section 6.1.5 [of the CGI specification].)
[end excerpt]
Implementation notes
standard input handling: If your script accepts standard input,
then the client must start sending input within a certain timeout period,
otherwise the servlet will assume no input is coming and carry on running
the script. The script's the standard input will be closed and handling of
any further input from the client is undefined. Most likely it will be
ignored. If this behavior becomes undesirable, then this servlet needs
to be enhanced to handle threading of the spawned process' stdin, stdout,
and stderr (which should not be too hard).
If you find your cgi scripts are timing out receiving input, you can set
the init parameter of your webapps' cgi-handling servlet
to be
Metavariable Values: According to the CGI specificion,
implementations may choose to represent both null or missing values in an
implementation-specific manner, but must define that manner. This
implementation chooses to always define all required metavariables, but
set the value to "" for all metavariables whose value is either null or
undefined. PATH_TRANSLATED is the sole exception to this rule, as per the
CGI Specification.
NPH -- Non-parsed-header implementation: This implementation does
not support the CGI NPH concept, whereby server ensures that the data
supplied to the script are preceisely as supplied by the client and
unaltered by the server.
The function of a servlet container (including Tomcat) is specifically
designed to parse and possible alter CGI-specific variables, and as
such makes NPH functionality difficult to support.
The CGI specification states that compliant servers MAY support NPH output.
It does not state servers MUST support NPH output to be unconditionally
compliant. Thus, this implementation maintains unconditional compliance
with the specification though NPH support is not present.
The CGI specification is located at
http://cgi-spec.golux.com.
TODO:
- Support for setting headers (for example, Location headers don't work)
- Support for collapsing multiple header lines (per RFC 2616)
- Ensure handling of POST method does not interfere with 2.3 Filters
- Refactor some debug code out of core
- Ensure header handling preserves encoding
- Possibly rewrite CGIRunner.run()?
- Possibly refactor CGIRunner and CGIEnvironment as non-inner classes?
- Document handling of cgi stdin when there is no stdin
- Revisit IOException handling in CGIRunner.run()
- Better documentation
- Confirm use of ServletInputStream.available() in CGIRunner.run() is
not needed
- Make checking for "." and ".." in servlet & cgi PATH_INFO less
draconian
- [add more to this TODO list]
|