|
This first chapter answers the question "What is a Servlet?", shows typical
uses for Servlets, compares Servlets to CGI programs and explains the basics
of the Servlet architecture and the Servlet lifecycle. It also gives a quick
introduction to HTTP and its implementation in the HttpServlet class.
Servlets are modules of Java code that run in a server application (hence the
name "Servlets", similar to "Applets" on the client side) to answer client
requests. Servlets are not tied to a specific client-server protocol but they
are most commonly used with HTTP and the word "Servlet" is often used in the
meaning of "HTTP Servlet".
Servlets make use of the Java standard extension classes in the packages javax.servlet
(the basic Servlet framework) and javax.servlet.http (extensions of the Servlet framework
for Servlets that answer HTTP requests). Since Servlets are written in the
highly portable Java language and follow a standard framework, they provide a
means to create sophisticated server extensions in a server and operating
system independent way.
Typical uses for HTTP Servlets include:
-
Processing and/or storing data submitted by an HTML form.
-
Providing dynamic content, e.g. returning the results of a database query
to the client.
-
Managing state information on top of the stateless HTTP, e.g. for an
online shopping cart system which manages shopping carts for many
concurrent customers and maps every request to the right customer.
The traditional way of adding functionality to a Web Server is the
Common Gateway Interface
(CGI), a language-independent interface that allows a server to start an
external process which gets information about a request through environment
variables, the command line and its standard input stream and writes response
data to its standard output stream. Each request is answered in a separate
process by a separate instance of the CGI program, or CGI script (as it is
often called because CGI programs are usually written in interpreted languages
like Perl).
Servlets have several advantages over CGI:
-
A Servlet does not run in a separate process. This removes the overhead of
creating a new process for each request.
-
A Servlet stays in memory between requests. A CGI program (and probably
also an extensive runtime system or interpreter) needs to be loaded and
started for each CGI request.
-
There is only a single instance which answers all requests concurrently.
This saves memory and allows a Servlet to easily manage persistent data.
-
A Servlet can be run by a Servlet Engine in a restrictive
Sandbox
(just like an Applet runs in a Web Browser's Sandbox) which allows secure
use of untrusted and potentially harmful Servlets.
A Servlet, in its most general form, is an instance of a class which
implements the javax.servlet.Servlet interface. Most Servlets, however, extend one of the
standard implementations of that interface, namely javax.servlet.GenericServlet and javax.servlet.http.HttpServlet. In this
tutorial we'll be discussing only HTTP Servlets which extend the javax.servlet.http.HttpServlet class.
In order to initialize a Servlet, a server application loads the Servlet class
(and probably other classes which are referenced by the Servlet) and creates
an instance by calling the no-args constructor. Then it calls the Servlet's
init(ServletConfig config) method. The Servlet should performe one-time setup procedures in this
method and store the ServletConfig object so that it can be retrieved later by
calling the Servlet's getServletConfig() method. This is handled by GenericServlet. Servlets which
extend GenericServlet (or its subclass HttpServlet) should call super.init(config) at the beginning of the
init method to make use of this feature. The ServletConfig object contains Servlet
parameters and a reference to the Servlet's ServletContext. The init method is
guaranteed to be called only once during the Servlet's lifecycle. It does not
need to be thread-safe because the service method will not be called until the
call to init returns.
When the Servlet is initialized, its service(ServletRequest req,
ServletResponse res) method is called for every request
to the Servlet. The method is called concurrently (i.e. multiple threads may
call this method at the same time) so it should be implemented in a
thread-safe manner. Techniques for ensuring that the service method is not
called concurrently, for the cases where this is not possible, are described
in
section
4.1.
When the Servlet needs to be unloaded (e.g. because a new version should be
loaded or the server is shutting down) the destroy() method is called. There may
still be threads that execute the service method when destroy is called, so destroy
has to be thread-safe. All resources which were allocated in init should be
released in destroy. This method is guaranteed to be called only once during the
Servlet's lifecycle.
A typical Servlet lifecycle
|
Before we can start writing the first Servlet, we need to know some basics of
HTTP ("HyperText Transfer Protocol"), the protocol which is used by a WWW
client (e.g. a browser) to send a request to a Web Server.
HTTP is a request-response oriented protocol. An HTTP request consists of a
request method, a URI, header fields and a body (which can be empty). An HTTP
response contains a result code and again header fields and a body.
The service method of HttpServlet dispatches a request to different Java methods for
different HTTP request methods. It recognizes the standard HTTP/1.1 methods
and should not be overridden in subclasses unless you need to implement
additional methods. The recognized methods are GET, HEAD, PUT, POST, DELETE,
OPTIONS and TRACE. Other methods are answered with a Bad Request HTTP
error. An HTTP method XXX is dispatched to a Java method doXxx, e.g.
GET -> doGet. All these methods expect the parameters "(HttpServletRequest req,
HttpServletResponse res)". The
methods doOptions and doTrace have suitable default implementations and are usually
not overridden. The HEAD method (which is supposed to return the same header
lines that a GET method would return, but doesn't include a body) is performed
by calling doGet and ignoring any output that is written by this method. That
leaves us with the methods doGet, doPut, doPost and doDelete whose default
implementations in HttpServlet return a Bad Request HTTP error. A subclass
of HttpServlet overrides one or more of these methods to provide a meaningful
implementation.
The request data is passed to all methods through the first argument of type
HttpServletRequest (which is a subclass of the more general ServletRequest class). The response can
be created with methods of the second argument of type HttpServletResponse (a subclass of
ServletResponse).
When you request a URL in a Web Browser, the GET method is used for the
request. A GET request does not have a body (i.e. the body is empty). The
response should contain a body with the response data and header fields which
describe the body (especially Content-Type and Content-Encoding). When you send an HTML form,
either GET or POST can be used. With a GET request the parameters are encoded
in the URL, with a POST request they are transmited in the body. HTML editors
and upload tools use PUT requests to upload resources to a Web Server and
DELETE requests to delete resources.
The complete HTTP specifications can be found in
RFCs
1945 (HTTP/1.0) and
2068 (HTTP/1.1).
|