Table of Contents
Java Servlets (and the associated Java Server Pages) are a server side web application service technology. They require no special client side support beyond access to a web browser although they can take advantage of a number of client side technologies. Java Servlet technology is becoming the method of database access most favoured by major database management system vendors: Oracle have announced that they are dropping their command line and client-server (Oracle forms) interfaces in favour of servlet based web interfaces. Servlets are fast, relatively easy to develop once the initial hurdles are overcome, and scalable to the largest of web application needs.
Servlets are Java objects that implement the javax.servlet.Servlet interface. They are loaded by a servlet server such as Tomcat, Jetty, Resin, or BEA's WebLogic (to name but a few) which maps them to some location on the web server's web space. Any web access to that address is picked up by the servlet server and directed to the corresponding servlet, the servlet analyses the request and responds, usually with a html page containing the answer to the request, although more sophisticated response mechanisms are possible.
While most servlet servers can act as full http demons on their own, serving both servlets and standard web pages, they rarely have the range of web servicing facilities and tuned efficiency that a full web server like Apache, Netscape Server or Microsoft's Internet Information Server has. On the other hand, these web servers cannot provide the full servlet serving facilities that the servlet servers are dedicated to. Hence the usual configuration is to run a web server and a servlet server and configure the web server to forward servlet requests to the servlet server. The servlet server does not then act as a httpd demon.
Typically, rather than directly implement the javax.servlet.Servlet interface, servlets will extend javax.servlet.http.HttpServlet, which itself implements the Servlet interface. Much of the "boilerplate" work of extracting information from a request and putting together a response is already implemented for you in HTTPServlet.
A Web Application directory structure is a single directory that contains all the java class files, libraries, html pages, JSP pages, data files and configuration information in the correct structure and formats. If your web application is maintained in this structure, any standards-conforming servlet server can load and run the servlets. Furthermore, a web application in this structure can be packaged up into a ".war" file which is then a single file which is all that is needed to install the application on a servlet server. You can drop such a war file into the appropriate servers webapps directory and the server will unpack it, load the servlets and start serving them without further interaction necessary. This is a very convenient way to distribute and install binary web applications. This convenience is at a slight cost in that you do have to be able to compile and build this structure. However, tool support, particularly ANT, can make this easy.
The directory structure required is simply one which matches the web address structure of the application but with one extra directory at the top level called WEB-INF. This WEB-INF is itself protected from web access but contains the servlet classes, servlet configuration information and libraries that must be loaded and available for the servlets to work and any other data files or information required for the execution of the servlets. The structure is as follows:
Application Root
WEB-INF
classes
(any .class files: compiled java servlets)
lib
(any .jar files that are not normally available but are need by servlets and JSP pages)
web.xml (contains the connections between class file names and abstract servlet names, and servlet names and web address locations)
(any other data files that need to be accessible to the servlets
(any JSP pages, HTML pages, images or other files in a directory structure that should put the files where you want them to appear in the web space relative to the root)
In order to maintain the sources and corresponding data files, I recommend the following source directory structure:
(the Java servlet source files)
(the .jar files that have to be copied into the WEB-INF/lib directory)
(the files - JSP, HTML, Images etc. - and subdirectories that should be copied into the Application Root directory)
(the files - web.xml and other application specific data files - that need to be copied into the WEB-INF directory)
(a working directory used to collect the whole web application structure in order to package it up into a .war file)
(where the final .war file is put, ready to be copied into the servlet server's webapps directory at installation time)
Given the directory structure above, to add an extra servlet to the application, you would typically need to:
add a new .java servlet file to src.
modify etc/web.xml in the pattern for other servlets already shown there: namely to associate an abstract name to an object name (the object name is, of course, the name of the file without the .java or .class extension) and to associate the abstract name with a location in the web address space relative to the root of the application. Note that this mapping need have no relationship with the actual underlying directory structure of the web application: you can map a servlet to address /x/y/z even if there is no /x and no /x/y page, servlet or directory.
quit the servlet server and run "ant install" to install the application.
start the servlet server again.
It should not really be necessary to stop and restart the servlet server but complexities with caching and dynamic loading and unloading of classes means that you can get unexpected behaviours if you don't do so.
You may be already familiar with the program make. Just like make, ant is a program that reads a set of rules on how to do build some software system from its sources, figures out the least amount of work it has to do in order to achieve that (basing its decisions on file modification dates) and builds the system requested. Ant's advantage over make is that it is fully platform independent: so it can work as well on Windows operating system as on Unix/Linux. It is written entirely in Java and its rule file is an XML file, conventionally called build.xml.
It is typically run with a single argument: the name of the target to build. The build.xml file in the sample servlet web application directory has the following targets:
prepare: create the deploy and dist directories if necessary, copy all the files that do not need to be compiled to their appropriate place in the deployment directory .
compile: compile all .java files in src, putting the compiled versions in the appropriate places in the deploy directory.
dist: build a .war file from the deploy directory.
install: copy the .war file to the servlet server's webapp directory, taking care to remove any previous versions first.
clean: remove all generated files and directories (except for the installed versions in the servlet server's webapp directory.
A servlet receives a request from a web browser over the http protocol and must construct some kind of response. The incoming request is usually either a Get, in which any parameters of the request are encoded as part of the URL used to access the servlet, or a Post, in which the parameters are buried in the message protocol and not visible or obvious to the end user. The sevlet has to extract the parameters, whichever way they were encoded, and deal with them, and generate a response by writing out and sending back the HTML text that should be display back on the browser.
In fact the HttpServlet class that we (usually) derive our servlet from does most of the hard work for us: It encapsulates the request and response mechanisms into objects. It provides get methods, to conveniently extract parameters from the request object, and a getWriter method for the response object that returns a PrintWriter object which we can write the HTML response to.
The servlet application is viewable when you have the sevlet server (in this case Resin), running, have set up the RHCShop ODBC connection to Shop.mdb, and have built and installed the application at Basic_Server. Note that we are using the loopback web address to view the web application: you must have Resin running on the physical machine you run the browser on for this to work. Obviously, you cannot run two different instances of Resin on the same machine as they would clash on port numbers. You can, of course, run the servlet server on any machine and change the web address to point at that machine. However, as we'll see, when developing servlets, you need the ability to start and stop the server easily so I strongly recommend developing with your own personal server instance.
Source code for the sample servlet, stored in the source directory structure described above, is here.
Looking first at the DateTime servlet, we see the basic structure of a servlet: To get a working servlet the only code we really need to provide is the derivation from HttpServlet and a doGet method (note that the API documention for servlets is in the J2EE installation and not the J2sdk). Try running it and refreshing the browser page multiple times.
public final class DateTimeServlet extends HttpServlet
{
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws IOException, ServletException
{
PrintWriter writer = response.getWriter();
response.setContentType("text/html");
writer.println();
writer.println("<head>");
writer.println();
writer.println("</head>");
writer.println();
writer.println("<table border=\"0\">");
// ... many more writer.println statements
writer.println("<p>");
long millisecs = System.currentTimeMillis() ;
Timestamp ts = new java.sql.Timestamp(millisecs) ;
writer.println("The date and time is now: " + ts);
writer.println("</p>");
writer.println("</body>");
writer.println("</html>");
}
} |
Servlets often have to generate a great deal of HTML output to include in the response. This means that the servlet code often contains a little bit of program logic and a large number of output statements that, having to produce HTML, have to encode the HTML with lots of escaping characters. This makes it unpleasant to do the page design work necessary and means that the artistic designers have to either work on Java code themselves or have to request change to the code by the Java programmers.
Java Server Pages (or JSP) is an attempt to overcome this problem. Rather than embedding HTML output statements in Java code, JSPs embed Java statements in HTML code. The first time a servlet server is asked to load a JSP page, it automatically "inverts" the JSP page, translating it into a standard servlet with embedded HTML output statements and all the necessary escaping of characters required. It then compiles the resulting servlet and serves it as if it were a normal servlet.
This means that now the artistic designers can work on the presentation of the web pages directly, using standard web page editting tools, while the Java programmers only need to embed their logic in the JSP.
While this has many advantages, it has disadvantages as well: cluttering up the html page with Java tags can mean that the artistic designers can make changes that damage the program logic without realising it. A whole host of technologies have been developed to assist here (e.g. Tag libraries, Bean libraries, Web Macros, Templates etc.) by minimising the amount of Java code necessary in a JSP page or putting a layer of protection in so that it becomes difficult to break the embedded Java code. However no single technology in this area has yet become the accepted best practice approach. In this course we will concentrate on Servlets as, even if you wish to use JSPs, you still need a basic mastery with Servlets before moving on to JSPs.
The sample web application provides an example of a JSP page that prints out the Customers from the Shop database.
The two servlets above have not taken any input from the user other than the request for a response. The FindCustomers servlet handles a text input parameter from a form in the query HTML page, and uses it in a "LIKE" condition of an SQL query to find the customers whose last names begin with the string that was input. Getting a parameter is easy given the support provided by the HttpServlet and HttpServletRequest classes:
String prefix = request.getParameter("prefix") ;
writer.println("<p>Prefix specified was: " + prefix + "</p>") ; |
![]() | Warning | |
|---|---|---|
Actually, here I have oversimplified a significant problem (and ignored it completely previously as well): The user could have entered any string at all to the browser. If we simply echo it, as we do above, then it could break the page display if it contained characters like '<', or even full HTML tags. The problem is actually worse: The table cells being output are essentially being dumped into the HTML response page with lines like:
What if one of the fields in the database had been entered with a '<' or with something that happened to be a html tag? For this reason, all output to a web page that comes from a source that is not explicitly coded by the servlet programmer should first pass through a filter that sanitises the text for HTML display. There are a number of such class libraries available free or it is easy to write your own. | ||
![]() | Warning | |
|---|---|---|
There is a further related problem: in the section where I put the string prefix into the SQL query, I made no check or transformation on the string:
Here again problems could occur innocently (e.g. the user enters the string "D'Arc") or less innocently (can you come up with a scenario where some string that the user enters could modify the query in such a way that the resulting query actually causes the database to be modified?). The solution is to program much more careful control over what happens to strings before they are appended to queries for execution. | ||
So far we have handled POST requests. We need to handle GET requests as well. This is simple: there is a doGet method that looks exactly like the doPost. Further the parameters are handled with the same interfaces. Therefore the following bit of boilerplate code will sort out the GET requests:
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws IOException, ServletException
{
doPost(request, response) ;
} |
A servlet is just a class, or a set of classes, loaded by the servlet server. They are (usually) not processes in their own right but run in a servlet server process (srun, in the case of Resin). This means that the current working directory of the process the servlet is running in may be outside the control of the servlet. Furthermore, this directory may not be predictable before the servlet starts running. Thus static relative pathnames should not be used in a servlet. One could use absolute pathnames but this causes maintenance problems. Instead you can use the following mechanism to turn a pathname relative to the root of the installed web application root into an absolute pathname:
ServletContext app = getServletContext();
String path = app.getRealPath("/WEB-INF/Database.Properties");
FileInputStream in = new FileInputStream(path) ;
|
All the servlets shown do their work entirely within a doGet or doPost method. They create a database connection, do any queries, generate the response and tear down the database connection for every page access to the servlet. Since creating a database connection is very expensive, this slows down the execution reduces the page serving rate very considerably. We would like to be able to avoid all this unnecessary setup work and to only set up the connection once and reuse it for every page access. In order to understand how to do this, we have to understand the servlet life cycle.
When the servlet server starts, it waits for page accesses that request servlets. When one comes in, it loads the servlet class into memory and initialise it if it was not already loaded and creates an object (an instance) of the servlet class. This servlet instance then stays alive and in memory for as long as the servlet server likes: typically the server will kill it off if it has remained idle for a certain amount of time. This servlet instance will typically handle all requests to the servlets web address. [Actually, you get one servlet instance created for each different registered servlet name in the web.xml files, even if the same servlet class file is registered to the different servlet names]. As far as the servlet instance is concerned, each page request to it is from a different thread.This immediately means that synchronisation issues need to be taken into account.
Consider what kind of variable you could use for holding the database connection object. There are three cases:
Local variable of doGet/doPost:
Here there is no sharing of the connection between different calls to the servlet web page
Instance variable of servlet class:
Here the connection can be shared across different web access calls, but we now have a problem that when the servlet runner decides to kill the servlet instance, we have to make sure that the connection is shut down properly. We can do this because there is another pair of methods, init and destroy, of the servlet class that are invoked when the servlet has just been loaded and an instance initialised and when a servlet is about to be unloaded respectively. Therefore you can actually set up the connection (and/or) load any data that should be persistent across multiple accesses to the servlet.in init and close it down (and/or) save any data that should be persistent across multiple accesses to the servlet in destroy.
Class (i.e. static) variable of servlet class:
This is basically the same as the Instance variable case as there is usually only 1 instance anyway and only under slightly unusual circumstances would you have more than one. You should decide, though, for any data that must persist, should it be different persistance data sets for the different registered names for the servlet (in which case the instance variable should be used and every init/destroy has to load/save) or should it be a single persistent set across all register names (in which case the class variables should be used and care should be taken that only the first init loads and the last destroy saves.)
Write a servlet that takes a text entry field value and returns the length of the string that was entered in that field together with a text entry field so that the user can immediately enter the next value to be.
Modify the FindCustomer servlet to provide, on the response page, a form with a text entry field (with a default value of the currently chosen prefix) and a submit button so that a new FindCustomer request can be made without having to return to the previous page first
Modify the FindCustomer servlet so that searches can be made on the CustomerFirstName and on the CustomerLastName simultaneously
Write a servlet that allows creation of a new customer (entry of the first and last names and address). The servlet should not allow two different customers to be created who have the same values for all three of these attributes.
Write a servlet that, on entry of a CustomerID, displays the full details of that customer's open orders.
Modify the FindCustomer servlet so that it displays a button beside each row of the resulting table, and if one of these buttons is pressed. then the orders with their order detail information for that customer is displayed.
Write a servlet that chooses a random number between 1 and 100 and allows the user to make a series of guesses to find it. On each guess the servlet should report whether the guess was higher, lower or equal and report the number of guesses so far
There are many topics on servlets that we could discuss but for which we did not have sufficient time to cover them
Synchonisation: Connection is thread safe so you don't have to synchronise it if you have a proper database and a decent JDBC driver for it. Statements and ResultSets are a different matter. If you want scalability and performance, try to design things so that you minimise the use of synchronise and avoid the use of SingleThreadModel. Don't forget that the servlet server can crash, so make sure that critical persistant data is robust: while destroy will be called when the servlet server trys to unload your servlet, it may never get the chance if the servlet server crashes.
Sessions: there are excellent session support facilities thathandle cookies and URL-rewriting.
Security: Never depend on Form Authentication without at least going over https
Connection Pooling: Database Connections are expensive and limited in number. Connection pooling is a necessity on any, even half-way serious, web database application. There are public domain libraries (some atrocious, some very good) as well as many commercial offerings. There are other resource pooling packages out there as well that are worth looking at.
Enterprise level servlets: For through scalability you might want to look at Enterprise Java Beans (the public domain JBoss is actually very good). For a lighter weight (and easier to learn alternative), check out what Spring has to offer.
Presentation issues: JSP on its own is not very useable. You really have to add other technologies to make it feasible. Tag libraries (particularly the Standard Tag Library from Apache and Java Server Faces), Java Beans, Web Macros, Templating Toolkits etc. are all contenders but the jury is still out on which the eventual winner will be. Relatively new kids on the block are Tapestry and Echo/EchoPoint. These have completely different models of how the front end should work. Currently I prefer either of them to any of the others - but new ideas, and systems, are coming out all the time.