<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<article id="UniDocBook">
  <articleinfo>
    <title>Getting Upto Speed With DocBook</title>
    <author>
      <firstname>Ashley</firstname>
      <othername>J.S</othername>
      <surname>Mills</surname>
      <affiliation>
        <address><email>ashley@ashleymills.com</email></address>
      </affiliation>
    </author>

    <copyright>
      <year>2002</year>
      <holder role="mailto:ashley@ashleymills.com">The University Of Birmingham</holder>
    </copyright>
  </articleinfo>

  <sect1 id="UniDocBook-Install"><title>Installation</title>
    <para>
      This section will detail how to install the tools required to validate and process <acronym>XML</acronym> DocBook documents. The tools that will be installed are; a bunch of tools and libraries called 'libxml', <command>Saxon</command> and <command>FOP</command>. The first will be used here to validate <acronym>XML</acronym> files and the latter two will be used to process XML files to produce other types of output. Another tool called a resolver will be installed to allow the tools to map links to files external to the computer being worked on to files local to the computer being worked on. This allows one to use the tools without an Internet connection and speeds up there execution.
    </para>

    <para>
      The documentation for the installation is written under the assumption that the reader has some experience of installing software on computers and knows how to change the operating environment of the particular operating system they are using. The documents entitled <ulink url="../winenvars/winenvarshome.html"><citetitle>Configuring A Windows Working Environment</citetitle></ulink> and <ulink url="../unixenvars/unixenvarshome.html"><citetitle>Configuring A Unix Working Environment</citetitle></ulink> are of use to people who need to know more.
    </para>

    <sect2 id="UniDocBook-Install-libxml"><title><command>libxml</command></title>
      <para>
        Within this tutorial the primary purpose for installing the libxml C library will be to gain access the tools that come with it. The tools provide the means to validate and transform <acronym>XML</acronym> files. In this tutorial, the program <command>xmllint</command> will be used to validate <acronym>XML</acronym> DocBook files before processing. The program <command>xsltproc</command> can be used to transform <acronym>XML</acronym> files. It is a program which uses <acronym>XSLT</acronym>.
      </para>

      <sect3 id="UniDocBook-Install-libxml-Windows"><title>Windows</title>
        <para>
          To install <emphasis role="strong">libxml</emphasis> on a Windows machine one needs to download the Windows binaries and libraries. These can be obtained from <ulink url="http://www.zlatkovic.com/pub/libxml/">http://www.zlatkovic.com/pub/libxml/</ulink>. Download the following:
        </para>
  
        <itemizedlist>
          <listitem><para><ulink url="http://www.zlatkovic.com/pub/libxml/libxml2-2.6.2+.win32.zip">libxml2-2.6.2+.win32.zip</ulink></para></listitem>
          <listitem><para><ulink url="http://www.zlatkovic.com/pub/libxml/libxslt-1.1.0.win32.zip">libxslt-1.1.0.win32.zip</ulink></para></listitem>
          <listitem><para><ulink url="http://www.zlatkovic.com/pub/libxml/iconv-1.9.1.win32.zip">iconv-1.9.1.win32.zip</ulink></para></listitem>
        </itemizedlist>
  
        <note>
          <para>
            The three links shown immediately above may be broken since it is common practice to remove old versions from a download page when they are obsoleted. Goto <ulink url="http://www.zlatkovic.com/pub/libxml/">http://www.zlatkovic.com/pub/libxml/</ulink> instead and download the <filename>libxml2...</filename>, <filename>libxslt...</filename>, and <filename>iconv...</filename> files with the highest version numbers. Some older versions are available in the directory <filename>oldreleases</filename> on that server, should one desire them.
          </para>
        </note>
  
        <para>
          It is not necessary to extract the content of these zips entirely, instead the required functionality will be extracted. Create a suitable directory to contain the stuff that is about to be extracted. For example, on my home machine. If I am running a Windows system I have a directory called <filename>c:\tools</filename> which contains all the tools I install. Within <filename>c:\tools</filename> I have a directory called <filename>libxml</filename> that contains the stuff I want from these zips. Create a suitable directory to extract the desired content from the zips into.
        </para>
  
        <para>
          Extract the following files from the <filename>libxml</filename> archive into the directory.
        </para>
  
        <itemizedlist>
          <listitem><para><filename>libxml2.dll</filename></para></listitem>
          <listitem><para><filename>xmllint.exe</filename></para></listitem>
        </itemizedlist>
  
        <para>Extract the following files from the <filename>libxslt</filename> archive into the directory.</para>
  
        <itemizedlist>
          <listitem><para><filename>libexslt.dll</filename></para></listitem>
          <listitem><para><filename>libxslt.dll</filename></para></listitem>
          <listitem><para><filename>xsltproc.exe</filename></para></listitem>
        </itemizedlist>
  
        <para>Extract the following files from the <filename>iconv</filename> archive into the directory.</para>
  
        <itemizedlist>
          <listitem><para><filename>iconv.dll</filename></para></listitem>
          <listitem><para><filename>iconv.exe</filename></para></listitem>
        </itemizedlist>
  
        <para>
          Append <filename>\directory\you\just\unzipped\everything\to</filename> to the <envar>PATH</envar> environment variable.
        </para>
  
        <para>
          You might not use <emphasis>all</emphasis> the tools but they are worth having around in case you decide you need them.
        </para>
      </sect3> 

      <sect3 id="UniDocBook-Install-libxml-Unix"><title>Unix/Linux/BSD</title>
        <para>
          These files are probably already installed on your system, as most modern distributions of these operating systems use <acronym>XML</acronym> processing for some of  the more popular components. But you may wish to get the latest versions, in which case, goto <ulink url="ftp://xmlsoft.org/">ftp://xmlsoft.org/</ulink> and get the latest <emphasis role="strong">libxml2</emphasis> and <emphasis>libxslt</emphasis>. There are gzipped tars and <acronym>RPM</acronym>s available, download whichever you prefer. A list of the latest files at the time of writing is shown below:
        </para>

        <programlisting>
          <filename>libxml2-2.4.25.tar.gz</filename>
          <filename>libxml2-2.4.25-1.i386.rpm</filename>
          <filename>libxml2-2.4.25-1.src.rpm</filename>
          <filename>libxslt-1.0.21.tar.gz</filename>
          <filename>libxslt-1.0.21-1.i386.rpm</filename>
          <filename>libxslt-1.0.21-1.src.rpm</filename>
        </programlisting>

        <para>
          The ftp directory also contains <emphasis>devel</emphasis> versions of the software, this is for people who want to develop with libxml.
        </para>
      </sect3>
    </sect2>

    <sect2 id="UniDocBook-Install-Fop"><title><acronym>FOP</acronym></title>
      <para>
        <acronym>FOP</acronym>(Formatting Objects Processor) is used to transform <acronym>FO</acronym> files to files of other formats. In this tutorial it is used to transform <acronym>FO</acronym> output produced by <command>xsltproc</command> into <acronym>PDF</acronym> which is a well known format considered by many to be aesthetically pleasing. The Unix and Windows installation paths are very similar, the differences will be mentioned where appropriate.
      </para>

      <para>
        Download the latest version of the Fop application, from <ulink url="http://ftp.plig.org/pub/apache/dist/xml/fop/">http://ftp.plig.org/pub/apache/dist/xml/fop/</ulink>. Download the zip or tar with <emphasis>bin</emphasis> as a substring of its name to some suitable location.
      </para>

      <para>
        On Windows, append <filename>/directory/where/you/unzipped/fop/fop.bat</filename> to the <envar>PATH</envar> environment variable.
      </para>

      <para>
        On Unix, append <filename>/directory/where/you/unzipped/fop/fop.sh</filename> to the <envar>PATH</envar> environment variable.
      </para>

      <sect3 id="UniDocBook-Install-Fop-Jimi"><title>Install Jimi</title>
        <para>
          Jimi is needed if you want to use <acronym>PNG</acronym> images with <acronym>FOP</acronym>, download it from <ulink url="http://java.sun.com/products/jimi/#">http://java.sun.com/products/jimi/#</ulink> and open the archive.
        </para>

        <para>
          On Windows, rename <filename>JimiProClasses.zip</filename> from the archive to <filename>jimi-1.0.jar</filename> and place it in the <filename>/directory/where/you/unzipped/fop/lib</filename> directory.
        </para>

        <para>
          On Unix, rename <filename>JimiProClasses.zip</filename> from the archive to <filename>JimiProClasses.jar</filename>and place it in the <filename>/directory/where/you/unzipped/fop/lib</filename> directory.
        </para>
      </sect3>
    </sect2>

    <sect2 id="UniDocBook-Install-DocBook"><title>DocBook <acronym>DTD</acronym></title>
      <para>
        The DocBook <acronym>DTD</acronym>(Document Type Definition) contains rules which specify the structure of a valid DocBook document, for example, the order that elements may appear and valid attributes etc. If one has a document which one claims is written in DocBook, it is not written in DocBook unless it conforms to the DocBook <acronym>DTD</acronym>. The <acronym>DTD</acronym> is used by tools like <command>xsltproc</command> in transforming DocBook documents.
      </para>
        
      <para>
        <acronym>DTD</acronym>'s are especially useful when one wants to validate a document to check that it conforms to the <acronym>DTD</acronym> one claims it conforms to. Validation is beneficial because a valid document is less likely to break processing tools (if a valid document does break a processing tool it is likely that the processing tool is broken and not the document). Hence, the DocBook <acronym>DTD</acronym> can be used to validate that a purported DocBook document <emphasis>really is</emphasis> a DocBook document.
      </para>

      <para>
        The latest version, at the time of writing, is called <emphasis role="strong">DocBook XML 4.2</emphasis>, it is distributed from <ulink url="http://www.docbook.org/xml/4.2/index.html">http://www.docbook.org/xml/4.2/index.html</ulink>.
      </para>

      <para>
        Download the zipped archive, <ulink url="http://www.docbook.org/xml/4.2/docbook-xml-4.2.zip">http://www.docbook.org/xml/4.2/docbook-xml-4.2.zip</ulink> and unzip it to some suitable location. For example if I was running the Windows operating system I would unzip it to a directory like <filename>c:\lib\docbook-xml-4.2</filename>, this is the same as the name of the zip file. If I was running a Unix, Linux, or BSD operating system I would unzip it to a directory like <filename>/lib/docbook/docbook-xml-4.2</filename>. You might have noticed on the webpage or in the zip, other files apart from <acronym>DTD</acronym> files, these are auxiliary files and are necessary.
      </para>

      <sect3 id="UniDocBook-Install-DocBook-Catalog"><title>Catalog Files</title>
        <para>
          When one writes a DocBook document in <acronym>XML</acronym> one inserts a DocType declaration at the top of the document to specify that the document is written in DocBook. This declaration specifies the version of DocBook being used with a <emphasis role="strong">PUBLIC</emphasis> ID in the declaration, a <emphasis role="strong">SYSTEM</emphasis> ID in the header specifies where one can find the <acronym>DTD</acronym> for DocBook. In the case of DocBook, usually this <emphasis role="strong">SYSTEM</emphasis> ID points to some location on the <acronym>OASIS</acronym>(Organization for the Advancement of Structured Information Standards) website because this is where DocBook's official home is. Some tools used for processing DocBook use the <acronym>DTD</acronym> at this location, this is no good when one wants to process a DocBook document on a computer that does not have Internet access or where accessing the Internet is undesirable.
        </para>

        <para>
          To overcome the necessity to access the Internet to process DocBook documents one can use a <emphasis>catalog</emphasis> file. A <emphasis>catalog</emphasis> file maps <emphasis role="strong">PUBLIC</emphasis> or <emphasis role="strong">SYSTEM</emphasis> IDs to alternate locations. Taking the example from above, one would process the DocBook document that contained the <emphasis role="strong">SYSTEM</emphasis> ID pointing to the Internet with a tool and specify a catalog file to use. The catalog file would map the <emphasis role="strong">SYSTEM</emphasis> ID pointing to the Internet to a copy of the <acronym>DTD</acronym> on the local system, thus circumventing the need for any Internet access.
        </para>

        <para>
          The DocBook zip that was just downloaded does actually contain it's own catalog file (<filename>catalog.xml</filename>. This does not seem to provide the desired functionality without modification. Instead of modifying that catalog file, create a new one called <filename>catalog</filename> in the <filename>docbook-xml-4.2</filename> directory. Put the following content into it:
        </para>

        <programlisting>
&lt;catalog xmlns=&quot;urn:oasis:names:tc:entity:xmlns:xml:catalog&quot;&gt;
  &lt;group xml:base=&quot;.&quot; prefer=&quot;public&quot; &gt;
  &lt;public publicId=&quot;-//OASIS//DTD DocBook XML V4.2//EN&quot; uri=&quot;file:///c:/lib/docbook-xml-4.2/docbookx.dtd&quot;/&gt;
  &lt;/group&gt;
&lt;/catalog&gt;
        </programlisting>

        <para>
          This maps the <emphasis role="strong">PUBLIC</emphasis> ID for DocBook <acronym>XML</acronym> V4.2 to a local copy of it's <acronym>DTD</acronym>. The example above was taken from a Windows system, modify the value of the <emphasis>uri</emphasis> attribute to point to the location of the <acronym>DTD</acronym> on your system. On a Unix system this could be <filename>file:///lib/docbook/docbook-xml-4.2/docbookx.dtd</filename>.
        </para>

        <para>
          The processing tools must know where this catalog file is in order to use the functionality it provides. This is achieved via an environment variable called <envar>XML_CATALOG_FILES</envar>, create this environment variable and make it point to the catalog file you just created. You could add similar entries to the catalog file shown above to map other <acronym>DTD</acronym>s you desire to use to local copies of their <acronym>DTD</acronym>'s.
        </para>
      </sect3>
    </sect2>

    <sect2 id="UniDocBook-Install-StyleSheets"><title><acronym>XSL</acronym> StyleSheets</title>
      <para>
        <acronym>XSL</acronym> stylesheets dictate how a document written in <acronym>XML</acronym> should be transformed using <acronym>XSLT</acronym> to a particular output format.  In the case of DocBook, Norman Walsh has already written, and regularly maintains some stylesheets for DocBook that provide rules for transformations from an <acronym>XML</acronym> DocBook document to the most commonly desirable output formats such as <acronym>XHTML</acronym> and <acronym>PDF</acronym>. The installation for Unix and Windows machines is the same.
      </para>

      <para>
        Download the latest stylesheets from <ulink url="http://sourceforge.net/project/showfiles.php?group_id=21935">http://sourceforge.net/project/showfiles.php?group_id=21935</ulink> and unzip the zip or gzipped tar to some suitable location. If I was running a Windows system I would use <filename>c:\lib\docbook-xsl\</filename>, if I was using a Unix system I would use <filename>c:\lib\docbook\docbook-xsl</filename>. The stylesheets are now ready to use.
      </para>

      <sect3 id="UniDocBook-Install-StyleSheets-Custom"><title>Custom StyleSheets</title>
        <para>
          The output produced by the stylesheets mentioned above is reasonable but the stylesheets mentioned above are a standard distribution and as a consequence seem to be designed to cater for the needs of the many, which is sensible, unfortunately. One may modify the stylesheets directly but more often one creates a customisation layer which imports the standard stylesheets and then one overrides specific aspects of the standard stylesheets or adds extra functionality within the customisation layer according to ones tastes. I have created a customisation layer which looks good enough for standard applications and am offering it to download.
        </para>

        <para>
          This is particularly pertinent if you study at The University Of Birmingham because any documentation created by me there in DocBook uses this customisation layer, all the tutorials I have written conform to these stylesheets. If you your documents to have the same style as the tutorials then use this customisation layer. It is probably worth downloading the customisation layer anyway so you can see how one goes about creating a customisation layer. Here is the zipped customisation layer: <ulink url="files/custom-stylesheets.zip">custom-stylesheets.zip</ulink>.
        </para>

        <para>
          Unzip the zip to where you want the customisation layer to be situated, this could be within the stylesheets directory or in separate directory. If you unzip it to the stylesheets directory the customisation layer will unzip into the directories <filename>common</filename>, <filename>fo</filename> and <filename>xhtml</filename>. If you unzip to a separate directory these directories will be created.
        </para>

        <para>
          Wherever you unzip the zip, it is important to change the references of the imports in the files so that they reflect the state of your system, the files <filename>fo/customfo.xsl</filename>, <filename>xhtml/customxhtml.xsl</filename> and <filename>xhtml/customchunk.xsl</filename> all have references that may need to be modified. For example, the file <filename>fo/customfo.xsl</filename> has the import line:
        </para>

        <programlisting>
&lt;!-- Import standard fo style-sheet --&gt;
&lt;xsl:import href=&quot;file:///c:/lib/docbook-xsl/fo/docbook.xsl&quot;/&gt;
        </programlisting>

        <para>Change this to point to <filename>/where/you/put/the/stylesheets/fo/docbook.xsl</filename></para>

        <para>
          Similarly, change the entry in <filename>customchunk.xsl</filename> to point to <filename>/where/you/put/the/stylesheets/xhtml/chunk.xsl</filename> and the entry in <filename>customxhtml.xsl</filename> to point to <filename>/where/you/put/the/stylesheets/xhtml/docbook.xsl</filename>. The advantage of unzipping the zip in the same location as the standard stylesheets is that the import links may be relative (the import links can always be relative assuming the stylesheets are on the same machine, but for clarity if I am using a different directory for the customisation stylesheets I will make the import references absolute).
        </para>

        <para>
          I have only provided customisations for <acronym>FO</acronym> and <acronym>XHTML</acronym>. It will become apparent how to use the customisation layer in the section on using the tools later. The provided customisations are listed below:
        </para>

        <itemizedlist>
          <listitem><para><filename>fo/customfo.xsl</filename> - Use this to generate custom <acronym>FO</acronym></para></listitem>
          <listitem><para><filename>xhtml/customxhtml.xsl</filename> - Use this to generate custom <acronym>XHTML</acronym> (segmented)</para></listitem>
          <listitem><para><filename>xhtml/customxchunk.xsl</filename> - Use this to generate custom <acronym>XHTML</acronym> (chunked)</para></listitem>
        </itemizedlist>

        <para>
          More information about customising stylesheets can be found at <ulink url="http://www.sagehill.net/xml/docbookxsl/">http://www.sagehill.net/xml/docbookxsl/</ulink>.
        </para>
      </sect3>
    </sect2>
  </sect1>

  <sect1 id="UniDocBook-ToolUsage"><title>Using the tools to validate and transform DocBook documents</title>
    <sect2 id="UniDocBook-ToolUsage-xmllint"><title>Using <command>xmllint</command> to validate an <acronym>XML</acronym> DocBook document</title>
      <para>
        In order to check the syntactic accordance of a DocBook document with the DocBook <acronym>DTD</acronym> one may use <command>xmllint</command>.
      </para>

      <screen><userinput><command>xmllint</command> <option>--valid</option> <option>--noout</option> <filename>in.xml</filename></userinput></screen>

      <para>
        The <option>--valid</option> option specifies that <command>xmllint</command> should validate the document against the <acronym>DTD</acronym> and the <option>--noout</option> option specifies that no output should be produced if there are no errors, hence if the document being validated is valid, <command>xmllint</command> will exit silently. If the document is invalid <command>xmllint</command> will output an error similar to this:
      </para>

      <screen>
        docbook.xml:1: error: Start tag expected, '&lt; not found
        ?xml version=&quot;1.0&quot; encoding='ISO-8859-15'?&gt;
        ^
      </screen>

      <para>Which specifies that there is a missing start tag on line one.</para>

      <note>
        <para>
           One can use the <option>--loaddtd</option> option to specify an external <acronym>DTD</acronym> to validate the file with. Also, the <option>--nonet</option> option can be useful to surpress fetching of DTDs files from the web if you find that your version does this by default and you don't want it too.
        </para>
      </note>
    </sect2>

    <sect2 id="UniDocBook-ToolUsage-XMLtoXHTMLSing"><title>Using <command>xsltproc</command> to generate <acronym>XHTML</acronym>(Single file) output from an <acronym>XML</acronym> Docbook document</title>

      <screen><userinput><command>xsltproc</command> <filename>file:///path/to/docbook-xsl/xhtml/docbook.xsl</filename> <filename>in.xml</filename> &gt; <filename>out.html</filename></userinput></screen>

      <para>This will produce a single <acronym>XHTML</acronym> file according to the <acronym>XSL</acronym> stylesheet specifications.</para>
    </sect2>

    <sect2 id="UniDocBook-ToolUsage-XMLtoXHTMLSep"><title>Using <command>xsltproc</command> to generate <acronym>XHTML</acronym>(Segmented) output from an <acronym>XML</acronym> Docbook document</title>

      <screen><userinput><command>xsltproc</command> <filename>file:///path/to/docbook-xsl/xhtml/chunk.xsl</filename> <filename>in.xml</filename></userinput></screen>

      <para>
        This will produce a set of <acronym>XHTML</acronym> files of the document where each section is on a separate <acronym>HTML</acronym> page.  The layout will accord to the <acronym>XSL</acronym> stylesheet specified. The separate files will be given unique names that correspond to the different sections of the book, e.g <filename>index.html</filename>, <filename>ar01s02.html</filename> and <filename>ar01s03.html</filename>.
      </para>
    </sect2>

    <sect2 id="UniDocBook-ToolUsage-ToolUse-XMLtoFO"><title>Using <command>xsltproc</command> to generate <acronym>FO</acronym> output from an <acronym>XML</acronym> Docbook document</title>

      <screen><userinput><command>xsltproc</command> <filename>file:///path/to/docbook-xsl/fo/docbook.xsl</filename> <filename>in.xml</filename> &gt; <filename>out.fo</filename></userinput></screen>

      <para>
        This will produce output as an <acronym>XSL</acronym> <acronym>FO</acronym>(Formatting object), this is an intermediate file type that can be used by other programs to generate other types of output, such as <acronym>PDF</acronym>.
      </para>
    </sect2>

    <sect2 id="UniDocBook-ToolUsage-XSLFOtoPDFFOP"><title>Using <command>FOP</command> to generate <acronym>PDF</acronym> output from <acronym>XSL</acronym> <acronym>FO</acronym> input</title>

      <para>
        In order to execute this conversion you will need to have generated <acronym>XSL</acronym> <acronym>FO</acronym> output by using <command>xsltproc</command> or some other tool capable of doing so.
      </para>

      <screen><userinput><command>fop.bat</command> <filename>in.fo</filename> <filename>out.pdf</filename></userinput></screen>

      <para>
         Substituting <filename>fop.sh</filename> for <filename>fop.bat</filename> on Unix derivatives.  This will generate a <acronym>PDF</acronym> file named according to the name provided as the second argument. <acronym>FOP</acronym> will probably generate lots of warnings about un-implemented features whilst generating this output, this is normal and can be ignored.
      </para>
    </sect2>

    <sect2 id="UniDocBook-General"><title>General Usage</title>
      <para>
        Assume that a file called <filename>test.xml</filename> has been created in <acronym>XML</acronym> DocBook. Assume that the stylesheets are located in a directory called <filename>/lib/docbook-xsl/</filename> One would create <acronym>XHTML</acronym> output like this:
      </para>

      <screen><userinput><command>xsltproc</command> <filename>file:///lib/docbook-xsl/xhtml/docbook.xsl</filename> <filename>test.xml</filename> &gt; <filename>test.html</filename></userinput></screen>

      <para>
        One would create <acronym>PDF</acronym> output in two steps, first create the <acronym>FO</acronym> output using <command>xsltproc</command>:
      </para>

      <screen><userinput><command>xsltproc</command> <filename>file:///lib/docbook-xsl/fo/docbook.xsl</filename> <filename>test.xml</filename> &gt; <filename>test.fo</filename></userinput></screen>

      <para>
        Next, process the <acronym>FO</acronym> output with <acronym>FOP</acronym> to produce the <acronym>PDF</acronym> file:
      </para>

      <screen><userinput><command>java</command> org.apache.fop.apps.Fop <filename>test.fo</filename> <filename>test.pdf</filename></userinput></screen>

      <para>
        If you want to use the custom stylesheets you simply modify the stylesheet parameter so that it points to the custom stylesheet you want to use. Assuming an install of the customisation layer mentioned above in the same location as the standard stylesheets one could generate <acronym>XHTML</acronym> output that conformed to the custom stylesheet for <acronym>XHTML</acronym> like this:
      </para>

      <screen><userinput><command>xsltproc</command> <filename>file:///lib/docbook-xsl/xhtml/customxhtml.xsl</filename> <filename>test.xml</filename> &gt; <filename>test.html</filename></userinput></screen>

      <para>Similarly, <acronym>FO</acronym> output could be produced.</para>
    </sect2>
  </sect1>

  <sect1 id="UniDocBook-Creating"><title>Creating an <acronym>XML</acronym> DocBook document</title>
    <para>
      For the ultimate reference guide see <ulink url="http://www.docbook.org/tdg/en/html/docbook.html"><citetitle>DocBook: The Definitive Guide</citetitle></ulink>. A template for a DocBook <emphasis>article</emphasis> is shown below:
    </para>

    <programlisting>
&lt;?xml version=&quot;1.0&quot; encoding='UTF-8'?&gt;
&lt;!DOCTYPE article PUBLIC &quot;-//OASIS//DTD DocBook XML V4.2//EN&quot;
  &quot;http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd&quot;&gt;
&lt;article&gt;
  &lt;articleinfo&gt;
    &lt;title&gt;Your title here&lt;/title&gt;
        
    &lt;author&gt;
      &lt;firstname&gt;Your first name&lt;/firstname&gt;
      &lt;surname&gt;Your surname&lt;/surname&gt;
      &lt;affiliation&gt;
        &lt;address&gt;&lt;email&gt;Your e-mail address&lt;/email&gt;&lt;/address&gt;
      &lt;/affiliation&gt;
    &lt;/author&gt;
    
    &lt;copyright&gt;
      &lt;year&gt;2002&lt;/year&gt;
      &lt;holder role=&quot;mailto:your e-mail address&quot;&gt;Your name&lt;/holder&gt;
    &lt;/copyright&gt;
  
    &lt;abstract&gt;
      &lt;para&gt;Include an abstract of the article's contents here.&lt;/para&gt;
    &lt;/abstract&gt;
  &lt;/articleinfo&gt;

  &lt;sect1&gt;&lt;title&gt;Section 1&lt;/title&gt;
    &lt;para&gt;
      blah blah blah
    &lt;/para&gt;
  &lt;/sect1&gt;

  &lt;sect1&gt;&lt;title&gt;Section 2&lt;/title&gt;
    &lt;para&gt;
      blah blah blah
    &lt;/para&gt;
  &lt;/sect1&gt;  
&lt;/article&gt;
    </programlisting>

    <note>
      <para>The output produced in the following examples was produced using a customisation of the stylesheets hence output on systems not implementing the same customisations may differ.</para>
    </note>

    <sect2 id="UniDocBook-Creating-Common-Elements"><title>Common DocBook Elements</title>
      <sect3 id="UniDocBook-Creating-Common-Elements-Para"><title><emphasis>&lt;para&gt;</emphasis></title>
        <para>
          The reference page for the <emphasis>para</emphasis> element can be found here: <ulink url="http://www.docbook.org/tdg/en/html/para.html">http://www.docbook.org/tdg/en/html/para.html</ulink>. <emphasis>para</emphasis> is one of the most commonly used elements of all the DocBook elements. <emphasis>para</emphasis>'s can contain block elements such as <emphasis>itemizedlist</emphasis> and <emphasis>mediaobject</emphasis> and can contain almost all inline elements. There is some debate about whether or not it is best to separate block elements from <emphasis>para</emphasis> elements, it is probably better to do so however because some processing systems have problems processing block elements within <emphasis>para</emphasis> elements. An example of a para element containing some inline elements is shown below:
        </para>

        <programlisting>
          &lt;para&gt;
            &lt;quote&gt;Behold the superfluous. They are always sick. They vomit their gall and call it a newspaper.&lt;/quote&gt;
            - Friedrich Wilhelm Nietzsche, &lt;citetitle&gt;Twilight of the Idols&lt;/citetitle&gt;
          &lt;/para&gt;
        </programlisting>

        <para>Looks like this:</para>

        <para>
          <quote>Behold the superfluous. They are always sick. They vomit their gall and call it a newspaper.</quote>
          - Friedrich Wilhelm Nietzsche, <citetitle>Twilight of the Idols</citetitle>
        </para>
      </sect3>

      <sect3 id="UniDocBook-Creating-Common-Elements-Programlisting"><title><emphasis>&lt;programlisting&gt;</emphasis></title>
        <para>
          The reference page for the <emphasis>programlisting</emphasis> element can be found here: <ulink url="http://www.docbook.org/tdg/en/html/programlisting.html">http://www.docbook.org/tdg/en/html/programlisting.html</ulink>. The <emphasis>programlisting</emphasis> element is used to display information that should be output verbatim, that is, white space is significant. An example is shown below:
        </para>

        <programlisting>
  &lt;programlisting&gt;
public class HelloWorld {
   public static void main(String args[]) {
      System.out.println(&amp;quot;Hello World!&amp;quot;);
   }
}
  &lt;/programlisting&gt;
        </programlisting>

        <para>Is output as:</para>

        <programlisting>
public class HelloWorld {
   public static void main(String args[]) {
      System.out.println(&quot;Hello World!&quot;);
   }
}
        </programlisting>

        <para>
          Notice the use of (&amp;quot;) to represent the (&quot;) character, this is know as a character entity and is used to represent a character that is not allowed to be used directly in the document, this is because these characters are used by the <acronym>XML</acronym> part of the document for special purposes. These special characters are known as <emphasis role="strong">CDATA</emphasis> as apposed to <emphasis role="strong">PCDATA</emphasis>, the latter standing for <emphasis>Parsed Character DATA</emphasis>. If one wants to use lots of <emphasis>CDATA</emphasis> characters in a document then one can wrap the section in a CDATA section like this:
        </para>

        <programlisting>
  &lt;programlisting&gt;
    &lt;![CDATA[
      One can get away with using lots of &amp;&amp;&amp; &quot;&quot;&quot; ''' &lt;&lt;&lt; &gt;&gt;&gt; 
      characters that would normally require being marked up as entities.
    ]]&gt;
  &lt;/programlisting&gt;
        </programlisting>

        <para>Is displayed as:</para>

        <programlisting>
          One can get away with using lots of &amp;&amp;&amp; &quot;&quot;&quot; ''' &lt;&lt;&lt; &gt;&gt;&gt; 
          characters that would normally have to be marked up as entities.
        </programlisting>

        <para>For more information about the available entities see the next section.</para>
      </sect3>

      <sect3 id="UniDocBook-Creating-Common-Elements-Entities"><title>Entities for special characters</title>
        <para>
          The following entities are provided for special characters, they must always be used unless they are used in a section that has been marked as a CDATA section. It is preferred to always use them in preference of CDATA sections however:
        </para>

        <informaltable frame="all">
          <tgroup cols="2">
            <thead>
              <row><entry>Character</entry><entry>Entity</entry></row>
            </thead>
  
            <tbody>
              <row><entry>&lt;</entry><entry>&amp;lt;</entry></row>
              <row><entry>&gt;</entry><entry>&amp;gt;</entry></row>
              <row><entry>&amp;</entry><entry>&amp;amp;</entry></row>
              <row><entry>&quot;</entry><entry>&amp;quot;</entry></row>
              <row><entry>'</entry><entry>&amp;apos;</entry></row>
            </tbody>
          </tgroup>
        </informaltable>
      </sect3>

      <sect3 id="UniDocBook-Creating-Common-Elements-Screen"><title><emphasis>&lt;screen&gt;</emphasis></title>
        <para>
          The reference page for the <emphasis>screen</emphasis> element can be found here: <ulink url="http://www.docbook.org/tdg/en/html/screen.html">http://www.docbook.org/tdg/en/html/screen.html</ulink>. Often one wants to illustrate the use of a program or a commandline, the <emphasis>screen</emphasis> element is intended to mark content up as text that a user would see on a computer screen. An example is shown below:
        </para>

        <programlisting>
          &lt;screen&gt;
            &lt;userinput&gt;&lt;command&gt;java&lt;/command&gt; org.apache.fop.apps.Fop &lt;replaceable&gt;in.fo&lt;/replaceable&gt; &lt;replaceable&gt;out.pdf&lt;/replaceable&gt;&lt;/userinput&gt;
          &lt;/screen&gt;
        </programlisting>

        <para>Is displayed as:</para>

        <screen>
          <userinput><command>java</command> org.apache.fop.apps.Fop <replaceable>in.fo</replaceable> <replaceable>out.pdf</replaceable></userinput>
        </screen>
      </sect3>

      <sect3 id="UniDocBook-Creating-Common-Elements-Ulink"><title><emphasis>&lt;ulink&gt;</emphasis></title>
        <para>
          The reference page for the <emphasis>ulink</emphasis> element can be found here: <ulink url="http://www.docbook.org/tdg/en/html/ulink.html">http://www.docbook.org/tdg/en/html/ulink.html</ulink>. <emphasis>ulink</emphasis> is the DocBook equivalent of <acronym>HTML</acronym>'s &quot;&lt;a href=&quot;...&quot;&gt;blah blah&lt;/a&gt;&quot;, an example is shown below:
        </para>

        <programlisting>
          &lt;para&gt;
            &lt;ulink url=&quot;http://www.oasis-open.org/committees/docbook/&quot;&gt;http://www.oasis-open.org/committees/docbook/&lt;/ulink&gt;
          &lt;/para&gt;
        </programlisting>

        <para>Displays as:</para>

        <para>
            <ulink url="http://www.oasis-open.org/committees/docbook/">http://www.oasis-open.org/committees/docbook/</ulink>
        </para>
      </sect3>

      <sect3 id="UniDocBook-Creating-Common-Elements-Lists"><title>Lists</title>
        <sect4 id="UniDocBook-Creating-Common-Elements-Lists-ItemizedLists"><title><emphasis>&lt;itemizedlist&gt;</emphasis></title>
          <para>
            The reference page for <emphasis>itemizedlist</emphasis> is here: <ulink url="http://www.docbook.org/tdg/en/html/itemizedlist.html">http://www.docbook.org/tdg/en/html/itemizedlist.html</ulink>. Itemized lists are standard bulleted lists and should be used where order of evaluation of the items of the list is not significant, ordered lists should be used where order of evaluation fot he items of the list are significant. An example use of itemized list is shown below:
          </para>
  
          <programlisting>
             &lt;itemizedlist&gt;
               &lt;listitem&gt;&lt;para&gt;Books&lt;/para&gt;
                 &lt;itemizedlist&gt;
                   &lt;listitem&gt;&lt;para&gt;Donald E. Knuth - The Art Of Computer Programming&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;Nils J. Nilsson - Artificial Intelligence: A New Synthesis&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;Pure Mathematics 2 - Geoff Mannall, Michael Kenwood&lt;/para&gt;&lt;/listitem&gt;
                 &lt;/itemizedlist&gt;
               &lt;/listitem&gt;
     
               &lt;listitem&gt;&lt;para&gt;Games&lt;/para&gt;
                 &lt;itemizedlist&gt;
                   &lt;listitem&gt;&lt;para&gt;Chess&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;Backgammon&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;Noughs And Crosses&lt;/para&gt;&lt;/listitem&gt;
                 &lt;/itemizedlist&gt;
               &lt;/listitem&gt;
             &lt;/itemizedlist&gt;
          </programlisting>
  
          <para>Which looks like this:</para>
  
          <itemizedlist>
            <listitem><para>Books</para>
              <itemizedlist>
                <listitem><para>Donald E. Knuth - The Art Of Computer Programming</para></listitem>
                <listitem><para>Nils J. Nilsson - Artificial Intelligence: A New Synthesis</para></listitem>
                <listitem><para>Pure Mathematics 2 - Geoff Mannall, Michael Kenwood</para></listitem>
              </itemizedlist>
            </listitem>
  
            <listitem><para>Games</para>
              <itemizedlist>
                <listitem><para>Chess</para></listitem>
                <listitem><para>Backgammon</para></listitem>
                <listitem><para>Noughs And Crosses</para></listitem>
              </itemizedlist>
            </listitem>
          </itemizedlist>
        </sect4>
  
        <sect4 id="UniDocBook-Creating-Common-Elements-Lists-OrderedLists"><title><emphasis>&lt;orderedlist&gt;</emphasis></title>
          <para>
            The reference page for <emphasis>orderedlist</emphasis> is here: <ulink url="http://www.docbook.org/tdg/en/html/orderedlist.html">http://www.docbook.org/tdg/en/html/orderedlist.html</ulink>. Ordered lists are used to specify a sequence of steps of which the order of evaluation is significant. The general form of an ordered list is like this:
          </para>
  
          <programlisting>
            &lt;orderedlist&gt;
              &lt;listitem&gt;&lt;para&gt;Action A&lt;para&gt;&lt;/listitem&gt;
              &lt;listitem&gt;&lt;para&gt;Action B&lt;para&gt;&lt;/listitem&gt;
            &lt;/orderedlist&gt;
          </programlisting>
  
          <para>Which would look like this:</para>
  
          <orderedlist>
            <listitem><para>Action A</para></listitem>
            <listitem><para>Action B</para></listitem>
          </orderedlist>
  
          <para>
            One may also specify the type of enumeration that the list will display, there are five types of enumeration; arabic, loweralpha, lowerroman, upperalpha , upperroman. The type of enumeration is specified via the <emphasis role="strong">numeration</emphasis> attribute like this:
          </para>
  
          <programlisting>
            &lt;orderedlist numeration=&quot;arabic&quot;&gt;
              &lt;listitem&gt;...&lt;/listitem&gt;
                               .
                               .
                               .
            &lt;/orderedlist&gt;
          </programlisting>
  
          <para>The types of enumeration are shown below:</para>
  
          <para>Arabic:</para>
          <orderedlist numeration="arabic">
            <listitem><para>arabic</para></listitem>
            <listitem><para>arabic</para></listitem>
            <listitem><para>arabic</para></listitem>
          </orderedlist>
  
          <para>Loweralpha:</para>
          <orderedlist numeration="loweralpha">
            <listitem><para>loweralpha</para></listitem>
            <listitem><para>loweralpha</para></listitem>
            <listitem><para>loweralpha</para></listitem>
          </orderedlist>
  
          <para>Lowerroman:</para>
          <orderedlist numeration="lowerroman">
            <listitem><para>lowerroman</para></listitem>
            <listitem><para>lowerroman</para></listitem>
            <listitem><para>lowerroman</para></listitem>
          </orderedlist>
  
          <para>Upperalpha:</para>
          <orderedlist numeration="upperalpha">
            <listitem><para>upperalpha</para></listitem>
            <listitem><para>upperalpha</para></listitem>
            <listitem><para>upperalpha</para></listitem>
          </orderedlist>
  
          <para>Upperroman:</para>
          <orderedlist numeration="upperroman">
            <listitem><para>upperroman</para></listitem>
            <listitem><para>upperroman</para></listitem>
            <listitem><para>upperroman</para></listitem>
          </orderedlist>
  
          <para>These can be combined to make nested enumeration clearer:</para>
  
          <programlisting>
             &lt;orderedlist numeration=&quot;loweralpha&quot;&gt;
               &lt;listitem&gt;
                 &lt;para&gt;Preparation&lt;/para&gt;
                 &lt;orderedlist numeration=&quot;upperalpha&quot;&gt;
                   &lt;listitem&gt;&lt;para&gt;Chop tomatoes&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;Peel onions&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;Mash potatoes&lt;/para&gt;&lt;/listitem&gt;
                 &lt;/orderedlist&gt;
               &lt;/listitem&gt;
               &lt;listitem&gt;
                 &lt;para&gt;Cooking&lt;/para&gt;
                 &lt;orderedlist numeration=&quot;upperalpha&quot;&gt;
                   &lt;listitem&gt;&lt;para&gt;Boil water&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;Put tomatoes and onions in&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;Blanch for 5 minutes&lt;/para&gt;&lt;/listitem&gt;
                 &lt;/orderedlist&gt;
               &lt;/listitem&gt;
               &lt;listitem&gt;
                 &lt;para&gt;Cleanup&lt;/para&gt;
                 &lt;orderedlist numeration=&quot;upperalpha&quot;&gt;
                   &lt;listitem&gt;&lt;para&gt;Throw away scraps&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;Clean side&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;Wash hands&lt;/para&gt;&lt;/listitem&gt;
                 &lt;/orderedlist&gt;
               &lt;/listitem&gt;
             &lt;/orderedlist&gt;
          </programlisting>
  
          <para>Which looks like this:</para>
  
          <orderedlist numeration="upperroman">
            <listitem>
              <para>Preparation</para>
              <orderedlist numeration="upperalpha">
                <listitem><para>Chop tomatoes</para></listitem>
                <listitem><para>Peel onions</para></listitem>
                <listitem><para>Mash potatoes</para></listitem>
              </orderedlist>
            </listitem>
            <listitem>
              <para>Cooking</para>
              <orderedlist numeration="upperalpha">
                <listitem><para>Boil water</para></listitem>
                <listitem><para>Put tomatoes and onions in</para></listitem>
                <listitem><para>Blanch for 5 minutes</para></listitem>
              </orderedlist>
            </listitem>
            <listitem>
              <para>Cleanup</para>
              <orderedlist numeration="upperalpha">
                <listitem><para>Throw away scraps</para></listitem>
                <listitem><para>Clean side</para></listitem>
                <listitem><para>Wash hands</para></listitem>
              </orderedlist>
            </listitem>
          </orderedlist>
  
          <para>One may also make the enumeration continue at lower nested levels by setting the <emphasis role="strong">continuation</emphasis> attribute to <emphasis role="strong">continues</emphasis>:</para>
  
          <programlisting>
             &lt;orderedlist&gt;
               &lt;listitem&gt;
                 &lt;para&gt;Do this&lt;/para&gt;
                 &lt;orderedlist numeration=&quot;arabic&quot;&gt;
                   &lt;listitem&gt;&lt;para&gt;And this&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;And this&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;And this&lt;/para&gt;&lt;/listitem&gt;
                 &lt;/orderedlist&gt;
               &lt;/listitem&gt;
               &lt;listitem&gt;
                 &lt;para&gt;And this&lt;/para&gt;
                 &lt;orderedlist numeration=&quot;arabic&quot; continuation=&quot;continues&quot;&gt;
                   &lt;listitem&gt;&lt;para&gt;And this&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;And this&lt;/para&gt;&lt;/listitem&gt;
                   &lt;listitem&gt;&lt;para&gt;And this&lt;/para&gt;&lt;/listitem&gt;
                 &lt;/orderedlist&gt;
               &lt;/listitem&gt;
             &lt;/orderedlist&gt;
          </programlisting>
  
          <para>Which looks like this:</para>
  
          <orderedlist>
            <listitem>
              <para>Do this</para>
              <orderedlist numeration="arabic">
                <listitem><para>And this</para></listitem>
                <listitem><para>And this</para></listitem>
                <listitem><para>And this</para></listitem>
              </orderedlist>
            </listitem>
            <listitem>
              <para>And this</para>
              <orderedlist numeration="arabic" continuation="continues">
                <listitem><para>And this</para></listitem>
                <listitem><para>And this</para></listitem>
                <listitem><para>And this</para></listitem>
              </orderedlist>
            </listitem>
          </orderedlist>
  
          <note><para>Some stylesheets may define that nested lists are of a different numeration by default.</para></note>
        </sect4>
      </sect3>

      <sect3 id="UniDocBook-Creating-Common-Elements-Inline"><title>Some common inline elements</title>
        <para>Some common inline elements and their output are shown below:</para>

        <informaltable frame="all">
          <tgroup cols="2">
            <thead>
              <row><entry>Example</entry><entry>Displays as</entry></row>
            </thead>
  
            <tbody>
              <row><entry>&lt;emphasis&gt;Emphasised Text&lt;/emphasis&gt;</entry><entry><emphasis>Emphasised Text</emphasis></entry></row>
              <row><entry>&lt;emphasis role=&quot;strong&quot;&gt;A different type of emphasis&lt;/emphasis&gt;</entry><entry><emphasis role="strong">A different type of emphasis</emphasis></entry></row>
              <row><entry>&lt;filename&gt;blahblah.txt&lt;/filename&gt;</entry><entry><filename>blahblah.txt</filename></entry></row>
              <row><entry>&lt;acronym&gt;XML&lt;/acronym&gt;</entry><entry><acronym>XML</acronym></entry></row>
              <row><entry>&lt;quote&gt;blahblahblah&lt;quote&gt;</entry><entry><quote>blahblahblah</quote></entry></row>
            </tbody>
          </tgroup>
        </informaltable>
      </sect3>
    </sect2>

    <sect2 id="UniDocBook-Creating-Images"><title>Including Images</title>
      <para>Images are included in DocBook documents as illustrated below:</para>
  
      <programlisting>
        &lt;figure&gt;&lt;title&gt;blah&lt;/title&gt;
          &lt;mediaobject&gt;
            &lt;imageobject&gt;&lt;imagedata fileref=&quot;blah.jpg&quot; format=&quot;JPEG&quot;/&gt;&lt;/imageobject&gt;
            &lt;textobject&gt;&lt;phrase&gt;Image description&lt;/phrase&gt;&lt;/textobject&gt;
          &lt;/mediaobject&gt;
        &lt;/figure&gt;
      </programlisting>
  
      <para>
        The overall encapsulating element is <emphasis>figure</emphasis> the reference page for which can be found at <ulink url="http://www.docbook.org/tdg/en/html/figure.html">http://www.docbook.org/tdg/en/html/figure.html</ulink>. The <emphasis>figure</emphasis> contains a <emphasis>mediaobject</emphasis> element which can occur on it's own too and may contain <emphasis>audioobject</emphasis>, <emphasis>caption</emphasis>, <emphasis>imageobject</emphasis>, <emphasis>objectinfo</emphasis>, <emphasis>textobject</emphasis> and <emphasis>videoobject</emphasis> elements. The reference page for<emphasis>mediaobject</emphasis> is at <ulink url="http://www.docbook.org/tdg/en/html/mediaobject.html">http://www.docbook.org/tdg/en/html/mediaobject.html</ulink>.
      </para>
  
      <para>
        <emphasis>imageobject</emphasis> is the type of <emphasis>mediaobject</emphasis> used to include an image and it's reference page can be found at <ulink url="http://www.docbook.org/tdg/en/html/imageobject.html">http://www.docbook.org/tdg/en/html/imageobject.html</ulink>. The item within the <emphasis>imageobject</emphasis> that handles the image is <emphasis>imagedata</emphasis>, it's reference page is at <ulink url="http://www.docbook.org/tdg/en/html/imagedata.html">http://www.docbook.org/tdg/en/html/imagedata.html</ulink>.
      </para>
  
      <para>
        The idea behind <emphasis>mediaobject</emphasis> is to provide a way to include media in many formats. It becomes the document processors job to decide which of the formats specified in the <emphasis>mediaobject</emphasis> to use in the particular output medium chosen. For example the <emphasis>mediaobject</emphasis> element may contain a <acronym>PNG</acronym> format <emphasis>imageobject</emphasis> for <acronym>HTML</acronym> output and a <acronym>TIFF</acronym> format <emphasis>imageobject</emphasis> for print output, there may also be a <emphasis>textobject</emphasis> providing a description of the image for an output format that does not have the capability to display images, for example, perhaps the document will be output in an audio format for people with sight problems.
      </para>

      <para>
        One does not have to encapsulate the <emphasis>mediaobject</emphasis> in a <emphasis>figure</emphasis> object but doing so allows one to provide a title and be able to have the <emphasis>figure</emphasis> listed in a list of figures at the beginning of the document.
      </para>
  
      <para><emphasis>imagedata</emphasis> may be of the following formats:</para>
  
      <mediaobject>
        <imageobject><imagedata fileref="files/images/imageformats.png" format="PNG"/></imageobject>
        <textobject><phrase><emphasis>imageobject</emphasis> image formats</phrase></textobject>
      </mediaobject>
  
      <para>
        The attribute <emphasis>format</emphasis> is thus required along with either <emphasis>fileref</emphasis> or <emphasis>entityref</emphasis> to reference the image:
      </para>
  
      <programlisting>
        &lt;mediaobject&gt;
          &lt;imageobject&gt;&lt;imagedata fileref=&quot;frog.png&quot; format=&quot;PNG&quot;/&gt;&lt;/imageobject&gt;
          &lt;textobject&gt;&lt;phrase&gt;A frog&lt;/phrase&gt;&lt;/textobject&gt;
        &lt;/mediaobject&gt;
      </programlisting>

      <para/>
  
      <mediaobject>
        <imageobject><imagedata fileref="files/images/frog.png" format="PNG"/></imageobject>
        <textobject><phrase>A frog</phrase></textobject>
      </mediaobject>
  
      <para>
        One could use stylesheets such that, in <acronym>HTML</acronym> rendered output, the <emphasis>phrase</emphasis> used in the <emphasis>textobject</emphasis> would become the alternative text in an image in the <acronym>HTML</acronym>. One can use multiple <emphasis>imageobject</emphasis>s for different output formats, for instance one may have an <emphasis>eps</emphasis> version of the image so that output can be generated with a processing chain that requires the image to be in this form. One could include different image formats for each of the desired output formats.
      </para>
  
      <para>
        The <emphasis>imagedata</emphasis> element has the useful attributes <emphasis>align</emphasis> and <emphasis>valign</emphasis>. <emphasis>align</emphasis> specifies how the image should be aligned horizontally and can be set to the values; <emphasis role="strong">center</emphasis>, <emphasis role="strong">left</emphasis> and <emphasis role="strong">right</emphasis>. <emphasis>valign</emphasis> specifies how the image should be aligned vertically and can be set to the values; <emphasis role="strong">bottom</emphasis>, <emphasis role="strong">middle</emphasis> and <emphasis role="strong">top</emphasis>.
      </para>
    </sect2>

    <sect2 id="UniDocBook-Creating-Tables"><title>Tables</title>
      <para>
        There are two elements used for placing tables inside a DocBook document, <emphasis>table</emphasis> and <emphasis>informaltable</emphasis>, the only difference between the former and the latter is that the former requires a <emphasis>title</emphasis> and the latter does not.
      </para>
  
      <programlisting>
        &lt;table&gt;&lt;title&gt;title&lt;/title&gt;         &lt;informaltable&gt;
          .                                        .
          .                          or            .
          .                                        .
        &lt;/table&gt;                            &lt;/informaltable&gt;
      </programlisting>
  
      <para>
        The <emphasis>table</emphasis> contains an attribute called <emphasis>frame</emphasis> which specifies how the table should be framed:
      </para>
  
      <programlisting>
        &lt;table frame=&quot;frametype&quot;&gt;&lt;title&gt;frame=&quot;frametype&quot;&lt;/title&gt;
          &lt;tgroup cols=&quot;1&quot;&gt;
            &lt;thead&gt;
              &lt;row&gt;&lt;entry&gt;a1&lt;/entry&gt;&lt;entry&gt;b1&lt;/entry&gt;&lt;entry&gt;c1&lt;/entry&gt;&lt;/row&gt;
            &lt;/thead&gt;
            &lt;tbody&gt;
              &lt;row&gt;&lt;entry&gt;a2&lt;/entry&gt;&lt;entry&gt;b2&lt;/entry&gt;&lt;entry&gt;c2&lt;/entry&gt;&lt;/row&gt;
              &lt;row&gt;&lt;entry&gt;a3&lt;/entry&gt;&lt;entry&gt;b3&lt;/entry&gt;&lt;entry&gt;c3&lt;/entry&gt;&lt;/row&gt;
            &lt;/tbody&gt;
          &lt;/tgroup&gt;
        &lt;/table&gt;
      </programlisting>
  
      <para>
        Where <emphasis>frametype</emphasis> is replaced with one of <emphasis>all</emphasis>, <emphasis>bottom</emphasis>, <emphasis>none</emphasis>, <emphasis>sides</emphasis>, <emphasis>top</emphasis> or <emphasis>topbot</emphasis>:
      </para>
  
      <figure><title>Table frame types</title>
        <mediaobject>
          <imageobject><imagedata fileref="files/images/tableframetypes.png" format="PNG"/></imageobject>
        </mediaobject>
      </figure>
  
      <para>
        The output above is <acronym>PDF</acronym>, with <acronym>HTML</acronym> all the tables look the same as the one with attribute <emphasis>all</emphasis> apart from the one with attribute <emphasis>none</emphasis> which has no frame at all. The attributes <emphasis>colsep</emphasis> and <emphasis>rowsep</emphasis> are used to control whether lines should be drawn between columns and rows respectively:
      </para>
  
      <programlisting>
  &lt;table colsep=&quot;0&quot; rowsep=&quot;0&quot;&gt; ... &lt;/table&gt;
  &lt;table colsep=&quot;0&quot; rowsep=&quot;1&quot;&gt; ... &lt;/table&gt;
  &lt;table colsep=&quot;1&quot; rowsep=&quot;0&quot;&gt; ... &lt;/table&gt;
  &lt;table colsep=&quot;1&quot; rowsep=&quot;1&quot;&gt; ... &lt;/table&gt;
      </programlisting>
  
      <para>
        Unfortunately at the time of writing the tools used to convert <acronym>FO</acronym> to <acronym>PDF</acronym> either did not yet implement this feature or were in a broken state with regards to this feature so no pictorial examples can be provided. Other <emphasis>table</emphasis> attributes are discussed at <ulink url="http://www.docbook.org/tdg/en/html/table.html">http://www.docbook.org/tdg/en/html/table.html</ulink>.
      </para>
  
      <para>The generic layout for a table is as follows:</para>
  
      <programlisting>
        &lt;table&gt;&lt;title&gt;title&lt;/title&gt; 
          &lt;tgroup cols=&quot;3&quot;&gt;
            &lt;thead&gt;
              &lt;row&gt;&lt;entry&gt;blah&lt;/entry&gt;&lt;/row&gt;
            &lt;/thead&gt;
  
            &lt;tbody&gt;
              &lt;row&gt;&lt;entry&gt;blah&lt;/entry&gt;&lt;/row&gt;
            &lt;/tbody&gt;
  
            &lt;tfoot&gt;
              &lt;row&gt;&lt;entry&gt;blah&lt;/entry&gt;&lt;/row&gt;
            &lt;/tfoot&gt;
          &lt;/tgroup&gt;
        &lt;/table&gt;
      </programlisting>
  
      <para>
        <emphasis>tgroup</emphasis> contains the rest of the table which must contain a <emphasis>tbody</emphasis> element which specifies which data is in the body of the table. The <emphasis>tbody</emphasis> element may be empty with the table being included in <emphasis>thead</emphasis> or <emphasis>tfoot</emphasis> but this is not the intention. The reason for the <emphasis>thead</emphasis> and <emphasis>tfoot</emphasis> elements is so that different layouts can be applied by the stylesheets for the header and the footer of the table respectively. So usually the first row would be wrapped in a <emphasis>thead</emphasis> element. <emphasis>tgroup</emphasis> has the mandatory attribute <emphasis>cols</emphasis> which specifies the number of columns the table has.
      </para>
  
      <para>
        <emphasis>tgroup</emphasis> may also specify alignment of content via the <emphasis>align</emphasis> attribute, where <emphasis>alignment</emphasis> is either <emphasis>left</emphasis>, <emphasis>center</emphasis> or <emphasis>right</emphasis>:
      </para>
  
      <programlisting>
      &lt;table frame=&quot;all&quot;&gt;&lt;title&gt;align=&quot;alignment&quot;&lt;/title&gt;
        &lt;tgroup cols=&quot;3&quot; align=&quot;alignment&quot;&gt;
          &lt;tbody&gt;
            &lt;row&gt;&lt;entry&gt;a2&lt;/entry&gt;&lt;entry&gt;b2&lt;/entry&gt;&lt;entry&gt;c2&lt;/entry&gt;&lt;/row&gt;
          &lt;/tbody&gt;
        &lt;/tgroup&gt;
      &lt;/table&gt;
      </programlisting>
  
      <figure><title>Table alignment types</title>
        <mediaobject>
          <imageobject><imagedata fileref="files/images/tablealign.png" format="PNG"/></imageobject>
        </mediaobject>
      </figure>
  
      <para>
        For more information about the <emphasis>tgroup</emphasis> element see <ulink url="http://www.docbook.org/tdg/en/html/tgroup.html">http://www.docbook.org/tdg/en/html/tgroup.html</ulink>.
      </para>
  
      <para>
        A <emphasis>row</emphasis> consists of a number of <emphasis>entry</emphasis> elements which are entered in the sequence they should appear in each table row, for more information about the <emphasis>row</emphasis> element see <ulink url="http://www.docbook.org/tdg/en/html/row.html">http://www.docbook.org/tdg/en/html/row.html</ulink>.
      </para>
        
      <para>
        The <emphasis>entry</emphasis> element has some interesting attributes which allow an entry to span more than one column or row, they are (<emphasis>namest</emphasis> &amp; <emphasis>nameend</emphasis>) and <emphasis>morerows</emphasis> respectively. The <emphasis>morerow</emphasis> attribute specifies how many more rows the entry it is applied to should span:
      </para>
  
      <programlisting>
      &lt;table frame=&quot;all&quot;&gt;&lt;title&gt;&lt;emphasis&gt;morerows&lt;/emphasis&gt; example&lt;/title&gt;
        &lt;tgroup cols=&quot;3&quot;&gt;
          &lt;tbody&gt;
            &lt;row&gt;&lt;entry morerows=&quot;2&quot;&gt;a1&lt;/entry&gt;&lt;entry&gt;b1&lt;/entry&gt;&lt;entry&gt;c1&lt;/entry&gt;&lt;/row&gt;
            &lt;row&gt;&lt;entry&gt;b2&lt;/entry&gt;&lt;entry&gt;c2&lt;/entry&gt;&lt;/row&gt;
            &lt;row&gt;&lt;entry&gt;b3&lt;/entry&gt;&lt;entry&gt;c3&lt;/entry&gt;&lt;/row&gt;
          &lt;/tbody&gt;
        &lt;/tgroup&gt;
      &lt;/table&gt;
      </programlisting>
  
      <figure><title><emphasis>morerows</emphasis> example</title>
        <mediaobject>
          <imageobject><imagedata fileref="files/images/morerowsexample.png" format="PNG"/></imageobject>
        </mediaobject>
      </figure>
  
      <para>
        Unfortunately there is no <emphasis>morecolumns</emphasis> attribute, instead one has to use <emphasis>namest</emphasis> to specify the starting column of the <emphasis>entry</emphasis> and <emphasis>nameend</emphasis> to specify the ending column of the <emphasis>entry</emphasis>. The value applied to this attribute is the name of the columns, columns are named using the <emphasis>colspec</emphasis> element, <emphasis>colspec</emphasis> elements are inserted inside <emphasis>tgroup</emphasis> but before <emphasis>thead</emphasis>, <emphasis>tbody</emphasis> and <emphasis>tfoot</emphasis>:
      </para>
  
      <programlisting>
      &lt;table frame=&quot;all&quot;&gt;&lt;title&gt;column spanning&lt;/title&gt;
        &lt;tgroup cols=&quot;3&quot;&gt;
          &lt;colspec colname=&quot;col1&quot;/&gt;
          &lt;colspec colname=&quot;col2&quot;/&gt;
          &lt;colspec colname=&quot;col3&quot;/&gt;
          &lt;tbody&gt;
            &lt;row&gt;&lt;entry namest=&quot;col1&quot; nameend=&quot;col3&quot;&gt;a1&lt;/entry&gt;&lt;/row&gt;
            &lt;row&gt;&lt;entry&gt;a2&lt;/entry&gt;&lt;entry&gt;b1&lt;/entry&gt;&lt;entry&gt;c1&lt;/entry&gt;&lt;/row&gt;
            &lt;row&gt;&lt;entry&gt;a3&lt;/entry&gt;&lt;entry&gt;b2&lt;/entry&gt;&lt;entry&gt;c2&lt;/entry&gt;&lt;/row&gt;
          &lt;/tbody&gt;
        &lt;/tgroup&gt;
      &lt;/table&gt;
      </programlisting>
  
      <figure><title>Column spanning example</title>
        <mediaobject>
          <imageobject><imagedata fileref="files/images/columnspanexample.png" format="PNG"/></imageobject>
        </mediaobject>
      </figure>
  
      <para>
        More information about the <emphasis>entry</emphasis> element can be found at <ulink url="http://www.docbook.org/tdg/en/html/entry.html">http://www.docbook.org/tdg/en/html/entry.html</ulink>. Tables may be nested to a level of one, see <ulink url="http://www.docbook.org/tdg/en/html/entrytbl.html">http://www.docbook.org/tdg/en/html/entrytbl.html</ulink>. For the entire source and output pertaining to the examples discussed in this section see <ulink url="files/tableexampleshome.html">Table Examples</ulink>.
      </para>
    </sect2>
  </sect1>

  <sect1 id="UniDocBook-References"><title>References (And links you may find useful)</title>
    <itemizedlist>
      <listitem>
        <para>DocBook: The Definitive Guide</para>
        <para><ulink url="http://www.docbook.org/tdg/en/html/docbook.html">http://www.docbook.org/tdg/en/html/docbook.html</ulink></para>
      </listitem>

      <listitem>
        <para>Using the DocBook XSL Stylesheets</para>
        <para><ulink url="http://www.sagehill.net/xml/docbookxsl/">http://www.sagehill.net/xml/docbookxsl/</ulink>.</para>
      </listitem>

      <listitem>
        <para>
          <ulink url="http://supportweb.cs.bham.ac.uk/documentation/tutorials/docsystem/build/tutorials/docbooksys/docbooksyshome.html">Setting Up A Free XML/SGML DocBook Editing Suite For Windows And Unix</ulink>
        </para>
      </listitem>
    </itemizedlist>
  </sect1>
</article>

