[exim-cvs] cvs commit: exim/exim-doc/doc-docbook AdMarkup.tx…

Pàgina inicial
Delete this message
Reply to this message
Autor: Philip Hazel
Data:  
A: exim-cvs
Assumpte: [exim-cvs] cvs commit: exim/exim-doc/doc-docbook AdMarkup.txt HowItWorks.txt Makefile Markup.txt MyAsciidoc.conf MyStyle-fo.xsl MyStyle-spec-fo.xsl MyStyle.xsl MyTitleStyle.xsl MyTitlepage.template
ph10 2006/02/01 11:01:02 GMT

  Modified files:
    exim-doc/doc-docbook HowItWorks.txt Makefile MyStyle-fo.xsl 
                         MyStyle-spec-fo.xsl MyStyle.xsl 
                         MyTitlepage.templates.xml Pre-xml 
                         TidyHTML-filter TidyHTML-spec Tidytxt 
                         x2man 
  Added files:
    exim-doc/doc-docbook Markup.txt MyTitleStyle.xsl PageLabelPDF 
                         filter.xfpt spec.xfpt 
  Removed files:
    exim-doc/doc-docbook AdMarkup.txt MyAsciidoc.conf filter.ascd 
                         spec.ascd 
  Log:
  Remove Asciidoc versions of the documentation and building apparatus;
  replace with xfpt versions.


  Revision  Changes    Path
  1.3       +0 -438    exim/exim-doc/doc-docbook/AdMarkup.txt (dead)
  1.3       +155 -118  exim/exim-doc/doc-docbook/HowItWorks.txt
  1.7       +54 -33    exim/exim-doc/doc-docbook/Makefile
  1.1       +354 -0    exim/exim-doc/doc-docbook/Markup.txt (new)
  1.3       +0 -212    exim/exim-doc/doc-docbook/MyAsciidoc.conf (dead)
  1.3       +61 -31    exim/exim-doc/doc-docbook/MyStyle-fo.xsl
  1.3       +14 -1     exim/exim-doc/doc-docbook/MyStyle-spec-fo.xsl
  1.3       +14 -8     exim/exim-doc/doc-docbook/MyStyle.xsl
  1.1       +223 -0    exim/exim-doc/doc-docbook/MyTitleStyle.xsl (new)
  1.2       +8 -5      exim/exim-doc/doc-docbook/MyTitlepage.templates.xml
  1.1       +61 -0     exim/exim-doc/doc-docbook/PageLabelPDF (new)
  1.3       +37 -30    exim/exim-doc/doc-docbook/Pre-xml
  1.3       +28 -21    exim/exim-doc/doc-docbook/TidyHTML-filter
  1.3       +1 -1      exim/exim-doc/doc-docbook/TidyHTML-spec
  1.2       +54 -9     exim/exim-doc/doc-docbook/Tidytxt
  1.3       +0 -1758   exim/exim-doc/doc-docbook/filter.ascd (dead)
  1.1       +1688 -0   exim/exim-doc/doc-docbook/filter.xfpt (new)
  1.5       +0 -35031  exim/exim-doc/doc-docbook/spec.ascd (dead)
  1.1       +32338 -0  exim/exim-doc/doc-docbook/spec.xfpt (new)
  1.2       +4 -6      exim/exim-doc/doc-docbook/x2man


Index: Markup.txt
====================================================================
$Cambridge: exim/exim-doc/doc-docbook/Markup.txt,v 1.1 2006/02/01 11:01:01 ph10 Exp $

XFPT MARKUP USED IN THE EXIM DOCUMENTATION
------------------------------------------

This file contains a summary of the xfpt markup that is used in the source
files of the Exim documentation. The source files are in plain text that can be
edited by any text editor. They are converted by the xfpt application into
DocBook XML for subsequent processing into the various output formats.

The advantage of using xfpt format as a "back end" is that is uses relatively
simple markup in the majority of the text, making it easier to read and edit.
The disadvantage is that it is tricky to deal with complicated formatting,
though that is probably true of any markup language, and is certainly true of
XML itself.

The Exim documentation uses standard xfpt DocBook markup with a few additions.
The definitions of the additions that are used in spec.xfpt and filter.xfpt,
respectively, appear at the start of each of those files. In this file we
describe all the markup briefly, both the standard and additional items. See
the xfpt specification for more details.

Markup in xfpt is indicated in one of two ways: lines that start with a dot are
interpreted specially ("directive lines"), and ampersand characters within the
text always introduce a markup item. Recognized sequences that start with an
ampersand are called "flags". Some of these have "partners" that do not
necessarily start with an ampersand, but these must always appear after a flag
that starts with an ampersand. There are no other forms of markup.

There are two text characters that are not printed as their Ascii graphics.
These are the grave accent and the single quote. They are automatically
converted into opening and closing typographic quote characters in non-literal
text. Other input characters that are not part of some markup always stand for
themselves.


CONTINUATION LINES

Any line of input can be continued onto the next line by ending the first line
with the sequence &&&. The line break and any leading spaces at the start of
the following line are ignored. This processing happens as the lines are read
in, before any other processing.


SPECIAL CHARACTERS IN TEXT

The following flag sequences are translated to non-Ascii characters:

    &--     en-dash (generates –)
    &~      hard space (generates  )


The following two flags are for use on Exim option definitions. They are
designed for use within italic text; however, they terminate and restart the
italic so that the daggers themselves are roman. These flags do not work
outside italic text.

    &!!     dagger         (generates </emphasis>&dagger;<emphasis>)
    &!?     double dagger  (generates </emphasis>&Dagger;<emphasis>)


Any Unicode character can be accessed by giving its name or code point in the
normal XML fashion. For example, &dagger; gives a dagger and &copy; gives a
copyright symbol.


AMPERSANDS AS DATA

If you really do want an ampersand character in the text, you must type two
ampersands. This is a flag that expands to &amp; in the output. Of course, you
could also just type &amp; yourself; the flag is just for convenience.


PAIRED FLAGS

There are several sequences that use pairs of markup flags, surrounding some
enclosed text, which is represented as ... in the following list:

    &'...'&    italic: maps to <emphasis>...</emphasis>
               used for email addresses, domains, local
               parts, header names, user names


    &*...*&    bold: maps to <emphasis role="bold">...</emphasis>
               used for things like &*Note:*&


    &`...`&    monospaced text: maps to <literal>...</literal>
               used for literal quoting in mixed-font text


    &$...$&    Exim variable: maps to <varname>$...</varname>
               note that the leading dollar is automatically inserted


    &%...%&    Exim option, command line option: maps to <option>...</option>


    &(...)&    Exim driver name, Unix command name, filter command name:
               maps to <command>...</command>


    &[...]&    C function: maps to <function>...</function>


    &_..._&    file name: maps to <filename>...</filename>


    &"..."&    put word in quotation marks: maps to <quote>...</quote>


For example,

    This is an &'italic phrase'&. This is a &_filename_& and a &$variable$&.
    This &"word"& is in quote marks.


It is important to use &"..."& rather than literal quotes so that different
renditions can be used for different forms of output.

These markup items can be nested, but not overlapped. However, the resulting
XML from nested constructions is not always valid, so nesting is best avoided
if possible. For example, &`xxx&'yyy'&xxx`& generates an <emphasis> item within
a <literal> item, and the DocBook DTD does not allow that. However, a
combination that does work is <literal> within an <emphasis>, so that is what
you have to use if you want an italic or boldface monospaced font. For example,
you have to use &*&`bold mono`&*& and not &`&*bold mono*&`&.


LITERAL XML

You can include blocks of literal XML between these two directive lines:

    .literal xml
    ...
    .literal off


There are some examples at the start of the Exim specification. You can also
include individual XML elements by enclosing them in &<...>& but at the time of
writing there are no examples of this usage in the Exim documentation.


COMMENTS

You can include comments that will not be copied to the XML document by
starting a line with a dot followed by a space. You can include comments that
are copied to the XML by either of the literal XML methods just described.


URL REFERENCES

To refer to a URL, use &url, followed by parentheses that can enclose one or
two arguments, comma separated. The second, if present, is used as the
displayed text. If there is only one argument, it is used both as the displayed
text and as the URL. For example, here is a reference to
&url(http://www.exim.org/,the exim home page). In HTML output, all you see is
the display text; in printed output you see something like "the exim home page
[http://www.exim.org/]". The URL is printed in a bold font.


CHAPTERS AND SECTIONS

The directives .chapter and .section mark the beginnings of chapters and
sections. They are followed by a title in quotes, and optionally by up to two
more arguments. Either single or double quotes can be used, and if you need a
quote of the type being used as a delimiter within the string, it must be
doubled. (Quotes are not in fact needed if the title contains no white space,
but this is rare.)

The second argument, if present and not an empty string, is an id for
cross-references. For example:

    .chapter "Environment for running local transports" "CHAPenvironment"


To refer to a cross-reference point, enclose the name in &<<...>>&. For
example:

    See section &<<SECTexample>>&.


Chapter titles are used for running feet in the PostScript and PDF forms of the
manual. Sometimes they are too long, causing them to be split in an ugly way.
The solution to this is to define a short title for the running feet as the
third argument for .chapter or .section, like this:

    .chapter "Environment for running local transports" "CHAPenvironment" &&&
             "Environment for local transports"


Note the use of &&& in this example to continue the logical input line. If you
need to specify a third argument without a second argument, the second argument
must be given as an empty string in quotes.


DISPLAYS

There are two forms of text display. Displayed blocks of literal text are
started by .code and terminated by .endd:

    .code
    # Exim filter
    deliver baggins@???
    .endd


No flags are recognized in such blocks, which are displayed in a monospaced
font.

Blocks of text between .display and .endd are displayed in the current font,
with the lines processed for flags as in normal paragraphs, but keeping the
line layout. Flags can be used in the block to specify different fonts or
special characters. For example:

    .display
    &`\n`&   is replaced by a newline
    &`\r`&   is replaced by a carriage return
    &`\t`&   is replaced by a tab
    .endd



BLOCK QUOTES

Text between .blockquote and .endblockquote is forced to start a new paragraph
and is wrapped in a <blockquote> element.


INDEX ENTRIES

To create an index entry, include a line like one of these:

    .cindex "primary text" "secondary text"
    .oindex "primary text" "secondary text"


The first is for the "concept index" and the second is for the "options index".
Not all forms of output distinguish between these - sometimes there is just one
index.

The index for the Exim reference manual has a number of "see also" entries.
These are coded in raw XML at the start of the source file.


LISTS

Bulleted (itemized) lists are started by .ilist, and ordered (numbered) lists
are started by .olist, which can be followed by "arabic", "loweralpha",
"lowerroman", "upperalpha", or "upperroman" to indicate the type of numeration
that is wanted. Each new item is started by .next, and the whole list is
terminated by .endlist. Lists can be nested. For example:

    .ilist
    The first item in the itemized list.
    .olist lowerroman
    The first item in the nested, numbered list
    .next
    The next item in the nested, numbered list.
    .endlist
    Continuing with the first item in the itemized list.
    .next
    The next item in the itemized list.
    .endlist


Variable lists are used for Exim command line options and similar things. They
map into XML <variablelist> items. Start the list with .vlist and end it with
.endlist. Each item starts with a .vitem line, followed by its description. The
argument to .vitem must be quoted if it contains spaces. For example:

    .vlist
    .vitem &*--*&
    This is a pseudo-option whose only purpose is to terminate the options and
    therefore to cause subsequent command line items to be treated as arguments
    rather than options, even if they begin with hyphens.


    .vitem &*--help*&
    This option causes Exim to output a few sentences stating what it is.
    The same output is generated if the Exim binary is called with no options and
    no arguments.
    ...
    .endlist



TABLES

The .itable macro directive in xfpt can be used to specify an informal table.
See the specification for details. The Exim specification uses this directly in
one place, but most of its tables contain only two columns, for which a
cut-down macro called .table2 has been defined. Its arguments are the widths of
the columns, defaulting to 190pt and 300pt, which are suitable for the many
tables that appear at the start of the global options definition chapter. Each
row in a table is defined by a .row macro, and the table ends with .endtable.
For example:

    .table2 100pt
    .row &_OptionLists.txt_&   "list of all options in alphabetical order"
    .row &_dbm.discuss.txt_&   "discussion about DBM libraries"
    ...
    .endtable


This example overrides the width of the first column. The first arguments of
the .row macro do not need quotes, because they contain no white space, but
quotes could have been used.


EXIM CONFIGURATION OPTION HEADINGS

Each Exim configuration option is formatted with its name, usage, type, and
default value on a single output line, spread over the line so as to fill it
completely. The only way I know of aligning text using DocBook is to use a
table. The .option macro defines such a table and inserts its four arguments
into the cells. For example:

    .option acl_not_smtp_mime main string&!! unset
    This option causes...


The macro contains the font definitions and the heading words "Use", "Type",
and "Default", so you do not have to supply them. Notice the use of the &!!
flag to put a dagger after the word "string".


CHANGE BARS

I have not yet found a way of producing change bars in the PostScript and PDF
versions of the documents. However, it is possible to put a green background
behind changed text in the HTML version, so the appropriate markup should be
used in the source. There is a facility in xfpt for setting the "revisionflag"
attribute on appropriate XML elements. There is also a macro called .new which
packages this up for use in three different ways. One or more large text items
can be placed between .new and .wen ("wen" is "new" backwards). For example:

    This paragraph is not flagged as new.
    .new
    This is a new paragraph that contains a display:
    .display
    whatever
    .endd
    This is the next paragraph.
    .wen
    Here is the next, unmarked, paragraph.


When called without an argument, .new terminates the current paragraph, and
.wen always does so. Therefore, even though there are no blank lines before
.new or .wen above, the marked text will be in a paragraph of its own. You
can, of course, put in blank lines if you wish, and it is probably clearer to
do so.

If want to mark just a few words inside a paragraph as new, you can call the
.new macro with an argument. The macro can be called either as a directive or
as an inline macro call, which takes the form of an ampersand followed by the
name, with the argument in parentheses. For example:

    This is a paragraph that has
    .new "a few marked words"
    within it. Here are &new(some more) marked words.


The effect of this is to generate a <phrase> XML element with the revisionflag
attribute set. The .wen macro is not used in this case.

If you want to mark a whole table as new, .new and .wen can be used to surround
it as described above. However, current DocBook processors do not seem to
recognize the "revisionflag" attribute on individual rows and table entries.
You can, nevertheless, mark the contents of individual table entries as changed
by using an inline macro call. For example:

    .row "&new(some text)" "...."


Each such entry must be separately marked. If there are more than one or two,
it may be easier just to mark the entire table.

Philip Hazel
Last updated: 25 January 2006

Index: MyTitleStyle.xsl
====================================================================
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<!-- This stylesheet was created by template/titlepage.xsl; do not edit it by hand. -->

  <xsl:template name="book.titlepage.recto">
    <xsl:choose>
      <xsl:when test="bookinfo/title">
        <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="bookinfo/title"/>
      </xsl:when>
      <xsl:when test="info/title">
        <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="info/title"/>
      </xsl:when>
      <xsl:when test="title">
        <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="title"/>
      </xsl:when>
    </xsl:choose>


    <xsl:choose>
      <xsl:when test="bookinfo/subtitle">
        <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="bookinfo/subtitle"/>
      </xsl:when>
      <xsl:when test="info/subtitle">
        <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="info/subtitle"/>
      </xsl:when>
      <xsl:when test="subtitle">
        <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="subtitle"/>
      </xsl:when>
    </xsl:choose>


    <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="bookinfo/corpauthor"/>
    <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="info/corpauthor"/>
    <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="bookinfo/authorgroup"/>
    <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="info/authorgroup"/>
    <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="bookinfo/author"/>
    <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="info/author"/>
    <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="bookinfo/affiliation"/>
    <xsl:apply-templates mode="book.titlepage.recto.auto.mode" select="info/affiliation"/>
  </xsl:template>


  <xsl:template name="book.titlepage.verso">
    <xsl:choose>
      <xsl:when test="bookinfo/title">
        <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/title"/>
      </xsl:when>
      <xsl:when test="info/title">
        <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/title"/>
      </xsl:when>
      <xsl:when test="title">
        <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="title"/>
      </xsl:when>
    </xsl:choose>


    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/corpauthor"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/corpauthor"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/authorgroup"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/authorgroup"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/author"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/author"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/affiliation"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/affiliation"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/address"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/address"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/pubdate"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/pubdate"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/abstract"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/abstract"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/copyright"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/copyright"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/revhistory"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/revhistory"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="bookinfo/legalnotice"/>
    <xsl:apply-templates mode="book.titlepage.verso.auto.mode" select="info/legalnotice"/>
  </xsl:template>


<xsl:template name="book.titlepage.separator">
</xsl:template>

<xsl:template name="book.titlepage.before.recto">
</xsl:template>

<xsl:template name="book.titlepage.before.verso"><fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" break-after="page"/>
</xsl:template>

  <xsl:template name="book.titlepage">
    <fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format">
      <xsl:variable name="recto.content">
        <xsl:call-template name="book.titlepage.before.recto"/>
        <xsl:call-template name="book.titlepage.recto"/>
      </xsl:variable>
      <xsl:if test="normalize-space($recto.content) != ''">
        <fo:block><xsl:copy-of select="$recto.content"/></fo:block>
      </xsl:if>
      <xsl:variable name="verso.content">
        <xsl:call-template name="book.titlepage.before.verso"/>
        <xsl:call-template name="book.titlepage.verso"/>
      </xsl:variable>
      <xsl:if test="normalize-space($verso.content) != ''">
        <fo:block><xsl:copy-of select="$verso.content"/></fo:block>
      </xsl:if>
      <xsl:call-template name="book.titlepage.separator"/>
    </fo:block>
  </xsl:template>


  <xsl:template match="*" mode="book.titlepage.recto.mode">
    <!-- if an element isn't found in this mode, -->
    <!-- try the generic titlepage.mode -->
    <xsl:apply-templates select="." mode="titlepage.mode"/>
  </xsl:template>


  <xsl:template match="*" mode="book.titlepage.verso.mode">
    <!-- if an element isn't found in this mode, -->
    <!-- try the generic titlepage.mode -->
    <xsl:apply-templates select="." mode="titlepage.mode"/>
  </xsl:template>


<xsl:template match="title" mode="book.titlepage.recto.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.recto.style" text-align="center" font-size="24.8832pt" space-before="5em" font-weight="bold" font-family="{$title.fontset}">
<xsl:call-template name="division.title">
<xsl:with-param name="node" select="ancestor-or-self::book[1]"/>
</xsl:call-template>
</fo:block>
</xsl:template>

<xsl:template match="subtitle" mode="book.titlepage.recto.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.recto.style" text-align="center" font-size="20.736pt" space-before="15.552pt" font-family="{$title.fontset}">
<xsl:apply-templates select="." mode="book.titlepage.recto.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="corpauthor" mode="book.titlepage.recto.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.recto.style" font-size="17.28pt" keep-with-next="always" space-before="2in">
<xsl:apply-templates select="." mode="book.titlepage.recto.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="authorgroup" mode="book.titlepage.recto.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.recto.style" space-before="2in">
<xsl:apply-templates select="." mode="book.titlepage.recto.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="author" mode="book.titlepage.recto.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.recto.style" font-size="17.28pt" space-before="10.8pt" keep-with-next="always">
<xsl:apply-templates select="." mode="book.titlepage.recto.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="affiliation" mode="book.titlepage.recto.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.recto.style" space-before="1em">
<xsl:apply-templates select="." mode="book.titlepage.recto.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="title" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style" font-size="14.4pt" font-weight="bold" font-family="{$title.fontset}">
<xsl:call-template name="book.verso.title">
</xsl:call-template>
</fo:block>
</xsl:template>

<xsl:template match="corpauthor" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style">
<xsl:apply-templates select="." mode="book.titlepage.verso.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="authorgroup" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style">
<xsl:call-template name="verso.authorgroup">
</xsl:call-template>
</fo:block>
</xsl:template>

<xsl:template match="author" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style" space-before="1em">
<xsl:apply-templates select="." mode="book.titlepage.verso.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="affiliation" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style" space-before="1em">
<xsl:apply-templates select="." mode="book.titlepage.verso.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="address" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style">
<xsl:apply-templates select="." mode="book.titlepage.verso.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="pubdate" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style" space-before="1em">
<xsl:apply-templates select="." mode="book.titlepage.verso.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="abstract" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style">
<xsl:apply-templates select="." mode="book.titlepage.verso.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="copyright" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style" space-before="1em">
<xsl:apply-templates select="." mode="book.titlepage.verso.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="revhistory" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style" space-before="1em">
<xsl:apply-templates select="." mode="book.titlepage.verso.mode"/>
</fo:block>
</xsl:template>

<xsl:template match="legalnotice" mode="book.titlepage.verso.auto.mode">
<fo:block xmlns:fo="http://www.w3.org/1999/XSL/Format" xsl:use-attribute-sets="book.titlepage.verso.style" font-size="8pt">
<xsl:apply-templates select="." mode="book.titlepage.verso.mode"/>
</fo:block>
</xsl:template>

</xsl:stylesheet>

Index: PageLabelPDF
====================================================================
#! /usr/bin/perl -w

# $Cambridge: exim/exim-doc/doc-docbook/PageLabelPDF,v 1.1 2006/02/01 11:01:01 ph10 Exp $

# Program to add page label information to the PDF output file. I have not
# found a way of automatically discovering the number of frontmatter pages
# in the document. It is therefore screwed in as 12 in the next statement.

  $add = "/PageLabels << /Nums [ 0 << /S /r >>\n" .
         "                      12 << /S /D >>\n" .
         "                     ]\n" .
         "            >>\n";


$extra = length $add;

  $before = 0;
  while (<>)
    {
    print;
    $before += length($_);
    last if $_ =~ "^<< /Type /Catalog";
    }


print $add;

  while (<>)
    {
    print;
    last if $_ =~ /^xref$/;
    }


  while (<>)
    {
    if (/^(\d{10}) (.*)/)
      {
      my($was) = $1;
      my($rest) = $2;
      printf "%010d $rest\n", $was + (($was > $before)? $extra : 0);
      }
    elsif (/^startxref/)
      {
      print;
      $_ = <>;
      if (/^(\d+)/)
        {
        print $1 + $extra, "\n";
        }
      else
        {
        print;
        }
      }
    else
      {
      print;
      }
    }


# End



Index: filter.xfpt
====================================================================
. $Cambridge: exim/exim-doc/doc-docbook/filter.xfpt,v 1.1 2006/02/01 11:01:01 ph10 Exp $

. /////////////////////////////////////////////////////////////////////////////
. This is the primary source of the document that describes Exim's filtering
. facilities. It is an xfpt document that is converted into DocBook XML for
. subsequent conversion into printing and online formats. The markup used
. herein is "standard" xfpt markup, with some extras. The markup is summarized
. in a file called Markup.txt.
. /////////////////////////////////////////////////////////////////////////////

.include stdflags
.include stdmacs
.docbook
.book

. ===========================================================================
. Additional xfpt markup used by this document, over and above the default
. provided in the xfpt library.

. Override the &$ flag to automatically insert a $ with the variable name

.flag &$ $& "<varname>$" "</varname>"

. A macro for the common 2-column tables

.macro table2 100pt 300pt
.itable none 0 0 2 $1 left $2 left
.endmacro
. ===========================================================================


. This preliminary stuff creates a <bookinfo> entry in the XML. This is removed
. when creating the PostScript/PDF output, because we do not want a full-blown
. title page created for those versions. The stylesheet fudges up a title line
. to replace the text "Table of contents". However, for the other forms of
. output, the <bookinfo> element is retained and used.

  .literal xml
  <bookinfo>
  <title>Exim's interfaces to mail filtering</title>
  <titleabbrev>Exim filtering</titleabbrev>
  <date>30 January 2006</date>
  <author><firstname>Philip</firstname><surname>Hazel</surname></author>
  <authorinitials>PH</authorinitials>
  <revhistory><revision>
    <revnumber>4.60-1</revnumber>
    <date>30 January 2006</date>
    <authorinitials>PH</authorinitials>
  </revision></revhistory>
  <copyright><year>2006</year><holder>University of Cambridge</holder></copyright>
  </bookinfo>
  .literal off



.chapter "Forwarding and filtering in Exim"
This document describes the user interfaces to Exim's in-built mail filtering
facilities, and is copyright &copy; University of Cambridge 2006. It
corresponds to Exim version 4.60.



.section "Introduction"
Most Unix mail transfer agents (programs that deliver mail) permit individual
users to specify automatic forwarding of their mail, usually by placing a list
of forwarding addresses in a file called &_.forward_& in their home
directories. Exim extends this facility by allowing the forwarding instructions
to be a set of rules rather than just a list of addresses, in effect providing
&"&_.forward_& with conditions"&. Operating the set of rules is called
&'filtering'&, and the file that contains them is called a &'filter file'&.

Exim supports two different kinds of filter file. An &'Exim filter'& contains
instructions in a format that is unique to Exim. A &'Sieve filter'& contains
instructions in the Sieve format that is defined by RFC 3028. As this is a
standard format, Sieve filter files may already be familiar to some users.
Sieve files should also be portable between different environments. However,
the Exim filtering facility contains more features (such as variable
expansion), and better integration with the host environment (such as the use
of external processes and pipes).

The choice of which kind of filter to use can be left to the end-user, provided
that the system administrator has configured Exim appropriately for both kinds
of filter. However, if interoperability is important, Sieve is the only
choice.

The ability to use filtering or traditional forwarding has to be enabled by the
system administrator, and some of the individual facilities can be separately
enabled or disabled. A local document should be provided to describe exactly
what has been enabled. In the absence of this, consult your system
administrator.

This document describes how to use a filter file and the format of its
contents. It is intended for use by end-users. Both Sieve filters and Exim
filters are covered. However, for Sieve filters, only issues that relate to the
Exim implementation are discussed, since Sieve itself is described elsewhere.

The contents of traditional &_.forward_& files are not described here. They
normally contain just a list of addresses, file names, or pipe commands,
separated by commas or newlines, but other types of item are also available.
The full details can be found in the chapter on the &(redirect)& router in the
Exim specification, which also describes how the system administrator can set
up and control the use of filtering.



.section "Filter operation"
It is important to realize that, in Exim, no deliveries are actually made while
a filter or traditional &_.forward_& file is being processed. Running a filter
or processing a traditional &_.forward_& file sets up future delivery
operations, but does not carry them out.

The result of filter or &_.forward_& file processing is a list of destinations
to which a message should be delivered. The deliveries themselves take place
later, along with all other deliveries for the message. This means that it is
not possible to test for successful deliveries while filtering. It also means
that any duplicate addresses that are generated are dropped, because Exim never
delivers the same message to the same address more than once.




.section "Testing a new filter file" "SECTtesting"
Filter files, especially the more complicated ones, should always be tested, as
it is easy to make mistakes. Exim provides a facility for preliminary testing
of a filter file before installing it. This tests the syntax of the file and
its basic operation, and can also be used with traditional &_.forward_& files.

Because a filter can do tests on the content of messages, a test message is
required. Suppose you have a new filter file called &_myfilter_& and a test
message in a file called &_test-message_&. Assuming that Exim is installed with
the conventional path name &_/usr/sbin/sendmail_& (some operating systems use
&_/usr/lib/sendmail_&), the following command can be used:
.code
/usr/sbin/sendmail -bf myfilter <test-message
.endd
The &%-bf%& option tells Exim that the following item on the command line is
the name of a filter file that is to be tested. There is also a &%-bF%& option,
which is similar, but which is used for testing system filter files, as opposed
to user filter files, and which is therefore of use only to the system
administrator.

The test message is supplied on the standard input. If there are no
message-dependent tests in the filter, an empty file (&_/dev/null_&) can be
used. A supplied message must start with header lines or the &"From&~"& message
separator line that is found in many multi-message folder files. Note that
blank lines at the start terminate the header lines. A warning is given if no
header lines are read.

The result of running this command, provided no errors are detected in the
filter file, is a list of the actions that Exim would try to take if presented
with the message for real. For example, for an Exim filter, the output
.code
Deliver message to: gulliver@???
Save message to: /home/lemuel/mail/archive
.endd
means that one copy of the message would be sent to
&'gulliver@???'&, and another would be added to the file
&_/home/lemuel/mail/archive_&, if all went well.

The actions themselves are not attempted while testing a filter file in this
way; there is no check, for example, that any forwarding addresses are valid.
For an Exim filter, if you want to know why a particular action is being taken,
add the &%-v%& option to the command. This causes Exim to output the results of
any conditional tests and to indent its output according to the depth of
nesting of &(if)& commands. Further additional output from a filter test can be
generated by the &(testprint)& command, which is described below.

When Exim is outputting a list of the actions it would take, if any text
strings are included in the output, non-printing characters therein are
converted to escape sequences. In particular, if any text string contains a
newline character, this is shown as &"\n"& in the testing output.

  When testing a filter in this way, Exim makes up an &"envelope"& for the
  message. The recipient is by default the user running the command, and so is
  the sender, but the command can be run with the &%-f%& option to supply a
  different sender. For example,
  .code
  /usr/sbin/sendmail -bf myfilter \
     -f islington@??? <test-message
  .endd
  Alternatively, if the &%-f%& option is not used, but the first line of the
  supplied message is a &"From&~"& separator from a message folder file (not the
  same thing as a &'From:'& header line), the sender is taken from there. If
  &%-f%& is present, the contents of any &"From&~"& line are ignored.


The &"return path"& is the same as the envelope sender, unless the message
contains a &'Return-path:'& header, in which case it is taken from there. You
need not worry about any of this unless you want to test out features of a
filter file that rely on the sender address or the return path.

It is possible to change the envelope recipient by specifying further options.
The &%-bfd%& option changes the domain of the recipient address, while the
&%-bfl%& option changes the &"local part"&, that is, the part before the @
sign. An adviser could make use of these to test someone else's filter file.

The &%-bfp%& and &%-bfs%& options specify the prefix or suffix for the local
part. These are relevant only when support for multiple personal mailboxes is
implemented; see the description in section &<<SECTmbox>>& below.


.section "Installing a filter file"
A filter file is normally installed under the name &_.forward_& in your home
directory &-- it is distinguished from a conventional &_.forward_& file by its
first line (described below). However, the file name is configurable, and some
system administrators may choose to use some different name or location for
filter files.


.section "Testing an installed filter file"
Testing a filter file before installation cannot find every potential problem;
for example, it does not actually run commands to which messages are piped.
Some &"live"& tests should therefore also be done once a filter is installed.

If at all possible, test your filter file by sending messages from some other
account. If you send a message to yourself from the filtered account, and
delivery fails, the error message will be sent back to the same account, which
may cause another delivery failure. It won't cause an infinite sequence of such
messages, because delivery failure messages do not themselves generate further
messages. However, it does mean that the failure won't be returned to you, and
also that the postmaster will have to investigate the stuck message.

If you have to test an Exim filter from the same account, a sensible precaution
is to include the line
.code
if error_message then finish endif
.endd
as the first filter command, at least while testing. This causes filtering to
be abandoned for a delivery failure message, and since no destinations are
generated, the message goes on to be delivered to the original address. Unless
there is a good reason for not doing so, it is recommended that the above test
be left in all Exim filter files. (This does not apply to Sieve files.)



.section "Details of filtering commands"
The filtering commands for Sieve and Exim filters are completely different in
syntax and semantics. The Sieve mechanism is defined in RFC 3028; in the next
chapter we describe how it is integrated into Exim. The subsequent chapter
covers Exim filtering commands in detail.



.chapter "Sieve filter files" "CHAPsievefilter"
The code for Sieve filtering in Exim was contributed by Michael Haardt, and
most of the content of this chapter is taken from the notes he provided. Since
Sieve is an extensible language, it is important to understand &"Sieve"& in
this context as &"the specific implementation of Sieve for Exim"&.

This chapter does not contain a description of Sieve, since that can be found
in RFC 3028, which should be read in conjunction with these notes.

The Exim Sieve implementation offers the core as defined by RFC 3028,
comparison tests, the &*copy*&, &*envelope*&, &*fileinto*&, and &*vacation*&
extensions, but not the &*reject*& extension. Exim does not support message
delivery notifications (MDNs), so adding it just to the Sieve filter (as
required for &*reject*&) makes little sense.

In order for Sieve to work properly in Exim, the system administrator needs to
make some adjustments to the Exim configuration. These are described in the
chapter on the &(redirect)& router in the full Exim specification.


.section "Recognition of Sieve filters"
A filter file is interpreted as a Sieve filter if its first line is
.code
# Sieve filter
.endd
This is what distinguishes it from a conventional &_.forward_& file or an Exim
filter file.



.section "Saving to specified folders"
If the system administrator has set things up as suggested in the Exim
specification, and you use &(keep)& or &(fileinto)& to save a mail into a
folder, absolute files are stored where specified, relative files are stored
relative to &$home$&, and &_inbox_& goes to the standard mailbox location.



.section "Strings containing header names"
RFC 3028 does not specify what happens if a string denoting a header field does
not contain a valid header name, for example, it contains a colon. This
implementation generates an error instead of ignoring the header field in order
to ease script debugging, which fits in with the common picture of Sieve.



.section "Exists test with empty list of headers"
The &*exists*& test succeeds only if all the specified headers exist. RFC 3028
does not explicitly specify what happens on an empty list of headers. This
implementation evaluates that condition as true, interpreting the RFC in a
strict sense.



.section "Header test with invalid MIME encoding in header"
Some MUAs process invalid base64 encoded data, generating junk. Others ignore
junk after seeing an equal sign in base64 encoded data. RFC 2047 does not
specify how to react in this case, other than stating that a client must not
forbid to process a message for that reason. RFC 2045 specifies that invalid
data should be ignored (apparently looking at end of line characters). It also
specifies that invalid data may lead to rejecting messages containing them (and
there it appears to talk about true encoding violations), which is a clear
contradiction to ignoring them.

RFC 3028 does not specify how to process incorrect MIME words. This
implementation treats them literally, as it does if the word is correct but its
character set cannot be converted to UTF-8.



.section "Address test for multiple addresses per header"
A header may contain multiple addresses. RFC 3028 does not explicitly specify
how to deal with them, but since the address test checks if anything matches
anything else, matching one address suffices to satisfy the condition. That
makes it impossible to test if a header contains a certain set of addresses and
no more, but it is more logical than letting the test fail if the header
contains an additional address besides the one the test checks for.



.section "Semantics of keep"
The &(keep)& command is equivalent to
.code
fileinto "inbox";
.endd
It saves the message and resets the implicit keep flag. It does not set the
implicit keep flag; there is no command to set it once it has been reset.



.section "Semantics of fileinto"
RFC 3028 does not specify whether &(fileinto)& should try to create a mail
folder if it does not exist. This implementation allows the sysadmin to
configure that aspect using the &(appendfile)& transport options
&%create_directory%&, &%create_file%&, and &%file_must_exist%&. See the
&(appendfile)& transport in the Exim specification for details.



.section "Semantics of redirect"
Sieve scripts are supposed to be interoperable between servers, so this
implementation does not allow mail to be redirected to unqualified addresses,
because the domain would depend on the system being used. On systems with
virtual mail domains, the default domain is probably not what the user expects
it to be.



.section "String arguments"
There has been confusion if the string arguments to &(require)& are to be
matched case-sensitively or not. This implementation matches them with the
match type &(:is)& (default, see section 2.7.1 of the RFC) and the comparator
&(i;ascii-casemap)& (default, see section 2.7.3 of the RFC). The RFC defines
the command defaults clearly, so any different implementations violate RFC
3028. The same is valid for comparator names, also specified as strings.



.section "Number units"
There is a mistake in RFC 3028: the suffix G denotes gibi-, not tebibyte.
The mistake is obvious, because RFC 3028 specifies G to denote 2^30
(which is gibi, not tebi), and that is what this implementation uses as
the scaling factor for the suffix G.



.section "RFC compliance"
Exim requires the first line of a Sieve filter to be
.code
# Sieve filter
.endd
Of course the RFC does not specify that line. Do not expect examples to work
without adding it, though.

RFC 3028 requires the use of CRLF to terminate a line. The rationale was that
CRLF is universally used in network protocols to mark the end of the line. This
implementation does not embed Sieve in a network protocol, but uses Sieve
scripts as part of the Exim MTA. Since all parts of Exim use LF as the newline
character, this implementation does, too, by default, though the system
administrator may choose (at Exim compile time) to use CRLF instead.

Exim violates RFC 2822, section 3.6.8, by accepting 8-bit header names, so this
implementation repeats this violation to stay consistent with Exim. This is in
preparation for UTF-8 data.

Sieve scripts cannot contain NUL characters in strings, but mail headers could
contain MIME encoded NUL characters, which could never be matched by Sieve
scripts using exact comparisons. For that reason, this implementation extends
the Sieve quoted string syntax with \0 to describe a NUL character, violating
\0 being the same as 0 in RFC 3028. Even without using \0, the following tests
are all true in this implementation. Implementations that use C-style strings
will only evaluate the first test as true.
.code
Subject: =?iso-8859-1?q?abc=00def

header :contains "Subject" ["abc"]
header :contains "Subject" ["def"]
header :matches "Subject" ["abc?def"]
.endd
Note that by considering Sieve to be an MUA, RFC 2047 can be interpreted in a
way that NUL characters truncating strings is allowed for Sieve
implementations, although not recommended. It is further allowed to use encoded
NUL characters in headers, but that's not recommended either. The above example
shows why.

RFC 3028 states that if an implementation fails to convert a character set to
UTF-8, two strings cannot be equal if one contains octets greater than 127.
Assuming that all unknown character sets are one-byte character sets with the
lower 128 octets being US-ASCII is not sound, so this implementation violates
RFC 3028 and treats such MIME words literally. That way at least something
could be matched.

The folder specified by &(fileinto)& must not contain the character sequence
&".."& to avoid security problems. RFC 3028 does not specify the syntax of
folders apart from &(keep)& being equivalent to
.code
fileinto "INBOX";
.endd
This implementation uses &_inbox_& instead.

Sieve script errors currently cause messages to be silently filed into
&_inbox_&. RFC 3028 requires that the user is notified of that condition.
This may be implemented in the future by adding a header line to mails that
are filed into &_inbox_& due to an error in the filter.



.chapter "Exim filter files" "CHAPeximfilter"
This chapter contains a full description of the contents of Exim filter files.


.section "Format of Exim filter files"
Apart from leading white space, the first text in an Exim filter file must be
.code
# Exim filter
.endd
This is what distinguishes it from a conventional &_.forward_& file or a Sieve
filter file. If the file does not have this initial line (or the equivalent for
a Sieve filter), it is treated as a conventional &_.forward_& file, both when
delivering mail and when using the &%-bf%& testing mechanism. The white space
in the line is optional, and any capitalization may be used. Further text on
the same line is treated as a comment. For example, you could have
.code
# Exim filter <<== do not edit or remove this line!
.endd
The remainder of the file is a sequence of filtering commands, which consist of
keywords and data values. For example, in the command
.code
deliver gulliver@???
.endd
the keyword is &`deliver`& and the data value is
&`gulliver@???`&. White space or line breaks separate the
components of a command, except in the case of conditions for the &(if)&
command, where round brackets (parentheses) also act as separators. Complete
commands are separated from each other by white space or line breaks; there are
no special terminators. Thus, several commands may appear on one line, or one
command may be spread over a number of lines.

If the character # follows a separator anywhere in a command, everything from
# up to the next newline is ignored. This provides a way of including comments
in a filter file.


.section "Data values in filter commands"
There are two ways in which a data value can be input:

.ilist
If the text contains no white space, it can be typed verbatim. However, if it
is part of a condition, it must also be free of round brackets (parentheses),
as these are used for grouping in conditions.
.next
Otherwise, text must be enclosed in double quotation marks. In this case, the
character \ (backslash) is treated as an &"escape character"& within the
string, causing the following character or characters to be treated specially:
.display
&`\n`& is replaced by a newline
&`\r`& is replaced by a carriage return
&`\t`& is replaced by a tab
.endd
.endlist

Backslash followed by up to three octal digits is replaced by the character
specified by those digits, and &`\x`& followed by up to two hexadecimal digits
is treated similarly. Backslash followed by any other character is replaced by
the second character, so that in particular, &`\"`& becomes &`"`& and &`\\`&
becomes &`\`&. A data item enclosed in double quotes can be continued onto the
next line by ending the first line with a backslash. Any leading white space at
the start of the continuation line is ignored.

In addition to the escape character processing that occurs when strings are
enclosed in quotes, most data values are also subject to &'string expansion'&
(as described in the next section), in which case the characters &`$`& and
&`\`& are also significant. This means that if a single backslash is actually
required in such a string, and the string is also quoted, &`\\\\`& has to be
entered.

The maximum permitted length of a data string, before expansion, is 1024
characters.


.section "String expansion" "SECTfilterstringexpansion"
Most data values are expanded before use. Expansion consists of replacing
substrings beginning with &`$`& with other text. The full expansion facilities
available in Exim are extensive. If you want to know everything that Exim can
do with strings, you should consult the chapter on string expansion in the Exim
documentation.

In filter files, by far the most common use of string expansion is the
substitution of the contents of a variable. For example, the substring
.code
$reply_address
.endd
is replaced by the address to which replies to the message should be sent. If
such a variable name is followed by a letter or digit or underscore, it must be
enclosed in curly brackets (braces), for example,
.code
${reply_address}
.endd
If a &`$`& character is actually required in an expanded string, it must be
escaped with a backslash, and because backslash is also an escape character in
quoted input strings, it must be doubled in that case. The following two
examples illustrate two different ways of testing for a &`$`& character in a
message:
.code
if $message_body contains \$ then ...
if $message_body contains "\\$" then ...
.endd
You can prevent part of a string from being expanded by enclosing it between
two occurrences of &`\N`&. For example,
.code
if $message_body contains \N$$$$\N then ...
.endd
tests for a run of four dollar characters.


.section "Some useful general variables"
A complete list of the available variables is given in the Exim documentation.
This shortened list contains the ones that are most likely to be useful in
personal filter files:

&$body_linecount$&: The number of lines in the body of the message.

&$body_zerocount$&: The number of binary zero characters in the body of the
message.

&$home$&: In conventional configurations, this variable normally contains the
user's home directory. The system administrator can, however, change this.

&$local_part$&: The part of the email address that precedes the @ sign &--
normally the user's login name. If support for multiple personal mailboxes is
enabled (see section &<<SECTmbox>>& below) and a prefix or suffix for the local
part was recognized, it is removed from the string in this variable.

&$local_part_prefix$&: If support for multiple personal mailboxes is enabled
(see section &<<SECTmbox>>& below), and a local part prefix was recognized,
this variable contains the prefix. Otherwise it contains an empty string.

&$local_part_suffix$&: If support for multiple personal mailboxes is enabled
(see section &<<SECTmbox>>& below), and a local part suffix was recognized,
this variable contains the suffix. Otherwise it contains an empty string.

&$message_body$&: The initial portion of the body of the message. By default,
up to 500 characters are read into this variable, but the system administrator
can configure this to some other value. Newlines in the body are converted into
single spaces.

&$message_body_end$&: The final portion of the body of the message, formatted
and limited in the same way as &$message_body$&.

&$message_body_size$&: The size of the body of the message, in bytes.

&$message_exim_id$&: The message's local identification string, which is unique
for each message handled by a single host.

&$message_headers$&: The header lines of the message, concatenated into a
single string, with newline characters between them.

&$message_size$&: The size of the entire message, in bytes.

&$original_local_part$&: When an address that arrived with the message is
being processed, this contains the same value as the variable &$local_part$&.
However, if an address generated by an alias, forward, or filter file is being
processed, this variable contains the local part of the original address.

&$reply_address$&: The contents of the &'Reply-to:'& header, if the message
has one; otherwise the contents of the &'From:'& header. It is the address to
which normal replies to the message should be sent.

&$return_path$&: The return path &-- that is, the sender field that will be
transmitted as part of the message's envelope if the message is sent to another
host. This is the address to which delivery errors are sent. In many cases,
this variable has the same value as &$sender_address$&, but if, for example,
an incoming message to a mailing list has been expanded, &$return_path$& may
have been changed to contain the address of the list maintainer.

&$sender_address$&: The sender address that was received in the envelope of
the message. This is not necessarily the same as the contents of the &'From:'&
or &'Sender:'& header lines. For delivery error messages (&"bounce messages"&)
there is no sender address, and this variable is empty.

&$tod_full$&: A full version of the time and date, for example: Wed, 18 Oct
1995 09:51:40 +0100. The timezone is always given as a numerical offset from
GMT.

&$tod_log$&: The time and date in the format used for writing Exim's log files,
without the timezone, for example: 1995-10-12 15:32:29.

&$tod_zone$&: The local timezone offset, for example: +0100.



.section "Header variables" "SECTheadervariables"
There is a special set of expansion variables containing the header lines of
the message being processed. These variables have names beginning with
&$header_$& followed by the name of the header line, terminated by a colon.
For example,
.code
$header_from:
$header_subject:
.endd
The whole item, including the terminating colon, is replaced by the contents of
the message header line. If there is more than one header line with the same
name, their contents are concatenated. For header lines whose data consists of
a list of addresses (for example, &'From:'& and &'To:'&), a comma and newline
is inserted between each set of data. For all other header lines, just a
newline is used.

Leading and trailing white space is removed from header line data, and if there
are any MIME &"words"& that are encoded as defined by RFC 2047 (because they
contain non-ASCII characters), they are decoded and translated, if possible, to
a local character set. Translation is attempted only on operating systems that
have the &[iconv()]& function. This makes the header line look the same as it
would when displayed by an MUA. The default character set is ISO-8859-1, but
this can be changed by means of the &(headers)& command (see below).

If you want to see the actual characters that make up a header line, you can
specify &$rheader_$& instead of &$header_$&. This inserts the &"raw"&
header line, unmodified.

There is also an intermediate form, requested by &$bheader_$&, which removes
leading and trailing space and decodes MIME &"words"&, but does not do any
character translation. If an attempt to decode what looks superficially like a
MIME &"word"& fails, the raw string is returned. If decoding produces a binary
zero character, it is replaced by a question mark.

The capitalization of the name following &$header_$& is not significant.
Because any printing character except colon may appear in the name of a
message's header (this is a requirement of RFC 2822, the document that
describes the format of a mail message) curly brackets must &'not'& be used in
this case, as they will be taken as part of the header name. Two shortcuts are
allowed in naming header variables:

.ilist
The initiating &$header_$&, &$rheader_$&, or &$bheader_$& can be
abbreviated to &$h_$&, &$rh_$&, or &$bh_$&, respectively.
.next
The terminating colon can be omitted if the next character is white space. The
white space character is retained in the expanded string. However, this is not
recommended, because it makes it easy to forget the colon when it really is
needed.
.endlist

If the message does not contain a header of the given name, an empty string is
substituted. Thus it is important to spell the names of headers correctly. Do
not use &$header_Reply_to$& when you really mean &$header_Reply-to$&.


.section "User variables"
There are ten user variables with names &$n0$& &-- &$n9$& that can be
incremented by the &(add)& command (see section &<<SECTadd>>&). These can be
used for &"scoring"& messages in various ways. If Exim is configured to run a
&"system filter"& on every message, the values left in these variables are
copied into the variables &$sn0$& &-- &$sn9$& at the end of the system filter,
thus making them available to users' filter files. How these values are used is
entirely up to the individual installation.


.section "Current directory"
The contents of your filter file should not make any assumptions about the
current directory. It is best to use absolute paths for file names; you can
normally make use of the &$home$& variable to refer to your home directory. The
&(save)& command automatically inserts &$home$& at the start of non-absolute
paths.




.section "Significant deliveries" "SECTsigdel"
When in the course of delivery a message is processed by a filter file, what
happens next, that is, after the filter file has been processed, depends on
whether or not the filter sets up any &'significant deliveries'&. If at least
one significant delivery is set up, the filter is considered to have handled
the entire delivery arrangements for the current address, and no further
processing of the address takes place. If, however, no significant deliveries
are set up, Exim continues processing the current address as if there were no
filter file, and typically sets up a delivery of a copy of the message into a
local mailbox. In particular, this happens in the special case of a filter file
containing only comments.

The delivery commands &(deliver)&, &(save)&, and &(pipe)& are by default
significant. However, if such a command is preceded by the word &"unseen"&, its
delivery is not considered to be significant. In contrast, other commands such
as &(mail)& and &(vacation)& do not set up significant deliveries unless
preceded by the word &"seen"&. The following example commands set up
significant deliveries:
.code
deliver jack@???
pipe $home/bin/mymailscript
seen mail subject "message discarded"
seen finish
.endd
The following example commands do not set up significant deliveries:
.code
unseen deliver jack@???
unseen pipe $home/bin/mymailscript
mail subject "message discarded"
finish
.endd



.section "Filter commands"
The filter commands that are described in subsequent sections are listed
below, with the section in which they are described in brackets:

  .table2
  .row &(add)&        "&~&~increment a user variable (section &<<SECTadd>>&)"
  .row &(deliver)&    "&~&~deliver to an email address (section &<<SECTdeliver>>&)"
  .row &(fail)&       "&~&~force delivery failure (sysadmin use) (section &<<SECTfail>>&)"
  .row &(finish)&     "&~&~end processing (section &<<SECTfinish>>&)"
  .row &(freeze)&     "&~&~freeze message (sysadmin use) (section &<<SECTfreeze>>&)"
  .row &(headers)&    "&~&~set the header character set (section &<<SECTheaders>>&)"
  .row &(if)&         "&~&~test condition(s) (section &<<SECTif>>&)"
  .row &(logfile)&    "&~&~define log file (section &<<SECTlog>>&)"
  .row &(logwrite)&   "&~&~write to log file (section &<<SECTlog>>&)"
  .row &(mail)&       "&~&~send a reply message (section &<<SECTmail>>&)"
  .row &(pipe)&       "&~&~pipe to a command (section &<<SECTpipe>>&)"
  .row &(save)&       "&~&~save to a file (section &<<SECTsave>>&)"
  .row &(testprint)&  "&~&~print while testing (section &<<SECTtestprint>>&)"
  .row &(vacation)&   "&~&~tailored form of &(mail)& (section &<<SECTmail>>&)"
  .endtable


The &(headers)& command has additional parameters that can be used only in a
system filter. The &(fail)& and &(freeze)& commands are available only when
Exim's filtering facilities are being used as a system filter, and are
therefore usable only by the system administrator and not by ordinary users.
They are mentioned only briefly in this document; for more information, see the
main Exim specification.



  .section "The add command" "SECTadd"
  .display
  &`     add `&<&'number'&>&` to `&<&'user variable'&>
  &`e.g. add 2 to n3`&
  .endd


There are 10 user variables of this type, with names &$n0$& &-- &$n9$&. Their
values can be obtained by the normal expansion syntax (for example &$n3$&) in
other commands. At the start of filtering, these variables all contain zero.
Both arguments of the &(add)& command are expanded before use, making it
possible to add variables to each other. Subtraction can be obtained by adding
negative numbers.



  .section "The deliver command" "SECTdeliver"
  .display
  &`     deliver`& <&'mail address'&>
  &`e.g. deliver "Dr Livingstone <David@???>"`&
  .endd


This command provides a forwarding operation. The delivery that it sets up is
significant unless the command is preceded by &"unseen"& (see section
&<<SECTsigdel>>&). The message is sent on to the given address, exactly as
happens if the address had appeared in a traditional &_.forward_& file. If you
want to deliver the message to a number of different addresses, you can use
more than one &(deliver)& command (each one may have only one address).
However, duplicate addresses are discarded.

To deliver a copy of the message to your normal mailbox, your login name can be
given as the address. Once an address has been processed by the filtering
mechanism, an identical generated address will not be so processed again, so
doing this does not cause a loop.

However, if you have a mail alias, you should &'not'& refer to it here. For
example, if the mail address &'L.Gulliver'& is aliased to &'lg303'& then all
references in Gulliver's &_.forward_& file should be to &'lg303'&. A reference
to the alias will not work for messages that are addressed to that alias,
since, like &_.forward_& file processing, aliasing is performed only once on an
address, in order to avoid looping.

Following the new address, an optional second address, preceded by
&"errors_to"& may appear. This changes the address to which delivery errors on
the forwarded message will be sent. Instead of going to the message's original
sender, they go to this new address. For ordinary users, the only value that is
permitted for this address is the user whose filter file is being processed.
For example, the user &'lg303'& whose mailbox is in the domain
&'lilliput.example'& could have a filter file that contains
.code
deliver jon@??? errors_to lg303@???
.endd
Clearly, using this feature makes sense only in situations where not all
messages are being forwarded. In particular, bounce messages must not be
forwarded in this way, as this is likely to create a mail loop if something
goes wrong.



  .section "The save command" "SECTsave"
  .display
  &`     save `&<&'file name'&>
  &`e.g. save $home/mail/bookfolder`&
  .endd


This command specifies that a copy of the message is to be appended to the
given file (that is, the file is to be used as a mail folder). The delivery
that &(save)& sets up is significant unless the command is preceded by
&"unseen"& (see section &<<SECTsigdel>>&).

More than one &(save)& command may be obeyed; each one causes a copy of the
message to be written to its argument file, provided they are different
(duplicate &(save)& commands are ignored).

If the file name does not start with a / character, the contents of the
&$home$& variable are prepended, unless it is empty. In conventional
configurations, this variable is normally set in a user filter to the user's
home directory, but the system administrator may set it to some other path. In
some configurations, &$home$& may be unset, in which case a non-absolute path
name may be generated. Such configurations convert this to an absolute path
when the delivery takes place. In a system filter, &$home$& is never set.

The user must of course have permission to write to the file, and the writing
of the file takes place in a process that is running as the user, under the
user's primary group. Any secondary groups to which the user may belong are not
normally taken into account, though the system administrator can configure Exim
to set them up. In addition, the ability to use this command at all is
controlled by the system administrator &-- it may be forbidden on some systems.

An optional mode value may be given after the file name. The value for the mode
is interpreted as an octal number, even if it does not begin with a zero. For
example:
.code
save /some/folder 640
.endd
This makes it possible for users to override the system-wide mode setting for
file deliveries, which is normally 600. If an existing file does not have the
correct mode, it is changed.

An alternative form of delivery may be enabled on your system, in which each
message is delivered into a new file in a given directory. If this is the case,
this functionality can be requested by giving the directory name terminated by
a slash after the &(save)& command, for example
.code
save separated/messages/
.endd
There are several different formats for such deliveries; check with your system
administrator or local documentation to find out which (if any) are available
on your system. If this functionality is not enabled, the use of a path name
ending in a slash causes an error.



  .section "The pipe command" "SECTpipe"
  .display
  &`     pipe `&<&'command'&>
  &`e.g. pipe "$home/bin/countmail $sender_address"`&
  .endd


This command specifies that the message is to be delivered to the specified
command using a pipe. The delivery that it sets up is significant unless the
command is preceded by &"unseen"& (see section &<<SECTsigdel>>&). Remember,
however, that no deliveries are done while the filter is being processed. All
deliveries happen later on. Therefore, the result of running the pipe is not
available to the filter.

When the deliveries are done, a separate process is run, and a copy of the
message is passed on its standard input. The process runs as the user, under
the user's primary group. Any secondary groups to which the user may belong are
not normally taken into account, though the system administrator can configure
Exim to set them up. More than one &(pipe)& command may appear; each one causes
a copy of the message to be written to its argument pipe, provided they are
different (duplicate &(pipe)& commands are ignored).

When the time comes to transport the message, the command supplied to &(pipe)&
is split up by Exim into a command name and a number of arguments. These are
delimited by white space except for arguments enclosed in double quotes, in
which case backslash is interpreted as an escape, or in single quotes, in which
case no escaping is recognized. Note that as the whole command is normally
supplied in double quotes, a second level of quoting is required for internal
double quotes. For example:
.code
pipe "$home/myscript \"size is $message_size\""
.endd
String expansion is performed on the separate components after the line has
been split up, and the command is then run directly by Exim; it is not run
under a shell. Therefore, substitution cannot change the number of arguments,
nor can quotes, backslashes or other shell metacharacters in variables cause
confusion.

Documentation for some programs that are normally run via this kind of pipe
often suggest that the command should start with
.code
IFS=" "
.endd
This is a shell command, and should &'not'& be present in Exim filter files,
since it does not normally run the command under a shell.

However, there is an option that the administrator can set to cause a shell to
be used. In this case, the entire command is expanded as a single string and
passed to the shell for interpretation. It is recommended that this be avoided
if at all possible, since it can lead to problems when inserted variables
contain shell metacharacters.

The default PATH set up for the command is determined by the system
administrator, usually containing at least &_/bin_& and &_/usr/bin_& so that
common commands are available without having to specify an absolute file name.
However, it is possible for the system administrator to restrict the pipe
facility so that the command name must not contain any / characters, and must
be found in one of the directories in the configured PATH. It is also possible
for the system administrator to lock out the use of the &(pipe)& command
altogether.

When the command is run, a number of environment variables are set up. The
complete list for pipe deliveries may be found in the Exim reference manual.
Those that may be useful for pipe deliveries from user filter files are:

  .display
  &`DOMAIN            `&   the domain of the address
  &`HOME              `&   your home directory
  &`LOCAL_PART        `&   see below
  &`LOCAL_PART_PREFIX `&   see below
  &`LOCAL_PART_SUFFIX `&   see below
  &`LOGNAME           `&   your login name
  &`MESSAGE_ID        `&   the unique id of the message
  &`PATH              `&   the command search path
  &`RECIPIENT         `&   the complete recipient address
  &`SENDER            `&   the sender of the message
  &`SHELL             `&   &`/bin/sh`&
  &`USER              `&   see below
  .endd


LOCAL_PART, LOGNAME, and USER are all set to the same value, namely, your login
id. LOCAL_PART_PREFIX and LOCAL_PART_SUFFIX may be set if Exim is configured to
recognize prefixes or suffixes in the local parts of addresses. For example, a
message addressed to &'pat-suf2@???'& may cause the filter for user
&'pat'& to be run. If this sets up a pipe delivery, LOCAL_PART_SUFFIX is
&`-suf2`& when the pipe command runs. The system administrator has to configure
Exim specially for this feature to be available.

If you run a command that is a shell script, be very careful in your use of
data from the incoming message in the commands in your script. RFC 2822 is very
generous in the characters that are permitted to appear in mail addresses, and
in particular, an address may begin with a vertical bar or a slash. For this
reason you should always use quotes round any arguments that involve data from
the message, like this:
.code
/some/command '$SENDER'
.endd
so that inserted shell meta-characters do not cause unwanted effects.

Remember that, as was explained earlier, the pipe command is not run at the
time the filter file is interpreted. The filter just defines what deliveries
are required for one particular addressee of a message. The deliveries
themselves happen later, once Exim has decided everything that needs to be done
for the message.

A consequence of this is that you cannot inspect the return code from the pipe
command from within the filter. Nevertheless, the code returned by the command
is important, because Exim uses it to decide whether the delivery has succeeded
or failed.

The command should return a zero completion code if all has gone well. Most
non-zero codes are treated by Exim as indicating a failure of the pipe. This is
treated as a delivery failure, causing the message to be returned to its
sender. However, there are some completion codes that are treated as temporary
errors. The message remains on Exim's spool disk, and the delivery is tried
again later, though it will ultimately time out if the delivery failures go on
too long. The completion codes to which this applies can be specified by the
system administrator; the default values are 73 and 75.

The pipe command should not normally write anything to its standard output or
standard error file descriptors. If it does, whatever is written is normally
returned to the sender of the message as a delivery error, though this action
can be varied by the system administrator.



.section "Mail commands" "SECTmail"
There are two commands that cause the creation of a new mail message, neither
of which count as a significant delivery unless the command is preceded by the
word &"seen"& (see section &<<SECTsigdel>>&). This is a powerful facility, but
it should be used with care, because of the danger of creating infinite
sequences of messages. The system administrator can forbid the use of these
commands altogether.

  To help prevent runaway message sequences, these commands have no effect when
  the incoming message is a bounce (delivery error) message, and messages sent by
  this means are treated as if they were reporting delivery errors. Thus, they
  should never themselves cause a bounce message to be returned. The basic
  mail-sending command is
  .display
  &`mail [to `&<&'address-list'&>&`]`&
  &`     [cc `&<&'address-list'&>&`]`&
  &`     [bcc `&<&'address-list'&>&`]`&
  &`     [from `&<&'address'&>&`]`&
  &`     [reply_to `&<&'address'&>&`]`&
  &`     [subject `&<&'text'&>&`]`&
  &`     [extra_headers `&<&'text'&>&`]`&
  &`     [text `&<&'text'&>&`]`&
  &`     [[expand] file `&<&'filename'&>&`]`&
  &`     [return message]`&
  &`     [log `&<&'log file name'&>&`]`&
  &`     [once `&<&'note file name'&>&`]`&
  &`     [once_repeat `&<&'time interval'&>&`]`&


  &`e.g. mail text "Your message about $h_subject: has been received"`&
  .endd
  Each <&'address-list'&> can contain a number of addresses, separated by commas,
  in the format of a &'To:'& or &'Cc:'& header line. In fact, the text you supply
  here is copied exactly into the appropriate header line. It may contain
  additional information as well as email addresses. For example:
  .code
  mail to "Julius Caesar <jc@???>, \
           <ma@???> (Mark A.)"
  .endd
  Similarly, the texts supplied for &%from%& and &%reply_to%& are copied into
  their respective header lines.


As a convenience for use in one common case, there is also a command called
&(vacation)&. It behaves in the same way as &(mail)&, except that the defaults
for the &%subject%&, &%file%&, &%log%&, &%once%&, and &%once_repeat%& options
are
.code
subject "On vacation"
expand file .vacation.msg
log .vacation.log
once .vacation
once_repeat 7d
.endd
respectively. These are the same file names and repeat period used by the
traditional Unix &(vacation)& command. The defaults can be overridden by
explicit settings, but if a file name is given its contents are expanded only
if explicitly requested.

&*Warning*&: The &(vacation)& command should always be used conditionally,
subject to at least the &(personal)& condition (see section &<<SECTpersonal>>&
below) so as not to send automatic replies to non-personal messages from
mailing lists or elsewhere. Sending an automatic response to a mailing list or
a mailing list manager is an Internet Sin.

For both commands, the key/value argument pairs can appear in any order. At
least one of &%text%& or &%file%& must appear (except with &(vacation)&, where
there is a default for &%file%&); if both are present, the text string appears
first in the message. If &%expand%& precedes &%file%&, each line of the file is
subject to string expansion before it is included in the message.

Several lines of text can be supplied to &%text%& by including the escape
sequence &"\n"& in the string wherever a newline is required. If the command is
output during filter file testing, newlines in the text are shown as &"\n"&.

Note that the keyword for creating a &'Reply-To:'& header is &%reply_to%&,
because Exim keywords may contain underscores, but not hyphens. If the &%from%&
keyword is present and the given address does not match the user who owns the
forward file, Exim normally adds a &'Sender:'& header to the message, though it
can be configured not to do this.

The &%extra_headers%& keyword allows you to add custom header lines to the
message. The text supplied must be one or more syntactically valid RFC 2822
header lines. You can use &"\n"& within quoted text to specify newlines between
headers, and also to define continued header lines. For example:
.code
extra_headers "h1: first\nh2: second\n continued\nh3: third"
.endd
No newline should appear at the end of the final header line.

If no &%to%& argument appears, the message is sent to the address in the
&$reply_address$& variable (see section &<<SECTfilterstringexpansion>>& above).
An &'In-Reply-To:'& header is automatically included in the created message,
giving a reference to the message identification of the incoming message.

If &%return message%& is specified, the incoming message that caused the filter
file to be run is added to the end of the message, subject to a maximum size
limitation.

If a log file is specified, a line is added to it for each message sent.

If a &%once%& file is specified, it is used to hold a database for remembering
who has received a message, and no more than one message is ever sent to any
particular address, unless &%once_repeat%& is set. This specifies a time
interval after which another copy of the message is sent. The interval is
specified as a sequence of numbers, each followed by the initial letter of one
of &"seconds"&, &"minutes"&, &"hours"&, &"days"&, or &"weeks"&. For example,
.code
once_repeat 5d4h
.endd
causes a new message to be sent if at least 5 days and 4 hours have elapsed
since the last one was sent. There must be no white space in a time interval.

Commonly, the file name specified for &%once%& is used as the base name for
direct-access (DBM) file operations. There are a number of different DBM
libraries in existence. Some operating systems provide one as a default, but
even in this case a different one may have been used when building Exim. With
some DBM libraries, specifying &%once%& results in two files being created,
with the suffixes &_.dir_& and &_.pag_& being added to the given name. With
some others a single file with the suffix &_.db_& is used, or the name is used
unchanged.

Using a DBM file for implementing the &%once%& feature means that the file
grows as large as necessary. This is not usually a problem, but some system
administrators want to put a limit on it. The facility can be configured not to
use a DBM file, but instead, to use a regular file with a maximum size. The
data in such a file is searched sequentially, and if the file fills up, the
oldest entry is deleted to make way for a new one. This means that some
correspondents may receive a second copy of the message after an unpredictable
interval. Consult your local information to see if your system is configured
this way.

More than one &(mail)& or &(vacation)& command may be obeyed in a single filter
run; they are all honoured, even when they are to the same recipient.



.section "Logging commands" "SECTlog"
A log can be kept of actions taken by a filter file. This facility is normally
available in conventional configurations, but there are some situations where
it might not be. Also, the system administrator may choose to disable it. Check
your local information if in doubt.

  Logging takes place while the filter file is being interpreted. It does not
  queue up for later like the delivery commands. The reason for this is so that a
  log file need be opened only once for several write operations. There are two
  commands, neither of which constitutes a significant delivery. The first
  defines a file to which logging output is subsequently written:
  .display
  &`     logfile `&<&'file name'&>
  &`e.g. logfile $home/filter.log`&
  .endd
  The file name must be fully qualified. You can use &$home$&, as in this
  example, to refer to your home directory. The file name may optionally be
  followed by a mode for the file, which is used if the file has to be created.
  For example,
  .code
  logfile $home/filter.log 0644
  .endd
  The number is interpreted as octal, even if it does not begin with a zero.
  The default for the mode is 600. It is suggested that the &(logfile)& command
  normally appear as the first command in a filter file. Once a log file has
  been obeyed, the &(logwrite)& command can be used to write to it:
  .display
  &`     logwrite "`&<&'some text string'&>&`"`&
  &`e.g. logwrite "$tod_log $message_id processed"`&
  .endd
  It is possible to have more than one &(logfile)& command, to specify writing to
  different log files in different circumstances. Writing takes place at the end
  of the file, and a newline character is added to the end of each string if
  there isn't one already there. Newlines can be put in the middle of the string
  by using the &"\n"& escape sequence. Lines from simultaneous deliveries may get
  interleaved in the file, as there is no interlocking, so you should plan your
  logging with this in mind. However, data should not get lost.




.section "The finish command" "SECTfinish"
The command &(finish)&, which has no arguments, causes Exim to stop
interpreting the filter file. This is not a significant action unless preceded
by &"seen"&. A filter file containing only &"seen finish"& is a black hole.


  .section "The testprint command" "SECTtestprint"
  It is sometimes helpful to be able to print out the values of variables when
  testing filter files. The command
  .display
  &`     testprint `&<&'text'&>
  &`e.g. testprint "home=$home reply_address=$reply_address"`&
  .endd
  does nothing when mail is being delivered. However, when the filtering code is
  being tested by means of the &%-bf%& option (see section &<<SECTtesting>>&
  above), the value of the string is written to the standard output.



.section "The fail command" "SECTfail"
When Exim's filtering facilities are being used as a system filter, the
&(fail)& command is available, to force delivery failure. Because this command
is normally usable only by the system administrator, and not enabled for use by
ordinary users, it is described in more detail in the main Exim specification
rather than in this document.


.section "The freeze command" "SECTfreeze"
When Exim's filtering facilities are being used as a system filter, the
&(freeze)& command is available, to freeze a message on the queue. Because this
command is normally usable only by the system administrator, and not enabled
for use by ordinary users, it is described in more detail in the main Exim
specification rather than in this document.



.section "The headers command" "SECTheaders"
The &(headers)& command can be used to change the target character set that is
used when translating the contents of encoded header lines for insertion by the
&$header_$& mechanism (see section &<<SECTheadervariables>>& above). The
default can be set in the Exim configuration; if not specified, ISO-8859-1 is
used. The only currently supported format for the &(headers)& command in user
filters is as in this example:
.code
headers charset "UTF-8"
.endd
That is, &(headers)& is followed by the word &"charset"& and then the name of a
character set. This particular example would be useful if you wanted to compare
the contents of a header to a UTF-8 string.

In system filter files, the &(headers)& command can be used to add or remove
header lines from the message. These features are described in the main Exim
specification.



  .section "Obeying commands conditionally" "SECTif"
  Most of the power of filtering comes from the ability to test conditions and
  obey different commands depending on the outcome. The &(if)& command is used to
  specify conditional execution, and its general form is
  .display
  &`if    `&<&'condition'&>
  &`then  `&<&'commands'&>
  &`elif  `&<&'condition'&>
  &`then  `&<&'commands'&>
  &`else  `&<&'commands'&>
  &`endif`&
  .endd
  There may be any number of &(elif)& and &(then)& sections (including none) and
  the &(else)& section is also optional. Any number of commands, including nested
  &(if)& commands, may appear in any of the <&'commands'&> sections.


Conditions can be combined by using the words &(and)& and &(or)&, and round
brackets (parentheses) can be used to specify how several conditions are to
combine. Without brackets, &(and)& is more binding than &(or)&. For example:
.code
if
$h_subject: contains "Make money" or
$h_precedence: is "junk" or
($h_sender: matches ^\\d{8}@ and not personal) or
$message_body contains "this is not spam"
then
seen finish
endif
.endd
A condition can be preceded by &(not)& to negate it, and there are also some
negative forms of condition that are more English-like.



.section "String testing conditions"
There are a number of conditions that operate on text strings, using the words
&"begins"&, &"ends"&, &"is"&, &"contains"& and &"matches"&. If you want to
apply the same test to more than one header line, you can easily concatenate
them into a single string for testing, as in this example:
.code
if "$h_to:, $h_cc:" contains me@??? then ...
.endd
If a string-testing condition name is written in lower case, the testing
of letters is done without regard to case; if it is written in upper case
(for example, &"CONTAINS"&), the case of letters is taken into account.

  .display
  &`     `&<&'text1'&>&` begins `&<&'text2'&>
  &`     `&<&'text1'&>&` does not begin `&<&'text2'&>
  &`e.g. $header_from: begins "Friend@"`&
  .endd


A &"begins"& test checks for the presence of the second string at the start of
the first, both strings having been expanded.

  .display
  &`     `&<&'text1'&>&` ends `&<&'text2'&>
  &`     `&<&'text1'&>&` does not end `&<&'text2'&>
  &`e.g. $header_from: ends "public.com.example"`&
  .endd


An &"ends"& test checks for the presence of the second string at the end of
the first, both strings having been expanded.

  .display
  &`     `&<&'text1'&>&` is `&<&'text2'&>
  &`     `&<&'text1'&>&` is not `&<&'text2'&>
  &`e.g. $local_part_suffix is "-foo"`&
  .endd


An &"is"& test does an exact match between the strings, having first expanded
both strings.

  .display
  &`     `&<&'text1'&>&` contains `&<&'text2'&>
  &`     `&<&'text1'&>&` does not contain `&<&'text2'&>
  &`e.g. $header_subject: contains "evolution"`&
  .endd


A &"contains"& test does a partial string match, having expanded both strings.

  .display
  &`     `&<&'text1'&>&` matches `&<&'text2'&>
  &`     `&<&'text1'&>&` does not match `&<&'text2'&>
  &`e.g. $sender_address matches "(bill|john)@"`&
  .endd


For a &"matches"& test, after expansion of both strings, the second one is
interpreted as a regular expression. Exim uses the PCRE regular expression
library, which provides regular expressions that are compatible with Perl.

The match succeeds if the regular expression matches any part of the first
string. If you want a regular expression to match only at the start or end of
the subject string, you must encode that requirement explicitly, using the
&`^`& or &`$`& metacharacters. The above example, which is not so constrained,
matches all these addresses:
.code
bill@???
john@???
spoonbill@???
littlejohn@???
.endd
To match only the first two, you could use this:
.code
if $sender_address matches "^(bill|john)@" then ...
.endd
Care must be taken if you need a backslash in a regular expression, because
backslashes are interpreted as escape characters both by the string expansion
code and by Exim's normal processing of strings in quotes. For example, if you
want to test the sender address for a domain ending in &'.com'& the regular
expression is
.code
\.com$
.endd
The backslash and dollar sign in that expression have to be escaped when used
in a filter command, as otherwise they would be interpreted by the expansion
code. Thus, what you actually write is
.code
if $sender_address matches \\.com\$
.endd
An alternative way of handling this is to make use of the &`\N`& expansion
flag for suppressing expansion:
.code
if $sender_address matches \N\.com$\N
.endd
Everything between the two occurrences of &`\N`& is copied without change by
the string expander (and in fact you do not need the final one, because it is
at the end of the string). If the regular expression is given in quotes
(mandatory only if it contains white space) you have to write either
.code
if $sender_address matches "\\\\.com\\$"
.endd
or
.code
if $sender_address matches "\\N\\.com$\\N"
.endd

If the regular expression contains bracketed sub-expressions, numeric
variable substitutions such as &$1$& can be used in the subsequent actions
after a successful match. If the match fails, the values of the numeric
variables remain unchanged. Previous values are not restored after &(endif)&.
In other words, only one set of values is ever available. If the condition
contains several sub-conditions connected by &(and)& or &(or)&, it is the
strings extracted from the last successful match that are available in
subsequent actions. Numeric variables from any one sub-condition are also
available for use in subsequent sub-conditions, because string expansion of a
condition occurs just before it is tested.


.section "Numeric testing conditions"
The following conditions are available for performing numerical tests:

  .display
  &`     `&<&'number1'&>&` is above `&<&'number2'&>
  &`     `&<&'number1'&>&` is not above `&<&'number2'&>
  &`     `&<&'number1'&>&` is below `&<&'number2'&>
  &`     `&<&'number1'&>&` is not below `&<&'number2'&>
  &`e.g. $message_size is not above 10k`&
  .endd


The <&'number'&> arguments must expand to strings of digits, optionally
followed by one of the letters K or M (upper case or lower case) which cause
multiplication by 1024 and 1024x1024 respectively.


.section "Testing for significant deliveries"
You can use the &(delivered)& condition to test whether or not any previously
obeyed filter commands have set up a significant delivery. For example:
.code
if not delivered then save mail/anomalous endif
.endd
&"Delivered"& is perhaps a poor choice of name for this condition, because the
message has not actually been delivered; rather, a delivery has been set up for
later processing.


.section "Testing for error messages"
The condition &(error_message)& is true if the incoming message is a bounce
(mail delivery error) message. Putting the command
.code
if error_message then finish endif
.endd
at the head of your filter file is a useful insurance against things going
wrong in such a way that you cannot receive delivery error reports. &*Note*&:
&(error_message)& is a condition, not an expansion variable, and therefore is
not preceded by &`$`&.


.section "Testing a list of addresses"
There is a facility for looping through a list of addresses and applying a
condition to each of them. It takes the form
.display
&`foranyaddress `&<&'string'&>&` (`&<&'condition'&>&`)`&
.endd
where <&'string'&> is interpreted as a list of RFC 2822 addresses, as in a
typical header line, and <&'condition'&> is any valid filter condition or
combination of conditions. The &"group"& syntax that is defined for certain
header lines that contain addresses is supported.

The parentheses surrounding the condition are mandatory, to delimit it from
possible further sub-conditions of the enclosing &(if)& command. Within the
condition, the expansion variable &$thisaddress$& is set to the non-comment
portion of each of the addresses in the string in turn. For example, if the
string is
.code
B.Simpson <bart@???>, lisa@??? (his sister)
.endd
then &$thisaddress$& would take on the values &`bart@???`& and
&`lisa@???`& in turn.

If there are no valid addresses in the list, the whole condition is false. If
the internal condition is true for any one address, the overall condition is
true and the loop ends. If the internal condition is false for all addresses in
the list, the overall condition is false. This example tests for the presence
of an eight-digit local part in any address in a &'To:'& header:
.code
if foranyaddress $h_to: ( $thisaddress matches ^\\d{8}@ ) then ...
.endd
When the overall condition is true, the value of &$thisaddress$& in the
commands that follow &(then)& is the last value it took on inside the loop. At
the end of the &(if)& command, the value of &$thisaddress$& is reset to what it
was before. It is best to avoid the use of multiple occurrences of
&(foranyaddress)&, nested or otherwise, in a single &(if)& command, if the
value of &$thisaddress$& is to be used afterwards, because it isn't always
clear what the value will be. Nested &(if)& commands should be used instead.

Header lines can be joined together if a check is to be applied to more than
one of them. For example:
.code
if foranyaddress $h_to:,$h_cc: ....
.endd
This scans through the addresses in both the &'To:'& and the &'Cc:'& headers.


.section "Testing for personal mail" "SECTpersonal"
A common requirement is to distinguish between incoming personal mail and mail
from a mailing list, or from a robot or other automatic process (for example, a
bounce message). In particular, this test is normally required for &"vacation
messages"&.

The &(personal)& condition checks that the message is not a bounce message and
that the current user's email address appears in the &'To:'& header. It also
checks that the sender is not the current user or one of a number of common
daemons, and that there are no header lines starting &'List-'& in the message.
Finally, it checks the content of the &'Precedence:'& header line, if there is
one.

You should always use the &(personal)& condition when generating automatic
responses. This example shows the use of &(personal)& in a filter file that is
sending out vacation messages:
.code
if personal then
mail to $reply_address
subject "I am on holiday"
file $home/vacation/message
once $home/vacation/once
once_repeat 10d
endif
.endd
It is tempting, when writing commands like the above, to quote the original
subject in the reply. For example:
.code
subject "Re: $h_subject:"
.endd
There is a danger in doing this, however. It may allow a third party to
subscribe you to an opt-in mailing list, provided that the list accepts bounce
messages as subscription confirmations. (Messages sent from filters are always
sent as bounce messages.) Well-managed lists require a non-bounce message to
confirm a subscription, so the danger is relatively small.

If prefixes or suffixes are in use for local parts &-- something which depends
on the configuration of Exim (see section &<<SECTmbox>>& below) &-- the tests
for the current user are done with the full address (including the prefix and
suffix, if any) as well as with the prefix and suffix removed. If the system is
configured to rewrite local parts of mail addresses, for example, to rewrite
&`dag46`& as &`Dirk.Gently`&, the rewritten form of the address is also used in
the tests.



  .section "Alias addresses for the personal condition"
  It is quite common for people who have mail accounts on a number of different
  systems to forward all their mail to one system, and in this case a check for
  personal mail should test all their various mail addresses. To allow for this,
  the &(personal)& condition keyword can be followed by
  .display
  &`alias `&<&'address'&>
  .endd
  any number of times, for example:
  .code
  if personal alias smith@???
              alias jones@???
  then ...
  .endd
  The alias addresses are treated as alternatives to the current user's email
  address when testing the contents of header lines.



.section "Details of the personal condition"
The basic &(personal)& test is roughly equivalent to the following:
.code
not error_message and
$message_headers does not contain "\nList-Id:" and
$message_headers does not contain "\nList-Help:" and
$message_headers does not contain "\nList-Subscribe:" and
$message_headers does not contain "\nList-Unsubscribe:" and
$message_headers does not contain "\nList-Post:" and
$message_headers does not contain "\nList-Owner:" and
$message_headers does not contain "\nList-Archive:" and
(
"${if def h_auto-submitted:{present}{absent}}" is "absent" or
$header_auto-submitted: is "no"
) and
$header_precedence: does not contain "bulk" and
$header_precedence: does not contain "list" and
$header_precedence: does not contain "junk" and
foranyaddress $header_to:
( $thisaddress contains "$local_part$domain" ) and
not foranyaddress $header_from:
(
$thisaddress contains "$local_partdomain" or
$thisaddress contains "server" or
$thisaddress contains "daemon" or
$thisaddress contains "root" or
$thisaddress contains "listserv" or
$thisaddress contains "majordomo" or
$thisaddress contains "-request" or
$thisaddress matches "^owner-[^]+"
)
.endd
The variable &$local_part$& contains the local part of the mail address of
the user whose filter file is being run &-- it is normally your login id. The
&$domain$& variable contains the mail domain. As explained above, if aliases
or rewriting are defined, or if prefixes or suffixes are in use, the tests for
the current user are also done with alternative addresses.




.section "Testing delivery status"
There are two conditions that are intended mainly for use in system filter
files, but which are available in users' filter files as well. The condition
&(first_delivery)& is true if this is the first process that is attempting to
deliver the message, and false otherwise. This indicator is not reset until the
first delivery process successfully terminates; if there is a crash or a power
failure (for example), the next delivery attempt is also a &"first delivery"&.

In a user filter file &(first_delivery)& will be false if there was previously
an error in the filter, or if a delivery for the user failed owing to, for
example, a quota error, or if forwarding to a remote address was deferred for
some reason.

The condition &(manually_thawed)& is true if the message was &"frozen"& for
some reason, and was subsequently released by the system administrator. It is
unlikely to be of use in users' filter files.


.section "Multiple personal mailboxes" "SECTmbox"
The system administrator can configure Exim so that users can set up variants
on their email addresses and handle them separately. Consult your system
administrator or local documentation to see if this facility is enabled on your
system, and if so, what the details are.

The facility involves the use of a prefix or a suffix on an email address. For
example, all mail addressed to &'lg303-'&<&'something'&> would be the property
of user &'lg303'&, who could determine how it was to be handled, depending on
the value of <&'something'&>.

There are two possible ways in which this can be set up. The first possibility
is the use of multiple &_.forward_& files. In this case, mail to &'lg303-foo'&,
for example, is handled by looking for a file called &_.forward-foo_& in
&'lg303'&'s home directory. If such a file does not exist, delivery fails
and the message is returned to its sender.

The alternative approach is to pass all messages through a single &_.forward_&
file, which must be a filter file so that it can distinguish between the
different cases by referencing the variables &$local_part_prefix$& or
&$local_part_suffix$&, as in the final example in section &<<SECTex>>& below.

It is possible to configure Exim to support both schemes at once. In this case,
a specific &_.forward-foo_& file is first sought; if it is not found, the basic
&_.forward_& file is used.

The &(personal)& test (see section &<<SECTpersonal>>&) includes prefixes and
suffixes in its checking.



.section "Ignoring delivery errors"
As was explained above, filtering just sets up addresses for delivery &-- no
deliveries are actually done while a filter file is active. If any of the
generated addresses subsequently suffers a delivery failure, an error message
is generated in the normal way. However, if a filter command that sets up a
delivery is preceded by the word &"noerror"&, errors for that delivery,
and any deliveries consequent on it (that is, from alias, forwarding, or
filter files it invokes) are ignored.



.section "Examples of Exim filter commands" "SECTex"
Simple forwarding:

.code
# Exim filter
deliver baggins@???
.endd

Vacation handling using traditional means, assuming that the &_.vacation.msg_&
and other files have been set up in your home directory:

.code
# Exim filter
unseen pipe "/usr/ucb/vacation \"$local_part\""
.endd

Vacation handling inside Exim, having first created a file called
&_.vacation.msg_& in your home directory:

.code
# Exim filter
if personal then vacation endif
.endd

File some messages by subject:

.code
# Exim filter
if $header_subject: contains "empire" or
$header_subject: contains "foundation"
then
save $home/mail/f+e
endif
.endd

Save all non-urgent messages by weekday:

.code
# Exim filter
if $header_subject: does not contain "urgent" and
$tod_full matches "^(...),"
then
save $home/mail/$1
endif
.endd

Throw away all mail from one site, except from postmaster:

.code
# Exim filter
if $reply_address contains "@spam.site.example" and
$reply_address does not contain "postmaster@"
then
seen finish
endif
.endd

Handle multiple personal mailboxes:

.code
# Exim filter
if $local_part_suffix is "-foo"
then
save $home/mail/foo
elif $local_part_suffix is "-bar"
then
save $home/mail/bar
endif
.endd


File/diff for spec.xfpt is too large (1451816 bytes > 1024000 bytes)!

  Index: HowItWorks.txt
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/HowItWorks.txt,v
  retrieving revision 1.2
  retrieving revision 1.3
  diff -u -r1.2 -r1.3
  --- HowItWorks.txt    10 Nov 2005 12:30:13 -0000    1.2
  +++ HowItWorks.txt    1 Feb 2006 11:01:01 -0000    1.3
  @@ -1,4 +1,4 @@
  -$Cambridge: exim/exim-doc/doc-docbook/HowItWorks.txt,v 1.2 2005/11/10 12:30:13 ph10 Exp $
  +$Cambridge: exim/exim-doc/doc-docbook/HowItWorks.txt,v 1.3 2006/02/01 11:01:01 ph10 Exp $


CREATING THE EXIM DOCUMENTATION

@@ -22,7 +22,7 @@
A demand for a version in "info" format led me to write a Perl script that
converted the SGCAL input into a Texinfo file. Because of the somewhat
restrictive requirements of Texinfo, this script has always needed a lot of
-maintenance, and has never been 100% satisfactory.
+maintenance, and was never totally satisfactory.

The HTML version of the documentation was originally produced from the Texinfo
version, but later I wrote another Perl script that produced it directly from
@@ -54,8 +54,16 @@
Maintaining an XML document by hand editing is a tedious, verbose, and
error-prone process. A number of specialized XML text editors were available,
but all the free ones were at a very primitive stage. I therefore decided to
-keep the master source in AsciiDoc format (described below), from which a
-secondary XML master could be automatically generated.
+keep the master source in AsciiDoc format, from which a secondary XML master
+could be automatically generated.
+
+The first "new" versions of the documents, for the 4.60 release, were generated
+this way. However, there were a number of problems with using AsciiDoc for a
+document as large and as complex as the Exim manual. As a result, I wrote a new
+application called xfpt ("XML From Plain Text") which creates XML from a
+relatively simple and consistent markup language. This application has been
+released for general use, and the master sources for the Exim documentation are
+now in xfpt format.

All the output formats are generated from the XML file. If, in the future, a
better way of maintaining the XML source becomes available, this can be adopted
@@ -64,14 +72,14 @@
adopted without affecting the source maintenance.

A number of issues arose while setting this all up, which are best summed up by
-the statement that a lot of the technology is (in 2005) still very immature. It
+the statement that a lot of the technology is (in 2006) still very immature. It
is probable that trying to do this conversion any earlier would not have been
anywhere near as successful. The main problems that still bother me are
described in the penultimate section of this document.

-The following sections describe the processes by which the AsciiDoc files are
+The following sections describe the processes by which the xfpt files are
transformed into the final output documents. In practice, the details are coded
-into a makefile that specifies the chain of commands for each output format.
+into a Makefile that specifies the chain of commands for each output format.


REQUIRED SOFTWARE
@@ -81,10 +89,9 @@
I am not fully aware of. This is what I know about (version numbers are current
at the time of writing):

-. AsciiDoc 6.0.3
+. xfpt 0.00

- This converts the master source file into a DocBook XML file, using a
- customized AsciiDoc configuration file.
+ This converts the master source file into a DocBook XML file.

. xmlto 0.0.18

  @@ -94,26 +101,27 @@
     things that I have not figured out, to apply the DocBook XSLT stylesheets.


. libxml 1.8.17
- libxml2 2.6.17
- libxslt 1.1.12
+ libxml2 2.6.22
+ libxslt 1.1.15

     These are all installed on my box; I do not know which of libxml or libxml2
     the various scripts are actually using.


-. xsl-stylesheets-1.66.1
+. xsl-stylesheets-1.68.1

     These are the standard DocBook XSL stylesheets.


. fop 0.20.5

     FOP is a processor for "formatted objects". It is written in Java. The fop
  -  command is a shell script that drives it.
  +  command is a shell script that drives it. It is used to generate PostScript
  +  and PDF output.


. w3m 0.5.1

     This is a text-oriented web brower. It is used to produce the Ascii form of
  -  the Exim documentation from a specially-created HTML format. It seems to do a
  -  better job than lynx.
  +  the Exim documentation (spec.txt) from a specially-created HTML format. It
  +  seems to do a better job than lynx.


. docbook2texi (part of docbook2X 0.8.5)

@@ -130,27 +138,8 @@

     This is used to make a set of "info" files from a Texinfo file.


-In addition, there are some locally written Perl scripts. These are described
-below.
-
-
-ASCIIDOC
-
-AsciiDoc (http://www.methods.co.nz/asciidoc/) is a Python script that converts
-an input document in a more-or-less human-readable format into DocBook XML.
-For a document as complex as the Exim specification, the markup is quite
-complex - probably no simpler than the original SGCAL markup - but it is
-definitely easier to work with than XML itself.
-
-AsciiDoc is highly configurable. It comes with a default configuration, but I
-have extended this with an additional configuration file that must be used when
-processing the Exim documents. There is a separate document called AdMarkup.txt
-that describes the markup that is used in these documents. This includes the
-default AsciiDoc markup and the local additions.
-
-The author of AsciiDoc uses the extension .txt for input documents. I find
-this confusing, especially as some of the output files have .txt extensions.
-Therefore, I have used the extension .ascd for the sources.
+In addition, there are a number of locally written Perl scripts. These are
+described below.


   THE MAKEFILE
  @@ -163,13 +152,13 @@
     make spec.pdf


This runs the necessary tools in order to create the file spec.pdf from the
-original source spec.ascd. A number of intermediate files are created during
+original source spec.xfpt. A number of intermediate files are created during
this process, including the master DocBook source, called spec.xml. Of course,
the usual features of "make" ensure that if this already exists and is
up-to-date, it is not needlessly rebuilt.

The "test" series of targets were created so that small tests could easily be
-run fairly quickly, because processing even the shortish filter document takes
+run fairly quickly, because processing even the shortish XML document takes
a bit of time, and processing the main specification takes ages.

Another target is "exim.8". This runs a locally written Perl script called
@@ -180,11 +169,12 @@
There is also a "clean" target that deletes all the generated files.


-CREATING DOCBOOK XML FROM ASCIIDOC
+CREATING DOCBOOK XML FROM XFPT INPUT

-There is a single local AsciiDoc configuration file called MyAsciidoc.conf.
-Using this, one run of the asciidoc command creates a .xml file from a .ascd
-file. When this succeeds, there is no output.
+The small amount of local configuration for xfpt is included at the start of
+the two .xfpt files; there are no separate local xfpt configuration files.
+Running the xfpt command creates a .xml file from a .xfpt file. When this
+succeeds, there is no output.


DOCBOOK PROCESSING
@@ -213,32 +203,28 @@
The Pre-xml script copies a .xml file, making certain changes according to the
options it is given. The currently available options are as follows:

--abstract
-
- This option causes the <abstract> element to be removed from the XML. The
- source abuses the <abstract> element by using it to contain the author's
- address so that it appears on the title page verso in the printed renditions.
- This just gets in the way for the non-PostScript/PDF renditions.
-
-ascii

     This option is used for Ascii output formats. It makes the following
     character replacements:


  -    &8230;    =>  ...       (sic, no #x)
       &#x2019;  =>  '         apostrophe
  -    &#x201C;  =>  "         opening double quote
  -    &#x201D;  =>  "         closing double quote
  -    &#x2013;  =>  -         en dash
  -    &#x2020;  =>  *         dagger
  -    &#x2021;  =>  **        double dagger
  -    &#x00a0;  =>  a space   hard space
  -    &#x00a9;  =>  (c)       copyright
  -
  -  In addition, this option causes quotes to be put round <literal> text items,
  -  and <quote> and </quote> to be replaced by Ascii quote marks. You would think
  -  the stylesheet would cope with the latter, but it seems to generate non-Ascii
  -  characters that w3m then turns into question marks.
  +    &copy;    =>  (c)       copyright
  +    &dagger;  =>  *         dagger
  +    &Dagger;  =>  **        double dagger
  +    &nbsp;    =>  a space   hard space
  +    &ndash;   =>  -         en dash
  +
  +  The apostrophe is specified numerically because that is what xfpt generates
  +  from an Ascii single quote character. Non-Ascii characters that are not in
  +  this list should not be used without thinking about how they might be
  +  converted for the Ascii formats.
  +
  +  In addition to the character replacements, this option causes quotes to be
  +  put round <literal> text items, and <quote> and </quote> to be replaced by
  +  Ascii quote marks. You would think the stylesheet would cope with the latter,
  +  but it seems to generate non-Ascii characters that w3m then turns into
  +  question marks.


-bookinfo

@@ -259,28 +245,36 @@

-noindex

- Remove the XML to generate a Concept Index and an Options index.
+ Remove the XML to generate a Concept Index and an Options index. The source
+ document has two types of index entry, for a concept and an options index.
+ However, no index is required for the .txt and .texinfo outputs.

-oneindex

     Remove the XML to generate a Concept and an Options Index, and add XML to
  -  generate a single index.
  -
  -The source document has two types of index entry, for a concept and an options
  -index. However, no index is required for the .txt and .texinfo outputs.
  -Furthermore, the only output processor that supports multiple indexes is the
  -processor that produces "formatted objects" for PostScript and PDF output. The
  -HTML processor ignores the XML settings for multiple indexes and just makes one
  -unified index. Specifying two indexes gets you two copies of the same index, so
  -this has to be changed.
  +  generate a single index. The only output processor that supports multiple
  +  indexes is the processor that produces "formatted objects" for PostScript and
  +  PDF output. The HTML processor ignores the XML settings for multiple indexes
  +  and just makes one unified index. Specifying two indexes gets you two copies
  +  of the same index, so this has to be changed.
  +
  +-optbreak
  +
  +  Look for items of the form <option>...</option> and <varname>...</varname> in
  +  ordinary paragraphs, and insert &#x200B; after each underscore in the
  +  enclosed text. The same is done for any word containing four or more upper
  +  case letters (compile-time options in the Exim specification). The character
  +  &#x200B; is a zero-width space. This means that the line may be split after
  +  one of these underscores, but no hyphen is inserted.



CREATING POSTSCRIPT AND PDF

-These two output formats are created in three stages. First, the XML is
-pre-processed. For the filter document, the <bookinfo> element is removed so
-that no title page is generated, but for the main specification, no changes are
-currently made.
+These two output formats are created in three stages, with an additional fourth
+stage for PDF. First, the XML is pre-processed by the Pre-xml script. For the
+filter document, the <bookinfo> element is removed so that no title page is
+generated. For the main specification, the only change is to insert line
+breakpoints via -optbreak.

Second, the xmlto command is used to produce a "formatted objects" (.fo) file.
This process uses the following stylesheets:
@@ -300,10 +294,14 @@
All this apparatus is appallingly heavyweight. The processing is also very slow
in the case of the specification document. However, there should be no errors.

-In the third and final part of the processing, the .fo file that is produced by
-the xmlto command is processed by the fop command to generate either PostScript
-or PDF. This is also very slow, and you get a whole slew of errors, of which
-these are a sample:
+The reference book that saved my life while I was trying to get all this to
+work is "DocBook XSL, The Complete Guide", third edition (2005), by Bob
+Stayton, published by Sagehill Enterprises.
+
+In the third part of the processing, the .fo file that is produced by the xmlto
+command is processed by the fop command to generate either PostScript or PDF.
+This is also very slow, and you get a whole slew of errors, of which these are
+a sample:

     [ERROR] property - "background-position-horizontal" is not implemented yet.


@@ -330,14 +328,39 @@
warnings are repeated many times. Nevertheless, it does eventually produce
usable output, though I have a number of issues with it (see a later section of
this document). Maybe one day there will be a new release of fop that does
-better. Maybe there will be some other means of producing PostScript and PDF
-from DocBook XML. Maybe porcine aeronautics will really happen.
+better (there are now signs - February 2006 - that this may be happening).
+Maybe there will be some other means of producing PostScript and PDF from
+DocBook XML. Maybe porcine aeronautics will really happen.
+
+The PDF file that is produced by this process has one problem: the pages, as
+shown by acroread in its thumbnail display, are numbered sequentially from one
+to the end. Those numbers do not correspond with the page numbers of the body
+of the document, which makes finding a page from the index awkward. There is a
+facility in the PDF format to give pages appropriate "labels", but I cannot
+find a way of persuading fop to generate these. Fortunately, it is possibly to
+fix up the PDF to add page labels. I wrote a script called PageLabelPDF which
+does this. They are shown correctly by acroread, but not by GhostScript (gv).
+
+
+THE PAGELABELPDF SCRIPT
+
+This script reads the standard input and writes the standard output. It
+searches for the PDF object that sets data in its "Catalog", and adds
+appropriate information about page labels. The number of front-matter pages
+(those before chapter 1) is hard-wired into this script as 12 because I could
+not find a way of determining it automatically. As the current table of
+contents finishes near the top of the 11th page, there is plenty of room for
+expansion, so it is unlikely to be a problem.
+
+Having added data to the PDF file, the script then finds the xref table at the
+end of the file, and adjusts its entries to allow for the added text. This
+simple processing seems to be enough to generate a new, valid, PDF file.


CREATING HTML

   Only two stages are needed to produce HTML, but the main specification is
  -subsequently postprocessed. The Pre-xml script is called with the -abstract and
  +subsequently postprocessed. The Pre-xml script is called with the -optbreak and
   -oneindex options to preprocess the XML. Then the xmlto command creates the
   HTML output directly. For the specification document, a directory of files is
   created, whereas the filter document is output as a single HTML page. The
  @@ -347,9 +370,13 @@
     (2) MyStyle-html.xsl
     (3) MyStyle.xsl


-The first stylesheet references the chunking or non-chunking standard
+The first stylesheet references the chunking or non-chunking standard DocBook
stylesheet, as appropriate.

+You may see a number of these errors when creating HTML: "Revisionflag on
+unexpected element: literallayout (Assuming block)". They seem to be harmless;
+the output appears to be what is intended.
+
The original HTML that I produced from the SGCAL input had hyperlinks back from
chapter and section titles to the table of contents. These links are not
generated by xmlto. One of the testers pointed out that the lack of these
@@ -387,10 +414,11 @@

CREATING TEXT FILES

-This happens in four stages. The Pre-xml script is called with the -abstract,
--ascii and -noindex options to remove the <abstract> element, convert the input
-to Ascii characters, and to disable the production of an index. Then the xmlto
-command converts the XML to a single HTML document, using these stylesheets:
+This happens in four stages. The Pre-xml script is called with the -ascii,
+-optbreak, and -noindex options to convert the input to Ascii characters,
+insert line break points, and disable the production of an index. Then the
+xmlto command converts the XML to a single HTML document, using these
+stylesheets:

     (1) MyStyle-txt-html.xsl
     (2) MyStyle-html.xsl
  @@ -404,21 +432,25 @@


The w3m command is used with the -dump option to turn the HTML file into Ascii
text, but this contains multiple sequences of blank lines that make it look
-awkward, so, finally, a local Perl script called Tidytxt is used to convert
-sequences of blank lines into a single blank line.
+awkward. Furthermore, chapter and section titles do not stand out very well. A
+local Perl script called Tidytxt is used to post-process the output. First, it
+converts sequences of blank lines into a single blank lines. Then it searches
+for chapter and section headings. Each chapter heading is uppercased, and
+preceded by an extra two blank lines and a line of equals characters. An extra
+newline is inserted before each section heading, and they are underlined with
+hyphens.


CREATING INFO FILES

-This process starts with the same Pre-xml call as for text files. The
-<abstract> element is deleted, non-ascii characters in the source are
-transliterated, and the <index> elements are removed. The docbook2texi script
-is then called to convert the XML file into a Texinfo file. However, this is
-not quite enough. The converted file ends up with "conceptindex" and
-"optionindex" items, which are not recognized by the makeinfo command. An
-in-line call to Perl in the Makefile changes these to "cindex" and "findex"
-respectively in the final .texinfo file. Finally, a call of makeinfo creates a
-set of .info files.
+This process starts with the same Pre-xml call as for text files. Non-ascii
+characters in the source are transliterated, and the <index> elements are
+removed. The docbook2texi script is then called to convert the XML file into a
+Texinfo file. However, this is not quite enough. The converted file ends up
+with "conceptindex" and "optionindex" items, which are not recognized by the
+makeinfo command. An in-line call to Perl in the Makefile changes these to
+"cindex" and "findex" respectively in the final .texinfo file. Finally, a call
+of makeinfo creates a set of .info files.

There is one apparently unconfigurable feature of docbook2texi: it does not
seem possible to give it a file name for its output. It chooses a name based on
@@ -431,14 +463,14 @@
CREATING THE MAN PAGE

I wrote a Perl script called x2man to create the exim.8 man page from the
-DocBook XML source. I deliberately did NOT start from the AsciiDoc source,
+DocBook XML source. I deliberately did NOT start from the xfpt source,
because it is the DocBook source that is the "standard". This comment line in
the DocBook source marks the start of the command line options:

     <!-- === Start of command line options === -->


A similar line marks the end. If at some time in the future another way other
-than AsciiDoc is used to maintain the DocBook source, it needs to be capable of
+than xfpt is used to maintain the DocBook source, it needs to be capable of
maintaining these comments.


@@ -448,18 +480,16 @@
in the manner described above. I will describe them here in the hope that in
future some way round them can be found.

  -(1)  Errors in the toolchain
  -
  -     When a whole chain of tools is processing a file, an error somewhere in
  -     the middle is often very hard to debug. For instance, an error in the
  -     AsciiDoc might not show up until an XML processor throws a wobbly because
  +(1)  When a whole chain of tools is processing a file, an error somewhere
  +     in the middle is often very hard to debug. For instance, an error in the
  +     xfpt file might not show up until an XML processor throws a wobbly because
        the generated XML is bad. You have to be able to read XML and figure out
        what generated what. One of the reasons for creating the "test" series of
        targets was to help in checking out these kinds of problem.


   (2)  There is a mechanism in XML for marking parts of the document as
  -     "revised", and I have arranged for AsciiDoc markup to use it. However, at
  -     the moment, the only output format that pays attention to this is the HTML
  +     "revised", and I have arranged for xfpt markup to use it. However, at the
  +     moment, the only output format that pays attention to this is the HTML
        output, which sets a green background. There are therefore no revision
        marks (change bars) in the PostScript, PDF, or text output formats as
        there used to be. (There never were for Texinfo.)
  @@ -502,13 +532,23 @@
   (9)  The fop processor does not support "fi" ligatures, not even if you put the
        appropriate Unicode character into the source by hand.


  -(10) There are no diagrams in the new documentation. This is something I could
  -     work on. The previously-used Aspic command for creating line art from a
  +(10) There are no diagrams in the new documentation. This is something I hope
  +     to work on. The previously used Aspic command for creating line art from a
        textual description can output Encapsulated PostScript or Scalar Vector
        Graphics, which are two standard diagram representations. Aspic could be
        formally released and used to generate output that could be included in at
        least some of the output formats.


  +(11) The use of a "zero-width space" works well as a way of specifying that
  +     Exim option names can be split, without hyphens, over line breaks.
  +     However, when an option is not split, if the line is very "loose", the
  +     zero-width space is expanded, along with other spaces. This is a totally
  +     crazy thing to, but unfortunately it is suggested by the Unicode
  +     definition of the zero-width space, which says "its presence between two
  +     characters does not prevent increased letter spacing in justification".
  +     It seems that the implementors of fop have understood "letter spacing"
  +     also to include "word spacing". Sigh.
  +
   The consequence of (7), (8), and (9) is that the PostScript/PDF output looks as
   if it comes from some of the very early attempts at text formatting of around
   20 years ago. We can only hope that 20 years' progress is not going to get
  @@ -517,10 +557,9 @@


LIST OF FILES

  -AdMarkup.txt                   Describes the AsciiDoc markup that is used
  +Markup.txt                     Describes the xfpt markup that is used
   HowItWorks.txt                 This document
   Makefile                       The makefile
  -MyAsciidoc.conf                Localized AsciiDoc configuration
   MyStyle-chunk-html.xsl         Stylesheet for chunked HTML output
   MyStyle-filter-fo.xsl          Stylesheet for filter fo output
   MyStyle-fo.xsl                 Stylesheet for any fo output
  @@ -532,17 +571,15 @@
   MyTitleStyle.xsl               Stylesheet for spec title page
   MyTitlepage.templates.xml      Template for creating MyTitleStyle.xsl
   Myhtml.css                     Experimental css stylesheet for HTML output
  +PageLabelPDF                   Script to postprocess PDF
   Pre-xml                        Script to preprocess XML
   TidyHTML-filter                Script to tidy up the filter HTML output
   TidyHTML-spec                  Script to tidy up the spec HTML output
   Tidytxt                        Script to compact multiple blank lines
  -filter.ascd                    AsciiDoc source of the filter document
  -spec.ascd                      AsciiDoc source of the specification document
  +filter.xfpt                    xfpt source of the filter document
  +spec.xfpt                      xfpt source of the specification document
   x2man                          Script to make the Exim man page from the XML


-The file Myhtml.css was an experiment that was not followed through. It is
-mentioned in a comment in MyStyle-html.xsl, but is not at present in use.
-

Philip Hazel
-Last updated: 10 June 2005
+Last updated: 31 January 2006

  Index: Makefile
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/Makefile,v
  retrieving revision 1.6
  retrieving revision 1.7
  diff -u -r1.6 -r1.7
  --- Makefile    20 Dec 2005 15:45:02 -0000    1.6
  +++ Makefile    1 Feb 2006 11:01:01 -0000    1.7
  @@ -1,6 +1,6 @@
  -# $Cambridge: exim/exim-doc/doc-docbook/Makefile,v 1.6 2005/12/20 15:45:02 ph10 Exp $
  +# $Cambridge: exim/exim-doc/doc-docbook/Makefile,v 1.7 2006/02/01 11:01:01 ph10 Exp $


-# Make file for Exim documentation from Asciidoc source.
+# Make file for Exim documentation from xfpt source.

   notarget:;    @echo "** You must specify a target, in the form x.y, where x is 'filter', 'spec',"
             @echo "** or 'test', and y is 'xml', 'fo', 'ps', 'pdf', 'html', 'txt', or 'info'."
  @@ -10,7 +10,7 @@


############################## MAN PAGE ################################

  -exim.8: spec.xml
  +exim.8:       spec.xml x2man
             ./x2man


########################################################################
@@ -18,8 +18,8 @@

############################### FILTER #################################

  -filter.xml:   filter.ascd MyAsciidoc.conf
  -          asciidoc -d book -b docbook -f MyAsciidoc.conf filter.ascd
  +filter.xml:   filter.xfpt
  +          xfpt filter.xfpt


   filter-fo.xml: filter.xml Pre-xml
             ./Pre-xml -bookinfo <filter.xml >filter-fo.xml
  @@ -28,7 +28,10 @@
             ./Pre-xml -html <filter.xml >filter-html.xml


   filter-txt.xml: filter.xml Pre-xml
  -          ./Pre-xml -ascii -html <filter.xml >filter-txt.xml
  +          ./Pre-xml -ascii -html -quoteliteral <filter.xml >filter-txt.xml
  +
  +filter-info.xml: filter.xml Pre-xml
  +          ./Pre-xml -ascii -html <filter.xml >filter-info.xml


   filter.fo:    filter-fo.xml MyStyle-filter-fo.xsl MyStyle-fo.xsl MyStyle.xsl
             /bin/rm -rf filter.fo filter-fo.fo
  @@ -48,23 +51,25 @@
             fop filter.fo -pdf filter-tmp.pdf
             mv filter-tmp.pdf filter.pdf


  -filter.html:  filter-html.xml TidyHTML-filter MyStyle-nochunk-html.xsl MyStyle-html.xsl MyStyle.xsl
  +filter.html:  filter-html.xml TidyHTML-filter MyStyle-nochunk-html.xsl \
  +                MyStyle-html.xsl MyStyle.xsl
             /bin/rm -rf filter.html filter-html.html
             xmlto -x MyStyle-nochunk-html.xsl html-nochunks filter-html.xml
             /bin/mv -f filter-html.html filter.html
             ./TidyHTML-filter


  -filter.txt:   filter-txt.xml Tidytxt MyStyle-txt-html.xsl MyStyle-html.xsl MyStyle.xsl
  +filter.txt:   filter-txt.xml Tidytxt MyStyle-txt-html.xsl MyStyle-html.xsl \
  +                MyStyle.xsl
             /bin/rm -rf filter-txt.html
             xmlto -x MyStyle-txt-html.xsl html-nochunks filter-txt.xml
  -          w3m -dump filter-txt.html >filter.txt
  +          w3m -dump filter-txt.html | ./Tidytxt >filter.txt


# I have not found a way of making docbook2texi write its output anywhere
# other than the file name that it makes up. The --to-stdout option does not
# work.

  -filter.info:  filter-txt.xml
  -          docbook2texi filter-txt.xml
  +filter.info:  filter-info.xml
  +          docbook2texi filter-info.xml
             perl -ne 's/conceptindex/cindex/;s/optionindex/findex/;print;' \
           <exim_filtering.texi | Tidytxt >filter.texinfo
             /bin/rm -rf exim_filtering.texi
  @@ -75,19 +80,25 @@


################################ SPEC ##################################

  -spec.xml:     spec.ascd MyAsciidoc.conf
  -          asciidoc -d book -b docbook -f MyAsciidoc.conf spec.ascd
  +spec.xml:     spec.xfpt
  +          xfpt spec.xfpt


   spec-fo.xml:  spec.xml Pre-xml
  -          ./Pre-xml <spec.xml >spec-fo.xml
  +          ./Pre-xml -optbreak <spec.xml >spec-fo.xml


   spec-html.xml: spec.xml Pre-xml
  -          ./Pre-xml -abstract -html -oneindex <spec.xml >spec-html.xml
  +          ./Pre-xml -html -oneindex \
  +        <spec.xml >spec-html.xml


   spec-txt.xml: spec.xml Pre-xml
  -          ./Pre-xml -abstract -ascii -html -noindex <spec.xml >spec-txt.xml
  +          ./Pre-xml -ascii -html -noindex -quoteliteral \
  +        <spec.xml >spec-txt.xml
  +
  +spec-info.xml: spec.xml Pre-xml
  +          ./Pre-xml -ascii -html -noindex <spec.xml >spec-info.xml


  -spec.fo:      spec-fo.xml MyStyle-spec-fo.xsl MyStyle-fo.xsl MyStyle.xsl MyTitleStyle.xsl
  +spec.fo:      spec-fo.xml MyStyle-spec-fo.xsl MyStyle-fo.xsl MyStyle.xsl \
  +              MyTitleStyle.xsl
             /bin/rm -rf spec.fo spec-fo.fo
             xmlto -x MyStyle-spec-fo.xsl fo spec-fo.xml
             /bin/mv -f spec-fo.fo spec.fo
  @@ -99,28 +110,31 @@
             mv spec-tmp.ps spec.ps


# Do not use ps2pdf from the PS version; better PDF is generated directly. It
-# contains cross links etc.
+# contains cross links etc. We post-process it to add page label information
+# so that the page identifiers shown by acroread are the correct page numbers.

  -spec.pdf:     spec.fo
  +spec.pdf:     spec.fo PageLabelPDF
             FOP_OPTS=-Xmx512m fop spec.fo -pdf spec-tmp.pdf
  -          mv spec-tmp.pdf spec.pdf
  +          ./PageLabelPDF <spec-tmp.pdf >spec.pdf


  -spec.html:    spec-html.xml TidyHTML-spec MyStyle-chunk-html.xsl MyStyle-html.xsl MyStyle.xsl
  +spec.html:    spec-html.xml TidyHTML-spec MyStyle-chunk-html.xsl \
  +                MyStyle-html.xsl MyStyle.xsl
             /bin/rm -rf spec.html
             xmlto -x MyStyle-chunk-html.xsl -o spec.html html spec-html.xml
             ./TidyHTML-spec


  -spec.txt:     spec-txt.xml Tidytxt MyStyle-txt-html.xsl MyStyle-html.xsl MyStyle.xsl
  +spec.txt:     spec-txt.xml Tidytxt MyStyle-txt-html.xsl MyStyle-html.xsl \
  +                MyStyle.xsl
             /bin/rm -rf spec-txt.html
             xmlto -x MyStyle-txt-html.xsl html-nochunks spec-txt.xml
  -          w3m -dump spec-txt.html | Tidytxt >spec.txt
  +          w3m -dump spec-txt.html | ./Tidytxt >spec.txt


# I have not found a way of making docbook2texi write its output anywhere
# other than the file name that it makes up. The --to-stdout option does not
# work.

  -spec.info:    spec-txt.xml
  -          docbook2texi spec-txt.xml
  +spec.info:    spec-info.xml
  +          docbook2texi spec-info.xml
             perl -ne 's/conceptindex/cindex/;s/optionindex/findex/;print;' \
           <the_exim_mta.texi >spec.texinfo
             /bin/rm -rf the_exim_mta.texi
  @@ -133,19 +147,24 @@


# These targets (similar to the above) are for running little tests.

  -test.xml:     test.ascd MyAsciidoc.conf
  -          asciidoc -d book -b docbook -f MyAsciidoc.conf test.ascd
  +test.xml:     test.xfpt
  +          xfpt test.xfpt


   test-fo.xml:  test.xml Pre-xml
             ./Pre-xml <test.xml >test-fo.xml


   test-html.xml: test.xml Pre-xml
  -          ./Pre-xml -abstract -html -oneindex <test.xml >test-html.xml
  +          ./Pre-xml -html -oneindex <test.xml >test-html.xml


   test-txt.xml: test.xml Pre-xml
  -          ./Pre-xml -abstract -ascii -html -noindex <test.xml >test-txt.xml
  +          ./Pre-xml -ascii -html -noindex -quoteinfo \
  +        <test.xml >test-txt.xml
  +
  +test-info.xml: test.xml Pre-xml
  +          ./Pre-xml -ascii -html -noindex <test.xml >test-info.xml


  -test.fo:      test-fo.xml MyStyle-spec-fo.xsl MyStyle-fo.xsl MyStyle.xsl MyTitleStyle.xsl
  +test.fo:      test-fo.xml MyStyle-spec-fo.xsl MyStyle-fo.xsl MyStyle.xsl \
  +                MyTitleStyle.xsl
             /bin/rm -rf test.fo test-fo.fo
             xmlto -x MyStyle-spec-fo.xsl fo test-fo.xml
             /bin/mv -f test-fo.fo test.fo
  @@ -163,12 +182,14 @@
             fop test.fo -pdf test-tmp.pdf
             mv test-tmp.pdf test.pdf


  -test.html:    test-html.xml MyStyle-nochunk-html.xsl MyStyle-html.xsl MyStyle.xsl
  +test.html:    test-html.xml MyStyle-nochunk-html.xsl MyStyle-html.xsl \
  +                MyStyle.xsl
             /bin/rm -rf test.html test-html.html
             xmlto -x MyStyle-nochunk-html.xsl html-nochunks test-html.xml
             /bin/mv -f test-html.html test.html


  -test.txt:     test-txt.xml Tidytxt MyStyle-txt-html.xsl MyStyle-html.xsl MyStyle.xsl
  +test.txt:     test-txt.xml Tidytxt MyStyle-txt-html.xsl MyStyle-html.xsl \
  +                MyStyle.xsl
             /bin/rm -rf test-txt.html
             xmlto -x MyStyle-txt-html.xsl html-nochunks test-txt.xml
             w3m -dump test-txt.html | Tidytxt >test.txt
  @@ -177,8 +198,8 @@
   # other than the file name that it makes up. The --to-stdout option does not
   # work.


  -test.info:    test-txt.xml
  -          docbook2texi test-txt.xml
  +test.info:    test-info.xml
  +          docbook2texi test-info.xml
             perl -ne 's/conceptindex/cindex/;s/optionindex/findex/;print;' \
           <short_title.texi >test.texinfo
             /bin/rm -rf short_title.texi


  Index: MyStyle-fo.xsl
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/MyStyle-fo.xsl,v
  retrieving revision 1.2
  retrieving revision 1.3
  diff -u -r1.2 -r1.3
  --- MyStyle-fo.xsl    10 Nov 2005 12:30:13 -0000    1.2
  +++ MyStyle-fo.xsl    1 Feb 2006 11:01:01 -0000    1.3
  @@ -1,4 +1,4 @@
  -<!-- $Cambridge: exim/exim-doc/doc-docbook/MyStyle-fo.xsl,v 1.2 2005/11/10 12:30:13 ph10 Exp $ -->
  +<!-- $Cambridge: exim/exim-doc/doc-docbook/MyStyle-fo.xsl,v 1.3 2006/02/01 11:01:01 ph10 Exp $ -->


   <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                   xmlns:fo="http://www.w3.org/1999/XSL/Format"
  @@ -21,6 +21,10 @@
   <xsl:param name="double.sided" select="1"></xsl:param>
   -->


+<!-- Let's have whatever fop extensions there are -->
+
+<xsl:param name="fop.extensions" select="1"></xsl:param>
+
<!-- Allow for typed index entries. The "role" setting works with DocBook
version 4.2 or earlier. Later versions (which we are not currently using)
need "type". -->
@@ -28,7 +32,6 @@
<xsl:param name="index.on.type" select="1"></xsl:param>
<xsl:param name="index.on.role" select="1"></xsl:param>

-
<!-- The default uses short chapter titles in the TOC! I want them only for
use in footer lines. So we have to modify this template. I changed
"titleabbrev.markup" to "title.markup". While I'm here, I also made chapter
@@ -135,7 +138,6 @@
http://www.sagehill.net/docbookxsl/PrintHeaders.html
-->

  -
   <xsl:attribute-set name="footer.content.properties">
     <!-- <xsl:attribute name="font-family">serif</xsl:attribute> -->
     <!-- <xsl:attribute name="font-size">9pt</xsl:attribute> -->
  @@ -143,40 +145,16 @@
   </xsl:attribute-set>



-<!-- Things that can be inserted into the footer are:
-
-<fo:page-number/>
-Inserts the current page number.
-
-<xsl:apply-templates select="." mode="title.markup"/>
-Inserts the title of the current chapter, appendix, or other component.
-
-<xsl:apply-templates select="." mode="titleabbrev.markup"/>
-Inserts the titleabbrev of the current chapter, appendix, or other component,
-if it is available. Otherwise it inserts the regular title.
-
-<xsl:apply-templates select="." mode="object.title.markup"/>
-Inserts the chapter title with chapter number label. Likewise for appendices.
+<!-- The default cell widths make the centre one too large -->

  -<fo:retrieve-marker ... />      Used to retrieve the current section name.
  +<xsl:param name="footer.column.widths">4 1 4</xsl:param>


-<xsl:apply-templates select="//corpauthor[1]"/>
-Inserts the value of the first corpauthor element found anywhere in the
-document.

-<xsl:call-template name="datetime.format">
- <xsl:with-param ...
-Inserts a date timestamp.
-
-<xsl:call-template name="draft.text"/>
-Inserts the Draft message if draft.mode is currently on.
-
-<fo:external-graphic ... />
-Inserts a graphical image.
-See the section Graphic in header or footer for details.
+<!-- Put the abbreviated chapter titles in running feet, and add the chapter
+number afterwards in parentheses. I changed title.markup to titleabbrev.markup,
+and added some lines.
-->

  -
   <xsl:template name="footer.content">
     <xsl:param name="pageclass" select="''"/>
     <xsl:param name="sequence" select="''"/>
  @@ -206,6 +184,15 @@
           <fo:page-number/>
         </xsl:when>


  +      <!-- This clause added by PH -->
  +      <xsl:when test="$double.sided = 0 and $position='right' and $pageclass='body'">
  +        <xsl:apply-templates select="." mode="titleabbrev.markup"/>
  +          <xsl:text> (</xsl:text>
  +          <xsl:apply-templates select="." mode="label.markup"/>
  +          <xsl:text>)</xsl:text>
  +      </xsl:when>
  +
  +      <!-- Changed title.markup to titleabbrev.markup for TOC -->
         <xsl:when test="$double.sided = 0 and $position='right'">
           <xsl:apply-templates select="." mode="titleabbrev.markup"/>
         </xsl:when>
  @@ -229,6 +216,49 @@
         </xsl:otherwise>
       </xsl:choose>
     </fo:block>
  +</xsl:template>
  +
  +
  +<!-- Arrange for ordered list numbers to be in parentheses instead of just
  +followed by a dot, which I don't like. Unfortunately, this styling is
  +output-specific, so we have to do it separately for FO and HTML output. -->
  +
  +<xsl:template match="orderedlist/listitem" mode="item-number">
  +  <xsl:variable name="numeration">
  +    <xsl:call-template name="list.numeration">
  +      <xsl:with-param name="node" select="parent::orderedlist"/>
  +    </xsl:call-template>
  +  </xsl:variable>
  +
  +  <xsl:variable name="type">
  +    <xsl:choose>
  +      <xsl:when test="$numeration='arabic'">(1)</xsl:when>
  +      <xsl:when test="$numeration='loweralpha'">(a)</xsl:when>
  +      <xsl:when test="$numeration='lowerroman'">(i)</xsl:when>
  +      <xsl:when test="$numeration='upperalpha'">(A)</xsl:when>
  +      <xsl:when test="$numeration='upperroman'">(I)</xsl:when>
  +      <!-- What!? This should never happen -->
  +      <xsl:otherwise>
  +        <xsl:message>
  +          <xsl:text>Unexpected numeration: </xsl:text>
  +          <xsl:value-of select="$numeration"/>
  +        </xsl:message>
  +        <xsl:value-of select="1."/>
  +      </xsl:otherwise>
  +    </xsl:choose>
  +  </xsl:variable>
  +
  +  <xsl:variable name="item-number">
  +    <xsl:call-template name="orderedlist-item-number"/>
  +  </xsl:variable>
  +
  +  <xsl:if test="parent::orderedlist/@inheritnum='inherit'
  +                and ancestor::listitem[parent::orderedlist]">
  +    <xsl:apply-templates select="ancestor::listitem[parent::orderedlist][1]"
  +                         mode="item-number"/>
  +  </xsl:if>
  +
  +  <xsl:number value="$item-number" format="{$type}"/>
   </xsl:template>


</xsl:stylesheet>

  Index: MyStyle-spec-fo.xsl
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/MyStyle-spec-fo.xsl,v
  retrieving revision 1.2
  retrieving revision 1.3
  diff -u -r1.2 -r1.3
  --- MyStyle-spec-fo.xsl    5 Aug 2005 10:57:41 -0000    1.2
  +++ MyStyle-spec-fo.xsl    1 Feb 2006 11:01:01 -0000    1.3
  @@ -1,4 +1,4 @@
  -<!-- $Cambridge: exim/exim-doc/doc-docbook/MyStyle-spec-fo.xsl,v 1.2 2005/08/05 10:57:41 ph10 Exp $ -->
  +<!-- $Cambridge: exim/exim-doc/doc-docbook/MyStyle-spec-fo.xsl,v 1.3 2006/02/01 11:01:01 ph10 Exp $ -->


<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'>

@@ -12,6 +12,19 @@
<xsl:import href="MyStyle.xsl"/>
<xsl:import href="MyStyle-fo.xsl"/>

  -<!-- Nothing special for the full spec document yet -->
  +<!-- Special for the spec document -->
  +
  +<!-- Arrange for the table of contents to be an even number of pages. The name
  +"lot" includes all pages that contain a "list of titles", which in our case is
  +only the TOC. -->
  +
  +<xsl:template name="force.page.count">
  +  <xsl:param name="element" select="local-name(.)"/>
  +  <xsl:param name="master-reference" select="''"/>
  +  <xsl:choose>
  +    <xsl:when test="$master-reference = 'lot'">end-on-even</xsl:when>
  +    <xsl:otherwise>no-force</xsl:otherwise>
  +  </xsl:choose>
  +</xsl:template>


</xsl:stylesheet>

  Index: MyStyle.xsl
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/MyStyle.xsl,v
  retrieving revision 1.2
  retrieving revision 1.3
  diff -u -r1.2 -r1.3
  --- MyStyle.xsl    10 Nov 2005 12:30:13 -0000    1.2
  +++ MyStyle.xsl    1 Feb 2006 11:01:01 -0000    1.3
  @@ -1,4 +1,4 @@
  -<!-- $Cambridge: exim/exim-doc/doc-docbook/MyStyle.xsl,v 1.2 2005/11/10 12:30:13 ph10 Exp $ -->
  +<!-- $Cambridge: exim/exim-doc/doc-docbook/MyStyle.xsl,v 1.3 2006/02/01 11:01:01 ph10 Exp $ -->


<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'>

@@ -51,18 +51,11 @@
<xsl:param name="hyphenate">false</xsl:param>


-<!--
-Generate only numbers, no titles, in cross references.
--->
+<!-- Generate only numbers, no titles, in cross references. -->

<xsl:param name="xref.with.number.and.title">0</xsl:param>


-<!-- Hopefully this might do something useful? It doesn't seem to. -->
-
-<xsl:param name="fop.extensions" select="1"></xsl:param>
-
-
<!-- Output variable names in italic rather than the default monospace. -->

<xsl:template match="varname">
@@ -77,6 +70,13 @@
</xsl:template>


+<!-- Output function names in italic rather than the default boldface. -->
+
+<xsl:template match="function">
+ <xsl:call-template name="inline.italicseq"/>
+</xsl:template>
+
+
<!-- Output options in bold rather than the default monospace. -->

   <xsl:template match="option">
  @@ -92,6 +92,12 @@
   <xsl:param name="local.l10n.xml" select="document('')"/>
   <l:i18n xmlns:l="http://docbook.sourceforge.net/xmlns/l10n/1.0">
     <l:l10n language="en">
  +
  +    <!-- Turn the text "Revision History" into nothing, because we only have
  +    the info for the latest revision in the file. -->
  +
  +    <l:gentext key="revhistory" text=""/>
  +    <l:gentext key="RevHistory" text=""/>


       <!-- The default (as modified above) gives us "Chapter xxx" or "Section
       xxx", with a capital letter at the start. So we have to make an more


  Index: MyTitlepage.templates.xml
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/MyTitlepage.templates.xml,v
  retrieving revision 1.1
  retrieving revision 1.2
  diff -u -r1.1 -r1.2
  --- MyTitlepage.templates.xml    16 Jun 2005 10:32:31 -0000    1.1
  +++ MyTitlepage.templates.xml    1 Feb 2006 11:01:01 -0000    1.2
  @@ -13,14 +13,14 @@
   <!ENTITY hsize5space "18.6624pt"> <!-- 0.75 * hsize5 -->
   ]>


-<!-- $Cambridge: exim/exim-doc/doc-docbook/MyTitlepage.templates.xml,v 1.1 2005/06/16 10:32:31 ph10 Exp $ -->
+<!-- $Cambridge: exim/exim-doc/doc-docbook/MyTitlepage.templates.xml,v 1.2 2006/02/01 11:01:01 ph10 Exp $ -->

<!-- This document is copied from the DocBook XSL stylesheets, and modified to
do what I want it to do for the Exim reference manual. Process this document
with:

   xsltproc -output MyTitleStyle.xsl \
  -  /usr/share/sgml/docbook/xsl-stylesheets-1.66.1/template/titlepage.xsl \
  +  /usr/share/sgml/docbook/xsl-stylesheets-1.68.1/template/titlepage.xsl \
     MyTitlepage.templates.xml


   in order to generate a style sheet called MyTitleStyle.xsl. That is then
  @@ -52,7 +52,7 @@
                param:node="ancestor-or-self::book[1]"
                text-align="center"
                font-size="&hsize5;"
  -             space-before="&hsize5space;"
  +             space-before="5em"
                font-weight="bold"
                font-family="{$title.fontset}"/>
         <subtitle
  @@ -67,6 +67,7 @@
         <author font-size="&hsize3;"
                 space-before="&hsize2space;"
                 keep-with-next="always"/>
  +      <affiliation space-before="1em"/>
       </t:titlepage-content>


     <t:titlepage-content t:side="verso">
  @@ -77,11 +78,13 @@
                font-family="{$title.fontset}"/>
         <corpauthor/>
         <authorgroup t:named-template="verso.authorgroup"/>
  -      <author/>
  -      <othercredit/>
  +      <author space-before="1em"/>
  +      <affiliation space-before="1em"/>
  +      <address/>
         <pubdate space-before="1em"/>
         <abstract/>
  -      <copyright/>
  +      <copyright space-before="1em"/>
  +      <revhistory space-before="1em"/>
         <legalnotice font-size="8pt"/>
     </t:titlepage-content>



  Index: Pre-xml
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/Pre-xml,v
  retrieving revision 1.2
  retrieving revision 1.3
  diff -u -r1.2 -r1.3
  --- Pre-xml    10 Nov 2005 12:30:13 -0000    1.2
  +++ Pre-xml    1 Feb 2006 11:01:01 -0000    1.3
  @@ -1,6 +1,6 @@
   #! /usr/bin/perl


-# $Cambridge: exim/exim-doc/doc-docbook/Pre-xml,v 1.2 2005/11/10 12:30:13 ph10 Exp $
+# $Cambridge: exim/exim-doc/doc-docbook/Pre-xml,v 1.3 2006/02/01 11:01:01 ph10 Exp $

# Script to pre-process XML input before processing it for various purposes.
# Options specify which transformations are to be done. Monospaced literal
@@ -8,20 +8,17 @@

# Changes:

  -# -abstract: Remove the <abstract> element
  -
  -# -ascii:    Replace &8230;   (sic, no x) with ...
  -#            Replace &#x2019; by '
  -#            Replace &#x201C; by "
  -#            Replace &#x201D; by "
  -#            Replace &#x2013; by -
  -#            Replace &#x2020; by *
  -#            Replace &#x2021; by **
  -#            Replace &#x00a0; by a space
  -#            Replace &#169;   by (c)
  -#            Put quotes round <literal> text
  +# -ascii:    Replace &#x2019; by '
  +#            Replace &copy;   by (c)
  +#            Replace &dagger; by *
  +#            Replace &Dagger; by **
  +#            Replace &nbsp;   by a space
  +#            Replace &ndash;  by -
   #            Put quotes round <quote> text
   #
  +# -quoteliteral:
  +#            Put quotes round <literal> text
  +#
   # -bookinfo: Remove the <bookinfo> element from the file
   #
   # -fi:       Replace "fi" by &#xFB01; except when it is in an XML element, or
  @@ -29,16 +26,22 @@
   #
   # -html:     Certain things are done only for HTML output:
   #
  -#            If <literallayout> is followed by optional # space and then a
  +#            If <literallayout> is followed by optional space and then a
   #            newline, the space and newline are removed, because otherwise you
   #            get a blank line in the HTML output.
   #
   # -noindex   Remove the XML to generate a Concept and an Options index.
   # -oneindex  Ditto, but add XML to generate a single index.
  +#
  +# -optbreak  Insert an optional line break (zero-width space, &#x200B;) after
  +#            every underscore in text within <option> and <variable> elements,
  +#            except when preceded by <entry> (i.e. not in tables). The same is
  +#            also done within a word of four or more upper-case letters (for
  +#            compile-time options).




-# The function that processes non-literal monospaced text
+# The function that processes non-literal, non-monospaced text

sub process()
{
@@ -46,17 +49,23 @@

$s =~ s/fi(?![^<>]*>)/&#xFB01;/g if $ligatures;

  +if ($optbreak)
  +  {
  +  $s =~ s%(?<!<entry>)(<option>|<varname>)([^<]+)%
  +    my($x,$y) = ($1,$2); $y =~ s/_/_&#x200B;/g; "$x"."$y"%gex;
  +
  +  $s =~ s?\b([A-Z_]{4,})\b?
  +    my($x) = $1; $x =~ s/_/_&#x200B;/g; "$x"?gex;
  +  }
  +
   if ($ascii)
     {
  -  $s =~ s/&#8230;/.../g;
     $s =~ s/&#x2019;/'/g;
  -  $s =~ s/&#x201C;/"/g;
  -  $s =~ s/&#x201D;/"/g;
  -  $s =~ s/&#x2013;/-/g;
  -  $s =~ s/&#x2020;/*/g;
  -  $s =~ s/&#x2021;/**/g;
  -  $s =~ s/&#x00a0;/ /g;
  -  $s =~ s/&#169;/(c)/g;
  +  $s =~ s/&copy;/(c)/g;
  +  $s =~ s/&dagger;/*/g;
  +  $s =~ s/&Dagger;/**/g;
  +  $s =~ s/&nsbp;/ /g;
  +  $s =~ s/&ndash;/-/g;
     $s =~ s/<quote>/"/g;
     $s =~ s/<\/quote>/"/g;
     }
  @@ -67,7 +76,6 @@


# The main program

  -$abstract  = 0;
   $ascii     = 0;
   $bookinfo  = 0;
   $html      = 0;
  @@ -77,25 +85,24 @@
   $madeindex = 0;
   $noindex   = 0;
   $oneindex  = 0;
  +$optbreak  = 0;
  +$quoteliteral = 0;


   foreach $arg (@ARGV)
     {
     if    ($arg eq "-fi")       { $ligatures = 1; }
  -  elsif ($arg eq "-abstract") { $abstract = 1; }
     elsif ($arg eq "-ascii")    { $ascii = 1; }
     elsif ($arg eq "-bookinfo") { $bookinfo = 1; }
     elsif ($arg eq "-html")     { $html = 1; }
     elsif ($arg eq "-noindex")  { $noindex = 1; }
     elsif ($arg eq "-oneindex") { $oneindex = 1; }
  +  elsif ($arg eq "-optbreak") { $optbreak = 1; }
  +  elsif ($arg eq "-quoteliteral") { $quoteliteral = 1; }
     else  { die "** Pre-xml: Unknown option \"$arg\"\n"; }
     }


   while (<STDIN>)
     {
  -  # Remove <abstract> if required
  -
  -  next if ($abstract && /^\s*<abstract>/);
  -
     # Remove <bookinfo> if required


     if ($bookinfo && /^<bookinfo/)
  @@ -152,7 +159,7 @@
         if (/^(.*?)<\/literal>(?!<\/quote>)(.*)$/)
           {
           print $1;
  -        print "\"" if $ascii && !$inliterallayout;
  +        print "\"" if $quoteliteral && !$inliterallayout;
           print "</literal>";
           $inliteral = 0;
           $_ = "$2\n";
  @@ -172,7 +179,7 @@
           {
           print &process($1);
           print "<literal>";
  -        print "\"" if $ascii && !$inliterallayout;
  +        print "\"" if $quoteliteral && !$inliterallayout;
           $inliteral = 1;
           $_ = "$2\n";
           }


  Index: TidyHTML-filter
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/TidyHTML-filter,v
  retrieving revision 1.2
  retrieving revision 1.3
  diff -u -r1.2 -r1.3
  --- TidyHTML-filter    10 Nov 2005 12:30:13 -0000    1.2
  +++ TidyHTML-filter    1 Feb 2006 11:01:01 -0000    1.3
  @@ -1,6 +1,6 @@
   #! /usr/bin/perl


-# $Cambridge: exim/exim-doc/doc-docbook/TidyHTML-filter,v 1.2 2005/11/10 12:30:13 ph10 Exp $
+# $Cambridge: exim/exim-doc/doc-docbook/TidyHTML-filter,v 1.3 2006/02/01 11:01:01 ph10 Exp $

# Script to tidy up the filter HTML file that is generated by xmlto. The
# following changes are made:
@@ -21,13 +21,21 @@
@text = <IN>;
close(IN);

-# Insert a newline after every > because the whole toc is generated as one
-# humungous line that is hard to check. Then split the lines so that each one
-# is a separate element in the vector.
+# Insert a newline after every > in the toc, because the whole toc is generated
+# as one humungous line that is hard to check. Indeed, the start of the first
+# chapter is also on the line, so we have to split if off first. Having
+# inserted newlines, we split the toc into separate items in the vector.

  -foreach $line (@text) { $line =~ s/>\s*/>\n/g; }
   for ($i = 0; $i < scalar(@text); $i++)
  -  { splice @text, $i, 1, (split /(?<=\n)/, $text[$i]); }
  +  {
  +  if ($text[$i] =~ ?<title>Exim's interfaces to mail filtering</title>?)
  +    {
  +    splice @text, $i, 1, (split /(?=<div class="chapter")/, $text[$i]);
  +    $text[$i] =~ s/>\s*/>\n/g;
  +    splice @text, $i, 1, (split /(?<=\n)/, $text[$i]);
  +    last;
  +    }
  +  }


# We want to create reverse links from each chapter and section title back to
# the relevant place in the TOC. Scan the TOC for the relevant entries. Add
@@ -60,26 +68,25 @@

   for (; $i < scalar(@text); $i++)
     {
  -  if ($text[$i] eq "<div class=\"literallayout\">\n" && $text[$i+1] eq "<p>\n")
  +  while ($text[$i] =~
  +      /^(.*)<a( xmlns="[^"]+")? id="([^"]+)"><\/a>(.*?)<\/h(.*)/)
       {
  -    $text[++$i] = "";
  -    $thisdiv = 1;
  +    my($ref) = $backref{"#$2"};
  +    $text[$i] = "$1<a$2 href=\"#$ref\" id=\"$3\">$4</a></h$5";
       }
  -  elsif ($thisdiv && $text[$i] eq "</p>\n" && $text[$i+1] eq "</div>\n")
  -    {
  -    $text[$i] = "";
  -    $thisdiv = 0;
  -    }
  -  elsif ($text[$i] =~ /^<h[23] /)
  +
  +  if ($text[$i] =~ /^(.*)<div class="literallayout"><p>(?:<br \/>)?(.*)/)
       {
  -    $i++;
  -    if ($text[$i] =~ /^<a( xmlns="[^"]+")? id="([^"]+)">$/)
  +    my($j);
  +    $text[$i] = "$1<div class=\"literallayout\">$2";
  +
  +    for ($j = $i + 1; $j < scalar(@text); $j++)
         {
  -      my($ref) = $backref{"#$2"};
  -      $text[$i++] = "<a$1 href=\"#$ref\" id=\"$2\">\n";
  -      my($temp) = $text[$i];
  -      $text[$i] = $text[$i+1];
  -      $text[++$i] = $temp;
  +      if ($text[$j] =~ /^<\/p><\/div>/)
  +        {
  +        $text[$j] =~ s/<\/p>//;
  +        last;
  +        }
         }
       }
     }


  Index: TidyHTML-spec
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/TidyHTML-spec,v
  retrieving revision 1.2
  retrieving revision 1.3
  diff -u -r1.2 -r1.3
  --- TidyHTML-spec    10 Nov 2005 12:30:13 -0000    1.2
  +++ TidyHTML-spec    1 Feb 2006 11:01:01 -0000    1.3
  @@ -1,6 +1,6 @@
   #! /usr/bin/perl


-# $Cambridge: exim/exim-doc/doc-docbook/TidyHTML-spec,v 1.2 2005/11/10 12:30:13 ph10 Exp $
+# $Cambridge: exim/exim-doc/doc-docbook/TidyHTML-spec,v 1.3 2006/02/01 11:01:01 ph10 Exp $

   # Script to tidy up the spec HTML files that are generated by xmlto. The
   # following changes are made:
  @@ -101,7 +101,7 @@
         $text[$i]= "$pre<a$opt href=\"index.html#$ref\" id=\"$id\">$title</a></h$post";
         }


  -    elsif ($text[$i] eq "<div class=\"literallayout\">\n" && $text[$i+1] eq "<p>\n")
  +    elsif ($text[$i] =~ /^<div [^>]*?class="literallayout">$/ && $text[$i+1] eq "<p>\n")
         {
         $text[++$i] = "";
         $thisdiv = 1;


  Index: Tidytxt
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/Tidytxt,v
  retrieving revision 1.1
  retrieving revision 1.2
  diff -u -r1.1 -r1.2
  --- Tidytxt    16 Jun 2005 10:32:31 -0000    1.1
  +++ Tidytxt    1 Feb 2006 11:01:01 -0000    1.2
  @@ -1,21 +1,66 @@
   #! /usr/bin/perl


-# $Cambridge: exim/exim-doc/doc-docbook/Tidytxt,v 1.1 2005/06/16 10:32:31 ph10 Exp $
+# $Cambridge: exim/exim-doc/doc-docbook/Tidytxt,v 1.2 2006/02/01 11:01:01 ph10 Exp $

-# Script to tidy up the output of w3m when it makes a text file. We convert
-# sequences of blank lines into a single blank line.
+# Script to tidy up the output of w3m when it makes a text file. First we
+# convert sequences of blank lines into a single blank line, to get everything
+# uniform. Then we go through and insert blank lines before chapter and
+# sections, also converting chapter titles to uppercase.

  -$blanks = 0;
  -while (<>)
  +@lines = <>;
  +
  +$lastwasblank = 0;
  +foreach $line (@lines)
     {
  -  if (/^\s*$/)
  +  if ($line =~ /^\s*$/)
       {
  -    $blanks++;
  +    $line = "" if $lastwasblank;
  +    $lastwasblank = 1;
       next;
       }
  -  print "\n" if $blanks > 0;
  -  $blanks = 0;
  -  print;
  +  $lastwasblank = 0;
  +  }
  +
  +# Find start of TOC, uppercasing its title
  +
  +for ($i = 0; $i < scalar @lines; $i++)
  +  {
  +  $lines[$i] = "TABLE OF CONTENTS\n" if $lines[$i] =~ /^Table of Contents/;
  +  last if $lines[$i] =~ /^1. /;
  +  }
  +
  +# Find start of first chapter
  +
  +for ($i++; $i < scalar @lines; $i++)
  +  { last if $lines[$i] =~ /^1. /; }
  +
  +# Process the body. We can detect the starts of chapters and sections by
  +# looking for preceding and following blank lines, and then matching against
  +# the numbers.
  +
  +$chapter = 0;
  +for (; $i < scalar @lines; $i++)
  +  {
  +  next if $lines[$i-1] !~ /^$/ || $lines[$i+1] !~ /^$/;
  +
  +  # Start of chapter
  +
  +  if ($lines[$i] =~ /^(\d+)\. / && $1 == $chapter + 1)
  +    {
  +    $chapter++;
  +    $section = 0;
  +    $lines[$i] = "\n\n" . ("=" x 79) . "\n" . uc($lines[$i]);
  +    }
  +
  +  # Start of next section
  +
  +  elsif ($lines[$i] =~ /^(\d+)\.(\d+) / && $1 == $chapter && $2 == $section + 1)
  +    {
  +    $section++;
  +    $lines[$i] = "\n$lines[$i]" . "-" x (length($lines[$i]) - 1) . "\n";
  +    }
     }
  +
  +print @lines;


# End

  Index: x2man
  ===================================================================
  RCS file: /home/cvs/exim/exim-doc/doc-docbook/x2man,v
  retrieving revision 1.1
  retrieving revision 1.2
  diff -u -r1.1 -r1.2
  --- x2man    16 Jun 2005 10:32:31 -0000    1.1
  +++ x2man    1 Feb 2006 11:01:02 -0000    1.2
  @@ -1,6 +1,6 @@
   #! /usr/bin/perl -w


-# $Cambridge: exim/exim-doc/doc-docbook/x2man,v 1.1 2005/06/16 10:32:31 ph10 Exp $
+# $Cambridge: exim/exim-doc/doc-docbook/x2man,v 1.2 2006/02/01 11:01:02 ph10 Exp $

# Script to find the command line options in the DocBook source of the Exim
# spec, and turn them into a man page, because people like that.
@@ -96,11 +96,11 @@

     # Start of new option


  -  if (/^<term>$/)
  +  if (/^<term>(<emphasis role="bold">-.*?)<\/term>$/)
       {
       print OUT ".TP 10\n";
  +    $_ = "$1\n";
       $optstart = 1;
  -    next;
       }


     # If a line contains text that is not in <>, read subsequent lines of the
  @@ -192,11 +192,9 @@
     s/&lt;/</g;
     s/&gt;/>/g;


  -  s/&#x002d;/-/g;
  -  s/&#x00a0;/ /g;
  -  s/&#x2013;/-/g;
  +  s/&nbsp;/ /g;
  +  s/&ndash;/-/g;
     s/&#x2019;/'/g;
  -  s/&#8230;/.../g;    # Sic - no x


     # Escape hyphens to prevent unwanted hyphenation