Re: [exim-dev] PDF spec is outdated?

Top Page
Delete this message
Reply to this message
Author: Phil Pennock
Date:  
To: Nigel Metheringham
CC: exim-dev
Subject: Re: [exim-dev] PDF spec is outdated?
On 2013-04-19 at 09:30 +0100, Nigel Metheringham wrote:
> Was wondering about our mechanisms for distributing docs. It makes
> sense to have a tarball of the HTML in the "ftp" area (quoted, because I
> suspect most distribution is other than by ftp nowadays) - as the HTML
> is a pile of small files all connected. PDFs, and for that matter ebook
> versions etc, are single files (OK, spec & filter) and reasonably
> compressed to start with. Would we do better to just unpack them.
> In that case the website links become links direct to the distribution
> area PDFs.


I think the reason for the compression of single files is that it was
set up when PostScript was still dominant. With PDF, I agree that it
makes much less sense.

epub files are zip files with a different extension and a common file
hierarchy, so if they can be compressed better it means we're using a
bad zip compressor. I'm definitely in favour of not having to .bz2/.gz
the epubs.

> Maybe we would need to change directory structure a little - a directory
> per version with all the files in rather than flat, although we prune
> the online versions into an old directory every so often anyhow... so
> maybe this doesn't need doing.
>
> Hopefully that makes things simpler rather than more complex (OK, the
> scripting needs fixing, but it needs fixing now).
>
> Comments?


With links to the files from the web area, we stop being able to move
old versions into sub-directories without breaking the links from the
old versions of the files, so we're better off having something like:

eximdoc/pdf/spec-4.80.pdf
eximdoc/pdf/spec-4.80.1.pdf
eximdoc/pdf/spec-current.pdf -> spec-4.80.1.pdf

eximdoc/epub/spec-4.80.epub
eximdoc/epub/spec-4.80.1.epub
eximdoc/epub/spec-current.epub -> spec-4.80.1.epub

For the source files, it makes sense to move old versions aside so that
those who want to have infrastructure that locks in a particular version
are responsible for providing access to it and _we_ can move versions
with security holes into old/ ASAP and break direct download links.

For documentation, absent a developer machine compromise which results
in the production of a trojanned document with an exploit in it, which
should be handled as an exception case for clean-up, we don't want this
and we should just make sure that a versioned link is good for all time.

Something else to consider, far less urgent, is metadata in HTTP
responses such as "Link: <...>; rel=latest-version". It seems feasible
that search engines would detect that and use as input, similarly to how
canonical links in web-pages help de-dup, and ensure that folks who
search for "exim spec.pdf" will have link ranking pointing them to the
latest, with a version number, updating fairly promptly without losing
ranking.

Eg:
curl -I http://people.spodhuis.org/phil.pennock/software/sieve-connect-0.85.tar.bz2
(and old versions 0.81, 0.83, 0.84 exist for comparing/contrasting).

-Phil