Re: [Exim] What to do about non-monitonic process ids

Author: Philip Hazel
Date:
To: exim-users
Subject: Re: [Exim] What to do about non-monitonic process ids

Thanks to everybody who has replied, both on and off list.

On Thu, 30 Jan 2003, Jim Knoble wrote:

> Is this facility used anywhere else for constructing unique filenames?
> In particular, is it used anywhere the filenames should be unpredictable?

I don't think filenames need to be unpredictable in Exim.

> This is my recommendation, with a change (described below). Exim v4 is
> still in the relatively early adoption stage, and changing it now,
> once, is liable to be easier than changing things slightly now, finding
> out that it doesn't work the way we expected anyway, and having to
> change it again later.

I wondered about this, but in practice, I fear that it is already too
late. And certainly not something I can "just change for the next point
release". As has been pointed out, if the length of the message id
changes, upgrading (and possibly subsequently downgrading) is going to
be very tricky. In fact, it probably needs a two-stage process. First, a
version of Exim that still generates old-length IDs, but understands
new-length ones. Then, when that is firmly established, the next version
can generates new ones, but must still understand old ones.

My conclusion is that, yes, something of this kind will have to be done,
but it needs some long-term planning, and is perhaps something for "Exim
5". In that connection, however, thanks for your ideas. I agree that
both attoseconds and 64-byte PIDs should be supported.

So, what is to be done right now? For Exim 4.14? I think we have to
remain within the 16-character field at this stage.

On Thu, 30 Jan 2003, Nico Erfurth wrote:

> For maildir, just add some more noise (micro or maybe millisecs),

Indeed, maildir is not a bit problem. I will add microseconds and a
delay if required.

> The question is, does all Unices provide a way to retrieve the
> milliseconds in a useable resolution? (I just know gettimeofday() so
> far, to get the microsecs)

They all seem to have gettimeofday(). So if you want milliseconds, you
just divide by 1000 :-)

> And whoever came up with random PIDs should be shot and buried on a moon
> far far away ;)

There's no problem with the randomness itself, just the possibility of
re-use within the same second. If the implementation guaranteed not to
re-use the same PID until the clock had ticked, there would be no
problem.

On Thu, 30 Jan 2003, James P. Roberts wrote:

> I support the following idea, taken from Philip's posted discussion:
>
> >IDEA 3: USE THOSE HYPHENS

> What I think is, the ID's are not human-readable anyway (at least, not
> to THIS human!). Whenever I want to do something with one, I never type
> it by hand, but use whatever "cut & paste" feature is available. Go
> ahead and make maximum use of the 16 characters you have, without
> suffering the "major upheaval" problem.

The only problem with this is that it makes it harder to recognize that
something *is* an Exim message id (for example, for pattern matching in
scripts that are reading log files).

> Another possibility is to move up to a larger base system.

I thought of that, but I decided that there weren't many extra
characters available that wouldn't break something. You would have to
exclude chars that are problems in file names, chars that are shell
metacharacters (for convenience), chars that can't appear on the LHS of
a Message-ID: header line (because Exim uses its ID to construct
Message-ID for messages that don't have one). I think this leaves at
most about 6 characters. I don't think the gain is worth it.

I received one other new suggestion: to change the epoch in the
timestamp. Exim didn't exist before 1995, so if I started a new epoch in
2003, it will be 25 years before there is the possibility of any
clashing IDs (and then only with messages that by then would be over 30
years old).

This is of course only a delaying scheme, because eventually time will
catch up (though I will be long retired by then :-). It might be
something that could last until we do the Major Upheaval. However,
I had a Bright Idea (tm)...

WHAT I PROBABLY WILL DO
-----------------------

The final -xx of the ID is not much used. For hosts that do not set
localhost_number, it contains a sequence for multiple messages received
by one process in one second. In that situation (localhost_number
unset), I could use these digits to hold the millisecond time instead.
The receiving process could ensure that it doesn't exit until at least 1
millisecond after the timestamp of the final message.

Note: this doesn't slow down your mail; it just (perhaps) delays the
termination of a process that has already completed its work. In most
cases, certainly on current hardware, there would be no delay, because
it takes more than a millisecond to receive a message. (I've just
realized that the very crudest approach of all would be to change
nothing, and just delay receiving processes until the 1-second clock had
ticked.)

What to do when localhost_number *is* set? Currently, the number is
permitted to be in the range 0-255. I don't know how much it is used,
but is seems plausible that this could be reduced to 0-50, in which case
it could be stuffed into the most significant base-62 digit of the
pid part of the message id. That still allows for 32-bit pids.

Philip

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.

This message is part of the following thread:
	the complete thread tree sorted by date
	Dean Brooks at
	Richard.Hall at