Re: [exim] slightly OT - reconstructing mbox files?

Góra strony
Delete this message
Reply to this message
Autor: Kjetil Torgrim Homme
Data:  
Dla: Chris Lightfoot
CC: exim-users, Marcus Barczak
Temat: Re: [exim] slightly OT - reconstructing mbox files?
On Thu, 2006-08-24 at 08:54 +0100, Chris Lightfoot wrote:
> You need a heuristic to detect a block of message headers,
> and you need to be sure that you don't mistake lines in
> message bodies for headers (e.g. quoted headers in a
> bounce) or the headers of a MIME part for the headers of a
> message.


[snipped lots of good points]

one suggestion for heuristic is to consider each block of candidate
headers: is the _first_ Received header something which your archiving
server would make? if so, it's a new message. any bounces and
forwarded messages should have a different host in their first
Received-line.

to find candidate header blocks, look for empty lines. if the first
line after the blank line is on the format /^[A-Za-z0-9-]+:/, start to
collect the lines, until you see a new blank line.

I haven't tried this myself, of course :)
--
Kjetil T.