Re: [exim] Remove header lines matching a specific pattern?

Top Page
Delete this message
Reply to this message
Author: Heiko Schlittermann
Date:  
To: Karl Fischer, exim-users
Subject: Re: [exim] Remove header lines matching a specific pattern?
Phil Pennock <exim-users@???> (Mo 13 Jul 2009 23:44:15 CEST):
> On 2009-07-13 at 22:54 +0200, Karl Fischer wrote:
> > I followed this thread with interest and I'm still a little puzzled with the
> > specific exim syntax, but in terms of regex and just extracting the header
> > names, this perl regex should be more efficient: s/:.*?\n(\s+.*?\n)*/:/g
> >
> > This saves looping through map/extract by getting rid of the unwanted 1st.
>
> Good point.
>
> However, you're also not stripping out space between the header name and
> the following colon, which is valid. This email could validly be
> constructed with:
> ----------------------------8< cut here >8------------------------------
> From: Phil ....
> To : Karl ...
> Cc : exim-users ....
> ----------------------------8< cut here >8------------------------------


Ah. Thanks for the hint.

> With a little further optimisation, we get:
>
> s/(?>\s*:.*?\n)(?>\s+.*?\n)*/:/g
>
> although actually I'm not sure there would be any backtracking needed
> for your s///g and it's probably only the \s*: that benefits from the
> protection. (I can't be bothered to benchmark it).
>
> > In exim syntax I'd assume this to be (not tested yet):
> >
> > MESSAGE_HEADERS = ${lc:${sg {$message_headers_raw}{\N:.*?\n(\s+.*?\n)*\N}{:}}}
>
> ${lc:${sg{$message_headers_raw}{\N(?>\s*:.*?\n)(?>\s+.*?\n)*\N}{:}}}


I'm still at my version - instead of cutting away the tail, I'm
selecting the head of the logical header line:

${lc:${sg {$message_headers_raw}{\N(?m)(^\S+(?=\s*):)?.*?\n\N}{\$1}}}

But I'm not sure about efficency or readability.

--
Heiko