[exim] 8-bit in headers

Top Page
Delete this message
Reply to this message
Author: Phil Chambers
Date:  
To: exim-users
Subject: [exim] 8-bit in headers
This message follows on from work I did in identifying invalid characters in
header field names in an ACL. That thread had a subject of "non-address header
syntax checking". I am using 4.62.

If I try to identify 8-bit characters in the non-field-name part of header
lines it appears that $message_headers is not the raw data from the message
header. It appears that =?char-set?...?= forms are converted to 8-bit.

That means that if I use match{$message_headers}{\N(?m)[\x80-\xFF]\N} I pick
out all the messages that are properly encoded as well as those that are not!

Is there any way to work on the raw headers without any decoding?

What I have found is that the great majority of messages which have 8-bit in
the headers are spam and it would be a useful feature to take into account for
identifying spam.

Phil.
---------------------------------------
Phil Chambers (postmaster@???)
University of Exeter