[exim] Date and From header regular expressions (was: RegEx …

Top Page
Delete this message
Reply to this message
Author: Schramm, Dominik
Date:  
To: exim-users
Subject: [exim] Date and From header regular expressions (was: RegEx longer than maximum string length in system filter)
Ted Cooper wrote on Friday, July 18, 2008 2:03 AM:

> Schramm, Dominik wrote:
>
>> I decided to check for this in the DATA ACL. It was supposed
>> as a means of rejecting offending messages at SMTP time anyway.
>> I just wanted to log the offending portions for some time
>> before activating it, but I can do that in the ACL, too.
>
> There was talk on exim-dev regarding a RFC 2822 date parsing
> patch to the exim code.
>
> http://lists.exim.org/lurker/message/20080604.133713.1e67dcec.en.html
>
> I can't find any news regarding this since.


This is slightly off-topic, but why? Is there still only an
RFC 822 parser? Or does the parser you mention serve a different
purpose? Does exim parse any Date headers anyway so far?

Here's my expression by the way, in case it's useful for anyone
else. The regular expression syntax (named subpatterns) is valid
from Perl version 5.10 upwards. The

DATA ACL syntax:

\N^(?:(?:(?:(?:(?&FWS)?)(Mon|Tue|Wed|Thu|Fri|Sat|Sun))|(?:(?:(?&FWS)?)(Mon|Tue|Wed|Thu|Fri|Sat|Sun)(?:(?&CFWS)?))),)?(?:(?:(?&FWS)?)([0-9]{1,2})|(?:(?&CFWS)?)([0-9]{1,2})(?:(?&CFWS)?))(?:(?:(?:(?&FWS))(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(?:(?&FWS)))|(?:(?&CFWS))(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(?:(?&CFWS)))(?:([0-9]{4})|(?:(?&CFWS)?)([0-9]{2})(?:(?&CFWS)?))(?:(?:(?&CFWS)?)([0-9]{2})(?:(?&CFWS)?):(?:(?&CFWS)?)([0-9]{2})(?:(?&CFWS)?)(?::(?:(?&CFWS)?)([0-9]{2})(?:(?&CFWS)?))?)(?:(?&FWS))((?:(?:\+|-)[0-9]{4})|UT|GMT|EST|EDT|CST|CDT|MST|MDT|PST|PDT|[a-ik-zA-IK-Z])(?:(?&CFWS))$(?:(?(DEFINE)(?'CFWS'(?:(?:(?:\s*(?:(?![\r\n])\s)+)?(?'comment'\((?:(?:\s*(?:(?![\r\n])\s)+)?(?:(?:(?!(?:\s|[()\\]))[[:cntrl:][:graph:]])+|\\[\x01-\x09\x0B\x0C\x0E-\x7F]|\\\n*\r*(?:[\x01-\x09\x0B\x0C\x0E-\x7F]\n*\r*)*|\\[\x00-\x7F]|(?:(?&comment))))*(?:\s*(?:(?![\r\n])\s)+)?\)))*(?:\s*(?:(?![\r\n])\s)+)?))(?'FWS'(?:\s*(?:(?![\r\n])\s)+))))\N

This includes all the "obs" syntax elements from RFC 2822, btw.

And here is the From header, this time however without the "obs"
parts -- ACL syntax:

\N^(?&mailbox)(?:,(?&mailbox))*$(?:(?(DEFINE)(?'mailbox'(?:(?:(?&a)|(?&quoted_string))+(?&CFWS)?<(?&addr_spec)>(?&CFWS)?)|(?&addr_spec))(?'comment'\((?:(?:\s*(?:(?![\r\n])\s)+)?(?:(?:(?!(?:\s|[()\\]))[[:cntrl:][:graph:]])+|\\[\x01-\x09\x0B\x0C\x0E-\x7F]|\\\n*\r*(?:[\x01-\x09\x0B\x0C\x0E-\x7F]\n*\r*)*|\\[\x00-\x7F]|(?&comment)))*(?:\s*(?:(?![\r\n])\s)+)?\))(?'CFWS'(?:(?:(?:\s*(?:(?![\r\n])\s)+)?(?&comment))*(?:\s*(?:(?![\r\n])\s)+)?))(?'FWS'(?:\s*(?:(?![\r\n])\s)+))(?'no_ws_ctl'[\x01-\x08\x0b\x0c\x0e-\x1f\x7f])(?'text'[\x01-\x09\x0b\x0c\x0e-\x7f])(?'specials'[()<>:;@\\,.\"\[\]])(?'quoted_pair'\\(?&text))(?'atext'[A-Za-z0-9!#%&'*+/=?^_`{|}~$-])(?'dot_atom_text'(?&atext)+(?:\.(?&atext)+)*)(?'dot_atom'(?&CFWS)?(?&dot_atom_text)(?&CFWS)?)(?'a'(?&CFWS)?(?&atext)(?&CFWS)?)(?'qtext'(?:(?&no_ws_ctl)|[\x21\x23-\x5b\x5d-\x7e]))(?'qcontent'(?:(?&qtext)|(?&quoted_pair)))(?'quoted_string'(?&CFWS)?\"(?:(?&FWS)?(?&qcontent))*(?&FWS)?\"(?&CFWS)?)(?'dtext'((?&no_ws_ctl)|[\x21-\x5a\x5e-\x7e]))(?'dcontent'((?&dtext)|(?&quoted_pair)))(?'domain'(?&dot_atom))(?'local_part'((?&dot_atom)|(?&quoted_string)))(?'addr_spec'(?&local_part)@(?&domain))))\N

---
Dominik