[pcre-dev] [Bug 1372] We have OR (alternates). How about AND…

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1372] We have OR (alternates). How about AND and NOT?
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1372




--- Comment #4 from Philip Hazel <ph10@???> 2013-07-26 10:28:59 ---
Thank you for taking the time to explain your problem in detail. My first
reaction, however, is that this is a major extension to the syntax of regular
expressions. PCRE originated as a library that supported only patterns that are
compatible with Perl. It is true that there have been some extensions that were
not Perl-compatible in the past, though in some cases, Perl has implemented the
features later. I would be wary about adding a major new regex feature without
consulting the Perl people.

As you have noted, the existing features do make your work possible.
Incidentally, your second example below would be more efficient like this:

(?:(?=.*expr1).*(?:expr2|expr3)|(?=.*expr2).*expr3

because it scans for expr1 just once. I don't see that any new syntax can much
reduce the work that has to be done. It would be interesting to see whether an
alternative pattern such as

expr1.*?(expr2|expr3)|expr2.*?(expr1|expr3)|expr3.*?(expr1|expr2)

is slower or faster. If you have access to pcretest you can check this.

We are currently in the early stages of a discussion for a new API for PCRE.
Major changes of any other sort won't happen until the new API is implemented,
and this will take some time. This suggestion will, therefore, sit on the wish
list for now.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email