[pcre-dev] [Bug 2106] Please add support for parsing POSIX b…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 2106] Please add support for parsing POSIX basic & extended regular expressions
https://bugs.exim.org/show_bug.cgi?id=2106

--- Comment #4 from Philip Hazel <ph10@???> ---
(In reply to Kyle J. McKay from comment #3)
>
> I'm coming at this from the point of view of wanting a drop-in replacement
> (at the source code level -- in other words a recompilation is required but
> no source code changes) for system-level POSIX-compatible regex.h functions
> (preferably including support for the *BSDism extensions REG_STARTEND,
> REG_PEND and REG_NOSPEC which are mostly trivial to do with PCRE2;
> REG_NOSPEC makes the pattern "fixed" like fgrep and I'm not sure there's
> something for that in PCRE2).


REG_STARTEND is already there. As you say, REG_PEND is probably easy to do.
There is no PCRE2 equivalent for REG_NOSPEC. Using a pattern matching engine to
search for fixed strings is horribly inefficient. There are nice fast
algorithms for doing that. Somebody should write a library if there isn't one
already.
>
> I was thinking that overhauling pcreposix.h might be the way to go, but
> perhaps not. If pcreposix stays the way it is now -- providing exclusively
> PCREs via a POSIX-similar regcomp/regexec API, with the release of the new
> pattern translation functions it would be relatively easy to create a new
> "pcre/regex.h" header and associated pcreregex.c file to make a new
> libpcreregex.a library that provides a 100% POSIX-compatible drop-in
> replacement for the system's regex.h with POSIX-compatible extensions such
> as the *BSDisms and REG_PCRE/REG_JAVASCPT options.


...with one difference. PCRE2, like Perl, returns "first match" (at a given
point in the subject string), whereas POSIX mandates "longest match". However,
I have just in the last few days (while thinking about all this POSIX stuff)
realized that the newly-refactored matching function (not yet released, but
coming soon) probably makes it possible to provide "longest match"
functionality, but at a computational expense, of course. I intend to look into
this once the flurry of existing issues is past.
>
> Eliminating/reducing the learning curve is a good thing because it seems
> that no matter what the circumstance, when adding new features one thing
> always remains the same, there is never enough time.


That is certainly true!

--
You are receiving this mail because:
You are on the CC list for the bug.