------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=891
--- Comment #1 from Philip Hazel <ph10@???> 2009-09-23 13:45:47 ---
On Tue, 22 Sep 2009, Alan Lehotsky wrote:
> Apparently one or more implementations (including possibly Henry Spencer's UCB
> regex code support this as synonyms for the beginning of a word and the end
> of a word respectively.
>
> It would be handy for compatibility to recognize these two also in PCRE.
Are you sure about that? The patterns [[:<:]] and [[:>:]] look like a
modification of the POSIX character class syntax - and a character class
always matches a character. What would be the meaning of [abc[:<:]def]
for example?
I did a google to try to find any documentation about this, and I
couldn't. What I did find was that several engines use \< and \> for
beginning and end of word. This is incompatible with Perl, and so could
not be added to PCRE. (In Perl, and PCRE, backslash followed by a non-
alphanumeric character always matches a literal character. That is a
nice, clean rule, and I would not want to violate it, even with a
special option.)
If you can point me at some documentation that specifies what [[:<:]]
and [[:>:]] actually mean in some other regex engine, I will think about
it. But they are heckish long sequences, though in Perl and PCRE to do the
same thing takes one or two more characters:
\b(?=\w) start of word
\b(?<=\w) end of word
Regards,
Philip
--
Configure bugmail:
http://bugs.exim.org/userprefs.cgi?tab=email