Author: Philip Hazel Date: To: Zoltán Herczeg CC: pcre-dev Subject: Re: [pcre-dev] \u in JavaScript compat mode
On Fri, 11 Nov 2011, Zoltán Herczeg wrote:
> I recently got a notification that PCRE does not support \u in
> PCRE_JAVASCRIPT_COMPAT mode.
\u in Perl means "upper case the next character". PCRE does not support
it at all. The documentation (pcrecompat) says this:
5. The following Perl escape sequences are not supported: \l, \u, \L,
\U, and \N when followed by a character name or Unicode value. (\N on
its own, matching a non-newline character, is supported.) In fact
these are implemented by Perl's general string-handling and are not
part of its pattern matching engine. If any of these are encountered
by PCRE, an error is generated.
> So \u is simply converted to u, and \x as x if not followed by enough
> hex characters.
>
> Philip, could we follow the standard here?
The JAVASCRIPT_COMPAT mode was never meant to make PCRE fully JAVASCRIPT
compatible, as I suspect that is probably impossible. However, it looks
as if adding support \u in that mode won't break anything else, so I
guess it could be done.