On Tue, Jul 11, 2017 at 03:39:21PM +0530, Saylee via Pcre-dev wrote:
> I have been using PCRE library to validate strings against regular
> expressions. Now I wish to upgrade to PCRE2 but I am facing issue with below
> given regular expression.
>
> Regular expression: ([1-9][0-9]{2,5}[[:space:]-]{1})([0-9]{2,6}[[:space:]-]{1})([0-9]{2,6})([[:space:]-]{1}[0-9]{1,3})?
>
> This expression gets compiled with PCRE, however compiling it with PCRE2
> gives error.
> The error ‘ERR50’ occurs in function ‘parse_regex’.
>
ERR50 means "invalid range in character class". The problematic part is
[[:space:]-]. PCRE2 thinks the hyphen character denotes a range and misses the
right hand side of the range. If you want a hyphen as an item in the set, you
should list at as a first character of the set like [-[:space:]] or escape it
with a backslash [[:space:]\-].
Please read SQUARE BRACKETS AND CHARACTER CLASSES section of pcre2pattern(3),
especially:
Perl treats a hyphen as a literal if it appears before or after a POSIX
class (see below) or a character type escape such as as \d, but gives
a warning in its warning mode, as this is most likely a user error.
As PCRE2 has no facility for warning, an error is given in these cases.
So I think this is a documented feature. Although a little bit nonintuitive
because a paragraph above reads:
If a minus character is required in a class, it must be escaped with
a backslash or appear in a position where it cannot be interpreted as
indicating a range, typically as the first or last character in the
class, or immediately after a range.
-- Petr