Re: [pcre-dev] \s and VT (vertical tab)

Top Page
Delete this message
Author: Jean-Christophe Deschamps
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] \s and VT (vertical tab)
At 13:22 12/01/2015, you wrote:
´¯¯¯
>On Sun, 11 Jan 2015, Jean-Christophe Deschamps wrote:
>
> > Dear list,
> >
> > Am I missing something? The docs say:
> >
> > "For compatibility with Perl, \s does not match the VT character
> (code 11).
> > This makes it different from the the POSIX "space" class. The \s
> characters
> > are HT (9), LF (10), FF (12), CR (13), and space (32). If "use
> locale;" is
> > included in a Perl script, \s may match the VT character. In PCRE,
> it never
> > does. "
> >
> > But AFAICT PCRE \s does match VT.
>
>Which docs? Here is an extract from the ChangeLog for PCRE 8.34:
>
>18. The character VT has been added to the default ("C" locale) set of
>     characters that match \s and are generally treated as white space,
>     following this same change in Perl 5.18. There is now no difference
>     between "Perl space" and "POSIX space". Whether VT is treated as
>     white space in other locales depends on the locale.

>
>The current (8.36) version of pcrepattern says this:
>
> For compatibility with Perl, \s did not used to match the VT
> character (code 11), which made it different from the the POSIX
> "space" class. However, Perl added VT at release 5.18, and PCRE
> followed suit at release 8.34. The default \s characters are now HT
> (9), LF (10), VT (11), FF (12), CR (13), and space (32), which are
> defined as white space in the "C" locale. This list may vary if
> locale-specific matching is taking place. For example, in some locales
> the "non-breaking space" character (\xA0) is recognized as white
> space, and in others the VT character is not.


Ah, sorry for the noise. I've seen this reported but my copy of the doc
was simply out of date.