[pcre-dev] [Bug 1412] \s Randomly Matches xA0

Top Page
Delete this message
Author: azaozz
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1412] \s Randomly Matches xA0
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1412

azaozz <admin@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |admin@???





--- Comment #5 from azaozz <admin@???> 2013-11-10 16:22:07 ---
It seems there is a documentation inconsistency. As far as I can find the PCRE
man pages say that \s matches only "HT (9), LF (10), FF (12), CR (13), and
space (32)", even in UTF mode. An exception is UTF mode when PCRE_UCP is set.
Then \s matches "any character that \p{Z} matches, plus HT, LF, FF, CR".

However \s also matches \xA0 (160) when a locale is set and it is anything
other than ASCII. Assuming this is the intended behaviour, could you amend the
documentation to reflect it.

This makes \s quite inconsistent in PHP as the locale setting depends on the
web server settings and can be changed dynamically. In some circumstances
(multithreaded servers like Apache or IIS on Windows) changing the locale in
one thread affects all other threads.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email