Revision: 643
http://vcs.pcre.org/viewvc?view=rev&revision=643
Author: ph10
Date: 2011-07-29 16:56:39 +0100 (Fri, 29 Jul 2011)
Log Message:
-----------
Allow all characters except closing parens in MARK:NAME etc.
Modified Paths:
--------------
code/trunk/ChangeLog
code/trunk/doc/pcrepattern.3
code/trunk/pcre_compile.c
code/trunk/testdata/testinput11
code/trunk/testdata/testoutput11
Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog 2011-07-28 18:59:40 UTC (rev 642)
+++ code/trunk/ChangeLog 2011-07-29 15:56:39 UTC (rev 643)
@@ -208,6 +208,10 @@
pattern in sufficient detail. The compile time test no longer happens when
PCRE is compiling a conditional subpattern, but actual runaway loops are
now caught at runtime (see 39 above).
+
+41. It seems that Perl allows any characters other than a closing parenthesis
+ to be part of the NAME in (*MARK:NAME) and other backtracking verbs. PCRE
+ has been changed to be the same.
Version 8.12 15-Jan-2011
Modified: code/trunk/doc/pcrepattern.3
===================================================================
--- code/trunk/doc/pcrepattern.3 2011-07-28 18:59:40 UTC (rev 642)
+++ code/trunk/doc/pcrepattern.3 2011-07-29 15:56:39 UTC (rev 643)
@@ -1835,9 +1835,10 @@
capturing is carried out only for positive assertions, because it does not make
sense for negative assertions.
.P
-For compatibility with Perl, assertion subpatterns may be repeated, even though
-it makes no sense to assert the same thing several times. In practice, there
-only three cases:
+For compatibility with Perl, assertion subpatterns may be repeated; though
+it makes no sense to assert the same thing several times, the side effect of
+capturing parentheses may occasionally be useful. In practice, there only three
+cases:
.sp
(1) If the quantifier is {0}, the assertion is never obeyed during matching.
However, it may contain internal capturing parenthesized groups that are called
Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c 2011-07-28 18:59:40 UTC (rev 642)
+++ code/trunk/pcre_compile.c 2011-07-29 15:56:39 UTC (rev 643)
@@ -4997,15 +4997,18 @@
previous = NULL;
while ((cd->ctypes[*++ptr] & ctype_letter) != 0) {};
namelen = (int)(ptr - name);
+
+ /* It appears that Perl allows any characters whatsoever, other than
+ a closing parenthesis, to appear in arguments, so we no longer insist on
+ letters, digits, and underscores. */
if (*ptr == CHAR_COLON)
{
arg = ++ptr;
- while ((cd->ctypes[*ptr] & (ctype_letter|ctype_digit)) != 0
- || *ptr == '_') ptr++;
+ while (*ptr != 0 && *ptr != CHAR_RIGHT_PARENTHESIS) ptr++;
arglen = (int)(ptr - arg);
}
-
+
if (*ptr != CHAR_RIGHT_PARENTHESIS)
{
*errorcodeptr = ERR60;
Modified: code/trunk/testdata/testinput11
===================================================================
--- code/trunk/testdata/testinput11 2011-07-28 18:59:40 UTC (rev 642)
+++ code/trunk/testdata/testinput11 2011-07-29 15:56:39 UTC (rev 643)
@@ -666,4 +666,14 @@
/((?(R1)a+|(?1)b))/
aaaabcde
+/a(*:any
+name)/K
+ abc
+
+/a(*:a\x{1234}b)/8K
+ abc
+
+/a(*:a£b)/8K
+ abc
+
/-- End of testinput11 --/
Modified: code/trunk/testdata/testoutput11
===================================================================
--- code/trunk/testdata/testoutput11 2011-07-28 18:59:40 UTC (rev 642)
+++ code/trunk/testdata/testoutput11 2011-07-29 15:56:39 UTC (rev 643)
@@ -1252,4 +1252,21 @@
0: aaaab
1: aaaab
+/a(*:any
+name)/K
+ abc
+ 0: a
+MK: any
+name
+
+/a(*:a\x{1234}b)/8K
+ abc
+ 0: a
+MK: a\x{1234}b
+
+/a(*:a£b)/8K
+ abc
+ 0: a
+MK: a£b
+
/-- End of testinput11 --/