Revision: 1363
http://vcs.pcre.org/viewvc?view=rev&revision=1363
Author: ph10
Date: 2013-10-01 17:54:40 +0100 (Tue, 01 Oct 2013)
Log Message:
-----------
Refactored auto-possessification code.
Modified Paths:
--------------
code/trunk/ChangeLog
code/trunk/doc/pcre_compile.3
code/trunk/doc/pcre_compile2.3
code/trunk/doc/pcreapi.3
code/trunk/doc/pcrematching.3
code/trunk/doc/pcretest.1
code/trunk/pcre.h.in
code/trunk/pcre_compile.c
code/trunk/pcre_dfa_exec.c
code/trunk/pcre_internal.h
code/trunk/pcretest.c
code/trunk/testdata/saved16BE-1
code/trunk/testdata/saved16LE-1
code/trunk/testdata/saved32BE-1
code/trunk/testdata/saved32LE-1
code/trunk/testdata/testinput1
code/trunk/testdata/testinput10
code/trunk/testdata/testinput15
code/trunk/testdata/testinput2
code/trunk/testdata/testinput7
code/trunk/testdata/testinput8
code/trunk/testdata/testinput9
code/trunk/testdata/testoutput1
code/trunk/testdata/testoutput10
code/trunk/testdata/testoutput11-16
code/trunk/testdata/testoutput11-32
code/trunk/testdata/testoutput11-8
code/trunk/testdata/testoutput15
code/trunk/testdata/testoutput17
code/trunk/testdata/testoutput18-16
code/trunk/testdata/testoutput18-32
code/trunk/testdata/testoutput2
code/trunk/testdata/testoutput20
code/trunk/testdata/testoutput5
code/trunk/testdata/testoutput7
code/trunk/testdata/testoutput8
code/trunk/testdata/testoutput9
code/trunk/ucp.h
Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/ChangeLog 2013-10-01 16:54:40 UTC (rev 1363)
@@ -82,6 +82,13 @@
16. Unicode character properties were updated from Unicode 6.3.0.
+17. The compile-time code for auto-possessification has been refactored, based
+ on a patch by Zoltan Herczeg. It now happens after instead of during
+ compilation. The code is cleaner, and more cases are handled. The option
+ PCRE_NO_AUTO_POSSESSIFY is added for testing purposes, and the -O and /O
+ options in pcretest are provided to set it.
+
+
Version 8.33 28-May-2013
------------------------
Modified: code/trunk/doc/pcre_compile.3
===================================================================
--- code/trunk/doc/pcre_compile.3 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/doc/pcre_compile.3 2013-10-01 16:54:40 UTC (rev 1363)
@@ -1,4 +1,4 @@
-.TH PCRE_COMPILE 3 "24 June 2012" "PCRE 8.30"
+.TH PCRE_COMPILE 3 "01 October 2013" "PCRE 8.34"
.SH NAME
PCRE - Perl-compatible regular expressions
.SH SYNOPSIS
@@ -51,6 +51,7 @@
PCRE_FIRSTLINE Force matching to be before newline
PCRE_JAVASCRIPT_COMPAT JavaScript compatibility
PCRE_MULTILINE ^ and $ match newlines within data
+ PCRE_NEVER_UTF Lock out UTF, e.g. via (*UTF)
PCRE_NEWLINE_ANY Recognize any Unicode newline sequence
PCRE_NEWLINE_ANYCRLF Recognize CR, LF, and CRLF as newline
sequences
@@ -59,6 +60,8 @@
PCRE_NEWLINE_LF Set LF as the newline sequence
PCRE_NO_AUTO_CAPTURE Disable numbered capturing paren-
theses (named ones available)
+ PCRE_NO_AUTO_POSSESSIFY Disable auto-possessification
+ PCRE_NO_START_OPTIMIZE Disable match-time start optimizations
PCRE_NO_UTF16_CHECK Do not check the pattern for UTF-16
validity (only relevant if
PCRE_UTF16 is set)
Modified: code/trunk/doc/pcre_compile2.3
===================================================================
--- code/trunk/doc/pcre_compile2.3 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/doc/pcre_compile2.3 2013-10-01 16:54:40 UTC (rev 1363)
@@ -1,4 +1,4 @@
-.TH PCRE_COMPILE2 3 "24 June 2012" "PCRE 8.30"
+.TH PCRE_COMPILE2 3 "01 October 2013" "PCRE 8.34"
.SH NAME
PCRE - Perl-compatible regular expressions
.SH SYNOPSIS
@@ -56,6 +56,7 @@
PCRE_FIRSTLINE Force matching to be before newline
PCRE_JAVASCRIPT_COMPAT JavaScript compatibility
PCRE_MULTILINE ^ and $ match newlines within data
+ PCRE_NEVER_UTF Lock out UTF, e.g. via (*UTF)
PCRE_NEWLINE_ANY Recognize any Unicode newline sequence
PCRE_NEWLINE_ANYCRLF Recognize CR, LF, and CRLF as newline
sequences
@@ -64,6 +65,8 @@
PCRE_NEWLINE_LF Set LF as the newline sequence
PCRE_NO_AUTO_CAPTURE Disable numbered capturing paren-
theses (named ones available)
+ PCRE_NO_AUTO_POSSESSIFY Disable auto-possessification
+ PCRE_NO_START_OPTIMIZE Disable match-time start optimizations
PCRE_NO_UTF16_CHECK Do not check the pattern for UTF-16
validity (only relevant if
PCRE_UTF16 is set)
Modified: code/trunk/doc/pcreapi.3
===================================================================
--- code/trunk/doc/pcreapi.3 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/doc/pcreapi.3 2013-10-01 16:54:40 UTC (rev 1363)
@@ -1,4 +1,4 @@
-.TH PCREAPI 3 "03 September 2013" "PCRE 8.34"
+.TH PCREAPI 3 "01 October 2013" "PCRE 8.34"
.SH NAME
PCRE - Perl-compatible regular expressions
.sp
@@ -795,6 +795,12 @@
they acquire numbers in the usual way). There is no equivalent of this option
in Perl.
.sp
+ PCRE_NO_AUTO_POSSESSIFY
+.sp
+If this option is set, it disables "auto-possessification". This is an
+optimization that, for example, turns a+b into a++b in order to avoid
+backtracks into a+ that can never be successful.
+.sp
PCRE_NO_START_OPTIMIZE
.sp
This is an option that acts at matching time; that is, it is really an option
@@ -860,10 +866,10 @@
error. If you already know that your pattern is valid, and you want to skip
this check for performance reasons, you can set the PCRE_NO_UTF8_CHECK option.
When it is set, the effect of passing an invalid UTF-8 string as a pattern is
-undefined. It may cause your program to crash. Note that this option can also
-be passed to \fBpcre_exec()\fP and \fBpcre_dfa_exec()\fP, to suppress the
-validity checking of subject strings only. If the same string is being matched
-many times, the option can be safely set for the second and subsequent
+undefined. It may cause your program to crash or loop. Note that this option
+can also be passed to \fBpcre_exec()\fP and \fBpcre_dfa_exec()\fP, to suppress
+the validity checking of subject strings only. If the same string is being
+matched many times, the option can be safely set for the second and subsequent
matchings to improve performance.
.
.
@@ -1931,7 +1937,7 @@
the value of \fIstartoffset\fP points to the start of a character (or the end
of the subject). When PCRE_NO_UTF8_CHECK is set, the effect of passing an
invalid string as a subject or an invalid value of \fIstartoffset\fP is
-undefined. Your program may crash.
+undefined. Your program may crash or loop.
.sp
PCRE_PARTIAL_HARD
PCRE_PARTIAL_SOFT
@@ -2773,6 +2779,14 @@
\fIovector\fP, the yield of the function is zero, and the vector is filled with
the longest matches. Unlike \fBpcre_exec()\fP, \fBpcre_dfa_exec()\fP can use
the entire \fIovector\fP for returning matched strings.
+
+NOTE: PCRE's "auto-possessification" optimization usually applies to character
+repeats at the end of a pattern (as well as internally). For example, the
+pattern "a\ed+" is compiled as if it were "a\ed++" because there is no point
+even considering the possibility of backtracking into the repeated digits. For
+DFA matching, this means that only one possible match is found. If you really
+do want multiple matches in such cases, either use an ungreedy repeat
+("a\ed+?") or set the PCRE_NO_AUTO_POSSESSIFY option when compiling.
.
.
.SS "Error returns from \fBpcre_dfa_exec()\fP"
@@ -2849,6 +2863,6 @@
.rs
.sp
.nf
-Last updated: 03 September 2013
+Last updated: 01 October 2013
Copyright (c) 1997-2013 University of Cambridge.
.fi
Modified: code/trunk/doc/pcrematching.3
===================================================================
--- code/trunk/doc/pcrematching.3 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/doc/pcrematching.3 2013-10-01 16:54:40 UTC (rev 1363)
@@ -1,4 +1,4 @@
-.TH PCREMATCHING 3 "08 January 2012" "PCRE 8.30"
+.TH PCREMATCHING 3 "01 October 2013" "PCRE 8.34"
.SH NAME
PCRE - Perl-compatible regular expressions
.SH "PCRE MATCHING ALGORITHMS"
@@ -106,6 +106,14 @@
character of the subject. The algorithm does not automatically move on to find
matches that start at later positions.
.P
+PCRE's "auto-possessification" optimization usually applies to character
+repeats at the end of a pattern (as well as internally). For example, the
+pattern "a\ed+" is compiled as if it were "a\ed++" because there is no point
+even considering the possibility of backtracking into the repeated digits. For
+DFA matching, this means that only one possible match is found. If you really
+do want multiple matches in such cases, either use an ungreedy repeat
+("a\ed+?") or set the PCRE_NO_AUTO_POSSESSIFY option when compiling.
+.P
There are a number of features of PCRE regular expressions that are not
supported by the alternative matching algorithm. They are as follows:
.P
@@ -201,6 +209,6 @@
.rs
.sp
.nf
-Last updated: 08 January 2012
+Last updated: 01 October 2013
Copyright (c) 1997-2012 University of Cambridge.
.fi
Modified: code/trunk/doc/pcretest.1
===================================================================
--- code/trunk/doc/pcretest.1 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/doc/pcretest.1 2013-10-01 16:54:40 UTC (rev 1363)
@@ -1,4 +1,4 @@
-.TH PCRETEST 1 "27 August 2013" "PCRE 8.34"
+.TH PCRETEST 1 "01 October 2013" "PCRE 8.34"
.SH NAME
pcretest - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS
@@ -155,6 +155,10 @@
equivalent to adding \fB/M\fP to each regular expression. The size is given in
bytes for both libraries.
.TP 10
+\fB-O\fP
+Behave as if each pattern has the \fB/O\fP modifier, that is disable
+auto-possessification for all patterns.
+.TP 10
\fB-o\fP \fIosize\fP
Set the number of elements in the output vector that is used when calling
\fBpcre[16|32]_exec()\fP or \fBpcre[16|32]_dfa_exec()\fP to be \fIosize\fP. The
@@ -324,6 +328,7 @@
\fB/M\fP show compiled memory size
\fB/m\fP set PCRE_MULTILINE
\fB/N\fP set PCRE_NO_AUTO_CAPTURE
+ \fB/O\fP set PCRE_NO_AUTO_POSSESSIFY
\fB/P\fP use the POSIX wrapper
\fB/S\fP study the pattern after compilation
\fB/s\fP set PCRE_DOTALL
@@ -380,6 +385,7 @@
\fB/f\fP PCRE_FIRSTLINE
\fB/J\fP PCRE_DUPNAMES
\fB/N\fP PCRE_NO_AUTO_CAPTURE
+ \fB/O\fP PCRE_NO_AUTO_POSSESSIFY
\fB/U\fP PCRE_UNGREEDY
\fB/W\fP PCRE_UCP
\fB/X\fP PCRE_EXTRA
@@ -512,8 +518,8 @@
matched. There are a number of qualifying characters that may follow \fB/S\fP.
They may appear in any order.
.P
-If \fBS\fP is followed by an exclamation mark, \fBpcre[16|32]_study()\fP is called
-with the PCRE_STUDY_EXTRA_NEEDED option, causing it always to return a
+If \fB/S\fP is followed by an exclamation mark, \fBpcre[16|32]_study()\fP is
+called with the PCRE_STUDY_EXTRA_NEEDED option, causing it always to return a
\fBpcre_extra\fP block, even when studying discovers no useful information.
.P
If \fB/S\fP is followed by a second S character, it suppresses studying, even
@@ -1098,6 +1104,6 @@
.rs
.sp
.nf
-Last updated: 27 August 2013
+Last updated: 01 October 2013
Copyright (c) 1997-2013 University of Cambridge.
.fi
Modified: code/trunk/pcre.h.in
===================================================================
--- code/trunk/pcre.h.in 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/pcre.h.in 2013-10-01 16:54:40 UTC (rev 1363)
@@ -150,7 +150,10 @@
#define PCRE_NEVER_UTF 0x00010000 /* C1 ) Overlaid */
#define PCRE_DFA_SHORTEST 0x00010000 /* D ) Overlaid */
-#define PCRE_DFA_RESTART 0x00020000 /* D */
+/* This pair use the same big. */
+#define PCRE_NO_AUTO_POSSESSIFY 0x00020000 /* C1 ) Overlaid */
+#define PCRE_DFA_RESTART 0x00020000 /* D ) Overlaid */
+
#define PCRE_FIRSTLINE 0x00040000 /* C3 */
#define PCRE_DUPNAMES 0x00080000 /* C1 */
#define PCRE_NEWLINE_CR 0x00100000 /* C3 E D */
Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/pcre_compile.c 2013-10-01 16:54:40 UTC (rev 1363)
@@ -115,9 +115,9 @@
#define COMPILE_WORK_SIZE (2048*LINK_SIZE)
#define COMPILE_WORK_SIZE_MAX (100*COMPILE_WORK_SIZE)
-/* This value determines the size of the initial vector that is used for
-remembering named groups during the pre-compile. It is allocated on the stack,
-but if it is too small, it is expanded using malloc(), in a similar way to the
+/* This value determines the size of the initial vector that is used for
+remembering named groups during the pre-compile. It is allocated on the stack,
+but if it is too small, it is expanded using malloc(), in a similar way to the
workspace. The value is the number of slots in the list. */
#define NAMED_GROUP_LIST_SIZE 20
@@ -655,7 +655,126 @@
#endif
+/* This table is used to check whether auto-possessification is possible
+between adjacent character-type opcodes. The left-hand (repeated) opcode is
+used to select the row, and the right-hand opcode is use to select the column.
+A value of 1 means that auto-possessification is OK. For example, the second
+value in the first row means that \D+\d can be turned into \D++\d.
+The Unicode property types (\P and \p) have to be present to fill out the table
+because of what their opcode values are, but the table values should always be
+zero because property types are handled separately in the code. The last four
+columns apply to items that cannot be repeated, so there is no need to have
+rows for them. Note that OP_DIGIT etc. are generated only when PCRE_UCP is
+*not* set. When it is set, \d etc. are converted into OP_(NOT_)PROP codes. */
+
+#define APTROWS (LAST_AUTOTAB_LEFT_OP - FIRST_AUTOTAB_OP + 1)
+#define APTCOLS (LAST_AUTOTAB_RIGHT_OP - FIRST_AUTOTAB_OP + 1)
+
+static const pcre_uint8 autoposstab[APTROWS][APTCOLS] = {
+/* \D \d \S \s \W \w . .+ \C \P \p \R \H \h \V \v \X \Z \z $ $M */
+ { 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0 }, /* \D */
+ { 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1 }, /* \d */
+ { 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1 }, /* \S */
+ { 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0 }, /* \s */
+ { 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0 }, /* \W */
+ { 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1 }, /* \w */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0 }, /* . */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0 }, /* .+ */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0 }, /* \C */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, /* \P */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, /* \p */
+ { 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0 }, /* \R */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0 }, /* \H */
+ { 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0 }, /* \h */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0 }, /* \V */
+ { 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0 }, /* \v */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0 } /* \X */
+};
+
+
+/* This table is used to check whether auto-possessification is possible
+between adjacent Unicode property opcodes (OP_PROP and OP_NOTPROP). The
+left-hand (repeated) opcode is used to select the row, and the right-hand
+opcode is used to select the column. The values are as follows:
+
+ 0 Always return FALSE (never auto-possessify)
+ 1 Character groups are distinct (possessify if both are OP_PROP)
+ 2 Check character categories in the same group (general or particular)
+ 3 TRUE if the two opcodes are not the same (PROP vs NOTPROP)
+
+ 4 Check left general category vs right particular category
+ 5 Check right general category vs left particular category
+
+ 6 Left alphanum vs right general category
+ 7 Left space vs right general category
+ 8 Left word vs right general category
+
+ 9 Right alphanum vs left general category
+ 10 Right space vs left general category
+ 11 Right word vs left general category
+
+ 12 Left alphanum vs right particular category
+ 13 Left space vs right particular category
+ 14 Left word vs right particular category
+
+ 15 Right alphanum vs left particular category
+ 16 Right space vs left particular category
+ 17 Right word vs left particular category
+*/
+
+static const pcre_uint8 propposstab[PT_TABSIZE][PT_TABSIZE] = {
+/* ANY LAMP GC PC SC ALNUM SPACE PXSPACE WORD CLIST UCNC */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, /* PT_ANY */
+ { 0, 3, 0, 0, 0, 3, 1, 1, 0, 0, 0 }, /* PT_LAMP */
+ { 0, 0, 2, 4, 0, 9, 10, 10, 11, 0, 0 }, /* PT_GC */
+ { 0, 0, 5, 2, 0, 15, 16, 16, 17, 0, 0 }, /* PT_PC */
+ { 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0 }, /* PT_SC */
+ { 0, 3, 6, 12, 0, 3, 1, 1, 0, 0, 0 }, /* PT_ALNUM */
+ { 0, 1, 7, 13, 0, 1, 3, 3, 1, 0, 0 }, /* PT_SPACE */
+ { 0, 1, 7, 13, 0, 1, 3, 3, 1, 0, 0 }, /* PT_PXSPACE */
+ { 0, 0, 8, 14, 0, 0, 1, 1, 3, 0, 0 }, /* PT_WORD */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, /* PT_CLIST */
+ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3 } /* PT_UCNC */
+};
+
+/* This table is used to check whether auto-possessification is possible
+between adjacent Unicode property opcodes (OP_PROP and OP_NOTPROP) when one
+specifies a general category and the other specifies a particular category. The
+row is selected by the general category and the column by the particular
+category. The value is 1 if the particular category is not part of the general
+category. */
+
+static const pcre_uint8 catposstab[7][30] = {
+/* Cc Cf Cn Co Cs Ll Lm Lo Lt Lu Mc Me Mn Nd Nl No Pc Pd Pe Pf Pi Po Ps Sc Sk Sm So Zl Zp Zs */
+ { 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }, /* C */
+ { 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }, /* L */
+ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }, /* M */
+ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }, /* N */
+ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1 }, /* P */
+ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1 }, /* S */
+ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0 } /* Z */
+};
+
+/* This table is used when checking ALNUM, (PX)SPACE, SPACE, and WORD against
+a general or particular category. The properties in each row are those
+that apply to the character set in question. Duplication means that a little
+unnecessary work is done when checking, but this keeps things much simpler
+because they can all use the same code. For more details see the comment where
+this table is used.
+
+Note: SPACE and PXSPACE used to be different because Perl excluded VT from
+"space", but from Perl 5.18 it's included, so both categories are treated the
+same here. */
+
+static const pcre_uint8 posspropstab[3][4] = {
+ { ucp_L, ucp_N, ucp_N, ucp_Nl }, /* ALNUM, 3rd and 4th values redundant */
+ { ucp_Z, ucp_Z, ucp_C, ucp_Cc }, /* SPACE and PXSPACE, 2nd value redundant */
+ { ucp_L, ucp_N, ucp_P, ucp_Po } /* WORD */
+};
+
+
+
/*************************************************
* Find an error text *
*************************************************/
@@ -682,6 +801,7 @@
}
+
/*************************************************
* Expand the workspace *
*************************************************/
@@ -1199,6 +1319,8 @@
return escape;
}
+
+
#ifdef SUPPORT_UCP
/*************************************************
* Handle \P and \p *
@@ -1296,7 +1418,6 @@
-
/*************************************************
* Read repeat counts *
*************************************************/
@@ -1419,7 +1540,6 @@
-
/*************************************************
* Find the fixed length of a branch *
*************************************************/
@@ -1760,7 +1880,6 @@
-
/*************************************************
* Scan compiled regex for specific bracket *
*************************************************/
@@ -2462,6 +2581,851 @@
/*************************************************
+* Base opcode of repeated opcodes *
+*************************************************/
+
+/* Returns the base opcode for repeated single character type opcodes. If the
+opcode is not a repeated character type, it returns with the original value.
+
+Arguments: c opcode
+Returns: base opcode for the type
+*/
+
+static pcre_uchar
+get_repeat_base(pcre_uchar c)
+{
+return (c > OP_TYPEPOSUPTO)? c :
+ (c >= OP_TYPESTAR)? OP_TYPESTAR :
+ (c >= OP_NOTSTARI)? OP_NOTSTARI :
+ (c >= OP_NOTSTAR)? OP_NOTSTAR :
+ (c >= OP_STARI)? OP_STARI :
+ OP_STAR;
+}
+
+
+
+#ifdef SUPPORT_UCP
+/*************************************************
+* Check a character and a property *
+*************************************************/
+
+/* This function is called by check_auto_possessive() when a property item
+is adjacent to a fixed character.
+
+Arguments:
+ c the character
+ ptype the property type
+ pdata the data for the type
+ negated TRUE if it's a negated property (\P or \p{^)
+
+Returns: TRUE if auto-possessifying is OK
+*/
+
+static BOOL
+check_char_prop(pcre_uint32 c, unsigned int ptype, unsigned int pdata,
+ BOOL negated)
+{
+const pcre_uint32 *p;
+const ucd_record *prop = GET_UCD(c);
+
+switch(ptype)
+ {
+ case PT_LAMP:
+ return (prop->chartype == ucp_Lu ||
+ prop->chartype == ucp_Ll ||
+ prop->chartype == ucp_Lt) == negated;
+
+ case PT_GC:
+ return (pdata == PRIV(ucp_gentype)[prop->chartype]) == negated;
+
+ case PT_PC:
+ return (pdata == prop->chartype) == negated;
+
+ case PT_SC:
+ return (pdata == prop->script) == negated;
+
+ /* These are specials */
+
+ case PT_ALNUM:
+ return (PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+ PRIV(ucp_gentype)[prop->chartype] == ucp_N) == negated;
+
+ case PT_SPACE: /* Perl space */
+ return (PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
+ c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR)
+ == negated;
+
+ case PT_PXSPACE: /* POSIX space */
+ return (PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
+ c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
+ c == CHAR_FF || c == CHAR_CR)
+ == negated;
+
+ case PT_WORD:
+ return (PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+ PRIV(ucp_gentype)[prop->chartype] == ucp_N ||
+ c == CHAR_UNDERSCORE) == negated;
+
+ case PT_CLIST:
+ p = PRIV(ucd_caseless_sets) + prop->caseset;
+ for (;;)
+ {
+ if (c < *p) return !negated;
+ if (c == *p++) return negated;
+ }
+ break; /* Control never reaches here */
+ }
+
+return FALSE;
+}
+#endif /* SUPPORT_UCP */
+
+
+
+/*************************************************
+* Fill the character property list *
+*************************************************/
+
+/* Checks whether the code points to an opcode that can take part in auto-
+possessification, and if so, fills a list with its properties.
+
+Arguments:
+ code points to start of expression
+ utf TRUE if in UTF-8 / UTF-16 / UTF-32 mode
+ fcc points to case-flipping table
+ list points to output list
+ list[0] will be filled with the opcode
+ list[1] will be non-zero if this opcode
+ can match an empty character string
+ list[2..7] depends on the opcode
+
+Returns: points to the start of the next opcode if *code is accepted
+ NULL if *code is not accepted
+*/
+
+static const pcre_uchar *
+get_chr_property_list(const pcre_uchar *code, BOOL utf,
+ const pcre_uint8 *fcc, pcre_uint32 *list)
+{
+pcre_uchar c = *code;
+const pcre_uchar *end;
+const pcre_uint32 *clist_src;
+pcre_uint32 *clist_dest;
+pcre_uint32 chr;
+pcre_uchar base;
+
+list[0] = c;
+list[1] = FALSE;
+code++;
+
+if (c >= OP_STAR && c <= OP_TYPEPOSUPTO)
+ {
+ base = get_repeat_base(c);
+ c -= (base - OP_STAR);
+
+ if (c == OP_UPTO || c == OP_MINUPTO || c == OP_EXACT || c == OP_POSUPTO)
+ code += IMM2_SIZE;
+
+ list[1] = (c != OP_PLUS && c != OP_MINPLUS && c != OP_EXACT && c != OP_POSPLUS);
+
+ switch(base)
+ {
+ case OP_STAR:
+ list[0] = OP_CHAR;
+ break;
+
+ case OP_STARI:
+ list[0] = OP_CHARI;
+ break;
+
+ case OP_NOTSTAR:
+ list[0] = OP_NOT;
+ break;
+
+ case OP_NOTSTARI:
+ list[0] = OP_NOTI;
+ break;
+
+ case OP_TYPESTAR:
+ list[0] = *code;
+ code++;
+ break;
+ }
+ c = list[0];
+ }
+
+switch(c)
+ {
+ case OP_NOT_DIGIT:
+ case OP_DIGIT:
+ case OP_NOT_WHITESPACE:
+ case OP_WHITESPACE:
+ case OP_NOT_WORDCHAR:
+ case OP_WORDCHAR:
+ case OP_ANY:
+ case OP_ALLANY:
+ case OP_ANYNL:
+ case OP_NOT_HSPACE:
+ case OP_HSPACE:
+ case OP_NOT_VSPACE:
+ case OP_VSPACE:
+ case OP_EXTUNI:
+ case OP_EODN:
+ case OP_EOD:
+ case OP_DOLL:
+ case OP_DOLLM:
+ return code;
+
+ case OP_CHAR:
+ case OP_NOT:
+ GETCHARINCTEST(chr, code);
+ list[2] = chr;
+ list[3] = NOTACHAR;
+ return code;
+
+ case OP_CHARI:
+ case OP_NOTI:
+ list[0] = (c == OP_CHARI) ? OP_CHAR : OP_NOT;
+ GETCHARINCTEST(chr, code);
+ list[2] = chr;
+
+#ifdef SUPPORT_UCP
+ if (chr < 128 || (chr < 256 && !utf))
+ list[3] = fcc[chr];
+ else
+ list[3] = UCD_OTHERCASE(chr);
+#elif defined SUPPORT_UTF || !defined COMPILE_PCRE8
+ list[3] = (chr < 256) ? fcc[chr] : chr;
+#else
+ list[3] = fcc[chr];
+#endif
+
+ /* The othercase might be the same value. */
+
+ if (chr == list[3])
+ list[3] = NOTACHAR;
+ else
+ list[4] = NOTACHAR;
+ return code;
+
+#ifdef SUPPORT_UCP
+ case OP_PROP:
+ case OP_NOTPROP:
+ if (code[0] != PT_CLIST)
+ {
+ list[2] = code[0];
+ list[3] = code[1];
+ return code + 2;
+ }
+
+ /* Convert only if we have anough space. */
+
+ clist_src = PRIV(ucd_caseless_sets) + code[1];
+ clist_dest = list + 2;
+ code += 2;
+
+ do {
+ /* Early return if there is not enough space. */
+ if (clist_dest >= list + 8)
+ {
+ list[2] = code[0];
+ list[3] = code[1];
+ return code;
+ }
+ *clist_dest++ = *clist_src;
+ }
+ while(*clist_src++ != NOTACHAR);
+
+ /* Enough space to store all characters. */
+
+ list[0] = (c == OP_PROP) ? OP_CHAR : OP_NOT;
+ return code;
+#endif
+
+ case OP_NCLASS:
+ case OP_CLASS:
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
+ case OP_XCLASS:
+
+ if (c == OP_XCLASS)
+ end = code + GET(code, 0);
+ else
+#endif
+ end = code + 32 / sizeof(pcre_uchar);
+
+ switch(*end)
+ {
+ case OP_CRSTAR:
+ case OP_CRMINSTAR:
+ case OP_CRQUERY:
+ case OP_CRMINQUERY:
+ list[1] = TRUE;
+ end++;
+ break;
+
+ case OP_CRRANGE:
+ case OP_CRMINRANGE:
+ list[1] = (GET2(end, 1) == 0);
+ end += 1 + 2 * IMM2_SIZE;
+ break;
+ }
+ list[2] = end - code;
+ return end;
+ }
+return NULL; /* Opcode not accepted */
+}
+
+
+
+/*************************************************
+* Scan further character sets for match *
+*************************************************/
+
+/* Checks whether the base and the current opcode have a common character, in
+which case the base cannot be possessified.
+
+Arguments:
+ code points to the byte code
+ utf TRUE in UTF-8 / UTF-16 / UTF-32 mode
+ cd static compile data
+ base_list the data list of the base opcode
+
+Returns: TRUE if the auto-possessification is possible
+*/
+
+static BOOL
+compare_opcodes(const pcre_uchar *code, BOOL utf, const compile_data *cd,
+ const pcre_uint32* base_list)
+{
+pcre_uchar c;
+pcre_uint32 list[8];
+const pcre_uint32* chr_ptr;
+const pcre_uint32* ochr_ptr;
+const pcre_uint32* list_ptr;
+pcre_uint32 chr;
+
+for(;;)
+ {
+ c = *code;
+
+ /* Skip over callouts */
+
+ if (c == OP_CALLOUT)
+ {
+ code += PRIV(OP_lengths)[c];
+ continue;
+ }
+
+ if (c == OP_ALT)
+ {
+ do code += GET(code, 1); while (*code == OP_ALT);
+ c = *code;
+ }
+
+ switch(c)
+ {
+ case OP_END:
+ /* TRUE only in greedy case. The non-greedy case could be replaced by an
+ OP_EXACT, but it is probably not worth it. (And note that OP_EXACT uses
+ more memory, which we cannot get at this stage.) */
+
+ return base_list[1] != 0;
+
+ case OP_KET:
+ /* If the bracket is capturing, and referenced by an OP_RECURSE, the
+ non-greedy case cannot be converted to a possessive form. We do not test
+ the bracket type at the moment, but we might do it in the future to improve
+ this condition. (But note that recursive calls are always atomic.) */
+
+ if (base_list[1] == 0) return FALSE;
+ code += PRIV(OP_lengths)[c];
+ continue;
+ }
+
+ /* Check for a supported opcode, and load its properties. */
+
+ code = get_chr_property_list(code, utf, cd->fcc, list);
+ if (code == NULL) return FALSE; /* Unsupported */
+
+ /* If either opcode is a small character list, set pointers for comparing
+ characters from that list with another list, or with a property. */
+
+ if (base_list[0] == OP_CHAR)
+ {
+ chr_ptr = base_list + 2;
+ list_ptr = list;
+ }
+ else if (list[0] == OP_CHAR)
+ {
+ chr_ptr = list + 2;
+ list_ptr = base_list;
+ }
+
+ /* Some property combinations also acceptable. Unicode property opcodes are
+ processed specially; the rest can be handled with a lookup table. */
+
+ else
+ {
+ pcre_uint32 leftop, rightop;
+
+ if (list[1] != 0) return FALSE; /* Must match at least one character */
+ leftop = base_list[0];
+ rightop = list[0];
+
+#ifdef SUPPORT_UCP
+ if (leftop == OP_PROP || leftop == OP_NOTPROP)
+ {
+ if (rightop == OP_EOD) return TRUE;
+ if (rightop == OP_PROP || rightop == OP_NOTPROP)
+ {
+ int n;
+ const pcre_uint8 *p;
+ BOOL same = leftop == rightop;
+ BOOL lisprop = leftop == OP_PROP;
+ BOOL risprop = rightop == OP_PROP;
+ BOOL bothprop = lisprop && risprop;
+
+ /* There's a table that specifies how each combination is to be
+ processed:
+ 0 Always return FALSE (never auto-possessify)
+ 1 Character groups are distinct (possessify if both are OP_PROP)
+ 2 Check character categories in the same group (general or particular)
+ 3 Return TRUE if the two opcodes are not the same
+ ... see comments below
+ */
+
+ n = propposstab[base_list[2]][list[2]];
+ switch(n)
+ {
+ case 0: return FALSE;
+ case 1: return bothprop;
+ case 2: return (base_list[3] == list[3]) != same;
+ case 3: return !same;
+
+ case 4: /* Left general category, right particular category */
+ return risprop && catposstab[base_list[3]][list[3]] == same;
+
+ case 5: /* Right general category, left particular category */
+ return lisprop && catposstab[list[3]][base_list[3]] == same;
+
+ /* This code is logically tricky. Think hard before fiddling with it.
+ The posspropstab table has four entries per row. Each row relates to
+ one of PCRE's special properties such as ALNUM or SPACE or WORD.
+ Only WORD actually needs all four entries, but using repeats for the
+ others means they can all use the same code below.
+
+ The first two entries in each row are Unicode general categories, and
+ apply always, because all the characters they include are part of the
+ PCRE character set. The third and fourth entries are a general and a
+ particular category, respectively, that include one or more relevant
+ characters. One or the other is used, depending on whether the check
+ is for a general or a particular category. However, in both cases the
+ category contains more characters than the specials that are defined
+ for the property being tested against. Therefore, it cannot be used
+ in a NOTPROP case.
+
+ Example: the row for WORD contains ucp_L, ucp_N, ucp_P, ucp_Po.
+ Underscore is covered by ucp_P or ucp_Po. */
+
+ case 6: /* Left alphanum vs right general category */
+ case 7: /* Left space vs right general category */
+ case 8: /* Left word vs right general category */
+ p = posspropstab[n-6];
+ return risprop && lisprop ==
+ (list[3] != p[0] &&
+ list[3] != p[1] &&
+ (list[3] != p[2] || !lisprop));
+
+ case 9: /* Right alphanum vs left general category */
+ case 10: /* Right space vs left general category */
+ case 11: /* Right word vs left general category */
+ p = posspropstab[n-9];
+ return lisprop && risprop ==
+ (base_list[3] != p[0] &&
+ base_list[3] != p[1] &&
+ (base_list[3] != p[2] || !risprop));
+
+ case 12: /* Left alphanum vs right particular category */
+ case 13: /* Left space vs right particular category */
+ case 14: /* Left word vs right particular category */
+ p = posspropstab[n-12];
+ return risprop && lisprop ==
+ (catposstab[p[0]][list[3]] &&
+ catposstab[p[1]][list[3]] &&
+ (list[3] != p[3] || !lisprop));
+
+ case 15: /* Right alphanum vs left particular category */
+ case 16: /* Right space vs left particular category */
+ case 17: /* Right word vs left particular category */
+ p = posspropstab[n-15];
+ return lisprop && risprop ==
+ (catposstab[p[0]][base_list[3]] &&
+ catposstab[p[1]][base_list[3]] &&
+ (base_list[3] != p[3] || !risprop));
+ }
+ }
+ return FALSE;
+ }
+
+ else
+#endif /* SUPPORT_UCP */
+
+ return leftop >= FIRST_AUTOTAB_OP && leftop <= LAST_AUTOTAB_LEFT_OP &&
+ rightop >= FIRST_AUTOTAB_OP && rightop <= LAST_AUTOTAB_RIGHT_OP &&
+ autoposstab[leftop - FIRST_AUTOTAB_OP][rightop - FIRST_AUTOTAB_OP];
+ }
+
+ /* Control reaches here only if one of the items is a small character list.
+ All characters are checked against the other side. */
+
+ do
+ {
+ chr = *chr_ptr;
+
+ switch(list_ptr[0])
+ {
+ case OP_CHAR:
+ ochr_ptr = list_ptr + 2;
+ do
+ {
+ if (chr == *ochr_ptr) return FALSE;
+ ochr_ptr++;
+ }
+ while(*ochr_ptr != NOTACHAR);
+ break;
+
+ case OP_NOT:
+ ochr_ptr = list_ptr + 2;
+ do
+ {
+ if (chr == *ochr_ptr)
+ break;
+ ochr_ptr++;
+ }
+ while(*ochr_ptr != NOTACHAR);
+ if (*ochr_ptr == NOTACHAR) return FALSE; /* Not found */
+ break;
+
+ /* Note that OP_DIGIT etc. are generated only when PCRE_UCP is *not*
+ set. When it is set, \d etc. are converted into OP_(NOT_)PROP codes. */
+
+ case OP_DIGIT:
+ if (chr < 256 && (cd->ctypes[chr] & ctype_digit) != 0) return FALSE;
+ break;
+
+ case OP_NOT_DIGIT:
+ if (chr > 255 || (cd->ctypes[chr] & ctype_digit) == 0) return FALSE;
+ break;
+
+ case OP_WHITESPACE:
+ if (chr < 256 && (cd->ctypes[chr] & ctype_space) != 0) return FALSE;
+ break;
+
+ case OP_NOT_WHITESPACE:
+ if (chr > 255 || (cd->ctypes[chr] & ctype_space) == 0) return FALSE;
+ break;
+
+ case OP_WORDCHAR:
+ if (chr < 255 && (cd->ctypes[chr] & ctype_word) != 0) return FALSE;
+ break;
+
+ case OP_NOT_WORDCHAR:
+ if (chr > 255 || (cd->ctypes[chr] & ctype_word) == 0) return FALSE;
+ break;
+
+ case OP_HSPACE:
+ switch(chr)
+ {
+ HSPACE_CASES: return FALSE;
+ default: break;
+ }
+ break;
+
+ case OP_NOT_HSPACE:
+ switch(chr)
+ {
+ HSPACE_CASES: break;
+ default: return FALSE;
+ }
+ break;
+
+ case OP_ANYNL:
+ case OP_VSPACE:
+ switch(chr)
+ {
+ VSPACE_CASES: return FALSE;
+ default: break;
+ }
+ break;
+
+ case OP_NOT_VSPACE:
+ switch(chr)
+ {
+ VSPACE_CASES: break;
+ default: return FALSE;
+ }
+ break;
+
+ case OP_DOLL:
+ case OP_EODN:
+ switch (chr)
+ {
+ case CHAR_CR:
+ case CHAR_LF:
+ case CHAR_VT:
+ case CHAR_FF:
+ case CHAR_NEL:
+#ifndef EBCDIC
+ case 0x2028:
+ case 0x2029:
+#endif /* Not EBCDIC */
+ return FALSE;
+ }
+ break;
+
+ case OP_EOD: /* Can always possessify before \z */
+ break;
+
+ case OP_PROP:
+ case OP_NOTPROP:
+ if (!check_char_prop(chr, list_ptr[2], list_ptr[3],
+ list_ptr[0] == OP_NOTPROP))
+ return FALSE;
+ break;
+
+ /* The class comparisons work only when the class is the second item
+ of the pair, because there are at present no possessive forms of the
+ class opcodes. Note also that the "code" variable that is used below
+ points after the second item, and that the pointer for the first item
+ is not available, so even if there were possessive forms of the class
+ opcodes, the correct comparison could not be done. */
+
+ case OP_NCLASS:
+ if (chr > 255) return FALSE;
+ /* Fall through */
+
+ case OP_CLASS:
+ if (list_ptr != list) return FALSE; /* Class is first opcode */
+ if (chr > 255) break;
+ if ((((pcre_uint8 *)(code - list_ptr[2] + 1))[chr >> 3] & (1 << (chr & 7))) != 0)
+ return FALSE;
+ break;
+
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
+ case OP_XCLASS:
+ if (list_ptr != list) return FALSE; /* Class is first opcode */
+ if (PRIV(xclass)(chr, code - list_ptr[2] + 1 + LINK_SIZE, utf))
+ return FALSE;
+ break;
+#endif
+
+ default:
+ return FALSE;
+ }
+
+ chr_ptr++;
+ }
+ while(*chr_ptr != NOTACHAR);
+
+ /* At least one character must be matched from this opcode. */
+
+ if (list[1] == 0) return TRUE;
+ }
+
+return FALSE;
+}
+
+
+
+/*************************************************
+* Scan compiled regex for auto-possession *
+*************************************************/
+
+/* Replaces single character iterations with their possessive alternatives
+if appropriate. This function modifies the compiled opcode!
+
+Arguments:
+ code points to start of the byte code
+ utf TRUE in UTF-8 / UTF-16 / UTF-32 mode
+ cd static compile data
+
+Returns: nothing
+*/
+
+static void
+auto_possessify(pcre_uchar *code, BOOL utf, const compile_data *cd)
+{
+register pcre_uchar c;
+const pcre_uchar *end;
+pcre_uint32 list[8];
+
+for (;;)
+ {
+ c = *code;
+
+ if (c >= OP_STAR && c <= OP_TYPEPOSUPTO)
+ {
+ c -= get_repeat_base(c) - OP_STAR;
+ end = (c <= OP_MINUPTO) ?
+ get_chr_property_list(code, utf, cd->fcc, list) : NULL;
+ list[1] = c == OP_STAR || c == OP_PLUS || c == OP_QUERY || c == OP_UPTO;
+
+ if (end != NULL && compare_opcodes(end, utf, cd, list))
+ {
+ switch(c)
+ {
+ case OP_STAR:
+ *code += OP_POSSTAR - OP_STAR;
+ break;
+
+ case OP_MINSTAR:
+ *code += OP_POSSTAR - OP_MINSTAR;
+ break;
+
+ case OP_PLUS:
+ *code += OP_POSPLUS - OP_PLUS;
+ break;
+
+ case OP_MINPLUS:
+ *code += OP_POSPLUS - OP_MINPLUS;
+ break;
+
+ case OP_QUERY:
+ *code += OP_POSQUERY - OP_QUERY;
+ break;
+
+ case OP_MINQUERY:
+ *code += OP_POSQUERY - OP_MINQUERY;
+ break;
+
+ case OP_UPTO:
+ *code += OP_POSUPTO - OP_UPTO;
+ break;
+
+ case OP_MINUPTO:
+ *code += OP_MINUPTO - OP_UPTO;
+ break;
+ }
+ }
+ c = *code;
+ }
+
+ switch(c)
+ {
+ case OP_END:
+ return;
+
+ case OP_TYPESTAR:
+ case OP_TYPEMINSTAR:
+ case OP_TYPEPLUS:
+ case OP_TYPEMINPLUS:
+ case OP_TYPEQUERY:
+ case OP_TYPEMINQUERY:
+ case OP_TYPEPOSSTAR:
+ case OP_TYPEPOSPLUS:
+ case OP_TYPEPOSQUERY:
+ if (code[1] == OP_PROP || code[1] == OP_NOTPROP) code += 2;
+ break;
+
+ case OP_TYPEUPTO:
+ case OP_TYPEMINUPTO:
+ case OP_TYPEEXACT:
+ case OP_TYPEPOSUPTO:
+ if (code[1 + IMM2_SIZE] == OP_PROP || code[1 + IMM2_SIZE] == OP_NOTPROP)
+ code += 2;
+ break;
+
+ case OP_XCLASS:
+ code += GET(code, 1);
+ break;
+
+ case OP_MARK:
+ case OP_PRUNE_ARG:
+ case OP_SKIP_ARG:
+ case OP_THEN_ARG:
+ code += code[1];
+ break;
+ }
+
+ /* Add in the fixed length from the table */
+
+ code += PRIV(OP_lengths)[c];
+
+ /* In UTF-8 mode, opcodes that are followed by a character may be followed by
+ a multi-byte character. The length in the table is a minimum, so we have to
+ arrange to skip the extra bytes. */
+
+#if defined SUPPORT_UTF && !defined COMPILE_PCRE32
+ if (utf) switch(c)
+ {
+ case OP_CHAR:
+ case OP_CHARI:
+ case OP_NOT:
+ case OP_NOTI:
+ case OP_STAR:
+ case OP_MINSTAR:
+ case OP_PLUS:
+ case OP_MINPLUS:
+ case OP_QUERY:
+ case OP_MINQUERY:
+ case OP_UPTO:
+ case OP_MINUPTO:
+ case OP_EXACT:
+ case OP_POSSTAR:
+ case OP_POSPLUS:
+ case OP_POSQUERY:
+ case OP_POSUPTO:
+ case OP_STARI:
+ case OP_MINSTARI:
+ case OP_PLUSI:
+ case OP_MINPLUSI:
+ case OP_QUERYI:
+ case OP_MINQUERYI:
+ case OP_UPTOI:
+ case OP_MINUPTOI:
+ case OP_EXACTI:
+ case OP_POSSTARI:
+ case OP_POSPLUSI:
+ case OP_POSQUERYI:
+ case OP_POSUPTOI:
+ case OP_NOTSTAR:
+ case OP_NOTMINSTAR:
+ case OP_NOTPLUS:
+ case OP_NOTMINPLUS:
+ case OP_NOTQUERY:
+ case OP_NOTMINQUERY:
+ case OP_NOTUPTO:
+ case OP_NOTMINUPTO:
+ case OP_NOTEXACT:
+ case OP_NOTPOSSTAR:
+ case OP_NOTPOSPLUS:
+ case OP_NOTPOSQUERY:
+ case OP_NOTPOSUPTO:
+ case OP_NOTSTARI:
+ case OP_NOTMINSTARI:
+ case OP_NOTPLUSI:
+ case OP_NOTMINPLUSI:
+ case OP_NOTQUERYI:
+ case OP_NOTMINQUERYI:
+ case OP_NOTUPTOI:
+ case OP_NOTMINUPTOI:
+ case OP_NOTEXACTI:
+ case OP_NOTPOSSTARI:
+ case OP_NOTPOSPLUSI:
+ case OP_NOTPOSQUERYI:
+ case OP_NOTPOSUPTOI:
+ if (HAS_EXTRALEN(code[-1])) code += GET_EXTRALEN(code[-1]);
+ break;
+ }
+#else
+ (void)(utf); /* Keep compiler happy by referencing function argument */
+#endif
+ }
+}
+
+
+
+/*************************************************
* Check for POSIX class syntax *
*************************************************/
@@ -2744,477 +3708,11 @@
*cptr = c; /* Rest of input range */
return 0;
}
-
-
-
-/*************************************************
-* Check a character and a property *
-*************************************************/
-
-/* This function is called by check_auto_possessive() when a property item
-is adjacent to a fixed character.
-
-Arguments:
- c the character
- ptype the property type
- pdata the data for the type
- negated TRUE if it's a negated property (\P or \p{^)
-
-Returns: TRUE if auto-possessifying is OK
-*/
-
-static BOOL
-check_char_prop(pcre_uint32 c, unsigned int ptype, unsigned int pdata, BOOL negated)
-{
-#ifdef SUPPORT_UCP
-const pcre_uint32 *p;
-#endif
-
-const ucd_record *prop = GET_UCD(c);
-
-switch(ptype)
- {
- case PT_LAMP:
- return (prop->chartype == ucp_Lu ||
- prop->chartype == ucp_Ll ||
- prop->chartype == ucp_Lt) == negated;
-
- case PT_GC:
- return (pdata == PRIV(ucp_gentype)[prop->chartype]) == negated;
-
- case PT_PC:
- return (pdata == prop->chartype) == negated;
-
- case PT_SC:
- return (pdata == prop->script) == negated;
-
- /* These are specials */
-
- case PT_ALNUM:
- return (PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
- PRIV(ucp_gentype)[prop->chartype] == ucp_N) == negated;
-
- case PT_SPACE: /* Perl space */
- return (PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
- c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR)
- == negated;
-
- case PT_PXSPACE: /* POSIX space */
- return (PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
- c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
- c == CHAR_FF || c == CHAR_CR)
- == negated;
-
- case PT_WORD:
- return (PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
- PRIV(ucp_gentype)[prop->chartype] == ucp_N ||
- c == CHAR_UNDERSCORE) == negated;
-
-#ifdef SUPPORT_UCP
- case PT_CLIST:
- p = PRIV(ucd_caseless_sets) + prop->caseset;
- for (;;)
- {
- if (c < *p) return !negated;
- if (c == *p++) return negated;
- }
- break; /* Control never reaches here */
-#endif
- }
-
-return FALSE;
-}
#endif /* SUPPORT_UCP */
/*************************************************
-* Check if auto-possessifying is possible *
-*************************************************/
-
-/* This function is called for unlimited repeats of certain items, to see
-whether the next thing could possibly match the repeated item. If not, it makes
-sense to automatically possessify the repeated item.
-
-Arguments:
- previous pointer to the repeated opcode
- utf TRUE in UTF-8 / UTF-16 / UTF-32 mode
- ptr next character in pattern
- options options bits
- cd contains pointers to tables etc.
-
-Returns: TRUE if possessifying is wanted
-*/
-
-static BOOL
-check_auto_possessive(const pcre_uchar *previous, BOOL utf,
- const pcre_uchar *ptr, int options, compile_data *cd)
-{
-pcre_uint32 c = NOTACHAR;
-pcre_uint32 next;
-int escape;
-pcre_uchar op_code = *previous++;
-
-/* Skip whitespace and comments in extended mode */
-
-if ((options & PCRE_EXTENDED) != 0)
- {
- for (;;)
- {
- while (MAX_255(*ptr) && (cd->ctypes[*ptr] & ctype_space) != 0) ptr++;
- if (*ptr == CHAR_NUMBER_SIGN)
- {
- ptr++;
- while (*ptr != CHAR_NULL)
- {
- if (IS_NEWLINE(ptr)) { ptr += cd->nllen; break; }
- ptr++;
-#ifdef SUPPORT_UTF
- if (utf) FORWARDCHAR(ptr);
-#endif
- }
- }
- else break;
- }
- }
-
-/* If the next item is one that we can handle, get its value. A non-negative
-value is a character, a negative value is an escape value. */
-
-if (*ptr == CHAR_BACKSLASH)
- {
- int temperrorcode = 0;
- escape = check_escape(&ptr, &next, &temperrorcode, cd->bracount, options,
- FALSE);
- if (temperrorcode != 0) return FALSE;
- ptr++; /* Point after the escape sequence */
- }
-else if (!MAX_255(*ptr) || (cd->ctypes[*ptr] & ctype_meta) == 0)
- {
- escape = 0;
-#ifdef SUPPORT_UTF
- if (utf) { GETCHARINC(next, ptr); } else
-#endif
- next = *ptr++;
- }
-else return FALSE;
-
-/* Skip whitespace and comments in extended mode */
-
-if ((options & PCRE_EXTENDED) != 0)
- {
- for (;;)
- {
- while (MAX_255(*ptr) && (cd->ctypes[*ptr] & ctype_space) != 0) ptr++;
- if (*ptr == CHAR_NUMBER_SIGN)
- {
- ptr++;
- while (*ptr != CHAR_NULL)
- {
- if (IS_NEWLINE(ptr)) { ptr += cd->nllen; break; }
- ptr++;
-#ifdef SUPPORT_UTF
- if (utf) FORWARDCHAR(ptr);
-#endif
- }
- }
- else break;
- }
- }
-
-/* If the next thing is itself optional, we have to give up. */
-
-if (*ptr == CHAR_ASTERISK || *ptr == CHAR_QUESTION_MARK ||
- STRNCMP_UC_C8(ptr, STR_LEFT_CURLY_BRACKET STR_0 STR_COMMA, 3) == 0)
- return FALSE;
-
-/* If the previous item is a character, get its value. */
-
-if (op_code == OP_CHAR || op_code == OP_CHARI ||
- op_code == OP_NOT || op_code == OP_NOTI)
- {
-#ifdef SUPPORT_UTF
- GETCHARTEST(c, previous);
-#else
- c = *previous;
-#endif
- }
-
-/* Now compare the next item with the previous opcode. First, handle cases when
-the next item is a character. */
-
-if (escape == 0)
- {
- /* For a caseless UTF match, the next character may have more than one other
- case, which maps to the special PT_CLIST property. Check this first. */
-
-#ifdef SUPPORT_UCP
- if (utf && c != NOTACHAR && (options & PCRE_CASELESS) != 0)
- {
- unsigned int ocs = UCD_CASESET(next);
- if (ocs > 0) return check_char_prop(c, PT_CLIST, ocs, op_code >= OP_NOT);
- }
-#endif
-
- switch(op_code)
- {
- case OP_CHAR:
- return c != next;
-
- /* For CHARI (caseless character) we must check the other case. If we have
- Unicode property support, we can use it to test the other case of
- high-valued characters. We know that next can have only one other case,
- because multi-other-case characters are dealt with above. */
-
- case OP_CHARI:
- if (c == next) return FALSE;
-#ifdef SUPPORT_UTF
- if (utf)
- {
- pcre_uint32 othercase;
- if (next < 128) othercase = cd->fcc[next]; else
-#ifdef SUPPORT_UCP
- othercase = UCD_OTHERCASE(next);
-#else
- othercase = NOTACHAR;
-#endif
- return c != othercase;
- }
- else
-#endif /* SUPPORT_UTF */
- return (c != TABLE_GET(next, cd->fcc, next)); /* Not UTF */
-
- case OP_NOT:
- return c == next;
-
- case OP_NOTI:
- if (c == next) return TRUE;
-#ifdef SUPPORT_UTF
- if (utf)
- {
- pcre_uint32 othercase;
- if (next < 128) othercase = cd->fcc[next]; else
-#ifdef SUPPORT_UCP
- othercase = UCD_OTHERCASE(next);
-#else
- othercase = NOTACHAR;
-#endif
- return c == othercase;
- }
- else
-#endif /* SUPPORT_UTF */
- return (c == TABLE_GET(next, cd->fcc, next)); /* Not UTF */
-
- /* Note that OP_DIGIT etc. are generated only when PCRE_UCP is *not* set.
- When it is set, \d etc. are converted into OP_(NOT_)PROP codes. */
-
- case OP_DIGIT:
- return next > 255 || (cd->ctypes[next] & ctype_digit) == 0;
-
- case OP_NOT_DIGIT:
- return next <= 255 && (cd->ctypes[next] & ctype_digit) != 0;
-
- case OP_WHITESPACE:
- return next > 255 || (cd->ctypes[next] & ctype_space) == 0;
-
- case OP_NOT_WHITESPACE:
- return next <= 255 && (cd->ctypes[next] & ctype_space) != 0;
-
- case OP_WORDCHAR:
- return next > 255 || (cd->ctypes[next] & ctype_word) == 0;
-
- case OP_NOT_WORDCHAR:
- return next <= 255 && (cd->ctypes[next] & ctype_word) != 0;
-
- case OP_HSPACE:
- case OP_NOT_HSPACE:
- switch(next)
- {
- HSPACE_CASES:
- return op_code == OP_NOT_HSPACE;
-
- default:
- return op_code != OP_NOT_HSPACE;
- }
-
- case OP_ANYNL:
- case OP_VSPACE:
- case OP_NOT_VSPACE:
- switch(next)
- {
- VSPACE_CASES:
- return op_code == OP_NOT_VSPACE;
-
- default:
- return op_code != OP_NOT_VSPACE;
- }
-
-#ifdef SUPPORT_UCP
- case OP_PROP:
- return check_char_prop(next, previous[0], previous[1], FALSE);
-
- case OP_NOTPROP:
- return check_char_prop(next, previous[0], previous[1], TRUE);
-#endif
-
- default:
- return FALSE;
- }
- }
-
-/* Handle the case when the next item is \d, \s, etc. Note that when PCRE_UCP
-is set, \d turns into ESC_du rather than ESC_d, etc., so ESC_d etc. are
-generated only when PCRE_UCP is *not* set, that is, when only ASCII
-characteristics are recognized. Similarly, the opcodes OP_DIGIT etc. are
-replaced by OP_PROP codes when PCRE_UCP is set. */
-
-switch(op_code)
- {
- case OP_CHAR:
- case OP_CHARI:
- switch(escape)
- {
- case ESC_d:
- return c > 255 || (cd->ctypes[c] & ctype_digit) == 0;
-
- case ESC_D:
- return c <= 255 && (cd->ctypes[c] & ctype_digit) != 0;
-
- case ESC_s:
- return c > 255 || (cd->ctypes[c] & ctype_space) == 0;
-
- case ESC_S:
- return c <= 255 && (cd->ctypes[c] & ctype_space) != 0;
-
- case ESC_w:
- return c > 255 || (cd->ctypes[c] & ctype_word) == 0;
-
- case ESC_W:
- return c <= 255 && (cd->ctypes[c] & ctype_word) != 0;
-
- case ESC_h:
- case ESC_H:
- switch(c)
- {
- HSPACE_CASES:
- return escape != ESC_h;
-
- default:
- return escape == ESC_h;
- }
-
- case ESC_v:
- case ESC_V:
- switch(c)
- {
- VSPACE_CASES:
- return escape != ESC_v;
-
- default:
- return escape == ESC_v;
- }
-
- /* When PCRE_UCP is set, these values get generated for \d etc. Find
- their substitutions and process them. The result will always be either
- ESC_p or ESC_P. Then fall through to process those values. */
-
-#ifdef SUPPORT_UCP
- case ESC_du:
- case ESC_DU:
- case ESC_wu:
- case ESC_WU:
- case ESC_su:
- case ESC_SU:
- {
- int temperrorcode = 0;
- ptr = substitutes[escape - ESC_DU];
- escape = check_escape(&ptr, &next, &temperrorcode, 0, options, FALSE);
- if (temperrorcode != 0) return FALSE;
- ptr++; /* For compatibility */
- }
- /* Fall through */
-
- case ESC_p:
- case ESC_P:
- {
- unsigned int ptype = 0, pdata = 0;
- int errorcodeptr;
- BOOL negated;
-
- ptr--; /* Make ptr point at the p or P */
- if (!get_ucp(&ptr, &negated, &ptype, &pdata, &errorcodeptr))
- return FALSE;
- ptr++; /* Point past the final curly ket */
-
- /* If the property item is optional, we have to give up. (When generated
- from \d etc by PCRE_UCP, this test will have been applied much earlier,
- to the original \d etc. At this point, ptr will point to a zero byte. */
-
- if (*ptr == CHAR_ASTERISK || *ptr == CHAR_QUESTION_MARK ||
- STRNCMP_UC_C8(ptr, STR_LEFT_CURLY_BRACKET STR_0 STR_COMMA, 3) == 0)
- return FALSE;
-
- /* Do the property check. */
-
- return check_char_prop(c, ptype, pdata, (escape == ESC_P) != negated);
- }
-#endif
-
- default:
- return FALSE;
- }
-
- /* In principle, support for Unicode properties should be integrated here as
- well. It means re-organizing the above code so as to get hold of the property
- values before switching on the op-code. However, I wonder how many patterns
- combine ASCII \d etc with Unicode properties? (Note that if PCRE_UCP is set,
- these op-codes are never generated.) */
-
- case OP_DIGIT:
- return escape == ESC_D || escape == ESC_s || escape == ESC_W ||
- escape == ESC_h || escape == ESC_v || escape == ESC_R;
-
- case OP_NOT_DIGIT:
- return escape == ESC_d;
-
- case OP_WHITESPACE:
- return escape == ESC_S || escape == ESC_d || escape == ESC_w;
-
- case OP_NOT_WHITESPACE:
- return escape == ESC_s || escape == ESC_h || escape == ESC_v || escape == ESC_R;
-
- case OP_HSPACE:
- return escape == ESC_S || escape == ESC_H || escape == ESC_d ||
- escape == ESC_w || escape == ESC_v || escape == ESC_R;
-
- case OP_NOT_HSPACE:
- return escape == ESC_h;
-
- /* Can't have \S in here because VT matches \S (Perl anomaly) */
- case OP_ANYNL:
- case OP_VSPACE:
- return escape == ESC_V || escape == ESC_d || escape == ESC_w;
-
- case OP_NOT_VSPACE:
- return escape == ESC_v || escape == ESC_R;
-
- case OP_WORDCHAR:
- return escape == ESC_W || escape == ESC_s || escape == ESC_h ||
- escape == ESC_v || escape == ESC_R;
-
- case OP_NOT_WORDCHAR:
- return escape == ESC_w || escape == ESC_d;
-
- default:
- return FALSE;
- }
-
-/* Control does not reach here */
-}
-
-
-
-/*************************************************
* Add a character or range to a class *
*************************************************/
@@ -4642,19 +5140,6 @@
}
}
- /* If the repetition is unlimited, it pays to see if the next thing on
- the line is something that cannot possibly match this character. If so,
- automatically possessifying this item gains some performance in the case
- where the match fails. */
-
- if (!possessive_quantifier &&
- repeat_max < 0 &&
- check_auto_possessive(previous, utf, ptr + 1, options, cd))
- {
- repeat_type = 0; /* Force greedy */
- possessive_quantifier = TRUE;
- }
-
goto OUTPUT_SINGLE_REPEAT; /* Code shared with single character types */
}
@@ -4672,14 +5157,6 @@
op_type = OP_TYPESTAR - OP_STAR; /* Use type opcodes */
c = *previous;
- if (!possessive_quantifier &&
- repeat_max < 0 &&
- check_auto_possessive(previous, utf, ptr + 1, options, cd))
- {
- repeat_type = 0; /* Force greedy */
- possessive_quantifier = TRUE;
- }
-
OUTPUT_SINGLE_REPEAT:
if (*previous == OP_PROP || *previous == OP_NOTPROP)
{
@@ -5838,27 +6315,27 @@
/* In the pre-compile phase, do a syntax check, remember the longest
name, and then remember the group in a vector, expanding it if
- necessary. Duplicates for the same number are skipped; other duplicates
+ necessary. Duplicates for the same number are skipped; other duplicates
are checked for validity. In the actual compile, there is nothing to
do. */
if (lengthptr != NULL)
{
- named_group *ng;
+ named_group *ng;
pcre_uint32 number = cd->bracount + 1;
-
+
if (*ptr != (pcre_uchar)terminator)
{
*errorcodeptr = ERR42;
goto FAILED;
}
-
+
if (cd->names_found >= MAX_NAME_COUNT)
{
*errorcodeptr = ERR49;
goto FAILED;
}
-
+
if (namelen + IMM2_SIZE + 1 > cd->name_entry_size)
{
cd->name_entry_size = namelen + IMM2_SIZE + 1;
@@ -5868,59 +6345,59 @@
goto FAILED;
}
}
-
+
/* Scan the list to check for duplicates. For duplicate names, if the
number is the same, break the loop, which causes the name to be
discarded; otherwise, if DUPNAMES is not set, give an error.
If it is set, allow the name with a different number, but continue
scanning in case this is a duplicate with the same number. For
non-duplicate names, give an error if the number is duplicated. */
-
+
ng = cd->named_groups;
for (i = 0; i < cd->names_found; i++, ng++)
{
if (namelen == ng->length &&
STRNCMP_UC_UC(name, ng->name, namelen) == 0)
- {
+ {
if (ng->number == number) break;
if ((options & PCRE_DUPNAMES) == 0)
{
*errorcodeptr = ERR43;
- goto FAILED;
+ goto FAILED;
}
- cd->dupnames = TRUE; /* Duplicate names exist */
- }
+ cd->dupnames = TRUE; /* Duplicate names exist */
+ }
else if (ng->number == number)
{
*errorcodeptr = ERR65;
goto FAILED;
- }
- }
+ }
+ }
if (i >= cd->names_found) /* Not a duplicate with same number */
- {
+ {
/* Increase the list size if necessary */
-
+
if (cd->names_found >= cd->named_group_list_size)
{
int newsize = cd->named_group_list_size * 2;
named_group *newspace = (PUBL(malloc))
(newsize * sizeof(named_group));
-
- if (newspace == NULL)
+
+ if (newspace == NULL)
{
*errorcodeptr = ERR21;
- goto FAILED;
- }
-
- memcpy(newspace, cd->named_groups,
+ goto FAILED;
+ }
+
+ memcpy(newspace, cd->named_groups,
cd->named_group_list_size * sizeof(named_group));
if (cd->named_group_list_size > NAMED_GROUP_LIST_SIZE)
(PUBL(free))((void *)cd->named_groups);
cd->named_groups = newspace;
cd->named_group_list_size = newsize;
- }
-
+ }
+
cd->named_groups[cd->names_found].name = name;
cd->named_groups[cd->names_found].length = namelen;
cd->named_groups[cd->names_found].number = number;
@@ -5959,7 +6436,7 @@
if (lengthptr != NULL)
{
named_group *ng;
-
+
if (namelen == 0)
{
*errorcodeptr = ERR62;
@@ -5977,7 +6454,7 @@
}
/* The name table does not exist in the first pass; instead we must
- scan the list of names encountered so far in order to get the
+ scan the list of names encountered so far in order to get the
number. If the name is not found, set the value to 0 for a forward
reference. */
@@ -5987,12 +6464,12 @@
if (namelen == ng->length &&
STRNCMP_UC_UC(name, ng->name, namelen) == 0)
break;
- }
+ }
recno = (i < cd->names_found)? ng->number : 0;
-
+
/* Count named back references. */
-
- if (!is_recurse) cd->namedrefcount++;
+
+ if (!is_recurse) cd->namedrefcount++;
}
/* In the real compile, search the name table. We check the name
@@ -6015,7 +6492,7 @@
{
recno = GET2(slot, 0);
}
- else
+ else
{
*errorcodeptr = ERR15;
goto FAILED;
@@ -6026,44 +6503,44 @@
handles numerical recursion. */
if (is_recurse) goto HANDLE_RECURSION;
-
+
/* In the second pass we must see if the name is duplicated. If so, we
generate a different opcode. */
-
+
if (lengthptr == NULL && cd->dupnames)
{
- int count = 1;
+ int count = 1;
unsigned int index = i;
pcre_uchar *cslot = slot + cd->name_entry_size;
-
- for (i++; i < cd->names_found; i++)
+
+ for (i++; i < cd->names_found; i++)
{
if (STRCMP_UC_UC(slot + IMM2_SIZE, cslot + IMM2_SIZE) != 0) break;
- count++;
+ count++;
cslot += cd->name_entry_size;
- }
+ }
if (count > 1)
- {
+ {
if (firstcharflags == REQ_UNSET) firstcharflags = REQ_NONE;
previous = code;
*code++ = ((options & PCRE_CASELESS) != 0)? OP_DNREFI : OP_DNREF;
PUT2INC(code, 0, index);
- PUT2INC(code, 0, count);
-
+ PUT2INC(code, 0, count);
+
/* Process each potentially referenced group. */
-
+
for (; slot < cslot; slot += cd->name_entry_size)
{
open_capitem *oc;
- recno = GET2(slot, 0);
+ recno = GET2(slot, 0);
cd->backref_map |= (recno < 32)? (1 << recno) : 1;
if (recno > cd->top_backref) cd->top_backref = recno;
-
+
/* Check to see if this back reference is recursive, that it, it
is inside the group that it references. A flag is set so that the
group can be made atomic. */
-
+
for (oc = cd->open_caps; oc != NULL; oc = oc->next)
{
if (oc->number == recno)
@@ -6072,17 +6549,17 @@
break;
}
}
- }
-
+ }
+
continue; /* End of back ref handling */
- }
+ }
}
-
+
/* First pass, or a non-duplicated name. */
-
+
goto HANDLE_REFERENCE;
-
+
/* ------------------------------------------------------------ */
case CHAR_R: /* Recursion */
ptr++; /* Same as (?0) */
@@ -6662,8 +7139,8 @@
{
open_capitem *oc;
recno = -escape;
-
- /* Come here from named backref handling when the reference is to a
+
+ /* Come here from named backref handling when the reference is to a
single group (i.e. not to a duplicated name. */
HANDLE_REFERENCE:
@@ -7531,20 +8008,20 @@
*************************************************/
/* This function is called between compiling passes to add an entry to the
-name/number table, maintaining alphabetical order. Checking for permitted
+name/number table, maintaining alphabetical order. Checking for permitted
and forbidden duplicates has already been done.
Arguments:
cd the compile data block
name the name to add
length the length of the name
- groupno the group number
+ groupno the group number
Returns: nothing
*/
static void
-add_name(compile_data *cd, const pcre_uchar *name, int length,
+add_name(compile_data *cd, const pcre_uchar *name, int length,
unsigned int groupno)
{
int i;
@@ -7553,7 +8030,7 @@
for (i = 0; i < cd->names_found; i++)
{
int crc = memcmp(name, slot+IMM2_SIZE, IN_UCHARS(length));
- if (crc == 0 && slot[IMM2_SIZE+length] != 0)
+ if (crc == 0 && slot[IMM2_SIZE+length] != 0)
crc = -1; /* Current name is a substring */
/* Make space in the table and break the loop for an earlier name. For a
@@ -7668,7 +8145,7 @@
pcre_uchar cworkspace[COMPILE_WORK_SIZE];
-/* This vector is used for remembering name groups during the pre-compile. In a
+/* This vector is used for remembering name groups during the pre-compile. In a
similar way to cworkspace, it can be expanded using malloc() if necessary. */
named_group named_groups[NAMED_GROUP_LIST_SIZE];
@@ -7980,16 +8457,16 @@
if (cd->dupnames && cd->namedrefcount > 0)
length += cd->namedrefcount * IMM2_SIZE * sizeof(pcre_uchar);
-
+
/* Compute the size of the data block for storing the compiled pattern. Integer
overflow should no longer be possible because nowadays we limit the maximum
value of cd->names_found and cd->name_entry_size. */
-size = sizeof(REAL_PCRE) +
+size = sizeof(REAL_PCRE) +
(length + cd->names_found * cd->name_entry_size) * sizeof(pcre_uchar);
-
-/* Get the memory. */
+/* Get the memory. */
+
re = (REAL_PCRE *)(PUBL(malloc))(size);
if (re == NULL)
{
@@ -8044,19 +8521,19 @@
cd->check_lookbehind = FALSE;
cd->open_caps = NULL;
-/* If any named groups were found, create the name/number table from the list
+/* If any named groups were found, create the name/number table from the list
created in the first pass. */
if (cd->names_found > 0)
{
int i = cd->names_found;
- named_group *ng = cd->named_groups;
+ named_group *ng = cd->named_groups;
cd->names_found = 0;
for (; i > 0; i--, ng++)
- add_name(cd, ng->name, ng->length, ng->number);
+ add_name(cd, ng->name, ng->length, ng->number);
if (cd->named_group_list_size > NAMED_GROUP_LIST_SIZE)
(PUBL(free))((void *)cd->named_groups);
- }
+ }
/* Set up a starting, non-extracting bracket, then compile the expression. On
error, errorcode will be set non-zero, so we don't need to look at the result
@@ -8133,6 +8610,12 @@
if (errorcode == 0 && re->top_backref > re->top_bracket) errorcode = ERR15;
+/* Unless disabled, check whether single character iterators can be
+auto-possessified. The function overwrites the appropriate opcode values. */
+
+if ((options & PCRE_NO_AUTO_POSSESSIFY) == 0)
+ auto_possessify((pcre_uchar *)codestart, utf, cd);
+
/* If there were any lookbehind assertions that contained OP_RECURSE
(recursions or subroutine calls), a flag is set for them to be checked here,
because they may contain forward references. Actual recursions cannot be fixed
Modified: code/trunk/pcre_dfa_exec.c
===================================================================
--- code/trunk/pcre_dfa_exec.c 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/pcre_dfa_exec.c 2013-10-01 16:54:40 UTC (rev 1363)
@@ -120,7 +120,7 @@
0, 0, /* \P, \p */
0, 0, 0, 0, 0, /* \R, \H, \h, \V, \v */
0, /* \X */
- 0, 0, 0, 0, 0, 0, /* \Z, \z, ^, ^M, $, $M */
+ 0, 0, 0, 0, 0, 0, /* \Z, \z, $, $M, ^, ^M */
1, /* Char */
1, /* Chari */
1, /* not */
@@ -196,7 +196,7 @@
1, 1, /* \P, \p */
1, 1, 1, 1, 1, /* \R, \H, \h, \V, \v */
1, /* \X */
- 0, 0, 0, 0, 0, 0, /* \Z, \z, ^, ^M, $, $M */
+ 0, 0, 0, 0, 0, 0, /* \Z, \z, $, $M, ^, ^M */
1, /* Char */
1, /* Chari */
1, /* not */
Modified: code/trunk/pcre_internal.h
===================================================================
--- code/trunk/pcre_internal.h 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/pcre_internal.h 2013-10-01 16:54:40 UTC (rev 1363)
@@ -1174,7 +1174,8 @@
#define PUBLIC_COMPILE_OPTIONS \
(PCRE_CASELESS|PCRE_EXTENDED|PCRE_ANCHORED|PCRE_MULTILINE| \
PCRE_DOTALL|PCRE_DOLLAR_ENDONLY|PCRE_EXTRA|PCRE_UNGREEDY|PCRE_UTF8| \
- PCRE_NO_AUTO_CAPTURE|PCRE_NO_UTF8_CHECK|PCRE_AUTO_CALLOUT|PCRE_FIRSTLINE| \
+ PCRE_NO_AUTO_CAPTURE|PCRE_NO_AUTO_POSSESSIFY| \
+ PCRE_NO_UTF8_CHECK|PCRE_AUTO_CALLOUT|PCRE_FIRSTLINE| \
PCRE_DUPNAMES|PCRE_NEWLINE_BITS|PCRE_BSR_ANYCRLF|PCRE_BSR_UNICODE| \
PCRE_JAVASCRIPT_COMPAT|PCRE_UCP|PCRE_NO_START_OPTIMIZE|PCRE_NEVER_UTF)
@@ -1852,6 +1853,7 @@
#define PT_WORD 8 /* Word - L plus N plus underscore */
#define PT_CLIST 9 /* Pseudo-property: match character list */
#define PT_UCNC 10 /* Universal Character nameable character */
+#define PT_TABSIZE 11 /* Size of square table for autopossessify tests */
/* Flag bits and data types for the extended class (OP_XCLASS) for classes that
contain characters with values greater than 255. */
@@ -1891,13 +1893,31 @@
ESC_E, ESC_Q, ESC_g, ESC_k,
ESC_DU, ESC_du, ESC_SU, ESC_su, ESC_WU, ESC_wu };
-/* Opcode table: Starting from 1 (i.e. after OP_END), the values up to
-OP_EOD must correspond in order to the list of escapes immediately above.
+/* Opcode table */
-*** NOTE NOTE NOTE *** Whenever this list is updated, the two macro definitions
-that follow must also be updated to match. There are also tables called
-"coptable" and "poptable" in pcre_dfa_exec.c that must be updated. */
+/****** NOTE NOTE NOTE ******
+
+Starting from 1 (i.e. after OP_END), the values up to OP_EOD must correspond in
+order to the list of escapes immediately above. Furthermore, values up to
+OP_DOLLM must not be changed without adjusting the table called autoposstab in
+pcre_compile.c
+
+Whenever this list is updated, the two macro definitions that follow must also
+be updated to match. There are also tables called "coptable" and "poptable" in
+pcre_dfa_exec.c that must be updated.
+
+****** NOTE NOTE NOTE ******/
+
+
+/* The values between FIRST_AUTOTAB_OP and LAST_AUTOTAB_RIGHT_OP, inclusive,
+are used in a table for deciding whether a repeated character type can be
+auto-possessified. */
+
+#define FIRST_AUTOTAB_OP OP_NOT_DIGIT
+#define LAST_AUTOTAB_LEFT_OP OP_EXTUNI
+#define LAST_AUTOTAB_RIGHT_OP OP_DOLLM
+
enum {
OP_END, /* 0 End of pattern */
@@ -1928,11 +1948,16 @@
OP_EXTUNI, /* 22 \X (extended Unicode sequence */
OP_EODN, /* 23 End of data or \n at end of data (\Z) */
OP_EOD, /* 24 End of data (\z) */
+
+ /* Line end assertions */
- OP_CIRC, /* 25 Start of line - not multiline */
- OP_CIRCM, /* 26 Start of line - multiline */
- OP_DOLL, /* 27 End of line - not multiline */
- OP_DOLLM, /* 28 End of line - multiline */
+ OP_DOLL, /* 25 End of line - not multiline */
+ OP_DOLLM, /* 26 End of line - multiline */
+ OP_CIRC, /* 27 Start of line - not multiline */
+ OP_CIRCM, /* 28 Start of line - multiline */
+
+ /* Single characters; caseful must precede the caseless ones */
+
OP_CHAR, /* 29 Match one character, casefully */
OP_CHARI, /* 30 Match one character, caselessly */
OP_NOT, /* 31 Match one character, not the given one, casefully */
@@ -1941,7 +1966,7 @@
/* The following sets of 13 opcodes must always be kept in step because
the offset from the first one is used to generate the others. */
- /**** Single characters, caseful, must precede the caseless ones ****/
+ /* Repeated characters; caseful must precede the caseless ones */
OP_STAR, /* 33 The maximizing and minimizing versions of */
OP_MINSTAR, /* 34 these six opcodes must come in pairs, with */
@@ -1959,7 +1984,7 @@
OP_POSQUERY, /* 44 Posesssified query, caseful */
OP_POSUPTO, /* 45 Possessified upto, caseful */
- /**** Single characters, caseless, must follow the caseful ones */
+ /* Repeated characters; caseless must follow the caseful ones */
OP_STARI, /* 46 */
OP_MINSTARI, /* 47 */
@@ -1977,8 +2002,8 @@
OP_POSQUERYI, /* 57 Posesssified query, caseless */
OP_POSUPTOI, /* 58 Possessified upto, caseless */
- /**** The negated ones must follow the non-negated ones, and match them ****/
- /**** Negated single character, caseful; must precede the caseless ones ****/
+ /* The negated ones must follow the non-negated ones, and match them */
+ /* Negated repeated character, caseful; must precede the caseless ones */
OP_NOTSTAR, /* 59 The maximizing and minimizing versions of */
OP_NOTMINSTAR, /* 60 these six opcodes must come in pairs, with */
@@ -1996,7 +2021,7 @@
OP_NOTPOSQUERY, /* 70 */
OP_NOTPOSUPTO, /* 71 */
- /**** Negated single character, caseless; must follow the caseful ones ****/
+ /* Negated repeated character, caseless; must follow the caseful ones */
OP_NOTSTARI, /* 72 */
OP_NOTMINSTARI, /* 73 */
@@ -2014,7 +2039,7 @@
OP_NOTPOSQUERYI, /* 83 */
OP_NOTPOSUPTOI, /* 84 */
- /**** Character types ****/
+ /* Character types */
OP_TYPESTAR, /* 85 The maximizing and minimizing versions of */
OP_TYPEMINSTAR, /* 86 these six opcodes must come in pairs, with */
@@ -2055,8 +2080,8 @@
class. This does both positive and negative. */
OP_REF, /* 109 Match a back reference, casefully */
OP_REFI, /* 110 Match a back reference, caselessly */
- OP_DNREF, /* 111 Match a duplicate name backref, casefully */
- OP_DNREFI, /* 112 Match a duplicate name backref, caselessly */
+ OP_DNREF, /* 111 Match a duplicate name backref, casefully */
+ OP_DNREFI, /* 112 Match a duplicate name backref, caselessly */
OP_RECURSE, /* 113 Match a numbered subpattern (possibly recursive) */
OP_CALLOUT, /* 114 Call out to external function if provided */
@@ -2153,7 +2178,7 @@
"\\S", "\\s", "\\W", "\\w", "Any", "AllAny", "Anybyte", \
"notprop", "prop", "\\R", "\\H", "\\h", "\\V", "\\v", \
"extuni", "\\Z", "\\z", \
- "^", "^", "$", "$", "char", "chari", "not", "noti", \
+ "$", "$", "^", "^", "char", "chari", "not", "noti", \
"*", "*?", "+", "+?", "?", "??", \
"{", "{", "{", \
"*+","++", "?+", "{", \
@@ -2203,7 +2228,7 @@
3, 3, /* \P, \p */ \
1, 1, 1, 1, 1, /* \R, \H, \h, \V, \v */ \
1, /* \X */ \
- 1, 1, 1, 1, 1, 1, /* \Z, \z, ^, ^M, $, $M */ \
+ 1, 1, 1, 1, 1, 1, /* \Z, \z, $, $M ^, ^M */ \
2, /* Char - the minimum length */ \
2, /* Chari - the minimum length */ \
2, /* not */ \
@@ -2411,14 +2436,14 @@
pcre_uint16 flag; /* Set TRUE if recursive back ref */
} open_capitem;
-/* Structure for building a list of named groups during the first pass of
+/* Structure for building a list of named groups during the first pass of
compiling. */
typedef struct named_group {
const pcre_uchar *name; /* Points to the name in the pattern */
int length; /* Length of the name */
pcre_uint32 number; /* Group number */
-} named_group;
+} named_group;
/* Structure for passing "static" information around between the functions
doing the compiling, so that they are thread-safe. */
@@ -2438,14 +2463,14 @@
pcre_uchar *name_table; /* The name/number table */
int names_found; /* Number of entries so far */
int name_entry_size; /* Size of each entry */
- int named_group_list_size; /* Number of entries in the list */
+ int named_group_list_size; /* Number of entries in the list */
int workspace_size; /* Size of workspace */
unsigned int bracount; /* Count of capturing parens as we compile */
int final_bracount; /* Saved value after first pass */
int max_lookbehind; /* Maximum lookbehind (characters) */
int top_backref; /* Maximum back reference */
unsigned int backref_map; /* Bitmap of low back refs */
- unsigned int namedrefcount; /* Number of backreferences by name */
+ unsigned int namedrefcount; /* Number of backreferences by name */
int assert_depth; /* Depth of nested assertions */
pcre_uint32 external_options; /* External (initial) options */
pcre_uint32 external_flags; /* External flag bits to be set */
Modified: code/trunk/pcretest.c
===================================================================
--- code/trunk/pcretest.c 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/pcretest.c 2013-10-01 16:54:40 UTC (rev 1363)
@@ -2893,6 +2893,7 @@
printf(" -i show information about compiled patterns\n"
" -M find MATCH_LIMIT minimum for each subject\n"
" -m output memory used information\n"
+ " -O set PCRE_NO_AUTO_POSSESSIFY on each pattern\n"
" -o <n> set size of offsets vector to <n>\n");
#if !defined NOPOSIX
printf(" -p use POSIX interface\n");
@@ -2930,6 +2931,7 @@
int options = 0;
int study_options = 0;
int default_find_match_limit = FALSE;
+pcre_uint32 default_options = 0;
int op = 1;
int timeit = 0;
int timeitm = 0;
@@ -3075,6 +3077,7 @@
else if (strcmp(arg, "-i") == 0) showinfo = 1;
else if (strcmp(arg, "-d") == 0) showinfo = debug = 1;
else if (strcmp(arg, "-M") == 0) default_find_match_limit = TRUE;
+ else if (strcmp(arg, "-O") == 0) default_options = PCRE_NO_AUTO_POSSESSIFY;
#if !defined NODFA
else if (strcmp(arg, "-dfa") == 0) all_use_dfa = 1;
#endif
@@ -3615,7 +3618,7 @@
/* Look for options after final delimiter */
- options = 0;
+ options = default_options;
study_options = force_study_options;
log_store = showstore; /* default from command line */
@@ -3647,6 +3650,7 @@
case 'K': do_mark = 1; break;
case 'M': log_store = 1; break;
case 'N': options |= PCRE_NO_AUTO_CAPTURE; break;
+ case 'O': options |= PCRE_NO_AUTO_POSSESSIFY; break;
#if !defined NOPOSIX
case 'P': do_posix = 1; break;
@@ -4087,7 +4091,7 @@
if (do_flip) all_options = swap_uint32(all_options);
if (get_options == 0) fprintf(outfile, "No options\n");
- else fprintf(outfile, "Options:%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
+ else fprintf(outfile, "Options:%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
((get_options & PCRE_ANCHORED) != 0)? " anchored" : "",
((get_options & PCRE_CASELESS) != 0)? " caseless" : "",
((get_options & PCRE_EXTENDED) != 0)? " extended" : "",
@@ -4100,6 +4104,7 @@
((get_options & PCRE_EXTRA) != 0)? " extra" : "",
((get_options & PCRE_UNGREEDY) != 0)? " ungreedy" : "",
((get_options & PCRE_NO_AUTO_CAPTURE) != 0)? " no_auto_capture" : "",
+ ((get_options & PCRE_NO_AUTO_POSSESSIFY) != 0)? " no_auto_possessify" : "",
((get_options & PCRE_UTF8) != 0)? " utf" : "",
((get_options & PCRE_UCP) != 0)? " ucp" : "",
((get_options & PCRE_NO_UTF8_CHECK) != 0)? " no_utf_check" : "",
Modified: code/trunk/testdata/saved16BE-1
===================================================================
(Binary files differ)
Modified: code/trunk/testdata/saved16LE-1
===================================================================
(Binary files differ)
Modified: code/trunk/testdata/saved32BE-1
===================================================================
--- code/trunk/testdata/saved32BE-1 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/saved32BE-1 2013-10-01 16:54:40 UTC (rev 1363)
@@ -1 +1 @@
-???\xF4???,PCRE???\xF4????????\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF???????????????????????????????????????n???a???m???e???????????????o???t???h???e???r??????????^???????j???????????????????????????????????\x81??????????????????j???????????????????????????????????d???t???????k\xFF\xFF\xFF\xFF\xFF\xDF\xFF\xFF\xFF\xFF\xFF\xFE\xFF\xFF\xFF\xFE\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF???c???t???????\x82???????????l??? ???????????P???P???????????????s???????l??????????????????8???8???????????????????????????????\xD8???\xDF\xFF???????w???????l???????????????????????????????????????????????????????????????h???????????????t???^???????,????????????????????????????????????????
\ No newline at end of file
+???\xF4???,PCRE???\xF4????????\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF???????????????????????????????????????n???a???m???e???????????????o???t???h???e???r??????????^???????j???????????????????????????????????\x81??????????????????j???????????????????????????????????d???t???????k\xFF\xFF\xFF\xFF\xFF\xDF\xFF\xFF\xFF\xFF\xFF\xFE\xFF\xFF\xFF\xFE\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF???c???t???????\x82???????????l??? ???????????P???P???????????????s???????l??????????????????8???8???????????????????????????????\xD8???\xDF\xFF???????w???????l???????????????????????????????????????????????????????????????h???????????????t???^???????,????????????????????????????????????????
\ No newline at end of file
Modified: code/trunk/testdata/saved32LE-1
===================================================================
--- code/trunk/testdata/saved32LE-1 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/saved32LE-1 2013-10-01 16:54:40 UTC (rev 1363)
@@ -1 +1 @@
-???\xF4???,ERCP\xF4???????????\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF????????????????????????????????????n???a???m???e???????????????o???t???h???e???r??????????^???????j???????????????????????????????????\x81??????????????????j???????????????????????????????????d???t???????k???\xFF\xFF\xFF\xFF\xFF\xDF\xFF\xFF\xFF\xFF\xFF\xFE\xFF\xFF\xFF\xFE\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFFc???t???????\x82???????????l??? ???????????P???P???????????????s???????l?????????????????????8???8??????????????????????????????\xD8??\xFF\xDF??????w???????l???????????????????????????????????????????????????????????????h???????????????t???^???????,???????????????????????????????????????????
\ No newline at end of file
+???\xF4???,ERCP\xF4???????????\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF????????????????????????????????????n???a???m???e???????????????o???t???h???e???r??????????^???????j???????????????????????????????????\x81??????????????????j???????????????????????????????????d???t???????k???\xFF\xFF\xFF\xFF\xFF\xDF\xFF\xFF\xFF\xFF\xFF\xFE\xFF\xFF\xFF\xFE\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFFc???t???????\x82???????????l??? ???????????P???P???????????????s???????l?????????????????????8???8??????????????????????????????\xD8??\xFF\xDF??????w???????l???????????????????????????????????????????????????????????????h???????????????t???^???????,???????????????????????????????????????????
\ No newline at end of file
Modified: code/trunk/testdata/testinput1
===================================================================
--- code/trunk/testdata/testinput1 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testinput1 2013-10-01 16:54:40 UTC (rev 1363)
@@ -5620,4 +5620,7 @@
Afoofoo
Abarbar
+/^(\d+)\s+IN\s+SOA\s+(\S+)\s+(\S+)\s*\(\s*$/
+ 1 IN SOA non-sp1 non-sp2(
+
/-- End of testinput1 --/
Modified: code/trunk/testdata/testinput10
===================================================================
--- code/trunk/testdata/testinput10 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testinput10 2013-10-01 16:54:40 UTC (rev 1363)
@@ -210,7 +210,7 @@
X
\x{903}
-/^\p{Nd}+/8
+/^\p{Nd}+/8O
0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}\x{66a}
\x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}\x{6fa}
\x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}\x{970}
@@ -433,23 +433,23 @@
** Failers
1234
-/\D+/8
+/\D+/8O
11111111111111111111111111111111111111111111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-/\P{Nd}+/8
+/\P{Nd}+/8O
11111111111111111111111111111111111111111111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-/[\D]+/8
+/[\D]+/8O
11111111111111111111111111111111111111111111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-/[\P{Nd}]+/8
+/[\P{Nd}]+/8O
11111111111111111111111111111111111111111111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-/[\D\P{Nd}]+/8
+/[\D\P{Nd}]+/8O
11111111111111111111111111111111111111111111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
@@ -896,19 +896,19 @@
** Failers
\x{0b}
-/^>\p{Xsp}+/8
+/^>\p{Xsp}+/8O
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-/^>\p{Xsp}*/8
+/^>\p{Xsp}*/8O
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-/^>\p{Xsp}{2,9}/8
+/^>\p{Xsp}{2,9}/8O
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-/^>[\p{Xsp}]/8
+/^>[\p{Xsp}]/8O
>\x{2028}\x{0b}
-/^>[\p{Xsp}]+/8
+/^>[\p{Xsp}]+/8O
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}/8
Modified: code/trunk/testdata/testinput15
===================================================================
--- code/trunk/testdata/testinput15 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testinput15 2013-10-01 16:54:40 UTC (rev 1363)
@@ -47,7 +47,7 @@
/\xC3\xC3\xC3xxx/8
-/\xC3\xC3\xC3xxx/8?DZSS
+/\xC3\xC3\xC3xxx/8?DZSSO
/badutf/8
\xdf
Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testinput2 2013-10-01 16:54:40 UTC (rev 1363)
@@ -3848,4 +3848,48 @@
/(?<a>abc)(?<a>z)\k<a>()/JDZS
+/a*[bcd]/BZ
+
+/[bcd]*a/BZ
+
+/-- A complete set of tests for auto-possessification of character types --/
+
+/\D+\D \D+\d \D+\S \D+\s \D+\W \D+\w \D+. \D+\C \D+\R \D+\H \D+\h \D+\V \D+\v \D+\X \D+\Z \D+\z \D+$/BZx
+
+/\d+\D \d+\d \d+\S \d+\s \d+\W \d+\w \d+. \d+\C \d+\R \d+\H \d+\h \d+\V \d+\v \d+\X \d+\Z \d+\z \d+$/BZx
+
+/\S+\D \S+\d \S+\S \S+\s \S+\W \S+\w \S+. \S+\C \S+\R \S+\H \S+\h \S+\V \S+\v \S+\X \S+\Z \S+\z \S+$/BZx
+
+/\s+\D \s+\d \s+\S \s+\s \s+\W \s+\w \s+. \s+\C \s+\R \s+\H \s+\h \s+\V \s+\v \s+\X \s+\Z \s+\z \s+$/BZx
+
+/\W+\D \W+\d \W+\S \W+\s \W+\W \W+\w \W+. \W+\C \W+\R \W+\H \W+\h \W+\V \W+\v \W+\X \W+\Z \W+\z \W+$/BZx
+
+/\w+\D \w+\d \w+\S \w+\s \w+\W \w+\w \w+. \w+\C \w+\R \w+\H \w+\h \w+\V \w+\v \w+\X \w+\Z \w+\z \w+$/BZx
+
+/\C+\D \C+\d \C+\S \C+\s \C+\W \C+\w \C+. \C+\C \C+\R \C+\H \C+\h \C+\V \C+\v \C+\X \C+\Z \C+\z \C+$/BZx
+
+/\R+\D \R+\d \R+\S \R+\s \R+\W \R+\w \R+. \R+\C \R+\R \R+\H \R+\h \R+\V \R+\v \R+\X \R+\Z \R+\z \R+$/BZx
+
+/\H+\D \H+\d \H+\S \H+\s \H+\W \H+\w \H+. \H+\C \H+\R \H+\H \H+\h \H+\V \H+\v \H+\X \H+\Z \H+\z \H+$/BZx
+
+/\h+\D \h+\d \h+\S \h+\s \h+\W \h+\w \h+. \h+\C \h+\R \h+\H \h+\h \h+\V \h+\v \h+\X \h+\Z \h+\z \h+$/BZx
+
+/\V+\D \V+\d \V+\S \V+\s \V+\W \V+\w \V+. \V+\C \V+\R \V+\H \V+\h \V+\V \V+\v \V+\X \V+\Z \V+\z \V+$/BZx
+
+/\v+\D \v+\d \v+\S \v+\s \v+\W \v+\w \v+. \v+\C \v+\R \v+\H \v+\h \v+\V \v+\v \v+\X \v+\Z \v+\z \v+$/BZx
+
+/\X+\D \X+\d \X+\S \X+\s \X+\W \X+\w \X+. \X+\C \X+\R \X+\H \X+\h \X+\V \X+\v \X+\X \X+\Z \X+\z \X+$/BZx
+
+/ a+\D a+\d a+\S a+\s a+\W a+\w a+. a+\C a+\R a+\H a+\h a+\V a+\v a+\X a+\Z a+\z a+$/BZx
+
+/\n+\D \n+\d \n+\S \n+\s \n+\W \n+\w \n+. \n+\C \n+\R \n+\H \n+\h \n+\V \n+\v \n+\X \n+\Z \n+\z \n+$/BZx
+
+/ .+\D .+\d .+\S .+\s .+\W .+\w .+. .+\C .+\R .+\H .+\h .+\V .+\v .+\X .+\Z .+\z .+$/BZx
+
+/ .+\D .+\d .+\S .+\s .+\W .+\w .+. .+\C .+\R .+\H .+\h .+\V .+\v .+\X .+\Z .+\z .+$/BZxs
+
+/\D+$ \d+$ \S+$ \s+$ \W+$ \w+$ \C+$ \R+$ \H+$ \h+$ \V+$ \v+$ \X+$ a+$ \n+$ .+$ .+$/BZxm
+
+/-- End of special auto-possessive tests --/
+
/-- End of testinput2 --/
Modified: code/trunk/testdata/testinput7
===================================================================
--- code/trunk/testdata/testinput7 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testinput7 2013-10-01 16:54:40 UTC (rev 1363)
@@ -759,5 +759,57 @@
@abc
`abc
\x{1234}abc
+
+/-- Some auto-possessification tests --/
+/\pN+\z/BZ
+
+/\PN+\z/BZ
+
+/\pN+/BZ
+
+/\PN+/BZ
+
+/\p{Any}+\p{Any} \p{Any}+\P{Any} \p{Any}+\p{L&} \p{Any}+\p{L} \p{Any}+\p{Lu} \p{Any}+\p{Han} \p{Any}+\p{Xan} \p{Any}+\p{Xsp} \p{Any}+\p{Xps} \p{Xwd}+\p{Any} \p{Any}+\p{Xuc}/BWZx
+
+/\p{L&}+\p{Any} \p{L&}+\p{L&} \P{L&}+\p{L&} \p{L&}+\p{L} \p{L&}+\p{Lu} \p{L&}+\p{Han} \p{L&}+\p{Xan} \p{L&}+\P{Xan} \p{L&}+\p{Xsp} \p{L&}+\p{Xps} \p{Xwd}+\p{L&} \p{L&}+\p{Xuc}/BWZx
+
+/\p{N}+\p{Any} \p{N}+\p{L&} \p{N}+\p{L} \p{N}+\P{L} \p{N}+\P{N} \p{N}+\p{Lu} \p{N}+\p{Han} \p{N}+\p{Xan} \p{N}+\p{Xsp} \p{N}+\p{Xps} \p{Xwd}+\p{N} \p{N}+\p{Xuc}/BWZx
+
+/\p{Lu}+\p{Any} \p{Lu}+\p{L&} \p{Lu}+\p{L} \p{Lu}+\p{Lu} \P{Lu}+\p{Lu} \p{Lu}+\p{Nd} \p{Lu}+\P{Nd} \p{Lu}+\p{Han} \p{Lu}+\p{Xan} \p{Lu}+\p{Xsp} \p{Lu}+\p{Xps} \p{Xwd}+\p{Lu} \p{Lu}+\p{Xuc}/BWZx
+
+/\p{Han}+\p{Lu} \p{Han}+\p{L&} \p{Han}+\p{L} \p{Han}+\p{Lu} \p{Han}+\p{Arabic} \p{Arabic}+\p{Arabic} \p{Han}+\p{Xan} \p{Han}+\p{Xsp} \p{Han}+\p{Xps} \p{Xwd}+\p{Han} \p{Han}+\p{Xuc}/BWZx
+
+/\p{Xan}+\p{Any} \p{Xan}+\p{L&} \P{Xan}+\p{L&} \p{Xan}+\p{L} \p{Xan}+\p{Lu} \p{Xan}+\p{Han} \p{Xan}+\p{Xan} \p{Xan}+\P{Xan} \p{Xan}+\p{Xsp} \p{Xan}+\p{Xps} \p{Xwd}+\p{Xan} \p{Xan}+\p{Xuc}/BWZx
+
+/\p{Xsp}+\p{Any} \p{Xsp}+\p{L&} \p{Xsp}+\p{L} \p{Xsp}+\p{Lu} \p{Xsp}+\p{Han} \p{Xsp}+\p{Xan} \p{Xsp}+\p{Xsp} \P{Xsp}+\p{Xsp} \p{Xsp}+\p{Xps} \p{Xwd}+\p{Xsp} \p{Xsp}+\p{Xuc}/BWZx
+
+/\p{Xwd}+\p{Any} \p{Xwd}+\p{L&} \p{Xwd}+\p{L} \p{Xwd}+\p{Lu} \p{Xwd}+\p{Han} \p{Xwd}+\p{Xan} \p{Xwd}+\p{Xsp} \p{Xwd}+\p{Xps} \p{Xwd}+\p{Xwd} \p{Xwd}+\P{Xwd} \p{Xwd}+\p{Xuc}/BWZx
+
+/\p{Xuc}+\p{Any} \p{Xuc}+\p{L&} \p{Xuc}+\p{L} \p{Xuc}+\p{Lu} \p{Xuc}+\p{Han} \p{Xuc}+\p{Xan} \p{Xuc}+\p{Xsp} \p{Xuc}+\p{Xps} \p{Xwd}+\p{Xuc} \p{Xuc}+\p{Xuc} \p{Xuc}+\P{Xuc}/BWZx
+
+/\p{N}+\p{Ll} \p{N}+\p{Nd} \p{N}+\P{Nd}/BWZx
+
+/\p{Xan}+\p{L} \p{Xan}+\p{N} \p{Xan}+\p{C} \p{Xan}+\P{L} \P{Xan}+\p{N} \p{Xan}+\P{C}/BWZx
+
+/\p{L}+\p{Xan} \p{N}+\p{Xan} \p{C}+\p{Xan} \P{L}+\p{Xan} \p{N}+\p{Xan} \P{C}+\p{Xan} \p{L}+\P{Xan}/BWZx
+
+/\p{Xan}+\p{Lu} \p{Xan}+\p{Nd} \p{Xan}+\p{Cc} \p{Xan}+\P{Ll} \P{Xan}+\p{No} \p{Xan}+\P{Cf}/BWZx
+
+/\p{Lu}+\p{Xan} \p{Nd}+\p{Xan} \p{Cs}+\p{Xan} \P{Lt}+\p{Xan} \p{Nl}+\p{Xan} \P{Cc}+\p{Xan} \p{Lt}+\P{Xan}/BWZx
+
+/\w+\p{P} \w+\p{Po} \w+\s \p{Xan}+\s \s+\p{Xan} \s+\w/BWZx
+
+/\w+\P{P} \W+\p{Po} \w+\S \P{Xan}+\s \s+\P{Xan} \s+\W/BWZx
+
+/\w+\p{Po} \w+\p{Pc} \W+\p{Po} \W+\p{Pc} \w+\P{Po} \w+\P{Pc}/BWZx
+
+/\p{Nl}+\p{Xan} \P{Nl}+\p{Xan} \p{Nl}+\P{Xan} \P{Nl}+\P{Xan}/BWZx
+
+/\p{Xan}+\p{Nl} \P{Xan}+\p{Nl} \p{Xan}+\P{Nl} \P{Xan}+\P{Nl}/BWZx
+
+/\p{Xan}+\p{Nd} \P{Xan}+\p{Nd} \p{Xan}+\P{Nd} \P{Xan}+\P{Nd}/BWZx
+
+/-- End auto-possessification tests --/
+
/-- End of testinput7 --/
Modified: code/trunk/testdata/testinput8
===================================================================
--- code/trunk/testdata/testinput8 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testinput8 2013-10-01 16:54:40 UTC (rev 1363)
@@ -16,7 +16,7 @@
ac
ab
-/a*/
+/a*/O
a
aaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
@@ -177,19 +177,19 @@
ayzq
axyzq
-/[^a]+/
+/[^a]+/O
bac
bcdefax
*** Failers
aaaaa
-/[^a]*/
+/[^a]*/O
bac
bcdefax
*** Failers
aaaaa
-/[^a]{3,5}/
+/[^a]{3,5}/O
xyz
awxyza
abcdefa
@@ -937,16 +937,16 @@
*** Failers
the abc
-/^[ab]{1,3}(ab*|b)/
+/^[ab]{1,3}(ab*|b)/O
aabbbbb
-/^[ab]{1,3}?(ab*|b)/
+/^[ab]{1,3}?(ab*|b)/O
aabbbbb
-/^[ab]{1,3}?(ab*?|b)/
+/^[ab]{1,3}?(ab*?|b)/O
aabbbbb
-/^[ab]{1,3}(ab*?|b)/
+/^[ab]{1,3}(ab*?|b)/O
aabbbbb
/ (?: [\040\t] | \(
@@ -2049,13 +2049,13 @@
/foo(.*?)bar/
The food is under the bar in the barn.
-/(.*)(\d*)/
+/(.*)(\d*)/O
I have 2 numbers: 53147
/(.*)(\d+)/
I have 2 numbers: 53147
-/(.*?)(\d*)/
+/(.*?)(\d*)/O
I have 2 numbers: 53147
/(.*?)(\d+)/
@@ -4699,7 +4699,7 @@
/(?(R)a*(?1)|((?R))b)/
aaaabcde
-/(a+)/
+/(a+)/O
\O6aaaa
\O8aaaa
Modified: code/trunk/testdata/testinput9
===================================================================
--- code/trunk/testdata/testinput9 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testinput9 2013-10-01 16:54:40 UTC (rev 1363)
@@ -239,16 +239,16 @@
/\x{100}{3,5}/8
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
-/\x{100}{3,}/8
+/\x{100}{3,}/8O
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
/(?<=a\x{100}{2}b)X/8
Xyyya\x{100}\x{100}bXzzz
-/\D*/8
+/\D*/8O
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-/\D*/8
+/\D*/8O
\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
/\D/8
@@ -584,16 +584,16 @@
a\n\n\n\rb
a\r
-/\h+\V?\v{3,4}/8
+/\h+\V?\v{3,4}/8O
\x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
-/\V?\v{3,4}/8
+/\V?\v{3,4}/8O
\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
-/\h+\V?\v{3,4}/8
+/\h+\V?\v{3,4}/8O
>\x09\x20\x{a0}X\x0a\x0a\x0a<
-/\V?\v{3,4}/8
+/\V?\v{3,4}/8O
>\x09\x20\x{a0}X\x0a\x0a\x0a<
/\H\h\V\v/8
@@ -602,7 +602,7 @@
** Failers
\x{a0} X\x0a
-/\H*\h+\V?\v{3,4}/8
+/\H*\h+\V?\v{3,4}/8O
\x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
\x09\x20\x{a0}\x0a\x0b\x0c\x0d\x0a
\x09\x20\x{a0}\x0a\x0b\x0c
@@ -615,7 +615,7 @@
** Failers
\x{2009} X\x0a
-/\H*\h+\V?\v{3,4}/8
+/\H*\h+\V?\v{3,4}/8O
\x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
\x09\x{205f}\x{a0}\x0a\x{2029}\x0c\x{2028}\x0a
\x09\x20\x{202f}\x0a\x0b\x0c
Modified: code/trunk/testdata/testoutput1
===================================================================
--- code/trunk/testdata/testoutput1 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput1 2013-10-01 16:54:40 UTC (rev 1363)
@@ -9230,4 +9230,11 @@
Abarbar
No match
+/^(\d+)\s+IN\s+SOA\s+(\S+)\s+(\S+)\s*\(\s*$/
+ 1 IN SOA non-sp1 non-sp2(
+ 0: 1 IN SOA non-sp1 non-sp2(
+ 1: 1
+ 2: non-sp1
+ 3: non-sp2
+
/-- End of testinput1 --/
Modified: code/trunk/testdata/testoutput10
===================================================================
--- code/trunk/testdata/testoutput10 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput10 2013-10-01 16:54:40 UTC (rev 1363)
@@ -39,9 +39,6 @@
/^\pL+/8
abcd
0: abcd
- 1: abc
- 2: ab
- 3: a
a
0: a
*** Failers
@@ -50,45 +47,24 @@
/^\PL+/8
1234
0: 1234
- 1: 123
- 2: 12
- 3: 1
=
0: =
*** Failers
0: ***
- 1: ***
- 2: **
- 3: *
abcd
No match
/^\X+/8
abcdA\x{300}\x{301}\x{302}
0: abcdA\x{300}\x{301}\x{302}
- 1: abcd
- 2: abc
- 3: ab
- 4: a
A\x{300}\x{301}\x{302}
0: A\x{300}\x{301}\x{302}
A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
0: A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
- 1: A\x{300}\x{301}\x{302}
a
0: a
*** Failers
0: *** Failers
- 1: *** Failer
- 2: *** Faile
- 3: *** Fail
- 4: *** Fai
- 5: *** Fa
- 6: *** F
- 7: ***
- 8: ***
- 9: **
-10: *
\x{300}\x{301}\x{302}
0: \x{300}\x{301}\x{302}
@@ -374,7 +350,7 @@
\x{903}
No match
-/^\p{Nd}+/8
+/^\p{Nd}+/8O
0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}\x{66a}
0: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}
1: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}
@@ -564,10 +540,6 @@
/^\p{Sc}+/8
$\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
0: $\x{a2}\x{a3}\x{a4}\x{a5}
- 1: $\x{a2}\x{a3}\x{a4}
- 2: $\x{a2}\x{a3}
- 3: $\x{a2}
- 4: $
\x{9f2}
0: \x{9f2}
** Failers
@@ -590,11 +562,6 @@
/^\p{Sm}+/8
+<|~\x{ac}\x{2044}
0: +<|~\x{ac}\x{2044}
- 1: +<|~\x{ac}
- 2: +<|~
- 3: +<|
- 4: +<
- 5: +
** Failers
No match
X
@@ -829,7 +796,7 @@
1234
No match
-/\D+/8
+/\D+/8O
11111111111111111111111111111111111111111111111111111111111111111111111
No match
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
@@ -857,7 +824,7 @@
20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-/\P{Nd}+/8
+/\P{Nd}+/8O
11111111111111111111111111111111111111111111111111111111111111111111111
No match
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
@@ -885,7 +852,7 @@
20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-/[\D]+/8
+/[\D]+/8O
11111111111111111111111111111111111111111111111111111111111111111111111
No match
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
@@ -913,7 +880,7 @@
20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-/[\P{Nd}]+/8
+/[\P{Nd}]+/8O
11111111111111111111111111111111111111111111111111111111111111111111111
No match
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
@@ -941,7 +908,7 @@
20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-/[\D\P{Nd}]+/8
+/[\D\P{Nd}]+/8O
11111111111111111111111111111111111111111111111111111111111111111111111
No match
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
@@ -1066,10 +1033,6 @@
/\x{391}+/8i
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
- 1: \x{391}\x{3b1}\x{3b1}\x{3b1}
- 2: \x{391}\x{3b1}\x{3b1}
- 3: \x{391}\x{3b1}
- 4: \x{391}
/\x{391}{3,5}(.)/8i
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
@@ -1256,8 +1219,6 @@
/^\p{Han}+/8
\x{2e81}\x{3007}\x{2f804}\x{31a0}
0: \x{2e81}\x{3007}\x{2f804}
- 1: \x{2e81}\x{3007}
- 2: \x{2e81}
** Failers
No match
\x{2e7f}
@@ -1268,15 +1229,6 @@
0: \x{3105}
** Failers
0: ** Failers
- 1: ** Failer
- 2: ** Faile
- 3: ** Fail
- 4: ** Fai
- 5: ** Fa
- 6: ** F
- 7: **
- 8: **
- 9: *
\x{30ff}
No match
@@ -1489,12 +1441,8 @@
/^\p{Any}{3,5}/8
abcdefgh
0: abcde
- 1: abcd
- 2: abc
\x{1234}\n\r\x{3456}xyz
0: \x{1234}\x{0a}\x{0d}\x{3456}x
- 1: \x{1234}\x{0a}\x{0d}\x{3456}
- 2: \x{1234}\x{0a}\x{0d}
/^\P{Any}{3,5}?/8
** Failers
@@ -1659,7 +1607,6 @@
/\x{c0}+\x{116}+/8i
\x{c0}\x{e0}\x{116}\x{117}
0: \x{c0}\x{e0}\x{116}\x{117}
- 1: \x{c0}\x{e0}\x{116}
/[\x{c0}\x{116}]+/8i
\x{c0}\x{e0}\x{116}\x{117}
@@ -1713,16 +1660,6 @@
/^\p{Xan}+/8
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
- 1: ABCD1234\x{6ca}\x{a6c}
- 2: ABCD1234\x{6ca}
- 3: ABCD1234
- 4: ABCD123
- 5: ABCD12
- 6: ABCD1
- 7: ABCD
- 8: ABC
- 9: AB
-10: A
** Failers
No match
_ABC
@@ -1731,28 +1668,10 @@
/^\p{Xan}*/8
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
- 1: ABCD1234\x{6ca}\x{a6c}
- 2: ABCD1234\x{6ca}
- 3: ABCD1234
- 4: ABCD123
- 5: ABCD12
- 6: ABCD1
- 7: ABCD
- 8: ABC
- 9: AB
-10: A
-11:
/^\p{Xan}{2,9}/8
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}
- 1: ABCD1234
- 2: ABCD123
- 3: ABCD12
- 4: ABCD1
- 5: ABCD
- 6: ABC
- 7: AB
/^[\p{Xan}]/8
ABCD1234_
@@ -1796,7 +1715,7 @@
\x{0b}
No match
-/^>\p{Xsp}+/8
+/^>\p{Xsp}+/8O
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
@@ -1807,7 +1726,7 @@
6: > \x{09}
7: >
-/^>\p{Xsp}*/8
+/^>\p{Xsp}*/8O
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
@@ -1819,7 +1738,7 @@
7: >
8: >
-/^>\p{Xsp}{2,9}/8
+/^>\p{Xsp}{2,9}/8O
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
@@ -1829,11 +1748,11 @@
5: > \x{09}\x{0a}
6: > \x{09}
-/^>[\p{Xsp}]/8
+/^>[\p{Xsp}]/8O
>\x{2028}\x{0b}
0: >\x{2028}
-/^>[\p{Xsp}]+/8
+/^>[\p{Xsp}]+/8O
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
@@ -1857,14 +1776,6 @@
/^>\p{Xps}+/8
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 4: > \x{09}\x{0a}\x{0c}\x{0d}
- 5: > \x{09}\x{0a}\x{0c}
- 6: > \x{09}\x{0a}
- 7: > \x{09}
- 8: >
/^>\p{Xps}+?/8
>\x{1680}\x{2028}\x{0b}
@@ -1875,26 +1786,10 @@
/^>\p{Xps}*/8
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 4: > \x{09}\x{0a}\x{0c}\x{0d}
- 5: > \x{09}\x{0a}\x{0c}
- 6: > \x{09}\x{0a}
- 7: > \x{09}
- 8: >
- 9: >
/^>\p{Xps}{2,9}/8
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 4: > \x{09}\x{0a}\x{0c}\x{0d}
- 5: > \x{09}\x{0a}\x{0c}
- 6: > \x{09}\x{0a}
- 7: > \x{09}
/^>\p{Xps}{2,9}?/8
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
@@ -1944,42 +1839,14 @@
/^\p{Xwd}+/8
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 1: ABCD1234\x{6ca}\x{a6c}\x{10a7}
- 2: ABCD1234\x{6ca}\x{a6c}
- 3: ABCD1234\x{6ca}
- 4: ABCD1234
- 5: ABCD123
- 6: ABCD12
- 7: ABCD1
- 8: ABCD
- 9: ABC
-10: AB
-11: A
/^\p{Xwd}*/8
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 1: ABCD1234\x{6ca}\x{a6c}\x{10a7}
- 2: ABCD1234\x{6ca}\x{a6c}
- 3: ABCD1234\x{6ca}
- 4: ABCD1234
- 5: ABCD123
- 6: ABCD12
- 7: ABCD1
- 8: ABCD
- 9: ABC
-10: AB
-11: A
-12:
/^\p{Xwd}{2,9}/8
A_12\x{6ca}\x{a6c}\x{10a7}
0: A_12\x{6ca}\x{a6c}\x{10a7}
- 1: A_12\x{6ca}\x{a6c}
- 2: A_12\x{6ca}
- 3: A_12
- 4: A_1
- 5: A_
/^[\p{Xwd}]/8
ABCD1234_
@@ -2063,7 +1930,6 @@
/[^\x{100}]+/8i
\x{100}\x{101}XX
0: XX
- 1: X
/^\X/8
A\P
@@ -2110,7 +1976,6 @@
/^\X+/8
AA\P
0: AA
- 1: A
AA\P\P
Partial match: AA
@@ -2291,7 +2156,6 @@
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
- 1: \x{1e9e}
/[z\x{1e9e}]+/8i
\x{1e9e}\x{00df}
@@ -2301,7 +2165,6 @@
/\x{00df}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
- 1: \x{1e9e}
/[z\x{00df}]+/8i
\x{1e9e}\x{00df}
@@ -2311,7 +2174,6 @@
/\x{1f88}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
- 1: \x{1f88}
/[z\x{1f88}]+/8i
\x{1f88}\x{1f80}
@@ -2323,403 +2185,273 @@
/\x{00b5}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
- 1: \x{b5}\x{39c}
- 2: \x{b5}
/\x{039c}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
- 1: \x{b5}\x{39c}
- 2: \x{b5}
/\x{03bc}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
- 1: \x{b5}\x{39c}
- 2: \x{b5}
/\x{00c5}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
- 1: \x{c5}\x{e5}
- 2: \x{c5}
/\x{00e5}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
- 1: \x{c5}\x{e5}
- 2: \x{c5}
/\x{212b}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
- 1: \x{c5}\x{e5}
- 2: \x{c5}
/\x{01c4}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
- 1: \x{1c4}\x{1c5}
- 2: \x{1c4}
/\x{01c5}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
- 1: \x{1c4}\x{1c5}
- 2: \x{1c4}
/\x{01c6}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
- 1: \x{1c4}\x{1c5}
- 2: \x{1c4}
/\x{01c7}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
- 1: \x{1c7}\x{1c8}
- 2: \x{1c7}
/\x{01c8}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
- 1: \x{1c7}\x{1c8}
- 2: \x{1c7}
/\x{01c9}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
- 1: \x{1c7}\x{1c8}
- 2: \x{1c7}
/\x{01ca}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
- 1: \x{1ca}\x{1cb}
- 2: \x{1ca}
/\x{01cb}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
- 1: \x{1ca}\x{1cb}
- 2: \x{1ca}
/\x{01cc}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
- 1: \x{1ca}\x{1cb}
- 2: \x{1ca}
/\x{01f1}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
- 1: \x{1f1}\x{1f2}
- 2: \x{1f1}
/\x{01f2}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
- 1: \x{1f1}\x{1f2}
- 2: \x{1f1}
/\x{01f3}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
- 1: \x{1f1}\x{1f2}
- 2: \x{1f1}
/\x{0345}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
- 1: \x{345}\x{399}\x{3b9}
- 2: \x{345}\x{399}
- 3: \x{345}
/\x{0399}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
- 1: \x{345}\x{399}\x{3b9}
- 2: \x{345}\x{399}
- 3: \x{345}
/\x{03b9}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
- 1: \x{345}\x{399}\x{3b9}
- 2: \x{345}\x{399}
- 3: \x{345}
/\x{1fbe}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
- 1: \x{345}\x{399}\x{3b9}
- 2: \x{345}\x{399}
- 3: \x{345}
/\x{0392}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
- 1: \x{392}\x{3b2}
- 2: \x{392}
/\x{03b2}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
- 1: \x{392}\x{3b2}
- 2: \x{392}
/\x{03d0}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
- 1: \x{392}\x{3b2}
- 2: \x{392}
/\x{0395}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
- 1: \x{395}\x{3b5}
- 2: \x{395}
/\x{03b5}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
- 1: \x{395}\x{3b5}
- 2: \x{395}
/\x{03f5}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
- 1: \x{395}\x{3b5}
- 2: \x{395}
/\x{0398}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
- 1: \x{398}\x{3b8}\x{3d1}
- 2: \x{398}\x{3b8}
- 3: \x{398}
/\x{03b8}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
- 1: \x{398}\x{3b8}\x{3d1}
- 2: \x{398}\x{3b8}
- 3: \x{398}
/\x{03d1}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
- 1: \x{398}\x{3b8}\x{3d1}
- 2: \x{398}\x{3b8}
- 3: \x{398}
/\x{03f4}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
- 1: \x{398}\x{3b8}\x{3d1}
- 2: \x{398}\x{3b8}
- 3: \x{398}
/\x{039a}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
- 1: \x{39a}\x{3ba}
- 2: \x{39a}
/\x{03ba}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
- 1: \x{39a}\x{3ba}
- 2: \x{39a}
/\x{03f0}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
- 1: \x{39a}\x{3ba}
- 2: \x{39a}
/\x{03a0}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
- 1: \x{3a0}\x{3c0}
- 2: \x{3a0}
/\x{03c0}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
- 1: \x{3a0}\x{3c0}
- 2: \x{3a0}
/\x{03d6}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
- 1: \x{3a0}\x{3c0}
- 2: \x{3a0}
/\x{03a1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
- 1: \x{3a1}\x{3c1}
- 2: \x{3a1}
/\x{03c1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
- 1: \x{3a1}\x{3c1}
- 2: \x{3a1}
/\x{03f1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
- 1: \x{3a1}\x{3c1}
- 2: \x{3a1}
/\x{03a3}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
- 1: \x{3a3}\x{3c2}
- 2: \x{3a3}
/\x{03c2}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
- 1: \x{3a3}\x{3c2}
- 2: \x{3a3}
/\x{03c3}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
- 1: \x{3a3}\x{3c2}
- 2: \x{3a3}
/\x{03a6}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
- 1: \x{3a6}\x{3c6}
- 2: \x{3a6}
/\x{03c6}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
- 1: \x{3a6}\x{3c6}
- 2: \x{3a6}
/\x{03d5}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
- 1: \x{3a6}\x{3c6}
- 2: \x{3a6}
/\x{03c9}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
- 1: \x{3c9}\x{3a9}
- 2: \x{3c9}
/\x{03a9}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
- 1: \x{3c9}\x{3a9}
- 2: \x{3c9}
/\x{2126}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
- 1: \x{3c9}\x{3a9}
- 2: \x{3c9}
/\x{1e60}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
- 1: \x{1e60}\x{1e61}
- 2: \x{1e60}
/\x{1e61}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
- 1: \x{1e60}\x{1e61}
- 2: \x{1e60}
/\x{1e9b}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
- 1: \x{1e60}\x{1e61}
- 2: \x{1e60}
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
- 1: \x{1e9e}
/\x{00df}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
- 1: \x{1e9e}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
- 1: \x{1f88}
/\x{1f80}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
- 1: \x{1f88}
/\x{004b}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
- 1: Kk
- 2: K
/\x{006b}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
- 1: Kk
- 2: K
/\x{212a}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
- 1: Kk
- 2: K
/\x{0053}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
- 1: Ss
- 2: S
/\x{0073}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
- 1: Ss
- 2: S
/\x{017f}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
- 1: Ss
- 2: S
/ist/8i
ikt
@@ -2760,11 +2492,6 @@
/^\p{Xuc}+/8
$@`\x{a0}\x{1234}\x{e000}**
0: $@`\x{a0}\x{1234}\x{e000}
- 1: $@`\x{a0}\x{1234}
- 2: $@`\x{a0}
- 3: $@`
- 4: $@
- 5: $
** Failers
No match
\x{9f}
@@ -2802,8 +2529,6 @@
/^\p{Xuc}{3,5}/8
$@`\x{a0}\x{1234}\x{e000}**
0: $@`\x{a0}\x{1234}
- 1: $@`\x{a0}
- 2: $@`
** Failers
No match
\x{9f}
Modified: code/trunk/testdata/testoutput11-16
===================================================================
--- code/trunk/testdata/testoutput11-16 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput11-16 2013-10-01 16:54:40 UTC (rev 1363)
@@ -105,7 +105,7 @@
0 11 Bra
2 7 Once
4 x
- 6 x{0,2}
+ 6 x{0,2}+
9 7 Ket
11 11 Ket
13 End
@@ -139,7 +139,7 @@
39 [bc]+
57 21 Ket
59 5 CBra 5
- 62 \w*
+ 62 \w*+
64 5 Ket
66 63 Ket
68 68 Ket
Modified: code/trunk/testdata/testoutput11-32
===================================================================
--- code/trunk/testdata/testoutput11-32 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput11-32 2013-10-01 16:54:40 UTC (rev 1363)
@@ -105,7 +105,7 @@
0 11 Bra
2 7 Once
4 x
- 6 x{0,2}
+ 6 x{0,2}+
9 7 Ket
11 11 Ket
13 End
@@ -139,7 +139,7 @@
31 [bc]+
41 13 Ket
43 5 CBra 5
- 46 \w*
+ 46 \w*+
48 5 Ket
50 47 Ket
52 52 Ket
Modified: code/trunk/testdata/testoutput11-8
===================================================================
--- code/trunk/testdata/testoutput11-8 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput11-8 2013-10-01 16:54:40 UTC (rev 1363)
@@ -105,7 +105,7 @@
0 15 Bra
3 9 Once
6 x
- 8 x{0,2}
+ 8 x{0,2}+
12 9 Ket
15 15 Ket
18 End
@@ -139,7 +139,7 @@
66 [bc]+
100 39 Ket
103 7 CBra 5
-108 \w*
+108 \w*+
110 7 Ket
113 109 Ket
116 116 Ket
Modified: code/trunk/testdata/testoutput15
===================================================================
--- code/trunk/testdata/testoutput15 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput15 2013-10-01 16:54:40 UTC (rev 1363)
@@ -72,7 +72,7 @@
/\xC3\xC3\xC3xxx/8
Failed: invalid UTF-8 string at offset 0
-/\xC3\xC3\xC3xxx/8?DZSS
+/\xC3\xC3\xC3xxx/8?DZSSO
------------------------------------------------------------------
Bra
\X{c0}\X{c0}\X{c0}xxx
@@ -80,7 +80,7 @@
End
------------------------------------------------------------------
Capturing subpattern count = 0
-Options: utf no_utf_check
+Options: no_auto_possessify utf no_utf_check
First char = \x{c3}
Need char = 'x'
@@ -508,7 +508,7 @@
------------------------------------------------------------------
Bra
\x{100}{3}
- \x{100}?
+ \x{100}?+
Ket
End
------------------------------------------------------------------
@@ -525,7 +525,7 @@
------------------------------------------------------------------
Bra
CBra 1
- \x{100}+
+ \x{100}++
Alt
x
Ket
@@ -562,7 +562,7 @@
------------------------------------------------------------------
Bra
CBra 1
- \x{100}{0,2}
+ \x{100}{0,2}+
a
Alt
x
@@ -582,7 +582,7 @@
Bra
CBra 1
\x{100}
- \x{100}{0,1}
+ \x{100}{0,1}+
a
Alt
x
@@ -613,7 +613,7 @@
------------------------------------------------------------------
Bra
a\x{100}
- \x{101}*
+ \x{101}*+
Ket
End
------------------------------------------------------------------
@@ -626,7 +626,7 @@
------------------------------------------------------------------
Bra
a\x{100}
- \x{101}+
+ \x{101}++
Ket
End
------------------------------------------------------------------
Modified: code/trunk/testdata/testoutput17
===================================================================
--- code/trunk/testdata/testoutput17 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput17 2013-10-01 16:54:40 UTC (rev 1363)
@@ -442,7 +442,7 @@
/i [^\x{7fff}]{0,7}?
Once
/i [^\x{100}]{5}
- /i [^\x{100}]?
+ /i [^\x{100}]?+
Ket
Ket
End
Modified: code/trunk/testdata/testoutput18-16
===================================================================
--- code/trunk/testdata/testoutput18-16 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput18-16 2013-10-01 16:54:40 UTC (rev 1363)
@@ -367,7 +367,7 @@
------------------------------------------------------------------
Bra
\x{100}{3}
- \x{100}?
+ \x{100}?+
Ket
End
------------------------------------------------------------------
@@ -384,7 +384,7 @@
------------------------------------------------------------------
Bra
CBra 1
- \x{100}+
+ \x{100}++
Alt
x
Ket
@@ -421,7 +421,7 @@
------------------------------------------------------------------
Bra
CBra 1
- \x{100}{0,2}
+ \x{100}{0,2}+
a
Alt
x
@@ -441,7 +441,7 @@
Bra
CBra 1
\x{100}
- \x{100}{0,1}
+ \x{100}{0,1}+
a
Alt
x
@@ -472,7 +472,7 @@
------------------------------------------------------------------
Bra
a\x{100}
- \x{101}*
+ \x{101}*+
Ket
End
------------------------------------------------------------------
@@ -485,7 +485,7 @@
------------------------------------------------------------------
Bra
a\x{100}
- \x{101}+
+ \x{101}++
Ket
End
------------------------------------------------------------------
Modified: code/trunk/testdata/testoutput18-32
===================================================================
--- code/trunk/testdata/testoutput18-32 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput18-32 2013-10-01 16:54:40 UTC (rev 1363)
@@ -365,7 +365,7 @@
------------------------------------------------------------------
Bra
\x{100}{3}
- \x{100}?
+ \x{100}?+
Ket
End
------------------------------------------------------------------
@@ -382,7 +382,7 @@
------------------------------------------------------------------
Bra
CBra 1
- \x{100}+
+ \x{100}++
Alt
x
Ket
@@ -419,7 +419,7 @@
------------------------------------------------------------------
Bra
CBra 1
- \x{100}{0,2}
+ \x{100}{0,2}+
a
Alt
x
@@ -439,7 +439,7 @@
Bra
CBra 1
\x{100}
- \x{100}{0,1}
+ \x{100}{0,1}+
a
Alt
x
@@ -470,7 +470,7 @@
------------------------------------------------------------------
Bra
a\x{100}
- \x{101}*
+ \x{101}*+
Ket
End
------------------------------------------------------------------
@@ -483,7 +483,7 @@
------------------------------------------------------------------
Bra
a\x{100}
- \x{101}+
+ \x{101}++
Ket
End
------------------------------------------------------------------
Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput2 2013-10-01 16:54:40 UTC (rev 1363)
@@ -2898,7 +2898,7 @@
Bra
Once
x
- x{0,2}
+ x{0,2}+
Ket
Ket
End
@@ -3088,7 +3088,7 @@
[bc]+
Ket
CBra 5
- \w*
+ \w*+
Ket
Ket
Ket
@@ -3788,12 +3788,6 @@
--->abbbbbccc
1 ^ ^
Callout data = 1
- 1 ^ ^
-Callout data = 1
- 1 ^ ^
-Callout data = 1
- 1 ^ ^
-Callout data = 1
1 ^ ^
Callout data = 1
1 ^ ^
@@ -7516,7 +7510,7 @@
/a*[^a]/BZ
------------------------------------------------------------------
Bra
- a*
+ a*+
[^a]
Ket
End
@@ -8596,7 +8590,7 @@
\d
\v++
\w
- \v+
+ \v++
\S
\v++
\V
@@ -8694,27 +8688,18 @@
+6 ^ ^ (*FAIL)
+6 ^ ^ (*FAIL)
+6 ^ ^ (*FAIL)
- +4 ^ ^ c+
- +2 ^ ^ b?
- +4 ^ ^ c+
- +2 ^^ b?
- +4 ^^ c+
+0 ^ a+
+2 ^ ^ b?
+4 ^ ^ c+
+6 ^ ^ (*FAIL)
+6 ^ ^ (*FAIL)
+6 ^ ^ (*FAIL)
- +4 ^ ^ c+
- +2 ^^ b?
- +4 ^^ c+
+0 ^ a+
+2 ^^ b?
+4 ^ ^ c+
+6 ^ ^ (*FAIL)
+6 ^ ^ (*FAIL)
+6 ^ ^ (*FAIL)
- +4 ^^ c+
No match
/a+b?(*PRUNE)c+(*FAIL)/C
@@ -11577,7 +11562,7 @@
Assert not
a
Ket
- \w+
+ \w++
Ket
End
------------------------------------------------------------------
@@ -11837,14 +11822,14 @@
/^(?>a+)(?>b+)(?>c+)(?>d+)(?>e+)/
\Maabbccddee
-Minimum match() limit = 12
+Minimum match() limit = 11
Minimum match() recursion limit = 3
0: aabbccddee
/^(?>(a+))(?>(b+))(?>(c+))(?>(d+))(?>(e+))/
\Maabbccddee
-Minimum match() limit = 22
-Minimum match() recursion limit = 21
+Minimum match() limit = 21
+Minimum match() recursion limit = 20
0: aabbccddee
1: aa
2: bb
@@ -11854,8 +11839,8 @@
/^(?>(a+))(?>b+)(?>(c+))(?>d+)(?>(e+))/
\Maabbccddee
-Minimum match() limit = 18
-Minimum match() recursion limit = 13
+Minimum match() limit = 17
+Minimum match() recursion limit = 12
0: aabbccddee
1: aa
2: cc
@@ -12768,4 +12753,764 @@
Subject length lower bound = 5
No set of starting bytes
+/a*[bcd]/BZ
+------------------------------------------------------------------
+ Bra
+ a*+
+ [b-d]
+ Ket
+ End
+------------------------------------------------------------------
+
+/[bcd]*a/BZ
+------------------------------------------------------------------
+ Bra
+ [b-d]*
+ a
+ Ket
+ End
+------------------------------------------------------------------
+
+/-- A complete set of tests for auto-possessification of character types --/
+
+/\D+\D \D+\d \D+\S \D+\s \D+\W \D+\w \D+. \D+\C \D+\R \D+\H \D+\h \D+\V \D+\v \D+\X \D+\Z \D+\z \D+$/BZx
+------------------------------------------------------------------
+ Bra
+ \D+
+ \D
+ \D++
+ \d
+ \D+
+ \S
+ \D+
+ \s
+ \D+
+ \W
+ \D+
+ \w
+ \D+
+ Any
+ \D+
+ AllAny
+ \D+
+ \R
+ \D+
+ \H
+ \D+
+ \h
+ \D+
+ \V
+ \D+
+ \v
+ \D+
+ extuni
+ \D+
+ \Z
+ \D++
+ \z
+ \D+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\d+\D \d+\d \d+\S \d+\s \d+\W \d+\w \d+. \d+\C \d+\R \d+\H \d+\h \d+\V \d+\v \d+\X \d+\Z \d+\z \d+$/BZx
+------------------------------------------------------------------
+ Bra
+ \d++
+ \D
+ \d+
+ \d
+ \d+
+ \S
+ \d++
+ \s
+ \d++
+ \W
+ \d+
+ \w
+ \d+
+ Any
+ \d+
+ AllAny
+ \d++
+ \R
+ \d+
+ \H
+ \d++
+ \h
+ \d+
+ \V
+ \d++
+ \v
+ \d+
+ extuni
+ \d++
+ \Z
+ \d++
+ \z
+ \d++
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\S+\D \S+\d \S+\S \S+\s \S+\W \S+\w \S+. \S+\C \S+\R \S+\H \S+\h \S+\V \S+\v \S+\X \S+\Z \S+\z \S+$/BZx
+------------------------------------------------------------------
+ Bra
+ \S+
+ \D
+ \S+
+ \d
+ \S+
+ \S
+ \S++
+ \s
+ \S+
+ \W
+ \S+
+ \w
+ \S+
+ Any
+ \S+
+ AllAny
+ \S++
+ \R
+ \S+
+ \H
+ \S++
+ \h
+ \S+
+ \V
+ \S++
+ \v
+ \S+
+ extuni
+ \S++
+ \Z
+ \S++
+ \z
+ \S++
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\s+\D \s+\d \s+\S \s+\s \s+\W \s+\w \s+. \s+\C \s+\R \s+\H \s+\h \s+\V \s+\v \s+\X \s+\Z \s+\z \s+$/BZx
+------------------------------------------------------------------
+ Bra
+ \s+
+ \D
+ \s++
+ \d
+ \s++
+ \S
+ \s+
+ \s
+ \s+
+ \W
+ \s++
+ \w
+ \s+
+ Any
+ \s+
+ AllAny
+ \s+
+ \R
+ \s+
+ \H
+ \s+
+ \h
+ \s+
+ \V
+ \s+
+ \v
+ \s+
+ extuni
+ \s+
+ \Z
+ \s++
+ \z
+ \s+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\W+\D \W+\d \W+\S \W+\s \W+\W \W+\w \W+. \W+\C \W+\R \W+\H \W+\h \W+\V \W+\v \W+\X \W+\Z \W+\z \W+$/BZx
+------------------------------------------------------------------
+ Bra
+ \W+
+ \D
+ \W++
+ \d
+ \W+
+ \S
+ \W+
+ \s
+ \W+
+ \W
+ \W++
+ \w
+ \W+
+ Any
+ \W+
+ AllAny
+ \W+
+ \R
+ \W+
+ \H
+ \W+
+ \h
+ \W+
+ \V
+ \W+
+ \v
+ \W+
+ extuni
+ \W+
+ \Z
+ \W++
+ \z
+ \W+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\w+\D \w+\d \w+\S \w+\s \w+\W \w+\w \w+. \w+\C \w+\R \w+\H \w+\h \w+\V \w+\v \w+\X \w+\Z \w+\z \w+$/BZx
+------------------------------------------------------------------
+ Bra
+ \w+
+ \D
+ \w+
+ \d
+ \w+
+ \S
+ \w++
+ \s
+ \w++
+ \W
+ \w+
+ \w
+ \w+
+ Any
+ \w+
+ AllAny
+ \w++
+ \R
+ \w+
+ \H
+ \w++
+ \h
+ \w+
+ \V
+ \w++
+ \v
+ \w+
+ extuni
+ \w++
+ \Z
+ \w++
+ \z
+ \w++
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\C+\D \C+\d \C+\S \C+\s \C+\W \C+\w \C+. \C+\C \C+\R \C+\H \C+\h \C+\V \C+\v \C+\X \C+\Z \C+\z \C+$/BZx
+------------------------------------------------------------------
+ Bra
+ AllAny+
+ \D
+ AllAny+
+ \d
+ AllAny+
+ \S
+ AllAny+
+ \s
+ AllAny+
+ \W
+ AllAny+
+ \w
+ AllAny+
+ Any
+ AllAny+
+ AllAny
+ AllAny+
+ \R
+ AllAny+
+ \H
+ AllAny+
+ \h
+ AllAny+
+ \V
+ AllAny+
+ \v
+ AllAny+
+ extuni
+ AllAny+
+ \Z
+ AllAny++
+ \z
+ AllAny+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\R+\D \R+\d \R+\S \R+\s \R+\W \R+\w \R+. \R+\C \R+\R \R+\H \R+\h \R+\V \R+\v \R+\X \R+\Z \R+\z \R+$/BZx
+------------------------------------------------------------------
+ Bra
+ \R+
+ \D
+ \R++
+ \d
+ \R+
+ \S
+ \R++
+ \s
+ \R+
+ \W
+ \R++
+ \w
+ \R++
+ Any
+ \R+
+ AllAny
+ \R+
+ \R
+ \R+
+ \H
+ \R++
+ \h
+ \R+
+ \V
+ \R+
+ \v
+ \R+
+ extuni
+ \R+
+ \Z
+ \R++
+ \z
+ \R+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\H+\D \H+\d \H+\S \H+\s \H+\W \H+\w \H+. \H+\C \H+\R \H+\H \H+\h \H+\V \H+\v \H+\X \H+\Z \H+\z \H+$/BZx
+------------------------------------------------------------------
+ Bra
+ \H+
+ \D
+ \H+
+ \d
+ \H+
+ \S
+ \H+
+ \s
+ \H+
+ \W
+ \H+
+ \w
+ \H+
+ Any
+ \H+
+ AllAny
+ \H+
+ \R
+ \H+
+ \H
+ \H++
+ \h
+ \H+
+ \V
+ \H+
+ \v
+ \H+
+ extuni
+ \H+
+ \Z
+ \H++
+ \z
+ \H+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\h+\D \h+\d \h+\S \h+\s \h+\W \h+\w \h+. \h+\C \h+\R \h+\H \h+\h \h+\V \h+\v \h+\X \h+\Z \h+\z \h+$/BZx
+------------------------------------------------------------------
+ Bra
+ \h+
+ \D
+ \h++
+ \d
+ \h++
+ \S
+ \h+
+ \s
+ \h+
+ \W
+ \h++
+ \w
+ \h+
+ Any
+ \h+
+ AllAny
+ \h++
+ \R
+ \h++
+ \H
+ \h+
+ \h
+ \h+
+ \V
+ \h++
+ \v
+ \h+
+ extuni
+ \h+
+ \Z
+ \h++
+ \z
+ \h+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\V+\D \V+\d \V+\S \V+\s \V+\W \V+\w \V+. \V+\C \V+\R \V+\H \V+\h \V+\V \V+\v \V+\X \V+\Z \V+\z \V+$/BZx
+------------------------------------------------------------------
+ Bra
+ \V+
+ \D
+ \V+
+ \d
+ \V+
+ \S
+ \V+
+ \s
+ \V+
+ \W
+ \V+
+ \w
+ \V+
+ Any
+ \V+
+ AllAny
+ \V++
+ \R
+ \V+
+ \H
+ \V+
+ \h
+ \V+
+ \V
+ \V++
+ \v
+ \V+
+ extuni
+ \V+
+ \Z
+ \V++
+ \z
+ \V+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\v+\D \v+\d \v+\S \v+\s \v+\W \v+\w \v+. \v+\C \v+\R \v+\H \v+\h \v+\V \v+\v \v+\X \v+\Z \v+\z \v+$/BZx
+------------------------------------------------------------------
+ Bra
+ \v+
+ \D
+ \v++
+ \d
+ \v++
+ \S
+ \v+
+ \s
+ \v+
+ \W
+ \v++
+ \w
+ \v+
+ Any
+ \v+
+ AllAny
+ \v+
+ \R
+ \v+
+ \H
+ \v++
+ \h
+ \v++
+ \V
+ \v+
+ \v
+ \v+
+ extuni
+ \v+
+ \Z
+ \v++
+ \z
+ \v+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\X+\D \X+\d \X+\S \X+\s \X+\W \X+\w \X+. \X+\C \X+\R \X+\H \X+\h \X+\V \X+\v \X+\X \X+\Z \X+\z \X+$/BZx
+------------------------------------------------------------------
+ Bra
+ extuni+
+ \D
+ extuni+
+ \d
+ extuni+
+ \S
+ extuni+
+ \s
+ extuni+
+ \W
+ extuni+
+ \w
+ extuni+
+ Any
+ extuni+
+ AllAny
+ extuni+
+ \R
+ extuni+
+ \H
+ extuni+
+ \h
+ extuni+
+ \V
+ extuni+
+ \v
+ extuni+
+ extuni
+ extuni+
+ \Z
+ extuni++
+ \z
+ extuni+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/ a+\D a+\d a+\S a+\s a+\W a+\w a+. a+\C a+\R a+\H a+\h a+\V a+\v a+\X a+\Z a+\z a+$/BZx
+------------------------------------------------------------------
+ Bra
+ a+
+ \D
+ a++
+ \d
+ a+
+ \S
+ a++
+ \s
+ a++
+ \W
+ a+
+ \w
+ a+
+ Any
+ a+
+ AllAny
+ a++
+ \R
+ a+
+ \H
+ a++
+ \h
+ a+
+ \V
+ a++
+ \v
+ a+
+ extuni
+ a++
+ \Z
+ a++
+ \z
+ a++
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\n+\D \n+\d \n+\S \n+\s \n+\W \n+\w \n+. \n+\C \n+\R \n+\H \n+\h \n+\V \n+\v \n+\X \n+\Z \n+\z \n+$/BZx
+------------------------------------------------------------------
+ Bra
+ \x0a+
+ \D
+ \x0a++
+ \d
+ \x0a++
+ \S
+ \x0a+
+ \s
+ \x0a+
+ \W
+ \x0a++
+ \w
+ \x0a+
+ Any
+ \x0a+
+ AllAny
+ \x0a+
+ \R
+ \x0a+
+ \H
+ \x0a++
+ \h
+ \x0a++
+ \V
+ \x0a+
+ \v
+ \x0a+
+ extuni
+ \x0a+
+ \Z
+ \x0a++
+ \z
+ \x0a+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/ .+\D .+\d .+\S .+\s .+\W .+\w .+. .+\C .+\R .+\H .+\h .+\V .+\v .+\X .+\Z .+\z .+$/BZx
+------------------------------------------------------------------
+ Bra
+ Any+
+ \D
+ Any+
+ \d
+ Any+
+ \S
+ Any+
+ \s
+ Any+
+ \W
+ Any+
+ \w
+ Any+
+ Any
+ Any+
+ AllAny
+ Any++
+ \R
+ Any+
+ \H
+ Any+
+ \h
+ Any+
+ \V
+ Any+
+ \v
+ Any+
+ extuni
+ Any+
+ \Z
+ Any++
+ \z
+ Any+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/ .+\D .+\d .+\S .+\s .+\W .+\w .+. .+\C .+\R .+\H .+\h .+\V .+\v .+\X .+\Z .+\z .+$/BZxs
+------------------------------------------------------------------
+ Bra
+ AllAny+
+ \D
+ AllAny+
+ \d
+ AllAny+
+ \S
+ AllAny+
+ \s
+ AllAny+
+ \W
+ AllAny+
+ \w
+ AllAny+
+ AllAny
+ AllAny+
+ AllAny
+ AllAny+
+ \R
+ AllAny+
+ \H
+ AllAny+
+ \h
+ AllAny+
+ \V
+ AllAny+
+ \v
+ AllAny+
+ extuni
+ AllAny+
+ \Z
+ AllAny++
+ \z
+ AllAny+
+ $
+ Ket
+ End
+------------------------------------------------------------------
+
+/\D+$ \d+$ \S+$ \s+$ \W+$ \w+$ \C+$ \R+$ \H+$ \h+$ \V+$ \v+$ \X+$ a+$ \n+$ .+$ .+$/BZxm
+------------------------------------------------------------------
+ Bra
+ \D+
+ /m $
+ \d++
+ /m $
+ \S++
+ /m $
+ \s+
+ /m $
+ \W+
+ /m $
+ \w++
+ /m $
+ AllAny+
+ /m $
+ \R+
+ /m $
+ \H+
+ /m $
+ \h+
+ /m $
+ \V+
+ /m $
+ \v+
+ /m $
+ extuni+
+ /m $
+ a+
+ /m $
+ \x0a+
+ /m $
+ Any+
+ /m $
+ Any+
+ /m $
+ Ket
+ End
+------------------------------------------------------------------
+
+/-- End of special auto-possessive tests --/
+
/-- End of testinput2 --/
Modified: code/trunk/testdata/testoutput20
===================================================================
--- code/trunk/testdata/testoutput20 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput20 2013-10-01 16:54:40 UTC (rev 1363)
@@ -8,12 +8,10 @@
/^\x{ffff}?/i
\x{ffff}
0: \x{ffff}
- 1:
/^\x{ffff}*/i
\x{ffff}
0: \x{ffff}
- 1:
/^\x{ffff}{3}/i
\x{ffff}\x{ffff}\x{ffff}
@@ -22,6 +20,5 @@
/^\x{ffff}{0,3}/i
\x{ffff}
0: \x{ffff}
- 1:
/-- End of testinput20 --/
Modified: code/trunk/testdata/testoutput5
===================================================================
--- code/trunk/testdata/testoutput5 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput5 2013-10-01 16:54:40 UTC (rev 1363)
@@ -146,7 +146,7 @@
/\x{100}*/8DZ
------------------------------------------------------------------
Bra
- \x{100}*
+ \x{100}*+
Ket
End
------------------------------------------------------------------
@@ -160,7 +160,7 @@
------------------------------------------------------------------
Bra
a
- \x{100}*
+ \x{100}*+
Ket
End
------------------------------------------------------------------
@@ -173,7 +173,7 @@
------------------------------------------------------------------
Bra
ab
- \x{100}*
+ \x{100}*+
Ket
End
------------------------------------------------------------------
@@ -1817,7 +1817,7 @@
/i [^\x{7fff}]{0,7}?
Once
/i [^\x{fffff}]{5}
- /i [^\x{fffff}]?
+ /i [^\x{fffff}]?+
Ket
Ket
End
Modified: code/trunk/testdata/testoutput7
===================================================================
--- code/trunk/testdata/testoutput7 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput7 2013-10-01 16:54:40 UTC (rev 1363)
@@ -1109,8 +1109,8 @@
prop Nd
B+
prop N *+
- B+
- prop Nd *
+ B++
+ prop Nd *+
Ket
End
------------------------------------------------------------------
@@ -1386,7 +1386,7 @@
/[\x{3a3}]+/8iBZ
------------------------------------------------------------------
Bra
- clist 03a3 03c2 03c3 +
+ clist 03a3 03c2 03c3 ++
Ket
End
------------------------------------------------------------------
@@ -1394,7 +1394,7 @@
/[^\x{3a3}]+/8iBZ
------------------------------------------------------------------
Bra
- not clist 03a3 03c2 03c3 +
+ not clist 03a3 03c2 03c3 ++
Ket
End
------------------------------------------------------------------
@@ -1616,5 +1616,509 @@
No match
\x{1234}abc
No match
+
+/-- Some auto-possessification tests --/
+/\pN+\z/BZ
+------------------------------------------------------------------
+ Bra
+ prop N ++
+ \z
+ Ket
+ End
+------------------------------------------------------------------
+
+/\PN+\z/BZ
+------------------------------------------------------------------
+ Bra
+ notprop N ++
+ \z
+ Ket
+ End
+------------------------------------------------------------------
+
+/\pN+/BZ
+------------------------------------------------------------------
+ Bra
+ prop N ++
+ Ket
+ End
+------------------------------------------------------------------
+
+/\PN+/BZ
+------------------------------------------------------------------
+ Bra
+ notprop N ++
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Any}+\p{Any} \p{Any}+\P{Any} \p{Any}+\p{L&} \p{Any}+\p{L} \p{Any}+\p{Lu} \p{Any}+\p{Han} \p{Any}+\p{Xan} \p{Any}+\p{Xsp} \p{Any}+\p{Xps} \p{Xwd}+\p{Any} \p{Any}+\p{Xuc}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Any +
+ prop Any
+ prop Any +
+ notprop Any
+ prop Any +
+ prop L&
+ prop Any +
+ prop L
+ prop Any +
+ prop Lu
+ prop Any +
+ prop Han
+ prop Any +
+ prop Xan
+ prop Any +
+ prop Xsp
+ prop Any +
+ prop Xps
+ prop Xwd +
+ prop Any
+ prop Any +
+ prop Xuc
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{L&}+\p{Any} \p{L&}+\p{L&} \P{L&}+\p{L&} \p{L&}+\p{L} \p{L&}+\p{Lu} \p{L&}+\p{Han} \p{L&}+\p{Xan} \p{L&}+\P{Xan} \p{L&}+\p{Xsp} \p{L&}+\p{Xps} \p{Xwd}+\p{L&} \p{L&}+\p{Xuc}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop L& +
+ prop Any
+ prop L& +
+ prop L&
+ notprop L& ++
+ prop L&
+ prop L& +
+ prop L
+ prop L& +
+ prop Lu
+ prop L& +
+ prop Han
+ prop L& +
+ prop Xan
+ prop L& ++
+ notprop Xan
+ prop L& ++
+ prop Xsp
+ prop L& ++
+ prop Xps
+ prop Xwd +
+ prop L&
+ prop L& +
+ prop Xuc
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{N}+\p{Any} \p{N}+\p{L&} \p{N}+\p{L} \p{N}+\P{L} \p{N}+\P{N} \p{N}+\p{Lu} \p{N}+\p{Han} \p{N}+\p{Xan} \p{N}+\p{Xsp} \p{N}+\p{Xps} \p{Xwd}+\p{N} \p{N}+\p{Xuc}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop N +
+ prop Any
+ prop N +
+ prop L&
+ prop N ++
+ prop L
+ prop N +
+ notprop L
+ prop N ++
+ notprop N
+ prop N ++
+ prop Lu
+ prop N +
+ prop Han
+ prop N +
+ prop Xan
+ prop N ++
+ prop Xsp
+ prop N ++
+ prop Xps
+ prop Xwd +
+ prop N
+ prop N +
+ prop Xuc
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Lu}+\p{Any} \p{Lu}+\p{L&} \p{Lu}+\p{L} \p{Lu}+\p{Lu} \P{Lu}+\p{Lu} \p{Lu}+\p{Nd} \p{Lu}+\P{Nd} \p{Lu}+\p{Han} \p{Lu}+\p{Xan} \p{Lu}+\p{Xsp} \p{Lu}+\p{Xps} \p{Xwd}+\p{Lu} \p{Lu}+\p{Xuc}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Lu +
+ prop Any
+ prop Lu +
+ prop L&
+ prop Lu +
+ prop L
+ prop Lu +
+ prop Lu
+ notprop Lu ++
+ prop Lu
+ prop Lu ++
+ prop Nd
+ prop Lu +
+ notprop Nd
+ prop Lu +
+ prop Han
+ prop Lu +
+ prop Xan
+ prop Lu ++
+ prop Xsp
+ prop Lu ++
+ prop Xps
+ prop Xwd +
+ prop Lu
+ prop Lu +
+ prop Xuc
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Han}+\p{Lu} \p{Han}+\p{L&} \p{Han}+\p{L} \p{Han}+\p{Lu} \p{Han}+\p{Arabic} \p{Arabic}+\p{Arabic} \p{Han}+\p{Xan} \p{Han}+\p{Xsp} \p{Han}+\p{Xps} \p{Xwd}+\p{Han} \p{Han}+\p{Xuc}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Han +
+ prop Lu
+ prop Han +
+ prop L&
+ prop Han +
+ prop L
+ prop Han +
+ prop Lu
+ prop Han ++
+ prop Arabic
+ prop Arabic +
+ prop Arabic
+ prop Han +
+ prop Xan
+ prop Han +
+ prop Xsp
+ prop Han +
+ prop Xps
+ prop Xwd +
+ prop Han
+ prop Han +
+ prop Xuc
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Xan}+\p{Any} \p{Xan}+\p{L&} \P{Xan}+\p{L&} \p{Xan}+\p{L} \p{Xan}+\p{Lu} \p{Xan}+\p{Han} \p{Xan}+\p{Xan} \p{Xan}+\P{Xan} \p{Xan}+\p{Xsp} \p{Xan}+\p{Xps} \p{Xwd}+\p{Xan} \p{Xan}+\p{Xuc}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xan +
+ prop Any
+ prop Xan +
+ prop L&
+ notprop Xan ++
+ prop L&
+ prop Xan +
+ prop L
+ prop Xan +
+ prop Lu
+ prop Xan +
+ prop Han
+ prop Xan +
+ prop Xan
+ prop Xan ++
+ notprop Xan
+ prop Xan ++
+ prop Xsp
+ prop Xan ++
+ prop Xps
+ prop Xwd +
+ prop Xan
+ prop Xan +
+ prop Xuc
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Xsp}+\p{Any} \p{Xsp}+\p{L&} \p{Xsp}+\p{L} \p{Xsp}+\p{Lu} \p{Xsp}+\p{Han} \p{Xsp}+\p{Xan} \p{Xsp}+\p{Xsp} \P{Xsp}+\p{Xsp} \p{Xsp}+\p{Xps} \p{Xwd}+\p{Xsp} \p{Xsp}+\p{Xuc}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xsp +
+ prop Any
+ prop Xsp ++
+ prop L&
+ prop Xsp ++
+ prop L
+ prop Xsp ++
+ prop Lu
+ prop Xsp +
+ prop Han
+ prop Xsp ++
+ prop Xan
+ prop Xsp +
+ prop Xsp
+ notprop Xsp ++
+ prop Xsp
+ prop Xsp +
+ prop Xps
+ prop Xwd ++
+ prop Xsp
+ prop Xsp +
+ prop Xuc
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Xwd}+\p{Any} \p{Xwd}+\p{L&} \p{Xwd}+\p{L} \p{Xwd}+\p{Lu} \p{Xwd}+\p{Han} \p{Xwd}+\p{Xan} \p{Xwd}+\p{Xsp} \p{Xwd}+\p{Xps} \p{Xwd}+\p{Xwd} \p{Xwd}+\P{Xwd} \p{Xwd}+\p{Xuc}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xwd +
+ prop Any
+ prop Xwd +
+ prop L&
+ prop Xwd +
+ prop L
+ prop Xwd +
+ prop Lu
+ prop Xwd +
+ prop Han
+ prop Xwd +
+ prop Xan
+ prop Xwd ++
+ prop Xsp
+ prop Xwd ++
+ prop Xps
+ prop Xwd +
+ prop Xwd
+ prop Xwd ++
+ notprop Xwd
+ prop Xwd +
+ prop Xuc
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Xuc}+\p{Any} \p{Xuc}+\p{L&} \p{Xuc}+\p{L} \p{Xuc}+\p{Lu} \p{Xuc}+\p{Han} \p{Xuc}+\p{Xan} \p{Xuc}+\p{Xsp} \p{Xuc}+\p{Xps} \p{Xwd}+\p{Xuc} \p{Xuc}+\p{Xuc} \p{Xuc}+\P{Xuc}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xuc +
+ prop Any
+ prop Xuc +
+ prop L&
+ prop Xuc +
+ prop L
+ prop Xuc +
+ prop Lu
+ prop Xuc +
+ prop Han
+ prop Xuc +
+ prop Xan
+ prop Xuc +
+ prop Xsp
+ prop Xuc +
+ prop Xps
+ prop Xwd +
+ prop Xuc
+ prop Xuc +
+ prop Xuc
+ prop Xuc ++
+ notprop Xuc
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{N}+\p{Ll} \p{N}+\p{Nd} \p{N}+\P{Nd}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop N ++
+ prop Ll
+ prop N +
+ prop Nd
+ prop N +
+ notprop Nd
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Xan}+\p{L} \p{Xan}+\p{N} \p{Xan}+\p{C} \p{Xan}+\P{L} \P{Xan}+\p{N} \p{Xan}+\P{C}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xan +
+ prop L
+ prop Xan +
+ prop N
+ prop Xan ++
+ prop C
+ prop Xan +
+ notprop L
+ notprop Xan ++
+ prop N
+ prop Xan +
+ notprop C
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{L}+\p{Xan} \p{N}+\p{Xan} \p{C}+\p{Xan} \P{L}+\p{Xan} \p{N}+\p{Xan} \P{C}+\p{Xan} \p{L}+\P{Xan}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop L +
+ prop Xan
+ prop N +
+ prop Xan
+ prop C ++
+ prop Xan
+ notprop L +
+ prop Xan
+ prop N +
+ prop Xan
+ notprop C +
+ prop Xan
+ prop L ++
+ notprop Xan
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Xan}+\p{Lu} \p{Xan}+\p{Nd} \p{Xan}+\p{Cc} \p{Xan}+\P{Ll} \P{Xan}+\p{No} \p{Xan}+\P{Cf}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xan +
+ prop Lu
+ prop Xan +
+ prop Nd
+ prop Xan ++
+ prop Cc
+ prop Xan +
+ notprop Ll
+ notprop Xan ++
+ prop No
+ prop Xan +
+ notprop Cf
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Lu}+\p{Xan} \p{Nd}+\p{Xan} \p{Cs}+\p{Xan} \P{Lt}+\p{Xan} \p{Nl}+\p{Xan} \P{Cc}+\p{Xan} \p{Lt}+\P{Xan}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Lu +
+ prop Xan
+ prop Nd +
+ prop Xan
+ prop Cs ++
+ prop Xan
+ notprop Lt +
+ prop Xan
+ prop Nl +
+ prop Xan
+ notprop Cc +
+ prop Xan
+ prop Lt ++
+ notprop Xan
+ Ket
+ End
+------------------------------------------------------------------
+
+/\w+\p{P} \w+\p{Po} \w+\s \p{Xan}+\s \s+\p{Xan} \s+\w/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xwd +
+ prop P
+ prop Xwd +
+ prop Po
+ prop Xwd ++
+ prop Xsp
+ prop Xan ++
+ prop Xsp
+ prop Xsp ++
+ prop Xan
+ prop Xsp ++
+ prop Xwd
+ Ket
+ End
+------------------------------------------------------------------
+
+/\w+\P{P} \W+\p{Po} \w+\S \P{Xan}+\s \s+\P{Xan} \s+\W/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xwd +
+ notprop P
+ notprop Xwd +
+ prop Po
+ prop Xwd +
+ notprop Xsp
+ notprop Xan +
+ prop Xsp
+ prop Xsp +
+ notprop Xan
+ prop Xsp +
+ notprop Xwd
+ Ket
+ End
+------------------------------------------------------------------
+
+/\w+\p{Po} \w+\p{Pc} \W+\p{Po} \W+\p{Pc} \w+\P{Po} \w+\P{Pc}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xwd +
+ prop Po
+ prop Xwd ++
+ prop Pc
+ notprop Xwd +
+ prop Po
+ notprop Xwd +
+ prop Pc
+ prop Xwd +
+ notprop Po
+ prop Xwd +
+ notprop Pc
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Nl}+\p{Xan} \P{Nl}+\p{Xan} \p{Nl}+\P{Xan} \P{Nl}+\P{Xan}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Nl +
+ prop Xan
+ notprop Nl +
+ prop Xan
+ prop Nl ++
+ notprop Xan
+ notprop Nl +
+ notprop Xan
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Xan}+\p{Nl} \P{Xan}+\p{Nl} \p{Xan}+\P{Nl} \P{Xan}+\P{Nl}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xan +
+ prop Nl
+ notprop Xan ++
+ prop Nl
+ prop Xan +
+ notprop Nl
+ notprop Xan +
+ notprop Nl
+ Ket
+ End
+------------------------------------------------------------------
+
+/\p{Xan}+\p{Nd} \P{Xan}+\p{Nd} \p{Xan}+\P{Nd} \P{Xan}+\P{Nd}/BWZx
+------------------------------------------------------------------
+ Bra
+ prop Xan +
+ prop Nd
+ notprop Xan ++
+ prop Nd
+ prop Xan +
+ notprop Nd
+ notprop Xan +
+ notprop Nd
+ Ket
+ End
+------------------------------------------------------------------
+
+/-- End auto-possessification tests --/
+
/-- End of testinput7 --/
Modified: code/trunk/testdata/testoutput8
===================================================================
--- code/trunk/testdata/testoutput8 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput8 2013-10-01 16:54:40 UTC (rev 1363)
@@ -25,7 +25,7 @@
ab
No match
-/a*/
+/a*/O
a
0: a
1:
@@ -341,7 +341,7 @@
axyzq
No match
-/[^a]+/
+/[^a]+/O
bac
0: b
bcdefax
@@ -359,7 +359,7 @@
aaaaa
No match
-/[^a]*/
+/[^a]*/O
bac
0: b
1:
@@ -380,7 +380,7 @@
aaaaa
0:
-/[^a]{3,5}/
+/[^a]{3,5}/O
xyz
0: xyz
awxyza
@@ -408,29 +408,18 @@
/\d*/
1234b567
0: 1234
- 1: 123
- 2: 12
- 3: 1
- 4:
xyz
0:
/\D*/
a1234b567
0: a
- 1:
xyz
0: xyz
- 1: xy
- 2: x
- 3:
/\d+/
ab1234c56
0: 1234
- 1: 123
- 2: 12
- 3: 1
*** Failers
No match
xyz
@@ -439,19 +428,8 @@
/\D+/
ab123c56
0: ab
- 1: a
*** Failers
0: *** Failers
- 1: *** Failer
- 2: *** Faile
- 3: *** Fail
- 4: *** Fai
- 5: *** Fa
- 6: *** F
- 7: ***
- 8: ***
- 9: **
-10: *
789
No match
@@ -478,9 +456,6 @@
/a+/
aaaa
0: aaaa
- 1: aaa
- 2: aa
- 3: a
/^.*xyz/
xyz
@@ -886,9 +861,6 @@
0:
aaabcd
0: aaa
- 1: aa
- 2: a
- 3:
xyz
0: xyz
1:
@@ -1744,38 +1716,16 @@
/foo(?!bar)(.*)/
foobar is foolish see?
0: foolish see?
- 1: foolish see
- 2: foolish se
- 3: foolish s
- 4: foolish
- 5: foolish
- 6: foolis
- 7: fooli
- 8: fool
- 9: foo
/(?:(?!foo)...|^.{0,2})bar(.*)/
foobar crowbar etc
0: rowbar etc
- 1: rowbar et
- 2: rowbar e
- 3: rowbar
- 4: rowbar
barrel
0: barrel
- 1: barre
- 2: barr
- 3: bar
2barrel
0: 2barrel
- 1: 2barre
- 2: 2barr
- 3: 2bar
A barrel
0: A barrel
- 1: A barre
- 2: A barr
- 3: A bar
/^(\D*)(?=\d)(?!123)/
abc456
@@ -1820,7 +1770,7 @@
the abc
No match
-/^[ab]{1,3}(ab*|b)/
+/^[ab]{1,3}(ab*|b)/O
aabbbbb
0: aabbbbb
1: aabbbb
@@ -1829,7 +1779,7 @@
4: aab
5: aa
-/^[ab]{1,3}?(ab*|b)/
+/^[ab]{1,3}?(ab*|b)/O
aabbbbb
0: aabbbbb
1: aabbbb
@@ -1838,7 +1788,7 @@
4: aab
5: aa
-/^[ab]{1,3}?(ab*?|b)/
+/^[ab]{1,3}?(ab*?|b)/O
aabbbbb
0: aabbbbb
1: aabbbb
@@ -1847,7 +1797,7 @@
4: aab
5: aa
-/^[ab]{1,3}(ab*?|b)/
+/^[ab]{1,3}(ab*?|b)/O
aabbbbb
0: aabbbbb
1: aabbbb
@@ -2705,10 +2655,6 @@
/\0*/
\0\0\0\0
0: \x00\x00\x00\x00
- 1: \x00\x00\x00
- 2: \x00\x00
- 3: \x00
- 4:
/A\x0{2,3}Z/
The A\x0\x0Z
@@ -2760,56 +2706,14 @@
/([^.]*)\.([^:]*):[T ]+(.*)/
track1.title:TBlah blah blah
0: track1.title:TBlah blah blah
- 1: track1.title:TBlah blah bla
- 2: track1.title:TBlah blah bl
- 3: track1.title:TBlah blah b
- 4: track1.title:TBlah blah
- 5: track1.title:TBlah blah
- 6: track1.title:TBlah bla
- 7: track1.title:TBlah bl
- 8: track1.title:TBlah b
- 9: track1.title:TBlah
-10: track1.title:TBlah
-11: track1.title:TBla
-12: track1.title:TBl
-13: track1.title:TB
-14: track1.title:T
/([^.]*)\.([^:]*):[T ]+(.*)/i
track1.title:TBlah blah blah
0: track1.title:TBlah blah blah
- 1: track1.title:TBlah blah bla
- 2: track1.title:TBlah blah bl
- 3: track1.title:TBlah blah b
- 4: track1.title:TBlah blah
- 5: track1.title:TBlah blah
- 6: track1.title:TBlah bla
- 7: track1.title:TBlah bl
- 8: track1.title:TBlah b
- 9: track1.title:TBlah
-10: track1.title:TBlah
-11: track1.title:TBla
-12: track1.title:TBl
-13: track1.title:TB
-14: track1.title:T
/([^.]*)\.([^:]*):[t ]+(.*)/i
track1.title:TBlah blah blah
0: track1.title:TBlah blah blah
- 1: track1.title:TBlah blah bla
- 2: track1.title:TBlah blah bl
- 3: track1.title:TBlah blah b
- 4: track1.title:TBlah blah
- 5: track1.title:TBlah blah
- 6: track1.title:TBlah bla
- 7: track1.title:TBlah bl
- 8: track1.title:TBlah b
- 9: track1.title:TBlah
-10: track1.title:TBlah
-11: track1.title:TBla
-12: track1.title:TBl
-13: track1.title:TB
-14: track1.title:T
/^[W-c]+$/
WXY_^abc
@@ -2882,7 +2786,6 @@
0: b
c::b
0: ::
- 1: :
/[-az]+/
az-
@@ -3077,16 +2980,13 @@
1: baNOTccc
2: baNOTcc
3: baNOTc
- 4: baNOT
baNOTcccd
0: baNOTccc
1: baNOTcc
2: baNOTc
- 3: baNOT
baNOTccd
0: baNOTcc
1: baNOTc
- 2: baNOT
bacccd
0: baccc
*** Failers
@@ -3096,7 +2996,6 @@
3: *** Fail
4: *** Fai
5: *** Fa
- 6: *** F
anything
No match
b\bc
@@ -3115,23 +3014,14 @@
/[^a]+/
AAAaAbc
0: AAA
- 1: AA
- 2: A
/[^a]+/i
AAAaAbc
0: bc
- 1: b
/[^a]+/
bbb\nccc
0: bbb\x0accc
- 1: bbb\x0acc
- 2: bbb\x0ac
- 3: bbb\x0a
- 4: bbb
- 5: bb
- 6: b
/[^k]$/
abc
@@ -3208,20 +3098,8 @@
/(\.\d\d[1-9]?)\d+/
1.230003938
0: .230003938
- 1: .23000393
- 2: .2300039
- 3: .230003
- 4: .23000
- 5: .2300
- 6: .230
1.875000282
0: .875000282
- 1: .87500028
- 2: .8750002
- 3: .875000
- 4: .87500
- 5: .8750
- 6: .875
1.235
0: .235
@@ -3243,10 +3121,6 @@
/\b(foo)\s+(\w+)/i
Food is on the foo table
0: foo table
- 1: foo tabl
- 2: foo tab
- 3: foo ta
- 4: foo t
/foo(.*)bar/
The food is under the bar in the barn.
@@ -3258,7 +3132,7 @@
0: food is under the bar in the bar
1: food is under the bar
-/(.*)(\d*)/
+/(.*)(\d*)/O
I have 2 numbers: 53147
Matched, but offsets vector is too small to show all matches
0: I have 2 numbers: 53147
@@ -3287,13 +3161,9 @@
/(.*)(\d+)/
I have 2 numbers: 53147
0: I have 2 numbers: 53147
- 1: I have 2 numbers: 5314
- 2: I have 2 numbers: 531
- 3: I have 2 numbers: 53
- 4: I have 2 numbers: 5
- 5: I have 2
+ 1: I have 2
-/(.*?)(\d*)/
+/(.*?)(\d*)/O
I have 2 numbers: 53147
Matched, but offsets vector is too small to show all matches
0: I have 2 numbers: 53147
@@ -3322,11 +3192,7 @@
/(.*?)(\d+)/
I have 2 numbers: 53147
0: I have 2 numbers: 53147
- 1: I have 2 numbers: 5314
- 2: I have 2 numbers: 531
- 3: I have 2 numbers: 53
- 4: I have 2 numbers: 5
- 5: I have 2
+ 1: I have 2
/(.*)(\d+)$/
I have 2 numbers: 53147
@@ -3738,13 +3604,8 @@
0: a
ab
0: ab
- 1: a
abbbb
0: abbbb
- 1: abbb
- 2: abb
- 3: ab
- 4: a
*** Failers
0: a
bbbbb
@@ -3930,19 +3791,8 @@
/(?>(\.\d\d[1-9]?))\d+/
1.230003938
0: .230003938
- 1: .23000393
- 2: .2300039
- 3: .230003
- 4: .23000
- 5: .2300
- 6: .230
1.875000282
0: .875000282
- 1: .87500028
- 2: .8750002
- 3: .875000
- 4: .87500
- 5: .8750
*** Failers
No match
1.235
@@ -4561,7 +4411,6 @@
/.{3,4}/
abbbbc
0: abbb
- 1: abb
/ab{0,}bc/
abbbbc
@@ -4966,7 +4815,6 @@
/ab*/
xabyabbbz
0: ab
- 1: a
xayabbbz
0: a
@@ -4995,8 +4843,7 @@
/a([bc]*)c*/
abc
0: abc
- 1: ab
- 2: a
+ 1: a
/a([bc]*)(c*d)/
abcd
@@ -5079,8 +4926,6 @@
/(.*)c(.*)/
abcde
0: abcde
- 1: abcd
- 2: abc
/\((.*), (.*)\)/
(a, b)
@@ -5427,7 +5272,6 @@
/ab*/i
XABYABBBZ
0: AB
- 1: A
XAYABBBZ
0: A
@@ -5458,8 +5302,7 @@
/a([bc]*)c*/i
ABC
0: ABC
- 1: AB
- 2: A
+ 1: A
/a([bc]*)(c*d)/i
ABCD
@@ -5546,8 +5389,6 @@
/(.*)c(.*)/i
ABCDE
0: ABCDE
- 1: ABCD
- 2: ABC
/\((.*), (.*)\)/i
(A, B)
@@ -6196,11 +6037,9 @@
/a*/g
abbab
0: a
- 1:
0:
0:
0: a
- 1:
0:
0:
@@ -6253,10 +6092,6 @@
/\s+/
> \x09\x0a\x0c\x0d\x0b<
0: \x09\x0a\x0c\x0d
- 1: \x09\x0a\x0c
- 2: \x09\x0a
- 3: \x09
- 4:
/a?b/x
ab
@@ -6661,66 +6496,22 @@
/.*/<lf>
abc\ndef
0: abc
- 1: ab
- 2: a
- 3:
abc\rdef
0: abc\x0ddef
- 1: abc\x0dde
- 2: abc\x0dd
- 3: abc\x0d
- 4: abc
- 5: ab
- 6: a
- 7:
abc\r\ndef
0: abc\x0d
- 1: abc
- 2: ab
- 3: a
- 4:
\<cr>abc\ndef
0: abc\x0adef
- 1: abc\x0ade
- 2: abc\x0ad
- 3: abc\x0a
- 4: abc
- 5: ab
- 6: a
- 7:
\<cr>abc\rdef
0: abc
- 1: ab
- 2: a
- 3:
\<cr>abc\r\ndef
0: abc
- 1: ab
- 2: a
- 3:
\<crlf>abc\ndef
0: abc\x0adef
- 1: abc\x0ade
- 2: abc\x0ad
- 3: abc\x0a
- 4: abc
- 5: ab
- 6: a
- 7:
\<crlf>abc\rdef
0: abc\x0ddef
- 1: abc\x0dde
- 2: abc\x0dd
- 3: abc\x0d
- 4: abc
- 5: ab
- 6: a
- 7:
\<crlf>abc\r\ndef
0: abc
- 1: ab
- 2: a
- 3:
/\w+(.)(.)?def/s
abc\ndef
@@ -7033,10 +6824,8 @@
/\H*\h+\V?\v{3,4}/
\x09\x20\xa0X\x0a\x0b\x0c\x0d\x0a
0: \x09 \xa0X\x0a\x0b\x0c\x0d
- 1: \x09 \xa0X\x0a\x0b\x0c
\x09\x20\xa0\x0a\x0b\x0c\x0d\x0a
0: \x09 \xa0\x0a\x0b\x0c\x0d
- 1: \x09 \xa0\x0a\x0b\x0c
\x09\x20\xa0\x0a\x0b\x0c
0: \x09 \xa0\x0a\x0b\x0c
** Failers
@@ -7047,7 +6836,6 @@
/\H{3,4}/
XY ABCDE
0: ABCD
- 1: ABC
XY PQR ST
0: PQR
@@ -7531,15 +7319,11 @@
xxxxabcd\P
0: abcd
0+
- 1: abc
xxxxabcd\P\P
Partial match: abcd
dddxxx\R
0: ddd
0+ xxx
- 1: dd
- 2: d
- 3:
xxxxabcd\P\P
Partial match: abcd
xxx\R
@@ -7549,19 +7333,16 @@
/abcd*/i
xxxxabcd\P
0: abcd
- 1: abc
xxxxabcd\P\P
Partial match: abcd
XXXXABCD\P
0: ABCD
- 1: ABC
XXXXABCD\P\P
Partial match: ABCD
/abc\d*/
xxxxabc1\P
0: abc1
- 1: abc
xxxxabc1\P\P
Partial match: abc1
@@ -7684,11 +7465,8 @@
/.+/
abc\>0
0: abc
- 1: ab
- 2: a
abc\>1
0: bc
- 1: b
abc\>2
0: c
abc\>3
@@ -7811,10 +7589,6 @@
/^(?!a){0}\w+/
aaaaa
0: aaaaa
- 1: aaaa
- 2: aaa
- 3: aa
- 4: a
/(?<=(abc))?xyz/
abcxyz
@@ -7846,7 +7620,7 @@
aaaabcde
Error -26 (nested recursion at the same subject position)
-/(a+)/
+/(a+)/O
\O6aaaa
Matched, but offsets vector is too small to show all matches
0: aaaa
@@ -7971,7 +7745,6 @@
Partial match: \x0d\x0d
\r\r\r\P
0: \x0d\x0d\x0d
- 1: \x0d\x0d
\r\r\r\P\P
Partial match: \x0d\x0d\x0d
Modified: code/trunk/testdata/testoutput9
===================================================================
--- code/trunk/testdata/testoutput9 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/testdata/testoutput9 2013-10-01 16:54:40 UTC (rev 1363)
@@ -313,13 +313,9 @@
/[^a]+/8g
bcd
0: bcd
- 1: bc
- 2: b
\x{100}aY\x{256}Z
0: \x{100}
0: Y\x{256}Z
- 1: Y\x{256}
- 2: Y
/^[^a]{2}/8
\x{100}bc
@@ -328,8 +324,6 @@
/^[^a]{2,}/8
\x{100}bcAa
0: \x{100}bcA
- 1: \x{100}bc
- 2: \x{100}b
/^[^a]{2,}?/8
\x{100}bca
@@ -339,13 +333,9 @@
/[^a]+/8ig
bcd
0: bcd
- 1: bc
- 2: b
\x{100}aY\x{256}Z
0: \x{100}
0: Y\x{256}Z
- 1: Y\x{256}
- 2: Y
/^[^a]{2}/8i
\x{100}bc
@@ -354,7 +344,6 @@
/^[^a]{2,}/8i
\x{100}bcAa
0: \x{100}bc
- 1: \x{100}b
/^[^a]{2,}?/8i
\x{100}bca
@@ -370,28 +359,18 @@
0:
\x{100}\x{100}
0: \x{100}
- 1:
/\x{100}{0,3}/8
\x{100}\x{100}
0: \x{100}\x{100}
- 1: \x{100}
- 2:
\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}
- 1: \x{100}\x{100}
- 2: \x{100}
- 3:
/\x{100}*/8
abce
0:
\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}\x{100}
- 1: \x{100}\x{100}\x{100}
- 2: \x{100}\x{100}
- 3: \x{100}
- 4:
/\x{100}{1,1}/8
abcd\x{100}\x{100}\x{100}\x{100}
@@ -400,15 +379,10 @@
/\x{100}{1,3}/8
abcd\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}
- 1: \x{100}\x{100}
- 2: \x{100}
/\x{100}+/8
abcd\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}\x{100}
- 1: \x{100}\x{100}\x{100}
- 2: \x{100}\x{100}
- 3: \x{100}
/\x{100}{3}/8
abcd\x{100}\x{100}\x{100}XX
@@ -417,10 +391,8 @@
/\x{100}{3,5}/8
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
0: \x{100}\x{100}\x{100}\x{100}\x{100}
- 1: \x{100}\x{100}\x{100}\x{100}
- 2: \x{100}\x{100}\x{100}
-/\x{100}{3,}/8
+/\x{100}{3,}/8O
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
0: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
1: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
@@ -432,7 +404,7 @@
Xyyya\x{100}\x{100}bXzzz
0: X
-/\D*/8
+/\D*/8O
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Matched, but offsets vector is too small to show all matches
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
@@ -458,7 +430,7 @@
20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-/\D*/8
+/\D*/8O
\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
Matched, but offsets vector is too small to show all matches
0: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
@@ -507,33 +479,18 @@
/\D+/8
12abcd34
0: abcd
- 1: abc
- 2: ab
- 3: a
*** Failers
0: *** Failers
- 1: *** Failer
- 2: *** Faile
- 3: *** Fail
- 4: *** Fai
- 5: *** Fa
- 6: *** F
- 7: ***
- 8: ***
- 9: **
-10: *
1234
No match
/\D{2,3}/8
12abcd34
0: abc
- 1: ab
12ab34
0: ab
*** Failers
0: ***
- 1: **
1234
No match
12a34
@@ -556,7 +513,6 @@
/\d+/8
12abcd34
0: 12
- 1: 1
*** Failers
No match
@@ -565,7 +521,6 @@
0: 12
1234abcd
0: 123
- 1: 12
*** Failers
No match
1.4
@@ -585,30 +540,18 @@
/\S+/8
12abcd34
0: 12abcd34
- 1: 12abcd3
- 2: 12abcd
- 3: 12abc
- 4: 12ab
- 5: 12a
- 6: 12
- 7: 1
*** Failers
0: ***
- 1: **
- 2: *
\ \
No match
/\S{2,3}/8
12abcd34
0: 12a
- 1: 12
1234abcd
0: 123
- 1: 12
*** Failers
0: ***
- 1: **
\ \
No match
@@ -654,15 +597,8 @@
/\w+/8
12 34
0: 12
- 1: 1
*** Failers
0: Failers
- 1: Failer
- 2: Faile
- 3: Fail
- 4: Fai
- 5: Fa
- 6: F
+++=*!
No match
@@ -671,10 +607,8 @@
0: ab
abcd ce
0: abc
- 1: ab
*** Failers
0: Fai
- 1: Fa
a.b.c
No match
@@ -693,26 +627,18 @@
/\W+/8
12====34
0: ====
- 1: ===
- 2: ==
- 3: =
*** Failers
0: ***
- 1: ***
- 2: **
- 3: *
abcd
No match
/\W{2,3}/8
ab====cd
0: ===
- 1: ==
ab==cd
0: ==
*** Failers
0: ***
- 1: **
a.b.c
No match
@@ -1126,21 +1052,21 @@
a\r
No match
-/\h+\V?\v{3,4}/8
+/\h+\V?\v{3,4}/8O
\x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
0: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}\x{0d}
1: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}
-/\V?\v{3,4}/8
+/\V?\v{3,4}/8O
\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
0: X\x{0a}\x{0b}\x{0c}\x{0d}
1: X\x{0a}\x{0b}\x{0c}
-/\h+\V?\v{3,4}/8
+/\h+\V?\v{3,4}/8O
>\x09\x20\x{a0}X\x0a\x0a\x0a<
0: \x{09} \x{a0}X\x{0a}\x{0a}\x{0a}
-/\V?\v{3,4}/8
+/\V?\v{3,4}/8O
>\x09\x20\x{a0}X\x0a\x0a\x0a<
0: X\x{0a}\x{0a}\x{0a}
@@ -1154,7 +1080,7 @@
\x{a0} X\x0a
No match
-/\H*\h+\V?\v{3,4}/8
+/\H*\h+\V?\v{3,4}/8O
\x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
0: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}\x{0d}
1: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}
@@ -1178,7 +1104,7 @@
\x{2009} X\x0a
No match
-/\H*\h+\V?\v{3,4}/8
+/\H*\h+\V?\v{3,4}/8O
\x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
0: \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x{0c}\x{0d}
1: \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x{0c}
@@ -1279,26 +1205,22 @@
/abcd*/8
xxxxabcd\P
0: abcd
- 1: abc
xxxxabcd\P\P
Partial match: abcd
/abcd*/i8
xxxxabcd\P
0: abcd
- 1: abc
xxxxabcd\P\P
Partial match: abcd
XXXXABCD\P
0: ABCD
- 1: ABC
XXXXABCD\P\P
Partial match: ABCD
/abc\d*/8
xxxxabc1\P
0: abc1
- 1: abc
xxxxabc1\P\P
Partial match: abc1
@@ -1340,7 +1262,6 @@
Partial match: \x{0d}\x{0d}
\r\r\r\P
0: \x{0d}\x{0d}\x{0d}
- 1: \x{0d}\x{0d}
\r\r\r\P\P
Partial match: \x{0d}\x{0d}\x{0d}
@@ -1366,6 +1287,5 @@
/[^\x{100}]+/8
\x{100}\x{101}X
0: \x{101}X
- 1: \x{101}
/-- End of testinput9 --/
Modified: code/trunk/ucp.h
===================================================================
--- code/trunk/ucp.h 2013-10-01 11:27:04 UTC (rev 1362)
+++ code/trunk/ucp.h 2013-10-01 16:54:40 UTC (rev 1363)
@@ -11,8 +11,11 @@
IMPORTANT: Note also that the specific numeric values of the enums have to be
the same as the values that are generated by the maint/MultiStage2.py script,
-where the equivalent property descriptive names are listed in vectors. */
+where the equivalent property descriptive names are listed in vectors.
+ALSO: The specific values of the first two enums are assumed for the table
+called catposstab in pcre_compile.c. */
+
/* These are the general character categories. */
enum {