Revision: 1433
http://vcs.pcre.org/viewvc?view=rev&revision=1433
Author: ph10
Date: 2014-01-03 15:15:00 +0000 (Fri, 03 Jan 2014)
Log Message:
-----------
Reword pcretest messages and clarify "first char" meaning.
Modified Paths:
--------------
code/trunk/ChangeLog
code/trunk/doc/pcreapi.3
code/trunk/doc/pcretest.1
code/trunk/pcretest.c
code/trunk/testdata/testoutput12
code/trunk/testdata/testoutput13
code/trunk/testdata/testoutput14
code/trunk/testdata/testoutput15
code/trunk/testdata/testoutput16
code/trunk/testdata/testoutput17
code/trunk/testdata/testoutput18-16
code/trunk/testdata/testoutput18-32
code/trunk/testdata/testoutput19
code/trunk/testdata/testoutput2
code/trunk/testdata/testoutput21-16
code/trunk/testdata/testoutput21-32
code/trunk/testdata/testoutput22-16
code/trunk/testdata/testoutput22-32
code/trunk/testdata/testoutput23
code/trunk/testdata/testoutput25
code/trunk/testdata/testoutput3
code/trunk/testdata/testoutput5
code/trunk/testdata/testoutput8
Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/ChangeLog 2014-01-03 15:15:00 UTC (rev 1433)
@@ -40,6 +40,12 @@
8. Add missing (new) files sljitNativeTILEGX.c and sljitNativeTILEGX-encoder.c
to the export list in Makefile.am (they were accidentally omitted from the
8.34 tarball).
+
+9. The informational output from pcretest used the phrase "starting byte set"
+ which is inappropriate for the 16-bit and 32-bit libraries. As the output
+ for "first char" and "need char" really means "non-UTF-char", I've changed
+ "byte" to "char", and slightly reworded the output. The documentation about
+ these values has also been (I hope) clarified.
Version 8.34 15-December-2013
Modified: code/trunk/doc/pcreapi.3
===================================================================
--- code/trunk/doc/pcreapi.3 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/doc/pcreapi.3 2014-01-03 15:15:00 UTC (rev 1433)
@@ -1,4 +1,4 @@
-.TH PCREAPI 3 "12 November 2013" "PCRE 8.34"
+.TH PCREAPI 3 "03 January 2014" "PCRE 8.35"
.SH NAME
PCRE - Perl-compatible regular expressions
.sp
@@ -1248,12 +1248,15 @@
function. External callers can cause PCRE to use its internal tables by passing
a NULL table pointer.
.sp
- PCRE_INFO_FIRSTBYTE
+ PCRE_INFO_FIRSTBYTE (deprecated)
.sp
Return information about the first data unit of any matched string, for a
-non-anchored pattern. (The name of this option refers to the 8-bit library,
-where data units are bytes.) The fourth argument should point to an \fBint\fP
-variable.
+non-anchored pattern. The name of this option refers to the 8-bit library,
+where data units are bytes. The fourth argument should point to an \fBint\fP
+variable. Negative values are used for special cases. However, this means that
+when the 32-bit library is in non-UTF-32 mode, the full 32-bit range of
+characters cannot be returned. For this reason, this value is deprecated; use
+PCRE_INFO_FIRSTCHARACTERFLAGS and PCRE_INFO_FIRSTCHARACTER instead.
.P
If there is a fixed first value, for example, the letter "c" from a pattern
such as (cat|cow|coyote), its value is returned. In the 8-bit library, the
@@ -1271,12 +1274,39 @@
-1 is returned, indicating that the pattern matches only at the start of a
subject string or after any newline within the string. Otherwise -2 is
returned. For anchored patterns, -2 is returned.
+.sp
+ PCRE_INFO_FIRSTCHARACTER
+.sp
+Return the value of the first data unit (non-UTF character) of any matched
+string in the situation where PCRE_INFO_FIRSTCHARACTERFLAGS returns 1;
+otherwise return 0. The fourth argument should point to an \fBuint_t\fP
+variable.
.P
-Since for the 32-bit library using the non-UTF-32 mode, this function is unable
-to return the full 32-bit range of the character, this value is deprecated;
-instead the PCRE_INFO_FIRSTCHARACTERFLAGS and PCRE_INFO_FIRSTCHARACTER values
-should be used.
+In the 8-bit library, the value is always less than 256. In the 16-bit library
+the value can be up to 0xffff. In the 32-bit library in UTF-32 mode the value
+can be up to 0x10ffff, and up to 0xffffffff when not using UTF-32 mode.
.sp
+ PCRE_INFO_FIRSTCHARACTERFLAGS
+.sp
+Return information about the first data unit of any matched string, for a
+non-anchored pattern. The fourth argument should point to an \fBint\fP
+variable.
+.P
+If there is a fixed first value, for example, the letter "c" from a pattern
+such as (cat|cow|coyote), 1 is returned, and the character value can be
+retrieved using PCRE_INFO_FIRSTCHARACTER. If there is no fixed first value, and
+if either
+.sp
+(a) the pattern was compiled with the PCRE_MULTILINE option, and every branch
+starts with "^", or
+.sp
+(b) every branch of the pattern starts with ".*" and PCRE_DOTALL is not set
+(if it were set, the pattern would be anchored),
+.sp
+2 is returned, indicating that the pattern matches only at the start of a
+subject string or after any newline within the string. Otherwise 0 is
+returned. For anchored patterns, 0 is returned.
+.sp
PCRE_INFO_FIRSTTABLE
.sp
If the pattern was studied, and this resulted in the construction of a 256-bit
@@ -1499,38 +1529,6 @@
.\"
documentation for details).
.sp
- PCRE_INFO_FIRSTCHARACTERFLAGS
-.sp
-Return information about the first data unit of any matched string, for a
-non-anchored pattern. The fourth argument should point to an \fBint\fP
-variable.
-.P
-If there is a fixed first value, for example, the letter "c" from a pattern
-such as (cat|cow|coyote), 1 is returned, and the character value can be
-retrieved using PCRE_INFO_FIRSTCHARACTER.
-.P
-If there is no fixed first value, and if either
-.sp
-(a) the pattern was compiled with the PCRE_MULTILINE option, and every branch
-starts with "^", or
-.sp
-(b) every branch of the pattern starts with ".*" and PCRE_DOTALL is not set
-(if it were set, the pattern would be anchored),
-.sp
-2 is returned, indicating that the pattern matches only at the start of a
-subject string or after any newline within the string. Otherwise 0 is
-returned. For anchored patterns, 0 is returned.
-.sp
- PCRE_INFO_FIRSTCHARACTER
-.sp
-Return the fixed first character value in the situation where
-PCRE_INFO_FIRSTCHARACTERFLAGS returns 1; otherwise return 0. The fourth
-argument should point to an \fBuint_t\fP variable.
-.P
-In the 8-bit library, the value is always less than 256. In the 16-bit library
-the value can be up to 0xffff. In the 32-bit library in UTF-32 mode the value
-can be up to 0x10ffff, and up to 0xffffffff when not using UTF-32 mode.
-.sp
PCRE_INFO_REQUIREDCHARFLAGS
.sp
Returns 1 if there is a rightmost literal data unit that must exist in any
@@ -2900,6 +2898,6 @@
.rs
.sp
.nf
-Last updated: 12 November 2013
-Copyright (c) 1997-2013 University of Cambridge.
+Last updated: 03 January 2014
+Copyright (c) 1997-2014 University of Cambridge.
.fi
Modified: code/trunk/doc/pcretest.1
===================================================================
--- code/trunk/doc/pcretest.1 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/doc/pcretest.1 2014-01-03 15:15:00 UTC (rev 1433)
@@ -1,4 +1,4 @@
-.TH PCRETEST 1 "12 November 2013" "PCRE 8.34"
+.TH PCRETEST 1 "03 January 2014" "PCRE 8.35"
.SH NAME
pcretest - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS
@@ -483,7 +483,10 @@
The \fB/I\fP modifier requests that \fBpcretest\fP output information about the
compiled pattern (whether it is anchored, has a fixed first character, and
so on). It does this by calling \fBpcre[16|32]_fullinfo()\fP after compiling a
-pattern. If the pattern is studied, the results of that are also output.
+pattern. If the pattern is studied, the results of that are also output. In
+this output, the word "char" means a non-UTF character, that is, the value of a
+single data item (8-bit, 16-bit, or 32-bit, depending on the library that is
+being tested).
.P
The \fB/K\fP modifier requests \fBpcretest\fP to show names from backtracking
control verbs that are returned from calls to \fBpcre[16|32]_exec()\fP. It causes
@@ -1135,6 +1138,6 @@
.rs
.sp
.nf
-Last updated: 12 November 2013
-Copyright (c) 1997-2013 University of Cambridge.
+Last updated: 03 January 2014
+Copyright (c) 1997-2014 University of Cambridge.
.fi
Modified: code/trunk/pcretest.c
===================================================================
--- code/trunk/pcretest.c 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/pcretest.c 2014-01-03 15:15:00 UTC (rev 1433)
@@ -4282,12 +4282,12 @@
if (new_info(re, extra, PCRE_INFO_FIRSTTABLE, &start_bits) == 0)
{
if (start_bits == NULL)
- fprintf(outfile, "No set of starting bytes\n");
+ fprintf(outfile, "No starting char list\n");
else
{
int i;
int c = 24;
- fprintf(outfile, "Starting byte set: ");
+ fprintf(outfile, "Starting chars: ");
for (i = 0; i < 256; i++)
{
if ((start_bits[i/8] & (1<<(i&7))) != 0)
Modified: code/trunk/testdata/testoutput12
===================================================================
--- code/trunk/testdata/testoutput12 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput12 2014-01-03 15:15:00 UTC (rev 1433)
@@ -8,7 +8,7 @@
First char = 'a'
Need char = 'c'
Subject length lower bound = 3
-No set of starting bytes
+No starting char list
JIT study was successful
/(?(?C1)(?=a)a)/S+I
@@ -27,7 +27,7 @@
No first char
No need char
Subject length lower bound = -1
-No set of starting bytes
+No starting char list
JIT study was not successful
/abc/S+I>testsavedregex
@@ -36,7 +36,7 @@
First char = 'a'
Need char = 'c'
Subject length lower bound = 3
-No set of starting bytes
+No starting char list
JIT study was successful
Compiled pattern written to testsavedregex
Study data written to testsavedregex
@@ -165,7 +165,7 @@
First char = 'a'
Need char = 'd'
Subject length lower bound = 4
-No set of starting bytes
+No starting char list
JIT study was successful
/(*NO_START_OPT)a(*:m)b/KS++
Modified: code/trunk/testdata/testoutput13
===================================================================
--- code/trunk/testdata/testoutput13 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput13 2014-01-03 15:15:00 UTC (rev 1433)
@@ -8,7 +8,7 @@
First char = 'a'
Need char = 'c'
Subject length lower bound = 3
-No set of starting bytes
+No starting char list
JIT support is not available in this version of PCRE
/a*/SI
Modified: code/trunk/testdata/testoutput14
===================================================================
--- code/trunk/testdata/testoutput14 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput14 2014-01-03 15:15:00 UTC (rev 1433)
@@ -361,7 +361,7 @@
No first char
No need char
Subject length lower bound = 3
-Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
+Starting chars: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e
f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
@@ -388,7 +388,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0
+Starting chars: \x09 \x20 \xa0
/\H/SI
Capturing subpattern count = 0
@@ -396,7 +396,7 @@
No first char
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/\v/SI
Capturing subpattern count = 0
@@ -404,7 +404,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85
+Starting chars: \x0a \x0b \x0c \x0d \x85
/\V/SI
Capturing subpattern count = 0
@@ -412,7 +412,7 @@
No first char
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/\R/SI
Capturing subpattern count = 0
@@ -420,7 +420,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85
+Starting chars: \x0a \x0b \x0c \x0d \x85
/[\h]/BZ
------------------------------------------------------------------
Modified: code/trunk/testdata/testoutput15
===================================================================
--- code/trunk/testdata/testoutput15 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput15 2014-01-03 15:15:00 UTC (rev 1433)
@@ -481,7 +481,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
@@ -519,7 +519,7 @@
First char = \x{c4}
Need char = \x{80}
Subject length lower bound = 3
-No set of starting bytes
+No starting char list
\x{100}\x{100}\x{100}\x{100\x{100}
0: \x{100}\x{100}\x{100}
@@ -539,7 +539,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: x \xc4
+Starting chars: x \xc4
/(\x{100}*a|x)/8SDZ
------------------------------------------------------------------
@@ -558,7 +558,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a x \xc4
+Starting chars: a x \xc4
/(\x{100}{0,2}a|x)/8SDZ
------------------------------------------------------------------
@@ -577,7 +577,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a x \xc4
+Starting chars: a x \xc4
/(\x{100}{1,2}a|x)/8SDZ
------------------------------------------------------------------
@@ -597,7 +597,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: x \xc4
+Starting chars: x \xc4
/\x{100}/8DZ
------------------------------------------------------------------
@@ -799,7 +799,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xc2 \xe1 \xe2 \xe3
+Starting chars: \x09 \x20 \xc2 \xe1 \xe2 \xe3
ABC\x{09}
0: \x{09}
ABC\x{20}
@@ -825,7 +825,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
+Starting chars: \x0a \x0b \x0c \x0d \xc2 \xe2
ABC\x{0a}
0: \x{0a}
ABC\x{0b}
@@ -845,7 +845,7 @@
No first char
Need char = 'A'
Subject length lower bound = 1
-Starting byte set: \x09 \x20 A \xc2 \xe1 \xe2 \xe3
+Starting chars: \x09 \x20 A \xc2 \xe1 \xe2 \xe3
CDBABC
0: A
@@ -855,7 +855,7 @@
No first char
Need char = 'A'
Subject length lower bound = 2
-Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
+Starting chars: \x0a \x0b \x0c \x0d \xc2 \xe2
/\s?xxx\s/8SI
Capturing subpattern count = 0
@@ -863,7 +863,7 @@
No first char
Need char = 'x'
Subject length lower bound = 4
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 x
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 x
/\sxxx\s/I8ST1
Capturing subpattern count = 0
@@ -871,7 +871,7 @@
No first char
Need char = 'x'
Subject length lower bound = 5
-Starting byte set: \x09 \x0a \x0c \x0d \x20 \xc2
+Starting chars: \x09 \x0a \x0c \x0d \x20 \xc2
AB\x{85}xxx\x{a0}XYZ
0: \x{85}xxx\x{a0}
AB\x{a0}xxx\x{85}XYZ
@@ -883,7 +883,7 @@
No first char
Need char = ' '
Subject length lower bound = 3
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
\x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e
@@ -917,7 +917,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \xe1
+Starting chars: \xe1
/\x{1234}+?/iS8I
Capturing subpattern count = 0
@@ -925,7 +925,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \xe1
+Starting chars: \xe1
/\x{1234}++/iS8I
Capturing subpattern count = 0
@@ -933,7 +933,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \xe1
+Starting chars: \xe1
/\x{1234}{2}/iS8I
Capturing subpattern count = 0
@@ -941,7 +941,7 @@
No first char
No need char
Subject length lower bound = 2
-Starting byte set: \xe1
+Starting chars: \xe1
/[^\x{c4}]/8DZ
------------------------------------------------------------------
@@ -974,7 +974,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
+Starting chars: \x0a \x0b \x0c \x0d \xc2 \xe2
/\777/8DZ
------------------------------------------------------------------
Modified: code/trunk/testdata/testoutput16
===================================================================
--- code/trunk/testdata/testoutput16 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput16 2014-01-03 15:15:00 UTC (rev 1433)
@@ -64,7 +64,7 @@
No first char
No need char
Subject length lower bound = 17
-Starting byte set: \xd0 \xd1
+Starting chars: \xd0 \xd1
\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
\x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
@@ -92,7 +92,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0
+Starting chars: \x09 \x20 \xa0
/\v/SI
Capturing subpattern count = 0
@@ -100,7 +100,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85
+Starting chars: \x0a \x0b \x0c \x0d \x85
/\R/SI
Capturing subpattern count = 0
@@ -108,7 +108,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85
+Starting chars: \x0a \x0b \x0c \x0d \x85
/[[:blank:]]/WBZ
------------------------------------------------------------------
Modified: code/trunk/testdata/testoutput17
===================================================================
--- code/trunk/testdata/testoutput17 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput17 2014-01-03 15:15:00 UTC (rev 1433)
@@ -228,7 +228,7 @@
No first char
No need char
Subject length lower bound = 3
-Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
+Starting chars: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e
f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xff
@@ -274,7 +274,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0 \xff
+Starting chars: \x09 \x20 \xa0 \xff
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
0: \x{1680}\x{2000}\x{202f}\x{3000}
\x{3001}\x{2fff}\x{200a}\xa0\x{2000}
@@ -292,7 +292,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0 \xff
+Starting chars: \x09 \x20 \xa0 \xff
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
0: \x{1680}\x{2000}\x{202f}\x{3000}
\x{3001}\x{2fff}\x{200a}\xa0\x{2000}
@@ -304,7 +304,7 @@
No first char
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
0: \x{167f}\x{1681}\x{180d}\x{180f}
\x{2000}\x{200a}\x{1fff}\x{200b}
@@ -330,7 +330,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff
\x{2027}\x{2030}\x{2028}\x{2029}
0: \x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
@@ -348,7 +348,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff
\x{2027}\x{2030}\x{2028}\x{2029}
0: \x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
@@ -360,7 +360,7 @@
No first char
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
\x{2028}\x{2029}\x{2027}\x{2030}
0: \x{2027}\x{2030}
\x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
@@ -378,7 +378,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff
\x{2027}\x{2030}\x{2028}\x{2029}
0: \x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
Modified: code/trunk/testdata/testoutput18-16
===================================================================
--- code/trunk/testdata/testoutput18-16 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput18-16 2014-01-03 15:15:00 UTC (rev 1433)
@@ -339,7 +339,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
@@ -378,7 +378,7 @@
First char = \x{100}
Need char = \x{100}
Subject length lower bound = 3
-No set of starting bytes
+No starting char list
\x{100}\x{100}\x{100}\x{100\x{100}
0: \x{100}\x{100}\x{100}
@@ -398,7 +398,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: x \xff
+Starting chars: x \xff
/(\x{100}*a|x)/8SDZ
------------------------------------------------------------------
@@ -417,7 +417,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a x \xff
+Starting chars: a x \xff
/(\x{100}{0,2}a|x)/8SDZ
------------------------------------------------------------------
@@ -436,7 +436,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a x \xff
+Starting chars: a x \xff
/(\x{100}{1,2}a|x)/8SDZ
------------------------------------------------------------------
@@ -456,7 +456,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: x \xff
+Starting chars: x \xff
/\x{100}/8DZ
------------------------------------------------------------------
@@ -666,7 +666,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0 \xff
+Starting chars: \x09 \x20 \xa0 \xff
ABC\x{09}
0: \x{09}
ABC\x{20}
@@ -692,7 +692,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff
ABC\x{0a}
0: \x{0a}
ABC\x{0b}
@@ -712,7 +712,7 @@
No first char
Need char = 'A'
Subject length lower bound = 1
-Starting byte set: \x09 \x20 A \xa0 \xff
+Starting chars: \x09 \x20 A \xa0 \xff
CDBABC
0: A
\x{2000}ABC
@@ -724,7 +724,7 @@
No first char
Need char = 'A'
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d A \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d A \x85 \xff
CDBABC
0: A
\x{2028}A
@@ -736,7 +736,7 @@
No first char
Need char = 'A'
Subject length lower bound = 2
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff
/\s?xxx\s/8SI
Capturing subpattern count = 0
@@ -744,7 +744,7 @@
No first char
Need char = 'x'
Subject length lower bound = 4
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 x
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 x
/\sxxx\s/I8ST1
Capturing subpattern count = 0
@@ -752,7 +752,7 @@
No first char
Need char = 'x'
Subject length lower bound = 5
-Starting byte set: \x09 \x0a \x0c \x0d \x20 \x85 \xa0
+Starting chars: \x09 \x0a \x0c \x0d \x20 \x85 \xa0
AB\x{85}xxx\x{a0}XYZ
0: \x{85}xxx\x{a0}
AB\x{a0}xxx\x{85}XYZ
@@ -764,7 +764,7 @@
No first char
Need char = ' '
Subject length lower bound = 3
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
\x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e
@@ -803,7 +803,7 @@
First char = \x{1234}
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/\x{1234}+?/iS8I
Capturing subpattern count = 0
@@ -811,7 +811,7 @@
First char = \x{1234}
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/\x{1234}++/iS8I
Capturing subpattern count = 0
@@ -819,7 +819,7 @@
First char = \x{1234}
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/\x{1234}{2}/iS8I
Capturing subpattern count = 0
@@ -827,7 +827,7 @@
First char = \x{1234}
Need char = \x{1234}
Subject length lower bound = 2
-No set of starting bytes
+No starting char list
/[^\x{c4}]/8DZ
------------------------------------------------------------------
@@ -860,7 +860,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff
/-- Check bad offset --/
Modified: code/trunk/testdata/testoutput18-32
===================================================================
--- code/trunk/testdata/testoutput18-32 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput18-32 2014-01-03 15:15:00 UTC (rev 1433)
@@ -337,7 +337,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
@@ -376,7 +376,7 @@
First char = \x{100}
Need char = \x{100}
Subject length lower bound = 3
-No set of starting bytes
+No starting char list
\x{100}\x{100}\x{100}\x{100\x{100}
0: \x{100}\x{100}\x{100}
@@ -396,7 +396,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: x \xff
+Starting chars: x \xff
/(\x{100}*a|x)/8SDZ
------------------------------------------------------------------
@@ -415,7 +415,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a x \xff
+Starting chars: a x \xff
/(\x{100}{0,2}a|x)/8SDZ
------------------------------------------------------------------
@@ -434,7 +434,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a x \xff
+Starting chars: a x \xff
/(\x{100}{1,2}a|x)/8SDZ
------------------------------------------------------------------
@@ -454,7 +454,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: x \xff
+Starting chars: x \xff
/\x{100}/8DZ
------------------------------------------------------------------
@@ -663,7 +663,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0 \xff
+Starting chars: \x09 \x20 \xa0 \xff
ABC\x{09}
0: \x{09}
ABC\x{20}
@@ -689,7 +689,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff
ABC\x{0a}
0: \x{0a}
ABC\x{0b}
@@ -709,7 +709,7 @@
No first char
Need char = 'A'
Subject length lower bound = 1
-Starting byte set: \x09 \x20 A \xa0 \xff
+Starting chars: \x09 \x20 A \xa0 \xff
CDBABC
0: A
\x{2000}ABC
@@ -721,7 +721,7 @@
No first char
Need char = 'A'
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d A \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d A \x85 \xff
CDBABC
0: A
\x{2028}A
@@ -733,7 +733,7 @@
No first char
Need char = 'A'
Subject length lower bound = 2
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff
/\s?xxx\s/8SI
Capturing subpattern count = 0
@@ -741,7 +741,7 @@
No first char
Need char = 'x'
Subject length lower bound = 4
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 x
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 x
/\sxxx\s/I8ST1
Capturing subpattern count = 0
@@ -749,7 +749,7 @@
No first char
Need char = 'x'
Subject length lower bound = 5
-Starting byte set: \x09 \x0a \x0c \x0d \x20 \x85 \xa0
+Starting chars: \x09 \x0a \x0c \x0d \x20 \x85 \xa0
AB\x{85}xxx\x{a0}XYZ
0: \x{85}xxx\x{a0}
AB\x{a0}xxx\x{85}XYZ
@@ -761,7 +761,7 @@
No first char
Need char = ' '
Subject length lower bound = 3
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
\x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e
@@ -800,7 +800,7 @@
First char = \x{1234}
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/\x{1234}+?/iS8I
Capturing subpattern count = 0
@@ -808,7 +808,7 @@
First char = \x{1234}
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/\x{1234}++/iS8I
Capturing subpattern count = 0
@@ -816,7 +816,7 @@
First char = \x{1234}
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/\x{1234}{2}/iS8I
Capturing subpattern count = 0
@@ -824,7 +824,7 @@
First char = \x{1234}
Need char = \x{1234}
Subject length lower bound = 2
-No set of starting bytes
+No starting char list
/[^\x{c4}]/8DZ
------------------------------------------------------------------
@@ -857,7 +857,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff
/-- Check bad offset --/
Modified: code/trunk/testdata/testoutput19
===================================================================
--- code/trunk/testdata/testoutput19 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput19 2014-01-03 15:15:00 UTC (rev 1433)
@@ -55,7 +55,7 @@
First char = \x{401} (caseless)
Need char = \x{42f} (caseless)
Subject length lower bound = 17
-No set of starting bytes
+No starting char list
\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
\x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput2 2014-01-03 15:15:00 UTC (rev 1433)
@@ -178,7 +178,7 @@
No first char
No need char
Subject length lower bound = 3
-Starting byte set: c d e
+Starting chars: c d e
this sentence eventually mentions a cat
0: cat
this sentences rambles on and on for a while and then reaches elephant
@@ -190,7 +190,7 @@
No first char
No need char
Subject length lower bound = 3
-Starting byte set: C D E c d e
+Starting chars: C D E c d e
this sentence eventually mentions a CAT cat
0: CAT
this sentences rambles on and on for a while to elephant ElePhant
@@ -202,7 +202,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b c d
+Starting chars: a b c d
/(a|[^\dZ])/IS
Capturing subpattern count = 1
@@ -210,7 +210,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / : ; < = >
? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y [ \ ] ^ _ ` a b c d
@@ -231,7 +231,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 a b
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 a b
/(ab\2)/
Failed: reference to non-existent subpattern at offset 6
@@ -512,7 +512,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b c d
+Starting chars: a b c d
/(?i)[abcd]/IS
Capturing subpattern count = 0
@@ -520,7 +520,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: A B C D a b c d
+Starting chars: A B C D a b c d
/(?m)[xy]|(b|c)/IS
Capturing subpattern count = 1
@@ -528,7 +528,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: b c x y
+Starting chars: b c x y
/(^a|^b)/Im
Capturing subpattern count = 1
@@ -591,7 +591,7 @@
First char = 'b' (caseless)
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/(a*b|(?i:c*(?-i)d))/IS
Capturing subpattern count = 1
@@ -599,7 +599,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: C a b c d
+Starting chars: C a b c d
/a$/I
Capturing subpattern count = 0
@@ -666,7 +666,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b
+Starting chars: a b
/(?<!foo)(alpha|omega)/IS
Capturing subpattern count = 1
@@ -675,7 +675,7 @@
No first char
Need char = 'a'
Subject length lower bound = 5
-Starting byte set: a o
+Starting chars: a o
/(?!alphabet)[ab]/IS
Capturing subpattern count = 0
@@ -683,7 +683,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b
+Starting chars: a b
/(?<=foo\n)^bar/Im
Capturing subpattern count = 0
@@ -1642,7 +1642,7 @@
No first char
Need char = 'd'
Subject length lower bound = 4
-No set of starting bytes
+No starting char list
/\( # ( at start
(?: # Non-capturing bracket
@@ -1875,7 +1875,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
+Starting chars: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
_ a b c d e f g h i j k l m n o p q r s t u v w x y z
/^[[:ascii:]]/DZ
@@ -1937,7 +1937,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20
/^[[:cntrl:]]/DZ
------------------------------------------------------------------
@@ -3434,7 +3434,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b
+Starting chars: a b
/[^a]/I
Capturing subpattern count = 0
@@ -3454,7 +3454,7 @@
No first char
Need char = '6'
Subject length lower bound = 4
-Starting byte set: 0 1 2 3 4 5 6 7 8 9
+Starting chars: 0 1 2 3 4 5 6 7 8 9
/a^b/I
Capturing subpattern count = 0
@@ -3488,7 +3488,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: A B a b
+Starting chars: A B a b
/[ab](?i)cd/IS
Capturing subpattern count = 0
@@ -3496,7 +3496,7 @@
No first char
Need char = 'd' (caseless)
Subject length lower bound = 3
-Starting byte set: a b
+Starting chars: a b
/abc(?C)def/I
Capturing subpattern count = 0
@@ -3537,7 +3537,7 @@
No first char
Need char = 'f'
Subject length lower bound = 7
-Starting byte set: 0 1 2 3 4 5 6 7 8 9
+Starting chars: 0 1 2 3 4 5 6 7 8 9
1234abcdef
--->1234abcdef
1 ^ \d
@@ -3856,7 +3856,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b
+Starting chars: a b
/(?R)/I
Failed: recursive call could loop indefinitely at offset 3
@@ -4637,7 +4637,7 @@
No first char
Need char = 'g' (caseless)
Subject length lower bound = 8
-No set of starting bytes
+No starting char list
Baby Bjorn Active Carrier - With free SHIPPING!!
0: Baby Bjorn Active Carrier - With free SHIPPING!!
1: Baby Bjorn Active Carrier - With free SHIPPING!!
@@ -4656,7 +4656,7 @@
No first char
Need char = 'b'
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/(a|b)*.?c/ISDZ
------------------------------------------------------------------
@@ -4677,7 +4677,7 @@
No first char
Need char = 'c'
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/abc(?C255)de(?C)f/DZ
------------------------------------------------------------------
@@ -4750,7 +4750,7 @@
No first char
Need char = 'b'
Subject length lower bound = 1
-Starting byte set: a b
+Starting chars: a b
ab
--->ab
+0 ^ a*
@@ -4893,7 +4893,7 @@
No first char
Need char = 'x'
Subject length lower bound = 4
-Starting byte set: a d
+Starting chars: a d
abcx
--->abcx
+0 ^ (abc|def)
@@ -5127,7 +5127,7 @@
No first char
No need char
Subject length lower bound = 2
-Starting byte set: a b x
+Starting chars: a b x
Note: that { does NOT introduce a quantifier
--->Note: that { does NOT introduce a quantifier
+0 ^ ([ab]{,4}c|xy)
@@ -5607,7 +5607,7 @@
First char = 'a'
Need char = 'c'
Subject length lower bound = 3
-No set of starting bytes
+No starting char list
Compiled pattern written to testsavedregex
Study data written to testsavedregex
<testsavedregex
@@ -5642,7 +5642,7 @@
First char = 'a'
Need char = 'c'
Subject length lower bound = 3
-No set of starting bytes
+No starting char list
Compiled pattern written to testsavedregex
Study data written to testsavedregex
<testsavedregex
@@ -5677,7 +5677,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b
+Starting chars: a b
Compiled pattern written to testsavedregex
Study data written to testsavedregex
<testsavedregex
@@ -5716,7 +5716,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b
+Starting chars: a b
Compiled pattern written to testsavedregex
Study data written to testsavedregex
<testsavedregex
@@ -6431,7 +6431,7 @@
No first char
Need char = ','
Subject length lower bound = 1
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 ,
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 ,
\x0b,\x0b
0: \x0b,\x0b
\x0c,\x0d
@@ -6738,7 +6738,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: C a b c d
+Starting chars: C a b c d
/()[ab]xyz/IS
Capturing subpattern count = 1
@@ -6746,7 +6746,7 @@
No first char
Need char = 'z'
Subject length lower bound = 4
-Starting byte set: a b
+Starting chars: a b
/(|)[ab]xyz/IS
Capturing subpattern count = 1
@@ -6754,7 +6754,7 @@
No first char
Need char = 'z'
Subject length lower bound = 4
-Starting byte set: a b
+Starting chars: a b
/(|c)[ab]xyz/IS
Capturing subpattern count = 1
@@ -6762,7 +6762,7 @@
No first char
Need char = 'z'
Subject length lower bound = 4
-Starting byte set: a b c
+Starting chars: a b c
/(|c?)[ab]xyz/IS
Capturing subpattern count = 1
@@ -6770,7 +6770,7 @@
No first char
Need char = 'z'
Subject length lower bound = 4
-Starting byte set: a b c
+Starting chars: a b c
/(d?|c?)[ab]xyz/IS
Capturing subpattern count = 1
@@ -6778,7 +6778,7 @@
No first char
Need char = 'z'
Subject length lower bound = 4
-Starting byte set: a b c d
+Starting chars: a b c d
/(d?|c)[ab]xyz/IS
Capturing subpattern count = 1
@@ -6786,7 +6786,7 @@
No first char
Need char = 'z'
Subject length lower bound = 4
-Starting byte set: a b c d
+Starting chars: a b c d
/^a*b\d/DZ
------------------------------------------------------------------
@@ -6879,7 +6879,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b c d
+Starting chars: a b c d
/(a+|b*)[cd]/IS
Capturing subpattern count = 1
@@ -6887,7 +6887,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b c d
+Starting chars: a b c d
/(a*|b+)[cd]/IS
Capturing subpattern count = 1
@@ -6895,7 +6895,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: a b c d
+Starting chars: a b c d
/(a+|b+)[cd]/IS
Capturing subpattern count = 1
@@ -6903,7 +6903,7 @@
No first char
No need char
Subject length lower bound = 2
-Starting byte set: a b
+Starting chars: a b
/((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((
((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((
@@ -9307,7 +9307,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: x y z
+Starting chars: x y z
/(?(?=.*b)b|^)/CI
Capturing subpattern count = 0
@@ -10096,7 +10096,7 @@
No first char
No need char
Subject length lower bound = 2
-Starting byte set: a b
+Starting chars: a b
/(a|bc)\1{2,3}/SI
Capturing subpattern count = 1
@@ -10105,7 +10105,7 @@
No first char
No need char
Subject length lower bound = 3
-Starting byte set: a b
+Starting chars: a b
/(a|bc)(?1)/SI
Capturing subpattern count = 1
@@ -10113,7 +10113,7 @@
No first char
No need char
Subject length lower bound = 2
-Starting byte set: a b
+Starting chars: a b
/(a|b\1)(a|b\1)/SI
Capturing subpattern count = 2
@@ -10122,7 +10122,7 @@
No first char
No need char
Subject length lower bound = 2
-Starting byte set: a b
+Starting chars: a b
/(a|b\1){2}/SI
Capturing subpattern count = 1
@@ -10131,7 +10131,7 @@
No first char
No need char
Subject length lower bound = 2
-Starting byte set: a b
+Starting chars: a b
/(a|bbbb\1)(a|bbbb\1)/SI
Capturing subpattern count = 2
@@ -10140,7 +10140,7 @@
No first char
No need char
Subject length lower bound = 2
-Starting byte set: a b
+Starting chars: a b
/(a|bbbb\1){2}/SI
Capturing subpattern count = 1
@@ -10149,7 +10149,7 @@
No first char
No need char
Subject length lower bound = 2
-Starting byte set: a b
+Starting chars: a b
/^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/SI
Capturing subpattern count = 1
@@ -10157,7 +10157,7 @@
No first char
Need char = ':'
Subject length lower bound = 22
-No set of starting bytes
+No starting char list
/<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/isIS
Capturing subpattern count = 11
@@ -10165,7 +10165,7 @@
First char = '<'
Need char = '>'
Subject length lower bound = 47
-No set of starting bytes
+No starting char list
"(?>.*/)foo"SI
Capturing subpattern count = 0
@@ -10173,7 +10173,7 @@
No first char
Need char = 'o'
Subject length lower bound = 4
-No set of starting bytes
+No starting char list
/(?(?=[^a-z]+[a-z]) \d{2}-[a-z]{3}-\d{2} | \d{2}-\d{2}-\d{2} ) /xSI
Capturing subpattern count = 0
@@ -10181,7 +10181,7 @@
No first char
Need char = '-'
Subject length lower bound = 8
-No set of starting bytes
+No starting char list
/(?:(?:(?:(?:(?:(?:(?:(?:(?:(a|b|c))))))))))/iSI
Capturing subpattern count = 1
@@ -10189,7 +10189,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: A B C a b c
+Starting chars: A B C a b c
/(?:c|d)(?:)(?:aaaaaaaa(?:)(?:bbbbbbbb)(?:bbbbbbbb(?:))(?:bbbbbbbb(?:)(?:bbbbbbbb)))/SI
Capturing subpattern count = 0
@@ -10197,7 +10197,7 @@
No first char
Need char = 'b'
Subject length lower bound = 41
-Starting byte set: c d
+Starting chars: c d
/<a[\s]+href[\s]*=[\s]* # find <a href=
([\"\'])? # find single or double quote
@@ -10210,7 +10210,7 @@
First char = '<'
Need char = '='
Subject length lower bound = 9
-No set of starting bytes
+No starting char list
/^(?!:) # colon disallowed at start
(?: # start of item
@@ -10226,7 +10226,7 @@
No first char
Need char = ':'
Subject length lower bound = 2
-No set of starting bytes
+No starting char list
/(?|(?<a>A)|(?<a>B))/I
Capturing subpattern count = 1
@@ -10450,7 +10450,7 @@
No first char
Need char = 'a'
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
cat
0: a
1:
@@ -10464,7 +10464,7 @@
No first char
Need char = 'a'
Subject length lower bound = 3
-No set of starting bytes
+No starting char list
cat
No match
@@ -10476,7 +10476,7 @@
First char = 'i'
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
i
0: i
@@ -10486,7 +10486,7 @@
No first char
Need char = 'i'
Subject length lower bound = 1
-Starting byte set: i
+Starting chars: i
ia
0: ia
1:
@@ -11080,7 +11080,7 @@
First char = 'a'
Need char = '4'
Subject length lower bound = 5
-No set of starting bytes
+No starting char list
/([abc])++1234/SI
Capturing subpattern count = 1
@@ -11088,7 +11088,7 @@
No first char
Need char = '4'
Subject length lower bound = 5
-Starting byte set: a b c
+Starting chars: a b c
/(?<=(abc)+)X/
Failed: lookbehind assertion is not fixed length at offset 10
@@ -11369,7 +11369,7 @@
No first char
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/(a(?2)|b)(b(?1)|a)(?:(?1)|(?2))/SI
Capturing subpattern count = 2
@@ -11377,7 +11377,7 @@
No first char
No need char
Subject length lower bound = 3
-Starting byte set: a b
+Starting chars: a b
/(a(?2)|b)(b(?1)|a)(?1)(?2)/SI
Capturing subpattern count = 2
@@ -11385,7 +11385,7 @@
No first char
No need char
Subject length lower bound = 4
-Starting byte set: a b
+Starting chars: a b
/(abc)(?1)/SI
Capturing subpattern count = 1
@@ -11393,7 +11393,7 @@
First char = 'a'
Need char = 'c'
Subject length lower bound = 6
-No set of starting bytes
+No starting char list
/^(?>a)++/
aa\M
@@ -11711,7 +11711,7 @@
First char = 't'
Need char = 't'
Subject length lower bound = 18
-No set of starting bytes
+No starting char list
/\btype\b\W*?\btext\b\W*?\bjavascript\b|\burl\b\W*?\bshell:|<input\b.*?\btype\b\W*?\bimage\b|\bonkeyup\b\W*?\=/IS
Capturing subpattern count = 0
@@ -11720,7 +11720,7 @@
No first char
No need char
Subject length lower bound = 8
-Starting byte set: < o t u
+Starting chars: < o t u
/a(*SKIP)c|b(*ACCEPT)|/+S!I
Capturing subpattern count = 0
@@ -11729,7 +11729,7 @@
No first char
No need char
Subject length lower bound = -1
-No set of starting bytes
+No starting char list
a
0:
0+
@@ -11740,7 +11740,7 @@
No first char
No need char
Subject length lower bound = -1
-Starting byte set: a b x
+Starting chars: a b x
ax
0: x
@@ -12436,7 +12436,7 @@
No first char
No need char
Subject length lower bound = -1
-No set of starting bytes
+No starting char list
/(?:(a)+(?C1)bb|aa(?C2)b)/
aab\C+
@@ -12722,7 +12722,7 @@
No first char
Need char = 'z'
Subject length lower bound = 2
-Starting byte set: a z
+Starting chars: a z
aaaaaaaaaaaaaz
Error -21 (recursion limit exceeded)
aaaaaaaaaaaaaz\Q1000
@@ -12735,7 +12735,7 @@
No first char
Need char = 'z'
Subject length lower bound = 2
-Starting byte set: a z
+Starting chars: a z
aaaaaaaaaaaaaz
Error -21 (recursion limit exceeded)
@@ -12746,7 +12746,7 @@
No first char
Need char = 'z'
Subject length lower bound = 2
-Starting byte set: a z
+Starting chars: a z
aaaaaaaaaaaaaz
No match
aaaaaaaaaaaaaz\Q10
@@ -12790,7 +12790,7 @@
First char = 'a'
Need char = 'z'
Subject length lower bound = 5
-No set of starting bytes
+No starting char list
/a*[bcd]/BZ
------------------------------------------------------------------
@@ -13902,7 +13902,7 @@
No first char
Need char = 'd'
Subject length lower bound = 1
-Starting byte set: a b c d
+Starting chars: a b c d
/[a-c]+d/DZS
------------------------------------------------------------------
@@ -13917,7 +13917,7 @@
No first char
Need char = 'd'
Subject length lower bound = 2
-Starting byte set: a b c
+Starting chars: a b c
/[a-c]?d/DZS
------------------------------------------------------------------
@@ -13932,7 +13932,7 @@
No first char
Need char = 'd'
Subject length lower bound = 1
-Starting byte set: a b c d
+Starting chars: a b c d
/[a-c]{4,6}d/DZS
------------------------------------------------------------------
@@ -13947,7 +13947,7 @@
No first char
Need char = 'd'
Subject length lower bound = 5
-Starting byte set: a b c
+Starting chars: a b c
/[a-c]{0,6}d/DZS
------------------------------------------------------------------
@@ -13962,7 +13962,7 @@
No first char
Need char = 'd'
Subject length lower bound = 1
-Starting byte set: a b c d
+Starting chars: a b c d
/-- End of special auto-possessive tests --/
Modified: code/trunk/testdata/testoutput21-16
===================================================================
--- code/trunk/testdata/testoutput21-16 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput21-16 2014-01-03 15:15:00 UTC (rev 1433)
@@ -50,7 +50,7 @@
No first char
No need char
Subject length lower bound = 6
-No set of starting bytes
+No starting char list
<!testsaved16BE-1
Compiled pattern loaded from testsaved16BE-1
@@ -83,7 +83,7 @@
No first char
No need char
Subject length lower bound = 6
-No set of starting bytes
+No starting char list
<!testsaved32LE-1
Compiled pattern loaded from testsaved32LE-1
Modified: code/trunk/testdata/testoutput21-32
===================================================================
--- code/trunk/testdata/testoutput21-32 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput21-32 2014-01-03 15:15:00 UTC (rev 1433)
@@ -62,7 +62,7 @@
No first char
No need char
Subject length lower bound = 6
-No set of starting bytes
+No starting char list
<!testsaved32BE-1
Compiled pattern loaded from testsaved32BE-1
@@ -95,6 +95,6 @@
No first char
No need char
Subject length lower bound = 6
-No set of starting bytes
+No starting char list
/-- End of testinput21 --/
Modified: code/trunk/testdata/testoutput22-16
===================================================================
--- code/trunk/testdata/testoutput22-16 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput22-16 2014-01-03 15:15:00 UTC (rev 1433)
@@ -37,7 +37,7 @@
No first char
No need char
Subject length lower bound = 2
-No set of starting bytes
+No starting char list
<!testsaved16BE-2
Compiled pattern loaded from testsaved16BE-2
@@ -64,7 +64,7 @@
No first char
No need char
Subject length lower bound = 2
-No set of starting bytes
+No starting char list
<!testsaved32LE-2
Compiled pattern loaded from testsaved32LE-2
Modified: code/trunk/testdata/testoutput22-32
===================================================================
--- code/trunk/testdata/testoutput22-32 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput22-32 2014-01-03 15:15:00 UTC (rev 1433)
@@ -49,7 +49,7 @@
No first char
No need char
Subject length lower bound = 2
-No set of starting bytes
+No starting char list
<!testsaved32BE-2
Compiled pattern loaded from testsaved32BE-2
@@ -76,6 +76,6 @@
No first char
No need char
Subject length lower bound = 2
-No set of starting bytes
+No starting char list
/-- End of testinput22 --/
Modified: code/trunk/testdata/testoutput23
===================================================================
--- code/trunk/testdata/testoutput23 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput23 2014-01-03 15:15:00 UTC (rev 1433)
@@ -27,7 +27,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b
\x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a
\x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9
: ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^
@@ -54,7 +54,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
\x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = >
? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
Modified: code/trunk/testdata/testoutput25
===================================================================
--- code/trunk/testdata/testoutput25 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput25 2014-01-03 15:15:00 UTC (rev 1433)
@@ -74,7 +74,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b
\x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a
\x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9
: ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^
@@ -101,7 +101,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
\x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = >
? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
Modified: code/trunk/testdata/testoutput3
===================================================================
--- code/trunk/testdata/testoutput3 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput3 2014-01-03 15:15:00 UTC (rev 1433)
@@ -90,7 +90,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
/\w/ISLfr_FR
@@ -99,7 +99,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
\xAA \xB5 \xBA \xC0 \xC1 \xC2 \xC3 \xC4 \xC5 \xC6 \xC7 \xC8 \xC9 \xCA \xCB \xCC \xCD \xCE \xCF \xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6 \xD8 \xD9 \xDA \xDB \xDC \xDD \xDE \xDF \xE0 \xE1 \xE2
\xE3 \xE4 \xE5 \xE6 \xE7 \xE8 \xE9 \xEA \xEB \xEC \xED \xEE \xEF \xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF8 \xF9 \xFA \xFB \xFC \xFD \xFE \xFF
Modified: code/trunk/testdata/testoutput5
===================================================================
--- code/trunk/testdata/testoutput5 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput5 2014-01-03 15:15:00 UTC (rev 1433)
@@ -1538,7 +1538,7 @@
No first char
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/[^\x{1234}]+?/iS8I
Capturing subpattern count = 0
@@ -1546,7 +1546,7 @@
No first char
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/[^\x{1234}]++/iS8I
Capturing subpattern count = 0
@@ -1554,7 +1554,7 @@
No first char
No need char
Subject length lower bound = 1
-No set of starting bytes
+No starting char list
/[^\x{1234}]{2}/iS8I
Capturing subpattern count = 0
@@ -1562,7 +1562,7 @@
No first char
No need char
Subject length lower bound = 2
-No set of starting bytes
+No starting char list
//<bsr_anycrlf><bsr_unicode>
Failed: inconsistent NEWLINE options at offset 0
Modified: code/trunk/testdata/testoutput8
===================================================================
--- code/trunk/testdata/testoutput8 2014-01-02 17:50:25 UTC (rev 1432)
+++ code/trunk/testdata/testoutput8 2014-01-03 15:15:00 UTC (rev 1433)
@@ -7232,7 +7232,7 @@
No first char
No need char
Subject length lower bound = 3
-Starting byte set: a d x
+Starting chars: a d x
terhjk;abcdaadsfe
0: abc
the quick xyz brown fox