[Pcre-svn] [836] code/trunk: Merging all the changes from the pcre16 branch into the trunk.

Autor: Subversion repository
Datum:
To: pcre-svn
Betreff: [Pcre-svn] [836] code/trunk: Merging all the changes from the pcre16 branch into the trunk.

Revision: 836

          http://vcs.pcre.org/viewvc?view=rev&revision=836
Author:   ph10
Date:     2011-12-28 17:16:11 +0000 (Wed, 28 Dec 2011)

Log Message:
-----------
Merging all the changes from the pcre16 branch into the trunk.

Modified Paths:
--------------
    code/trunk/AUTHORS
    code/trunk/CMakeLists.txt
    code/trunk/ChangeLog
    code/trunk/LICENCE
    code/trunk/Makefile.am
    code/trunk/NEWS
    code/trunk/NON-UNIX-USE
    code/trunk/PrepareRelease
    code/trunk/README
    code/trunk/RunGrepTest
    code/trunk/RunTest
    code/trunk/RunTest.bat
    code/trunk/configure.ac
    code/trunk/dftables.c
    code/trunk/doc/html/pcreapi.html
    code/trunk/doc/html/pcrecallout.html
    code/trunk/doc/html/pcrecompat.html
    code/trunk/doc/html/pcrejit.html
    code/trunk/doc/html/pcrelimits.html
    code/trunk/doc/html/pcrematching.html
    code/trunk/doc/html/pcrepattern.html
    code/trunk/doc/html/pcretest.html
    code/trunk/doc/pcre.txt
    code/trunk/doc/pcreapi.3
    code/trunk/doc/pcrecallout.3
    code/trunk/doc/pcrecompat.3
    code/trunk/doc/pcrejit.3
    code/trunk/doc/pcrelimits.3
    code/trunk/doc/pcrepattern.3
    code/trunk/doc/pcretest.1
    code/trunk/doc/pcretest.txt
    code/trunk/libpcre.a.dev
    code/trunk/libpcre.pc.in
    code/trunk/maint/ManyConfigTests
    code/trunk/makevp_c.txt
    code/trunk/makevp_l.txt
    code/trunk/pcre-config.in
    code/trunk/pcre.h.in
    code/trunk/pcre_chartables.c.dist
    code/trunk/pcre_compile.c
    code/trunk/pcre_config.c
    code/trunk/pcre_dfa_exec.c
    code/trunk/pcre_exec.c
    code/trunk/pcre_fullinfo.c
    code/trunk/pcre_get.c
    code/trunk/pcre_globals.c
    code/trunk/pcre_internal.h
    code/trunk/pcre_jit_compile.c
    code/trunk/pcre_jit_test.c
    code/trunk/pcre_maketables.c
    code/trunk/pcre_newline.c
    code/trunk/pcre_ord2utf8.c
    code/trunk/pcre_refcount.c
    code/trunk/pcre_study.c
    code/trunk/pcre_tables.c
    code/trunk/pcre_ucd.c
    code/trunk/pcre_ucp_searchfuncs.c
    code/trunk/pcre_valid_utf8.c
    code/trunk/pcre_version.c
    code/trunk/pcre_xclass.c
    code/trunk/pcregrep.c
    code/trunk/pcreposix.c
    code/trunk/pcreposix.h
    code/trunk/pcretest.c
    code/trunk/perltest.pl
    code/trunk/sljit/sljitConfig.h
    code/trunk/sljit/sljitConfigInternal.h
    code/trunk/sljit/sljitExecAllocator.c
    code/trunk/sljit/sljitLir.c
    code/trunk/sljit/sljitLir.h
    code/trunk/sljit/sljitNativeARM_Thumb2.c
    code/trunk/sljit/sljitNativeARM_v5.c
    code/trunk/sljit/sljitNativeMIPS_32.c
    code/trunk/sljit/sljitNativeMIPS_common.c
    code/trunk/sljit/sljitNativePPC_32.c
    code/trunk/sljit/sljitNativePPC_64.c
    code/trunk/sljit/sljitNativePPC_common.c
    code/trunk/sljit/sljitNativeX86_32.c
    code/trunk/sljit/sljitNativeX86_64.c
    code/trunk/sljit/sljitNativeX86_common.c
    code/trunk/sljit/sljitUtils.c
    code/trunk/testdata/testinput1
    code/trunk/testdata/testinput10
    code/trunk/testdata/testinput11
    code/trunk/testdata/testinput12
    code/trunk/testdata/testinput13
    code/trunk/testdata/testinput14
    code/trunk/testdata/testinput15
    code/trunk/testdata/testinput2
    code/trunk/testdata/testinput4
    code/trunk/testdata/testinput5
    code/trunk/testdata/testinput6
    code/trunk/testdata/testinput7
    code/trunk/testdata/testinput8
    code/trunk/testdata/testinput9
    code/trunk/testdata/testoutput1
    code/trunk/testdata/testoutput10
    code/trunk/testdata/testoutput12
    code/trunk/testdata/testoutput13
    code/trunk/testdata/testoutput14
    code/trunk/testdata/testoutput15
    code/trunk/testdata/testoutput2
    code/trunk/testdata/testoutput4
    code/trunk/testdata/testoutput5
    code/trunk/testdata/testoutput6
    code/trunk/testdata/testoutput7
    code/trunk/testdata/testoutput8
    code/trunk/testdata/testoutput9

Added Paths:
-----------
    code/trunk/libpcre16.pc.in
    code/trunk/pcre16_byte_order.c
    code/trunk/pcre16_chartables.c
    code/trunk/pcre16_compile.c
    code/trunk/pcre16_config.c
    code/trunk/pcre16_dfa_exec.c
    code/trunk/pcre16_exec.c
    code/trunk/pcre16_fullinfo.c
    code/trunk/pcre16_get.c
    code/trunk/pcre16_globals.c
    code/trunk/pcre16_jit_compile.c
    code/trunk/pcre16_maketables.c
    code/trunk/pcre16_newline.c
    code/trunk/pcre16_ord2utf16.c
    code/trunk/pcre16_printint.c
    code/trunk/pcre16_refcount.c
    code/trunk/pcre16_string_utils.c
    code/trunk/pcre16_study.c
    code/trunk/pcre16_tables.c
    code/trunk/pcre16_ucd.c
    code/trunk/pcre16_utf16_utils.c
    code/trunk/pcre16_valid_utf16.c
    code/trunk/pcre16_version.c
    code/trunk/pcre16_xclass.c
    code/trunk/pcre_byte_order.c
    code/trunk/pcre_printint.c
    code/trunk/pcre_string_utils.c
    code/trunk/testdata/saved16
    code/trunk/testdata/saved8
    code/trunk/testdata/testinput16
    code/trunk/testdata/testinput17
    code/trunk/testdata/testinput18
    code/trunk/testdata/testinput19
    code/trunk/testdata/testinput20
    code/trunk/testdata/testoutput11-16
    code/trunk/testdata/testoutput11-8
    code/trunk/testdata/testoutput16
    code/trunk/testdata/testoutput17
    code/trunk/testdata/testoutput18
    code/trunk/testdata/testoutput19
    code/trunk/testdata/testoutput20

Removed Paths:
-------------
    code/trunk/pcre_info.c
    code/trunk/pcre_printint.src
    code/trunk/pcre_try_flipped.c
    code/trunk/testdata/testoutput11

Property Changed:
----------------
    code/trunk/

Property changes on: code/trunk
___________________________________________________________________
Name: svn:ignore
- .deps
.libs
CMakeCache.txt
CMakeFiles
DartTestfile.txt
INSTALL
Makefile
Makefile.in
Testing
aclocal.m4
autom4te.cache
cmake_install.cmake
config.guess
config.h
config.h.generic
config.h.in
config.log
config.status
config.sub
configure
depcomp
dftables
install-sh
libpcre.pc
libpcre.so
libpcrecpp.pc
libpcrecpp.so
libpcreposix.pc
libpcreposix.so
libtool
ltmain.sh
m4
missing
pcre.h
pcre.h.generic
pcre_chartables.c
pcre-config
pcre_jit_test
pcre_scanner_unittest
pcre_stringpiece.h
pcre_stringpiece_unittest
pcrecpparg.h
pcrecpp_unittest
pcredemo
pcregrep
pcretest
progress.make
stamp-h1
test3input
test3output
testNinput
testsavedregex
teststderr
teststdout
testtry

+ .deps
.libs
CMakeCache.txt
CMakeFiles
DartTestfile.txt
INSTALL
Makefile
Makefile.in
Testing
aclocal.m4
autom4te.cache
cmake_install.cmake
config.guess
config.h
config.h.generic
config.h.in
config.log
config.status
config.sub
configure
depcomp
dftables
install-sh
libpcre.pc
libpcre16.pc
libpcre.so
libpcrecpp.pc
libpcrecpp.so
libpcreposix.pc
libpcreposix.so
libtool
ltmain.sh
m4
missing
pcre.h
pcre.h.generic
pcre_chartables.c
pcre-config
pcre_jit_test
pcre_scanner_unittest
pcre_stringpiece.h
pcre_stringpiece_unittest
pcrecpparg.h
pcrecpp_unittest
pcredemo
pcregrep
pcretest
progress.make
stamp-h1
test3input
test3output
testNinput
testsavedregex
teststderr
teststdout
testtry

Modified: code/trunk/AUTHORS
===================================================================
--- code/trunk/AUTHORS    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/AUTHORS    2011-12-28 17:16:11 UTC (rev 836)
@@ -8,7 +8,7 @@
 University of Cambridge Computing Service,
 Cambridge, England.

@@ -19,7 +19,7 @@
 Email local part: hzmester
 Emain domain:     freemail.hu

@@ -30,7 +30,7 @@
 Email local part: hzmester
 Emain domain:     freemail.hu

-Copyright(c) 2009-2011 Zoltan Herczeg
+Copyright(c) 2009-2012 Zoltan Herczeg
All rights reserved.

@@ -39,7 +39,7 @@

 Written by:       Google Inc.

-Copyright (c) 2007-2011 Google Inc
+Copyright (c) 2007-2012 Google Inc
All rights reserved

####

Modified: code/trunk/CMakeLists.txt
===================================================================
--- code/trunk/CMakeLists.txt    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/CMakeLists.txt    2011-12-28 17:16:11 UTC (rev 836)
@@ -333,7 +333,8 @@
 SET(PCRE_HEADERS ${PROJECT_BINARY_DIR}/pcre.h)

SET(PCRE_SOURCES
- ${PROJECT_BINARY_DIR}/pcre_chartables.c
+ ${PROJECT_BINARY_DIR}/pcre_byte_order.c
+ pcre_chartables.c
pcre_compile.c
pcre_config.c
pcre_dfa_exec.c
@@ -349,7 +350,6 @@
pcre_refcount.c
pcre_study.c
pcre_tables.c
- pcre_try_flipped.c
pcre_ucd.c
pcre_valid_utf8.c
pcre_version.c

Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/ChangeLog    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,9 +1,35 @@
 ChangeLog for PCRE
 ------------------

-Version 8.21
+Version 8.30
------------

+1.  Renamed "isnumber" as "is_a_number" because in some Mac environments this
+    name is defined in ctype.h.
+    
+2.  Fixed a bug in the code for calculating the fixed length of lookbehind
+    assertions.
+    
+3.  Removed the function pcre_info(), which has been obsolete and deprecated
+    since it was replaced by pcre_fullinfo() in February 2000. 
+    
+4.  For a non-anchored pattern, if (*SKIP) was given with a name that did not
+    match a (*MARK), and the match failed at the start of the subject, a 
+    reference to memory before the start of the subject could occur. This bug 
+    was introduced by fix 17 of release 8.21.
+    
+5.  A reference to an unset group with zero minimum repetition was giving
+    totally wrong answers (in non-JavaScript-compatibility mode). For example,
+    /(another)?(\1?)test/ matched against "hello world test". This bug was 
+    introduced in release 8.13.
+    
+6.  Add support for 16-bit character strings (a large amount of work involving 
+    many changes and refactorings). 
+    
+
+Version 8.21 12-Dec-2011
+------------------------
+
 1.  Updating the JIT compiler.

 2.  JIT compiler now supports OP_NCREF, OP_RREF and OP_NRREF. New test cases
@@ -13,7 +39,7 @@
     PCRE_EXTRA_TABLES is not suported by JIT, and should be checked before
     calling _pcre_jit_exec. Some extra comments are added.

-4.  Mark settings inside atomic groups that do not contain any capturing
+4.  (*MARK) settings inside atomic groups that do not contain any capturing
     parentheses, for example, (?>a(*:m)), were not being passed out. This bug
     was introduced by change 18 for 8.20.

@@ -22,37 +48,101 @@

 6.  Lookbehinds such as (?<=a{2}b) that contained a fixed repetition were
     erroneously being rejected as "not fixed length" if PCRE_CASELESS was set.
-    This bug was probably introduced by change 9 of 8.13. 
-    
+    This bug was probably introduced by change 9 of 8.13.
+
 7.  While fixing 6 above, I noticed that a number of other items were being
-    incorrectly rejected as "not fixed length". This arose partly because newer 
+    incorrectly rejected as "not fixed length". This arose partly because newer
     opcodes had not been added to the fixed-length checking code. I have (a)
     corrected the bug and added tests for these items, and (b) arranged for an
     error to occur if an unknown opcode is encountered while checking for fixed
-    length instead of just assuming "not fixed length". The items that were 
-    rejected were: (*ACCEPT), (*COMMIT), (*FAIL), (*MARK), (*PRUNE), (*SKIP), 
-    (*THEN), \h, \H, \v, \V, and single character negative classes with fixed 
+    length instead of just assuming "not fixed length". The items that were
+    rejected were: (*ACCEPT), (*COMMIT), (*FAIL), (*MARK), (*PRUNE), (*SKIP),
+    (*THEN), \h, \H, \v, \V, and single character negative classes with fixed
     repetitions, e.g. [^a]{3}, with and without PCRE_CASELESS.
-    
+
 8.  A possessively repeated conditional subpattern such as (?(?=c)c|d)++ was
-    being incorrectly compiled and would have given unpredicatble results. 
-    
-9.  A possessively repeated subpattern with minimum repeat count greater than 
+    being incorrectly compiled and would have given unpredicatble results.
+
+9.  A possessively repeated subpattern with minimum repeat count greater than
     one behaved incorrectly. For example, (A){2,}+ behaved as if it was
-    (A)(A)++ which meant that, after a subsequent mismatch, backtracking into 
-    the first (A) could occur when it should not. 
-    
-10. Add a cast and remove a redundant test from the code. 
+    (A)(A)++ which meant that, after a subsequent mismatch, backtracking into
+    the first (A) could occur when it should not.

+10. Add a cast and remove a redundant test from the code.
+
11. JIT should use pcre_malloc/pcre_free for allocation.

 12. Updated pcre-config so that it no longer shows -L/usr/lib, which seems
-    best practice nowadays, and helps with cross-compiling. (If the exec_prefix 
-    is anything other than /usr, -L is still shown). 
-    
+    best practice nowadays, and helps with cross-compiling. (If the exec_prefix
+    is anything other than /usr, -L is still shown).
+
 13. In non-UTF-8 mode, \C is now supported in lookbehinds and DFA matching.

+14. Perl does not support \N without a following name in a [] class; PCRE now
+    also gives an error.

+15. If a forward reference was repeated with an upper limit of around 2000,
+    it caused the error "internal error: overran compiling workspace". The
+    maximum number of forward references (including repeats) was limited by the
+    internal workspace, and dependent on the LINK_SIZE. The code has been
+    rewritten so that the workspace expands (via pcre_malloc) if necessary, and
+    the default depends on LINK_SIZE. There is a new upper limit (for safety)
+    of around 200,000 forward references. While doing this, I also speeded up
+    the filling in of repeated forward references.
+
+16. A repeated forward reference in a pattern such as (a)(?2){2}(.) was
+    incorrectly expecting the subject to contain another "a" after the start.
+
+17. When (*SKIP:name) is activated without a corresponding (*MARK:name) earlier
+    in the match, the SKIP should be ignored. This was not happening; instead
+    the SKIP was being treated as NOMATCH. For patterns such as
+    /A(*MARK:A)A+(*SKIP:B)Z|AAC/ this meant that the AAC branch was never
+    tested.
+
+18. The behaviour of (*MARK), (*PRUNE), and (*THEN) has been reworked and is
+    now much more compatible with Perl, in particular in cases where the result
+    is a non-match for a non-anchored pattern. For example, if
+    /b(*:m)f|a(*:n)w/ is matched against "abc", the non-match returns the name
+    "m", where previously it did not return a name. A side effect of this
+    change is that for partial matches, the last encountered mark name is
+    returned, as for non matches. A number of tests that were previously not
+    Perl-compatible have been moved into the Perl-compatible test files. The
+    refactoring has had the pleasing side effect of removing one argument from
+    the match() function, thus reducing its stack requirements.
+
+19. If the /S+ option was used in pcretest to study a pattern using JIT,
+    subsequent uses of /S (without +) incorrectly behaved like /S+.
+
+21. Retrieve executable code size support for the JIT compiler and fixing
+    some warnings.
+
+22. A caseless match of a UTF-8 character whose other case uses fewer bytes did
+    not work when the shorter character appeared right at the end of the
+    subject string.
+
+23. Added some (int) casts to non-JIT modules to reduce warnings on 64-bit
+    systems.
+
+24. Added PCRE_INFO_JITSIZE to pass on the value from (21) above, and also
+    output it when the /M option is used in pcretest.
+
+25. The CheckMan script was not being included in the distribution. Also, added
+    an explicit "perl" to run Perl scripts from the PrepareRelease script
+    because this is reportedly needed in Windows.
+
+26. If study data was being save in a file and studying had not found a set of
+    "starts with" bytes for the pattern, the data written to the file (though
+    never used) was taken from uninitialized memory and so caused valgrind to
+    complain.
+
+27. Updated RunTest.bat as provided by Sheri Pierce.
+
+28. Fixed a possible uninitialized memory bug in pcre_jit_compile.c.
+
+29. Computation of memory usage for the table of capturing group names was
+    giving an unnecessarily large value.
+
+
 Version 8.20 21-Oct-2011
 ------------------------

Modified: code/trunk/LICENCE
===================================================================
--- code/trunk/LICENCE    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/LICENCE    2011-12-28 17:16:11 UTC (rev 836)
@@ -24,7 +24,7 @@
 University of Cambridge Computing Service,
 Cambridge, England.

@@ -35,7 +35,7 @@
 Email local part: hzmester
 Emain domain:     freemail.hu

@@ -46,7 +46,7 @@
 Email local part: hzmester
 Emain domain:     freemail.hu

-Copyright(c) 2009-2011 Zoltan Herczeg
+Copyright(c) 2009-2012 Zoltan Herczeg
All rights reserved.

@@ -55,7 +55,7 @@

Contributed by: Google Inc.

-Copyright (c) 2007-2011, Google Inc.
+Copyright (c) 2007-2012, Google Inc.
All rights reserved.

Modified: code/trunk/Makefile.am
===================================================================
--- code/trunk/Makefile.am    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/Makefile.am    2011-12-28 17:16:11 UTC (rev 836)
@@ -35,7 +35,6 @@
   doc/html/pcre_get_stringtable_entries.html \
   doc/html/pcre_get_substring.html \
   doc/html/pcre_get_substring_list.html \
-  doc/html/pcre_info.html \
   doc/html/pcre_jit_stack_alloc.html \
   doc/html/pcre_jit_stack_free.html \
   doc/html/pcre_maketables.html \
@@ -79,7 +78,7 @@
 dist_noinst_SCRIPTS =

# Some of the binaries we make are to be installed, and others are
-# (non-user-visible) helper programs needed to build libpcre.
+# (non-user-visible) helper programs needed to build libpcre or libpcre16.
bin_PROGRAMS =
noinst_PROGRAMS =

@@ -100,6 +99,7 @@
# These files are used in the preparation of a release
EXTRA_DIST += \
PrepareRelease \
+ CheckMan \
CleanTxt \
Detrail \
132html \
@@ -168,10 +168,15 @@

endif # WITH_REBUILD_CHARTABLES

+BUILT_SOURCES = pcre_chartables.c

## The main pcre library
+
+# Build the 8 bit library if it is enabled.
+if WITH_PCRE8
lib_LTLIBRARIES += libpcre.la
libpcre_la_SOURCES = \
+ pcre_byte_order.c \
pcre_compile.c \
pcre_config.c \
pcre_dfa_exec.c \
@@ -179,16 +184,15 @@
pcre_fullinfo.c \
pcre_get.c \
pcre_globals.c \
- pcre_info.c \
pcre_internal.h \
pcre_jit_compile.c \
pcre_maketables.c \
pcre_newline.c \
pcre_ord2utf8.c \
pcre_refcount.c \
+ pcre_string_utils.c \
pcre_study.c \
pcre_tables.c \
- pcre_try_flipped.c \
pcre_ucd.c \
pcre_valid_utf8.c \
pcre_version.c \
@@ -199,11 +203,45 @@
nodist_libpcre_la_SOURCES = \
pcre_chartables.c

-# The pcre_printint.src file is #included by some source files, so it must be
-# distributed. The pcre_chartables.c.dist file is the default version of
-# pcre_chartables.c, used unless --enable-rebuild-chartables is specified.
-EXTRA_DIST += pcre_printint.src pcre_chartables.c.dist
+endif # WITH_PCRE8

+# Build the 16 bit library if it is enabled.
+if WITH_PCRE16
+lib_LTLIBRARIES += libpcre16.la
+libpcre16_la_SOURCES = \
+ pcre16_byte_order.c \
+ pcre16_chartables.c \
+ pcre16_compile.c \
+ pcre16_config.c \
+ pcre16_dfa_exec.c \
+ pcre16_exec.c \
+ pcre16_fullinfo.c \
+ pcre16_get.c \
+ pcre16_globals.c \
+ pcre16_jit_compile.c \
+ pcre16_maketables.c \
+ pcre16_newline.c \
+ pcre16_ord2utf16.c \
+ pcre16_refcount.c \
+ pcre16_string_utils.c \
+ pcre16_study.c \
+ pcre16_tables.c \
+ pcre16_ucd.c \
+ pcre16_utf16_utils.c \
+ pcre16_valid_utf16.c \
+ pcre16_version.c \
+ pcre16_xclass.c
+
+## This file is generated as part of the building process, so don't distribute.
+nodist_libpcre16_la_SOURCES = \
+ pcre_chartables.c
+
+endif # WITH_PCRE16
+
+# The pcre_chartables.c.dist file is the default version of pcre_chartables.c,
+# used unless --enable-rebuild-chartables is specified.
+EXTRA_DIST += pcre_chartables.c.dist
+
# The JIT compiler lives in a separate directory, but its files are #included
# when pcre_jit_compile.c is processed, so they must be distributed.
EXTRA_DIST += \
@@ -224,7 +262,12 @@
sljit/sljitNativeX86_common.c \
sljit/sljitUtils.c

+if WITH_PCRE8
libpcre_la_LDFLAGS = $(EXTRA_LIBPCRE_LDFLAGS)
+endif # WITH_PCRE8
+if WITH_PCRE16
+libpcre16_la_LDFLAGS = $(EXTRA_LIBPCRE_LDFLAGS)
+endif # WITH_PCRE16

CLEANFILES += pcre_chartables.c

@@ -233,15 +276,23 @@
TESTS += pcre_jit_test
noinst_PROGRAMS += pcre_jit_test
pcre_jit_test_SOURCES = pcre_jit_test.c
-pcre_jit_test_LDADD = libpcre.la
+pcre_jit_test_LDADD =
+if WITH_PCRE8
+pcre_jit_test_LDADD += libpcre.la
+endif # WITH_PCRE8
+if WITH_PCRE16
+pcre_jit_test_LDADD += libpcre16.la
+endif # WITH_PCRE16
endif # WITH_JIT

## A version of the main pcre library that has a posix re API.
+if WITH_PCRE8
lib_LTLIBRARIES += libpcreposix.la
libpcreposix_la_SOURCES = \
pcreposix.c
libpcreposix_la_LDFLAGS = $(EXTRA_LIBPCREPOSIX_LDFLAGS)
libpcreposix_la_LIBADD = libpcre.la
+endif # WITH_PCRE8

## There's a C++ library as well.
if WITH_PCRE_CPP
@@ -282,13 +333,24 @@
EXTRA_DIST += RunTest.bat
bin_PROGRAMS += pcretest
pcretest_SOURCES = pcretest.c
-pcretest_LDADD = libpcreposix.la $(LIBREADLINE)
+pcretest_LDADD = $(LIBREADLINE)
+if WITH_PCRE8
+pcretest_SOURCES += pcre_printint.c
+pcretest_LDADD += libpcreposix.la
+endif # WITH_PCRE8
+if WITH_PCRE16
+pcretest_SOURCES += pcre16_printint.c
+pcretest_LDADD += libpcre16.la
+endif # WITH_PCRE16

+if WITH_PCRE8
TESTS += RunGrepTest
dist_noinst_SCRIPTS += RunGrepTest
bin_PROGRAMS += pcregrep
pcregrep_SOURCES = pcregrep.c
-pcregrep_LDADD = libpcreposix.la $(LIBZ) $(LIBBZ2)
+pcregrep_LDADD = $(LIBZ) $(LIBBZ2)
+pcregrep_LDADD += libpcreposix.la
+endif # WITH_PCRE8

 EXTRA_DIST += \
   testdata/grepinput \
@@ -315,6 +377,10 @@
   testdata/testinput13 \
   testdata/testinput14 \
   testdata/testinput15 \
+  testdata/testinput16 \
+  testdata/testinput17 \
+  testdata/testinput18 \
+  testdata/testinput19 \
   testdata/testoutput1 \
   testdata/testoutput2 \
   testdata/testoutput3 \
@@ -325,11 +391,16 @@
   testdata/testoutput8 \
   testdata/testoutput9 \
   testdata/testoutput10 \
-  testdata/testoutput11 \
+  testdata/testoutput11-8 \
+  testdata/testoutput11-16 \
   testdata/testoutput12 \
   testdata/testoutput13 \
   testdata/testoutput14 \
   testdata/testoutput15 \
+  testdata/testoutput16 \
+  testdata/testoutput17 \
+  testdata/testoutput18 \
+  testdata/testoutput19 \
   testdata/wintestinput3 \
   testdata/wintestoutput3 \
   perltest.pl
@@ -359,13 +430,12 @@
 # A PCRE user submitted the following addition, saying that it "will allow
 # anyone using the 'mingw32' compiler to simply type 'make pcre.dll' and get a
 # nice DLL for Windows use". (It is used by the pcre.dll target.)
-DLL_OBJS= pcre_compile.o pcre_config.o \
+DLL_OBJS= pcre_byte_order.o pcre_compile.o pcre_config.o \
     pcre_dfa_exec.o pcre_exec.o pcre_fullinfo.o pcre_get.o \
-    pcre_globals.o pcre_info.o pcre_jit_compile.o pcre_maketables.o \
+    pcre_globals.o pcre_jit_compile.o pcre_maketables.o \
     pcre_newline.o pcre_ord2utf8.o pcre_refcount.o \
-    pcre_study.o pcre_tables.o pcre_try_flipped.o \
-    pcre_ucd.o pcre_valid_utf8.o pcre_version.o \
-    pcre_chartables.o \
+    pcre_study.o pcre_tables.o pcre_ucd.o \
+    pcre_valid_utf8.o pcre_version.o pcre_chartables.o \
     pcre_xclass.o

# A PCRE user submitted the following addition, saying that it "will allow
@@ -378,6 +448,9 @@
# We have .pc files for pkg-config users.
pkgconfigdir = $(libdir)/pkgconfig
pkgconfig_DATA = libpcre.pc libpcreposix.pc
+if WITH_PCRE16
+pkgconfig_DATA += libpcre16.pc
+endif
if WITH_PCRE_CPP
pkgconfig_DATA += libpcrecpp.pc
endif
@@ -402,7 +475,6 @@
doc/pcre_get_stringtable_entries.3 \
doc/pcre_get_substring.3 \
doc/pcre_get_substring_list.3 \
- doc/pcre_info.3 \
doc/pcre_jit_stack_alloc.3 \
doc/pcre_jit_stack_free.3 \
doc/pcre_maketables.3 \

Modified: code/trunk/NEWS
===================================================================
--- code/trunk/NEWS    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/NEWS    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,6 +1,13 @@
 News about PCRE releases
 ------------------------

+Release 8.21 12-Dec-2011
+------------------------
+
+This is almost entirely a bug-fix release. The only new feature is the ability
+to obtain the size of the memory used by the JIT compiler.
+
+
Release 8.20 21-Oct-2011
------------------------

Modified: code/trunk/NON-UNIX-USE
===================================================================
--- code/trunk/NON-UNIX-USE    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/NON-UNIX-USE    2011-12-28 17:16:11 UTC (rev 836)
@@ -97,6 +97,7 @@
      option if you have set up config.h with your configuration, or else use
      other -D settings to change the configuration as required.

+       pcre_byte_order.c
        pcre_chartables.c
        pcre_compile.c
        pcre_config.c
@@ -112,7 +113,6 @@
        pcre_refcount.c
        pcre_study.c
        pcre_tables.c
-       pcre_try_flipped.c
        pcre_ucd.c
        pcre_valid_utf8.c
        pcre_version.c

Modified: code/trunk/PrepareRelease
===================================================================
--- code/trunk/PrepareRelease    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/PrepareRelease    2011-12-28 17:16:11 UTC (rev 836)
@@ -37,7 +37,7 @@

# Check the remaining man pages

-../CheckMan *.1 *.3
+perl ../CheckMan *.1 *.3
if [ $? != 0 ] ; then exit 1; fi

 # Make Text form of the documentation. It needs some mangling to make it
@@ -64,7 +64,7 @@
             pcrelimits pcrestack ; do
   echo "  Processing $file.3"
   nroff -c -man $file.3 >$file.rawtxt
-  ../CleanTxt <$file.rawtxt >>pcre.txt
+  perl ../CleanTxt <$file.rawtxt >>pcre.txt
   /bin/rm $file.rawtxt
   echo "------------------------------------------------------------------------------" >>pcre.txt
   if [ "$file" != "pcresample" ] ; then
@@ -77,7 +77,7 @@
 for file in pcretest pcregrep pcre-config ; do
   echo Making $file.txt
   nroff -c -man $file.1 >$file.rawtxt
-  ../CleanTxt <$file.rawtxt >$file.txt
+  perl ../CleanTxt <$file.rawtxt >$file.txt
   /bin/rm $file.rawtxt
 done

@@ -126,7 +126,7 @@
for file in *.1 ; do
base=`basename $file .1`
echo " Making $base.html"
- ../132html -toc $base <$file >html/$base.html
+ perl ../132html -toc $base <$file >html/$base.html
done

 # Exclude table of contents for function summaries. It seems that expr
@@ -146,7 +146,7 @@
     toc=""
   fi
   echo "  Making $base.html"
-  ../132html $toc $base <$file >html/$base.html
+  perl ../132html $toc $base <$file >html/$base.html
   if [ $? != 0 ] ; then exit 1; fi
 done

@@ -194,6 +194,7 @@
pcreposix.h \
pcre.h.in \
pcre_internal.h
+ pcre_byte_order.c \
pcre_compile.c \
pcre_config.c \
pcre_dfa_exec.c \
@@ -210,7 +211,6 @@
pcre_refcount.c \
pcre_study.c \
pcre_tables.c \
- pcre_try_flipped.c \
pcre_ucp_searchfuncs.c \
pcre_valid_utf8.c \
pcre_version.c \
@@ -235,7 +235,7 @@
libpcreposix.def"

echo Detrailing
-./Detrail $files doc/p* doc/html/*
+perl ./Detrail $files doc/p* doc/html/*

echo Doing basic configure to get default pcre.h and config.h
# This is in case the caller has set aliases (as I do - PH)

Modified: code/trunk/README
===================================================================
--- code/trunk/README    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/README    2011-12-28 17:16:11 UTC (rev 836)
@@ -713,6 +713,7 @@
                             specified, by copying to pcre_chartables.c

   pcreposix.c             )
+  pcre_byte_order.c       )
   pcre_compile.c          )
   pcre_config.c           )
   pcre_dfa_exec.c         )
@@ -728,7 +729,6 @@
   pcre_refcount.c         )
   pcre_study.c            )
   pcre_tables.c           )
-  pcre_try_flipped.c      )
   pcre_ucd.c              )
   pcre_valid_utf8.c       )
   pcre_version.c          )

Modified: code/trunk/RunGrepTest
===================================================================
--- code/trunk/RunGrepTest    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/RunGrepTest    2011-12-28 17:16:11 UTC (rev 836)
@@ -66,7 +66,7 @@

# Check for the availability of UTF-8 support

-./pcretest -C | ./pcregrep "No UTF-8 support" >/dev/null
+./pcretest -C utf >/dev/null
utf8=$?

echo "---------------------------- Test 1 ------------------------------" >testtry

Modified: code/trunk/RunTest
===================================================================
--- code/trunk/RunTest    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/RunTest    2011-12-28 17:16:11 UTC (rev 836)
@@ -18,7 +18,10 @@
 # two tests for JIT-specific features, one to be run when JIT support is
 # available, and one when it is not.

-# The arguments for this script can be individual test numbers, or the word
+# Whichever of the 8-bit and 16-bit libraries exist are tested. It is also
+# possible to select which to test by the arguments -8 or -16.
+
+# Other arguments for this script can be individual test numbers, or the word
# "valgrind", or "sim" followed by an argument to run cross-compiled
# executables under a simulator, for example:
#
@@ -26,6 +29,8 @@

valgrind=
sim=
+arg8=
+arg16=

# Select which tests to run; for those that are explicitly requested, check
# that the necessary optional facilities are available.
@@ -45,6 +50,11 @@
do13=no
do14=no
do15=no
+do16=no
+do17=no
+do18=no
+do19=no
+do20=no

 while [ $# -gt 0 ] ; do
   case $1 in
@@ -63,6 +73,13 @@
    13) do13=yes;;
    14) do14=yes;;
    15) do15=yes;;
+   16) do16=yes;;
+   17) do17=yes;;
+   18) do18=yes;;
+   19) do19=yes;;
+   20) do20=yes;; 
+   -8) arg8=yes;;
+  -16) arg16=yes;;
    valgrind) valgrind="valgrind -q --smc-check=all";;
    sim) shift; sim=$1;;
     *) echo "Unknown test number $1"; exit 1;;
@@ -93,44 +110,87 @@
 # strips only linefeeds from the output of a `backquoted` command. Hence the
 # alternative patterns.

-case `$sim ./pcretest -C | $sim ./pcregrep 'Internal link size'` in
- *2|*2[[:space:]]) link_size=2;;
- *3|*3[[:space:]]) link_size=3;;
- *4|*4[[:space:]]) link_size=4;;
- *) echo "Failed to find internal link size"; exit 1;;
-esac
+$sim ./pcretest -C linksize >/dev/null
+link_size=$?
+if [ $link_size -lt 2 ] ; then
+ echo "Failed to find internal link size"
+ exit 1
+fi
+if [ $link_size -gt 4 ] ; then
+ echo "Failed to find internal link size"
+ exit 1
+fi

-$sim ./pcretest -C | $sim ./pcregrep 'No UTF-8 support' >/dev/null
-utf8=$?
+# Both 8-bit and 16-bit character strings may be supported, but only one
+# need be.

-$sim ./pcretest -C | $sim ./pcregrep 'No Unicode properties support' >/dev/null
+$sim ./pcretest -C pcre8 >/dev/null
+support8=$?
+$sim ./pcretest -C pcre16 >/dev/null
+support16=$?
+if [ $(( $support8 + $support16 )) -eq 2 ] ; then
+  test8=
+  test16=-16
+  if [ "$arg8" = yes -a "$arg16" != yes ] ; then
+    test16=skip
+  fi
+  if [ "$arg16" = yes -a "$arg8" != yes ] ; then
+    test8=skip
+  fi
+else
+  if [ $support8 -ne 0 ] ; then
+    if [ "$arg16" = yes ] ; then
+      echo "Cannot run 16-bit library tests: 16-bit library not compiled"
+      exit 1
+    fi
+    test8=
+    test16=skip
+  else
+    if [ "$arg8" = yes ] ; then
+      echo "Cannot run 8-bit library tests: 8-bit library not compiled"
+      exit 1
+    fi
+    test8=skip
+    test16=-16
+  fi
+fi
+
+# UTF support always applies to both bit sizes if both are supported; we can't
+# have UTF-8 support without UTF-16 support (for example).
+
+$sim ./pcretest -C utf >/dev/null
+utf=$?
+
+$sim ./pcretest -C ucp >/dev/null
 ucp=$?

jitopt=
-$sim ./pcretest -C | $sim ./pcregrep 'No just-in-time compiler support' \
- >/dev/null
+$sim ./pcretest -C jit >/dev/null
jit=$?
if [ $jit -ne 0 ] ; then
jitopt=-s+
fi

-if [ $utf8 -eq 0 ] ; then
+if [ $utf -eq 0 ] ; then
   if [ $do4 = yes ] ; then
-    echo "Can't run test 4 because UTF-8 support is not configured"
+    echo "Can't run test 4 because UTF support is not configured"
     exit 1
   fi
   if [ $do5 = yes ] ; then
-    echo "Can't run test 5 because UTF-8 support is not configured"
+    echo "Can't run test 5 because UTF support is not configured"
     exit 1
   fi
-  if [ $do8 = yes ] ; then
-    echo "Can't run test 8 because UTF-8 support is not configured"
+  if [ $do9 = yes ] ; then
+    echo "Can't run test 8 because UTF support is not configured"
     exit 1
   fi
-  if [ $do12 = yes ] ; then
-    echo "Can't run test 12 because UTF-8 support is not configured"
+  if [ $do15 = yes ] ; then
+    echo "Can't run test 15 because UTF support is not configured"
     exit 1
   fi
+  if [ $do18 = yes ] ; then
+    echo "Can't run test 18 because UTF support is not configured"
+  fi     
 fi

 if [ $ucp -eq 0 ] ; then
@@ -138,35 +198,39 @@
     echo "Can't run test 6 because Unicode property support is not configured"
     exit 1
   fi
-  if [ $do9 = yes ] ; then
-    echo "Can't run test 9 because Unicode property support is not configured"
+  if [ $do7 = yes ] ; then
+    echo "Can't run test 7 because Unicode property support is not configured"
     exit 1
   fi
   if [ $do10 = yes ] ; then
     echo "Can't run test 10 because Unicode property support is not configured"
     exit 1
   fi
-  if [ $do13 = yes ] ; then
-    echo "Can't run test 12 because Unicode property support is not configured"
+  if [ $do16 = yes ] ; then
+    echo "Can't run test 16 because Unicode property support is not configured"
     exit 1
   fi
+  if [ $do19 = yes ] ; then
+    echo "Can't run test 19 because Unicode property support is not configured"
+    exit 1
+  fi
 fi

 if [ $link_size -ne 2 ] ; then
-  if [ $do10 = yes ] ; then
-    echo "Can't run test 10 because the link size ($link_size) is not 2"
+  if [ $do11 = yes ] ; then
+    echo "Can't run test 11 because the link size ($link_size) is not 2"
     exit 1
   fi
 fi

 if [ $jit -eq 0 ] ; then
-  if [ $do14 = "yes" ] ; then
-    echo "Can't run test 14 because JIT support is not configured"
+  if [ $do12 = "yes" ] ; then
+    echo "Can't run test 12 because JIT support is not configured"
     exit 1
   fi
 else
-  if [ $do15 = "yes" ] ; then
-    echo "Can't run test 15 because JIT support is configured"
+  if [ $do13 = "yes" ] ; then
+    echo "Can't run test 13 because JIT support is configured"
     exit 1
   fi
 fi
@@ -177,7 +241,8 @@
 if [ $do1  = no -a $do2  = no -a $do3  = no -a $do4  = no -a \
      $do5  = no -a $do6  = no -a $do7  = no -a $do8  = no -a \
      $do9  = no -a $do10 = no -a $do11 = no -a $do12 = no -a \
-     $do13 = no -a $do14 = no -a $do15 = no ] ; then
+     $do13 = no -a $do14 = no -a $do15 = no -a $do16 = no -a \
+     $do17 = no -a $do18 = no -a $do19 = no -a $do20 = no ] ; then
   do1=yes
   do2=yes
   do3=yes
@@ -193,6 +258,11 @@
   do13=yes
   do14=yes
   do15=yes
+  do16=yes
+  do17=yes
+  do18=yes
+  do19=yes
+  do20=yes 
 fi

# Show which release and which test data
@@ -201,12 +271,20 @@
echo PCRE C library tests using test data from $testdata
$sim ./pcretest /dev/null

+for bmode in "$test8" "$test16"; do
+  case "$bmode" in
+    skip) continue;;
+    -16)  if [ "$test8" != "skip" ] ; then echo ""; fi
+          bits=16; echo "---- Testing 16-bit library ----"; echo "";;
+    *)    bits=8; echo "---- Testing 8-bit library ----"; echo "";;
+  esac      
+   
 # Primary test, compatible with JIT and all versions of Perl >= 5.8

 if [ $do1 = yes ] ; then
-  echo "Test 1: main functionality (Compatible with Perl >= 5.8)"
+  echo "Test 1: main functionality (Compatible with Perl >= 5.10)"
   for opt in "" "-s" $jitopt; do
-    $sim $valgrind ./pcretest -q $opt $testdata/testinput1 testtry
+    $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput1 testtry
     if [ $? = 0 ] ; then
       $cf $testdata/testoutput1 testtry
       if [ $? != 0 ] ; then exit 1; fi
@@ -222,9 +300,9 @@
 # PCRE tests that are not JIT or Perl-compatible: API, errors, internals

 if [ $do2 = yes ] ; then
-  echo "Test 2: API, errors, internals, and non-Perl stuff"
+  echo "Test 2: API, errors, internals, and non-Perl stuff (not UTF-$bits)"
   for opt in "" "-s" $jitopt; do
-    $sim $valgrind ./pcretest -q $opt $testdata/testinput2 testtry
+    $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput2 testtry
     if [ $? = 0 ] ; then
       $cf $testdata/testoutput2 testtry
       if [ $? != 0 ] ; then exit 1; fi
@@ -278,7 +356,7 @@
   if [ "$locale" != "" ] ; then
     echo "Test 3: locale-specific features (using '$locale' locale)"
     for opt in "" "-s" $jitopt; do
-      $sim $valgrind ./pcretest -q $opt $infile testtry
+      $sim $valgrind ./pcretest -q $bmode $opt $infile testtry
       if [ $? = 0 ] ; then
         $cf $outfile testtry
         if [ $? != 0 ] ; then
@@ -304,15 +382,15 @@
   fi
 fi

-# Additional tests for UTF8 support
+# Additional tests for UTF support

 if [ $do4 = yes ] ; then
-  echo "Test 4: UTF-8 support (Compatible with Perl >= 5.8)"
-  if [ $utf8 -eq 0 ] ; then
-    echo "  Skipped because UTF-8 support is not available"
+  echo "Test 4: UTF-$bits support (Compatible with Perl >= 5.10)"
+  if [ $utf -eq 0 ] ; then
+    echo "  Skipped because UTF-$bits support is not available"
   else
     for opt in "" "-s" $jitopt; do
-      $sim $valgrind ./pcretest -q $opt $testdata/testinput4 testtry
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput4 testtry
       if [ $? = 0 ] ; then
         $cf $testdata/testoutput4 testtry
         if [ $? != 0 ] ; then exit 1; fi
@@ -327,12 +405,12 @@
 fi

 if [ $do5 = yes ] ; then
-  echo "Test 5: API, internals, and non-Perl stuff for UTF-8 support"
-  if [ $utf8 -eq 0 ] ; then
-    echo "  Skipped because UTF-8 support is not available"
+  echo "Test 5: API, internals, and non-Perl stuff for UTF-$bits support"
+  if [ $utf -eq 0 ] ; then
+    echo "  Skipped because UTF-$bits support is not available"
   else
     for opt in "" "-s" $jitopt; do
-      $sim $valgrind ./pcretest -q $opt $testdata/testinput5 testtry
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput5 testtry
       if [ $? = 0 ] ; then
         $cf $testdata/testoutput5 testtry
         if [ $? != 0 ] ; then exit 1; fi
@@ -348,11 +426,11 @@

 if [ $do6 = yes ] ; then
   echo "Test 6: Unicode property support (Compatible with Perl >= 5.10)"
-  if [ $utf8 -eq 0 -o $ucp -eq 0 ] ; then
+  if [ $utf -eq 0 -o $ucp -eq 0 ] ; then
     echo "  Skipped because Unicode property support is not available"
   else
     for opt in "" "-s" $jitopt; do
-      $sim $valgrind ./pcretest -q $opt $testdata/testinput6 testtry
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput6 testtry
       if [ $? = 0 ] ; then
         $cf $testdata/testoutput6 testtry
         if [ $? != 0 ] ; then exit 1; fi
@@ -366,14 +444,36 @@
   fi
 fi

+# Test non-Perl-compatible Unicode property support
+
+if [ $do7 = yes ] ; then
+  echo "Test 7: API, internals, and non-Perl stuff for Unicode property support"
+  if [ $utf -eq 0 -o $ucp -eq 0 ] ; then
+    echo "  Skipped because Unicode property support is not available"
+  else
+    for opt in "" "-s" $jitopt; do
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput7 testtry
+      if [ $? = 0 ] ; then
+        $cf $testdata/testoutput7 testtry
+        if [ $? != 0 ] ; then exit 1; fi
+      else exit 1
+      fi
+      if [ "$opt" = "-s" ] ; then echo "  OK with study"
+      elif [ "$opt" = "-s+" ] ; then echo "  OK with JIT study"
+      else echo "  OK"
+      fi
+    done
+  fi
+fi
+
 # Tests for DFA matching support

-if [ $do7 = yes ] ; then
-  echo "Test 7: DFA matching"
+if [ $do8 = yes ] ; then
+  echo "Test 8: DFA matching main functionality"
   for opt in "" "-s"; do
-    $sim $valgrind ./pcretest -q $opt -dfa $testdata/testinput7 testtry
+    $sim $valgrind ./pcretest -q $bmode $opt -dfa $testdata/testinput8 testtry
     if [ $? = 0 ] ; then
-      $cf $testdata/testoutput7 testtry
+      $cf $testdata/testoutput8 testtry
       if [ $? != 0 ] ; then exit 1; fi
     else exit 1
     fi
@@ -381,15 +481,15 @@
   done
 fi

-if [ $do8 = yes ] ; then
-  echo "Test 8: DFA matching with UTF-8"
-  if [ $utf8 -eq 0 ] ; then
-    echo "  Skipped because UTF-8 support is not available"
+if [ $do9 = yes ] ; then
+  echo "Test 9: DFA matching with UTF-$bits"
+  if [ $utf -eq 0 ] ; then
+    echo "  Skipped because UTF-$bits support is not available"
   else
     for opt in "" "-s"; do
-      $sim $valgrind ./pcretest -q $opt -dfa $testdata/testinput8 testtry
+      $sim $valgrind ./pcretest -q $bmode $opt -dfa $testdata/testinput9 testtry
       if [ $? = 0 ] ; then
-        $cf $testdata/testoutput8 testtry
+        $cf $testdata/testoutput9 testtry
         if [ $? != 0 ] ; then exit 1; fi
       else exit 1
       fi
@@ -398,15 +498,15 @@
   fi
 fi

-if [ $do9 = yes ] ; then
-  echo "Test 9: DFA matching with Unicode properties"
-  if [ $utf8 -eq 0 -o $ucp -eq 0 ] ; then
+if [ $do10 = yes ] ; then
+  echo "Test 10: DFA matching with Unicode properties"
+  if [ $utf -eq 0 -o $ucp -eq 0 ] ; then
     echo "  Skipped because Unicode property support is not available"
   else
     for opt in "" "-s"; do
-      $sim $valgrind ./pcretest -q $opt -dfa $testdata/testinput9 testtry
+      $sim $valgrind ./pcretest -q $bmode $opt -dfa $testdata/testinput10 testtry
       if [ $? = 0 ] ; then
-        $cf $testdata/testoutput9 testtry
+        $cf $testdata/testoutput10 testtry
         if [ $? != 0 ] ; then exit 1; fi
       else exit 1
       fi
@@ -419,19 +519,20 @@
 # is Unicode property support and the link size is 2. The actual tests are
 # mostly the same as in some of the above, but in this test we inspect some
 # offsets and sizes that require a known link size. This is a doublecheck for
-# the maintainer, just in case something changes unexpectely.
+# the maintainer, just in case something changes unexpectely. The output from
+# this test is not the same in 8-bit and 16-bit modes.

-if [ $do10 = yes ] ; then
-  echo "Test 10: Internal offsets and code size tests"
+if [ $do11 = yes ] ; then
+  echo "Test 11: Internal offsets and code size tests"
   if [ $link_size -ne 2 ] ; then
     echo "  Skipped because link size is not 2"
   elif [ $ucp -eq 0 ] ; then
     echo "  Skipped because Unicode property support is not available"
   else
     for opt in "" "-s"; do
-      $sim $valgrind ./pcretest -q $opt $testdata/testinput10 testtry
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput11 testtry
       if [ $? = 0 ] ; then
-        $cf $testdata/testoutput10 testtry
+        $cf $testdata/testoutput11-$bits testtry
         if [ $? != 0 ] ; then exit 1; fi
       else exit 1
       fi
@@ -440,35 +541,53 @@
   fi
 fi

-# Test of Perl >= 5.10 features without UTF8 support
+# Test JIT-specific features when JIT is available

-if [ $do11 = yes ] ; then
-  echo "Test 11: Features from Perl >= 5.10 without UTF8 support"
-  for opt in "" "-s" $jitopt; do
-    $sim $valgrind ./pcretest -q $opt $testdata/testinput11 testtry
+if [ $do12 = yes ] ; then
+  echo "Test 12: JIT-specific features (JIT available)"
+  if [ $jit -eq 0 ] ; then
+    echo "  Skipped because JIT is not available or not usable"
+  else
+    $sim $valgrind ./pcretest -q $bmode $testdata/testinput12 testtry
     if [ $? = 0 ] ; then
-      $cf $testdata/testoutput11 testtry
+      $cf $testdata/testoutput12 testtry
       if [ $? != 0 ] ; then exit 1; fi
     else exit 1
     fi
-    if [ "$opt" = "-s" ] ; then echo "  OK with study"
-    elif [ "$opt" = "-s+" ] ; then echo "  OK with JIT study"
-    else echo "  OK"
+    echo "  OK"
+  fi
+fi
+
+# Test JIT-specific features when JIT is not available
+
+if [ $do13 = yes ] ; then
+  echo "Test 13: JIT-specific features (JIT not available)"
+  if [ $jit -ne 0 ] ; then
+    echo "  Skipped because JIT is available"
+  else
+    $sim $valgrind ./pcretest -q $bmode $testdata/testinput13 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput13 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
     fi
-  done
+    echo "  OK"
+  fi
 fi

-# Test of Perl >= 5.10 features with UTF8 support
+# Tests for 8-bit-specific features

-if [ $do12 = yes ] ; then
-  echo "Test 12: Features from Perl >= 5.10 with UTF8 support"
-  if [ $utf8 -eq 0 ] ; then
-    echo "  Skipped because UTF-8 support is not available"
-  else
+if [ "$do14" = yes ] ; then
+  echo "Test 14: specials for the basic 8-bit library"
+  if [ "$bits" = "16" ] ; then
+    echo "  Skipped when running 16-bit tests"
+  elif [ $utf -eq 0 ] ; then
+    echo "  Skipped because UTF-$bits support is not available"
+  else   
     for opt in "" "-s" $jitopt; do
-      $sim $valgrind ./pcretest -q $opt $testdata/testinput12 testtry
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput14 testtry
       if [ $? = 0 ] ; then
-        $cf $testdata/testoutput12 testtry
+        $cf $testdata/testoutput14 testtry
         if [ $? != 0 ] ; then exit 1; fi
       else exit 1
       fi
@@ -480,17 +599,43 @@
   fi
 fi

-# Test non-Perl-compatible Unicode property support
+# Tests for 8-bit-specific features (needs UTF-8 support)

-if [ $do13 = yes ] ; then
-  echo "Test 13: API, internals, and non-Perl stuff for Unicode property support"
-  if [ $utf8 -eq 0 -o $ucp -eq 0 ] ; then
+if [ "$do15" = yes ] ; then
+  echo "Test 15: specials for the 8-bit library with UTF-8 support"
+  if [ "$bits" = "16" ] ; then
+    echo "  Skipped when running 16-bit tests"
+  elif [ $utf -eq 0 ] ; then
+    echo "  Skipped because UTF-$bits support is not available"
+  else   
+    for opt in "" "-s" $jitopt; do
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput15 testtry
+      if [ $? = 0 ] ; then
+        $cf $testdata/testoutput15 testtry
+        if [ $? != 0 ] ; then exit 1; fi
+      else exit 1
+      fi
+      if [ "$opt" = "-s" ] ; then echo "  OK with study"
+      elif [ "$opt" = "-s+" ] ; then echo "  OK with JIT study"
+      else echo "  OK"
+      fi
+    done
+  fi
+fi
+
+# Tests for 8-bit-specific features (Unicode property support)
+
+if [ $do16 = yes ] ; then
+  echo "Test 16: specials for the 8-bit library with Unicode propery support"
+  if [ "$bits" = "16" ] ; then
+    echo "  Skipped when running 16-bit tests"
+  elif [ $ucp -eq 0 ] ; then
     echo "  Skipped because Unicode property support is not available"
-  else
+  else   
     for opt in "" "-s" $jitopt; do
-      $sim $valgrind ./pcretest -q $opt $testdata/testinput13 testtry
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput16 testtry
       if [ $? = 0 ] ; then
-        $cf $testdata/testoutput13 testtry
+        $cf $testdata/testoutput16 testtry
         if [ $? != 0 ] ; then exit 1; fi
       else exit 1
       fi
@@ -502,38 +647,98 @@
   fi
 fi

-# Test JIT-specific features when JIT is available
+# Tests for 16-bit-specific features

-if [ $do14 = yes ] ; then
-  echo "Test 14: JIT-specific features (JIT available)"
-  if [ $jit -eq 0 ] ; then
-    echo "  Skipped because JIT is not available or not usable"
-  else
-    $sim $valgrind ./pcretest -q $testdata/testinput14 testtry
-    if [ $? = 0 ] ; then
-      $cf $testdata/testoutput14 testtry
-      if [ $? != 0 ] ; then exit 1; fi
-    else exit 1
-    fi
-    echo "  OK"
+if [ $do17 = yes ] ; then
+  echo "Test 17: specials for the basic 16-bit library"
+  if [ "$bits" = "8" ] ; then
+    echo "  Skipped when running 8-bit tests"
+  else   
+    for opt in "" "-s" $jitopt; do
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput17 testtry
+      if [ $? = 0 ] ; then
+        $cf $testdata/testoutput17 testtry
+        if [ $? != 0 ] ; then exit 1; fi
+      else exit 1
+      fi
+      if [ "$opt" = "-s" ] ; then echo "  OK with study"
+      elif [ "$opt" = "-s+" ] ; then echo "  OK with JIT study"
+      else echo "  OK"
+      fi
+    done
   fi
 fi

-# Test JIT-specific features when JIT is not available
+# Tests for 16-bit-specific features (UTF-16 support)

-if [ $do15 = yes ] ; then
-  echo "Test 15: JIT-specific features (JIT not available)"
-  if [ $jit -ne 0 ] ; then
-    echo "  Skipped because JIT is available"
-  else
-    $sim $valgrind ./pcretest -q $testdata/testinput15 testtry
-    if [ $? = 0 ] ; then
-      $cf $testdata/testoutput15 testtry
-      if [ $? != 0 ] ; then exit 1; fi
-    else exit 1
-    fi
-    echo "  OK"
+if [ $do18 = yes ] ; then
+  echo "Test 18: specials for the 16-bit library with UTF-16 support"
+  if [ "$bits" = "8" ] ; then
+    echo "  Skipped when running 8-bit tests"
+  elif [ $utf -eq 0 ] ; then
+    echo "  Skipped because UTF-$bits support is not available"
+  else   
+    for opt in "" "-s" $jitopt; do
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput18 testtry
+      if [ $? = 0 ] ; then
+        $cf $testdata/testoutput18 testtry
+        if [ $? != 0 ] ; then exit 1; fi
+      else exit 1
+      fi
+      if [ "$opt" = "-s" ] ; then echo "  OK with study"
+      elif [ "$opt" = "-s+" ] ; then echo "  OK with JIT study"
+      else echo "  OK"
+      fi
+    done
   fi
 fi

+# Tests for 16-bit-specific features (Unicode property support)
+
+if [ $do19 = yes ] ; then
+  echo "Test 19: specials for the 16-bit library with Unicode propery support"
+  if [ "$bits" = "8" ] ; then
+    echo "  Skipped when running 8-bit tests"
+  elif [ $ucp -eq 0 ] ; then
+    echo "  Skipped because Unicode property support is not available"
+  else   
+    for opt in "" "-s" $jitopt; do
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput19 testtry
+      if [ $? = 0 ] ; then
+        $cf $testdata/testoutput19 testtry
+        if [ $? != 0 ] ; then exit 1; fi
+      else exit 1
+      fi
+      if [ "$opt" = "-s" ] ; then echo "  OK with study"
+      elif [ "$opt" = "-s+" ] ; then echo "  OK with JIT study"
+      else echo "  OK"
+      fi
+    done
+  fi
+fi
+
+# Tests for 16-bit-specific features in DFA non-UTF-16 mode
+
+if [ $do20 = yes ] ; then
+  echo "Test 20: DFA specials for the basic 16-bit library"
+  if [ "$bits" = "8" ] ; then
+    echo "  Skipped when running 8-bit tests"
+  else   
+    for opt in "" "-s"; do
+      $sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput20 testtry
+      if [ $? = 0 ] ; then
+        $cf $testdata/testoutput20 testtry
+        if [ $? != 0 ] ; then exit 1; fi
+      else exit 1
+      fi
+      if [ "$opt" = "-s" ] ; then echo "  OK with study"
+      else echo "  OK"
+      fi
+    done
+  fi
+fi
+
+# End of loop for 8-bit/16-bit tests
+done
+
 # End

Modified: code/trunk/RunTest.bat
===================================================================
--- code/trunk/RunTest.bat    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/RunTest.bat    2011-12-28 17:16:11 UTC (rev 836)
@@ -18,6 +18,8 @@
 @rem 14 requires presense of jit support
 @rem 15 requires absence of jit support
 @rem Sheri P also added override tests for study and jit testing
+@rem JIT testing n/a for tests 7-10, removed JIT override test for them
+@rem removed override tests for 14-15

setlocal enabledelayedexpansion
if [%srcdir%]==[] (
@@ -27,7 +29,7 @@
if [%srcdir%]==[] (
if exist ..\..\testdata\ set srcdir=..\..)
if NOT exist "%srcdir%\testdata\" (
-Error: echo distribution testdata folder not found.
+Error: echo distribution testdata folder not found!
call :conferror
exit /b 1
goto :eof
@@ -41,14 +43,14 @@
echo pcregrep=%pcregrep%

if NOT exist "%pcregrep%" (
-echo Error: "%pcregrep%" not found.
+echo Error: "%pcregrep%" not found!
echo.
call :conferror
exit /b 1
)

if NOT exist "%pcretest%" (
-echo Error: "%pcretest%" not found.
+echo Error: "%pcretest%" not found!
echo.
call :conferror
exit /b 1
@@ -219,7 +221,7 @@
goto :eof

:do2
- call :runsub 2 testout "API, errors, internals, and non-Perl stuff" -q
+ call :runsub 2 testout "API, errors, internals, and non-Perl stuff (not UTF-8)" -q
call :runsub 2 testoutstudy "Test with Study Override" -q -s
if %jit% EQU 1 call :runsub 2 testoutjit "Test with JIT Override" -q -s+
goto :eof
@@ -263,7 +265,6 @@
:do7
call :runsub 7 testout "DFA matching" -q -dfa
call :runsub 7 testoutstudy "Test with Study Override" -q -dfa -s
- if %jit% EQU 1 call :runsub 7 testoutjit "Test with JIT Override" -q -dfa -s+
goto :eof

:do8
@@ -273,7 +274,6 @@
)
call :runsub 8 testout "DFA matching with UTF-8" -q -dfa
call :runsub 8 testoutstudy "Test with Study Override" -q -dfa -s
- if %jit% EQU 1 call :runsub 8 testoutjit "Test with JIT Override" -q -dfa -s+
goto :eof

:do9
@@ -283,7 +283,6 @@
)
call :runsub 9 testout "DFA matching with Unicode properties" -q -dfa
call :runsub 9 testoutstudy "Test with Study Override" -q -dfa -s
- if %jit% EQU 1 call :runsub 9 testoutjit "Test with JIT Override" -q -dfa -s+
goto :eof

:do10
@@ -293,7 +292,6 @@
)
call :runsub 10 testout "Internal offsets and code size tests" -q
call :runsub 10 testoutstudy "Test with Study Override" -q -s
- if %jit% EQU 1 call :runsub 10 testoutjit "Test with JIT Override" -q -s+
goto :eof

:do11
@@ -328,8 +326,6 @@
goto :eof
)
call :runsub 14 testout "JIT-specific features - have JIT" -q
- call :runsub 14 testoutstudy "Test with Study Override" -q -s
- call :runsub 14 testoutjit "Test with JIT Override" -q -s+
goto :eof

:do15
@@ -338,12 +334,12 @@
goto :eof
)
call :runsub 15 testout "JIT-specific features - no JIT" -q
- call :runsub 15 testoutstudy "Test with Study Override" -q -s
goto :eof

:conferror
-@echo Configuration error.
@echo.
+@echo Either your build is incomplete or you have a configuration error.
+@echo.
@echo If configured with cmake and executed via "make test" or the MSVC "RUN_TESTS"
@echo project, pcre_test.bat defines variables and automatically calls RunTest.bat.
@echo For manual testing of all available features, after configuring with cmake

Modified: code/trunk/configure.ac
===================================================================
--- code/trunk/configure.ac    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/configure.ac    2011-12-28 17:16:11 UTC (rev 836)
@@ -9,9 +9,9 @@
 dnl be defined as -RC2, for example. For real releases, it should be empty.

m4_define(pcre_major, [8])
-m4_define(pcre_minor, [21])
-m4_define(pcre_prerelease, [-RC1])
-m4_define(pcre_date, [2011-11-14])
+m4_define(pcre_minor, [30])
+m4_define(pcre_prerelease, [-PT1])
+m4_define(pcre_date, [2012-01-01])

# Libtool shared library interface versions (current:revision:age)
m4_define(libpcre_version, [0:1:0])
@@ -104,12 +104,24 @@
htmldir='${docdir}/html'
fi

+# Handle --disable-pcre8 (enabled by default)
+AC_ARG_ENABLE(pcre8,
+              AS_HELP_STRING([--disable-pcre8],
+                             [enable 8 bit character support]),
+              , enable_pcre8=unset)
+
+# Handle --enable-pcre16 (disabled by default)
+AC_ARG_ENABLE(pcre16,
+              AS_HELP_STRING([--enable-pcre16],
+                             [enable 16 bit character support]),
+              , enable_pcre16=unset)
+
 # Handle --disable-cpp. The substitution of enable_cpp is needed for use in
 # pcre-config.
 AC_ARG_ENABLE(cpp,
               AS_HELP_STRING([--disable-cpp],
                              [disable C++ support]),
-              , enable_cpp=yes)
+              , enable_cpp=unset)
 AC_SUBST(enable_cpp)

 # Handle --enable-jit (disabled by default)
@@ -133,13 +145,19 @@
 # Handle --enable-utf8 (disabled by default)
 AC_ARG_ENABLE(utf8,
               AS_HELP_STRING([--enable-utf8],
-                             [enable UTF-8 support (incompatible with --enable-ebcdic)]),
+                             [another name for --enable-utf. Kept only for compatibility reasons]),
               , enable_utf8=unset)

+# Handle --enable-utf (disabled by default)
+AC_ARG_ENABLE(utf,
+              AS_HELP_STRING([--enable-utf],
+                             [enable UTF-8/16 support (incompatible with --enable-ebcdic)]),
+              , enable_utf=unset)
+
 # Handle --enable-unicode-properties
 AC_ARG_ENABLE(unicode-properties,
               AS_HELP_STRING([--enable-unicode-properties],
-                             [enable Unicode properties support (implies --enable-utf8)]),
+                             [enable Unicode properties support (implies --enable-utf)]),
               , enable_unicode_properties=no)

 # Handle --enable-newline=NL
@@ -181,7 +199,7 @@
 # Handle --enable-ebcdic
 AC_ARG_ENABLE(ebcdic,
               AS_HELP_STRING([--enable-ebcdic],
-                             [assume EBCDIC coding rather than ASCII; incompatible with --enable-utf8; use only in (uncommon) EBCDIC environments; it implies --enable-rebuild-chartables]),
+                             [assume EBCDIC coding rather than ASCII; incompatible with --enable-utf; use only in (uncommon) EBCDIC environments; it implies --enable-rebuild-chartables]),
               , enable_ebcdic=no)

 # Handle --disable-stack-for-recursion
@@ -245,34 +263,76 @@
                            [default limit on internal recursion (default=MATCH_LIMIT)]),
             , with_match_limit_recursion=MATCH_LIMIT)

-# Make sure that if enable_unicode_properties was set, that UTF-8 support
-# is enabled.
-#
+# Copy enable_utf8 value to enable_utf for compatibility reasons
+if test "x$enable_utf8" != "xunset"
+then
+  if test "x$enable_utf" != "xunset"
+  then
+    AC_MSG_ERROR([--enable/disable-utf8 is kept only for compatibility reasons and its value is copied to --enable/disable-utf. Newer code must use --enable/disable-utf alone.])
+  fi
+  enable_utf=$enable_utf8
+fi
+
+# Set the default value for pcre8
+if test "x$enable_pcre8" = "xunset"
+then
+  enable_pcre8=yes
+fi
+
+# Set the default value for pcre16
+if test "x$enable_pcre16" = "xunset"
+then
+  enable_pcre16=no
+fi
+
+# Make sure enable_pcre8 or enable_pcre16 was set
+if test "x$enable_pcre8$enable_pcre16" = "xnono"
+then
+  AC_MSG_ERROR([Either 8 or 16 bit (or both) pcre library must be enabled])
+fi
+
+# Make sure that if enable_unicode_properties was set, that UTF support is enabled.
 if test "x$enable_unicode_properties" = "xyes"
 then
-  if test "x$enable_utf8" = "xno"
+  if test "x$enable_utf" = "xno"
   then
-    AC_MSG_ERROR([support for Unicode properties requires UTF-8 support])
+    AC_MSG_ERROR([support for Unicode properties requires UTF-8/16 support])
   fi
-  enable_utf8=yes
+  enable_utf=yes
 fi

-if test "x$enable_utf8" = "xunset"
+# enable_utf is disabled by default.
+if test "x$enable_utf" = "xunset"
then
- enable_utf8=no
+ enable_utf=no
fi

+# enable_cpp copies the value of enable_pcre8 by default
+if test "x$enable_cpp" = "xunset"
+then
+  enable_cpp=$enable_pcre8
+fi
+
+# Make sure that if enable_cpp was set, that enable_pcre8 support is enabled
+if test "x$enable_cpp" = "xyes"
+then
+  if test "x$enable_pcre8" = "xno"
+  then
+    AC_MSG_ERROR([C++ library requires pcre library with 8 bit characters])
+  fi
+fi
+
 # Make sure that if enable_ebcdic is set, rebuild_chartables is also enabled.
-# Also check that UTF-8 support is not requested, because PCRE cannot handle
-# EBCDIC and UTF-8 in the same build. To do so it would need to use different
+# Also check that UTF support is not requested, because PCRE cannot handle
+# EBCDIC and UTF in the same build. To do so it would need to use different
 # character constants depending on the mode.
 #
 if test "x$enable_ebcdic" = "xyes"
 then
   enable_rebuild_chartables=yes
-  if test "x$enable_utf8" = "xyes"
+  if test "x$enable_utf" = "xyes"
   then
-    AC_MSG_ERROR([support for EBCDIC and UTF-8 cannot be enabled at the same time])
+    AC_MSG_ERROR([support for EBCDIC and UTF-8/16 cannot be enabled at the same time])
   fi
 fi

@@ -410,10 +470,12 @@
AC_SUBST(pcre_have_bits_type_traits)

# Conditional compilation
+AM_CONDITIONAL(WITH_PCRE8, test "x$enable_pcre8" = "xyes")
+AM_CONDITIONAL(WITH_PCRE16, test "x$enable_pcre16" = "xyes")
AM_CONDITIONAL(WITH_PCRE_CPP, test "x$enable_cpp" = "xyes")
AM_CONDITIONAL(WITH_REBUILD_CHARTABLES, test "x$enable_rebuild_chartables" = "xyes")
AM_CONDITIONAL(WITH_JIT, test "x$enable_jit" = "xyes")
-AM_CONDITIONAL(WITH_UTF8, test "x$enable_utf8" = "xyes")
+AM_CONDITIONAL(WITH_UTF, test "x$enable_utf" = "xyes")

# Checks for typedefs, structures, and compiler characteristics.

@@ -482,6 +544,16 @@

# Here is where pcre specific defines are handled

+if test "$enable_pcre8" = "yes"; then
+  AC_DEFINE([SUPPORT_PCRE8], [], [
+    Define to enable the 8 bit PCRE library.])
+fi
+
+if test "$enable_pcre16" = "yes"; then
+  AC_DEFINE([SUPPORT_PCRE16], [], [
+    Define to enable the 16 bit PCRE library.])
+fi
+
 if test "$enable_jit" = "yes"; then
   AC_DEFINE([SUPPORT_JIT], [], [
     Define to enable support for Just-In-Time compiling.])
@@ -494,12 +566,12 @@
     Define to enable JIT support in pcregrep.])
 fi

-if test "$enable_utf8" = "yes"; then
-  AC_DEFINE([SUPPORT_UTF8], [], [
-    Define to enable support for the UTF-8 Unicode encoding. This will
-    work even in an EBCDIC environment, but it is incompatible with
-    the EBCDIC macro. That is, PCRE can support *either* EBCDIC code
-    *or* ASCII/UTF-8, but not both at once.])
+if test "$enable_utf" = "yes"; then
+  AC_DEFINE([SUPPORT_UTF], [], [
+    Define to enable support for the UTF-8/16 Unicode encoding. This
+    will work even in an EBCDIC environment, but it is incompatible
+    with the EBCDIC macro. That is, PCRE can support *either* EBCDIC
+    code *or* ASCII/UTF-8/16, but not both at once.])
 fi

 if test "$enable_unicode_properties" = "yes"; then
@@ -634,9 +706,9 @@
     character codes, define this macro as 1. On systems that can use
     "configure", this can be done via --enable-ebcdic. PCRE will then
     assume that all input strings are in EBCDIC. If you do not define
-    this macro, PCRE will assume input strings are ASCII or UTF-8 Unicode.
-    It is not possible to build a version of PCRE that supports both
-    EBCDIC and UTF-8.])
+    this macro, PCRE will assume input strings are ASCII or UTF-8/16
+    Unicode. It is not possible to build a version of PCRE that
+    supports both EBCDIC and UTF-8/16.])
 fi

 # Platform specific issues
@@ -720,7 +792,8 @@
 AC_CONFIG_FILES(
     Makefile
     libpcre.pc
-        libpcreposix.pc
+    libpcre16.pc
+    libpcreposix.pc
     libpcrecpp.pc
     pcre-config
     pcre.h
@@ -756,9 +829,11 @@
     Linker flags .................... : ${LDFLAGS}
     Extra libraries ................. : ${LIBS}

+    Build 8 bit pcre library ........ : ${enable_pcre8}
+    Build 16 bit pcre library ....... : ${enable_pcre16}
     Build C++ library ............... : ${enable_cpp}
     Enable JIT compiling support .... : ${enable_jit}
-    Enable UTF-8 support ............ : ${enable_utf8}
+    Enable UTF-8/16 support ......... : ${enable_utf}
     Unicode properties .............. : ${enable_unicode_properties}
     Newline char/sequence ........... : ${enable_newline}
     \R matches only ANYCRLF ......... : ${enable_bsr_anycrlf}

Modified: code/trunk/dftables.c
===================================================================
--- code/trunk/dftables.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/dftables.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2008 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -114,7 +114,7 @@
"#endif\n\n"
"#include \"pcre_internal.h\"\n\n");
fprintf(f,
- "const unsigned char _pcre_default_tables[] = {\n\n"
+ "const pcre_uint8 PRIV(default_tables)[] = {\n\n"
"/* This table is a lower casing table. */\n\n");

fprintf(f, " ");

Modified: code/trunk/doc/html/pcreapi.html
===================================================================
--- code/trunk/doc/html/pcreapi.html    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/html/pcreapi.html    2011-12-28 17:16:11 UTC (rev 836)
@@ -649,6 +649,23 @@
 string (by default this causes the current matching alternative to fail). A
 pattern such as (\1)(a) succeeds when this option is set (assuming it can find
 an "a" in the subject), whereas it fails by default, for Perl compatibility.
+</P>
+<P>
+(3) \U matches an upper case "U" character; by default \U causes a compile
+time error (Perl uses \U to upper case subsequent characters).
+</P>
+<P>
+(4) \u matches a lower case "u" character unless it is followed by four
+hexadecimal digits, in which case the hexadecimal number defines the code point
+to match. By default, \u causes a compile time error (Perl uses it to upper
+case the following character).
+</P>
+<P>
+(5) \x matches a lower case "x" character unless it is followed by two
+hexadecimal digits, in which case the hexadecimal number defines the code point
+to match. By default, as in Perl, a hexadecimal number is always expected after
+\x, but it may have zero, one, or two digits (so, for example, \xz matches a
+binary zero character followed by z).
 <pre>
   PCRE_MULTILINE
 </pre>
@@ -1127,6 +1144,12 @@
 <a href="pcrejit.html"><b>pcrejit</b></a>
 documentation for details of what can and cannot be handled.
 <pre>
+  PCRE_INFO_JITSIZE
+</pre>
+If the pattern was successfully studied with the PCRE_STUDY_JIT_COMPILE option,
+return the size of the JIT compiled code, otherwise return zero. The fourth
+argument should point to a <b>size_t</b> variable.
+<pre>
   PCRE_INFO_LASTLITERAL
 </pre>
 Return the value of the rightmost literal byte that must exist in any matched
@@ -1235,10 +1258,13 @@
 <pre>
   PCRE_INFO_SIZE
 </pre>
-Return the size of the compiled pattern, that is, the value that was passed as
-the argument to <b>pcre_malloc()</b> when PCRE was getting memory in which to
-place the compiled data. The fourth argument should point to a <b>size_t</b>
-variable.
+Return the size of the compiled pattern. The fourth argument should point to a
+<b>size_t</b> variable. This value does not include the size of the <b>pcre</b>
+structure that is returned by <b>pcre_compile()</b>. The value that is passed as
+the argument to <b>pcre_malloc()</b> when <b>pcre_compile()</b> is getting memory
+in which to place the compiled data is the value returned by this option plus
+the size of the <b>pcre</b> structure. Studying a compiled pattern, with or
+without JIT, does not alter the value returned by this option.
 <pre>
   PCRE_INFO_STUDYSIZE
 </pre>
@@ -2486,7 +2512,7 @@
 </P>
 <br><a name="SEC24" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 23 September 2011
+Last updated: 02 December 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>

Modified: code/trunk/doc/html/pcrecallout.html
===================================================================
--- code/trunk/doc/html/pcrecallout.html    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/html/pcrecallout.html    2011-12-28 17:16:11 UTC (rev 836)
@@ -189,9 +189,10 @@
 <P>
 The <i>mark</i> field is present from version 2 of the <i>pcre_callout</i>
 structure. In callouts from <b>pcre_exec()</b> it contains a pointer to the
-zero-terminated name of the most recently passed (*MARK) item in the match, or
-NULL if there are no (*MARK)s in the current matching path. In callouts from
-<b>pcre_dfa_exec()</b> this field always contains NULL.
+zero-terminated name of the most recently passed (*MARK), (*PRUNE), or (*THEN)
+item in the match, or NULL if no such items have been passed. Instances of
+(*PRUNE) or (*THEN) without a name do not obliterate a previous (*MARK). In
+callouts from <b>pcre_dfa_exec()</b> this field always contains NULL.
 </P>
 <br><a name="SEC4" href="#TOC1">RETURN VALUES</a><br>
 <P>
@@ -219,7 +220,7 @@
 </P>
 <br><a name="SEC6" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 26 August 2011
+Last updated: 30 November 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>

Modified: code/trunk/doc/html/pcrecompat.html
===================================================================
--- code/trunk/doc/html/pcrecompat.html    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/html/pcrecompat.html    2011-12-28 17:16:11 UTC (rev 836)
@@ -53,7 +53,8 @@
 own, matching a non-newline character, is supported.) In fact these are
 implemented by Perl's general string-handling and are not part of its pattern
 matching engine. If any of these are encountered by PCRE, an error is
-generated.
+generated by default. However, if the PCRE_JAVASCRIPT_COMPAT option is set,
+\U and \u are interpreted as JavaScript interprets them.
 </P>
 <P>
 6. The Perl escape sequences \p, \P, and \X are supported only if PCRE is
@@ -202,7 +203,7 @@
 REVISION
 </b><br>
 <P>
-Last updated: 09 October 2011
+Last updated: 14 November 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>

Modified: code/trunk/doc/html/pcrejit.html
===================================================================
--- code/trunk/doc/html/pcrejit.html    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/html/pcrejit.html    2011-12-28 17:16:11 UTC (rev 836)
@@ -20,10 +20,11 @@
 <li><a name="TOC5" href="#SEC5">RETURN VALUES FROM JIT EXECUTION</a>
 <li><a name="TOC6" href="#SEC6">SAVING AND RESTORING COMPILED PATTERNS</a>
 <li><a name="TOC7" href="#SEC7">CONTROLLING THE JIT STACK</a>
-<li><a name="TOC8" href="#SEC8">EXAMPLE CODE</a>
-<li><a name="TOC9" href="#SEC9">SEE ALSO</a>
-<li><a name="TOC10" href="#SEC10">AUTHOR</a>
-<li><a name="TOC11" href="#SEC11">REVISION</a>
+<li><a name="TOC8" href="#SEC8">JIT STACK FAQ</a>
+<li><a name="TOC9" href="#SEC9">EXAMPLE CODE</a>
+<li><a name="TOC10" href="#SEC10">SEE ALSO</a>
+<li><a name="TOC11" href="#SEC11">AUTHOR</a>
+<li><a name="TOC12" href="#SEC12">REVISION</a>
 </ul>
 <br><a name="SEC1" href="#TOC1">PCRE JUST-IN-TIME COMPILER SUPPORT</a><br>
 <P>
@@ -57,12 +58,18 @@
 fails.
 </P>
 <P>
-A program can tell if JIT support is available by calling <b>pcre_config()</b>
-with the PCRE_CONFIG_JIT option. The result is 1 when JIT is available, and 0
-otherwise. However, a simple program does not need to check this in order to
-use JIT. The API is implemented in a way that falls back to the ordinary PCRE
-code if JIT is not available.
+A program that is linked with PCRE 8.20 or later can tell if JIT support is
+available by calling <b>pcre_config()</b> with the PCRE_CONFIG_JIT option. The
+result is 1 when JIT is available, and 0 otherwise. However, a simple program
+does not need to check this in order to use JIT. The API is implemented in a
+way that falls back to the ordinary PCRE code if JIT is not available.
 </P>
+<P>
+If your program may sometimes be linked with versions of PCRE that are older
+than 8.20, but you want to use JIT when it is available, you can test
+the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT macro such
+as PCRE_CONFIG_JIT, for compile-time control of your code.
+</P>
 <br><a name="SEC3" href="#TOC1">SIMPLE USE OF JIT</a><br>
 <P>
 You have to do two things to make use of the JIT support in the simplest way:
@@ -75,6 +82,21 @@
       no longer needed instead of just freeing it yourself. This
       ensures that any JIT data is also freed.
 </pre>
+For a program that may be linked with pre-8.20 versions of PCRE, you can insert
+<pre>
+  #ifndef PCRE_STUDY_JIT_COMPILE
+  #define PCRE_STUDY_JIT_COMPILE 0
+  #endif
+</pre>
+so that no option is passed to <b>pcre_study()</b>, and then use something like
+this to free the study data:
+<pre>
+  #ifdef PCRE_CONFIG_JIT
+      pcre_free_study(study_ptr);
+  #else
+      pcre_free(study_ptr);
+  #endif
+</pre>
 In some circumstances you may need to call additional functions. These are
 described in the section entitled
 <a href="#stackcontrol">"Controlling the JIT stack"</a>
@@ -116,12 +138,8 @@
 <P>
 The unsupported pattern items are:
 <pre>
-  \C            match a single byte; not supported in UTF-8 mode
+  \C             match a single byte; not supported in UTF-8 mode
   (?Cn)          callouts
-  (?(&#60;name&#62;)...  conditional test on setting of a named subpattern
-  (?(R)...       conditional test on whole pattern recursion
-  (?(Rn)...      conditional test on recursion, by number
-  (?(R&name)...  conditional test on recursion, by name
   (*COMMIT)      )
   (*MARK)        )
   (*PRUNE)       ) the backtracking control verbs
@@ -167,7 +185,10 @@
 By default, it uses 32K on the machine stack. However, some large or
 complicated patterns need more than this. The error PCRE_ERROR_JIT_STACKLIMIT
 is given when there is not enough stack. Three functions are provided for
-managing blocks of memory for use as JIT stacks.
+managing blocks of memory for use as JIT stacks. There is further discussion
+about the use of JIT stacks in the section entitled
+<a href="#stackcontrol">"JIT stack FAQ"</a>
+below.
 </P>
 <P>
 The <b>pcre_jit_stack_alloc()</b> function creates a JIT stack. Its arguments
@@ -234,9 +255,87 @@
 and <b>pcre_assign_jit_stack()</b> does nothing unless the <b>extra</b> argument
 is non-NULL and points to a <b>pcre_extra</b> block that is the result of a
 successful study with PCRE_STUDY_JIT_COMPILE.
+<a name="stackfaq"></a></P>
+<br><a name="SEC8" href="#TOC1">JIT STACK FAQ</a><br>
+<P>
+(1) Why do we need JIT stacks?
+<br>
+<br>
+PCRE (and JIT) is a recursive, depth-first engine, so it needs a stack where
+the local data of the current node is pushed before checking its child nodes.
+Allocating real machine stack on some platforms is difficult. For example, the
+stack chain needs to be updated every time if we extend the stack on PowerPC.
+Although it is possible, its updating time overhead decreases performance. So
+we do the recursion in memory.
 </P>
-<br><a name="SEC8" href="#TOC1">EXAMPLE CODE</a><br>
 <P>
+(2) Why don't we simply allocate blocks of memory with <b>malloc()</b>?
+<br>
+<br>
+Modern operating systems have a nice feature: they can reserve an address space
+instead of allocating memory. We can safely allocate memory pages inside this
+address space, so the stack could grow without moving memory data (this is
+important because of pointers). Thus we can allocate 1M address space, and use
+only a single memory page (usually 4K) if that is enough. However, we can still
+grow up to 1M anytime if needed.
+</P>
+<P>
+(3) Who "owns" a JIT stack?
+<br>
+<br>
+The owner of the stack is the user program, not the JIT studied pattern or
+anything else. The user program must ensure that if a stack is used by
+<b>pcre_exec()</b>, (that is, it is assigned to the pattern currently running),
+that stack must not be used by any other threads (to avoid overwriting the same
+memory area). The best practice for multithreaded programs is to allocate a
+stack for each thread, and return this stack through the JIT callback function.
+</P>
+<P>
+(4) When should a JIT stack be freed?
+<br>
+<br>
+You can free a JIT stack at any time, as long as it will not be used by
+<b>pcre_exec()</b> again. When you assign the stack to a pattern, only a pointer
+is set. There is no reference counting or any other magic. You can free the
+patterns and stacks in any order, anytime. Just <i>do not</i> call
+<b>pcre_exec()</b> with a pattern pointing to an already freed stack, as that
+will cause SEGFAULT. (Also, do not free a stack currently used by
+<b>pcre_exec()</b> in another thread). You can also replace the stack for a
+pattern at any time. You can even free the previous stack before assigning a
+replacement.
+</P>
+<P>
+(5) Should I allocate/free a stack every time before/after calling
+<b>pcre_exec()</b>?
+<br>
+<br>
+No, because this is too costly in terms of resources. However, you could
+implement some clever idea which release the stack if it is not used in let's
+say two minutes. The JIT callback can help to achive this without keeping a
+list of the currently JIT studied patterns.
+</P>
+<P>
+(6) OK, the stack is for long term memory allocation. But what happens if a
+pattern causes stack overflow with a stack of 1M? Is that 1M kept until the
+stack is freed?
+<br>
+<br>
+Especially on embedded sytems, it might be a good idea to release
+memory sometimes without freeing the stack. There is no API for this at the
+moment. Probably a function call which returns with the currently allocated
+memory for any stack and another which allows releasing memory (shrinking the
+stack) would be a good idea if someone needs this.
+</P>
+<P>
+(7) This is too much of a headache. Isn't there any better solution for JIT
+stack handling?
+<br>
+<br>
+No, thanks to Windows. If POSIX threads were used everywhere, we could throw
+out this complicated API.
+</P>
+<br><a name="SEC9" href="#TOC1">EXAMPLE CODE</a><br>
+<P>
 This is a single-threaded example that specifies a JIT stack without using a
 callback.
 <pre>
@@ -260,22 +359,22 @@

</PRE>

- <a name="SEC9" href="#TOC1">SEE ALSO</a> 
+ <a name="SEC10" href="#TOC1">SEE ALSO</a> 

pcreapi(3)

- <a name="SEC10" href="#TOC1">AUTHOR</a> 
+ <a name="SEC11" href="#TOC1">AUTHOR</a> 

-Philip Hazel
+Philip Hazel (FAQ by Zoltan Herczeg)
 
University Computing Service
 
Cambridge CB2 3QH, England.
 

- <a name="SEC11" href="#TOC1">REVISION</a> 
+ <a name="SEC12" href="#TOC1">REVISION</a> 

-Last updated: 19 October 2011
+Last updated: 26 November 2011
 
Copyright © 1997-2011 University of Cambridge.

Modified: code/trunk/doc/html/pcrelimits.html
===================================================================
--- code/trunk/doc/html/pcrelimits.html    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/html/pcrelimits.html    2011-12-28 17:16:11 UTC (rev 836)
@@ -37,6 +37,12 @@
 no more than 65535 capturing subpatterns.
 </P>
 <P>
+There is a limit to the number of forward references to subsequent subpatterns
+of around 200,000. Repeated forward references with fixed upper limits, for
+example, (?2){0,100} when subpattern number 2 is to the right, are included in
+the count. There is no limit to the number of backward references.
+</P>
+<P>
 The maximum length of name for a named subpattern is 32 characters, and the
 maximum number of named subpatterns is 10000.
 </P>
@@ -65,7 +71,7 @@
 REVISION
 </b><br>
 <P>
-Last updated: 24 August 2011
+Last updated: 30 November 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>

Modified: code/trunk/doc/html/pcrematching.html
===================================================================
--- code/trunk/doc/html/pcrematching.html    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/html/pcrematching.html    2011-12-28 17:16:11 UTC (rev 836)
@@ -164,9 +164,9 @@
 </P>
 <P>
 7. The \C escape sequence, which (in the standard algorithm) matches a single
-byte, even in UTF-8 mode, is not supported because the alternative algorithm
-moves through the subject string one character at a time, for all active paths
-through the tree.
+byte, even in UTF-8 mode, is not supported in UTF-8 mode, because the
+alternative algorithm moves through the subject string one character at a time,
+for all active paths through the tree.
 </P>
 <P>
 8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not
@@ -220,7 +220,7 @@
 </P>
 <br><a name="SEC8" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 17 November 2010
+Last updated: 19 November 2011
 <br>
 Copyright &copy; 1997-2010 University of Cambridge.
 <br>

Modified: code/trunk/doc/html/pcrepattern.html
===================================================================
--- code/trunk/doc/html/pcrepattern.html    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/html/pcrepattern.html    2011-12-28 17:16:11 UTC (rev 836)
@@ -268,7 +268,8 @@
   \t        tab (hex 09)
   \ddd      character with octal code ddd, or back reference
   \xhh      character with hex code hh
-  \x{hhh..} character with hex code hhh..
+  \x{hhh..} character with hex code hhh.. (non-JavaScript mode)
+  \uhhhh    character with hex code hhhh (JavaScript mode only)
 </pre>
 The precise effect of \cx is as follows: if x is a lower case letter, it
 is converted to upper case. Then bit 6 of the character (hex 40) is inverted.
@@ -280,12 +281,12 @@
 0xc0 bits are flipped.)
 </P>
 <P>
-After \x, from zero to two hexadecimal digits are read (letters can be in
-upper or lower case). Any number of hexadecimal digits may appear between \x{
-and }, but the value of the character code must be less than 256 in non-UTF-8
-mode, and less than 2**31 in UTF-8 mode. That is, the maximum value in
-hexadecimal is 7FFFFFFF. Note that this is bigger than the largest Unicode code
-point, which is 10FFFF.
+By default, after \x, from zero to two hexadecimal digits are read (letters
+can be in upper or lower case). Any number of hexadecimal digits may appear
+between \x{ and }, but the value of the character code must be less than 256
+in non-UTF-8 mode, and less than 2**31 in UTF-8 mode. That is, the maximum
+value in hexadecimal is 7FFFFFFF. Note that this is bigger than the largest
+Unicode code point, which is 10FFFF.
 </P>
 <P>
 If characters other than hexadecimal digits appear between \x{ and }, or if
@@ -294,9 +295,17 @@
 following digits, giving a character whose value is zero.
 </P>
 <P>
+If the PCRE_JAVASCRIPT_COMPAT option is set, the interpretation of \x is
+as just described only when it is followed by two hexadecimal digits.
+Otherwise, it matches a literal "x" character. In JavaScript mode, support for
+code points greater than 256 is provided by \u, which must be followed by
+four hexadecimal digits; otherwise it matches a literal "u" character.
+</P>
+<P>
 Characters whose value is less than 256 can be defined by either of the two
-syntaxes for \x. There is no difference in the way they are handled. For
-example, \xdc is exactly the same as \x{dc}.
+syntaxes for \x (or by \u in JavaScript mode). There is no difference in the
+way they are handled. For example, \xdc is exactly the same as \x{dc} (or
+\u00dc in JavaScript mode).
 </P>
 <P>
 After \0 up to two further octal digits are read. If there are fewer than two
@@ -338,14 +347,27 @@
 </P>
 <P>
 All the sequences that define a single character value can be used both inside
-and outside character classes. In addition, inside a character class, the
-sequence \b is interpreted as the backspace character (hex 08). The sequences
-\B, \N, \R, and \X are not special inside a character class. Like any other
-unrecognized escape sequences, they are treated as the literal characters "B",
-"N", "R", and "X" by default, but cause an error if the PCRE_EXTRA option is
-set. Outside a character class, these sequences have different meanings.
+and outside character classes. In addition, inside a character class, \b is
+interpreted as the backspace character (hex 08).
 </P>
+<P>
+\N is not allowed in a character class. \B, \R, and \X are not special
+inside a character class. Like other unrecognized escape sequences, they are
+treated as the literal characters "B", "R", and "X" by default, but cause an
+error if the PCRE_EXTRA option is set. Outside a character class, these
+sequences have different meanings.
+</P>
 <br><b>
+Unsupported escape sequences
+</b><br>
+<P>
+In Perl, the sequences \l, \L, \u, and \U are recognized by its string
+handler and used to modify the case of following characters. By default, PCRE
+does not support these escape sequences. However, if the PCRE_JAVASCRIPT_COMPAT
+option is set, \U matches a "U" character, and \u can be used to define a
+character by code point, as described in the previous section.
+</P>
+<br><b>
 Absolute and relative back references
 </b><br>
 <P>
@@ -389,7 +411,8 @@
 There is also the single sequence \N, which matches a non-newline character.
 This is the same as
 <a href="#fullstopdot">the "." metacharacter</a>
-when PCRE_DOTALL is not set.
+when PCRE_DOTALL is not set. Perl also uses \N to match characters by name;
+PCRE does not support this.
 </P>
 <P>
 Each pair of lower and upper case escape sequences partitions the complete set
@@ -963,7 +986,8 @@
 <P>
 The escape sequence \N behaves like a dot, except that it is not affected by
 the PCRE_DOTALL option. In other words, it matches any character except one
-that signifies the end of a line.
+that signifies the end of a line. Perl also uses \N to match characters by
+name; PCRE does not support this.
 </P>
 <br><a name="SEC7" href="#TOC1">MATCHING A SINGLE BYTE</a><br>
 <P>
@@ -979,8 +1003,8 @@
 </P>
 <P>
 PCRE does not allow \C to appear in lookbehind assertions
-<a href="#lookbehind">(described below),</a>
-because in UTF-8 mode this would make it impossible to calculate the length of
+<a href="#lookbehind">(described below)</a>
+in UTF-8 mode, because this would make it impossible to calculate the length of
 the lookbehind.
 </P>
 <P>
@@ -1926,10 +1950,10 @@
 assertion fails.
 </P>
 <P>
-PCRE does not allow the \C escape (which matches a single byte in UTF-8 mode)
-to appear in lookbehind assertions, because it makes it impossible to calculate
-the length of the lookbehind. The \X and \R escapes, which can match
-different numbers of bytes, are also not permitted.
+In UTF-8 mode, PCRE does not allow the \C escape (which matches a single byte,
+even in UTF-8 mode) to appear in lookbehind assertions, because it makes it
+impossible to calculate the length of the lookbehind. The \X and \R escapes,
+which can match different numbers of bytes, are also not permitted.
 </P>
 <P>
 <a href="#subpatternsassubroutines">"Subroutine"</a>
@@ -2511,10 +2535,11 @@
 If any of these verbs are used in an assertion or in a subpattern that is
 called as a subroutine (whether or not recursively), their effect is confined
 to that subpattern; it does not extend to the surrounding pattern, with one
-exception: a *MARK that is encountered in a positive assertion <i>is</i> passed
-back (compare capturing parentheses in assertions). Note that such subpatterns
-are processed as anchored at the point where they are tested. Note also that
-Perl's treatment of subroutines is different in some cases.
+exception: the name from a *(MARK), (*PRUNE), or (*THEN) that is encountered in
+a successful positive assertion <i>is</i> passed back when a match succeeds
+(compare capturing parentheses in assertions). Note that such subpatterns are
+processed as anchored at the point where they are tested. Note also that Perl's
+treatment of subroutines is different in some cases.
 </P>
 <P>
 The new verbs make use of what was previously invalid syntax: an opening
@@ -2536,6 +2561,10 @@
 when calling <b>pcre_compile()</b> or <b>pcre_exec()</b>, or by starting the
 pattern with (*NO_START_OPT).
 </P>
+<P>
+Experiments with Perl suggest that it too has similar optimizations, sometimes
+leading to anomalous results.
+</P>
 <br><b>
 Verbs that act immediately
 </b><br>
@@ -2583,17 +2612,17 @@
 (*MARK) as you like in a pattern, and their names do not have to be unique.
 </P>
 <P>
-When a match succeeds, the name of the last-encountered (*MARK) is passed back
-to the caller via the <i>pcre_extra</i> data structure, as described in the
+When a match succeeds, the name of the last-encountered (*MARK) on the matching
+path is passed back to the caller via the <i>pcre_extra</i> data structure, as
+described in the
 <a href="pcreapi.html#extradata">section on <i>pcre_extra</i></a>
 in the
 <a href="pcreapi.html"><b>pcreapi</b></a>
-documentation. No data is returned for a partial match. Here is an example of
-<b>pcretest</b> output, where the /K modifier requests the retrieval and
-outputting of (*MARK) data:
+documentation. Here is an example of <b>pcretest</b> output, where the /K
+modifier requests the retrieval and outputting of (*MARK) data:
 <pre>
-  /X(*MARK:A)Y|X(*MARK:B)Z/K
-  XY
+    re&#62; /X(*MARK:A)Y|X(*MARK:B)Z/K
+  data&#62; XY
    0: XY
   MK: A
   XZ
@@ -2611,33 +2640,18 @@
 assertions.
 </P>
 <P>
-A name may also be returned after a failed match if the final path through the
-pattern involves (*MARK). However, unless (*MARK) used in conjunction with
-(*COMMIT), this is unlikely to happen for an unanchored pattern because, as the
-starting point for matching is advanced, the final check is often with an empty
-string, causing a failure before (*MARK) is reached. For example:
+After a partial match or a failed match, the name of the last encountered
+(*MARK) in the entire match process is returned. For example:
 <pre>
-  /X(*MARK:A)Y|X(*MARK:B)Z/K
-  XP
-  No match
-</pre>
-There are three potential starting points for this match (starting with X,
-starting with P, and with an empty string). If the pattern is anchored, the
-result is different:
-<pre>
-  /^X(*MARK:A)Y|^X(*MARK:B)Z/K
-  XP
+    re&#62; /X(*MARK:A)Y|X(*MARK:B)Z/K
+  data&#62; XP
   No match, mark = B
 </pre>
-PCRE's start-of-match optimizations can also interfere with this. For example,
-if, as a result of a call to <b>pcre_study()</b>, it knows the minimum
-subject length for a match, a shorter subject will not be scanned at all.
+Note that in this unanchored example the mark is retained from the match
+attempt that started at the letter "X". Subsequent match attempts starting at
+"P" and then with an empty string do not get as far as the (*MARK) item, but
+nevertheless do not reset it.
 </P>
-<P>
-Note that similar anomalies (though different in detail) exist in Perl, no
-doubt for the same reasons. The use of (*MARK) data after a failed match of an
-unanchored pattern is not recommended, unless (*COMMIT) is involved.
-</P>
 <br><b>
 Verbs that act after backtracking
 </b><br>
@@ -2675,8 +2689,8 @@
 unless PCRE's start-of-match optimizations are turned off, as shown in this
 <b>pcretest</b> example:
 <pre>
-  /(*COMMIT)abc/
-  xyzabc
+    re&#62; /(*COMMIT)abc/
+  data&#62; xyzabc
    0: abc
   xyzabc\Y
   No match
@@ -2697,10 +2711,8 @@
 the right, backtracking cannot cross (*PRUNE). In simple cases, the use of
 (*PRUNE) is just an alternative to an atomic group or possessive quantifier,
 but there are some uses of (*PRUNE) that cannot be expressed in any other way.
-The behaviour of (*PRUNE:NAME) is the same as (*MARK:NAME)(*PRUNE) when the
-match fails completely; the name is passed back if this is the final attempt.
-(*PRUNE:NAME) does not pass back a name if the match succeeds. In an anchored
-pattern (*PRUNE) has the same effect as (*COMMIT).
+The behaviour of (*PRUNE:NAME) is the same as (*MARK:NAME)(*PRUNE). In an
+anchored pattern (*PRUNE) has the same effect as (*COMMIT).
 <pre>
   (*SKIP)
 </pre>
@@ -2726,8 +2738,7 @@
 searched for the most recent (*MARK) that has the same name. If one is found,
 the "bumpalong" advance is to the subject position that corresponds to that
 (*MARK) instead of to where (*SKIP) was encountered. If no (*MARK) with a
-matching name is found, normal "bumpalong" of one character happens (that is,
-the (*SKIP) is ignored).
+matching name is found, the (*SKIP) is ignored.
 <pre>
   (*THEN) or (*THEN:NAME)
 </pre>
@@ -2741,9 +2752,8 @@
 If the COND1 pattern matches, FOO is tried (and possibly further items after
 the end of the group if FOO succeeds); on failure, the matcher skips to the
 second alternative and tries COND2, without backtracking into COND1. The
-behaviour of (*THEN:NAME) is exactly the same as (*MARK:NAME)(*THEN) if the
-overall match fails. If (*THEN) is not inside an alternation, it acts like
-(*PRUNE).
+behaviour of (*THEN:NAME) is exactly the same as (*MARK:NAME)(*THEN).
+If (*THEN) is not inside an alternation, it acts like (*PRUNE).
 </P>
 <P>
 Note that a subpattern that does not contain a | character is just a part of
@@ -2819,7 +2829,7 @@
 </P>
 <br><a name="SEC28" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 19 October 2011
+Last updated: 29 November 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>

Modified: code/trunk/doc/html/pcretest.html
===================================================================
--- code/trunk/doc/html/pcretest.html    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/html/pcretest.html    2011-12-28 17:16:11 UTC (rev 836)
@@ -364,7 +364,10 @@
 </P>
 <P>
 The <b>/M</b> modifier causes the size of memory block used to hold the compiled
-pattern to be output.
+pattern to be output. This does not include the size of the <b>pcre</b> block;
+it is just the actual compiled data. If the pattern is successfully studied
+with the PCRE_STUDY_JIT_COMPILE option, the size of the JIT compiled code is
+also output.
 </P>
 <P>
 If the <b>/S</b> modifier appears once, it causes <b>pcre_study()</b> to be
@@ -856,7 +859,7 @@
 </P>
 <br><a name="SEC15" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 26 August 2011
+Last updated: 02 December 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>

Modified: code/trunk/doc/pcre.txt
===================================================================
--- code/trunk/doc/pcre.txt    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/pcre.txt    2011-12-28 17:16:11 UTC (rev 836)
@@ -633,9 +633,9 @@
        always 1, and the value of the capture_last field is always -1.

        7.  The \C escape sequence, which (in the standard algorithm) matches a
-       single byte, even in UTF-8 mode, is not supported because the  alterna-
-       tive  algorithm  moves  through  the  subject string one character at a
-       time, for all active paths through the tree.
+       single byte, even in UTF-8  mode,  is  not  supported  in  UTF-8  mode,
+       because  the alternative algorithm moves through the subject string one
+       character at a time, for all active paths through the tree.

        8. Except for (*FAIL), the backtracking control verbs such as  (*PRUNE)
        are  not  supported.  (*FAIL)  is supported, and behaves like a failing
@@ -685,7 +685,7 @@

REVISION

-       Last updated: 17 November 2010
+       Last updated: 19 November 2011
        Copyright (c) 1997-2010 University of Cambridge.
 ------------------------------------------------------------------------------

@@ -1256,6 +1256,20 @@
        set  (assuming  it can find an "a" in the subject), whereas it fails by
        default, for Perl compatibility.

+       (3) \U matches an upper case "U" character; by default \U causes a com-
+       pile time error (Perl uses \U to upper case subsequent characters).
+
+       (4) \u matches a lower case "u" character unless it is followed by four
+       hexadecimal digits, in which case the hexadecimal  number  defines  the
+       code  point  to match. By default, \u causes a compile time error (Perl
+       uses it to upper case the following character).
+
+       (5) \x matches a lower case "x" character unless it is followed by  two
+       hexadecimal  digits,  in  which case the hexadecimal number defines the
+       code point to match. By default, as in Perl, a  hexadecimal  number  is
+       always expected after \x, but it may have zero, one, or two digits (so,
+       for example, \xz matches a binary zero character followed by z).
+
          PCRE_MULTILINE

        By default, PCRE treats the subject string as consisting  of  a  single
@@ -1710,6 +1724,12 @@
        compiler could not handle this particular pattern. See the pcrejit doc-
        umentation for details of what can and cannot be handled.

+         PCRE_INFO_JITSIZE
+
+       If the pattern was successfully studied with the PCRE_STUDY_JIT_COMPILE
+       option, return the size of the  JIT  compiled  code,  otherwise  return
+       zero. The fourth argument should point to a size_t variable.
+
          PCRE_INFO_LASTLITERAL

        Return  the  value of the rightmost literal byte that must exist in any
@@ -1818,10 +1838,14 @@

          PCRE_INFO_SIZE

-       Return  the  size  of the compiled pattern, that is, the value that was
-       passed as the argument to pcre_malloc() when PCRE was getting memory in
-       which to place the compiled data. The fourth argument should point to a
-       size_t variable.
+       Return  the  size  of  the compiled pattern. The fourth argument should
+       point to a size_t variable. This value does not include the size of the
+       pcre  structure  that  is returned by pcre_compile(). The value that is
+       passed as the argument to pcre_malloc() when pcre_compile() is  getting
+       memory  in  which  to  place the compiled data is the value returned by
+       this option plus the size of the pcre structure.  Studying  a  compiled
+       pattern, with or without JIT, does not alter the value returned by this
+       option.

          PCRE_INFO_STUDYSIZE

@@ -2980,7 +3004,7 @@

REVISION

-       Last updated: 23 September 2011
+       Last updated: 02 December 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------

@@ -3143,9 +3167,11 @@

        The mark field is present from version 2 of the pcre_callout structure.
        In  callouts  from pcre_exec() it contains a pointer to the zero-termi-
-       nated name of the most recently passed (*MARK) item in  the  match,  or
-       NULL if there are no (*MARK)s in the current matching path. In callouts
-       from pcre_dfa_exec() this field always contains NULL.
+       nated name of the most recently passed (*MARK),  (*PRUNE),  or  (*THEN)
+       item in the match, or NULL if no such items have been passed. Instances
+       of (*PRUNE) or (*THEN) without a name  do  not  obliterate  a  previous
+       (*MARK).  In  callouts  from pcre_dfa_exec() this field always contains
+       NULL.

RETURN VALUES
@@ -3173,7 +3199,7 @@

REVISION

-       Last updated: 26 August 2011
+       Last updated: 30 November 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------

@@ -3218,7 +3244,9 @@
        its own, matching a non-newline character, is supported.) In fact these
        are implemented by Perl's general string-handling and are not  part  of
        its  pattern  matching engine. If any of these are encountered by PCRE,
-       an error is generated.
+       an error is generated by default. However, if the  PCRE_JAVASCRIPT_COM-
+       PAT  option  is set, \U and \u are interpreted as JavaScript interprets
+       them.

        6. The Perl escape sequences \p, \P, and \X are supported only if  PCRE
        is  built  with Unicode character property support. The properties that
@@ -3345,7 +3373,7 @@

REVISION

-       Last updated: 09 October 2011
+       Last updated: 14 November 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------

@@ -3572,7 +3600,8 @@
          \t        tab (hex 09)
          \ddd      character with octal code ddd, or back reference
          \xhh      character with hex code hh
-         \x{hhh..} character with hex code hhh..
+         \x{hhh..} character with hex code hhh.. (non-JavaScript mode)
+         \uhhhh    character with hex code hhhh (JavaScript mode only)

        The precise effect of \cx is as follows: if x is a lower  case  letter,
        it  is converted to upper case. Then bit 6 of the character (hex 40) is
@@ -3583,12 +3612,12 @@
        is compiled in EBCDIC mode, all byte values are  valid.  A  lower  case
        letter is converted to upper case, and then the 0xc0 bits are flipped.)

-       After  \x, from zero to two hexadecimal digits are read (letters can be
-       in upper or lower case). Any number of hexadecimal  digits  may  appear
-       between  \x{  and  },  but the value of the character code must be less
-       than 256 in non-UTF-8 mode, and less than 2**31 in UTF-8 mode. That is,
-       the  maximum value in hexadecimal is 7FFFFFFF. Note that this is bigger
-       than the largest Unicode code point, which is 10FFFF.
+       By  default,  after  \x,  from  zero to two hexadecimal digits are read
+       (letters can be in upper or lower case). Any number of hexadecimal dig-
+       its  may  appear between \x{ and }, but the value of the character code
+       must be less than 256 in non-UTF-8 mode, and less than 2**31  in  UTF-8
+       mode.  That is, the maximum value in hexadecimal is 7FFFFFFF. Note that
+       this is bigger than the largest Unicode code point, which is 10FFFF.

        If characters other than hexadecimal digits appear between \x{  and  },
        or if there is no terminating }, this form of escape is not recognized.
@@ -3596,9 +3625,17 @@
        escape,  with  no  following  digits, giving a character whose value is
        zero.

+       If the PCRE_JAVASCRIPT_COMPAT option is set, the interpretation  of  \x
+       is  as  just described only when it is followed by two hexadecimal dig-
+       its.  Otherwise, it matches a  literal  "x"  character.  In  JavaScript
+       mode, support for code points greater than 256 is provided by \u, which
+       must be followed by four hexadecimal digits;  otherwise  it  matches  a
+       literal "u" character.
+
        Characters whose value is less than 256 can be defined by either of the
-       two  syntaxes  for  \x. There is no difference in the way they are han-
-       dled. For example, \xdc is exactly the same as \x{dc}.
+       two syntaxes for \x (or by \u in JavaScript mode). There is no  differ-
+       ence in the way they are handled. For example, \xdc is exactly the same
+       as \x{dc} (or \u00dc in JavaScript mode).

        After \0 up to two further octal digits are read. If  there  are  fewer
        than  two  digits,  just  those  that  are  present  are used. Thus the
@@ -3642,13 +3679,23 @@

        All the sequences that define a single character value can be used both
        inside and outside character classes. In addition, inside  a  character
-       class,  the  sequence \b is interpreted as the backspace character (hex
-       08). The sequences \B, \N, \R, and \X are not special inside a  charac-
-       ter  class.  Like  any  other  unrecognized  escape sequences, they are
-       treated as the literal characters "B", "N", "R", and  "X"  by  default,
-       but cause an error if the PCRE_EXTRA option is set. Outside a character
-       class, these sequences have different meanings.
+       class, \b is interpreted as the backspace character (hex 08).

+       \N  is not allowed in a character class. \B, \R, and \X are not special
+       inside a character class. Like  other  unrecognized  escape  sequences,
+       they  are  treated  as  the  literal  characters  "B",  "R", and "X" by
+       default, but cause an error if the PCRE_EXTRA option is set. Outside  a
+       character class, these sequences have different meanings.
+
+   Unsupported escape sequences
+
+       In  Perl, the sequences \l, \L, \u, and \U are recognized by its string
+       handler and used  to  modify  the  case  of  following  characters.  By
+       default,  PCRE does not support these escape sequences. However, if the
+       PCRE_JAVASCRIPT_COMPAT option is set, \U matches a "U"  character,  and
+       \u can be used to define a character by code point, as described in the
+       previous section.
+
    Absolute and relative back references

        The sequence \g followed by an unsigned or a negative  number,  option-
@@ -3682,53 +3729,54 @@

        There is also the single sequence \N, which matches a non-newline char-
        acter.   This  is the same as the "." metacharacter when PCRE_DOTALL is
-       not set.
+       not set. Perl also uses \N to match characters by name; PCRE  does  not
+       support this.

-       Each pair of lower and upper case escape sequences partitions the  com-
-       plete  set  of  characters  into two disjoint sets. Any given character
-       matches one, and only one, of each pair. The sequences can appear  both
-       inside  and outside character classes. They each match one character of
-       the appropriate type. If the current matching point is at  the  end  of
-       the  subject string, all of them fail, because there is no character to
+       Each  pair of lower and upper case escape sequences partitions the com-
+       plete set of characters into two disjoint  sets.  Any  given  character
+       matches  one, and only one, of each pair. The sequences can appear both
+       inside and outside character classes. They each match one character  of
+       the  appropriate  type.  If the current matching point is at the end of
+       the subject string, all of them fail, because there is no character  to
        match.

-       For compatibility with Perl, \s does not match the VT  character  (code
-       11).   This makes it different from the the POSIX "space" class. The \s
-       characters are HT (9), LF (10), FF (12), CR (13), and  space  (32).  If
+       For  compatibility  with Perl, \s does not match the VT character (code
+       11).  This makes it different from the the POSIX "space" class. The  \s
+       characters  are  HT  (9), LF (10), FF (12), CR (13), and space (32). If
        "use locale;" is included in a Perl script, \s may match the VT charac-
        ter. In PCRE, it never does.

-       A "word" character is an underscore or any character that is  a  letter
-       or  digit.   By  default,  the definition of letters and digits is con-
-       trolled by PCRE's low-valued character tables, and may vary if  locale-
-       specific  matching is taking place (see "Locale support" in the pcreapi
-       page). For example, in a French locale such  as  "fr_FR"  in  Unix-like
-       systems,  or "french" in Windows, some character codes greater than 128
-       are used for accented letters, and these are then matched  by  \w.  The
+       A  "word"  character is an underscore or any character that is a letter
+       or digit.  By default, the definition of letters  and  digits  is  con-
+       trolled  by PCRE's low-valued character tables, and may vary if locale-
+       specific matching is taking place (see "Locale support" in the  pcreapi
+       page).  For  example,  in  a French locale such as "fr_FR" in Unix-like
+       systems, or "french" in Windows, some character codes greater than  128
+       are  used  for  accented letters, and these are then matched by \w. The
        use of locales with Unicode is discouraged.

-       By  default,  in  UTF-8  mode,  characters with values greater than 128
-       never match \d, \s, or \w, and always  match  \D,  \S,  and  \W.  These
-       sequences  retain their original meanings from before UTF-8 support was
-       available, mainly for efficiency reasons. However, if PCRE is  compiled
-       with  Unicode property support, and the PCRE_UCP option is set, the be-
-       haviour is changed so that Unicode properties  are  used  to  determine
+       By default, in UTF-8 mode, characters  with  values  greater  than  128
+       never  match  \d,  \s,  or  \w,  and always match \D, \S, and \W. These
+       sequences retain their original meanings from before UTF-8 support  was
+       available,  mainly for efficiency reasons. However, if PCRE is compiled
+       with Unicode property support, and the PCRE_UCP option is set, the  be-
+       haviour  is  changed  so  that Unicode properties are used to determine
        character types, as follows:

          \d  any character that \p{Nd} matches (decimal digit)
          \s  any character that \p{Z} matches, plus HT, LF, FF, CR
          \w  any character that \p{L} or \p{N} matches, plus underscore

-       The  upper case escapes match the inverse sets of characters. Note that
-       \d matches only decimal digits, whereas \w matches any  Unicode  digit,
-       as  well as any Unicode letter, and underscore. Note also that PCRE_UCP
-       affects \b, and \B because they are defined in  terms  of  \w  and  \W.
+       The upper case escapes match the inverse sets of characters. Note  that
+       \d  matches  only decimal digits, whereas \w matches any Unicode digit,
+       as well as any Unicode letter, and underscore. Note also that  PCRE_UCP
+       affects  \b,  and  \B  because  they are defined in terms of \w and \W.
        Matching these sequences is noticeably slower when PCRE_UCP is set.

-       The  sequences  \h, \H, \v, and \V are features that were added to Perl
-       at release 5.10. In contrast to the other sequences, which  match  only
-       ASCII  characters  by  default,  these always match certain high-valued
-       codepoints in UTF-8 mode, whether or not PCRE_UCP is set. The  horizon-
+       The sequences \h, \H, \v, and \V are features that were added  to  Perl
+       at  release  5.10. In contrast to the other sequences, which match only
+       ASCII characters by default, these  always  match  certain  high-valued
+       codepoints  in UTF-8 mode, whether or not PCRE_UCP is set. The horizon-
        tal space characters are:

          U+0009     Horizontal tab
@@ -3763,104 +3811,104 @@

    Newline sequences

-       Outside  a  character class, by default, the escape sequence \R matches
+       Outside a character class, by default, the escape sequence  \R  matches
        any Unicode newline sequence. In non-UTF-8 mode \R is equivalent to the
        following:

          (?>\r\n|\n|\x0b|\f|\r|\x85)

-       This  is  an  example  of an "atomic group", details of which are given
+       This is an example of an "atomic group", details  of  which  are  given
        below.  This particular group matches either the two-character sequence
-       CR  followed  by  LF,  or  one  of  the single characters LF (linefeed,
+       CR followed by LF, or  one  of  the  single  characters  LF  (linefeed,
        U+000A), VT (vertical tab, U+000B), FF (formfeed, U+000C), CR (carriage
        return, U+000D), or NEL (next line, U+0085). The two-character sequence
        is treated as a single unit that cannot be split.

-       In UTF-8 mode, two additional characters whose codepoints  are  greater
+       In  UTF-8  mode, two additional characters whose codepoints are greater
        than 255 are added: LS (line separator, U+2028) and PS (paragraph sepa-
-       rator, U+2029).  Unicode character property support is not  needed  for
+       rator,  U+2029).   Unicode character property support is not needed for
        these characters to be recognized.

        It is possible to restrict \R to match only CR, LF, or CRLF (instead of
-       the complete set  of  Unicode  line  endings)  by  setting  the  option
+       the  complete  set  of  Unicode  line  endings)  by  setting the option
        PCRE_BSR_ANYCRLF either at compile time or when the pattern is matched.
        (BSR is an abbrevation for "backslash R".) This can be made the default
-       when  PCRE  is  built;  if this is the case, the other behaviour can be
-       requested via the PCRE_BSR_UNICODE option.   It  is  also  possible  to
-       specify  these  settings  by  starting a pattern string with one of the
+       when PCRE is built; if this is the case, the  other  behaviour  can  be
+       requested  via  the  PCRE_BSR_UNICODE  option.   It is also possible to
+       specify these settings by starting a pattern string  with  one  of  the
        following sequences:

          (*BSR_ANYCRLF)   CR, LF, or CRLF only
          (*BSR_UNICODE)   any Unicode newline sequence

-       These override the default and the options given to  pcre_compile()  or
-       pcre_compile2(),  but  they  can  be  overridden  by  options  given to
+       These  override  the default and the options given to pcre_compile() or
+       pcre_compile2(), but  they  can  be  overridden  by  options  given  to
        pcre_exec() or pcre_dfa_exec(). Note that these special settings, which
-       are  not  Perl-compatible,  are  recognized only at the very start of a
-       pattern, and that they must be in upper case. If more than one of  them
+       are not Perl-compatible, are recognized only at the  very  start  of  a
+       pattern,  and that they must be in upper case. If more than one of them
        is present, the last one is used. They can be combined with a change of
        newline convention; for example, a pattern can start with:

          (*ANY)(*BSR_ANYCRLF)

        They can also be combined with the (*UTF8) or (*UCP) special sequences.
-       Inside  a  character  class,  \R  is  treated as an unrecognized escape
+       Inside a character class, \R  is  treated  as  an  unrecognized  escape
        sequence, and so matches the letter "R" by default, but causes an error
        if PCRE_EXTRA is set.

    Unicode character properties

        When PCRE is built with Unicode character property support, three addi-
-       tional escape sequences that match characters with specific  properties
-       are  available.   When not in UTF-8 mode, these sequences are of course
-       limited to testing characters whose codepoints are less than  256,  but
+       tional  escape sequences that match characters with specific properties
+       are available.  When not in UTF-8 mode, these sequences are  of  course
+       limited  to  testing characters whose codepoints are less than 256, but
        they do work in this mode.  The extra escape sequences are:

          \p{xx}   a character with the xx property
          \P{xx}   a character without the xx property
          \X       an extended Unicode sequence

-       The  property  names represented by xx above are limited to the Unicode
+       The property names represented by xx above are limited to  the  Unicode
        script names, the general category properties, "Any", which matches any
-       character   (including  newline),  and  some  special  PCRE  properties
-       (described in the next section).  Other Perl properties such as  "InMu-
-       sicalSymbols"  are  not  currently supported by PCRE. Note that \P{Any}
+       character  (including  newline),  and  some  special  PCRE   properties
+       (described  in the next section).  Other Perl properties such as "InMu-
+       sicalSymbols" are not currently supported by PCRE.  Note  that  \P{Any}
        does not match any characters, so always causes a match failure.

        Sets of Unicode characters are defined as belonging to certain scripts.
-       A  character from one of these sets can be matched using a script name.
+       A character from one of these sets can be matched using a script  name.
        For example:

          \p{Greek}
          \P{Han}

-       Those that are not part of an identified script are lumped together  as
+       Those  that are not part of an identified script are lumped together as
        "Common". The current list of scripts is:

        Arabic, Armenian, Avestan, Balinese, Bamum, Bengali, Bopomofo, Braille,
-       Buginese, Buhid, Canadian_Aboriginal, Carian, Cham,  Cherokee,  Common,
-       Coptic,   Cuneiform,  Cypriot,  Cyrillic,  Deseret,  Devanagari,  Egyp-
-       tian_Hieroglyphs,  Ethiopic,  Georgian,  Glagolitic,   Gothic,   Greek,
-       Gujarati,  Gurmukhi,  Han,  Hangul,  Hanunoo,  Hebrew,  Hiragana, Impe-
+       Buginese,  Buhid,  Canadian_Aboriginal, Carian, Cham, Cherokee, Common,
+       Coptic,  Cuneiform,  Cypriot,  Cyrillic,  Deseret,  Devanagari,   Egyp-
+       tian_Hieroglyphs,   Ethiopic,   Georgian,  Glagolitic,  Gothic,  Greek,
+       Gujarati, Gurmukhi,  Han,  Hangul,  Hanunoo,  Hebrew,  Hiragana,  Impe-
        rial_Aramaic, Inherited, Inscriptional_Pahlavi, Inscriptional_Parthian,
-       Javanese,  Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Lao,
+       Javanese, Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer,  Lao,
        Latin,  Lepcha,  Limbu,  Linear_B,  Lisu,  Lycian,  Lydian,  Malayalam,
-       Meetei_Mayek,  Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham, Old_Italic,
-       Old_Persian, Old_South_Arabian, Old_Turkic, Ol_Chiki,  Oriya,  Osmanya,
-       Phags_Pa,  Phoenician,  Rejang,  Runic, Samaritan, Saurashtra, Shavian,
-       Sinhala, Sundanese, Syloti_Nagri, Syriac,  Tagalog,  Tagbanwa,  Tai_Le,
-       Tai_Tham,  Tai_Viet,  Tamil,  Telugu,  Thaana, Thai, Tibetan, Tifinagh,
+       Meetei_Mayek, Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham,  Old_Italic,
+       Old_Persian,  Old_South_Arabian,  Old_Turkic, Ol_Chiki, Oriya, Osmanya,
+       Phags_Pa, Phoenician, Rejang, Runic,  Samaritan,  Saurashtra,  Shavian,
+       Sinhala,  Sundanese,  Syloti_Nagri,  Syriac, Tagalog, Tagbanwa, Tai_Le,
+       Tai_Tham, Tai_Viet, Tamil, Telugu,  Thaana,  Thai,  Tibetan,  Tifinagh,
        Ugaritic, Vai, Yi.

        Each character has exactly one Unicode general category property, spec-
-       ified  by a two-letter abbreviation. For compatibility with Perl, nega-
-       tion can be specified by including a  circumflex  between  the  opening
-       brace  and  the  property  name.  For  example,  \p{^Lu} is the same as
+       ified by a two-letter abbreviation. For compatibility with Perl,  nega-
+       tion  can  be  specified  by including a circumflex between the opening
+       brace and the property name.  For  example,  \p{^Lu}  is  the  same  as
        \P{Lu}.

        If only one letter is specified with \p or \P, it includes all the gen-
-       eral  category properties that start with that letter. In this case, in
-       the absence of negation, the curly brackets in the escape sequence  are
+       eral category properties that start with that letter. In this case,  in
+       the  absence of negation, the curly brackets in the escape sequence are
        optional; these two examples have the same effect:

          \p{L}
@@ -3912,54 +3960,54 @@
          Zp    Paragraph separator
          Zs    Space separator

-       The  special property L& is also supported: it matches a character that
-       has the Lu, Ll, or Lt property, in other words, a letter  that  is  not
+       The special property L& is also supported: it matches a character  that
+       has  the  Lu,  Ll, or Lt property, in other words, a letter that is not
        classified as a modifier or "other".

-       The  Cs  (Surrogate)  property  applies only to characters in the range
-       U+D800 to U+DFFF. Such characters are not valid in UTF-8  strings  (see
+       The Cs (Surrogate) property applies only to  characters  in  the  range
+       U+D800  to  U+DFFF. Such characters are not valid in UTF-8 strings (see
        RFC 3629) and so cannot be tested by PCRE, unless UTF-8 validity check-
-       ing has been turned off (see the discussion  of  PCRE_NO_UTF8_CHECK  in
+       ing  has  been  turned off (see the discussion of PCRE_NO_UTF8_CHECK in
        the pcreapi page). Perl does not support the Cs property.

-       The  long  synonyms  for  property  names  that  Perl supports (such as
-       \p{Letter}) are not supported by PCRE, nor is it  permitted  to  prefix
+       The long synonyms for  property  names  that  Perl  supports  (such  as
+       \p{Letter})  are  not  supported by PCRE, nor is it permitted to prefix
        any of these properties with "Is".

        No character that is in the Unicode table has the Cn (unassigned) prop-
        erty.  Instead, this property is assumed for any code point that is not
        in the Unicode table.

-       Specifying  caseless  matching  does not affect these escape sequences.
+       Specifying caseless matching does not affect  these  escape  sequences.
        For example, \p{Lu} always matches only upper case letters.

-       The \X escape matches any number of Unicode  characters  that  form  an
+       The  \X  escape  matches  any number of Unicode characters that form an
        extended Unicode sequence. \X is equivalent to

          (?>\PM\pM*)

-       That  is,  it matches a character without the "mark" property, followed
-       by zero or more characters with the "mark"  property,  and  treats  the
-       sequence  as  an  atomic group (see below).  Characters with the "mark"
-       property are typically accents that  affect  the  preceding  character.
-       None  of  them  have  codepoints less than 256, so in non-UTF-8 mode \X
+       That is, it matches a character without the "mark"  property,  followed
+       by  zero  or  more  characters with the "mark" property, and treats the
+       sequence as an atomic group (see below).  Characters  with  the  "mark"
+       property  are  typically  accents  that affect the preceding character.
+       None of them have codepoints less than 256, so  in  non-UTF-8  mode  \X
        matches any one character.

        Note that recent versions of Perl have changed \X to match what Unicode
        calls an "extended grapheme cluster", which has a more complicated def-
        inition.

-       Matching characters by Unicode property is not fast, because  PCRE  has
-       to  search  a  structure  that  contains data for over fifteen thousand
+       Matching  characters  by Unicode property is not fast, because PCRE has
+       to search a structure that contains  data  for  over  fifteen  thousand
        characters. That is why the traditional escape sequences such as \d and
-       \w  do  not  use  Unicode properties in PCRE by default, though you can
+       \w do not use Unicode properties in PCRE by  default,  though  you  can
        make them do so by setting the PCRE_UCP option for pcre_compile() or by
        starting the pattern with (*UCP).

    PCRE's additional properties

-       As  well  as  the standard Unicode properties described in the previous
-       section, PCRE supports four more that make it possible to convert  tra-
+       As well as the standard Unicode properties described  in  the  previous
+       section,  PCRE supports four more that make it possible to convert tra-
        ditional escape sequences such as \w and \s and POSIX character classes
        to use Unicode properties. PCRE uses these non-standard, non-Perl prop-
        erties internally when PCRE_UCP is set. They are:
@@ -3969,40 +4017,40 @@
          Xsp   Any Perl space character
          Xwd   Any Perl "word" character

-       Xan  matches  characters that have either the L (letter) or the N (num-
-       ber) property. Xps matches the characters tab, linefeed, vertical  tab,
-       formfeed,  or  carriage  return, and any other character that has the Z
+       Xan matches characters that have either the L (letter) or the  N  (num-
+       ber)  property. Xps matches the characters tab, linefeed, vertical tab,
+       formfeed, or carriage return, and any other character that  has  the  Z
        (separator) property.  Xsp is the same as Xps, except that vertical tab
        is excluded. Xwd matches the same characters as Xan, plus underscore.

    Resetting the match start

-       The  escape sequence \K causes any previously matched characters not to
+       The escape sequence \K causes any previously matched characters not  to
        be included in the final matched sequence. For example, the pattern:

          foo\Kbar

-       matches "foobar", but reports that it has matched "bar".  This  feature
-       is  similar  to  a lookbehind assertion (described below).  However, in
-       this case, the part of the subject before the real match does not  have
-       to  be of fixed length, as lookbehind assertions do. The use of \K does
-       not interfere with the setting of captured  substrings.   For  example,
+       matches  "foobar",  but reports that it has matched "bar". This feature
+       is similar to a lookbehind assertion (described  below).   However,  in
+       this  case, the part of the subject before the real match does not have
+       to be of fixed length, as lookbehind assertions do. The use of \K  does
+       not  interfere  with  the setting of captured substrings.  For example,
        when the pattern

          (foo)\Kbar

        matches "foobar", the first substring is still set to "foo".

-       Perl  documents  that  the  use  of  \K  within assertions is "not well
-       defined". In PCRE, \K is acted upon  when  it  occurs  inside  positive
+       Perl documents that the use  of  \K  within  assertions  is  "not  well
+       defined".  In  PCRE,  \K  is  acted upon when it occurs inside positive
        assertions, but is ignored in negative assertions.

    Simple assertions

-       The  final use of backslash is for certain simple assertions. An asser-
-       tion specifies a condition that has to be met at a particular point  in
-       a  match, without consuming any characters from the subject string. The
-       use of subpatterns for more complicated assertions is described  below.
+       The final use of backslash is for certain simple assertions. An  asser-
+       tion  specifies a condition that has to be met at a particular point in
+       a match, without consuming any characters from the subject string.  The
+       use  of subpatterns for more complicated assertions is described below.
        The backslashed assertions are:

          \b     matches at a word boundary
@@ -4013,49 +4061,49 @@
          \z     matches only at the end of the subject
          \G     matches at the first matching position in the subject

-       Inside  a  character  class, \b has a different meaning; it matches the
-       backspace character. If any other of  these  assertions  appears  in  a
-       character  class, by default it matches the corresponding literal char-
+       Inside a character class, \b has a different meaning;  it  matches  the
+       backspace  character.  If  any  other  of these assertions appears in a
+       character class, by default it matches the corresponding literal  char-
        acter  (for  example,  \B  matches  the  letter  B).  However,  if  the
-       PCRE_EXTRA  option is set, an "invalid escape sequence" error is gener-
+       PCRE_EXTRA option is set, an "invalid escape sequence" error is  gener-
        ated instead.

-       A word boundary is a position in the subject string where  the  current
-       character  and  the previous character do not both match \w or \W (i.e.
-       one matches \w and the other matches \W), or the start or  end  of  the
-       string  if  the  first  or  last character matches \w, respectively. In
-       UTF-8 mode, the meanings of \w and \W can be  changed  by  setting  the
-       PCRE_UCP  option. When this is done, it also affects \b and \B. Neither
-       PCRE nor Perl has a separate "start of word" or "end of  word"  metase-
-       quence.  However,  whatever follows \b normally determines which it is.
+       A  word  boundary is a position in the subject string where the current
+       character and the previous character do not both match \w or  \W  (i.e.
+       one  matches  \w  and the other matches \W), or the start or end of the
+       string if the first or last  character  matches  \w,  respectively.  In
+       UTF-8  mode,  the  meanings  of \w and \W can be changed by setting the
+       PCRE_UCP option. When this is done, it also affects \b and \B.  Neither
+       PCRE  nor  Perl has a separate "start of word" or "end of word" metase-
+       quence. However, whatever follows \b normally determines which  it  is.
        For example, the fragment \ba matches "a" at the start of a word.

-       The \A, \Z, and \z assertions differ from  the  traditional  circumflex
+       The  \A,  \Z,  and \z assertions differ from the traditional circumflex
        and dollar (described in the next section) in that they only ever match
-       at the very start and end of the subject string, whatever  options  are
-       set.  Thus,  they are independent of multiline mode. These three asser-
+       at  the  very start and end of the subject string, whatever options are
+       set. Thus, they are independent of multiline mode. These  three  asser-
        tions are not affected by the PCRE_NOTBOL or PCRE_NOTEOL options, which
-       affect  only the behaviour of the circumflex and dollar metacharacters.
-       However, if the startoffset argument of pcre_exec() is non-zero,  indi-
+       affect only the behaviour of the circumflex and dollar  metacharacters.
+       However,  if the startoffset argument of pcre_exec() is non-zero, indi-
        cating that matching is to start at a point other than the beginning of
-       the subject, \A can never match. The difference between \Z  and  \z  is
+       the  subject,  \A  can never match. The difference between \Z and \z is
        that \Z matches before a newline at the end of the string as well as at
        the very end, whereas \z matches only at the end.

-       The \G assertion is true only when the current matching position is  at
-       the  start point of the match, as specified by the startoffset argument
-       of pcre_exec(). It differs from \A when the  value  of  startoffset  is
-       non-zero.  By calling pcre_exec() multiple times with appropriate argu-
+       The  \G assertion is true only when the current matching position is at
+       the start point of the match, as specified by the startoffset  argument
+       of  pcre_exec().  It  differs  from \A when the value of startoffset is
+       non-zero. By calling pcre_exec() multiple times with appropriate  argu-
        ments, you can mimic Perl's /g option, and it is in this kind of imple-
        mentation where \G can be useful.

-       Note,  however,  that  PCRE's interpretation of \G, as the start of the
+       Note, however, that PCRE's interpretation of \G, as the  start  of  the
        current match, is subtly different from Perl's, which defines it as the
-       end  of  the  previous  match. In Perl, these can be different when the
-       previously matched string was empty. Because PCRE does just  one  match
+       end of the previous match. In Perl, these can  be  different  when  the
+       previously  matched  string was empty. Because PCRE does just one match
        at a time, it cannot reproduce this behaviour.

-       If  all  the alternatives of a pattern begin with \G, the expression is
+       If all the alternatives of a pattern begin with \G, the  expression  is
        anchored to the starting match position, and the "anchored" flag is set
        in the compiled regular expression.

@@ -4063,80 +4111,81 @@
CIRCUMFLEX AND DOLLAR

        Outside a character class, in the default matching mode, the circumflex
-       character is an assertion that is true only  if  the  current  matching
-       point  is  at the start of the subject string. If the startoffset argu-
-       ment of pcre_exec() is non-zero, circumflex  can  never  match  if  the
-       PCRE_MULTILINE  option  is  unset. Inside a character class, circumflex
+       character  is  an  assertion  that is true only if the current matching
+       point is at the start of the subject string. If the  startoffset  argu-
+       ment  of  pcre_exec()  is  non-zero,  circumflex can never match if the
+       PCRE_MULTILINE option is unset. Inside a  character  class,  circumflex
        has an entirely different meaning (see below).

-       Circumflex need not be the first character of the pattern if  a  number
-       of  alternatives are involved, but it should be the first thing in each
-       alternative in which it appears if the pattern is ever  to  match  that
-       branch.  If all possible alternatives start with a circumflex, that is,
-       if the pattern is constrained to match only at the start  of  the  sub-
-       ject,  it  is  said  to be an "anchored" pattern. (There are also other
+       Circumflex  need  not be the first character of the pattern if a number
+       of alternatives are involved, but it should be the first thing in  each
+       alternative  in  which  it appears if the pattern is ever to match that
+       branch. If all possible alternatives start with a circumflex, that  is,
+       if  the  pattern  is constrained to match only at the start of the sub-
+       ject, it is said to be an "anchored" pattern.  (There  are  also  other
        constructs that can cause a pattern to be anchored.)

-       A dollar character is an assertion that is true  only  if  the  current
-       matching  point  is  at  the  end of the subject string, or immediately
+       A  dollar  character  is  an assertion that is true only if the current
+       matching point is at the end of  the  subject  string,  or  immediately
        before a newline at the end of the string (by default). Dollar need not
-       be  the  last  character of the pattern if a number of alternatives are
-       involved, but it should be the last item in  any  branch  in  which  it
+       be the last character of the pattern if a number  of  alternatives  are
+       involved,  but  it  should  be  the last item in any branch in which it
        appears. Dollar has no special meaning in a character class.

-       The  meaning  of  dollar  can be changed so that it matches only at the
-       very end of the string, by setting the  PCRE_DOLLAR_ENDONLY  option  at
+       The meaning of dollar can be changed so that it  matches  only  at  the
+       very  end  of  the string, by setting the PCRE_DOLLAR_ENDONLY option at
        compile time. This does not affect the \Z assertion.

        The meanings of the circumflex and dollar characters are changed if the
-       PCRE_MULTILINE option is set. When  this  is  the  case,  a  circumflex
-       matches  immediately after internal newlines as well as at the start of
-       the subject string. It does not match after a  newline  that  ends  the
-       string.  A dollar matches before any newlines in the string, as well as
-       at the very end, when PCRE_MULTILINE is set. When newline is  specified
-       as  the  two-character  sequence CRLF, isolated CR and LF characters do
+       PCRE_MULTILINE  option  is  set.  When  this  is the case, a circumflex
+       matches immediately after internal newlines as well as at the start  of
+       the  subject  string.  It  does not match after a newline that ends the
+       string. A dollar matches before any newlines in the string, as well  as
+       at  the very end, when PCRE_MULTILINE is set. When newline is specified
+       as the two-character sequence CRLF, isolated CR and  LF  characters  do
        not indicate newlines.

-       For example, the pattern /^abc$/ matches the subject string  "def\nabc"
-       (where  \n  represents a newline) in multiline mode, but not otherwise.
-       Consequently, patterns that are anchored in single  line  mode  because
-       all  branches  start  with  ^ are not anchored in multiline mode, and a
-       match for circumflex is  possible  when  the  startoffset  argument  of
-       pcre_exec()  is  non-zero. The PCRE_DOLLAR_ENDONLY option is ignored if
+       For  example, the pattern /^abc$/ matches the subject string "def\nabc"
+       (where \n represents a newline) in multiline mode, but  not  otherwise.
+       Consequently,  patterns  that  are anchored in single line mode because
+       all branches start with ^ are not anchored in  multiline  mode,  and  a
+       match  for  circumflex  is  possible  when  the startoffset argument of
+       pcre_exec() is non-zero. The PCRE_DOLLAR_ENDONLY option is  ignored  if
        PCRE_MULTILINE is set.

-       Note that the sequences \A, \Z, and \z can be used to match  the  start
-       and  end of the subject in both modes, and if all branches of a pattern
-       start with \A it is always anchored, whether or not  PCRE_MULTILINE  is
+       Note  that  the sequences \A, \Z, and \z can be used to match the start
+       and end of the subject in both modes, and if all branches of a  pattern
+       start  with  \A it is always anchored, whether or not PCRE_MULTILINE is
        set.

FULL STOP (PERIOD, DOT) AND \N

        Outside a character class, a dot in the pattern matches any one charac-
-       ter in the subject string except (by default) a character  that  signi-
-       fies  the  end  of  a line. In UTF-8 mode, the matched character may be
+       ter  in  the subject string except (by default) a character that signi-
+       fies the end of a line. In UTF-8 mode, the  matched  character  may  be
        more than one byte long.

-       When a line ending is defined as a single character, dot never  matches
-       that  character; when the two-character sequence CRLF is used, dot does
-       not match CR if it is immediately followed  by  LF,  but  otherwise  it
-       matches  all characters (including isolated CRs and LFs). When any Uni-
-       code line endings are being recognized, dot does not match CR or LF  or
+       When  a line ending is defined as a single character, dot never matches
+       that character; when the two-character sequence CRLF is used, dot  does
+       not  match  CR  if  it  is immediately followed by LF, but otherwise it
+       matches all characters (including isolated CRs and LFs). When any  Uni-
+       code  line endings are being recognized, dot does not match CR or LF or
        any of the other line ending characters.

-       The  behaviour  of  dot  with regard to newlines can be changed. If the
-       PCRE_DOTALL option is set, a dot matches  any  one  character,  without
+       The behaviour of dot with regard to newlines can  be  changed.  If  the
+       PCRE_DOTALL  option  is  set,  a dot matches any one character, without
        exception. If the two-character sequence CRLF is present in the subject
        string, it takes two dots to match it.

-       The handling of dot is entirely independent of the handling of  circum-
-       flex  and  dollar,  the  only relationship being that they both involve
+       The  handling of dot is entirely independent of the handling of circum-
+       flex and dollar, the only relationship being  that  they  both  involve
        newlines. Dot has no special meaning in a character class.

-       The escape sequence \N behaves like  a  dot,  except  that  it  is  not
-       affected  by  the  PCRE_DOTALL  option.  In other words, it matches any
-       character except one that signifies the end of a line.
+       The  escape  sequence  \N  behaves  like  a  dot, except that it is not
+       affected by the PCRE_DOTALL option. In  other  words,  it  matches  any
+       character  except  one that signifies the end of a line. Perl also uses
+       \N to match characters by name; PCRE does not support this.

 MATCHING A SINGLE BYTE
@@ -4153,7 +4202,7 @@
        PCRE_NO_UTF8_CHECK option is used).

        PCRE  does  not  allow \C to appear in lookbehind assertions (described
-       below), because in UTF-8 mode this would make it impossible  to  calcu-
+       below) in UTF-8 mode, because this would make it impossible  to  calcu-
        late the length of the lookbehind.

        In  general, the \C escape sequence is best avoided in UTF-8 mode. How-
@@ -5060,40 +5109,41 @@
        then try to match. If there are insufficient characters before the cur-
        rent position, the assertion fails.

-       PCRE does not allow the \C escape (which matches a single byte in UTF-8
-       mode) to appear in lookbehind assertions, because it makes it  impossi-
-       ble  to  calculate the length of the lookbehind. The \X and \R escapes,
-       which can match different numbers of bytes, are also not permitted.
+       In  UTF-8 mode, PCRE does not allow the \C escape (which matches a sin-
+       gle byte, even in UTF-8  mode)  to  appear  in  lookbehind  assertions,
+       because  it  makes it impossible to calculate the length of the lookbe-
+       hind. The \X and \R escapes,  which  can  match  different  numbers  of
+       bytes, are also not permitted.

-       "Subroutine" calls (see below) such as (?2) or (?&X) are  permitted  in
-       lookbehinds,  as  long as the subpattern matches a fixed-length string.
+       "Subroutine"  calls  (see below) such as (?2) or (?&X) are permitted in
+       lookbehinds, as long as the subpattern matches a  fixed-length  string.
        Recursion, however, is not supported.

-       Possessive quantifiers can  be  used  in  conjunction  with  lookbehind
+       Possessive  quantifiers  can  be  used  in  conjunction with lookbehind
        assertions to specify efficient matching of fixed-length strings at the
        end of subject strings. Consider a simple pattern such as

          abcd$

-       when applied to a long string that does  not  match.  Because  matching
+       when  applied  to  a  long string that does not match. Because matching
        proceeds from left to right, PCRE will look for each "a" in the subject
-       and then see if what follows matches the rest of the  pattern.  If  the
+       and  then  see  if what follows matches the rest of the pattern. If the
        pattern is specified as

          ^.*abcd$

-       the  initial .* matches the entire string at first, but when this fails
+       the initial .* matches the entire string at first, but when this  fails
        (because there is no following "a"), it backtracks to match all but the
-       last  character,  then all but the last two characters, and so on. Once
-       again the search for "a" covers the entire string, from right to  left,
+       last character, then all but the last two characters, and so  on.  Once
+       again  the search for "a" covers the entire string, from right to left,
        so we are no better off. However, if the pattern is written as

          ^.*+(?<=abcd)

-       there  can  be  no backtracking for the .*+ item; it can match only the
-       entire string. The subsequent lookbehind assertion does a  single  test
-       on  the last four characters. If it fails, the match fails immediately.
-       For long strings, this approach makes a significant difference  to  the
+       there can be no backtracking for the .*+ item; it can  match  only  the
+       entire  string.  The subsequent lookbehind assertion does a single test
+       on the last four characters. If it fails, the match fails  immediately.
+       For  long  strings, this approach makes a significant difference to the
        processing time.

    Using multiple assertions
@@ -5102,18 +5152,18 @@

          (?<=\d{3})(?<!999)foo

-       matches  "foo" preceded by three digits that are not "999". Notice that
-       each of the assertions is applied independently at the  same  point  in
-       the  subject  string.  First  there  is a check that the previous three
-       characters are all digits, and then there is  a  check  that  the  same
+       matches "foo" preceded by three digits that are not "999". Notice  that
+       each  of  the  assertions is applied independently at the same point in
+       the subject string. First there is a  check  that  the  previous  three
+       characters  are  all  digits,  and  then there is a check that the same
        three characters are not "999".  This pattern does not match "foo" pre-
-       ceded by six characters, the first of which are  digits  and  the  last
-       three  of  which  are not "999". For example, it doesn't match "123abc-
+       ceded  by  six  characters,  the first of which are digits and the last
+       three of which are not "999". For example, it  doesn't  match  "123abc-
        foo". A pattern to do that is

          (?<=\d{3}...)(?<!999)foo

-       This time the first assertion looks at the  preceding  six  characters,
+       This  time  the  first assertion looks at the preceding six characters,
        checking that the first three are digits, and then the second assertion
        checks that the preceding three characters are not "999".

@@ -5121,29 +5171,29 @@

          (?<=(?<!foo)bar)baz

-       matches an occurrence of "baz" that is preceded by "bar" which in  turn
+       matches  an occurrence of "baz" that is preceded by "bar" which in turn
        is not preceded by "foo", while

          (?<=\d{3}(?!999)...)foo

-       is  another pattern that matches "foo" preceded by three digits and any
+       is another pattern that matches "foo" preceded by three digits and  any
        three characters that are not "999".

CONDITIONAL SUBPATTERNS

-       It is possible to cause the matching process to obey a subpattern  con-
-       ditionally  or to choose between two alternative subpatterns, depending
-       on the result of an assertion, or whether a specific capturing  subpat-
-       tern  has  already  been matched. The two possible forms of conditional
+       It  is possible to cause the matching process to obey a subpattern con-
+       ditionally or to choose between two alternative subpatterns,  depending
+       on  the result of an assertion, or whether a specific capturing subpat-
+       tern has already been matched. The two possible  forms  of  conditional
        subpattern are:

          (?(condition)yes-pattern)
          (?(condition)yes-pattern|no-pattern)

-       If the condition is satisfied, the yes-pattern is used;  otherwise  the
-       no-pattern  (if  present)  is used. If there are more than two alterna-
-       tives in the subpattern, a compile-time error occurs. Each of  the  two
+       If  the  condition is satisfied, the yes-pattern is used; otherwise the
+       no-pattern (if present) is used. If there are more  than  two  alterna-
+       tives  in  the subpattern, a compile-time error occurs. Each of the two
        alternatives may itself contain nested subpatterns of any form, includ-
        ing  conditional  subpatterns;  the  restriction  to  two  alternatives
        applies only at the level of the condition. This pattern fragment is an
@@ -5152,73 +5202,73 @@
          (?(1) (A|B|C) | (D | (?(2)E|F) | E) )

-       There are four kinds of condition: references  to  subpatterns,  refer-
+       There  are  four  kinds of condition: references to subpatterns, refer-
        ences to recursion, a pseudo-condition called DEFINE, and assertions.

    Checking for a used subpattern by number

-       If  the  text between the parentheses consists of a sequence of digits,
+       If the text between the parentheses consists of a sequence  of  digits,
        the condition is true if a capturing subpattern of that number has pre-
-       viously  matched.  If  there is more than one capturing subpattern with
-       the same number (see the earlier  section  about  duplicate  subpattern
-       numbers),  the condition is true if any of them have matched. An alter-
-       native notation is to precede the digits with a plus or minus sign.  In
-       this  case, the subpattern number is relative rather than absolute. The
-       most recently opened parentheses can be referenced by (?(-1), the  next
-       most  recent  by (?(-2), and so on. Inside loops it can also make sense
+       viously matched. If there is more than one  capturing  subpattern  with
+       the  same  number  (see  the earlier section about duplicate subpattern
+       numbers), the condition is true if any of them have matched. An  alter-
+       native  notation is to precede the digits with a plus or minus sign. In
+       this case, the subpattern number is relative rather than absolute.  The
+       most  recently opened parentheses can be referenced by (?(-1), the next
+       most recent by (?(-2), and so on. Inside loops it can also  make  sense
        to refer to subsequent groups. The next parentheses to be opened can be
-       referenced  as (?(+1), and so on. (The value zero in any of these forms
+       referenced as (?(+1), and so on. (The value zero in any of these  forms
        is not used; it provokes a compile-time error.)

-       Consider the following pattern, which  contains  non-significant  white
+       Consider  the  following  pattern, which contains non-significant white
        space to make it more readable (assume the PCRE_EXTENDED option) and to
        divide it into three parts for ease of discussion:

          ( \( )?    [^()]+    (?(1) \) )

-       The first part matches an optional opening  parenthesis,  and  if  that
+       The  first  part  matches  an optional opening parenthesis, and if that
        character is present, sets it as the first captured substring. The sec-
-       ond part matches one or more characters that are not  parentheses.  The
-       third  part  is  a conditional subpattern that tests whether or not the
-       first set of parentheses matched. If they  did,  that  is,  if  subject
-       started  with an opening parenthesis, the condition is true, and so the
-       yes-pattern is executed and a closing parenthesis is  required.  Other-
-       wise,  since no-pattern is not present, the subpattern matches nothing.
-       In other words, this pattern matches  a  sequence  of  non-parentheses,
+       ond  part  matches one or more characters that are not parentheses. The
+       third part is a conditional subpattern that tests whether  or  not  the
+       first  set  of  parentheses  matched.  If they did, that is, if subject
+       started with an opening parenthesis, the condition is true, and so  the
+       yes-pattern  is  executed and a closing parenthesis is required. Other-
+       wise, since no-pattern is not present, the subpattern matches  nothing.
+       In  other  words,  this  pattern matches a sequence of non-parentheses,
        optionally enclosed in parentheses.

-       If  you  were  embedding  this pattern in a larger one, you could use a
+       If you were embedding this pattern in a larger one,  you  could  use  a
        relative reference:

          ...other stuff... ( \( )?    [^()]+    (?(-1) \) ) ...

-       This makes the fragment independent of the parentheses  in  the  larger
+       This  makes  the  fragment independent of the parentheses in the larger
        pattern.

    Checking for a used subpattern by name

-       Perl  uses  the  syntax  (?(<name>)...) or (?('name')...) to test for a
-       used subpattern by name. For compatibility  with  earlier  versions  of
-       PCRE,  which  had this facility before Perl, the syntax (?(name)...) is
-       also recognized. However, there is a possible ambiguity with this  syn-
-       tax,  because  subpattern  names  may  consist entirely of digits. PCRE
-       looks first for a named subpattern; if it cannot find one and the  name
-       consists  entirely  of digits, PCRE looks for a subpattern of that num-
-       ber, which must be greater than zero. Using subpattern names that  con-
+       Perl uses the syntax (?(<name>)...) or (?('name')...)  to  test  for  a
+       used  subpattern  by  name.  For compatibility with earlier versions of
+       PCRE, which had this facility before Perl, the syntax  (?(name)...)  is
+       also  recognized. However, there is a possible ambiguity with this syn-
+       tax, because subpattern names may  consist  entirely  of  digits.  PCRE
+       looks  first for a named subpattern; if it cannot find one and the name
+       consists entirely of digits, PCRE looks for a subpattern of  that  num-
+       ber,  which must be greater than zero. Using subpattern names that con-
        sist entirely of digits is not recommended.

        Rewriting the above example to use a named subpattern gives this:

          (?<OPEN> \( )?    [^()]+    (?(<OPEN>) \) )

-       If  the  name used in a condition of this kind is a duplicate, the test
-       is applied to all subpatterns of the same name, and is true if any  one
+       If the name used in a condition of this kind is a duplicate,  the  test
+       is  applied to all subpatterns of the same name, and is true if any one
        of them has matched.

    Checking for pattern recursion

        If the condition is the string (R), and there is no subpattern with the
-       name R, the condition is true if a recursive call to the whole  pattern
+       name  R, the condition is true if a recursive call to the whole pattern
        or any subpattern has been made. If digits or a name preceded by amper-
        sand follow the letter R, for example:

@@ -5226,51 +5276,51 @@

        the condition is true if the most recent recursion is into a subpattern
        whose number or name is given. This condition does not check the entire
-       recursion stack. If the name used in a condition  of  this  kind  is  a
+       recursion  stack.  If  the  name  used in a condition of this kind is a
        duplicate, the test is applied to all subpatterns of the same name, and
        is true if any one of them is the most recent recursion.

-       At "top level", all these recursion test  conditions  are  false.   The
+       At  "top  level",  all  these recursion test conditions are false.  The
        syntax for recursive patterns is described below.

    Defining subpatterns for use by reference only

-       If  the  condition  is  the string (DEFINE), and there is no subpattern
-       with the name DEFINE, the condition is  always  false.  In  this  case,
-       there  may  be  only  one  alternative  in the subpattern. It is always
-       skipped if control reaches this point  in  the  pattern;  the  idea  of
-       DEFINE  is that it can be used to define subroutines that can be refer-
-       enced from elsewhere. (The use of subroutines is described below.)  For
-       example,  a  pattern  to match an IPv4 address such as "192.168.23.245"
+       If the condition is the string (DEFINE), and  there  is  no  subpattern
+       with  the  name  DEFINE,  the  condition is always false. In this case,
+       there may be only one alternative  in  the  subpattern.  It  is  always
+       skipped  if  control  reaches  this  point  in the pattern; the idea of
+       DEFINE is that it can be used to define subroutines that can be  refer-
+       enced  from elsewhere. (The use of subroutines is described below.) For
+       example, a pattern to match an IPv4 address  such  as  "192.168.23.245"
        could be written like this (ignore whitespace and line breaks):

          (?(DEFINE) (?<byte> 2[0-4]\d | 25[0-5] | 1\d\d | [1-9]?\d) )
          \b (?&byte) (\.(?&byte)){3} \b

-       The first part of the pattern is a DEFINE group inside which a  another
-       group  named "byte" is defined. This matches an individual component of
-       an IPv4 address (a number less than 256). When  matching  takes  place,
-       this  part  of  the pattern is skipped because DEFINE acts like a false
-       condition. The rest of the pattern uses references to the  named  group
-       to  match the four dot-separated components of an IPv4 address, insist-
+       The  first part of the pattern is a DEFINE group inside which a another
+       group named "byte" is defined. This matches an individual component  of
+       an  IPv4  address  (a number less than 256). When matching takes place,
+       this part of the pattern is skipped because DEFINE acts  like  a  false
+       condition.  The  rest of the pattern uses references to the named group
+       to match the four dot-separated components of an IPv4 address,  insist-
        ing on a word boundary at each end.

    Assertion conditions

-       If the condition is not in any of the above  formats,  it  must  be  an
-       assertion.   This may be a positive or negative lookahead or lookbehind
-       assertion. Consider  this  pattern,  again  containing  non-significant
+       If  the  condition  is  not  in any of the above formats, it must be an
+       assertion.  This may be a positive or negative lookahead or  lookbehind
+       assertion.  Consider  this  pattern,  again  containing non-significant
        white space, and with the two alternatives on the second line:

          (?(?=[^a-z]*[a-z])
          \d{2}-[a-z]{3}-\d{2}  |  \d{2}-\d{2}-\d{2} )

-       The  condition  is  a  positive  lookahead  assertion  that  matches an
-       optional sequence of non-letters followed by a letter. In other  words,
-       it  tests  for the presence of at least one letter in the subject. If a
-       letter is found, the subject is matched against the first  alternative;
-       otherwise  it  is  matched  against  the  second.  This pattern matches
-       strings in one of the two forms dd-aaa-dd or dd-dd-dd,  where  aaa  are
+       The condition  is  a  positive  lookahead  assertion  that  matches  an
+       optional  sequence of non-letters followed by a letter. In other words,
+       it tests for the presence of at least one letter in the subject.  If  a
+       letter  is found, the subject is matched against the first alternative;
+       otherwise it is  matched  against  the  second.  This  pattern  matches
+       strings  in  one  of the two forms dd-aaa-dd or dd-dd-dd, where aaa are
        letters and dd are digits.

@@ -5279,41 +5329,41 @@
        There are two ways of including comments in patterns that are processed
        by PCRE. In both cases, the start of the comment must not be in a char-
        acter class, nor in the middle of any other sequence of related charac-
-       ters such as (?: or a subpattern name or number.  The  characters  that
+       ters  such  as  (?: or a subpattern name or number. The characters that
        make up a comment play no part in the pattern matching.

-       The  sequence (?# marks the start of a comment that continues up to the
-       next closing parenthesis. Nested parentheses are not permitted. If  the
+       The sequence (?# marks the start of a comment that continues up to  the
+       next  closing parenthesis. Nested parentheses are not permitted. If the
        PCRE_EXTENDED option is set, an unescaped # character also introduces a
-       comment, which in this case continues to  immediately  after  the  next
-       newline  character  or character sequence in the pattern. Which charac-
+       comment,  which  in  this  case continues to immediately after the next
+       newline character or character sequence in the pattern.  Which  charac-
        ters are interpreted as newlines is controlled by the options passed to
        pcre_compile() or by a special sequence at the start of the pattern, as
-       described in the section entitled  "Newline  conventions"  above.  Note
-       that  the  end of this type of comment is a literal newline sequence in
+       described  in  the  section  entitled "Newline conventions" above. Note
+       that the end of this type of comment is a literal newline  sequence  in
        the pattern; escape sequences that happen to represent a newline do not
-       count.  For  example,  consider this pattern when PCRE_EXTENDED is set,
+       count. For example, consider this pattern when  PCRE_EXTENDED  is  set,
        and the default newline convention is in force:

          abc #comment \n still comment

-       On encountering the # character, pcre_compile()  skips  along,  looking
-       for  a newline in the pattern. The sequence \n is still literal at this
-       stage, so it does not terminate the comment. Only an  actual  character
+       On  encountering  the  # character, pcre_compile() skips along, looking
+       for a newline in the pattern. The sequence \n is still literal at  this
+       stage,  so  it does not terminate the comment. Only an actual character
        with the code value 0x0a (the default newline) does so.

RECURSIVE PATTERNS

-       Consider  the problem of matching a string in parentheses, allowing for
-       unlimited nested parentheses. Without the use of  recursion,  the  best
-       that  can  be  done  is  to use a pattern that matches up to some fixed
-       depth of nesting. It is not possible to  handle  an  arbitrary  nesting
+       Consider the problem of matching a string in parentheses, allowing  for
+       unlimited  nested  parentheses.  Without the use of recursion, the best
+       that can be done is to use a pattern that  matches  up  to  some  fixed
+       depth  of  nesting.  It  is not possible to handle an arbitrary nesting
        depth.

        For some time, Perl has provided a facility that allows regular expres-
-       sions to recurse (amongst other things). It does this by  interpolating
-       Perl  code in the expression at run time, and the code can refer to the
+       sions  to recurse (amongst other things). It does this by interpolating
+       Perl code in the expression at run time, and the code can refer to  the
        expression itself. A Perl pattern using code interpolation to solve the
        parentheses problem can be created like this:

@@ -5323,201 +5373,201 @@
        refers recursively to the pattern in which it appears.

        Obviously, PCRE cannot support the interpolation of Perl code. Instead,
-       it  supports  special  syntax  for recursion of the entire pattern, and
-       also for individual subpattern recursion.  After  its  introduction  in
-       PCRE  and  Python,  this  kind of recursion was subsequently introduced
+       it supports special syntax for recursion of  the  entire  pattern,  and
+       also  for  individual  subpattern  recursion. After its introduction in
+       PCRE and Python, this kind of  recursion  was  subsequently  introduced
        into Perl at release 5.10.

-       A special item that consists of (? followed by a  number  greater  than
-       zero  and  a  closing parenthesis is a recursive subroutine call of the
-       subpattern of the given number, provided that  it  occurs  inside  that
-       subpattern.  (If  not,  it is a non-recursive subroutine call, which is
-       described in the next section.) The special item  (?R)  or  (?0)  is  a
+       A  special  item  that consists of (? followed by a number greater than
+       zero and a closing parenthesis is a recursive subroutine  call  of  the
+       subpattern  of  the  given  number, provided that it occurs inside that
+       subpattern. (If not, it is a non-recursive subroutine  call,  which  is
+       described  in  the  next  section.)  The special item (?R) or (?0) is a
        recursive call of the entire regular expression.

-       This  PCRE  pattern  solves  the nested parentheses problem (assume the
+       This PCRE pattern solves the nested  parentheses  problem  (assume  the
        PCRE_EXTENDED option is set so that white space is ignored):

          \( ( [^()]++ | (?R) )* \)

-       First it matches an opening parenthesis. Then it matches any number  of
-       substrings  which  can  either  be  a sequence of non-parentheses, or a
-       recursive match of the pattern itself (that is, a  correctly  parenthe-
+       First  it matches an opening parenthesis. Then it matches any number of
+       substrings which can either be a  sequence  of  non-parentheses,  or  a
+       recursive  match  of the pattern itself (that is, a correctly parenthe-
        sized substring).  Finally there is a closing parenthesis. Note the use
        of a possessive quantifier to avoid backtracking into sequences of non-
        parentheses.

-       If  this  were  part of a larger pattern, you would not want to recurse
+       If this were part of a larger pattern, you would not  want  to  recurse
        the entire pattern, so instead you could use this:

          ( \( ( [^()]++ | (?1) )* \) )

-       We have put the pattern into parentheses, and caused the  recursion  to
+       We  have  put the pattern into parentheses, and caused the recursion to
        refer to them instead of the whole pattern.

-       In  a  larger  pattern,  keeping  track  of  parenthesis numbers can be
-       tricky. This is made easier by the use of relative references.  Instead
+       In a larger pattern,  keeping  track  of  parenthesis  numbers  can  be
+       tricky.  This is made easier by the use of relative references. Instead
        of (?1) in the pattern above you can write (?-2) to refer to the second
-       most recently opened parentheses  preceding  the  recursion.  In  other
-       words,  a  negative  number counts capturing parentheses leftwards from
+       most  recently  opened  parentheses  preceding  the recursion. In other
+       words, a negative number counts capturing  parentheses  leftwards  from
        the point at which it is encountered.

-       It is also possible to refer to  subsequently  opened  parentheses,  by
-       writing  references  such  as (?+2). However, these cannot be recursive
-       because the reference is not inside the  parentheses  that  are  refer-
-       enced.  They are always non-recursive subroutine calls, as described in
+       It  is  also  possible  to refer to subsequently opened parentheses, by
+       writing references such as (?+2). However, these  cannot  be  recursive
+       because  the  reference  is  not inside the parentheses that are refer-
+       enced. They are always non-recursive subroutine calls, as described  in
        the next section.

-       An alternative approach is to use named parentheses instead.  The  Perl
-       syntax  for  this  is (?&name); PCRE's earlier syntax (?P>name) is also
+       An  alternative  approach is to use named parentheses instead. The Perl
+       syntax for this is (?&name); PCRE's earlier syntax  (?P>name)  is  also
        supported. We could rewrite the above example as follows:

          (?<pn> \( ( [^()]++ | (?&pn) )* \) )

-       If there is more than one subpattern with the same name,  the  earliest
+       If  there  is more than one subpattern with the same name, the earliest
        one is used.

-       This  particular  example pattern that we have been looking at contains
+       This particular example pattern that we have been looking  at  contains
        nested unlimited repeats, and so the use of a possessive quantifier for
        matching strings of non-parentheses is important when applying the pat-
-       tern to strings that do not match. For example, when  this  pattern  is
+       tern  to  strings  that do not match. For example, when this pattern is
        applied to

          (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()

-       it  yields  "no  match" quickly. However, if a possessive quantifier is
-       not used, the match runs for a very long time indeed because there  are
-       so  many  different  ways the + and * repeats can carve up the subject,
+       it yields "no match" quickly. However, if a  possessive  quantifier  is
+       not  used, the match runs for a very long time indeed because there are
+       so many different ways the + and * repeats can carve  up  the  subject,
        and all have to be tested before failure can be reported.

-       At the end of a match, the values of capturing  parentheses  are  those
-       from  the outermost level. If you want to obtain intermediate values, a
-       callout function can be used (see below and the pcrecallout  documenta-
+       At  the  end  of a match, the values of capturing parentheses are those
+       from the outermost level. If you want to obtain intermediate values,  a
+       callout  function can be used (see below and the pcrecallout documenta-
        tion). If the pattern above is matched against

          (ab(cd)ef)

-       the  value  for  the  inner capturing parentheses (numbered 2) is "ef",
-       which is the last value taken on at the top level. If a capturing  sub-
-       pattern  is  not  matched at the top level, its final captured value is
-       unset, even if it was (temporarily) set at a deeper  level  during  the
+       the value for the inner capturing parentheses  (numbered  2)  is  "ef",
+       which  is the last value taken on at the top level. If a capturing sub-
+       pattern is not matched at the top level, its final  captured  value  is
+       unset,  even  if  it was (temporarily) set at a deeper level during the
        matching process.

-       If  there are more than 15 capturing parentheses in a pattern, PCRE has
-       to obtain extra memory to store data during a recursion, which it  does
+       If there are more than 15 capturing parentheses in a pattern, PCRE  has
+       to  obtain extra memory to store data during a recursion, which it does
        by using pcre_malloc, freeing it via pcre_free afterwards. If no memory
        can be obtained, the match fails with the PCRE_ERROR_NOMEMORY error.

-       Do not confuse the (?R) item with the condition (R),  which  tests  for
-       recursion.   Consider  this pattern, which matches text in angle brack-
-       ets, allowing for arbitrary nesting. Only digits are allowed in  nested
-       brackets  (that is, when recursing), whereas any characters are permit-
+       Do  not  confuse  the (?R) item with the condition (R), which tests for
+       recursion.  Consider this pattern, which matches text in  angle  brack-
+       ets,  allowing for arbitrary nesting. Only digits are allowed in nested
+       brackets (that is, when recursing), whereas any characters are  permit-
        ted at the outer level.

          < (?: (?(R) \d++  | [^<>]*+) | (?R)) * >

-       In this pattern, (?(R) is the start of a conditional  subpattern,  with
-       two  different  alternatives for the recursive and non-recursive cases.
+       In  this  pattern, (?(R) is the start of a conditional subpattern, with
+       two different alternatives for the recursive and  non-recursive  cases.
        The (?R) item is the actual recursive call.

    Differences in recursion processing between PCRE and Perl

-       Recursion processing in PCRE differs from Perl in two  important  ways.
-       In  PCRE (like Python, but unlike Perl), a recursive subpattern call is
+       Recursion  processing  in PCRE differs from Perl in two important ways.
+       In PCRE (like Python, but unlike Perl), a recursive subpattern call  is
        always treated as an atomic group. That is, once it has matched some of
        the subject string, it is never re-entered, even if it contains untried
-       alternatives and there is a subsequent matching failure.  This  can  be
-       illustrated  by the following pattern, which purports to match a palin-
-       dromic string that contains an odd number of characters  (for  example,
+       alternatives  and  there  is a subsequent matching failure. This can be
+       illustrated by the following pattern, which purports to match a  palin-
+       dromic  string  that contains an odd number of characters (for example,
        "a", "aba", "abcba", "abcdcba"):

          ^(.|(.)(?1)\2)$

        The idea is that it either matches a single character, or two identical
-       characters surrounding a sub-palindrome. In Perl, this  pattern  works;
-       in  PCRE  it  does  not if the pattern is longer than three characters.
+       characters  surrounding  a sub-palindrome. In Perl, this pattern works;
+       in PCRE it does not if the pattern is  longer  than  three  characters.
        Consider the subject string "abcba":

-       At the top level, the first character is matched, but as it is  not  at
+       At  the  top level, the first character is matched, but as it is not at
        the end of the string, the first alternative fails; the second alterna-
        tive is taken and the recursion kicks in. The recursive call to subpat-
-       tern  1  successfully  matches the next character ("b"). (Note that the
+       tern 1 successfully matches the next character ("b").  (Note  that  the
        beginning and end of line tests are not part of the recursion).

-       Back at the top level, the next character ("c") is compared  with  what
-       subpattern  2 matched, which was "a". This fails. Because the recursion
-       is treated as an atomic group, there are now  no  backtracking  points,
-       and  so  the  entire  match fails. (Perl is able, at this point, to re-
-       enter the recursion and try the second alternative.)  However,  if  the
+       Back  at  the top level, the next character ("c") is compared with what
+       subpattern 2 matched, which was "a". This fails. Because the  recursion
+       is  treated  as  an atomic group, there are now no backtracking points,
+       and so the entire match fails. (Perl is able, at  this  point,  to  re-
+       enter  the  recursion  and try the second alternative.) However, if the
        pattern is written with the alternatives in the other order, things are
        different:

          ^((.)(?1)\2|.)$

-       This time, the recursing alternative is tried first, and  continues  to
-       recurse  until  it runs out of characters, at which point the recursion
-       fails. But this time we do have  another  alternative  to  try  at  the
-       higher  level.  That  is  the  big difference: in the previous case the
+       This  time,  the recursing alternative is tried first, and continues to
+       recurse until it runs out of characters, at which point  the  recursion
+       fails.  But  this  time  we  do  have another alternative to try at the
+       higher level. That is the big difference:  in  the  previous  case  the
        remaining alternative is at a deeper recursion level, which PCRE cannot
        use.

-       To  change  the pattern so that it matches all palindromic strings, not
-       just those with an odd number of characters, it is tempting  to  change
+       To change the pattern so that it matches all palindromic  strings,  not
+       just  those  with an odd number of characters, it is tempting to change
        the pattern to this:

          ^((.)(?1)\2|.?)$

-       Again,  this  works  in Perl, but not in PCRE, and for the same reason.
-       When a deeper recursion has matched a single character,  it  cannot  be
-       entered  again  in  order  to match an empty string. The solution is to
-       separate the two cases, and write out the odd and even cases as  alter-
+       Again, this works in Perl, but not in PCRE, and for  the  same  reason.
+       When  a  deeper  recursion has matched a single character, it cannot be
+       entered again in order to match an empty string.  The  solution  is  to
+       separate  the two cases, and write out the odd and even cases as alter-
        natives at the higher level:

          ^(?:((.)(?1)\2|)|((.)(?3)\4|.))

-       If  you  want  to match typical palindromic phrases, the pattern has to
+       If you want to match typical palindromic phrases, the  pattern  has  to
        ignore all non-word characters, which can be done like this:

          ^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$

        If run with the PCRE_CASELESS option, this pattern matches phrases such
        as "A man, a plan, a canal: Panama!" and it works well in both PCRE and
-       Perl. Note the use of the possessive quantifier *+ to avoid  backtrack-
-       ing  into  sequences of non-word characters. Without this, PCRE takes a
-       great deal longer (ten times or more) to  match  typical  phrases,  and
+       Perl.  Note the use of the possessive quantifier *+ to avoid backtrack-
+       ing into sequences of non-word characters. Without this, PCRE  takes  a
+       great  deal  longer  (ten  times or more) to match typical phrases, and
        Perl takes so long that you think it has gone into a loop.

-       WARNING:  The  palindrome-matching patterns above work only if the sub-
-       ject string does not start with a palindrome that is shorter  than  the
-       entire  string.  For example, although "abcba" is correctly matched, if
-       the subject is "ababa", PCRE finds the palindrome "aba" at  the  start,
-       then  fails at top level because the end of the string does not follow.
-       Once again, it cannot jump back into the recursion to try other  alter-
+       WARNING: The palindrome-matching patterns above work only if  the  sub-
+       ject  string  does not start with a palindrome that is shorter than the
+       entire string.  For example, although "abcba" is correctly matched,  if
+       the  subject  is "ababa", PCRE finds the palindrome "aba" at the start,
+       then fails at top level because the end of the string does not  follow.
+       Once  again, it cannot jump back into the recursion to try other alter-
        natives, so the entire match fails.

-       The  second  way  in which PCRE and Perl differ in their recursion pro-
-       cessing is in the handling of captured values. In Perl, when a  subpat-
-       tern  is  called recursively or as a subpattern (see the next section),
-       it has no access to any values that were captured  outside  the  recur-
-       sion,  whereas  in  PCRE  these values can be referenced. Consider this
+       The second way in which PCRE and Perl differ in  their  recursion  pro-
+       cessing  is in the handling of captured values. In Perl, when a subpat-
+       tern is called recursively or as a subpattern (see the  next  section),
+       it  has  no  access to any values that were captured outside the recur-
+       sion, whereas in PCRE these values can  be  referenced.  Consider  this
        pattern:

          ^(.)(\1|a(?2))

-       In PCRE, this pattern matches "bab". The  first  capturing  parentheses
-       match  "b",  then in the second group, when the back reference \1 fails
-       to match "b", the second alternative matches "a" and then recurses.  In
-       the  recursion,  \1 does now match "b" and so the whole match succeeds.
-       In Perl, the pattern fails to match because inside the  recursive  call
+       In  PCRE,  this  pattern matches "bab". The first capturing parentheses
+       match "b", then in the second group, when the back reference  \1  fails
+       to  match "b", the second alternative matches "a" and then recurses. In
+       the recursion, \1 does now match "b" and so the whole  match  succeeds.
+       In  Perl,  the pattern fails to match because inside the recursive call
        \1 cannot access the externally set value.

SUBPATTERNS AS SUBROUTINES

-       If  the  syntax for a recursive subpattern call (either by number or by
-       name) is used outside the parentheses to which it refers,  it  operates
-       like  a subroutine in a programming language. The called subpattern may
-       be defined before or after the reference. A numbered reference  can  be
+       If the syntax for a recursive subpattern call (either by number  or  by
+       name)  is  used outside the parentheses to which it refers, it operates
+       like a subroutine in a programming language. The called subpattern  may
+       be  defined  before or after the reference. A numbered reference can be
        absolute or relative, as in these examples:

          (...(absolute)...)...(?2)...
@@ -5528,108 +5578,109 @@

          (sens|respons)e and \1ibility

-       matches  "sense and sensibility" and "response and responsibility", but
+       matches "sense and sensibility" and "response and responsibility",  but
        not "sense and responsibility". If instead the pattern

          (sens|respons)e and (?1)ibility

-       is used, it does match "sense and responsibility" as well as the  other
-       two  strings.  Another  example  is  given  in the discussion of DEFINE
+       is  used, it does match "sense and responsibility" as well as the other
+       two strings. Another example is  given  in  the  discussion  of  DEFINE
        above.

-       All subroutine calls, whether recursive or not, are always  treated  as
-       atomic  groups. That is, once a subroutine has matched some of the sub-
+       All  subroutine  calls, whether recursive or not, are always treated as
+       atomic groups. That is, once a subroutine has matched some of the  sub-
        ject string, it is never re-entered, even if it contains untried alter-
-       natives  and  there  is  a  subsequent  matching failure. Any capturing
-       parentheses that are set during the subroutine  call  revert  to  their
+       natives and there is  a  subsequent  matching  failure.  Any  capturing
+       parentheses  that  are  set  during the subroutine call revert to their
        previous values afterwards.

-       Processing  options  such as case-independence are fixed when a subpat-
-       tern is defined, so if it is used as a subroutine, such options  cannot
+       Processing options such as case-independence are fixed when  a  subpat-
+       tern  is defined, so if it is used as a subroutine, such options cannot
        be changed for different calls. For example, consider this pattern:

          (abc)(?i:(?-1))

-       It  matches  "abcabc". It does not match "abcABC" because the change of
+       It matches "abcabc". It does not match "abcABC" because the  change  of
        processing option does not affect the called subpattern.

ONIGURUMA SUBROUTINE SYNTAX

-       For compatibility with Oniguruma, the non-Perl syntax \g followed by  a
+       For  compatibility with Oniguruma, the non-Perl syntax \g followed by a
        name or a number enclosed either in angle brackets or single quotes, is
-       an alternative syntax for referencing a  subpattern  as  a  subroutine,
-       possibly  recursively. Here are two of the examples used above, rewrit-
+       an  alternative  syntax  for  referencing a subpattern as a subroutine,
+       possibly recursively. Here are two of the examples used above,  rewrit-
        ten using this syntax:

          (?<pn> \( ( (?>[^()]+) | \g<pn> )* \) )
          (sens|respons)e and \g'1'ibility

-       PCRE supports an extension to Oniguruma: if a number is preceded  by  a
+       PCRE  supports  an extension to Oniguruma: if a number is preceded by a
        plus or a minus sign it is taken as a relative reference. For example:

          (abc)(?i:\g<-1>)

-       Note  that \g{...} (Perl syntax) and \g<...> (Oniguruma syntax) are not
-       synonymous. The former is a back reference; the latter is a  subroutine
+       Note that \g{...} (Perl syntax) and \g<...> (Oniguruma syntax) are  not
+       synonymous.  The former is a back reference; the latter is a subroutine
        call.

CALLOUTS

        Perl has a feature whereby using the sequence (?{...}) causes arbitrary
-       Perl code to be obeyed in the middle of matching a regular  expression.
+       Perl  code to be obeyed in the middle of matching a regular expression.
        This makes it possible, amongst other things, to extract different sub-
        strings that match the same pair of parentheses when there is a repeti-
        tion.

        PCRE provides a similar feature, but of course it cannot obey arbitrary
        Perl code. The feature is called "callout". The caller of PCRE provides
-       an  external function by putting its entry point in the global variable
-       pcre_callout.  By default, this variable contains NULL, which  disables
+       an external function by putting its entry point in the global  variable
+       pcre_callout.   By default, this variable contains NULL, which disables
        all calling out.

-       Within  a  regular  expression,  (?C) indicates the points at which the
-       external function is to be called. If you want  to  identify  different
-       callout  points, you can put a number less than 256 after the letter C.
-       The default value is zero.  For example, this pattern has  two  callout
+       Within a regular expression, (?C) indicates the  points  at  which  the
+       external  function  is  to be called. If you want to identify different
+       callout points, you can put a number less than 256 after the letter  C.
+       The  default  value is zero.  For example, this pattern has two callout
        points:

          (?C1)abc(?C2)def

        If the PCRE_AUTO_CALLOUT flag is passed to pcre_compile(), callouts are
-       automatically installed before each item in the pattern. They  are  all
+       automatically  installed  before each item in the pattern. They are all
        numbered 255.

        During matching, when PCRE reaches a callout point (and pcre_callout is
-       set), the external function is called. It is provided with  the  number
-       of  the callout, the position in the pattern, and, optionally, one item
-       of data originally supplied by the caller of pcre_exec().  The  callout
-       function  may cause matching to proceed, to backtrack, or to fail alto-
+       set),  the  external function is called. It is provided with the number
+       of the callout, the position in the pattern, and, optionally, one  item
+       of  data  originally supplied by the caller of pcre_exec(). The callout
+       function may cause matching to proceed, to backtrack, or to fail  alto-
        gether. A complete description of the interface to the callout function
        is given in the pcrecallout documentation.

BACKTRACKING CONTROL

-       Perl  5.10 introduced a number of "Special Backtracking Control Verbs",
+       Perl 5.10 introduced a number of "Special Backtracking Control  Verbs",
        which are described in the Perl documentation as "experimental and sub-
-       ject  to  change or removal in a future version of Perl". It goes on to
-       say: "Their usage in production code should be noted to avoid  problems
+       ject to change or removal in a future version of Perl". It goes  on  to
+       say:  "Their usage in production code should be noted to avoid problems
        during upgrades." The same remarks apply to the PCRE features described
        in this section.

-       Since these verbs are specifically related  to  backtracking,  most  of
-       them  can  be  used  only  when  the  pattern  is  to  be matched using
+       Since  these  verbs  are  specifically related to backtracking, most of
+       them can be  used  only  when  the  pattern  is  to  be  matched  using
        pcre_exec(), which uses a backtracking algorithm. With the exception of
        (*FAIL), which behaves like a failing negative assertion, they cause an
        error if encountered by pcre_dfa_exec().

-       If any of these verbs are used in an assertion or in a subpattern  that
+       If  any of these verbs are used in an assertion or in a subpattern that
        is called as a subroutine (whether or not recursively), their effect is
        confined to that subpattern; it does not extend to the surrounding pat-
-       tern,  with  one  exception:  a *MARK that is encountered in a positive
-       assertion is passed back (compare capturing parentheses in assertions).
+       tern, with one exception: the name from a *(MARK), (*PRUNE), or (*THEN)
+       that  is  encountered in a successful positive assertion is passed back
+       when a match succeeds (compare capturing  parentheses  in  assertions).
        Note that such subpatterns are processed as anchored at the point where
        they are tested. Note also that Perl's treatment of subroutines is dif-
        ferent in some cases.
@@ -5652,59 +5703,61 @@
        by setting the PCRE_NO_START_OPTIMIZE  option  when  calling  pcre_com-
        pile() or pcre_exec(), or by starting the pattern with (*NO_START_OPT).

+       Experiments  with  Perl  suggest that it too has similar optimizations,
+       sometimes leading to anomalous results.
+
    Verbs that act immediately

-       The  following  verbs act as soon as they are encountered. They may not
+       The following verbs act as soon as they are encountered. They  may  not
        be followed by a name.

           (*ACCEPT)

-       This verb causes the match to end successfully, skipping the  remainder
-       of  the pattern. However, when it is inside a subpattern that is called
-       as a subroutine, only that subpattern is ended  successfully.  Matching
-       then  continues  at  the  outer level. If (*ACCEPT) is inside capturing
+       This  verb causes the match to end successfully, skipping the remainder
+       of the pattern. However, when it is inside a subpattern that is  called
+       as  a  subroutine, only that subpattern is ended successfully. Matching
+       then continues at the outer level. If  (*ACCEPT)  is  inside  capturing
        parentheses, the data so far is captured. For example:

          A((?:A|B(*ACCEPT)|C)D)

-       This matches "AB", "AAD", or "ACD"; when it matches "AB", "B"  is  cap-
+       This  matches  "AB", "AAD", or "ACD"; when it matches "AB", "B" is cap-
        tured by the outer parentheses.

          (*FAIL) or (*F)

-       This  verb causes a matching failure, forcing backtracking to occur. It
-       is equivalent to (?!) but easier to read. The Perl documentation  notes
-       that  it  is  probably  useful only when combined with (?{}) or (??{}).
-       Those are, of course, Perl features that are not present in  PCRE.  The
-       nearest  equivalent is the callout feature, as for example in this pat-
+       This verb causes a matching failure, forcing backtracking to occur.  It
+       is  equivalent to (?!) but easier to read. The Perl documentation notes
+       that it is probably useful only when combined  with  (?{})  or  (??{}).
+       Those  are,  of course, Perl features that are not present in PCRE. The
+       nearest equivalent is the callout feature, as for example in this  pat-
        tern:

          a+(?C)(*FAIL)

-       A match with the string "aaaa" always fails, but the callout  is  taken
+       A  match  with the string "aaaa" always fails, but the callout is taken
        before each backtrack happens (in this example, 10 times).

    Recording which path was taken

-       There  is  one  verb  whose  main  purpose  is to track how a match was
-       arrived at, though it also has a  secondary  use  in  conjunction  with
+       There is one verb whose main purpose  is  to  track  how  a  match  was
+       arrived  at,  though  it  also  has a secondary use in conjunction with
        advancing the match starting point (see (*SKIP) below).

          (*MARK:NAME) or (*:NAME)

-       A  name  is  always  required  with  this  verb.  There  may be as many
-       instances of (*MARK) as you like in a pattern, and their names  do  not
+       A name is always  required  with  this  verb.  There  may  be  as  many
+       instances  of  (*MARK) as you like in a pattern, and their names do not
        have to be unique.

-       When  a  match  succeeds,  the  name of the last-encountered (*MARK) is
-       passed back to  the  caller  via  the  pcre_extra  data  structure,  as
-       described in the section on pcre_extra in the pcreapi documentation. No
-       data is returned for a partial match. Here is an  example  of  pcretest
-       output,  where the /K modifier requests the retrieval and outputting of
-       (*MARK) data:
+       When a match succeeds, the name of the last-encountered (*MARK) on  the
+       matching  path  is  passed  back  to the caller via the pcre_extra data
+       structure, as described in the section on  pcre_extra  in  the  pcreapi
+       documentation. Here is an example of pcretest output, where the /K mod-
+       ifier requests the retrieval and outputting of (*MARK) data:

-         /X(*MARK:A)Y|X(*MARK:B)Z/K
-         XY
+           re> /X(*MARK:A)Y|X(*MARK:B)Z/K
+         data> XY
           0: XY
          MK: A
          XZ
@@ -5720,98 +5773,78 @@
        and passed back if it is the last-encountered. This does not happen for
        negative assertions.

-       A  name  may  also  be  returned after a failed match if the final path
-       through the pattern involves (*MARK). However, unless (*MARK)  used  in
-       conjunction  with  (*COMMIT),  this  is unlikely to happen for an unan-
-       chored pattern because, as the starting point for matching is advanced,
-       the final check is often with an empty string, causing a failure before
-       (*MARK) is reached. For example:
+       After  a  partial match or a failed match, the name of the last encoun-
+       tered (*MARK) in the entire match process is returned. For example:

-         /X(*MARK:A)Y|X(*MARK:B)Z/K
-         XP
-         No match
-
-       There are three potential starting points for this match (starting with
-       X,  starting  with  P,  and  with  an  empty string). If the pattern is
-       anchored, the result is different:
-
-         /^X(*MARK:A)Y|^X(*MARK:B)Z/K
-         XP
+           re> /X(*MARK:A)Y|X(*MARK:B)Z/K
+         data> XP
          No match, mark = B

-       PCRE's start-of-match optimizations can also interfere with  this.  For
-       example,  if, as a result of a call to pcre_study(), it knows the mini-
-       mum subject length for a match, a shorter subject will not  be  scanned
-       at all.
+       Note that in this unanchored example the  mark  is  retained  from  the
+       match attempt that started at the letter "X". Subsequent match attempts
+       starting at "P" and then with an empty string do not get as far as  the
+       (*MARK) item, but nevertheless do not reset it.

-       Note that similar anomalies (though different in detail) exist in Perl,
-       no doubt for the same reasons. The use of (*MARK) data after  a  failed
-       match  of an unanchored pattern is not recommended, unless (*COMMIT) is
-       involved.
-
    Verbs that act after backtracking

        The following verbs do nothing when they are encountered. Matching con-
-       tinues  with what follows, but if there is no subsequent match, causing
-       a backtrack to the verb, a failure is  forced.  That  is,  backtracking
-       cannot  pass  to the left of the verb. However, when one of these verbs
-       appears inside an atomic group, its effect is confined to  that  group,
-       because  once the group has been matched, there is never any backtrack-
-       ing into it. In this situation, backtracking can  "jump  back"  to  the
-       left  of the entire atomic group. (Remember also, as stated above, that
+       tinues with what follows, but if there is no subsequent match,  causing
+       a  backtrack  to  the  verb, a failure is forced. That is, backtracking
+       cannot pass to the left of the verb. However, when one of  these  verbs
+       appears  inside  an atomic group, its effect is confined to that group,
+       because once the group has been matched, there is never any  backtrack-
+       ing  into  it.  In  this situation, backtracking can "jump back" to the
+       left of the entire atomic group. (Remember also, as stated above,  that
        this localization also applies in subroutine calls and assertions.)

-       These verbs differ in exactly what kind of failure  occurs  when  back-
+       These  verbs  differ  in exactly what kind of failure occurs when back-
        tracking reaches them.

          (*COMMIT)

-       This  verb, which may not be followed by a name, causes the whole match
+       This verb, which may not be followed by a name, causes the whole  match
        to fail outright if the rest of the pattern does not match. Even if the
        pattern is unanchored, no further attempts to find a match by advancing
        the  starting  point  take  place.  Once  (*COMMIT)  has  been  passed,
-       pcre_exec()  is  committed  to  finding a match at the current starting
+       pcre_exec() is committed to finding a match  at  the  current  starting
        point, or not at all. For example:

          a+(*COMMIT)b

-       This matches "xxaab" but not "aacaab". It can be thought of as  a  kind
+       This  matches  "xxaab" but not "aacaab". It can be thought of as a kind
        of dynamic anchor, or "I've started, so I must finish." The name of the
-       most recently passed (*MARK) in the path is passed back when  (*COMMIT)
+       most  recently passed (*MARK) in the path is passed back when (*COMMIT)
        forces a match failure.

-       Note  that  (*COMMIT)  at  the start of a pattern is not the same as an
-       anchor, unless PCRE's start-of-match optimizations are turned  off,  as
+       Note that (*COMMIT) at the start of a pattern is not  the  same  as  an
+       anchor,  unless  PCRE's start-of-match optimizations are turned off, as
        shown in this pcretest example:

-         /(*COMMIT)abc/
-         xyzabc
+           re> /(*COMMIT)abc/
+         data> xyzabc
           0: abc
          xyzabc\Y
          No match

-       PCRE  knows  that  any  match  must start with "a", so the optimization
-       skips along the subject to "a" before running the first match  attempt,
-       which  succeeds.  When the optimization is disabled by the \Y escape in
+       PCRE knows that any match must start  with  "a",  so  the  optimization
+       skips  along the subject to "a" before running the first match attempt,
+       which succeeds. When the optimization is disabled by the \Y  escape  in
        the second subject, the match starts at "x" and so the (*COMMIT) causes
        it to fail without trying any other starting points.

          (*PRUNE) or (*PRUNE:NAME)

-       This  verb causes the match to fail at the current starting position in
-       the subject if the rest of the pattern does not match. If  the  pattern
-       is  unanchored,  the  normal  "bumpalong"  advance to the next starting
-       character then happens. Backtracking can occur as usual to the left  of
-       (*PRUNE),  before  it  is  reached,  or  when  matching to the right of
-       (*PRUNE), but if there is no match to the  right,  backtracking  cannot
-       cross  (*PRUNE). In simple cases, the use of (*PRUNE) is just an alter-
-       native to an atomic group or possessive quantifier, but there are  some
+       This verb causes the match to fail at the current starting position  in
+       the  subject  if the rest of the pattern does not match. If the pattern
+       is unanchored, the normal "bumpalong"  advance  to  the  next  starting
+       character  then happens. Backtracking can occur as usual to the left of
+       (*PRUNE), before it is reached,  or  when  matching  to  the  right  of
+       (*PRUNE),  but  if  there is no match to the right, backtracking cannot
+       cross (*PRUNE). In simple cases, the use of (*PRUNE) is just an  alter-
+       native  to an atomic group or possessive quantifier, but there are some
        uses of (*PRUNE) that cannot be expressed in any other way.  The behav-
-       iour of (*PRUNE:NAME) is the  same  as  (*MARK:NAME)(*PRUNE)  when  the
-       match  fails  completely;  the name is passed back if this is the final
-       attempt.  (*PRUNE:NAME) does not pass back a name  if  the  match  suc-
-       ceeds.  In  an  anchored pattern (*PRUNE) has the same effect as (*COM-
-       MIT).
+       iour  of  (*PRUNE:NAME)  is  the  same  as  (*MARK:NAME)(*PRUNE). In an
+       anchored pattern (*PRUNE) has the same effect as (*COMMIT).

          (*SKIP)

@@ -5838,67 +5871,66 @@
        is searched for the most recent (*MARK) that has the same name. If  one
        is  found, the "bumpalong" advance is to the subject position that cor-
        responds to that (*MARK) instead of to where (*SKIP)  was  encountered.
-       If  no (*MARK) with a matching name is found, normal "bumpalong" of one
-       character happens (that is, the (*SKIP) is ignored).
+       If no (*MARK) with a matching name is found, the (*SKIP) is ignored.

          (*THEN) or (*THEN:NAME)

-       This verb causes a skip to the next innermost alternative if  the  rest
-       of  the  pattern does not match. That is, it cancels pending backtrack-
-       ing, but only within the current alternative. Its name comes  from  the
+       This  verb  causes a skip to the next innermost alternative if the rest
+       of the pattern does not match. That is, it cancels  pending  backtrack-
+       ing,  but  only within the current alternative. Its name comes from the
        observation that it can be used for a pattern-based if-then-else block:

          ( COND1 (*THEN) FOO | COND2 (*THEN) BAR | COND3 (*THEN) BAZ ) ...

-       If  the COND1 pattern matches, FOO is tried (and possibly further items
-       after the end of the group if FOO succeeds); on  failure,  the  matcher
-       skips  to  the second alternative and tries COND2, without backtracking
-       into COND1. The behaviour  of  (*THEN:NAME)  is  exactly  the  same  as
-       (*MARK:NAME)(*THEN)  if  the  overall  match  fails.  If (*THEN) is not
-       inside an alternation, it acts like (*PRUNE).
+       If the COND1 pattern matches, FOO is tried (and possibly further  items
+       after  the  end  of the group if FOO succeeds); on failure, the matcher
+       skips to the second alternative and tries COND2,  without  backtracking
+       into  COND1.  The  behaviour  of  (*THEN:NAME)  is  exactly the same as
+       (*MARK:NAME)(*THEN).  If (*THEN) is not inside an alternation, it  acts
+       like (*PRUNE).

-       Note that a subpattern that does not contain a | character  is  just  a
-       part  of the enclosing alternative; it is not a nested alternation with
-       only one alternative. The effect of (*THEN) extends beyond such a  sub-
-       pattern  to  the enclosing alternative. Consider this pattern, where A,
+       Note  that  a  subpattern that does not contain a | character is just a
+       part of the enclosing alternative; it is not a nested alternation  with
+       only  one alternative. The effect of (*THEN) extends beyond such a sub-
+       pattern to the enclosing alternative. Consider this pattern,  where  A,
        B, etc. are complex pattern fragments that do not contain any | charac-
        ters at this level:

          A (B(*THEN)C) | D

-       If  A and B are matched, but there is a failure in C, matching does not
+       If A and B are matched, but there is a failure in C, matching does  not
        backtrack into A; instead it moves to the next alternative, that is, D.
-       However,  if the subpattern containing (*THEN) is given an alternative,
+       However, if the subpattern containing (*THEN) is given an  alternative,
        it behaves differently:

          A (B(*THEN)C | (*FAIL)) | D

-       The effect of (*THEN) is now confined to the inner subpattern. After  a
+       The  effect of (*THEN) is now confined to the inner subpattern. After a
        failure in C, matching moves to (*FAIL), which causes the whole subpat-
-       tern to fail because there are no more alternatives  to  try.  In  this
+       tern  to  fail  because  there are no more alternatives to try. In this
        case, matching does now backtrack into A.

        Note also that a conditional subpattern is not considered as having two
-       alternatives, because only one is ever used.  In  other  words,  the  |
+       alternatives,  because  only  one  is  ever used. In other words, the |
        character in a conditional subpattern has a different meaning. Ignoring
        white space, consider:

          ^.*? (?(?=a) a | b(*THEN)c )

-       If the subject is "ba", this pattern does not  match.  Because  .*?  is
-       ungreedy,  it  initially  matches  zero characters. The condition (?=a)
-       then fails, the character "b" is matched,  but  "c"  is  not.  At  this
-       point,  matching does not backtrack to .*? as might perhaps be expected
-       from the presence of the | character.  The  conditional  subpattern  is
+       If  the  subject  is  "ba", this pattern does not match. Because .*? is
+       ungreedy, it initially matches zero  characters.  The  condition  (?=a)
+       then  fails,  the  character  "b"  is  matched, but "c" is not. At this
+       point, matching does not backtrack to .*? as might perhaps be  expected
+       from  the  presence  of  the | character. The conditional subpattern is
        part of the single alternative that comprises the whole pattern, and so
-       the match fails. (If there was a backtrack into  .*?,  allowing  it  to
+       the  match  fails.  (If  there was a backtrack into .*?, allowing it to
        match "b", the match would succeed.)

-       The  verbs just described provide four different "strengths" of control
+       The verbs just described provide four different "strengths" of  control
        when subsequent matching fails. (*THEN) is the weakest, carrying on the
-       match  at  the next alternative. (*PRUNE) comes next, failing the match
-       at the current starting position, but allowing an advance to  the  next
-       character  (for an unanchored pattern). (*SKIP) is similar, except that
+       match at the next alternative. (*PRUNE) comes next, failing  the  match
+       at  the  current starting position, but allowing an advance to the next
+       character (for an unanchored pattern). (*SKIP) is similar, except  that
        the advance may be more than one character. (*COMMIT) is the strongest,
        causing the entire match to fail.

@@ -5908,8 +5940,8 @@

          (A(*COMMIT)B(*THEN)C|D)

-       Once  A  has  matched,  PCRE is committed to this match, at the current
-       starting position. If subsequently B matches, but C does not, the  nor-
+       Once A has matched, PCRE is committed to this  match,  at  the  current
+       starting  position. If subsequently B matches, but C does not, the nor-
        mal (*THEN) action of trying the next alternative (that is, D) does not
        happen because (*COMMIT) overrides.

@@ -5928,7 +5960,7 @@

REVISION

-       Last updated: 19 October 2011
+       Last updated: 29 November 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------

@@ -6497,13 +6529,19 @@
        been  fully  tested. If --enable-jit is set on an unsupported platform,
        compilation fails.

-       A program can tell if JIT support is available by calling pcre_config()
-       with the PCRE_CONFIG_JIT option. The result is 1 when JIT is available,
-       and 0 otherwise. However, a simple program does not need to check  this
-       in order to use JIT. The API is implemented in a way that falls back to
-       the ordinary PCRE code if JIT is not available.
+       A program that is linked with PCRE 8.20 or later can tell if  JIT  sup-
+       port  is  available  by  calling pcre_config() with the PCRE_CONFIG_JIT
+       option. The result is 1 when JIT is available, and  0  otherwise.  How-
+       ever, a simple program does not need to check this in order to use JIT.
+       The API is implemented in a way that falls back to  the  ordinary  PCRE
+       code if JIT is not available.

+       If  your program may sometimes be linked with versions of PCRE that are
+       older than 8.20, but you want to use JIT when it is available, you  can
+       test the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT
+       macro such as PCRE_CONFIG_JIT, for compile-time control of your code.

+
SIMPLE USE OF JIT

        You have to do two things to make use of the JIT support  in  the  sim-
@@ -6517,6 +6555,22 @@
              no longer needed instead of just freeing it yourself. This
              ensures that any JIT data is also freed.

+       For  a  program  that may be linked with pre-8.20 versions of PCRE, you
+       can insert
+
+         #ifndef PCRE_STUDY_JIT_COMPILE
+         #define PCRE_STUDY_JIT_COMPILE 0
+         #endif
+
+       so that no option is passed to pcre_study(),  and  then  use  something
+       like this to free the study data:
+
+         #ifdef PCRE_CONFIG_JIT
+             pcre_free_study(study_ptr);
+         #else
+             pcre_free(study_ptr);
+         #endif
+
        In  some circumstances you may need to call additional functions. These
        are described in the  section  entitled  "Controlling  the  JIT  stack"
        below.
@@ -6555,12 +6609,8 @@

        The unsupported pattern items are:

-         \C            match a single byte; not supported in UTF-8 mode
+         \C             match a single byte; not supported in UTF-8 mode
          (?Cn)          callouts
-         (?(<name>)...  conditional test on setting of a named subpattern
-         (?(R)...       conditional test on whole pattern recursion
-         (?(Rn)...      conditional test on recursion, by number
-         (?(R&name)...  conditional test on recursion, by name
          (*COMMIT)      )
          (*MARK)        )
          (*PRUNE)       ) the backtracking control verbs
@@ -6609,28 +6659,29 @@
        large  or  complicated  patterns  need  more  than  this.   The   error
        PCRE_ERROR_JIT_STACKLIMIT  is  given  when  there  is not enough stack.
        Three functions are provided for managing blocks of memory for  use  as
-       JIT stacks.
+       JIT  stacks. There is further discussion about the use of JIT stacks in
+       the section entitled "JIT stack FAQ" below.

-       The  pcre_jit_stack_alloc() function creates a JIT stack. Its arguments
-       are a starting size and a maximum size, and it returns a pointer to  an
-       opaque  structure of type pcre_jit_stack, or NULL if there is an error.
-       The pcre_jit_stack_free() function can be used to free a stack that  is
-       no  longer  needed.  (For  the technically minded: the address space is
+       The pcre_jit_stack_alloc() function creates a JIT stack. Its  arguments
+       are  a starting size and a maximum size, and it returns a pointer to an
+       opaque structure of type pcre_jit_stack, or NULL if there is an  error.
+       The  pcre_jit_stack_free() function can be used to free a stack that is
+       no longer needed. (For the technically minded:  the  address  space  is
        allocated by mmap or VirtualAlloc.)

-       JIT uses far less memory for recursion than the interpretive code,  and
-       a  maximum  stack size of 512K to 1M should be more than enough for any
+       JIT  uses far less memory for recursion than the interpretive code, and
+       a maximum stack size of 512K to 1M should be more than enough  for  any
        pattern.

-       The pcre_assign_jit_stack() function specifies  which  stack  JIT  code
+       The  pcre_assign_jit_stack()  function  specifies  which stack JIT code
        should use. Its arguments are as follows:

          pcre_extra         *extra
          pcre_jit_callback  callback
          void               *data

-       The  extra  argument  must  be  the  result  of studying a pattern with
-       PCRE_STUDY_JIT_COMPILE. There are three cases for  the  values  of  the
+       The extra argument must be  the  result  of  studying  a  pattern  with
+       PCRE_STUDY_JIT_COMPILE.  There  are  three  cases for the values of the
        other two options:

          (1) If callback is NULL and data is NULL, an internal 32K block
@@ -6645,18 +6696,18 @@
              is used; otherwise the return value must be a valid JIT stack,
              the result of calling pcre_jit_stack_alloc().

-       You  may  safely assign the same JIT stack to more than one pattern, as
+       You may safely assign the same JIT stack to more than one  pattern,  as
        long as they are all matched sequentially in the same thread. In a mul-
        tithread application, each thread must use its own JIT stack.

-       Strictly  speaking, even more is allowed. You can assign the same stack
-       to any number of patterns as long as they are not used for matching  by
+       Strictly speaking, even more is allowed. You can assign the same  stack
+       to  any number of patterns as long as they are not used for matching by
        multiple threads at the same time. For example, you can assign the same
-       stack to all compiled patterns, and use a global mutex in the  callback
+       stack  to all compiled patterns, and use a global mutex in the callback
        to wait until the stack is available for use. However, this is an inef-
        ficient solution, and not recommended.

-       This is a suggestion for how  a  typical  multithreaded  program  might
+       This  is  a  suggestion  for  how a typical multithreaded program might
        operate:

          During thread initalization
@@ -6668,12 +6719,80 @@
          Use a one-line callback function
            return thread_local_var

-       All  the  functions  described in this section do nothing if JIT is not
-       available, and pcre_assign_jit_stack() does nothing  unless  the  extra
-       argument  is  non-NULL  and  points  to  a pcre_extra block that is the
+       All the functions described in this section do nothing if  JIT  is  not
+       available,  and  pcre_assign_jit_stack()  does nothing unless the extra
+       argument is non-NULL and points to  a  pcre_extra  block  that  is  the
        result of a successful study with PCRE_STUDY_JIT_COMPILE.

+JIT STACK FAQ
+
+       (1) Why do we need JIT stacks?
+
+       PCRE  (and JIT) is a recursive, depth-first engine, so it needs a stack
+       where the local data of the current node is pushed before checking  its
+       child nodes.  Allocating real machine stack on some platforms is diffi-
+       cult. For example, the stack chain needs to be updated every time if we
+       extend  the  stack  on  PowerPC.  Although it is possible, its updating
+       time overhead decreases performance. So we do the recursion in memory.
+
+       (2) Why don't we simply allocate blocks of memory with malloc()?
+
+       Modern operating systems have a  nice  feature:  they  can  reserve  an
+       address space instead of allocating memory. We can safely allocate mem-
+       ory pages inside this address space, so the stack  could  grow  without
+       moving memory data (this is important because of pointers). Thus we can
+       allocate 1M address space, and use only a single memory  page  (usually
+       4K)  if  that is enough. However, we can still grow up to 1M anytime if
+       needed.
+
+       (3) Who "owns" a JIT stack?
+
+       The owner of the stack is the user program, not the JIT studied pattern
+       or  anything else. The user program must ensure that if a stack is used
+       by pcre_exec(), (that is, it is assigned to the pattern currently  run-
+       ning), that stack must not be used by any other threads (to avoid over-
+       writing the same memory area). The best practice for multithreaded pro-
+       grams  is  to  allocate  a stack for each thread, and return this stack
+       through the JIT callback function.
+
+       (4) When should a JIT stack be freed?
+
+       You can free a JIT stack at any time, as long as it will not be used by
+       pcre_exec()  again.  When  you  assign  the  stack to a pattern, only a
+       pointer is set. There is no reference counting or any other magic.  You
+       can  free  the  patterns  and stacks in any order, anytime. Just do not
+       call pcre_exec() with a pattern pointing to an already freed stack,  as
+       that  will cause SEGFAULT. (Also, do not free a stack currently used by
+       pcre_exec() in another thread). You can also replace the  stack  for  a
+       pattern  at  any  time.  You  can  even  free the previous stack before
+       assigning a replacement.
+
+       (5) Should I allocate/free a  stack  every  time  before/after  calling
+       pcre_exec()?
+
+       No,  because  this  is  too  costly in terms of resources. However, you
+       could implement some clever idea which release the stack if it  is  not
+       used in let's say two minutes. The JIT callback can help to achive this
+       without keeping a list of the currently JIT studied patterns.
+
+       (6) OK, the stack is for long term memory allocation. But what  happens
+       if  a pattern causes stack overflow with a stack of 1M? Is that 1M kept
+       until the stack is freed?
+
+       Especially on embedded sytems, it might be a good idea to release  mem-
+       ory  sometimes  without  freeing the stack. There is no API for this at
+       the moment. Probably a function call which returns with  the  currently
+       allocated  memory for any stack and another which allows releasing mem-
+       ory (shrinking the stack) would be a good idea if someone needs this.
+
+       (7) This is too much of a headache. Isn't there any better solution for
+       JIT stack handling?
+
+       No,  thanks to Windows. If POSIX threads were used everywhere, we could
+       throw out this complicated API.
+
+
 EXAMPLE CODE

        This is a single-threaded example that specifies a  JIT  stack  without
@@ -6705,14 +6824,14 @@

AUTHOR

-       Philip Hazel
+       Philip Hazel (FAQ by Zoltan Herczeg)
        University Computing Service
        Cambridge CB2 3QH, England.

REVISION

-       Last updated: 19 October 2011
+       Last updated: 26 November 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------

@@ -8153,6 +8272,12 @@
        There is no limit to the number of parenthesized subpatterns, but there
        can be no more than 65535 capturing subpatterns.

+       There is a limit to the number of forward references to subsequent sub-
+       patterns of around 200,000.  Repeated  forward  references  with  fixed
+       upper  limits,  for example, (?2){0,100} when subpattern number 2 is to
+       the right, are included in the count. There is no limit to  the  number
+       of backward references.
+
        The maximum length of name for a named subpattern is 32 characters, and
        the maximum number of named subpatterns is 10000.

@@ -8173,7 +8298,7 @@

REVISION

-       Last updated: 24 August 2011
+       Last updated: 30 November 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------

Modified: code/trunk/doc/pcreapi.3
===================================================================
--- code/trunk/doc/pcreapi.3    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/pcreapi.3    2011-12-28 17:16:11 UTC (rev 836)
@@ -644,18 +644,18 @@
 pattern such as (\e1)(a) succeeds when this option is set (assuming it can find
 an "a" in the subject), whereas it fails by default, for Perl compatibility.
 .P
-(3) \eU matches an upper case "U" character; by default \eU causes a compile 
+(3) \eU matches an upper case "U" character; by default \eU causes a compile
 time error (Perl uses \eU to upper case subsequent characters).
 .P
-(4) \eu matches a lower case "u" character unless it is followed by four 
-hexadecimal digits, in which case the hexadecimal number defines the code point 
-to match. By default, \eu causes a compile time error (Perl uses it to upper 
+(4) \eu matches a lower case "u" character unless it is followed by four
+hexadecimal digits, in which case the hexadecimal number defines the code point
+to match. By default, \eu causes a compile time error (Perl uses it to upper
 case the following character).
 .P
-(5) \ex matches a lower case "x" character unless it is followed by two 
-hexadecimal digits, in which case the hexadecimal number defines the code point 
-to match. By default, as in Perl, a hexadecimal number is always expected after 
-\ex, but it may have zero, one, or two digits (so, for example, \exz matches a 
+(5) \ex matches a lower case "x" character unless it is followed by two
+hexadecimal digits, in which case the hexadecimal number defines the code point
+to match. By default, as in Perl, a hexadecimal number is always expected after
+\ex, but it may have zero, one, or two digits (so, for example, \exz matches a
 binary zero character followed by z).
 .sp
   PCRE_MULTILINE
@@ -1147,6 +1147,12 @@
 .\"
 documentation for details of what can and cannot be handled.
 .sp
+  PCRE_INFO_JITSIZE
+.sp
+If the pattern was successfully studied with the PCRE_STUDY_JIT_COMPILE option,
+return the size of the JIT compiled code, otherwise return zero. The fourth
+argument should point to a \fBsize_t\fP variable.
+.sp
   PCRE_INFO_LASTLITERAL
 .sp
 Return the value of the rightmost literal byte that must exist in any matched
@@ -1262,10 +1268,13 @@
 .sp
   PCRE_INFO_SIZE
 .sp
-Return the size of the compiled pattern, that is, the value that was passed as
-the argument to \fBpcre_malloc()\fP when PCRE was getting memory in which to
-place the compiled data. The fourth argument should point to a \fBsize_t\fP
-variable.
+Return the size of the compiled pattern. The fourth argument should point to a
+\fBsize_t\fP variable. This value does not include the size of the \fBpcre\fP
+structure that is returned by \fBpcre_compile()\fP. The value that is passed as
+the argument to \fBpcre_malloc()\fP when \fBpcre_compile()\fP is getting memory
+in which to place the compiled data is the value returned by this option plus
+the size of the \fBpcre\fP structure. Studying a compiled pattern, with or
+without JIT, does not alter the value returned by this option.
 .sp
   PCRE_INFO_STUDYSIZE
 .sp
@@ -2544,6 +2553,6 @@
 .rs
 .sp
 .nf
-Last updated: 14 November 2011
+Last updated: 02 December 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi

Modified: code/trunk/doc/pcrecallout.3
===================================================================
--- code/trunk/doc/pcrecallout.3    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/pcrecallout.3    2011-12-28 17:16:11 UTC (rev 836)
@@ -160,9 +160,10 @@
 .P
 The \fImark\fP field is present from version 2 of the \fIpcre_callout\fP
 structure. In callouts from \fBpcre_exec()\fP it contains a pointer to the
-zero-terminated name of the most recently passed (*MARK) item in the match, or
-NULL if there are no (*MARK)s in the current matching path. In callouts from
-\fBpcre_dfa_exec()\fP this field always contains NULL.
+zero-terminated name of the most recently passed (*MARK), (*PRUNE), or (*THEN)
+item in the match, or NULL if no such items have been passed. Instances of
+(*PRUNE) or (*THEN) without a name do not obliterate a previous (*MARK). In
+callouts from \fBpcre_dfa_exec()\fP this field always contains NULL.
 .
 .
 .SH "RETURN VALUES"
@@ -195,6 +196,6 @@
 .rs
 .sp
 .nf
-Last updated: 26 August 2011
+Last updated: 30 November 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi

Modified: code/trunk/doc/pcrecompat.3
===================================================================
--- code/trunk/doc/pcrecompat.3    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/pcrecompat.3    2011-12-28 17:16:11 UTC (rev 836)
@@ -38,7 +38,7 @@
 own, matching a non-newline character, is supported.) In fact these are
 implemented by Perl's general string-handling and are not part of its pattern
 matching engine. If any of these are encountered by PCRE, an error is
-generated by default. However, if the PCRE_JAVASCRIPT_COMPAT option is set, 
+generated by default. However, if the PCRE_JAVASCRIPT_COMPAT option is set,
 \eU and \eu are interpreted as JavaScript interprets them.
 .P
 6. The Perl escape sequences \ep, \eP, and \eX are supported only if PCRE is

Modified: code/trunk/doc/pcrejit.3
===================================================================
--- code/trunk/doc/pcrejit.3    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/pcrejit.3    2011-12-28 17:16:11 UTC (rev 836)
@@ -34,11 +34,16 @@
 fully tested. If --enable-jit is set on an unsupported platform, compilation
 fails.
 .P
-A program can tell if JIT support is available by calling \fBpcre_config()\fP
-with the PCRE_CONFIG_JIT option. The result is 1 when JIT is available, and 0
-otherwise. However, a simple program does not need to check this in order to
-use JIT. The API is implemented in a way that falls back to the ordinary PCRE
-code if JIT is not available.
+A program that is linked with PCRE 8.20 or later can tell if JIT support is
+available by calling \fBpcre_config()\fP with the PCRE_CONFIG_JIT option. The
+result is 1 when JIT is available, and 0 otherwise. However, a simple program
+does not need to check this in order to use JIT. The API is implemented in a
+way that falls back to the ordinary PCRE code if JIT is not available.
+.P
+If your program may sometimes be linked with versions of PCRE that are older
+than 8.20, but you want to use JIT when it is available, you can test
+the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT macro such
+as PCRE_CONFIG_JIT, for compile-time control of your code.
 .
 .
 .SH "SIMPLE USE OF JIT"
@@ -54,6 +59,21 @@
       no longer needed instead of just freeing it yourself. This
       ensures that any JIT data is also freed.
 .sp
+For a program that may be linked with pre-8.20 versions of PCRE, you can insert
+.sp
+  #ifndef PCRE_STUDY_JIT_COMPILE
+  #define PCRE_STUDY_JIT_COMPILE 0
+  #endif
+.sp
+so that no option is passed to \fBpcre_study()\fP, and then use something like
+this to free the study data:
+.sp
+  #ifdef PCRE_CONFIG_JIT
+      pcre_free_study(study_ptr);
+  #else
+      pcre_free(study_ptr);
+  #endif
+.sp
 In some circumstances you may need to call additional functions. These are
 described in the section entitled
 .\" HTML <a href="#stackcontrol">
@@ -95,7 +115,7 @@
 .P
 The unsupported pattern items are:
 .sp
-  \eC            match a single byte; not supported in UTF-8 mode
+  \eC             match a single byte; not supported in UTF-8 mode
   (?Cn)          callouts
   (*COMMIT)      )
   (*MARK)        )
@@ -153,7 +173,13 @@
 By default, it uses 32K on the machine stack. However, some large or
 complicated patterns need more than this. The error PCRE_ERROR_JIT_STACKLIMIT
 is given when there is not enough stack. Three functions are provided for
-managing blocks of memory for use as JIT stacks.
+managing blocks of memory for use as JIT stacks. There is further discussion
+about the use of JIT stacks in the section entitled
+.\" HTML <a href="#stackcontrol">
+.\" </a>
+"JIT stack FAQ"
+.\"
+below.
 .P
 The \fBpcre_jit_stack_alloc()\fP function creates a JIT stack. Its arguments
 are a starting size and a maximum size, and it returns a pointer to an opaque
@@ -217,6 +243,74 @@
 successful study with PCRE_STUDY_JIT_COMPILE.
 .
 .
+.\" HTML <a name="stackfaq"></a>
+.SH "JIT STACK FAQ"
+.rs
+.sp
+(1) Why do we need JIT stacks?
+.sp
+PCRE (and JIT) is a recursive, depth-first engine, so it needs a stack where
+the local data of the current node is pushed before checking its child nodes.
+Allocating real machine stack on some platforms is difficult. For example, the
+stack chain needs to be updated every time if we extend the stack on PowerPC.
+Although it is possible, its updating time overhead decreases performance. So
+we do the recursion in memory.
+.P
+(2) Why don't we simply allocate blocks of memory with \fBmalloc()\fP?
+.sp
+Modern operating systems have a nice feature: they can reserve an address space
+instead of allocating memory. We can safely allocate memory pages inside this
+address space, so the stack could grow without moving memory data (this is
+important because of pointers). Thus we can allocate 1M address space, and use
+only a single memory page (usually 4K) if that is enough. However, we can still
+grow up to 1M anytime if needed.
+.P
+(3) Who "owns" a JIT stack?
+.sp
+The owner of the stack is the user program, not the JIT studied pattern or
+anything else. The user program must ensure that if a stack is used by
+\fBpcre_exec()\fP, (that is, it is assigned to the pattern currently running),
+that stack must not be used by any other threads (to avoid overwriting the same
+memory area). The best practice for multithreaded programs is to allocate a
+stack for each thread, and return this stack through the JIT callback function.
+.P
+(4) When should a JIT stack be freed?
+.sp
+You can free a JIT stack at any time, as long as it will not be used by
+\fBpcre_exec()\fP again. When you assign the stack to a pattern, only a pointer
+is set. There is no reference counting or any other magic. You can free the
+patterns and stacks in any order, anytime. Just \fIdo not\fP call
+\fBpcre_exec()\fP with a pattern pointing to an already freed stack, as that
+will cause SEGFAULT. (Also, do not free a stack currently used by
+\fBpcre_exec()\fP in another thread). You can also replace the stack for a
+pattern at any time. You can even free the previous stack before assigning a
+replacement.
+.P
+(5) Should I allocate/free a stack every time before/after calling
+\fBpcre_exec()\fP?
+.sp
+No, because this is too costly in terms of resources. However, you could
+implement some clever idea which release the stack if it is not used in let's
+say two minutes. The JIT callback can help to achive this without keeping a
+list of the currently JIT studied patterns.
+.P
+(6) OK, the stack is for long term memory allocation. But what happens if a
+pattern causes stack overflow with a stack of 1M? Is that 1M kept until the
+stack is freed?
+.sp
+Especially on embedded sytems, it might be a good idea to release
+memory sometimes without freeing the stack. There is no API for this at the
+moment. Probably a function call which returns with the currently allocated
+memory for any stack and another which allows releasing memory (shrinking the
+stack) would be a good idea if someone needs this.
+.P
+(7) This is too much of a headache. Isn't there any better solution for JIT
+stack handling?
+.sp
+No, thanks to Windows. If POSIX threads were used everywhere, we could throw
+out this complicated API.
+.
+.
 .SH "EXAMPLE CODE"
 .rs
 .sp
@@ -253,7 +347,7 @@
 .rs
 .sp
 .nf
-Philip Hazel
+Philip Hazel (FAQ by Zoltan Herczeg)
 University Computing Service
 Cambridge CB2 3QH, England.
 .fi
@@ -263,6 +357,6 @@
 .rs
 .sp
 .nf
-Last updated: 15 November 2011
+Last updated: 26 November 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi

Modified: code/trunk/doc/pcrelimits.3
===================================================================
--- code/trunk/doc/pcrelimits.3    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/pcrelimits.3    2011-12-28 17:16:11 UTC (rev 836)
@@ -23,6 +23,11 @@
 There is no limit to the number of parenthesized subpatterns, but there can be
 no more than 65535 capturing subpatterns.
 .P
+There is a limit to the number of forward references to subsequent subpatterns
+of around 200,000. Repeated forward references with fixed upper limits, for
+example, (?2){0,100} when subpattern number 2 is to the right, are included in
+the count. There is no limit to the number of backward references.
+.P
 The maximum length of name for a named subpattern is 32 characters, and the
 maximum number of named subpatterns is 10000.
 .P
@@ -52,6 +57,6 @@
 .rs
 .sp
 .nf
-Last updated: 24 August 2011
+Last updated: 30 November 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi

Modified: code/trunk/doc/pcrepattern.3
===================================================================
--- code/trunk/doc/pcrepattern.3    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/pcrepattern.3    2011-12-28 17:16:11 UTC (rev 836)
@@ -242,7 +242,7 @@
   \eddd      character with octal code ddd, or back reference
   \exhh      character with hex code hh
   \ex{hhh..} character with hex code hhh.. (non-JavaScript mode)
-  \euhhhh    character with hex code hhhh (JavaScript mode only) 
+  \euhhhh    character with hex code hhhh (JavaScript mode only)
 .sp
 The precise effect of \ecx is as follows: if x is a lower case letter, it
 is converted to upper case. Then bit 6 of the character (hex 40) is inverted.
@@ -265,15 +265,15 @@
 initial \ex will be interpreted as a basic hexadecimal escape, with no
 following digits, giving a character whose value is zero.
 .P
-If the PCRE_JAVASCRIPT_COMPAT option is set, the interpretation of \ex is 
-as just described only when it is followed by two hexadecimal digits. 
+If the PCRE_JAVASCRIPT_COMPAT option is set, the interpretation of \ex is
+as just described only when it is followed by two hexadecimal digits.
 Otherwise, it matches a literal "x" character. In JavaScript mode, support for
-code points greater than 256 is provided by \eu, which must be followed by 
+code points greater than 256 is provided by \eu, which must be followed by
 four hexadecimal digits; otherwise it matches a literal "u" character.
 .P
 Characters whose value is less than 256 can be defined by either of the two
 syntaxes for \ex (or by \eu in JavaScript mode). There is no difference in the
-way they are handled. For example, \exdc is exactly the same as \ex{dc} (or 
+way they are handled. For example, \exdc is exactly the same as \ex{dc} (or
 \eu00dc in JavaScript mode).
 .P
 After \e0 up to two further octal digits are read. If there are fewer than two
@@ -328,12 +328,14 @@
 zero, because no more than three octal digits are ever read.
 .P
 All the sequences that define a single character value can be used both inside
-and outside character classes. In addition, inside a character class, the
-sequence \eb is interpreted as the backspace character (hex 08). The sequences
-\eB, \eN, \eR, and \eX are not special inside a character class. Like any other
-unrecognized escape sequences, they are treated as the literal characters "B",
-"N", "R", and "X" by default, but cause an error if the PCRE_EXTRA option is
-set. Outside a character class, these sequences have different meanings.
+and outside character classes. In addition, inside a character class, \eb is
+interpreted as the backspace character (hex 08).
+.P
+\eN is not allowed in a character class. \eB, \eR, and \eX are not special
+inside a character class. Like other unrecognized escape sequences, they are
+treated as the literal characters "B", "R", and "X" by default, but cause an
+error if the PCRE_EXTRA option is set. Outside a character class, these
+sequences have different meanings.
 .
 .
 .SS "Unsupported escape sequences"
@@ -405,7 +407,7 @@
 .\" </a>
 the "." metacharacter
 .\"
-when PCRE_DOTALL is not set. Perl also uses \eN to match characters by name; 
+when PCRE_DOTALL is not set. Perl also uses \eN to match characters by name;
 PCRE does not support this.
 .P
 Each pair of lower and upper case escape sequences partitions the complete set
@@ -2567,10 +2569,11 @@
 If any of these verbs are used in an assertion or in a subpattern that is
 called as a subroutine (whether or not recursively), their effect is confined
 to that subpattern; it does not extend to the surrounding pattern, with one
-exception: a *MARK that is encountered in a positive assertion \fIis\fP passed
-back (compare capturing parentheses in assertions). Note that such subpatterns
-are processed as anchored at the point where they are tested. Note also that
-Perl's treatment of subroutines is different in some cases.
+exception: the name from a *(MARK), (*PRUNE), or (*THEN) that is encountered in
+a successful positive assertion \fIis\fP passed back when a match succeeds
+(compare capturing parentheses in assertions). Note that such subpatterns are
+processed as anchored at the point where they are tested. Note also that Perl's
+treatment of subroutines is different in some cases.
 .P
 The new verbs make use of what was previously invalid syntax: an opening
 parenthesis followed by an asterisk. They are generally of the form
@@ -2589,6 +2592,9 @@
 the start-of-match optimizations by setting the PCRE_NO_START_OPTIMIZE option
 when calling \fBpcre_compile()\fP or \fBpcre_exec()\fP, or by starting the
 pattern with (*NO_START_OPT).
+.P
+Experiments with Perl suggest that it too has similar optimizations, sometimes
+leading to anomalous results.
 .
 .
 .SS "Verbs that act immediately"
@@ -2636,8 +2642,9 @@
 A name is always required with this verb. There may be as many instances of
 (*MARK) as you like in a pattern, and their names do not have to be unique.
 .P
-When a match succeeds, the name of the last-encountered (*MARK) is passed back
-to the caller via the \fIpcre_extra\fP data structure, as described in the
+When a match succeeds, the name of the last-encountered (*MARK) on the matching
+path is passed back to the caller via the \fIpcre_extra\fP data structure, as
+described in the
 .\" HTML <a href="pcreapi.html#extradata">
 .\" </a>
 section on \fIpcre_extra\fP
@@ -2646,12 +2653,11 @@
 .\" HREF
 \fBpcreapi\fP
 .\"
-documentation. No data is returned for a partial match. Here is an example of
-\fBpcretest\fP output, where the /K modifier requests the retrieval and
-outputting of (*MARK) data:
+documentation. Here is an example of \fBpcretest\fP output, where the /K
+modifier requests the retrieval and outputting of (*MARK) data:
 .sp
-  /X(*MARK:A)Y|X(*MARK:B)Z/K
-  XY
+    re> /X(*MARK:A)Y|X(*MARK:B)Z/K
+  data> XY
    0: XY
   MK: A
   XZ
@@ -2667,31 +2673,17 @@
 passed back if it is the last-encountered. This does not happen for negative
 assertions.
 .P
-A name may also be returned after a failed match if the final path through the
-pattern involves (*MARK). However, unless (*MARK) used in conjunction with
-(*COMMIT), this is unlikely to happen for an unanchored pattern because, as the
-starting point for matching is advanced, the final check is often with an empty
-string, causing a failure before (*MARK) is reached. For example:
+After a partial match or a failed match, the name of the last encountered
+(*MARK) in the entire match process is returned. For example:
 .sp
-  /X(*MARK:A)Y|X(*MARK:B)Z/K
-  XP
-  No match
-.sp
-There are three potential starting points for this match (starting with X,
-starting with P, and with an empty string). If the pattern is anchored, the
-result is different:
-.sp
-  /^X(*MARK:A)Y|^X(*MARK:B)Z/K
-  XP
+    re> /X(*MARK:A)Y|X(*MARK:B)Z/K
+  data> XP
   No match, mark = B
 .sp
-PCRE's start-of-match optimizations can also interfere with this. For example,
-if, as a result of a call to \fBpcre_study()\fP, it knows the minimum
-subject length for a match, a shorter subject will not be scanned at all.
-.P
-Note that similar anomalies (though different in detail) exist in Perl, no
-doubt for the same reasons. The use of (*MARK) data after a failed match of an
-unanchored pattern is not recommended, unless (*COMMIT) is involved.
+Note that in this unanchored example the mark is retained from the match
+attempt that started at the letter "X". Subsequent match attempts starting at
+"P" and then with an empty string do not get as far as the (*MARK) item, but
+nevertheless do not reset it.
 .
 .
 .SS "Verbs that act after backtracking"
@@ -2728,8 +2720,8 @@
 unless PCRE's start-of-match optimizations are turned off, as shown in this
 \fBpcretest\fP example:
 .sp
-  /(*COMMIT)abc/
-  xyzabc
+    re> /(*COMMIT)abc/
+  data> xyzabc
    0: abc
   xyzabc\eY
   No match
@@ -2750,10 +2742,8 @@
 the right, backtracking cannot cross (*PRUNE). In simple cases, the use of
 (*PRUNE) is just an alternative to an atomic group or possessive quantifier,
 but there are some uses of (*PRUNE) that cannot be expressed in any other way.
-The behaviour of (*PRUNE:NAME) is the same as (*MARK:NAME)(*PRUNE) when the
-match fails completely; the name is passed back if this is the final attempt.
-(*PRUNE:NAME) does not pass back a name if the match succeeds. In an anchored
-pattern (*PRUNE) has the same effect as (*COMMIT).
+The behaviour of (*PRUNE:NAME) is the same as (*MARK:NAME)(*PRUNE). In an
+anchored pattern (*PRUNE) has the same effect as (*COMMIT).
 .sp
   (*SKIP)
 .sp
@@ -2779,8 +2769,7 @@
 searched for the most recent (*MARK) that has the same name. If one is found,
 the "bumpalong" advance is to the subject position that corresponds to that
 (*MARK) instead of to where (*SKIP) was encountered. If no (*MARK) with a
-matching name is found, normal "bumpalong" of one character happens (that is,
-the (*SKIP) is ignored).
+matching name is found, the (*SKIP) is ignored.
 .sp
   (*THEN) or (*THEN:NAME)
 .sp
@@ -2794,9 +2783,8 @@
 If the COND1 pattern matches, FOO is tried (and possibly further items after
 the end of the group if FOO succeeds); on failure, the matcher skips to the
 second alternative and tries COND2, without backtracking into COND1. The
-behaviour of (*THEN:NAME) is exactly the same as (*MARK:NAME)(*THEN) if the
-overall match fails. If (*THEN) is not inside an alternation, it acts like
-(*PRUNE).
+behaviour of (*THEN:NAME) is exactly the same as (*MARK:NAME)(*THEN).
+If (*THEN) is not inside an alternation, it acts like (*PRUNE).
 .P
 Note that a subpattern that does not contain a | character is just a part of
 the enclosing alternative; it is not a nested alternation with only one
@@ -2874,6 +2862,6 @@
 .rs
 .sp
 .nf
-Last updated: 19 November 2011
+Last updated: 29 November 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi

Modified: code/trunk/doc/pcretest.1
===================================================================
--- code/trunk/doc/pcretest.1    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/pcretest.1    2011-12-28 17:16:11 UTC (rev 836)
@@ -319,7 +319,10 @@
 which it appears.
 .P
 The \fB/M\fP modifier causes the size of memory block used to hold the compiled
-pattern to be output.
+pattern to be output. This does not include the size of the \fBpcre\fP block;
+it is just the actual compiled data. If the pattern is successfully studied
+with the PCRE_STUDY_JIT_COMPILE option, the size of the JIT compiled code is
+also output.
 .P
 If the \fB/S\fP modifier appears once, it causes \fBpcre_study()\fP to be
 called after the expression has been compiled, and the results used when the
@@ -875,6 +878,6 @@
 .rs
 .sp
 .nf
-Last updated: 26 August 2011
+Last updated: 02 December 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi

Modified: code/trunk/doc/pcretest.txt
===================================================================
--- code/trunk/doc/pcretest.txt    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/doc/pcretest.txt    2011-12-28 17:16:11 UTC (rev 836)
@@ -305,30 +305,33 @@
        it appears.

        The /M modifier causes the size of memory block used to hold  the  com-
-       piled pattern to be output.
+       piled  pattern to be output. This does not include the size of the pcre
+       block; it is just the actual compiled data. If the pattern is  success-
+       fully  studied  with the PCRE_STUDY_JIT_COMPILE option, the size of the
+       JIT compiled code is also output.

-       If  the  /S  modifier appears once, it causes pcre_study() to be called
-       after the expression has been compiled, and the results used  when  the
-       expression  is  matched.  If  /S appears twice, it suppresses studying,
+       If the /S modifier appears once, it causes pcre_study()  to  be  called
+       after  the  expression has been compiled, and the results used when the
+       expression is matched. If /S appears  twice,  it  suppresses  studying,
        even if it was requested externally by the -s command line option. This
-       makes  it possible to specify that certain patterns are always studied,
+       makes it possible to specify that certain patterns are always  studied,
        and others are never studied, independently of -s. This feature is used
        in the test files in a few cases where the output is different when the
        pattern is studied.

-       If the /S modifier is immediately followed by a + character,  the  call
-       to   pcre_study()  is  made  with  the  PCRE_STUDY_JIT_COMPILE  option,
-       requesting just-in-time optimization support if it is  available.  Note
-       that  there  is  also  a  /+ modifier; it must not be given immediately
-       after /S because this will be misinterpreted. If JIT studying  is  suc-
-       cessful,  it will automatically be used when pcre_exec() is run, except
-       when incompatible run-time options are  specified.  These  include  the
+       If  the  /S modifier is immediately followed by a + character, the call
+       to  pcre_study()  is  made  with  the  PCRE_STUDY_JIT_COMPILE   option,
+       requesting  just-in-time  optimization support if it is available. Note
+       that there is also a /+ modifier; it  must  not  be  given  immediately
+       after  /S  because this will be misinterpreted. If JIT studying is suc-
+       cessful, it will automatically be used when pcre_exec() is run,  except
+       when  incompatible  run-time  options  are specified. These include the
        partial matching options; a complete list is given in the pcrejit docu-
-       mentation. See also the \J escape sequence below for a way  of  setting
+       mentation.  See  also the \J escape sequence below for a way of setting
        the size of the JIT stack.

-       The  /T  modifier  must be followed by a single digit. It causes a spe-
-       cific set of built-in character tables to be passed to  pcre_compile().
+       The /T modifier must be followed by a single digit. It  causes  a  spe-
+       cific  set of built-in character tables to be passed to pcre_compile().
        It is used in the standard PCRE tests to check behaviour with different
        character tables. The digit specifies the tables as follows:

@@ -336,12 +339,12 @@
                pcre_chartables.c.dist
          1   a set of tables defining ISO 8859 characters

-       In table 1, some characters whose codes are greater than 128 are  iden-
+       In  table 1, some characters whose codes are greater than 128 are iden-
        tified as letters, digits, spaces, etc.

    Using the POSIX wrapper API

-       The  /P modifier causes pcretest to call PCRE via the POSIX wrapper API
+       The /P modifier causes pcretest to call PCRE via the POSIX wrapper  API
        rather than its native API. When /P is set, the following modifiers set
        options for the regcomp() function:

@@ -353,17 +356,17 @@
          /W    REG_UCP        )   the POSIX standard
          /8    REG_UTF8       )

-       The  /+  modifier  works  as  described  above. All other modifiers are
+       The /+ modifier works as  described  above.  All  other  modifiers  are
        ignored.

DATA LINES

-       Before each data line is passed to pcre_exec(),  leading  and  trailing
-       white  space  is removed, and it is then scanned for \ escapes. Some of
-       these are pretty esoteric features, intended for checking out  some  of
-       the  more  complicated features of PCRE. If you are just testing "ordi-
-       nary" regular expressions, you probably don't need any  of  these.  The
+       Before  each  data  line is passed to pcre_exec(), leading and trailing
+       white space is removed, and it is then scanned for \ escapes.  Some  of
+       these  are  pretty esoteric features, intended for checking out some of
+       the more complicated features of PCRE. If you are just  testing  "ordi-
+       nary"  regular  expressions,  you probably don't need any of these. The
        following escapes are recognized:

          \a         alarm (BEL, \x07)
@@ -444,95 +447,95 @@
          \<any>     pass the PCRE_NEWLINE_ANY option to pcre_exec()
                       or pcre_dfa_exec()

-       Note  that  \xhh  always  specifies  one byte, even in UTF-8 mode; this
+       Note that \xhh always specifies one byte,  even  in  UTF-8  mode;  this
        makes it possible to construct invalid UTF-8 sequences for testing pur-
        poses. On the other hand, \x{hh} is interpreted as a UTF-8 character in
-       UTF-8 mode, generating more than one byte if the value is greater  than
+       UTF-8  mode, generating more than one byte if the value is greater than
        127. When not in UTF-8 mode, it generates one byte for values less than
        256, and causes an error for greater values.

-       The escapes that specify line ending  sequences  are  literal  strings,
+       The  escapes  that  specify  line ending sequences are literal strings,
        exactly as shown. No more than one newline setting should be present in
        any data line.

-       A backslash followed by anything else just escapes the  anything  else.
-       If  the very last character is a backslash, it is ignored. This gives a
-       way of passing an empty line as data, since a real  empty  line  termi-
+       A  backslash  followed by anything else just escapes the anything else.
+       If the very last character is a backslash, it is ignored. This gives  a
+       way  of  passing  an empty line as data, since a real empty line termi-
        nates the data input.

-       The  \J escape provides a way of setting the maximum stack size that is
-       used by the just-in-time optimization code. It is ignored if JIT  opti-
-       mization  is  not being used. Providing a stack that is larger than the
+       The \J escape provides a way of setting the maximum stack size that  is
+       used  by the just-in-time optimization code. It is ignored if JIT opti-
+       mization is not being used. Providing a stack that is larger  than  the
        default 32K is necessary only for very complicated patterns.

-       If \M is present, pcretest calls pcre_exec() several times,  with  dif-
-       ferent  values  in  the match_limit and match_limit_recursion fields of
-       the pcre_extra data structure, until it finds the minimum  numbers  for
-       each  parameter  that  allow  pcre_exec()  to  complete  without error.
-       Because this is testing a specific feature of the  normal  interpretive
-       pcre_exec()  execution, the use of any JIT optimization that might have
+       If  \M  is present, pcretest calls pcre_exec() several times, with dif-
+       ferent values in the match_limit and  match_limit_recursion  fields  of
+       the  pcre_extra  data structure, until it finds the minimum numbers for
+       each parameter  that  allow  pcre_exec()  to  complete  without  error.
+       Because  this  is testing a specific feature of the normal interpretive
+       pcre_exec() execution, the use of any JIT optimization that might  have
        been set up by the /S+ qualifier of -s+ option is disabled.

-       The match_limit number is a measure of the amount of backtracking  that
-       takes  place,  and  checking it out can be instructive. For most simple
-       matches, the number is quite small, but for patterns  with  very  large
-       numbers  of  matching  possibilities,  it can become large very quickly
-       with increasing length of  subject  string.  The  match_limit_recursion
-       number  is  a  measure  of how much stack (or, if PCRE is compiled with
-       NO_RECURSE, how much heap) memory  is  needed  to  complete  the  match
+       The  match_limit number is a measure of the amount of backtracking that
+       takes place, and checking it out can be instructive.  For  most  simple
+       matches,  the  number  is quite small, but for patterns with very large
+       numbers of matching possibilities, it can  become  large  very  quickly
+       with  increasing  length  of  subject string. The match_limit_recursion
+       number is a measure of how much stack (or, if  PCRE  is  compiled  with
+       NO_RECURSE,  how  much  heap)  memory  is  needed to complete the match
        attempt.

-       When  \O  is  used, the value specified may be higher or lower than the
+       When \O is used, the value specified may be higher or  lower  than  the
        size set by the -O command line option (or defaulted to 45); \O applies
        only to the call of pcre_exec() for the line in which it appears.

-       If  the /P modifier was present on the pattern, causing the POSIX wrap-
-       per API to be used, the only option-setting  sequences  that  have  any
-       effect  are  \B,  \N,  and  \Z,  causing  REG_NOTBOL, REG_NOTEMPTY, and
+       If the /P modifier was present on the pattern, causing the POSIX  wrap-
+       per  API  to  be  used, the only option-setting sequences that have any
+       effect are \B,  \N,  and  \Z,  causing  REG_NOTBOL,  REG_NOTEMPTY,  and
        REG_NOTEOL, respectively, to be passed to regexec().

-       The use of \x{hh...} to represent UTF-8 characters is not dependent  on
-       the  use  of  the  /8 modifier on the pattern. It is recognized always.
-       There may be any number of hexadecimal digits inside  the  braces.  The
-       result  is  from  one  to  six bytes, encoded according to the original
-       UTF-8 rules of RFC 2279. This allows for  values  in  the  range  0  to
-       0x7FFFFFFF.  Note  that not all of those are valid Unicode code points,
-       or indeed valid UTF-8 characters according to the later  rules  in  RFC
+       The  use of \x{hh...} to represent UTF-8 characters is not dependent on
+       the use of the /8 modifier on the pattern.  It  is  recognized  always.
+       There  may  be  any number of hexadecimal digits inside the braces. The
+       result is from one to six bytes,  encoded  according  to  the  original
+       UTF-8  rules  of  RFC  2279.  This  allows for values in the range 0 to
+       0x7FFFFFFF. Note that not all of those are valid Unicode  code  points,
+       or  indeed  valid  UTF-8 characters according to the later rules in RFC
        3629.

THE ALTERNATIVE MATCHING FUNCTION

-       By   default,  pcretest  uses  the  standard  PCRE  matching  function,
+       By  default,  pcretest  uses  the  standard  PCRE  matching   function,
        pcre_exec() to match each data line. From release 6.0, PCRE supports an
-       alternative  matching  function,  pcre_dfa_test(),  which operates in a
-       different way, and has some restrictions. The differences  between  the
+       alternative matching function, pcre_dfa_test(),  which  operates  in  a
+       different  way,  and has some restrictions. The differences between the
        two functions are described in the pcrematching documentation.

-       If  a data line contains the \D escape sequence, or if the command line
-       contains the -dfa option, the alternative matching function is  called.
+       If a data line contains the \D escape sequence, or if the command  line
+       contains  the -dfa option, the alternative matching function is called.
        This function finds all possible matches at a given point. If, however,
-       the \F escape sequence is present in the data line, it stops after  the
+       the  \F escape sequence is present in the data line, it stops after the
        first match is found. This is always the shortest possible match.

DEFAULT OUTPUT FROM PCRETEST

-       This  section  describes  the output when the normal matching function,
+       This section describes the output when the  normal  matching  function,
        pcre_exec(), is being used.

        When a match succeeds, pcretest outputs the list of captured substrings
-       that  pcre_exec()  returns,  starting with number 0 for the string that
-       matched the whole pattern. Otherwise, it outputs "No  match"  when  the
+       that pcre_exec() returns, starting with number 0 for  the  string  that
+       matched  the  whole  pattern. Otherwise, it outputs "No match" when the
        return is PCRE_ERROR_NOMATCH, and "Partial match:" followed by the par-
-       tially matching substring when pcre_exec() returns  PCRE_ERROR_PARTIAL.
-       (Note  that  this is the entire substring that was inspected during the
-       partial match; it may include characters before the actual match  start
-       if  a  lookbehind assertion, \K, \b, or \B was involved.) For any other
-       return, pcretest outputs the PCRE negative error  number  and  a  short
-       descriptive  phrase.  If  the error is a failed UTF-8 string check, the
-       byte offset of the start of the failing character and the  reason  code
-       are  also  output,  provided  that  the size of the output vector is at
+       tially  matching substring when pcre_exec() returns PCRE_ERROR_PARTIAL.
+       (Note that this is the entire substring that was inspected  during  the
+       partial  match; it may include characters before the actual match start
+       if a lookbehind assertion, \K, \b, or \B was involved.) For  any  other
+       return,  pcretest  outputs  the  PCRE negative error number and a short
+       descriptive phrase. If the error is a failed UTF-8  string  check,  the
+       byte  offset  of the start of the failing character and the reason code
+       are also output, provided that the size of  the  output  vector  is  at
        least two. Here is an example of an interactive pcretest run.

          $ pcretest
@@ -547,9 +550,9 @@

        Unset capturing substrings that are not followed by one that is set are
        not returned by pcre_exec(), and are not shown by pcretest. In the fol-
-       lowing example, there are two capturing substrings, but when the  first
-       data  line  is  matched,  the  second, unset substring is not shown. An
-       "internal" unset substring is shown as "<unset>",  as  for  the  second
+       lowing  example, there are two capturing substrings, but when the first
+       data line is matched, the second, unset  substring  is  not  shown.  An
+       "internal"  unset  substring  is  shown as "<unset>", as for the second
        data line.

            re> /(a)|(b)/
@@ -561,11 +564,11 @@
           1: <unset>
           2: b

-       If  the strings contain any non-printing characters, they are output as
-       \0x escapes, or as \x{...} escapes if the /8 modifier  was  present  on
-       the  pattern.  See below for the definition of non-printing characters.
-       If the pattern has the /+ modifier, the output for substring 0 is  fol-
-       lowed  by  the  the rest of the subject string, identified by "0+" like
+       If the strings contain any non-printing characters, they are output  as
+       \0x  escapes,  or  as \x{...} escapes if the /8 modifier was present on
+       the pattern. See below for the definition of  non-printing  characters.
+       If  the pattern has the /+ modifier, the output for substring 0 is fol-
+       lowed by the the rest of the subject string, identified  by  "0+"  like
        this:

            re> /cat/+
@@ -573,7 +576,7 @@
           0: cat
           0+ aract

-       If the pattern has the /g or /G modifier,  the  results  of  successive
+       If  the  pattern  has  the /g or /G modifier, the results of successive
        matching attempts are output in sequence, like this:

            re> /\Bi(\w\w)/g
@@ -585,32 +588,32 @@
           0: ipp
           1: pp

-       "No  match" is output only if the first match attempt fails. Here is an
-       example of a failure message (the offset 4 that is specified by \>4  is
+       "No match" is output only if the first match attempt fails. Here is  an
+       example  of a failure message (the offset 4 that is specified by \>4 is
        past the end of the subject string):

            re> /xyz/
          data> xyz\>4
          Error -24 (bad offset value)

-       If  any  of the sequences \C, \G, or \L are present in a data line that
-       is successfully matched, the substrings extracted  by  the  convenience
+       If any of the sequences \C, \G, or \L are present in a data  line  that
+       is  successfully  matched,  the substrings extracted by the convenience
        functions are output with C, G, or L after the string number instead of
        a colon. This is in addition to the normal full list. The string length
-       (that  is,  the return from the extraction function) is given in paren-
+       (that is, the return from the extraction function) is given  in  paren-
        theses after each string for \C and \G.

        Note that whereas patterns can be continued over several lines (a plain
        ">" prompt is used for continuations), data lines may not. However new-
-       lines can be included in data by means of the \n escape (or  \r,  \r\n,
+       lines  can  be included in data by means of the \n escape (or \r, \r\n,
        etc., depending on the newline sequence setting).

OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION

-       When  the  alternative  matching function, pcre_dfa_exec(), is used (by
-       means of the \D escape sequence or the -dfa command line  option),  the
-       output  consists  of  a list of all the matches that start at the first
+       When the alternative matching function, pcre_dfa_exec(),  is  used  (by
+       means  of  the \D escape sequence or the -dfa command line option), the
+       output consists of a list of all the matches that start  at  the  first
        point in the subject where there is at least one match. For example:

            re> /(tang|tangerine|tan)/
@@ -619,11 +622,11 @@
           1: tang
           2: tan

-       (Using the normal matching function on this data  finds  only  "tang".)
-       The  longest matching string is always given first (and numbered zero).
+       (Using  the  normal  matching function on this data finds only "tang".)
+       The longest matching string is always given first (and numbered  zero).
        After a PCRE_ERROR_PARTIAL return, the output is "Partial match:", fol-
-       lowed  by  the  partially  matching  substring.  (Note that this is the
-       entire substring that was inspected during the partial  match;  it  may
+       lowed by the partially matching  substring.  (Note  that  this  is  the
+       entire  substring  that  was inspected during the partial match; it may
        include characters before the actual match start if a lookbehind asser-
        tion, \K, \b, or \B was involved.)

@@ -639,16 +642,16 @@
           1: tan
           0: tan

-       Since  the  matching  function  does not support substring capture, the
-       escape sequences that are concerned with captured  substrings  are  not
+       Since the matching function does not  support  substring  capture,  the
+       escape  sequences  that  are concerned with captured substrings are not
        relevant.

RESTARTING AFTER A PARTIAL MATCH

        When the alternative matching function has given the PCRE_ERROR_PARTIAL
-       return, indicating that the subject partially matched the pattern,  you
-       can  restart  the match with additional subject data by means of the \R
+       return,  indicating that the subject partially matched the pattern, you
+       can restart the match with additional subject data by means of  the  \R
        escape sequence. For example:

            re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
@@ -657,30 +660,30 @@
          data> n05\R\D
           0: n05

-       For further information about partial  matching,  see  the  pcrepartial
+       For  further  information  about  partial matching, see the pcrepartial
        documentation.

CALLOUTS

-       If  the pattern contains any callout requests, pcretest's callout func-
-       tion is called during matching. This works  with  both  matching  func-
+       If the pattern contains any callout requests, pcretest's callout  func-
+       tion  is  called  during  matching. This works with both matching func-
        tions. By default, the called function displays the callout number, the
-       start and current positions in the text at the callout  time,  and  the
+       start  and  current  positions in the text at the callout time, and the
        next pattern item to be tested. For example, the output

          --->pqrabcdef
            0    ^  ^     \d

-       indicates  that  callout number 0 occurred for a match attempt starting
-       at the fourth character of the subject string, when the pointer was  at
-       the  seventh  character of the data, and when the next pattern item was
-       \d. Just one circumflex is output if the start  and  current  positions
+       indicates that callout number 0 occurred for a match  attempt  starting
+       at  the fourth character of the subject string, when the pointer was at
+       the seventh character of the data, and when the next pattern  item  was
+       \d.  Just  one  circumflex is output if the start and current positions
        are the same.

        Callouts numbered 255 are assumed to be automatic callouts, inserted as
-       a result of the /C pattern modifier. In this case, instead  of  showing
-       the  callout  number, the offset in the pattern, preceded by a plus, is
+       a  result  of the /C pattern modifier. In this case, instead of showing
+       the callout number, the offset in the pattern, preceded by a  plus,  is
        output. For example:

            re> /\d?[A-E]\*/C
@@ -693,7 +696,7 @@
           0: E*

        If a pattern contains (*MARK) items, an additional line is output when-
-       ever  a  change  of  latest mark is passed to the callout function. For
+       ever a change of latest mark is passed to  the  callout  function.  For
        example:

            re> /a(*MARK:X)bc/C
@@ -707,59 +710,59 @@
          +12 ^  ^
           0: abc

-       The mark changes between matching "a" and "b", but stays the  same  for
-       the  rest  of  the match, so nothing more is output. If, as a result of
-       backtracking, the mark reverts to being unset, the  text  "<unset>"  is
+       The  mark  changes between matching "a" and "b", but stays the same for
+       the rest of the match, so nothing more is output. If, as  a  result  of
+       backtracking,  the  mark  reverts to being unset, the text "<unset>" is
        output.

-       The  callout  function  in pcretest returns zero (carry on matching) by
-       default, but you can use a \C item in a data line (as described  above)
+       The callout function in pcretest returns zero (carry  on  matching)  by
+       default,  but you can use a \C item in a data line (as described above)
        to change this and other parameters of the callout.

-       Inserting  callouts can be helpful when using pcretest to check compli-
-       cated regular expressions. For further information about callouts,  see
+       Inserting callouts can be helpful when using pcretest to check  compli-
+       cated  regular expressions. For further information about callouts, see
        the pcrecallout documentation.

NON-PRINTING CHARACTERS

-       When  pcretest is outputting text in the compiled version of a pattern,
-       bytes other than 32-126 are always treated as  non-printing  characters
+       When pcretest is outputting text in the compiled version of a  pattern,
+       bytes  other  than 32-126 are always treated as non-printing characters
        are are therefore shown as hex escapes.

-       When  pcretest  is  outputting text that is a matched part of a subject
-       string, it behaves in the same way, unless a different locale has  been
-       set  for  the  pattern  (using  the  /L  modifier).  In  this case, the
+       When pcretest is outputting text that is a matched part  of  a  subject
+       string,  it behaves in the same way, unless a different locale has been
+       set for the  pattern  (using  the  /L  modifier).  In  this  case,  the
        isprint() function to distinguish printing and non-printing characters.

SAVING AND RELOADING COMPILED PATTERNS

-       The facilities described in this section are  not  available  when  the
-       POSIX  interface  to  PCRE  is being used, that is, when the /P pattern
+       The  facilities  described  in  this section are not available when the
+       POSIX interface to PCRE is being used, that is,  when  the  /P  pattern
        modifier is specified.

        When the POSIX interface is not in use, you can cause pcretest to write
-       a  compiled  pattern to a file, by following the modifiers with > and a
+       a compiled pattern to a file, by following the modifiers with >  and  a
        file name.  For example:

          /pattern/im >/some/file

-       See the pcreprecompile documentation for a discussion about saving  and
-       re-using  compiled patterns.  Note that if the pattern was successfully
+       See  the pcreprecompile documentation for a discussion about saving and
+       re-using compiled patterns.  Note that if the pattern was  successfully
        studied with JIT optimization, the JIT data cannot be saved.

-       The data that is written is binary.  The  first  eight  bytes  are  the
-       length  of  the  compiled  pattern  data  followed by the length of the
-       optional study data, each written as four  bytes  in  big-endian  order
-       (most  significant  byte  first). If there is no study data (either the
+       The  data  that  is  written  is  binary. The first eight bytes are the
+       length of the compiled pattern data  followed  by  the  length  of  the
+       optional  study  data,  each  written as four bytes in big-endian order
+       (most significant byte first). If there is no study  data  (either  the
        pattern was not studied, or studying did not return any data), the sec-
-       ond  length  is  zero. The lengths are followed by an exact copy of the
-       compiled pattern. If there is additional study  data,  this  (excluding
-       any  JIT  data)  follows  immediately after the compiled pattern. After
+       ond length is zero. The lengths are followed by an exact  copy  of  the
+       compiled  pattern.  If  there is additional study data, this (excluding
+       any JIT data) follows immediately after  the  compiled  pattern.  After
        writing the file, pcretest expects to read a new pattern.

-       A saved pattern can be reloaded into pcretest by  specifying  <  and  a
+       A  saved  pattern  can  be reloaded into pcretest by specifying < and a
        file name instead of a pattern. The name of the file must not contain a
        < character, as otherwise pcretest will interpret the line as a pattern
        delimited by < characters.  For example:
@@ -768,27 +771,27 @@
          Compiled pattern loaded from /some/file
          No study data

-       If  the  pattern  was previously studied with the JIT optimization, the
-       JIT information cannot be saved and restored, and so is lost. When  the
-       pattern  has  been  loaded, pcretest proceeds to read data lines in the
+       If the pattern was previously studied with the  JIT  optimization,  the
+       JIT  information cannot be saved and restored, and so is lost. When the
+       pattern has been loaded, pcretest proceeds to read data  lines  in  the
        usual way.

-       You can copy a file written by pcretest to a different host and  reload
-       it  there,  even  if the new host has opposite endianness to the one on
-       which the pattern was compiled. For example, you can compile on an  i86
+       You  can copy a file written by pcretest to a different host and reload
+       it there, even if the new host has opposite endianness to  the  one  on
+       which  the pattern was compiled. For example, you can compile on an i86
        machine and run on a SPARC machine.

-       File  names  for  saving and reloading can be absolute or relative, but
-       note that the shell facility of expanding a file name that starts  with
+       File names for saving and reloading can be absolute  or  relative,  but
+       note  that the shell facility of expanding a file name that starts with
        a tilde (~) is not available.

-       The  ability to save and reload files in pcretest is intended for test-
-       ing and experimentation. It is not intended for production use  because
-       only  a  single pattern can be written to a file. Furthermore, there is
-       no facility for supplying  custom  character  tables  for  use  with  a
-       reloaded  pattern.  If  the  original  pattern was compiled with custom
-       tables, an attempt to match a subject string using a  reloaded  pattern
-       is  likely to cause pcretest to crash.  Finally, if you attempt to load
+       The ability to save and reload files in pcretest is intended for  test-
+       ing  and experimentation. It is not intended for production use because
+       only a single pattern can be written to a file. Furthermore,  there  is
+       no  facility  for  supplying  custom  character  tables  for use with a
+       reloaded pattern. If the original  pattern  was  compiled  with  custom
+       tables,  an  attempt to match a subject string using a reloaded pattern
+       is likely to cause pcretest to crash.  Finally, if you attempt to  load
        a file that is not in the correct format, the result is undefined.

@@ -807,5 +810,5 @@

REVISION

-       Last updated: 26 August 2011
+       Last updated: 02 December 2011
        Copyright (c) 1997-2011 University of Cambridge.

Modified: code/trunk/libpcre.a.dev
===================================================================
--- code/trunk/libpcre.a.dev    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/libpcre.a.dev    2011-12-28 17:16:11 UTC (rev 836)
@@ -190,7 +190,7 @@
 BuildCmd=

[Unit17]
-FileName=pcre_try_flipped.c
+FileName=pcre_byte_order.c
CompileCpp=0
Folder=libpcre.a
Compile=1

Modified: code/trunk/libpcre.pc.in
===================================================================
--- code/trunk/libpcre.pc.in    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/libpcre.pc.in    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 includedir=@includedir@

Name: libpcre
-Description: PCRE - Perl compatible regular expressions C library
+Description: PCRE - Perl compatible regular expressions C library with 8 bit character support
Version: @PACKAGE_VERSION@
Libs: -L${libdir} -lpcre
Cflags: -I${includedir} @PCRE_STATIC_CFLAG@

Copied: code/trunk/libpcre16.pc.in (from rev 835, code/branches/pcre16/libpcre16.pc.in)
===================================================================
--- code/trunk/libpcre16.pc.in                            (rev 0)
+++ code/trunk/libpcre16.pc.in    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,12 @@
+# Package Information for pkg-config
+
+prefix=@prefix@
+exec_prefix=@exec_prefix@
+libdir=@libdir@
+includedir=@includedir@
+
+Name: libpcre16
+Description: PCRE - Perl compatible regular expressions C library with 16 bit character support
+Version: @PACKAGE_VERSION@
+Libs: -L${libdir} -lpcre
+Cflags: -I${includedir} @PCRE_STATIC_CFLAG@

Modified: code/trunk/maint/ManyConfigTests
===================================================================
--- code/trunk/maint/ManyConfigTests    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/maint/ManyConfigTests    2011-12-28 17:16:11 UTC (rev 836)
@@ -141,6 +141,10 @@
 # builds both) to save a bit of time by building only one version of the
 # library for the subsequent tests.

+valgrind=
+cvalgrind=
+withvalgrind=
+
echo "Tests in the current directory"
srcdir=.
for opts in \
@@ -159,7 +163,14 @@
"--enable-utf8 --enable-newline-is-any --enable-unicode-properties --disable-stack-for-recursion --disable-static --disable-cpp" \
"--enable-jit --disable-shared" \
"--enable-jit --enable-unicode-properties --disable-shared" \
- "--enable-jit --enable-unicode-properties --with-link-size=3 --disable-shared"
+ "--enable-jit --enable-unicode-properties --with-link-size=3 --disable-shared" \
+ "--enable-pcre16" \
+ "--enable-pcre16 --enable-jit --enable-utf --disable-shared" \
+ "--enable-pcre16 --enable-jit --enable-unicode-properties --disable-shared" \
+ "--enable-pcre16 --enable-jit --disable-pcre8 --disable-shared" \
+ "--enable-pcre16 --enable-jit --disable-pcre8 --enable-utf --disable-shared" \
+ "--enable-pcre16 --disable-stack-for-recursion --disable-shared" \
+ "--enable-pcre16 --enable-unicode-properties --disable-stack-for-recursion --disable-shared"
do
runtest
done
@@ -174,15 +185,19 @@
for opts in \
"--enable-unicode-properties --disable-stack-for-recursion --disable-shared" \
"--enable-unicode-properties --with-link-size=3 --disable-shared" \
- "--enable-jit --enable-unicode-properties --disable-shared"
+ "--enable-jit --enable-unicode-properties --disable-shared" \
+ "--enable-pcre16 --enable-jit --enable-unicode-properties --disable-shared"
do
runtest
done

+valgrind=
+cvalgrind=
+withvalgrind=
+
# Clean up the distribution and then do at least one build and test in a
# directory other than the source directory. It doesn't work unless the
-# source directory is cleaned up first - and anyway, it's best to leave it
-# in a clean state after all this reconfiguring.
+# source directory is cleaned up first.

if [ -f Makefile ]; then
echo "Running 'make distclean'"

Modified: code/trunk/makevp_c.txt
===================================================================
--- code/trunk/makevp_c.txt    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/makevp_c.txt    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,3 +1,4 @@
+pcre_byte_order.c
 pcre_chartables.c
 pcre_compile.c
 pcre_config.c
@@ -13,7 +14,6 @@
 pcre_refcount.c
 pcre_study.c
 pcre_tables.c
-pcre_try_flipped.c
 pcre_ucd.c
 pcre_valid_utf8.c
 pcre_version.c

Modified: code/trunk/makevp_l.txt
===================================================================
--- code/trunk/makevp_l.txt    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/makevp_l.txt    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,3 +1,4 @@
++pcre_byte_order.obj        &
 +pcre_chartables.obj        &
 +pcre_compile.obj           &
 +pcre_config.obj        &
@@ -13,7 +14,6 @@
 +pcre_refcount.obj          &
 +pcre_study.obj            &
 +pcre_tables.obj        &
-+pcre_try_flipped.obj       &
 +pcre_ucd.obj               &
 +pcre_valid_utf8.obj        &
 +pcre_version.obj           &

Modified: code/trunk/pcre-config.in
===================================================================
--- code/trunk/pcre-config.in    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre-config.in    2011-12-28 17:16:11 UTC (rev 836)
@@ -70,7 +70,7 @@
       ;;
     --libs-cpp)
       if test @enable_cpp@ = yes ; then
-        echo -L@libdir@$libR -lpcrecpp -lpcre
+        echo $libS$libR -lpcrecpp -lpcre
       else
         echo "${usage}" 1>&2
       fi

Modified: code/trunk/pcre.h.in
===================================================================
--- code/trunk/pcre.h.in    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre.h.in    2011-12-28 17:16:11 UTC (rev 836)
@@ -5,7 +5,7 @@
 /* This is the public header file for the PCRE library, to be #included by
 applications that call the PCRE functions.

-           Copyright (c) 1997-2011 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -98,28 +98,37 @@
/* Options. Some are compile-time only, some are run-time only, and some are
both, so we keep them all distinct. However, almost all the bits in the options
word are now used. In the long run, we may have to re-use some of the
-compile-time only bits for runtime options, or vice versa. */
+compile-time only bits for runtime options, or vice versa. In the comments
+below, "compile", "exec", and "DFA exec" mean that the option is permitted to
+be set for those functions; "used in" means that an option may be set only for
+compile, but is subsequently referenced in exec and/or DFA exec. Any of the
+compile-time options may be inspected during studying (and therefore JIT
+compiling). */

 #define PCRE_CASELESS           0x00000001  /* Compile */
 #define PCRE_MULTILINE          0x00000002  /* Compile */
 #define PCRE_DOTALL             0x00000004  /* Compile */
 #define PCRE_EXTENDED           0x00000008  /* Compile */
 #define PCRE_ANCHORED           0x00000010  /* Compile, exec, DFA exec */
-#define PCRE_DOLLAR_ENDONLY     0x00000020  /* Compile */
+#define PCRE_DOLLAR_ENDONLY     0x00000020  /* Compile, used in exec, DFA exec */
 #define PCRE_EXTRA              0x00000040  /* Compile */
 #define PCRE_NOTBOL             0x00000080  /* Exec, DFA exec */
 #define PCRE_NOTEOL             0x00000100  /* Exec, DFA exec */
 #define PCRE_UNGREEDY           0x00000200  /* Compile */
 #define PCRE_NOTEMPTY           0x00000400  /* Exec, DFA exec */
-#define PCRE_UTF8               0x00000800  /* Compile */
+/* The next two are also used in exec and DFA exec */
+#define PCRE_UTF8               0x00000800  /* Compile (same as PCRE_UTF16) */
+#define PCRE_UTF16              0x00000800  /* Compile (same as PCRE_UTF8) */
 #define PCRE_NO_AUTO_CAPTURE    0x00001000  /* Compile */
-#define PCRE_NO_UTF8_CHECK      0x00002000  /* Compile, exec, DFA exec */
+/* The next two are also used in exec and DFA exec */
+#define PCRE_NO_UTF8_CHECK      0x00002000  /* Compile (same as PCRE_NO_UTF16_CHECK) */
+#define PCRE_NO_UTF16_CHECK     0x00002000  /* Compile (same as PCRE_NO_UTF8_CHECK) */
 #define PCRE_AUTO_CALLOUT       0x00004000  /* Compile */
 #define PCRE_PARTIAL_SOFT       0x00008000  /* Exec, DFA exec */
 #define PCRE_PARTIAL            0x00008000  /* Backwards compatible synonym */
 #define PCRE_DFA_SHORTEST       0x00010000  /* DFA exec */
 #define PCRE_DFA_RESTART        0x00020000  /* DFA exec */
-#define PCRE_FIRSTLINE          0x00040000  /* Compile */
+#define PCRE_FIRSTLINE          0x00040000  /* Compile, used in exec, DFA exec */
 #define PCRE_DUPNAMES           0x00080000  /* Compile */
 #define PCRE_NEWLINE_CR         0x00100000  /* Compile, exec, DFA exec */
 #define PCRE_NEWLINE_LF         0x00200000  /* Compile, exec, DFA exec */
@@ -128,43 +137,48 @@
 #define PCRE_NEWLINE_ANYCRLF    0x00500000  /* Compile, exec, DFA exec */
 #define PCRE_BSR_ANYCRLF        0x00800000  /* Compile, exec, DFA exec */
 #define PCRE_BSR_UNICODE        0x01000000  /* Compile, exec, DFA exec */
-#define PCRE_JAVASCRIPT_COMPAT  0x02000000  /* Compile */
+#define PCRE_JAVASCRIPT_COMPAT  0x02000000  /* Compile, used in exec */
 #define PCRE_NO_START_OPTIMIZE  0x04000000  /* Compile, exec, DFA exec */
 #define PCRE_NO_START_OPTIMISE  0x04000000  /* Synonym */
 #define PCRE_PARTIAL_HARD       0x08000000  /* Exec, DFA exec */
 #define PCRE_NOTEMPTY_ATSTART   0x10000000  /* Exec, DFA exec */
-#define PCRE_UCP                0x20000000  /* Compile */
+#define PCRE_UCP                0x20000000  /* Compile, used in exec, DFA exec */

/* Exec-time and get/set-time error codes */

-#define PCRE_ERROR_NOMATCH         (-1)
-#define PCRE_ERROR_NULL            (-2)
-#define PCRE_ERROR_BADOPTION       (-3)
-#define PCRE_ERROR_BADMAGIC        (-4)
-#define PCRE_ERROR_UNKNOWN_OPCODE  (-5)
-#define PCRE_ERROR_UNKNOWN_NODE    (-5)  /* For backward compatibility */
-#define PCRE_ERROR_NOMEMORY        (-6)
-#define PCRE_ERROR_NOSUBSTRING     (-7)
-#define PCRE_ERROR_MATCHLIMIT      (-8)
-#define PCRE_ERROR_CALLOUT         (-9)  /* Never used by PCRE itself */
-#define PCRE_ERROR_BADUTF8        (-10)
-#define PCRE_ERROR_BADUTF8_OFFSET (-11)
-#define PCRE_ERROR_PARTIAL        (-12)
-#define PCRE_ERROR_BADPARTIAL     (-13)
-#define PCRE_ERROR_INTERNAL       (-14)
-#define PCRE_ERROR_BADCOUNT       (-15)
-#define PCRE_ERROR_DFA_UITEM      (-16)
-#define PCRE_ERROR_DFA_UCOND      (-17)
-#define PCRE_ERROR_DFA_UMLIMIT    (-18)
-#define PCRE_ERROR_DFA_WSSIZE     (-19)
-#define PCRE_ERROR_DFA_RECURSE    (-20)
-#define PCRE_ERROR_RECURSIONLIMIT (-21)
-#define PCRE_ERROR_NULLWSLIMIT    (-22)  /* No longer actually used */
-#define PCRE_ERROR_BADNEWLINE     (-23)
-#define PCRE_ERROR_BADOFFSET      (-24)
-#define PCRE_ERROR_SHORTUTF8      (-25)
-#define PCRE_ERROR_RECURSELOOP    (-26)
-#define PCRE_ERROR_JIT_STACKLIMIT (-27)
+#define PCRE_ERROR_NOMATCH          (-1)
+#define PCRE_ERROR_NULL             (-2)
+#define PCRE_ERROR_BADOPTION        (-3)
+#define PCRE_ERROR_BADMAGIC         (-4)
+#define PCRE_ERROR_UNKNOWN_OPCODE   (-5)
+#define PCRE_ERROR_UNKNOWN_NODE     (-5)  /* For backward compatibility */
+#define PCRE_ERROR_NOMEMORY         (-6)
+#define PCRE_ERROR_NOSUBSTRING      (-7)
+#define PCRE_ERROR_MATCHLIMIT       (-8)
+#define PCRE_ERROR_CALLOUT          (-9)  /* Never used by PCRE itself */
+#define PCRE_ERROR_BADUTF8         (-10)  /* Same for 8/16 */
+#define PCRE_ERROR_BADUTF16        (-10)  /* Same for 8/16 */
+#define PCRE_ERROR_BADUTF8_OFFSET  (-11)  /* Same for 8/16 */
+#define PCRE_ERROR_BADUTF16_OFFSET (-11)  /* Same for 8/16 */
+#define PCRE_ERROR_PARTIAL         (-12)
+#define PCRE_ERROR_BADPARTIAL      (-13)
+#define PCRE_ERROR_INTERNAL        (-14)
+#define PCRE_ERROR_BADCOUNT        (-15)
+#define PCRE_ERROR_DFA_UITEM       (-16)
+#define PCRE_ERROR_DFA_UCOND       (-17)
+#define PCRE_ERROR_DFA_UMLIMIT     (-18)
+#define PCRE_ERROR_DFA_WSSIZE      (-19)
+#define PCRE_ERROR_DFA_RECURSE     (-20)
+#define PCRE_ERROR_RECURSIONLIMIT  (-21)
+#define PCRE_ERROR_NULLWSLIMIT     (-22)  /* No longer actually used */
+#define PCRE_ERROR_BADNEWLINE      (-23)
+#define PCRE_ERROR_BADOFFSET       (-24)
+#define PCRE_ERROR_SHORTUTF8       (-25)
+#define PCRE_ERROR_SHORTUTF16      (-25)  /* Same for 8/16 */
+#define PCRE_ERROR_RECURSELOOP     (-26)
+#define PCRE_ERROR_JIT_STACKLIMIT  (-27)
+#define PCRE_ERROR_BADMODE         (-28)
+#define PCRE_ERROR_BADENDIANNESS   (-29)

/* Specific error codes for UTF-8 validity checks */

@@ -191,6 +205,14 @@
 #define PCRE_UTF8_ERR20             20
 #define PCRE_UTF8_ERR21             21

+/* Specific error codes for UTF-16 validity checks */
+
+#define PCRE_UTF16_ERR0              0
+#define PCRE_UTF16_ERR1              1
+#define PCRE_UTF16_ERR2              2
+#define PCRE_UTF16_ERR3              3
+#define PCRE_UTF16_ERR4              4
+
 /* Request types for pcre_fullinfo() */

 #define PCRE_INFO_OPTIONS            0
@@ -211,6 +233,7 @@
 #define PCRE_INFO_HASCRORLF         14
 #define PCRE_INFO_MINLENGTH         15
 #define PCRE_INFO_JIT               16
+#define PCRE_INFO_JITSIZE           17

 /* Request types for pcre_config(). Do not re-arrange, in order to remain
 compatible. */
@@ -225,6 +248,7 @@
 #define PCRE_CONFIG_MATCH_LIMIT_RECURSION   7
 #define PCRE_CONFIG_BSR                     8
 #define PCRE_CONFIG_JIT                     9
+#define PCRE_CONFIG_UTF16                  10

 /* Request types for pcre_study(). Do not re-arrange, in order to remain
 compatible. */
@@ -250,6 +274,17 @@
 struct real_pcre_jit_stack;       /* declaration; the definition is private  */
 typedef struct real_pcre_jit_stack pcre_jit_stack;

+/* If PCRE is compiled with 16 bit character support, PCRE_SCHAR16 must contain
+a 16 bit wide signed data type. Otherwise it can be a dummy data type since
+pcre16 functions are not implemented. There is a check for this in pcre_internal.h. */
+#ifndef PCRE_SCHAR16
+#define PCRE_SCHAR16 short
+#endif
+
+#ifndef PCRE_SPTR16
+#define PCRE_SPTR16 const PCRE_SCHAR16 *
+#endif
+
 /* When PCRE is compiled as a C++ library, the subject pointer type can be
 replaced with a custom type. For conventional use, the public interface is a
 const char *. */
@@ -294,7 +329,7 @@
   int          pattern_position;  /* Offset to next item in the pattern */
   int          next_item_length;  /* Length of next item in the pattern */
   /* ------------------- Added for Version 2 -------------------------- */
-  const unsigned char *mark;      /* Pointer to current mark or NULL    */
+  const void  *mark;              /* Pointer to current mark or NULL    */
   /* ------------------------------------------------------------------ */
 } pcre_callout_block;

@@ -310,12 +345,24 @@
PCRE_EXP_DECL void *(*pcre_stack_malloc)(size_t);
PCRE_EXP_DECL void (*pcre_stack_free)(void *);
PCRE_EXP_DECL int (*pcre_callout)(pcre_callout_block *);
+
+PCRE_EXP_DECL void *(*pcre16_malloc)(size_t);
+PCRE_EXP_DECL void (*pcre16_free)(void *);
+PCRE_EXP_DECL void *(*pcre16_stack_malloc)(size_t);
+PCRE_EXP_DECL void (*pcre16_stack_free)(void *);
+PCRE_EXP_DECL int (*pcre16_callout)(pcre_callout_block *);
#else /* VPCOMPAT */
PCRE_EXP_DECL void *pcre_malloc(size_t);
PCRE_EXP_DECL void pcre_free(void *);
PCRE_EXP_DECL void *pcre_stack_malloc(size_t);
PCRE_EXP_DECL void pcre_stack_free(void *);
PCRE_EXP_DECL int pcre_callout(pcre_callout_block *);
+
+PCRE_EXP_DECL void *pcre16_malloc(size_t);
+PCRE_EXP_DECL void pcre16_free(void *);
+PCRE_EXP_DECL void *pcre16_stack_malloc(size_t);
+PCRE_EXP_DECL void pcre16_stack_free(void *);
+PCRE_EXP_DECL int pcre16_callout(pcre_callout_block *);
#endif /* VPCOMPAT */

/* User defined callback which provides a stack just before the match starts. */
@@ -326,42 +373,87 @@

 PCRE_EXP_DECL pcre *pcre_compile(const char *, int, const char **, int *,
                   const unsigned char *);
+PCRE_EXP_DECL pcre *pcre16_compile(PCRE_SPTR16, int, const char **, int *,
+                  const unsigned char *);
 PCRE_EXP_DECL pcre *pcre_compile2(const char *, int, int *, const char **,
                   int *, const unsigned char *);
+PCRE_EXP_DECL pcre *pcre16_compile2(PCRE_SPTR16, int, int *, const char **,
+                  int *, const unsigned char *);
 PCRE_EXP_DECL int  pcre_config(int, void *);
+PCRE_EXP_DECL int  pcre16_config(int, void *);
 PCRE_EXP_DECL int  pcre_copy_named_substring(const pcre *, const char *,
                   int *, int, const char *, char *, int);
-PCRE_EXP_DECL int  pcre_copy_substring(const char *, int *, int, int, char *,
-                  int);
+PCRE_EXP_DECL int  pcre16_copy_named_substring(const pcre *, PCRE_SPTR16,
+                  int *, int, PCRE_SPTR16, PCRE_SCHAR16 *, int);
+PCRE_EXP_DECL int  pcre_copy_substring(const char *, int *, int, int,
+                  char *, int);
+PCRE_EXP_DECL int  pcre16_copy_substring(PCRE_SPTR16, int *, int, int,
+                  PCRE_SCHAR16 *, int);
 PCRE_EXP_DECL int  pcre_dfa_exec(const pcre *, const pcre_extra *,
                   const char *, int, int, int, int *, int , int *, int);
+PCRE_EXP_DECL int  pcre16_dfa_exec(const pcre *, const pcre_extra *,
+                  PCRE_SPTR16, int, int, int, int *, int , int *, int);
 PCRE_EXP_DECL int  pcre_exec(const pcre *, const pcre_extra *, PCRE_SPTR,
                    int, int, int, int *, int);
+PCRE_EXP_DECL int  pcre16_exec(const pcre *, const pcre_extra *, PCRE_SPTR16,
+                   int, int, int, int *, int);
 PCRE_EXP_DECL void pcre_free_substring(const char *);
+PCRE_EXP_DECL void pcre16_free_substring(PCRE_SPTR16);
 PCRE_EXP_DECL void pcre_free_substring_list(const char **);
+PCRE_EXP_DECL void pcre16_free_substring_list(PCRE_SPTR16 *);
 PCRE_EXP_DECL int  pcre_fullinfo(const pcre *, const pcre_extra *, int,
                   void *);
+PCRE_EXP_DECL int  pcre16_fullinfo(const pcre *, const pcre_extra *, int,
+                  void *);
 PCRE_EXP_DECL int  pcre_get_named_substring(const pcre *, const char *,
                   int *, int, const char *, const char **);
+PCRE_EXP_DECL int  pcre16_get_named_substring(const pcre *, PCRE_SPTR16,
+                  int *, int, PCRE_SPTR16, PCRE_SPTR16 *);
 PCRE_EXP_DECL int  pcre_get_stringnumber(const pcre *, const char *);
+PCRE_EXP_DECL int  pcre16_get_stringnumber(const pcre *, PCRE_SPTR16);
 PCRE_EXP_DECL int  pcre_get_stringtable_entries(const pcre *, const char *,
                   char **, char **);
+PCRE_EXP_DECL int  pcre16_get_stringtable_entries(const pcre *, PCRE_SPTR16,
+                  PCRE_SCHAR16 **, PCRE_SCHAR16 **);
 PCRE_EXP_DECL int  pcre_get_substring(const char *, int *, int, int,
                   const char **);
+PCRE_EXP_DECL int  pcre16_get_substring(PCRE_SPTR16, int *, int, int,
+                  PCRE_SPTR16 *);
 PCRE_EXP_DECL int  pcre_get_substring_list(const char *, int *, int,
                   const char ***);
+PCRE_EXP_DECL int  pcre16_get_substring_list(PCRE_SPTR16, int *, int,
+                  PCRE_SPTR16 **);
 PCRE_EXP_DECL int  pcre_info(const pcre *, int *, int *);
+PCRE_EXP_DECL int  pcre16_info(const pcre *, int *, int *);
 PCRE_EXP_DECL const unsigned char *pcre_maketables(void);
+PCRE_EXP_DECL const unsigned char *pcre16_maketables(void);
 PCRE_EXP_DECL int  pcre_refcount(pcre *, int);
+PCRE_EXP_DECL int  pcre16_refcount(pcre *, int);
 PCRE_EXP_DECL pcre_extra *pcre_study(const pcre *, int, const char **);
+PCRE_EXP_DECL pcre_extra *pcre16_study(const pcre *, int, const char **);
 PCRE_EXP_DECL void pcre_free_study(pcre_extra *);
+PCRE_EXP_DECL void pcre16_free_study(pcre_extra *);
 PCRE_EXP_DECL const char *pcre_version(void);
+PCRE_EXP_DECL const char *pcre16_version(void);

+/* Utility functions for byte order swaps. */
+PCRE_EXP_DECL int  pcre_pattern_to_host_byte_order(pcre *, pcre_extra *,
+                  const unsigned char *);
+PCRE_EXP_DECL int  pcre16_pattern_to_host_byte_order(pcre *, pcre_extra *,
+                  const unsigned char *);
+PCRE_EXP_DECL int  pcre16_utf16_to_host_byte_order(PCRE_SCHAR16 *,
+                  PCRE_SPTR16, int, int *, int);
+
 /* JIT compiler related functions. */

 PCRE_EXP_DECL pcre_jit_stack *pcre_jit_stack_alloc(int, int);
+PCRE_EXP_DECL pcre_jit_stack *pcre16_jit_stack_alloc(int, int);
 PCRE_EXP_DECL void pcre_jit_stack_free(pcre_jit_stack *);
-PCRE_EXP_DECL void pcre_assign_jit_stack(pcre_extra *, pcre_jit_callback, void *);
+PCRE_EXP_DECL void pcre16_jit_stack_free(pcre_jit_stack *);
+PCRE_EXP_DECL void pcre_assign_jit_stack(pcre_extra *,
+                  pcre_jit_callback, void *);
+PCRE_EXP_DECL void pcre16_assign_jit_stack(pcre_extra *,
+                  pcre_jit_callback, void *);

#ifdef __cplusplus
} /* extern "C" */

Copied: code/trunk/pcre16_byte_order.c (from rev 835, code/branches/pcre16/pcre16_byte_order.c)
===================================================================
--- code/trunk/pcre16_byte_order.c                            (rev 0)
+++ code/trunk/pcre16_byte_order.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_byte_order.c"
+
+/* End of pcre16_byte_order.c */

Copied: code/trunk/pcre16_chartables.c (from rev 835, code/branches/pcre16/pcre16_chartables.c)
===================================================================
--- code/trunk/pcre16_chartables.c                            (rev 0)
+++ code/trunk/pcre16_chartables.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_chartables.c"
+
+/* End of pcre16_chartables.c */

Copied: code/trunk/pcre16_compile.c (from rev 835, code/branches/pcre16/pcre16_compile.c)
===================================================================
--- code/trunk/pcre16_compile.c                            (rev 0)
+++ code/trunk/pcre16_compile.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_compile.c"
+
+/* End of pcre16_compile.c */

Copied: code/trunk/pcre16_config.c (from rev 835, code/branches/pcre16/pcre16_config.c)
===================================================================
--- code/trunk/pcre16_config.c                            (rev 0)
+++ code/trunk/pcre16_config.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_config.c"
+
+/* End of pcre16_config.c */

Copied: code/trunk/pcre16_dfa_exec.c (from rev 835, code/branches/pcre16/pcre16_dfa_exec.c)
===================================================================
--- code/trunk/pcre16_dfa_exec.c                            (rev 0)
+++ code/trunk/pcre16_dfa_exec.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_dfa_exec.c"
+
+/* End of pcre16_dfa_exec.c */

Copied: code/trunk/pcre16_exec.c (from rev 835, code/branches/pcre16/pcre16_exec.c)
===================================================================
--- code/trunk/pcre16_exec.c                            (rev 0)
+++ code/trunk/pcre16_exec.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_exec.c"
+
+/* End of pcre16_exec.c */

Copied: code/trunk/pcre16_fullinfo.c (from rev 835, code/branches/pcre16/pcre16_fullinfo.c)
===================================================================
--- code/trunk/pcre16_fullinfo.c                            (rev 0)
+++ code/trunk/pcre16_fullinfo.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_fullinfo.c"
+
+/* End of pcre16_fullinfo.c */

Copied: code/trunk/pcre16_get.c (from rev 835, code/branches/pcre16/pcre16_get.c)
===================================================================
--- code/trunk/pcre16_get.c                            (rev 0)
+++ code/trunk/pcre16_get.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_get.c"
+
+/* End of pcre16_get.c */

Copied: code/trunk/pcre16_globals.c (from rev 835, code/branches/pcre16/pcre16_globals.c)
===================================================================
--- code/trunk/pcre16_globals.c                            (rev 0)
+++ code/trunk/pcre16_globals.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_globals.c"
+
+/* End of pcre16_globals.c */

Copied: code/trunk/pcre16_jit_compile.c (from rev 835, code/branches/pcre16/pcre16_jit_compile.c)
===================================================================
--- code/trunk/pcre16_jit_compile.c                            (rev 0)
+++ code/trunk/pcre16_jit_compile.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_jit_compile.c"
+
+/* End of pcre16_jit_compile.c */

Copied: code/trunk/pcre16_maketables.c (from rev 835, code/branches/pcre16/pcre16_maketables.c)
===================================================================
--- code/trunk/pcre16_maketables.c                            (rev 0)
+++ code/trunk/pcre16_maketables.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_maketables.c"
+
+/* End of pcre16_maketables.c */

Copied: code/trunk/pcre16_newline.c (from rev 835, code/branches/pcre16/pcre16_newline.c)
===================================================================
--- code/trunk/pcre16_newline.c                            (rev 0)
+++ code/trunk/pcre16_newline.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_newline.c"
+
+/* End of pcre16_newline.c */

Copied: code/trunk/pcre16_ord2utf16.c (from rev 835, code/branches/pcre16/pcre16_ord2utf16.c)
===================================================================
--- code/trunk/pcre16_ord2utf16.c                            (rev 0)
+++ code/trunk/pcre16_ord2utf16.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,95 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+
+/* This file contains a private PCRE function that converts an ordinal
+character value into a UTF16 string. */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_internal.h"
+
+/*************************************************
+*       Convert character value to UTF-16         *
+*************************************************/
+
+/* This function takes an integer value in the range 0 - 0x10ffff
+and encodes it as a UTF-16 character in 1 to 2 pcre_uchars.
+
+Arguments:
+  cvalue     the character value
+  buffer     pointer to buffer for result - at least 2 pcre_uchars long
+
+Returns:     number of characters placed in the buffer
+*/
+
+int
+PRIV(ord2utf)(pcre_uint32 cvalue, pcre_uchar *buffer)
+{
+#ifdef SUPPORT_UTF
+
+/* Checking invalid cvalue character, encoded as invalid UTF-16 character.
+Should never happen in practice. */
+if ((cvalue & 0xf800) == 0xd800 || cvalue >= 0x110000)
+  cvalue = 0xfffe;
+
+if (cvalue <= 0xffff)
+  {
+  *buffer = (pcre_uchar)cvalue;
+  return 1;
+  }
+
+cvalue -= 0x10000;
+*buffer++ = 0xd800 | (cvalue >> 10);
+*buffer = 0xdc00 | (cvalue & 0x3ff);
+return 2;
+
+#else /* SUPPORT_UTF */
+(void)(cvalue);  /* Keep compiler happy; this function won't ever be */
+(void)(buffer);  /* called when SUPPORT_UTF is not defined. */
+return 0;
+#endif /* SUPPORT_UTF */
+}
+
+/* End of pcre16_ord2utf16.c */

Copied: code/trunk/pcre16_printint.c (from rev 835, code/branches/pcre16/pcre16_printint.c)
===================================================================
--- code/trunk/pcre16_printint.c                            (rev 0)
+++ code/trunk/pcre16_printint.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_printint.c"
+
+/* End of pcre16_printint.c */

Copied: code/trunk/pcre16_refcount.c (from rev 835, code/branches/pcre16/pcre16_refcount.c)
===================================================================
--- code/trunk/pcre16_refcount.c                            (rev 0)
+++ code/trunk/pcre16_refcount.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_refcount.c"
+
+/* End of pcre16_refcount.c */

Copied: code/trunk/pcre16_string_utils.c (from rev 835, code/branches/pcre16/pcre16_string_utils.c)
===================================================================
--- code/trunk/pcre16_string_utils.c                            (rev 0)
+++ code/trunk/pcre16_string_utils.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_string_utils.c"
+
+/* End of pcre16_string_utils.c */

Copied: code/trunk/pcre16_study.c (from rev 835, code/branches/pcre16/pcre16_study.c)
===================================================================
--- code/trunk/pcre16_study.c                            (rev 0)
+++ code/trunk/pcre16_study.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_study.c"
+
+/* End of pcre16_study.c */

Copied: code/trunk/pcre16_tables.c (from rev 835, code/branches/pcre16/pcre16_tables.c)
===================================================================
--- code/trunk/pcre16_tables.c                            (rev 0)
+++ code/trunk/pcre16_tables.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_tables.c"
+
+/* End of pcre16_tables.c */

Copied: code/trunk/pcre16_ucd.c (from rev 835, code/branches/pcre16/pcre16_ucd.c)
===================================================================
--- code/trunk/pcre16_ucd.c                            (rev 0)
+++ code/trunk/pcre16_ucd.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_ucd.c"
+
+/* End of pcre16_ucd.c */

Copied: code/trunk/pcre16_utf16_utils.c (from rev 835, code/branches/pcre16/pcre16_utf16_utils.c)
===================================================================
--- code/trunk/pcre16_utf16_utils.c                            (rev 0)
+++ code/trunk/pcre16_utf16_utils.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,129 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+
+/* This module contains a function for converting any UTF-16 character
+strings to host byte order. */
+
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_internal.h"
+
+/*************************************************
+*  Convert any UTF-16 string to host byte order  *
+*************************************************/
+
+/* This function takes an UTF-16 string and converts
+it to host byte order. The length can be explicitly set,
+or autmatically detected for zero terminated strings.
+BOMs can be kept or discarded during the conversion.
+Conversion can be done in place (output == input).
+
+Arguments:
+  output     the output buffer, its size must be greater
+             or equal than the input string
+  input      any UTF-16 string
+  length     the number of characters in the input string
+             can be less than zero for zero terminated strings
+  host_byte_order
+             A non-zero value means the input is in host byte
+             order, which can be dynamically changed by BOMs later.
+             Initially it contains the starting byte order and returns
+             with the last byte order so it can be used for stream
+             processing. It can be NULL, which set the host byte
+             order mode by default.
+  keep_boms  for a non-zero value, the BOM (0xfeff) characters
+             are copied as well
+
+Returns:     the number of characters placed into the output buffer,
+             including the zero-terminator
+*/
+
+int
+pcre16_utf16_to_host_byte_order(PCRE_SCHAR16 *output, PCRE_SPTR16 input,
+  int length, int *host_byte_order, int keep_boms)
+{
+#ifdef SUPPORT_UTF
+/* This function converts any UTF-16 string to host byte order and optionally
+removes any Byte Order Marks (BOMS). Returns with the remainig length. */
+int host_bo = host_byte_order != NULL ? *host_byte_order : 1;
+pcre_uchar *optr = (pcre_uchar *)output;
+const pcre_uchar *iptr = (const pcre_uchar *)input;
+const pcre_uchar *end;
+/* The c variable must be unsigned. */
+register pcre_uchar c;
+
+if (length < 0)
+  length = STRLEN_UC(iptr) + 1;
+end = iptr + length;
+
+while (iptr < end)
+  {
+  c = *iptr++;
+  if (c == 0xfeff || c == 0xfffe)
+    {
+    /* Detecting the byte order of the machine is unnecessary, it is
+    enough to know that the UTF-16 string has the same byte order or not. */
+    host_bo = c == 0xfeff;
+    if (keep_boms != 0)
+      *optr++ = 0xfeff;
+    else
+      length--;
+    }
+  else
+    *optr++ = host_bo ? c : ((c >> 8) | (c << 8)); /* Flip bytes if needed. */
+  }
+if (host_byte_order != NULL)
+  *host_byte_order = host_bo;
+
+#else /* SUPPORT_UTF */
+(void)(output);  /* Keep picky compilers happy */
+(void)(input);
+(void)(keep_boms);
+#endif /* SUPPORT_UTF */
+return length;
+}
+
+/* End of pcre16_utf16_utils.c */

Copied: code/trunk/pcre16_valid_utf16.c (from rev 835, code/branches/pcre16/pcre16_valid_utf16.c)
===================================================================
--- code/trunk/pcre16_valid_utf16.c                            (rev 0)
+++ code/trunk/pcre16_valid_utf16.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,146 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+
+/* This module contains an internal function for validating UTF-16 character
+strings. */
+
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_internal.h"
+
+
+/*************************************************
+*         Validate a UTF-16 string                *
+*************************************************/
+
+/* This function is called (optionally) at the start of compile or match, to
+check that a supposed UTF-16 string is actually valid. The early check means
+that subsequent code can assume it is dealing with a valid string. The check
+can be turned off for maximum performance, but the consequences of supplying an
+invalid string are then undefined.
+
+From release 8.21 more information about the details of the error are passed
+back in the returned value:
+
+PCRE_UTF16_ERR0  No error
+PCRE_UTF16_ERR1  Missing low surrogate at the end of the string
+PCRE_UTF16_ERR2  Invalid low surrogate
+PCRE_UTF16_ERR3  Isolated low surrogate
+PCRE_UTF16_ERR4  Not allowed character
+
+Arguments:
+  string       points to the string
+  length       length of string, or -1 if the string is zero-terminated
+  errp         pointer to an error position offset variable
+
+Returns:       = 0    if the string is a valid UTF-16 string
+               > 0    otherwise, setting the offset of the bad character
+*/
+
+int
+PRIV(valid_utf)(PCRE_PUCHAR string, int length, int *erroroffset)
+{
+#ifdef SUPPORT_UTF
+register PCRE_PUCHAR p;
+register pcre_uchar c;
+
+if (length < 0)
+  {
+  for (p = string; *p != 0; p++);
+  length = p - string;
+  }
+
+for (p = string; length-- > 0; p++)
+  {
+  c = *p;
+
+  if ((c & 0xf800) != 0xd800)
+    {
+    /* Normal UTF-16 code point. Neither high nor low surrogate. */
+
+    /* This is probably a BOM from a different byte-order.
+    Regardless, the string is rejected. */
+    if (c == 0xfffe)
+      {
+      *erroroffset = p - string;
+      return PCRE_UTF16_ERR4;
+      }
+    }
+  else if ((c & 0x0400) == 0)
+    {
+    /* High surrogate. */
+
+    /* Must be a followed by a low surrogate. */
+    if (length == 0)
+      {
+      *erroroffset = p - string;
+      return PCRE_UTF16_ERR1;
+      }
+    p++;
+    length--;
+    if ((*p & 0xfc00) != 0xdc00)
+      {
+      *erroroffset = p - string;
+      return PCRE_UTF16_ERR2;
+      }
+    }
+  else
+    {
+    /* Isolated low surrogate. Always an error. */
+    *erroroffset = p - string;
+    return PCRE_UTF16_ERR3;
+    }
+  }
+
+#else  /* SUPPORT_UTF */
+(void)(string);  /* Keep picky compilers happy */
+(void)(length);
+#endif /* SUPPORT_UTF */
+
+return PCRE_UTF16_ERR0;   /* This indicates success */
+}
+
+/* End of pcre16_valid_utf16.c */

Copied: code/trunk/pcre16_version.c (from rev 835, code/branches/pcre16/pcre16_version.c)
===================================================================
--- code/trunk/pcre16_version.c                            (rev 0)
+++ code/trunk/pcre16_version.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_version.c"
+
+/* End of pcre16_version.c */

Copied: code/trunk/pcre16_xclass.c (from rev 835, code/branches/pcre16/pcre16_xclass.c)
===================================================================
--- code/trunk/pcre16_xclass.c                            (rev 0)
+++ code/trunk/pcre16_xclass.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,45 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+/* Generate code with 16 bit character support. */
+#define COMPILE_PCRE16
+
+#include "pcre_xclass.c"
+
+/* End of pcre16_xclass.c */

Copied: code/trunk/pcre_byte_order.c (from rev 835, code/branches/pcre16/pcre_byte_order.c)
===================================================================
--- code/trunk/pcre_byte_order.c                            (rev 0)
+++ code/trunk/pcre_byte_order.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,288 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+
+/* This module contains an internal function that tests a compiled pattern to
+see if it was compiled with the opposite endianness. If so, it uses an
+auxiliary local function to flip the appropriate bytes. */
+
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include "pcre_internal.h"
+
+
+/*************************************************
+*             Swap byte functions                *
+*************************************************/
+
+/* The following functions swap the bytes of a pcre_uint16
+and pcre_uint32 value.
+
+Arguments:
+  value        any number
+
+Returns:       the byte swapped value
+*/
+
+static pcre_uint32
+swap_uint32(pcre_uint32 value)
+{
+return ((value & 0x000000ff) << 24) |
+       ((value & 0x0000ff00) <<  8) |
+       ((value & 0x00ff0000) >>  8) |
+       (value >> 24);
+}
+
+static pcre_uint16
+swap_uint16(pcre_uint16 value)
+{
+return (value >> 8) | (value << 8);
+}
+
+
+/*************************************************
+*       Test for a byte-flipped compiled regex   *
+*************************************************/
+
+/* This function swaps the bytes of a compiled pattern usually
+loaded form the disk. It also sets the tables pointer, which
+is likely an invalid pointer after reload.
+
+Arguments:
+  argument_re     points to the compiled expression
+  extra_data      points to extra data or is NULL
+  tables          points to the character tables or NULL
+
+Returns:          0 if the swap is successful, negative on error
+*/
+
+#ifdef COMPILE_PCRE8
+PCRE_EXP_DECL int pcre_pattern_to_host_byte_order(pcre *argument_re,
+  pcre_extra *extra_data, const unsigned char *tables)
+#else
+PCRE_EXP_DECL int pcre16_pattern_to_host_byte_order(pcre *argument_re,
+  pcre_extra *extra_data, const unsigned char *tables)
+#endif
+{
+real_pcre *re = (real_pcre *)argument_re;
+pcre_study_data *study;
+#ifndef COMPILE_PCRE8
+pcre_uchar *ptr;
+int length;
+#ifdef SUPPORT_UTF
+BOOL utf;
+BOOL utf16_char;
+#endif /* SUPPORT_UTF */
+#endif /* !COMPILE_PCRE8 */
+
+if (re == NULL) return PCRE_ERROR_NULL;
+if (re->magic_number == MAGIC_NUMBER)
+  {
+  if ((re->flags & PCRE_MODE) == 0) return PCRE_ERROR_BADMODE;
+  re->tables = tables;
+  return 0;
+  }
+
+if (re->magic_number != REVERSED_MAGIC_NUMBER) return PCRE_ERROR_BADMAGIC;
+if ((swap_uint16(re->flags) & PCRE_MODE) == 0) return PCRE_ERROR_BADMODE;
+
+re->magic_number = MAGIC_NUMBER;
+re->size = swap_uint32(re->size);
+re->options = swap_uint32(re->options);
+re->flags = swap_uint16(re->flags);
+re->top_bracket = swap_uint16(re->top_bracket);
+re->top_backref = swap_uint16(re->top_backref);
+re->first_char = swap_uint16(re->first_char);
+re->req_char = swap_uint16(re->req_char);
+re->name_table_offset = swap_uint16(re->name_table_offset);
+re->name_entry_size = swap_uint16(re->name_entry_size);
+re->name_count = swap_uint16(re->name_count);
+re->ref_count = swap_uint16(re->ref_count);
+re->tables = tables;
+
+if (extra_data != NULL && (re->flags & PCRE_EXTRA_STUDY_DATA) != 0)
+  {
+  study = (pcre_study_data *)extra_data->study_data;
+  study->size = swap_uint32(study->size);
+  study->flags = swap_uint32(study->flags);
+  study->minlength = swap_uint32(study->minlength);
+  }
+
+#ifndef COMPILE_PCRE8
+ptr = (pcre_uchar *)re + re->name_table_offset;
+length = re->name_count * re->name_entry_size;
+#ifdef SUPPORT_UTF
+utf = (re->options & PCRE_UTF16) != 0;
+utf16_char = FALSE;
+#endif
+
+while(TRUE)
+  {
+  /* Swap previous characters. */
+  while (length-- > 0)
+    {
+    *ptr = swap_uint16(*ptr);
+    ptr++;
+    }
+#ifdef SUPPORT_UTF
+  if (utf16_char)
+    {
+    if (HAS_EXTRALEN(ptr[-1]))
+      {
+      /* We know that there is only one extra character in UTF-16. */
+      *ptr = swap_uint16(*ptr);
+      ptr++;
+      }
+    }
+  utf16_char = FALSE;
+#endif /* SUPPORT_UTF */
+
+  /* Get next opcode. */
+  length = 0;
+  *ptr = swap_uint16(*ptr);
+  switch (*ptr)
+    {
+    case OP_END:
+    return 0;
+
+#ifdef SUPPORT_UTF
+    case OP_CHAR:
+    case OP_CHARI:
+    case OP_NOT:
+    case OP_NOTI:
+    case OP_STAR:
+    case OP_MINSTAR:
+    case OP_PLUS:
+    case OP_MINPLUS:
+    case OP_QUERY:
+    case OP_MINQUERY:
+    case OP_UPTO:
+    case OP_MINUPTO:
+    case OP_EXACT:
+    case OP_POSSTAR:
+    case OP_POSPLUS:
+    case OP_POSQUERY:
+    case OP_POSUPTO:
+    case OP_STARI:
+    case OP_MINSTARI:
+    case OP_PLUSI:
+    case OP_MINPLUSI:
+    case OP_QUERYI:
+    case OP_MINQUERYI:
+    case OP_UPTOI:
+    case OP_MINUPTOI:
+    case OP_EXACTI:
+    case OP_POSSTARI:
+    case OP_POSPLUSI:
+    case OP_POSQUERYI:
+    case OP_POSUPTOI:
+    case OP_NOTSTAR:
+    case OP_NOTMINSTAR:
+    case OP_NOTPLUS:
+    case OP_NOTMINPLUS:
+    case OP_NOTQUERY:
+    case OP_NOTMINQUERY:
+    case OP_NOTUPTO:
+    case OP_NOTMINUPTO:
+    case OP_NOTEXACT:
+    case OP_NOTPOSSTAR:
+    case OP_NOTPOSPLUS:
+    case OP_NOTPOSQUERY:
+    case OP_NOTPOSUPTO:
+    case OP_NOTSTARI:
+    case OP_NOTMINSTARI:
+    case OP_NOTPLUSI:
+    case OP_NOTMINPLUSI:
+    case OP_NOTQUERYI:
+    case OP_NOTMINQUERYI:
+    case OP_NOTUPTOI:
+    case OP_NOTMINUPTOI:
+    case OP_NOTEXACTI:
+    case OP_NOTPOSSTARI:
+    case OP_NOTPOSPLUSI:
+    case OP_NOTPOSQUERYI:
+    case OP_NOTPOSUPTOI:
+    if (utf) utf16_char = TRUE;
+#endif
+    /* Fall through. */
+
+    default:
+    length = PRIV(OP_lengths)[*ptr] - 1;
+    break;
+
+    case OP_CLASS:
+    case OP_NCLASS:
+    /* Skip the character bit map. */
+    ptr += 32/sizeof(pcre_uchar);
+    length = 0;
+    break;
+
+    case OP_XCLASS:
+    /* Reverse the size of the XCLASS instance. */
+    ptr++;
+    *ptr = swap_uint16(*ptr);
+    if (LINK_SIZE > 1)
+      {
+      /* LINK_SIZE can be 1 or 2 in 16 bit mode. */
+      ptr++;
+      *ptr = swap_uint16(*ptr);
+      }
+    ptr++;
+    length = (GET(ptr, -LINK_SIZE)) - (1 + LINK_SIZE + 1);
+    *ptr = swap_uint16(*ptr);
+    if ((*ptr & XCL_MAP) != 0)
+      {
+      /* Skip the character bit map. */
+      ptr += 32/sizeof(pcre_uchar);
+      length -= 32/sizeof(pcre_uchar);
+      }
+    break;
+    }
+  ptr++;
+  }
+/* Control should never reach here in 16 bit mode. */
+#endif /* !COMPILE_PCRE8 */
+
+return 0;
+}
+
+/* End of pcre_byte_order.c */

Modified: code/trunk/pcre_chartables.c.dist
===================================================================
--- code/trunk/pcre_chartables.c.dist    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_chartables.c.dist    2011-12-28 17:16:11 UTC (rev 836)
@@ -26,7 +26,7 @@

#include "pcre_internal.h"

-const unsigned char _pcre_default_tables[] = {
+const pcre_uint8 PRIV(default_tables)[] = {

/* This table is a lower casing table. */

Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_compile.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2011 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -53,12 +53,16 @@
#include "pcre_internal.h"

-/* When PCRE_DEBUG is defined, we need the pcre_printint() function, which is
-also used by pcretest. PCRE_DEBUG is not defined when building a production
-library. */
+/* When PCRE_DEBUG is defined, we need the pcre(16)_printint() function, which
+is also used by pcretest. PCRE_DEBUG is not defined when building a production
+library. We do not need to select pcre16_printint.c specially, because the
+COMPILE_PCREx macro will already be appropriately set. */

#ifdef PCRE_DEBUG
-#include "pcre_printint.src"
+/* pcre_printint.c should not include any headers */
+#define PCRE_INCLUDED
+#include "pcre_printint.c"
+#undef PCRE_INCLUDED
#endif

@@ -88,16 +92,31 @@
The same workspace is used during the second, actual compile phase for
remembering forward references to groups so that they can be filled in at the
end. Each entry in this list occupies LINK_SIZE bytes, so even when LINK_SIZE
-is 4 there is plenty of room. */
+is 4 there is plenty of room for most patterns. However, the memory can get
+filled up by repetitions of forward references, for example patterns like
+/(?1){0,1999}(b)/, and one user did hit the limit. The code has been changed so
+that the workspace is expanded using malloc() in this situation. The value
+below is therefore a minimum, and we put a maximum on it for safety. The
+minimum is now also defined in terms of LINK_SIZE so that the use of malloc()
+kicks in at the same number of forward references in all cases. */

-#define COMPILE_WORK_SIZE (4096)
+#define COMPILE_WORK_SIZE (2048*LINK_SIZE)
+#define COMPILE_WORK_SIZE_MAX (100*COMPILE_WORK_SIZE)

/* The overrun tests check for a slightly smaller size so that they detect the
overrun before it actually does run off the end of the data block. */

-#define WORK_SIZE_CHECK (COMPILE_WORK_SIZE - 100)
+#define WORK_SIZE_SAFETY_MARGIN (100)

+/* Private flags added to firstchar and reqchar. */

+#define REQ_CASELESS   0x10000000l      /* Indicates caselessness */
+#define REQ_VARY       0x20000000l      /* Reqchar followed non-literal item */
+
+/* Repeated character flags. */
+
+#define UTF_LENGTH     0x10000000l      /* The char contains its length. */
+
 /* Table for handling escaped characters in the range '0'-'z'. Positive returns
 are simple data values; negative values are for special things like \d and so
 on. Zero means further processing is needed (for things like \x), or the escape
@@ -231,7 +250,7 @@
   STRING_graph0 STRING_print0 STRING_punct0 STRING_space0
   STRING_word0  STRING_xdigit;

-static const uschar posix_name_lengths[] = {
+static const pcre_uint8 posix_name_lengths[] = {
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 6, 0 };

/* Table of class bit maps for each POSIX class. Each class is formed from a
@@ -266,47 +285,101 @@
both positive and negative cases. NULL means no substitute. */

 #ifdef SUPPORT_UCP
-static const uschar *substitutes[] = {
-  (uschar *)"\\P{Nd}",    /* \D */
-  (uschar *)"\\p{Nd}",    /* \d */
-  (uschar *)"\\P{Xsp}",   /* \S */       /* NOTE: Xsp is Perl space */
-  (uschar *)"\\p{Xsp}",   /* \s */
-  (uschar *)"\\P{Xwd}",   /* \W */
-  (uschar *)"\\p{Xwd}"    /* \w */
+static const pcre_uchar string_PNd[]  = {
+  CHAR_BACKSLASH, CHAR_P, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_N, CHAR_d, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_pNd[]  = {
+  CHAR_BACKSLASH, CHAR_p, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_N, CHAR_d, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_PXsp[] = {
+  CHAR_BACKSLASH, CHAR_P, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_X, CHAR_s, CHAR_p, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_pXsp[] = {
+  CHAR_BACKSLASH, CHAR_p, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_X, CHAR_s, CHAR_p, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_PXwd[] = {
+  CHAR_BACKSLASH, CHAR_P, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_X, CHAR_w, CHAR_d, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_pXwd[] = {
+  CHAR_BACKSLASH, CHAR_p, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_X, CHAR_w, CHAR_d, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+
+static const pcre_uchar *substitutes[] = {
+  string_PNd,           /* \D */
+  string_pNd,           /* \d */
+  string_PXsp,          /* \S */       /* NOTE: Xsp is Perl space */
+  string_pXsp,          /* \s */
+  string_PXwd,          /* \W */
+  string_pXwd           /* \w */
 };

-static const uschar *posix_substitutes[] = {
-  (uschar *)"\\p{L}",     /* alpha */
-  (uschar *)"\\p{Ll}",    /* lower */
-  (uschar *)"\\p{Lu}",    /* upper */
-  (uschar *)"\\p{Xan}",   /* alnum */
-  NULL,                   /* ascii */
-  (uschar *)"\\h",        /* blank */
-  NULL,                   /* cntrl */
-  (uschar *)"\\p{Nd}",    /* digit */
-  NULL,                   /* graph */
-  NULL,                   /* print */
-  NULL,                   /* punct */
-  (uschar *)"\\p{Xps}",   /* space */    /* NOTE: Xps is POSIX space */
-  (uschar *)"\\p{Xwd}",   /* word */
-  NULL,                   /* xdigit */
+static const pcre_uchar string_pL[] =   {
+  CHAR_BACKSLASH, CHAR_p, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_L, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_pLl[] =  {
+  CHAR_BACKSLASH, CHAR_p, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_L, CHAR_l, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_pLu[] =  {
+  CHAR_BACKSLASH, CHAR_p, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_L, CHAR_u, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_pXan[] = {
+  CHAR_BACKSLASH, CHAR_p, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_X, CHAR_a, CHAR_n, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_h[] =    {
+  CHAR_BACKSLASH, CHAR_h, '\0' };
+static const pcre_uchar string_pXps[] = {
+  CHAR_BACKSLASH, CHAR_p, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_X, CHAR_p, CHAR_s, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_PL[] =   {
+  CHAR_BACKSLASH, CHAR_P, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_L, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_PLl[] =  {
+  CHAR_BACKSLASH, CHAR_P, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_L, CHAR_l, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_PLu[] =  {
+  CHAR_BACKSLASH, CHAR_P, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_L, CHAR_u, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_PXan[] = {
+  CHAR_BACKSLASH, CHAR_P, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_X, CHAR_a, CHAR_n, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+static const pcre_uchar string_H[] =    {
+  CHAR_BACKSLASH, CHAR_H, '\0' };
+static const pcre_uchar string_PXps[] = {
+  CHAR_BACKSLASH, CHAR_P, CHAR_LEFT_CURLY_BRACKET,
+  CHAR_X, CHAR_p, CHAR_s, CHAR_RIGHT_CURLY_BRACKET, '\0' };
+
+static const pcre_uchar *posix_substitutes[] = {
+  string_pL,            /* alpha */
+  string_pLl,           /* lower */
+  string_pLu,           /* upper */
+  string_pXan,          /* alnum */
+  NULL,                 /* ascii */
+  string_h,             /* blank */
+  NULL,                 /* cntrl */
+  string_pNd,           /* digit */
+  NULL,                 /* graph */
+  NULL,                 /* print */
+  NULL,                 /* punct */
+  string_pXps,          /* space */    /* NOTE: Xps is POSIX space */
+  string_pXwd,          /* word */
+  NULL,                 /* xdigit */
   /* Negated cases */
-  (uschar *)"\\P{L}",     /* ^alpha */
-  (uschar *)"\\P{Ll}",    /* ^lower */
-  (uschar *)"\\P{Lu}",    /* ^upper */
-  (uschar *)"\\P{Xan}",   /* ^alnum */
-  NULL,                   /* ^ascii */
-  (uschar *)"\\H",        /* ^blank */
-  NULL,                   /* ^cntrl */
-  (uschar *)"\\P{Nd}",    /* ^digit */
-  NULL,                   /* ^graph */
-  NULL,                   /* ^print */
-  NULL,                   /* ^punct */
-  (uschar *)"\\P{Xps}",   /* ^space */   /* NOTE: Xps is POSIX space */
-  (uschar *)"\\P{Xwd}",   /* ^word */
-  NULL                    /* ^xdigit */
+  string_PL,            /* ^alpha */
+  string_PLl,           /* ^lower */
+  string_PLu,           /* ^upper */
+  string_PXan,          /* ^alnum */
+  NULL,                 /* ^ascii */
+  string_H,             /* ^blank */
+  NULL,                 /* ^cntrl */
+  string_PNd,           /* ^digit */
+  NULL,                 /* ^graph */
+  NULL,                 /* ^print */
+  NULL,                 /* ^punct */
+  string_PXps,          /* ^space */   /* NOTE: Xps is POSIX space */
+  string_PXwd,          /* ^word */
+  NULL                  /* ^xdigit */
 };
-#define POSIX_SUBSIZE (sizeof(posix_substitutes)/sizeof(uschar *))
+#define POSIX_SUBSIZE (sizeof(posix_substitutes) / sizeof(pcre_uchar *))
 #endif

#define STRING(a) # a
@@ -412,6 +485,9 @@
"\\k is not followed by a braced, angle-bracketed, or quoted name\0"
/* 70 */
"internal error: unknown opcode in find_fixedlength()\0"
+ "\\N is not supported in a class\0"
+ "too many forward references\0"
+ "disallowed UTF-8/16 code point (>= 0xd800 && <= 0xdfff)\0"
;

/* Table to identify digits and hex digits. This is used when compiling
@@ -430,12 +506,18 @@

Then we can use ctype_digit and ctype_xdigit in the code. */

+/* Using a simple comparison for decimal numbers rather than a memory read
+is much faster, and the resulting code is simpler (the compiler turns it
+into a subtraction and unsigned comparison). */
+
+#define IS_DIGIT(x) ((x) >= CHAR_0 && (x) <= CHAR_9)
+
#ifndef EBCDIC

/* This is the "normal" case, for ASCII systems, and EBCDIC systems running in
UTF-8 mode. */

-static const unsigned char digitab[] =
+static const pcre_uint8 digitab[] =
{
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* 0- 7 */
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* 8- 15 */
@@ -474,7 +556,7 @@

/* This is the "abnormal" case, for EBCDIC systems not running in UTF-8 mode. */

-static const unsigned char digitab[] =
+static const pcre_uint8 digitab[] =
   {
   0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /*   0-  7  0 */
   0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /*   8- 15    */
@@ -509,7 +591,7 @@
   0x0c,0x0c,0x0c,0x0c,0x0c,0x0c,0x0c,0x0c, /*  0 - 7  F0 */
   0x0c,0x0c,0x00,0x00,0x00,0x00,0x00,0x00};/*  8 -255    */

-static const unsigned char ebcdic_chartab[] = { /* chartable partial dup */
+static const pcre_uint8 ebcdic_chartab[] = { /* chartable partial dup */
0x80,0x00,0x00,0x00,0x00,0x01,0x00,0x00, /* 0- 7 */
0x00,0x00,0x00,0x00,0x01,0x01,0x00,0x00, /* 8- 15 */
0x00,0x00,0x00,0x00,0x00,0x01,0x00,0x00, /* 16- 23 */
@@ -548,7 +630,7 @@
/* Definition to allow mutual recursion */

 static BOOL
-  compile_regex(int, uschar **, const uschar **, int *, BOOL, BOOL, int, int,
+  compile_regex(int, pcre_uchar **, const pcre_uchar **, int *, BOOL, BOOL, int, int,
     int *, int *, branch_chain *, compile_data *, int *);

@@ -580,6 +662,43 @@

 /*************************************************
+*           Expand the workspace                 *
+*************************************************/
+
+/* This function is called during the second compiling phase, if the number of
+forward references fills the existing workspace, which is originally a block on
+the stack. A larger block is obtained from malloc() unless the ultimate limit
+has been reached or the increase will be rather small.
+
+Argument: pointer to the compile data block
+Returns:  0 if all went well, else an error number
+*/
+
+static int
+expand_workspace(compile_data *cd)
+{
+pcre_uchar *newspace;
+int newsize = cd->workspace_size * 2;
+
+if (newsize > COMPILE_WORK_SIZE_MAX) newsize = COMPILE_WORK_SIZE_MAX;
+if (cd->workspace_size >= COMPILE_WORK_SIZE_MAX ||
+    newsize - cd->workspace_size < WORK_SIZE_SAFETY_MARGIN)
+ return ERR72;
+
+newspace = (PUBL(malloc))(IN_UCHARS(newsize));
+if (newspace == NULL) return ERR21;
+memcpy(newspace, cd->start_workspace, cd->workspace_size * sizeof(pcre_uchar));
+cd->hwm = (pcre_uchar *)newspace + (cd->hwm - cd->start_workspace);
+if (cd->workspace_size > COMPILE_WORK_SIZE)
+  (PUBL(free))((void *)cd->start_workspace);
+cd->start_workspace = newspace;
+cd->workspace_size = newsize;
+return 0;
+}
+
+
+
+/*************************************************
 *            Check for counted repeat            *
 *************************************************/

@@ -595,17 +714,19 @@
*/

static BOOL
-is_counted_repeat(const uschar *p)
+is_counted_repeat(const pcre_uchar *p)
{
-if ((digitab[*p++] & ctype_digit) == 0) return FALSE;
-while ((digitab[*p] & ctype_digit) != 0) p++;
+if (!IS_DIGIT(*p)) return FALSE;
+p++;
+while (IS_DIGIT(*p)) p++;
if (*p == CHAR_RIGHT_CURLY_BRACKET) return TRUE;

if (*p++ != CHAR_COMMA) return FALSE;
if (*p == CHAR_RIGHT_CURLY_BRACKET) return TRUE;

-if ((digitab[*p++] & ctype_digit) == 0) return FALSE;
-while ((digitab[*p] & ctype_digit) != 0) p++;
+if (!IS_DIGIT(*p)) return FALSE;
+p++;
+while (IS_DIGIT(*p)) p++;

return (*p == CHAR_RIGHT_CURLY_BRACKET);
}
@@ -637,12 +758,14 @@
*/

static int
-check_escape(const uschar **ptrptr, int *errorcodeptr, int bracount,
+check_escape(const pcre_uchar **ptrptr, int *errorcodeptr, int bracount,
int options, BOOL isclass)
{
-BOOL utf8 = (options & PCRE_UTF8) != 0;
-const uschar *ptr = *ptrptr + 1;
-int c, i;
+/* PCRE_UTF16 has the same value as PCRE_UTF8. */
+BOOL utf = (options & PCRE_UTF8) != 0;
+const pcre_uchar *ptr = *ptrptr + 1;
+pcre_int32 c;
+int i;

 GETCHARINCTEST(c, ptr);           /* Get character value, increment pointer */
 ptr--;                            /* Set pointer back to the last byte */
@@ -656,11 +779,13 @@
 Otherwise further processing may be required. */

 #ifndef EBCDIC  /* ASCII/UTF-8 coding */
-else if (c < CHAR_0 || c > CHAR_z) {}                     /* Not alphanumeric */
+/* Not alphanumeric */
+else if (c < CHAR_0 || c > CHAR_z) {}
 else if ((i = escapes[c - CHAR_0]) != 0) c = i;

 #else           /* EBCDIC coding */
-else if (c < 'a' || (ebcdic_chartab[c] & 0x0E) == 0) {}   /* Not alphanumeric */
+/* Not alphanumeric */
+else if (c < 'a' || (!MAX_255(c) || (ebcdic_chartab[c] & 0x0E) == 0)) {}
 else if ((i = escapes[c - 0x48]) != 0)  c = i;
 #endif

@@ -668,7 +793,7 @@

else
{
- const uschar *oldptr;
+ const pcre_uchar *oldptr;
BOOL braced, negated;

   switch (c)
@@ -686,8 +811,10 @@
       {
       /* In JavaScript, \u must be followed by four hexadecimal numbers.
       Otherwise it is a lowercase u letter. */
-      if ((digitab[ptr[1]] & ctype_xdigit) != 0 && (digitab[ptr[2]] & ctype_xdigit) != 0
-           && (digitab[ptr[3]] & ctype_xdigit) != 0 && (digitab[ptr[4]] & ctype_xdigit) != 0)
+      if (MAX_255(ptr[1]) && (digitab[ptr[1]] & ctype_xdigit) != 0
+        && MAX_255(ptr[2]) && (digitab[ptr[2]] & ctype_xdigit) != 0
+        && MAX_255(ptr[3]) && (digitab[ptr[3]] & ctype_xdigit) != 0
+        && MAX_255(ptr[4]) && (digitab[ptr[4]] & ctype_xdigit) != 0)
         {
         c = 0;
         for (i = 0; i < 4; ++i)
@@ -741,9 +868,9 @@

     if (ptr[1] == CHAR_LEFT_CURLY_BRACKET)
       {
-      const uschar *p;
+      const pcre_uchar *p;
       for (p = ptr+2; *p != 0 && *p != CHAR_RIGHT_CURLY_BRACKET; p++)
-        if (*p != CHAR_MINUS && (digitab[*p] & ctype_digit) == 0) break;
+        if (*p != CHAR_MINUS && !IS_DIGIT(*p)) break;
       if (*p != 0 && *p != CHAR_RIGHT_CURLY_BRACKET)
         {
         c = -ESC_k;
@@ -761,12 +888,21 @@
       }
     else negated = FALSE;

+    /* The integer range is limited by the machine's int representation. */
     c = 0;
-    while ((digitab[ptr[1]] & ctype_digit) != 0)
+    while (IS_DIGIT(ptr[1]))
+      {
+      if (((unsigned int)c) > INT_MAX / 10) /* Integer overflow */
+        {
+        c = -1;
+        break;
+        }
       c = c * 10 + *(++ptr) - CHAR_0;
-
-    if (c < 0)   /* Integer overflow */
+      }
+    if (((unsigned int)c) > INT_MAX) /* Integer overflow */
       {
+      while (IS_DIGIT(ptr[1]))
+        ptr++;
       *errorcodeptr = ERR61;
       break;
       }
@@ -814,11 +950,21 @@
     if (!isclass)
       {
       oldptr = ptr;
+      /* The integer range is limited by the machine's int representation. */
       c -= CHAR_0;
-      while ((digitab[ptr[1]] & ctype_digit) != 0)
+      while (IS_DIGIT(ptr[1]))
+        {
+        if (((unsigned int)c) > INT_MAX / 10) /* Integer overflow */
+          {
+          c = -1;
+          break;
+          }
         c = c * 10 + *(++ptr) - CHAR_0;
-      if (c < 0)    /* Integer overflow */
+        }
+      if (((unsigned int)c) > INT_MAX) /* Integer overflow */
         {
+        while (IS_DIGIT(ptr[1]))
+          ptr++;
         *errorcodeptr = ERR61;
         break;
         }
@@ -851,19 +997,20 @@
     c -= CHAR_0;
     while(i++ < 2 && ptr[1] >= CHAR_0 && ptr[1] <= CHAR_7)
         c = c * 8 + *(++ptr) - CHAR_0;
-    if (!utf8 && c > 255) *errorcodeptr = ERR51;
+    if (!utf && c > 0xff) *errorcodeptr = ERR51;
     break;

     /* \x is complicated. \x{ddd} is a character number which can be greater
-    than 0xff in utf8 mode, but only if the ddd are hex digits. If not, { is
-    treated as a data character. */
+    than 0xff in utf or non-8bit mode, but only if the ddd are hex digits.
+    If not, { is treated as a data character. */

     case CHAR_x:
     if ((options & PCRE_JAVASCRIPT_COMPAT) != 0)
       {
       /* In JavaScript, \x must be followed by two hexadecimal numbers.
       Otherwise it is a lowercase x letter. */
-      if ((digitab[ptr[1]] & ctype_xdigit) != 0 && (digitab[ptr[2]] & ctype_xdigit) != 0)
+      if (MAX_255(ptr[1]) && (digitab[ptr[1]] & ctype_xdigit) != 0
+        && MAX_255(ptr[2]) && (digitab[ptr[2]] & ctype_xdigit) != 0)
         {
         c = 0;
         for (i = 0; i < 2; ++i)
@@ -883,15 +1030,13 @@

     if (ptr[1] == CHAR_LEFT_CURLY_BRACKET)
       {
-      const uschar *pt = ptr + 2;
-      int count = 0;
+      const pcre_uchar *pt = ptr + 2;

       c = 0;
-      while ((digitab[*pt] & ctype_xdigit) != 0)
+      while (MAX_255(*pt) && (digitab[*pt] & ctype_xdigit) != 0)
         {
         register int cc = *pt++;
         if (c == 0 && cc == CHAR_0) continue;     /* Leading zeroes */
-        count++;

 #ifndef EBCDIC  /* ASCII/UTF-8 coding */
         if (cc >= CHAR_a) cc -= 32;               /* Convert to upper case */
@@ -900,11 +1045,25 @@
         if (cc >= CHAR_a && cc <= CHAR_z) cc += 64;  /* Convert to upper case */
         c = (c << 4) + cc - ((cc >= CHAR_0)? CHAR_0 : (CHAR_A - 10));
 #endif
+
+#ifdef COMPILE_PCRE8
+        if (c > (utf ? 0x10ffff : 0xff)) { c = -1; break; }
+#else
+#ifdef COMPILE_PCRE16
+        if (c > (utf ? 0x10ffff : 0xffff)) { c = -1; break; }
+#endif
+#endif
         }

+      if (c < 0)
+        {
+        while (MAX_255(*pt) && (digitab[*pt] & ctype_xdigit) != 0) pt++;
+        *errorcodeptr = ERR34;
+        }
+
       if (*pt == CHAR_RIGHT_CURLY_BRACKET)
         {
-        if (c < 0 || count > (utf8? 8 : 2)) *errorcodeptr = ERR34;
+        if (utf && c >= 0xd800 && c <= 0xdfff) *errorcodeptr = ERR73;
         ptr = pt;
         break;
         }
@@ -916,7 +1075,7 @@
     /* Read just a single-byte hex-defined char */

     c = 0;
-    while (i++ < 2 && (digitab[ptr[1]] & ctype_xdigit) != 0)
+    while (i++ < 2 && MAX_255(ptr[1]) && (digitab[ptr[1]] & ctype_xdigit) != 0)
       {
       int cc;                                  /* Some compilers don't like */
       cc = *(++ptr);                           /* ++ in initializers */
@@ -1014,11 +1173,11 @@
 */

static int
-get_ucp(const uschar **ptrptr, BOOL *negptr, int *dptr, int *errorcodeptr)
+get_ucp(const pcre_uchar **ptrptr, BOOL *negptr, int *dptr, int *errorcodeptr)
{
int c, i, bot, top;
-const uschar *ptr = *ptrptr;
-char name[32];
+const pcre_uchar *ptr = *ptrptr;
+pcre_uchar name[32];

 c = *(++ptr);
 if (c == 0) goto ERROR_RETURN;
@@ -1035,7 +1194,7 @@
     *negptr = TRUE;
     ptr++;
     }
-  for (i = 0; i < (int)sizeof(name) - 1; i++)
+  for (i = 0; i < (int)(sizeof(name) / sizeof(pcre_uchar)) - 1; i++)
     {
     c = *(++ptr);
     if (c == 0) goto ERROR_RETURN;
@@ -1059,16 +1218,16 @@
 /* Search for a recognized property name using binary chop */

bot = 0;
-top = _pcre_utt_size;
+top = PRIV(utt_size);

 while (bot < top)
   {
   i = (bot + top) >> 1;
-  c = strcmp(name, _pcre_utt_names + _pcre_utt[i].name_offset);
+  c = STRCMP_UC_C8(name, PRIV(utt_names) + PRIV(utt)[i].name_offset);
   if (c == 0)
     {
-    *dptr = _pcre_utt[i].value;
-    return _pcre_utt[i].type;
+    *dptr = PRIV(utt)[i].value;
+    return PRIV(utt)[i].type;
     }
   if (c > 0) bot = i + 1; else top = i;
   }
@@ -1106,8 +1265,8 @@
                  current ptr on error, with errorcodeptr set non-zero
 */

-static const uschar *
-read_repeat_counts(const uschar *p, int *minp, int *maxp, int *errorcodeptr)
+static const pcre_uchar *
+read_repeat_counts(const pcre_uchar *p, int *minp, int *maxp, int *errorcodeptr)
{
int min = 0;
int max = -1;
@@ -1115,7 +1274,7 @@
/* Read the minimum value and do a paranoid check: a negative value indicates
an integer overflow. */

-while ((digitab[*p] & ctype_digit) != 0) min = min * 10 + *p++ - CHAR_0;
+while (IS_DIGIT(*p)) min = min * 10 + *p++ - CHAR_0;
 if (min < 0 || min > 65535)
   {
   *errorcodeptr = ERR5;
@@ -1130,7 +1289,7 @@
   if (*(++p) != CHAR_RIGHT_CURLY_BRACKET)
     {
     max = 0;
-    while((digitab[*p] & ctype_digit) != 0) max = max * 10 + *p++ - CHAR_0;
+    while(IS_DIGIT(*p)) max = max * 10 + *p++ - CHAR_0;
     if (max < 0 || max > 65535)
       {
       *errorcodeptr = ERR5;
@@ -1185,17 +1344,17 @@
   name         name to seek, or NULL if seeking a numbered subpattern
   lorn         name length, or subpattern number if name is NULL
   xmode        TRUE if we are in /x mode
-  utf8         TRUE if we are in UTF-8 mode
+  utf          TRUE if we are in UTF-8 / UTF-16 mode
   count        pointer to the current capturing subpattern number (updated)

 Returns:       the number of the named subpattern, or -1 if not found
 */

 static int
-find_parens_sub(uschar **ptrptr, compile_data *cd, const uschar *name, int lorn,
-  BOOL xmode, BOOL utf8, int *count)
+find_parens_sub(pcre_uchar **ptrptr, compile_data *cd, const pcre_uchar *name, int lorn,
+  BOOL xmode, BOOL utf, int *count)
 {
-uschar *ptr = *ptrptr;
+pcre_uchar *ptr = *ptrptr;
 int start_count = *count;
 int hwm_count = start_count;
 BOOL dup_parens = FALSE;
@@ -1262,7 +1421,7 @@
         ptr[1] != CHAR_EQUALS_SIGN) || *ptr == CHAR_APOSTROPHE)
       {
       int term;
-      const uschar *thisname;
+      const pcre_uchar *thisname;
       *count += 1;
       if (name == NULL && *count == lorn) return *count;
       term = *ptr++;
@@ -1270,7 +1429,7 @@
       thisname = ptr;
       while (*ptr != term) ptr++;
       if (name != NULL && lorn == ptr - thisname &&
-          strncmp((const char *)name, (const char *)thisname, lorn) == 0)
+          STRNCMP_UC_UC(name, thisname, lorn) == 0)
         return *count;
       term++;
       }
@@ -1313,7 +1472,7 @@
         {
         if (ptr[2] == CHAR_E)
           ptr+= 2;
-        else if (strncmp((const char *)ptr+2,
+        else if (STRNCMP_UC_C8(ptr + 2,
                  STR_Q STR_BACKSLASH STR_E, 3) == 0)
           ptr += 4;
         else
@@ -1361,8 +1520,8 @@
       {
       if (IS_NEWLINE(ptr)) { ptr += cd->nllen - 1; break; }
       ptr++;
-#ifdef SUPPORT_UTF8
-      if (utf8) while ((*ptr & 0xc0) == 0x80) ptr++;
+#ifdef SUPPORT_UTF
+      if (utf) FORWARDCHAR(ptr);
 #endif
       }
     if (*ptr == 0) goto FAIL_EXIT;
@@ -1373,7 +1532,7 @@

   if (*ptr == CHAR_LEFT_PARENTHESIS)
     {
-    int rc = find_parens_sub(&ptr, cd, name, lorn, xmode, utf8, count);
+    int rc = find_parens_sub(&ptr, cd, name, lorn, xmode, utf, count);
     if (rc > 0) return rc;
     if (*ptr == 0) goto FAIL_EXIT;
     }
@@ -1419,16 +1578,16 @@
   name         name to seek, or NULL if seeking a numbered subpattern
   lorn         name length, or subpattern number if name is NULL
   xmode        TRUE if we are in /x mode
-  utf8         TRUE if we are in UTF-8 mode
+  utf          TRUE if we are in UTF-8 / UTF-16 mode

 Returns:       the number of the found subpattern, or -1 if not found
 */

static int
-find_parens(compile_data *cd, const uschar *name, int lorn, BOOL xmode,
- BOOL utf8)
+find_parens(compile_data *cd, const pcre_uchar *name, int lorn, BOOL xmode,
+ BOOL utf)
{
-uschar *ptr = (uschar *)cd->start_pattern;
+pcre_uchar *ptr = (pcre_uchar *)cd->start_pattern;
int count = 0;
int rc;

@@ -1439,7 +1598,7 @@

for (;;)
{
- rc = find_parens_sub(&ptr, cd, name, lorn, xmode, utf8, &count);
+ rc = find_parens_sub(&ptr, cd, name, lorn, xmode, utf, &count);
if (rc > 0 || *ptr++ == 0) break;
}

@@ -1466,8 +1625,8 @@
 Returns:       pointer to the first significant opcode
 */

-static const uschar*
-first_significant_code(const uschar *code, BOOL skipassert)
+static const pcre_uchar*
+first_significant_code(const pcre_uchar *code, BOOL skipassert)
 {
 for (;;)
   {
@@ -1478,7 +1637,7 @@
     case OP_ASSERTBACK_NOT:
     if (!skipassert) return code;
     do code += GET(code, 1); while (*code == OP_ALT);
-    code += _pcre_OP_lengths[*code];
+    code += PRIV(OP_lengths)[*code];
     break;

     case OP_WORD_BOUNDARY:
@@ -1492,7 +1651,7 @@
     case OP_RREF:
     case OP_NRREF:
     case OP_DEF:
-    code += _pcre_OP_lengths[*code];
+    code += PRIV(OP_lengths)[*code];
     break;

     default:
@@ -1522,7 +1681,7 @@

 Arguments:
   code     points to the start of the pattern (the bracket)
-  utf8     TRUE in UTF-8 mode
+  utf      TRUE in UTF-8 / UTF-16 mode
   atend    TRUE if called when the pattern is complete
   cd       the "compile data" structure

@@ -1534,12 +1693,12 @@
*/

static int
-find_fixedlength(uschar *code, BOOL utf8, BOOL atend, compile_data *cd)
+find_fixedlength(pcre_uchar *code, BOOL utf, BOOL atend, compile_data *cd)
{
int length = -1;

register int branchlength = 0;
-register uschar *cc = code + 1 + LINK_SIZE;
+register pcre_uchar *cc = code + 1 + LINK_SIZE;

 /* Scan along the opcodes for this branch. If we get to the end of the
 branch, check the length against that of the other branches. */
@@ -1547,8 +1706,9 @@
 for (;;)
   {
   int d;
-  uschar *ce, *cs;
+  pcre_uchar *ce, *cs;
   register int op = *cc;
+  
   switch (op)
     {
     /* We only need to continue for OP_CBRA (normal capturing bracket) and
@@ -1561,7 +1721,7 @@
     case OP_ONCE:
     case OP_ONCE_NC:
     case OP_COND:
-    d = find_fixedlength(cc + ((op == OP_CBRA)? 2:0), utf8, atend, cd);
+    d = find_fixedlength(cc + ((op == OP_CBRA)? IMM2_SIZE : 0), utf, atend, cd);
     if (d < 0) return d;
     branchlength += d;
     do cc += GET(cc, 1); while (*cc == OP_ALT);
@@ -1592,10 +1752,10 @@

     case OP_RECURSE:
     if (!atend) return -3;
-    cs = ce = (uschar *)cd->start_code + GET(cc, 1);  /* Start subpattern */
-    do ce += GET(ce, 1); while (*ce == OP_ALT);       /* End subpattern */
-    if (cc > cs && cc < ce) return -1;                /* Recursion */
-    d = find_fixedlength(cs + 2, utf8, atend, cd);
+    cs = ce = (pcre_uchar *)cd->start_code + GET(cc, 1);  /* Start subpattern */
+    do ce += GET(ce, 1); while (*ce == OP_ALT);           /* End subpattern */
+    if (cc > cs && cc < ce) return -1;                    /* Recursion */
+    d = find_fixedlength(cs + IMM2_SIZE, utf, atend, cd);
     if (d < 0) return d;
     branchlength += d;
     cc += 1 + LINK_SIZE;
@@ -1608,7 +1768,8 @@
     case OP_ASSERTBACK:
     case OP_ASSERTBACK_NOT:
     do cc += GET(cc, 1); while (*cc == OP_ALT);
-    /* Fall through */
+    cc += PRIV(OP_lengths)[*cc];
+    break;

     /* Skip over things that don't match chars */

@@ -1616,7 +1777,7 @@
     case OP_PRUNE_ARG:
     case OP_SKIP_ARG:
     case OP_THEN_ARG:
-    cc += cc[1] + _pcre_OP_lengths[*cc];
+    cc += cc[1] + PRIV(OP_lengths)[*cc];
     break;

     case OP_CALLOUT:
@@ -1643,7 +1804,7 @@
     case OP_SOM:
     case OP_THEN:
     case OP_WORD_BOUNDARY:
-    cc += _pcre_OP_lengths[*cc];
+    cc += PRIV(OP_lengths)[*cc];
     break;

     /* Handle literal characters */
@@ -1654,8 +1815,8 @@
     case OP_NOTI:
     branchlength++;
     cc += 2;
-#ifdef SUPPORT_UTF8
-    if (utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
+#ifdef SUPPORT_UTF
+    if (utf && HAS_EXTRALEN(cc[-1])) cc += GET_EXTRALEN(cc[-1]);
 #endif
     break;

@@ -1667,16 +1828,16 @@
     case OP_NOTEXACT:
     case OP_NOTEXACTI:
     branchlength += GET2(cc,1);
-    cc += 4;
-#ifdef SUPPORT_UTF8
-    if (utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
+    cc += 2 + IMM2_SIZE;
+#ifdef SUPPORT_UTF
+    if (utf && HAS_EXTRALEN(cc[-1])) cc += GET_EXTRALEN(cc[-1]);
 #endif
     break;

     case OP_TYPEEXACT:
     branchlength += GET2(cc,1);
-    if (cc[3] == OP_PROP || cc[3] == OP_NOTPROP) cc += 2;
-    cc += 4;
+    if (cc[1 + IMM2_SIZE] == OP_PROP || cc[1 + IMM2_SIZE] == OP_NOTPROP) cc += 2;
+    cc += 1 + IMM2_SIZE + 1;
     break;

     /* Handle single-char matchers */
@@ -1702,7 +1863,7 @@
     cc++;
     break;

-    /* The single-byte matcher isn't allowed. This only happens in UTF-8 mode; 
+    /* The single-byte matcher isn't allowed. This only happens in UTF-8 mode;
     otherwise \C is coded as OP_ALLANY. */

     case OP_ANYBYTE:
@@ -1710,15 +1871,15 @@

     /* Check a class for variable quantification */

-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || defined COMPILE_PCRE16
     case OP_XCLASS:
-    cc += GET(cc, 1) - 33;
+    cc += GET(cc, 1) - PRIV(OP_lengths)[OP_CLASS];
     /* Fall through */
 #endif

     case OP_CLASS:
     case OP_NCLASS:
-    cc += 33;
+    cc += PRIV(OP_lengths)[OP_CLASS];

     switch (*cc)
       {
@@ -1732,9 +1893,9 @@

       case OP_CRRANGE:
       case OP_CRMINRANGE:
-      if (GET2(cc,1) != GET2(cc,3)) return -1;
+      if (GET2(cc,1) != GET2(cc,1+IMM2_SIZE)) return -1;
       branchlength += GET2(cc,1);
-      cc += 5;
+      cc += 1 + 2 * IMM2_SIZE;
       break;

       default:
@@ -1849,14 +2010,14 @@

 Arguments:
   code        points to start of expression
-  utf8        TRUE in UTF-8 mode
+  utf         TRUE in UTF-8 / UTF-16 mode
   number      the required bracket number or negative to find a lookbehind

 Returns:      pointer to the opcode for the bracket, or NULL if not found
 */

-const uschar *
-_pcre_find_bracket(const uschar *code, BOOL utf8, int number)
+const pcre_uchar *
+PRIV(find_bracket)(const pcre_uchar *code, BOOL utf, int number)
{
for (;;)
{
@@ -1874,8 +2035,8 @@

   else if (c == OP_REVERSE)
     {
-    if (number < 0) return (uschar *)code;
-    code += _pcre_OP_lengths[c];
+    if (number < 0) return (pcre_uchar *)code;
+    code += PRIV(OP_lengths)[c];
     }

   /* Handle capturing bracket */
@@ -1884,8 +2045,8 @@
            c == OP_CBRAPOS || c == OP_SCBRAPOS)
     {
     int n = GET2(code, 1+LINK_SIZE);
-    if (n == number) return (uschar *)code;
-    code += _pcre_OP_lengths[c];
+    if (n == number) return (pcre_uchar *)code;
+    code += PRIV(OP_lengths)[c];
     }

   /* Otherwise, we can get the item's length from the table, except that for
@@ -1913,7 +2074,8 @@
       case OP_TYPEMINUPTO:
       case OP_TYPEEXACT:
       case OP_TYPEPOSUPTO:
-      if (code[3] == OP_PROP || code[3] == OP_NOTPROP) code += 2;
+      if (code[1 + IMM2_SIZE] == OP_PROP
+        || code[1 + IMM2_SIZE] == OP_NOTPROP) code += 2;
       break;

       case OP_MARK:
@@ -1929,14 +2091,14 @@

     /* Add in the fixed length from the table */

-    code += _pcre_OP_lengths[c];
+    code += PRIV(OP_lengths)[c];

/* In UTF-8 mode, opcodes that are followed by a character may be followed by
a multi-byte character. The length in the table is a minimum, so we have to
arrange to skip the extra bytes. */

-#ifdef SUPPORT_UTF8
-    if (utf8) switch(c)
+#ifdef SUPPORT_UTF
+    if (utf) switch(c)
       {
       case OP_CHAR:
       case OP_CHARI:
@@ -1966,11 +2128,11 @@
       case OP_MINQUERYI:
       case OP_POSQUERY:
       case OP_POSQUERYI:
-      if (code[-1] >= 0xc0) code += _pcre_utf8_table4[code[-1] & 0x3f];
+      if (HAS_EXTRALEN(code[-1])) code += GET_EXTRALEN(code[-1]);
       break;
       }
 #else
-    (void)(utf8);  /* Keep compiler happy by referencing function argument */
+    (void)(utf);  /* Keep compiler happy by referencing function argument */
 #endif
     }
   }
@@ -1987,13 +2149,13 @@

 Arguments:
   code        points to start of expression
-  utf8        TRUE in UTF-8 mode
+  utf         TRUE in UTF-8 / UTF-16 mode

 Returns:      pointer to the opcode for OP_RECURSE, or NULL if not found
 */

-static const uschar *
-find_recurse(const uschar *code, BOOL utf8)
+static const pcre_uchar *
+find_recurse(const pcre_uchar *code, BOOL utf)
 {
 for (;;)
   {
@@ -2032,7 +2194,8 @@
       case OP_TYPEUPTO:
       case OP_TYPEMINUPTO:
       case OP_TYPEEXACT:
-      if (code[3] == OP_PROP || code[3] == OP_NOTPROP) code += 2;
+      if (code[1 + IMM2_SIZE] == OP_PROP
+        || code[1 + IMM2_SIZE] == OP_NOTPROP) code += 2;
       break;

       case OP_MARK:
@@ -2048,14 +2211,14 @@

     /* Add in the fixed length from the table */

-    code += _pcre_OP_lengths[c];
+    code += PRIV(OP_lengths)[c];

     /* In UTF-8 mode, opcodes that are followed by a character may be followed
     by a multi-byte character. The length in the table is a minimum, so we have
     to arrange to skip the extra bytes. */

-#ifdef SUPPORT_UTF8
-    if (utf8) switch(c)
+#ifdef SUPPORT_UTF
+    if (utf) switch(c)
       {
       case OP_CHAR:
       case OP_CHARI:
@@ -2085,11 +2248,11 @@
       case OP_MINQUERYI:
       case OP_POSQUERY:
       case OP_POSQUERYI:
-      if (code[-1] >= 0xc0) code += _pcre_utf8_table4[code[-1] & 0x3f];
+      if (HAS_EXTRALEN(code[-1])) code += GET_EXTRALEN(code[-1]);
       break;
       }
 #else
-    (void)(utf8);  /* Keep compiler happy by referencing function argument */
+    (void)(utf);  /* Keep compiler happy by referencing function argument */
 #endif
     }
   }
@@ -2112,22 +2275,22 @@
 Arguments:
   code        points to start of search
   endcode     points to where to stop
-  utf8        TRUE if in UTF8 mode
+  utf         TRUE if in UTF-8 / UTF-16 mode
   cd          contains pointers to tables etc.

 Returns:      TRUE if what is matched could be empty
 */

 static BOOL
-could_be_empty_branch(const uschar *code, const uschar *endcode, BOOL utf8,
-  compile_data *cd)
+could_be_empty_branch(const pcre_uchar *code, const pcre_uchar *endcode,
+  BOOL utf, compile_data *cd)
 {
 register int c;
-for (code = first_significant_code(code + _pcre_OP_lengths[*code], TRUE);
+for (code = first_significant_code(code + PRIV(OP_lengths)[*code], TRUE);
      code < endcode;
-     code = first_significant_code(code + _pcre_OP_lengths[c], TRUE))
+     code = first_significant_code(code + PRIV(OP_lengths)[c], TRUE))
   {
-  const uschar *ccode;
+  const pcre_uchar *ccode;

c = *code;

@@ -2150,7 +2313,7 @@

   if (c == OP_RECURSE)
     {
-    const uschar *scode;
+    const pcre_uchar *scode;
     BOOL empty_branch;

     /* Test for forward reference */
@@ -2168,7 +2331,7 @@

     do
       {
-      if (could_be_empty_branch(scode, endcode, utf8, cd))
+      if (could_be_empty_branch(scode, endcode, utf, cd))
         {
         empty_branch = TRUE;
         break;
@@ -2186,7 +2349,7 @@
   if (c == OP_BRAZERO || c == OP_BRAMINZERO || c == OP_SKIPZERO ||
       c == OP_BRAPOSZERO)
     {
-    code += _pcre_OP_lengths[c];
+    code += PRIV(OP_lengths)[c];
     do code += GET(code, 1); while (*code == OP_ALT);
     c = *code;
     continue;
@@ -2224,7 +2387,7 @@
       empty_branch = FALSE;
       do
         {
-        if (!empty_branch && could_be_empty_branch(code, endcode, utf8, cd))
+        if (!empty_branch && could_be_empty_branch(code, endcode, utf, cd))
           empty_branch = TRUE;
         code += GET(code, 1);
         }
@@ -2242,11 +2405,11 @@
     {
     /* Check for quantifiers after a class. XCLASS is used for classes that
     cannot be represented just by a bit map. This includes negated single
-    high-valued characters. The length in _pcre_OP_lengths[] is zero; the
+    high-valued characters. The length in PRIV(OP_lengths)[] is zero; the
     actual length is stored in the compiled code, so we must update "code"
     here. */

-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
     case OP_XCLASS:
     ccode = code += GET(code, 1);
     goto CHECK_CLASS_REPEAT;
@@ -2254,9 +2417,9 @@

     case OP_CLASS:
     case OP_NCLASS:
-    ccode = code + 33;
+    ccode = code + PRIV(OP_lengths)[OP_CLASS];

-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
     CHECK_CLASS_REPEAT:
 #endif

@@ -2329,7 +2492,8 @@
     case OP_TYPEUPTO:
     case OP_TYPEMINUPTO:
     case OP_TYPEPOSUPTO:
-    if (code[3] == OP_PROP || code[3] == OP_NOTPROP) code += 2;
+    if (code[1 + IMM2_SIZE] == OP_PROP
+      || code[1 + IMM2_SIZE] == OP_NOTPROP) code += 2;
     break;

     /* End of branch */
@@ -2344,7 +2508,7 @@
     /* In UTF-8 mode, STAR, MINSTAR, POSSTAR, QUERY, MINQUERY, POSQUERY, UPTO,
     MINUPTO, and POSUPTO may be followed by a multibyte character */

-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
     case OP_STAR:
     case OP_STARI:
     case OP_MINSTAR:
@@ -2357,7 +2521,7 @@
     case OP_MINQUERYI:
     case OP_POSQUERY:
     case OP_POSQUERYI:
-    if (utf8 && code[1] >= 0xc0) code += _pcre_utf8_table4[code[1] & 0x3f];
+    if (utf && HAS_EXTRALEN(code[1])) code += GET_EXTRALEN(code[1]);
     break;

     case OP_UPTO:
@@ -2366,7 +2530,7 @@
     case OP_MINUPTOI:
     case OP_POSUPTO:
     case OP_POSUPTOI:
-    if (utf8 && code[3] >= 0xc0) code += _pcre_utf8_table4[code[3] & 0x3f];
+    if (utf && HAS_EXTRALEN(code[1 + IMM2_SIZE])) code += GET_EXTRALEN(code[1 + IMM2_SIZE]);
     break;
 #endif

@@ -2410,19 +2574,19 @@
   code        points to start of the recursion
   endcode     points to where to stop (current RECURSE item)
   bcptr       points to the chain of current (unclosed) branch starts
-  utf8        TRUE if in UTF-8 mode
+  utf         TRUE if in UTF-8 / UTF-16 mode
   cd          pointers to tables etc

 Returns:      TRUE if what is matched could be empty
 */

 static BOOL
-could_be_empty(const uschar *code, const uschar *endcode, branch_chain *bcptr,
-  BOOL utf8, compile_data *cd)
+could_be_empty(const pcre_uchar *code, const pcre_uchar *endcode,
+  branch_chain *bcptr, BOOL utf, compile_data *cd)
 {
 while (bcptr != NULL && bcptr->current_branch >= code)
   {
-  if (!could_be_empty_branch(bcptr->current_branch, endcode, utf8, cd))
+  if (!could_be_empty_branch(bcptr->current_branch, endcode, utf, cd))
     return FALSE;
   bcptr = bcptr->outer;
   }
@@ -2474,7 +2638,7 @@
 */

 static BOOL
-check_posix_syntax(const uschar *ptr, const uschar **endptr)
+check_posix_syntax(const pcre_uchar *ptr, const pcre_uchar **endptr)
 {
 int terminator;          /* Don't combine these lines; the Solaris cc */
 terminator = *(++ptr);   /* compiler warns about "non-constant" initializer. */
@@ -2518,14 +2682,14 @@
 */

 static int
-check_posix_name(const uschar *ptr, int len)
+check_posix_name(const pcre_uchar *ptr, int len)
 {
 const char *pn = posix_names;
 register int yield = 0;
 while (posix_name_lengths[yield] != 0)
   {
   if (len == posix_name_lengths[yield] &&
-    strncmp((const char *)ptr, pn, len) == 0) return yield;
+    STRNCMP_UC_C8(ptr, pn, len) == 0) return yield;
   pn += posix_name_lengths[yield] + 1;
   yield++;
   }
@@ -2557,7 +2721,7 @@
 Arguments:
   group      points to the start of the group
   adjust     the amount by which the group is to be moved
-  utf8       TRUE in UTF-8 mode
+  utf        TRUE in UTF-8 / UTF-16 mode
   cd         contains pointers to tables etc.
   save_hwm   the hwm forward reference pointer at the start of the group

@@ -2565,15 +2729,15 @@
*/

static void
-adjust_recurse(uschar *group, int adjust, BOOL utf8, compile_data *cd,
- uschar *save_hwm)
+adjust_recurse(pcre_uchar *group, int adjust, BOOL utf, compile_data *cd,
+ pcre_uchar *save_hwm)
{
-uschar *ptr = group;
+pcre_uchar *ptr = group;

-while ((ptr = (uschar *)find_recurse(ptr, utf8)) != NULL)
+while ((ptr = (pcre_uchar *)find_recurse(ptr, utf)) != NULL)
{
int offset;
- uschar *hc;
+ pcre_uchar *hc;

   /* See if this recursion is on the forward reference list. If so, adjust the
   reference. */
@@ -2618,14 +2782,14 @@
 Returns:         new code pointer
 */

-static uschar *
-auto_callout(uschar *code, const uschar *ptr, compile_data *cd)
+static pcre_uchar *
+auto_callout(pcre_uchar *code, const pcre_uchar *ptr, compile_data *cd)
 {
 *code++ = OP_CALLOUT;
 *code++ = 255;
 PUT(code, 0, (int)(ptr - cd->start_pattern));  /* Pattern offset */
 PUT(code, LINK_SIZE, 0);                       /* Default length */
-return code + 2*LINK_SIZE;
+return code + 2 * LINK_SIZE;
 }

@@ -2647,7 +2811,7 @@
*/

 static void
-complete_callout(uschar *previous_callout, const uschar *ptr, compile_data *cd)
+complete_callout(pcre_uchar *previous_callout, const pcre_uchar *ptr, compile_data *cd)
 {
 int length = (int)(ptr - cd->start_pattern - GET(previous_callout, 2));
 PUT(previous_callout, 2 + LINK_SIZE, length);
@@ -2730,7 +2894,7 @@
           prop->chartype == ucp_Lt) == negated;

case PT_GC:
- return (pdata == _pcre_ucp_gentype[prop->chartype]) == negated;
+ return (pdata == PRIV(ucp_gentype)[prop->chartype]) == negated;

case PT_PC:
return (pdata == prop->chartype) == negated;
@@ -2741,23 +2905,23 @@
/* These are specials */

   case PT_ALNUM:
-  return (_pcre_ucp_gentype[prop->chartype] == ucp_L ||
-          _pcre_ucp_gentype[prop->chartype] == ucp_N) == negated;
+  return (PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+          PRIV(ucp_gentype)[prop->chartype] == ucp_N) == negated;

   case PT_SPACE:    /* Perl space */
-  return (_pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+  return (PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
           c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR)
           == negated;

   case PT_PXSPACE:  /* POSIX space */
-  return (_pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+  return (PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
           c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
           c == CHAR_FF || c == CHAR_CR)
           == negated;

   case PT_WORD:
-  return (_pcre_ucp_gentype[prop->chartype] == ucp_L ||
-          _pcre_ucp_gentype[prop->chartype] == ucp_N ||
+  return (PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+          PRIV(ucp_gentype)[prop->chartype] == ucp_N ||
           c == CHAR_UNDERSCORE) == negated;
   }
 return FALSE;
@@ -2776,7 +2940,7 @@

 Arguments:
   previous      pointer to the repeated opcode
-  utf8          TRUE in UTF-8 mode
+  utf           TRUE in UTF-8 / UTF-16 mode
   ptr           next character in pattern
   options       options bits
   cd            contains pointers to tables etc.
@@ -2785,10 +2949,10 @@
 */

static BOOL
-check_auto_possessive(const uschar *previous, BOOL utf8, const uschar *ptr,
- int options, compile_data *cd)
+check_auto_possessive(const pcre_uchar *previous, BOOL utf,
+ const pcre_uchar *ptr, int options, compile_data *cd)
{
-int c, next;
+pcre_int32 c, next;
int op_code = *previous++;

 /* Skip whitespace and comments in extended mode */
@@ -2797,7 +2961,7 @@
   {
   for (;;)
     {
-    while ((cd->ctypes[*ptr] & ctype_space) != 0) ptr++;
+    while (MAX_255(*ptr) && (cd->ctypes[*ptr] & ctype_space) != 0) ptr++;
     if (*ptr == CHAR_NUMBER_SIGN)
       {
       ptr++;
@@ -2805,8 +2969,8 @@
         {
         if (IS_NEWLINE(ptr)) { ptr += cd->nllen; break; }
         ptr++;
-#ifdef SUPPORT_UTF8
-        if (utf8) while ((*ptr & 0xc0) == 0x80) ptr++;
+#ifdef SUPPORT_UTF
+        if (utf) FORWARDCHAR(ptr);
 #endif
         }
       }
@@ -2824,15 +2988,13 @@
   if (temperrorcode != 0) return FALSE;
   ptr++;    /* Point after the escape sequence */
   }
-
-else if ((cd->ctypes[*ptr] & ctype_meta) == 0)
+else if (!MAX_255(*ptr) || (cd->ctypes[*ptr] & ctype_meta) == 0)
   {
-#ifdef SUPPORT_UTF8
-  if (utf8) { GETCHARINC(next, ptr); } else
+#ifdef SUPPORT_UTF
+  if (utf) { GETCHARINC(next, ptr); } else
 #endif
   next = *ptr++;
   }
-
 else return FALSE;

 /* Skip whitespace and comments in extended mode */
@@ -2841,7 +3003,7 @@
   {
   for (;;)
     {
-    while ((cd->ctypes[*ptr] & ctype_space) != 0) ptr++;
+    while (MAX_255(*ptr) && (cd->ctypes[*ptr] & ctype_space) != 0) ptr++;
     if (*ptr == CHAR_NUMBER_SIGN)
       {
       ptr++;
@@ -2849,8 +3011,8 @@
         {
         if (IS_NEWLINE(ptr)) { ptr += cd->nllen; break; }
         ptr++;
-#ifdef SUPPORT_UTF8
-        if (utf8) while ((*ptr & 0xc0) == 0x80) ptr++;
+#ifdef SUPPORT_UTF
+        if (utf) FORWARDCHAR(ptr);
 #endif
         }
       }
@@ -2861,7 +3023,7 @@
 /* If the next thing is itself optional, we have to give up. */

 if (*ptr == CHAR_ASTERISK || *ptr == CHAR_QUESTION_MARK ||
-  strncmp((char *)ptr, STR_LEFT_CURLY_BRACKET STR_0 STR_COMMA, 3) == 0)
+  STRNCMP_UC_C8(ptr, STR_LEFT_CURLY_BRACKET STR_0 STR_COMMA, 3) == 0)
     return FALSE;

/* Now compare the next item with the previous opcode. First, handle cases when
@@ -2870,7 +3032,7 @@
if (next >= 0) switch(op_code)
{
case OP_CHAR:
-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
GETCHARTEST(c, previous);
#else
c = *previous;
@@ -2882,14 +3044,14 @@
high-valued characters. */

   case OP_CHARI:
-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
   GETCHARTEST(c, previous);
 #else
   c = *previous;
 #endif
   if (c == next) return FALSE;
-#ifdef SUPPORT_UTF8
-  if (utf8)
+#ifdef SUPPORT_UTF
+  if (utf)
     {
     unsigned int othercase;
     if (next < 128) othercase = cd->fcc[next]; else
@@ -2901,8 +3063,8 @@
     return (unsigned int)c != othercase;
     }
   else
-#endif  /* SUPPORT_UTF8 */
-  return (c != cd->fcc[next]);  /* Non-UTF-8 mode */
+#endif  /* SUPPORT_UTF */
+  return (c != TABLE_GET(next, cd->fcc, next));  /* Non-UTF-8 mode */

/* For OP_NOT and OP_NOTI, the data is always a single-byte character. These
opcodes are not used for multi-byte characters, because they are coded using
@@ -2913,8 +3075,8 @@

   case OP_NOTI:
   if ((c = *previous) == next) return TRUE;
-#ifdef SUPPORT_UTF8
-  if (utf8)
+#ifdef SUPPORT_UTF
+  if (utf)
     {
     unsigned int othercase;
     if (next < 128) othercase = cd->fcc[next]; else
@@ -2926,8 +3088,8 @@
     return (unsigned int)c == othercase;
     }
   else
-#endif  /* SUPPORT_UTF8 */
-  return (c == cd->fcc[next]);  /* Non-UTF-8 mode */
+#endif  /* SUPPORT_UTF */
+  return (c == TABLE_GET(next, cd->fcc, next));  /* Non-UTF-8 mode */

   /* Note that OP_DIGIT etc. are generated only when PCRE_UCP is *not* set.
   When it is set, \d etc. are converted into OP_(NOT_)PROP codes. */
@@ -3018,7 +3180,7 @@
   {
   case OP_CHAR:
   case OP_CHARI:
-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
   GETCHARTEST(c, previous);
 #else
   c = *previous;
@@ -3123,7 +3285,7 @@
       to the original \d etc. At this point, ptr will point to a zero byte. */

       if (*ptr == CHAR_ASTERISK || *ptr == CHAR_QUESTION_MARK ||
-        strncmp((char *)ptr, STR_LEFT_CURLY_BRACKET STR_0 STR_COMMA, 3) == 0)
+        STRNCMP_UC_C8(ptr, STR_LEFT_CURLY_BRACKET STR_0 STR_COMMA, 3) == 0)
           return FALSE;

       /* Do the property check. */
@@ -3201,8 +3363,8 @@
   codeptr        points to the pointer to the current code point
   ptrptr         points to the current pattern pointer
   errorcodeptr   points to error code variable
-  firstbyteptr   set to initial literal character, or < 0 (REQ_UNSET, REQ_NONE)
-  reqbyteptr     set to the last literal character required, else < 0
+  firstcharptr   set to initial literal character, or < 0 (REQ_UNSET, REQ_NONE)
+  reqcharptr     set to the last literal character required, else < 0
   bcptr          points to current branch chain
   cond_depth     conditional nesting depth
   cd             contains pointers to tables etc.
@@ -3214,49 +3376,56 @@
 */

 static BOOL
-compile_branch(int *optionsptr, uschar **codeptr, const uschar **ptrptr,
-  int *errorcodeptr, int *firstbyteptr, int *reqbyteptr, branch_chain *bcptr,
-  int cond_depth, compile_data *cd, int *lengthptr)
+compile_branch(int *optionsptr, pcre_uchar **codeptr,
+  const pcre_uchar **ptrptr, int *errorcodeptr, pcre_int32 *firstcharptr,
+  pcre_int32 *reqcharptr, branch_chain *bcptr, int cond_depth,
+  compile_data *cd, int *lengthptr)
 {
 int repeat_type, op_type;
 int repeat_min = 0, repeat_max = 0;      /* To please picky compilers */
 int bravalue = 0;
 int greedy_default, greedy_non_default;
-int firstbyte, reqbyte;
-int zeroreqbyte, zerofirstbyte;
-int req_caseopt, reqvary, tempreqvary;
+pcre_int32 firstchar, reqchar;
+pcre_int32 zeroreqchar, zerofirstchar;
+pcre_int32 req_caseopt, reqvary, tempreqvary;
 int options = *optionsptr;               /* May change dynamically */
 int after_manual_callout = 0;
 int length_prevgroup = 0;
 register int c;
-register uschar *code = *codeptr;
-uschar *last_code = code;
-uschar *orig_code = code;
-uschar *tempcode;
+register pcre_uchar *code = *codeptr;
+pcre_uchar *last_code = code;
+pcre_uchar *orig_code = code;
+pcre_uchar *tempcode;
 BOOL inescq = FALSE;
-BOOL groupsetfirstbyte = FALSE;
-const uschar *ptr = *ptrptr;
-const uschar *tempptr;
-const uschar *nestptr = NULL;
-uschar *previous = NULL;
-uschar *previous_callout = NULL;
-uschar *save_hwm = NULL;
-uschar classbits[32];
+BOOL groupsetfirstchar = FALSE;
+const pcre_uchar *ptr = *ptrptr;
+const pcre_uchar *tempptr;
+const pcre_uchar *nestptr = NULL;
+pcre_uchar *previous = NULL;
+pcre_uchar *previous_callout = NULL;
+pcre_uchar *save_hwm = NULL;
+pcre_uint8 classbits[32];

/* We can fish out the UTF-8 setting once and for all into a BOOL, but we
must not do this for other options (e.g. PCRE_EXTENDED) because they may change
dynamically as we process the pattern. */

-#ifdef SUPPORT_UTF8
-BOOL class_utf8;
-BOOL utf8 = (options & PCRE_UTF8) != 0;
-uschar *class_utf8data;
-uschar *class_utf8data_base;
-uschar utf8_char[6];
+#ifdef SUPPORT_UTF
+/* PCRE_UTF16 has the same value as PCRE_UTF8. */
+BOOL utf = (options & PCRE_UTF8) != 0;
+pcre_uchar utf_chars[6];
#else
-BOOL utf8 = FALSE;
+BOOL utf = FALSE;
#endif

+/* Helper variables for OP_XCLASS opcode (for characters > 255). */
+
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
+BOOL xclass;
+pcre_uchar *class_uchardata;
+pcre_uchar *class_uchardata_base;
+#endif
+
#ifdef PCRE_DEBUG
if (lengthptr != NULL) DPRINTF((">> start branch\n"));
#endif
@@ -3268,22 +3437,23 @@

/* Initialize no first byte, no required byte. REQ_UNSET means "no char
matching encountered yet". It gets changed to REQ_NONE if we hit something that
-matches a non-fixed char first char; reqbyte just remains unset if we never
+matches a non-fixed char first char; reqchar just remains unset if we never
find one.

When we hit a repeat whose minimum is zero, we may have to adjust these values
to take the zero repeat into account. This is implemented by setting them to
-zerofirstbyte and zeroreqbyte when such a repeat is encountered. The individual
+zerofirstbyte and zeroreqchar when such a repeat is encountered. The individual
item types that can be repeated set these backoff variables appropriately. */

-firstbyte = reqbyte = zerofirstbyte = zeroreqbyte = REQ_UNSET;
+firstchar = reqchar = zerofirstchar = zeroreqchar = REQ_UNSET;

-/* The variable req_caseopt contains either the REQ_CASELESS value or zero,
-according to the current setting of the caseless flag. REQ_CASELESS is a bit
-value > 255. It is added into the firstbyte or reqbyte variables to record the
-case status of the value. This is used only for ASCII characters. */
+/* The variable req_caseopt contains either the REQ_CASELESS value
+or zero, according to the current setting of the caseless flag. The
+REQ_CASELESS leaves the lower 28 bit empty. It is added into the
+firstchar or reqchar variables to record the case status of the
+value. This is used only for ASCII characters. */

-req_caseopt = ((options & PCRE_CASELESS) != 0)? REQ_CASELESS : 0;
+req_caseopt = ((options & PCRE_CASELESS) != 0)? REQ_CASELESS:0;

/* Switch on next character until the end of the branch */

@@ -3295,20 +3465,20 @@
BOOL is_quantifier;
BOOL is_recurse;
BOOL reset_bracount;
- int class_charcount;
- int class_lastchar;
+ int class_has_8bitchar;
+ int class_single_char;
int newoptions;
int recno;
int refsign;
int skipbytes;
- int subreqbyte;
- int subfirstbyte;
+ int subreqchar;
+ int subfirstchar;
int terminator;
int mclength;
int tempbracount;
- uschar mcbuffer[8];
+ pcre_uchar mcbuffer[8];

- /* Get next byte in the pattern */
+ /* Get next character in the pattern */

c = *ptr;

@@ -3330,7 +3500,8 @@
 #ifdef PCRE_DEBUG
     if (code > cd->hwm) cd->hwm = code;                 /* High water info */
 #endif
-    if (code > cd->start_workspace + WORK_SIZE_CHECK)   /* Check for overrun */
+    if (code > cd->start_workspace + cd->workspace_size -
+        WORK_SIZE_SAFETY_MARGIN)                       /* Check for overrun */
       {
       *errorcodeptr = ERR52;
       goto FAILED;
@@ -3353,9 +3524,9 @@
       }

     *lengthptr += (int)(code - last_code);
-    DPRINTF(("length=%d added %d c=%c\n", *lengthptr, (int)(code - last_code),
-      c));
-
+    DPRINTF(("length=%d added %d c=%c (0x%x)\n", *lengthptr,
+      (int)(code - last_code), c, c));
+      
     /* If "previous" is set and it is not at the start of the work space, move
     it back to there, in order to avoid filling up the work space. Otherwise,
     if "previous" is NULL, reset the current code pointer to the start. */
@@ -3364,7 +3535,7 @@
       {
       if (previous > orig_code)
         {
-        memmove(orig_code, previous, code - previous);
+        memmove(orig_code, previous, IN_UCHARS(code - previous));
         code -= previous - orig_code;
         previous = orig_code;
         }
@@ -3380,7 +3551,8 @@
   /* In the real compile phase, just check the workspace used by the forward
   reference list. */

-  else if (cd->hwm > cd->start_workspace + WORK_SIZE_CHECK)
+  else if (cd->hwm > cd->start_workspace + cd->workspace_size -
+           WORK_SIZE_SAFETY_MARGIN)
     {
     *errorcodeptr = ERR52;
     goto FAILED;
@@ -3432,7 +3604,7 @@

   if ((options & PCRE_EXTENDED) != 0)
     {
-    if ((cd->ctypes[c] & ctype_space) != 0) continue;
+    if (MAX_255(*ptr) && (cd->ctypes[c] & ctype_space) != 0) continue;
     if (c == CHAR_NUMBER_SIGN)
       {
       ptr++;
@@ -3440,8 +3612,8 @@
         {
         if (IS_NEWLINE(ptr)) { ptr += cd->nllen - 1; break; }
         ptr++;
-#ifdef SUPPORT_UTF8
-        if (utf8) while ((*ptr & 0xc0) == 0x80) ptr++;
+#ifdef SUPPORT_UTF
+        if (utf) FORWARDCHAR(ptr);
 #endif
         }
       if (*ptr != 0) continue;
@@ -3465,8 +3637,8 @@
     case 0:                        /* The branch terminates at string end */
     case CHAR_VERTICAL_LINE:       /* or | or ) */
     case CHAR_RIGHT_PARENTHESIS:
-    *firstbyteptr = firstbyte;
-    *reqbyteptr = reqbyte;
+    *firstcharptr = firstchar;
+    *reqcharptr = reqchar;
     *codeptr = code;
     *ptrptr = ptr;
     if (lengthptr != NULL)
@@ -3490,7 +3662,7 @@
     previous = NULL;
     if ((options & PCRE_MULTILINE) != 0)
       {
-      if (firstbyte == REQ_UNSET) firstbyte = REQ_NONE;
+      if (firstchar == REQ_UNSET) firstchar = REQ_NONE;
       *code++ = OP_CIRCM;
       }
     else *code++ = OP_CIRC;
@@ -3502,12 +3674,12 @@
     break;

     /* There can never be a first char if '.' is first, whatever happens about
-    repeats. The value of reqbyte doesn't change either. */
+    repeats. The value of reqchar doesn't change either. */

     case CHAR_DOT:
-    if (firstbyte == REQ_UNSET) firstbyte = REQ_NONE;
-    zerofirstbyte = firstbyte;
-    zeroreqbyte = reqbyte;
+    if (firstchar == REQ_UNSET) firstchar = REQ_NONE;
+    zerofirstchar = firstchar;
+    zeroreqchar = reqchar;
     previous = code;
     *code++ = ((options & PCRE_DOTALL) != 0)? OP_ALLANY: OP_ANY;
     break;
@@ -3562,8 +3734,7 @@
         {
         if (ptr[1] == CHAR_E)
           ptr++;
-        else if (strncmp((const char *)ptr+1,
-                          STR_Q STR_BACKSLASH STR_E, 3) == 0)
+        else if (STRNCMP_UC_C8(ptr + 1, STR_Q STR_BACKSLASH STR_E, 3) == 0)
           ptr += 3;
         else
           break;
@@ -3582,8 +3753,8 @@
         (cd->external_options & PCRE_JAVASCRIPT_COMPAT) != 0)
       {
       *code++ = negate_class? OP_ALLANY : OP_FAIL;
-      if (firstbyte == REQ_UNSET) firstbyte = REQ_NONE;
-      zerofirstbyte = firstbyte;
+      if (firstchar == REQ_UNSET) firstchar = REQ_NONE;
+      zerofirstchar = firstchar;
       break;
       }

@@ -3593,24 +3764,25 @@

     should_flip_negation = FALSE;

-    /* Keep a count of chars with values < 256 so that we can optimize the case
-    of just a single character (as long as it's < 256). However, For higher
-    valued UTF-8 characters, we don't yet do any optimization. */
+    /* For optimization purposes, we track some properties of the class.
+    class_has_8bitchar will be non-zero, if the class contains at least one
+    < 256 character. class_single_char will be 1 if the class contains only
+    a single character. */

-    class_charcount = 0;
-    class_lastchar = -1;
+    class_has_8bitchar = 0;
+    class_single_char = 0;

     /* Initialize the 32-char bit map to all zeros. We build the map in a
     temporary bit of memory, in case the class contains only 1 character (less
     than 256), because in that case the compiled code doesn't use the bit map.
     */

-    memset(classbits, 0, 32 * sizeof(uschar));
+    memset(classbits, 0, 32 * sizeof(pcre_uint8));

-#ifdef SUPPORT_UTF8
-    class_utf8 = FALSE;                       /* No chars >= 256 */
-    class_utf8data = code + LINK_SIZE + 2;    /* For UTF-8 items */
-    class_utf8data_base = class_utf8data;     /* For resetting in pass 1 */
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
+    xclass = FALSE;                           /* No chars >= 256 */
+    class_uchardata = code + LINK_SIZE + 2;   /* For UTF-8 items */
+    class_uchardata_base = class_uchardata;   /* For resetting in pass 1 */
 #endif

     /* Process characters until ] is reached. By writing this as a "do" it
@@ -3619,25 +3791,26 @@

     if (c != 0) do
       {
-      const uschar *oldptr;
+      const pcre_uchar *oldptr;

-#ifdef SUPPORT_UTF8
-      if (utf8 && c > 127)
+#ifdef SUPPORT_UTF
+      if (utf && HAS_EXTRALEN(c))
         {                           /* Braces are required because the */
         GETCHARLEN(c, ptr, ptr);    /* macro generates multiple statements */
         }
+#endif

-      /* In the pre-compile phase, accumulate the length of any UTF-8 extra
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
+      /* In the pre-compile phase, accumulate the length of any extra
       data and reset the pointer. This is so that very large classes that
-      contain a zillion UTF-8 characters no longer overwrite the work space
+      contain a zillion > 255 characters no longer overwrite the work space
       (which is on the stack). */

       if (lengthptr != NULL)
         {
-        *lengthptr += class_utf8data - class_utf8data_base;
-        class_utf8data = class_utf8data_base;
+        *lengthptr += class_uchardata - class_uchardata_base;
+        class_uchardata = class_uchardata_base;
         }
-
 #endif

       /* Inside \Q...\E everything is literal except \E */
@@ -3665,8 +3838,8 @@
         {
         BOOL local_negate = FALSE;
         int posix_class, taboffset, tabopt;
-        register const uschar *cbits = cd->cbits;
-        uschar pbits[32];
+        register const pcre_uint8 *cbits = cd->cbits;
+        pcre_uint8 pbits[32];

         if (ptr[1] != CHAR_COLON)
           {
@@ -3721,7 +3894,7 @@
         /* Copy in the first table (always present) */

         memcpy(pbits, cbits + posix_class_maps[posix_class],
-          32 * sizeof(uschar));
+          32 * sizeof(pcre_uint8));

         /* If there is a second table, add or remove it as required. */

@@ -3752,16 +3925,20 @@
           for (c = 0; c < 32; c++) classbits[c] |= pbits[c];

         ptr = tempptr + 1;
-        class_charcount = 10;  /* Set > 1; assumes more than 1 per class */
+        /* Every class contains at least one < 256 characters. */
+        class_has_8bitchar = 1;
+        /* Every class contains at least two characters. */
+        class_single_char = 2;
         continue;    /* End of POSIX syntax handling */
         }

       /* Backslash may introduce a single character, or it may introduce one
       of the specials, which just set a flag. The sequence \b is a special
       case. Inside a class (and only there) it is treated as backspace. We
-      assume that other escapes have more than one character in them, so set
-      class_charcount bigger than one. Unrecognized escapes fall through and
-      are either treated as literal characters (by default), or are faulted if
+      assume that other escapes have more than one character in them, so
+      speculatively set both class_has_8bitchar and class_single_char bigger
+      than one. Unrecognized escapes fall through and are either treated
+      as literal characters (by default), or are faulted if
       PCRE_EXTRA is set. */

       if (c == CHAR_BACKSLASH)
@@ -3770,6 +3947,11 @@
         if (*errorcodeptr != 0) goto FAILED;

         if (-c == ESC_b) c = CHAR_BS;    /* \b is backspace in a class */
+        else if (-c == ESC_N)            /* \N is not supported in a class */
+          {
+          *errorcodeptr = ERR71;
+          goto FAILED;
+          }
         else if (-c == ESC_Q)            /* Handle start of quoted string */
           {
           if (ptr[1] == CHAR_BACKSLASH && ptr[2] == CHAR_E)
@@ -3783,8 +3965,11 @@

         if (c < 0)
           {
-          register const uschar *cbits = cd->cbits;
-          class_charcount += 2;     /* Greater than 1 is what matters */
+          register const pcre_uint8 *cbits = cd->cbits;
+          /* Every class contains at least two < 256 characters. */
+          class_has_8bitchar++;
+          /* Every class contains at least two characters. */
+          class_single_char += 2;

           switch (-c)
             {
@@ -3797,7 +3982,7 @@
             case ESC_SU:
             nestptr = ptr;
             ptr = substitutes[-c - ESC_DU] - 1;  /* Just before substitute */
-            class_charcount -= 2;                /* Undo! */
+            class_has_8bitchar--;                /* Undo! */
             continue;
 #endif
             case ESC_d:
@@ -3838,23 +4023,38 @@
             SETBIT(classbits, 0x09); /* VT */
             SETBIT(classbits, 0x20); /* SPACE */
             SETBIT(classbits, 0xa0); /* NSBP */
-#ifdef SUPPORT_UTF8
-            if (utf8)
+#ifndef COMPILE_PCRE8
+            xclass = TRUE;
+            *class_uchardata++ = XCL_SINGLE;
+            *class_uchardata++ = 0x1680;
+            *class_uchardata++ = XCL_SINGLE;
+            *class_uchardata++ = 0x180e;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x2000;
+            *class_uchardata++ = 0x200a;
+            *class_uchardata++ = XCL_SINGLE;
+            *class_uchardata++ = 0x202f;
+            *class_uchardata++ = XCL_SINGLE;
+            *class_uchardata++ = 0x205f;
+            *class_uchardata++ = XCL_SINGLE;
+            *class_uchardata++ = 0x3000;
+#elif defined SUPPORT_UTF
+            if (utf)
               {
-              class_utf8 = TRUE;
-              *class_utf8data++ = XCL_SINGLE;
-              class_utf8data += _pcre_ord2utf8(0x1680, class_utf8data);
-              *class_utf8data++ = XCL_SINGLE;
-              class_utf8data += _pcre_ord2utf8(0x180e, class_utf8data);
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x2000, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x200A, class_utf8data);
-              *class_utf8data++ = XCL_SINGLE;
-              class_utf8data += _pcre_ord2utf8(0x202f, class_utf8data);
-              *class_utf8data++ = XCL_SINGLE;
-              class_utf8data += _pcre_ord2utf8(0x205f, class_utf8data);
-              *class_utf8data++ = XCL_SINGLE;
-              class_utf8data += _pcre_ord2utf8(0x3000, class_utf8data);
+              xclass = TRUE;
+              *class_uchardata++ = XCL_SINGLE;
+              class_uchardata += PRIV(ord2utf)(0x1680, class_uchardata);
+              *class_uchardata++ = XCL_SINGLE;
+              class_uchardata += PRIV(ord2utf)(0x180e, class_uchardata);
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x2000, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x200a, class_uchardata);
+              *class_uchardata++ = XCL_SINGLE;
+              class_uchardata += PRIV(ord2utf)(0x202f, class_uchardata);
+              *class_uchardata++ = XCL_SINGLE;
+              class_uchardata += PRIV(ord2utf)(0x205f, class_uchardata);
+              *class_uchardata++ = XCL_SINGLE;
+              class_uchardata += PRIV(ord2utf)(0x3000, class_uchardata);
               }
 #endif
             continue;
@@ -3872,32 +4072,59 @@
                 }
               classbits[c] |= x;
               }
-
-#ifdef SUPPORT_UTF8
-            if (utf8)
+#ifndef COMPILE_PCRE8
+            xclass = TRUE;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x0100;
+            *class_uchardata++ = 0x167f;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x1681;
+            *class_uchardata++ = 0x180d;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x180f;
+            *class_uchardata++ = 0x1fff;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x200b;
+            *class_uchardata++ = 0x202e;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x2030;
+            *class_uchardata++ = 0x205e;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x2060;
+            *class_uchardata++ = 0x2fff;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x3001;
+#ifdef SUPPORT_UTF
+            if (utf)
+              class_uchardata += PRIV(ord2utf)(0x10ffff, class_uchardata);
+            else
+#endif
+              *class_uchardata++ = 0xffff;
+#elif defined SUPPORT_UTF
+            if (utf)
               {
-              class_utf8 = TRUE;
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x0100, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x167f, class_utf8data);
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x1681, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x180d, class_utf8data);
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x180f, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x1fff, class_utf8data);
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x200B, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x202e, class_utf8data);
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x2030, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x205e, class_utf8data);
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x2060, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x2fff, class_utf8data);
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x3001, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x7fffffff, class_utf8data);
+              xclass = TRUE;
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x0100, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x167f, class_uchardata);
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x1681, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x180d, class_uchardata);
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x180f, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x1fff, class_uchardata);
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x200b, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x202e, class_uchardata);
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x2030, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x205e, class_uchardata);
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x2060, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x2fff, class_uchardata);
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x3001, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x10ffff, class_uchardata);
               }
 #endif
             continue;
@@ -3908,13 +4135,18 @@
             SETBIT(classbits, 0x0c); /* FF */
             SETBIT(classbits, 0x0d); /* CR */
             SETBIT(classbits, 0x85); /* NEL */
-#ifdef SUPPORT_UTF8
-            if (utf8)
+#ifndef COMPILE_PCRE8
+            xclass = TRUE;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x2028;
+            *class_uchardata++ = 0x2029;
+#elif defined SUPPORT_UTF
+            if (utf)
               {
-              class_utf8 = TRUE;
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x2028, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x2029, class_utf8data);
+              xclass = TRUE;
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x2028, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x2029, class_uchardata);
               }
 #endif
             continue;
@@ -3936,16 +4168,29 @@
               classbits[c] |= x;
               }

-#ifdef SUPPORT_UTF8
-            if (utf8)
+#ifndef COMPILE_PCRE8
+            xclass = TRUE;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x0100;
+            *class_uchardata++ = 0x2027;
+            *class_uchardata++ = XCL_RANGE;
+            *class_uchardata++ = 0x202a;
+#ifdef SUPPORT_UTF
+            if (utf)
+              class_uchardata += PRIV(ord2utf)(0x10ffff, class_uchardata);
+            else
+#endif
+              *class_uchardata++ = 0xffff;
+#elif defined SUPPORT_UTF
+            if (utf)
               {
-              class_utf8 = TRUE;
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x0100, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x2027, class_utf8data);
-              *class_utf8data++ = XCL_RANGE;
-              class_utf8data += _pcre_ord2utf8(0x2029, class_utf8data);
-              class_utf8data += _pcre_ord2utf8(0x7fffffff, class_utf8data);
+              xclass = TRUE;
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x0100, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x2027, class_uchardata);
+              *class_uchardata++ = XCL_RANGE;
+              class_uchardata += PRIV(ord2utf)(0x202a, class_uchardata);
+              class_uchardata += PRIV(ord2utf)(0x10ffff, class_uchardata);
               }
 #endif
             continue;
@@ -3958,12 +4203,12 @@
               int pdata;
               int ptype = get_ucp(&ptr, &negated, &pdata, errorcodeptr);
               if (ptype < 0) goto FAILED;
-              class_utf8 = TRUE;
-              *class_utf8data++ = ((-c == ESC_p) != negated)?
+              xclass = TRUE;
+              *class_uchardata++ = ((-c == ESC_p) != negated)?
                 XCL_PROP : XCL_NOTPROP;
-              *class_utf8data++ = ptype;
-              *class_utf8data++ = pdata;
-              class_charcount -= 2;   /* Not a < 256 character */
+              *class_uchardata++ = ptype;
+              *class_uchardata++ = pdata;
+              class_has_8bitchar--;                /* Undo! */
               continue;
               }
 #endif
@@ -3977,14 +4222,15 @@
               *errorcodeptr = ERR7;
               goto FAILED;
               }
-            class_charcount -= 2;  /* Undo the default count from above */
-            c = *ptr;              /* Get the final character and fall through */
+            class_has_8bitchar--;    /* Undo the speculative increase. */
+            class_single_char -= 2;  /* Undo the speculative increase. */
+            c = *ptr;                /* Get the final character and fall through */
             break;
             }
           }

         /* Fall through if we have a single character (c >= 0). This may be
-        greater than 256 in UTF-8 mode. */
+        greater than 256. */

         }   /* End of backslash handling */

@@ -4032,8 +4278,8 @@
           goto LONE_SINGLE_CHARACTER;
           }

-#ifdef SUPPORT_UTF8
-        if (utf8)
+#ifdef SUPPORT_UTF
+        if (utf)
           {                           /* Braces are required because the */
           GETCHARLEN(d, ptr, ptr);    /* macro generates multiple statements */
           }
@@ -4077,22 +4323,36 @@

         if (d == CHAR_CR || d == CHAR_NL) cd->external_flags |= PCRE_HASCRORLF;

+        /* Since we found a character range, single character optimizations
+        cannot be done anymore. */
+        class_single_char = 2;
+
         /* In UTF-8 mode, if the upper limit is > 255, or > 127 for caseless
         matching, we have to use an XCLASS with extra data items. Caseless
         matching for characters > 127 is available only if UCP support is
         available. */

-#ifdef SUPPORT_UTF8
-        if (utf8 && (d > 255 || ((options & PCRE_CASELESS) != 0 && d > 127)))
+#if defined SUPPORT_UTF && !(defined COMPILE_PCRE8)
+        if ((d > 255) || (utf && ((options & PCRE_CASELESS) != 0 && d > 127)))
+#elif defined  SUPPORT_UTF
+        if (utf && (d > 255 || ((options & PCRE_CASELESS) != 0 && d > 127)))
+#elif !(defined COMPILE_PCRE8)
+        if (d > 255)
+#endif
+#if defined SUPPORT_UTF || !(defined COMPILE_PCRE8)
           {
-          class_utf8 = TRUE;
+          xclass = TRUE;

           /* With UCP support, we can find the other case equivalents of
           the relevant characters. There may be several ranges. Optimize how
           they fit with the basic range. */

 #ifdef SUPPORT_UCP
+#ifndef COMPILE_PCRE8
+          if (utf && (options & PCRE_CASELESS) != 0)
+#else
           if ((options & PCRE_CASELESS) != 0)
+#endif
             {
             unsigned int occ, ocd;
             unsigned int cc = c;
@@ -4118,14 +4378,14 @@

               if (occ == ocd)
                 {
-                *class_utf8data++ = XCL_SINGLE;
+                *class_uchardata++ = XCL_SINGLE;
                 }
               else
                 {
-                *class_utf8data++ = XCL_RANGE;
-                class_utf8data += _pcre_ord2utf8(occ, class_utf8data);
+                *class_uchardata++ = XCL_RANGE;
+                class_uchardata += PRIV(ord2utf)(occ, class_uchardata);
                 }
-              class_utf8data += _pcre_ord2utf8(ocd, class_utf8data);
+              class_uchardata += PRIV(ord2utf)(ocd, class_uchardata);
               }
             }
 #endif  /* SUPPORT_UCP */
@@ -4133,33 +4393,69 @@
           /* Now record the original range, possibly modified for UCP caseless
           overlapping ranges. */

-          *class_utf8data++ = XCL_RANGE;
-          class_utf8data += _pcre_ord2utf8(c, class_utf8data);
-          class_utf8data += _pcre_ord2utf8(d, class_utf8data);
+          *class_uchardata++ = XCL_RANGE;
+#ifdef SUPPORT_UTF
+#ifndef COMPILE_PCRE8
+          if (utf)
+            {
+            class_uchardata += PRIV(ord2utf)(c, class_uchardata);
+            class_uchardata += PRIV(ord2utf)(d, class_uchardata);
+            }
+          else
+            {
+            *class_uchardata++ = c;
+            *class_uchardata++ = d;
+            }
+#else
+          class_uchardata += PRIV(ord2utf)(c, class_uchardata);
+          class_uchardata += PRIV(ord2utf)(d, class_uchardata);
+#endif
+#else /* SUPPORT_UTF */
+          *class_uchardata++ = c;
+          *class_uchardata++ = d;
+#endif /* SUPPORT_UTF */

           /* With UCP support, we are done. Without UCP support, there is no
-          caseless matching for UTF-8 characters > 127; we can use the bit map
-          for the smaller ones. */
+          caseless matching for UTF characters > 127; we can use the bit map
+          for the smaller ones. As for 16 bit characters without UTF, we
+          can still use  */

 #ifdef SUPPORT_UCP
-          continue;    /* With next character in the class */
-#else
+#ifndef COMPILE_PCRE8
+          if (utf)
+#endif
+            continue;    /* With next character in the class */
+#endif  /* SUPPORT_UCP */
+
+#if defined SUPPORT_UTF && !defined(SUPPORT_UCP) && !(defined COMPILE_PCRE8)
+          if (utf)
+            {
+            if ((options & PCRE_CASELESS) == 0 || c > 127) continue;
+            /* Adjust upper limit and fall through to set up the map */
+            d = 127;
+            }
+          else
+            {
+            if (c > 255) continue;
+            /* Adjust upper limit and fall through to set up the map */
+            d = 255;
+            }
+#elif defined SUPPORT_UTF && !defined(SUPPORT_UCP)
           if ((options & PCRE_CASELESS) == 0 || c > 127) continue;
-
           /* Adjust upper limit and fall through to set up the map */
-
           d = 127;
-
-#endif  /* SUPPORT_UCP */
+#else
+          if (c > 255) continue;
+          /* Adjust upper limit and fall through to set up the map */
+          d = 255;
+#endif  /* SUPPORT_UTF && !SUPPORT_UCP && !COMPILE_PCRE8 */
           }
-#endif  /* SUPPORT_UTF8 */
+#endif  /* SUPPORT_UTF || !COMPILE_PCRE8 */

-        /* We use the bit map for all cases when not in UTF-8 mode; else
-        ranges that lie entirely within 0-127 when there is UCP support; else
-        for partial ranges without UCP support. */
+        /* We use the bit map for 8 bit mode, or when the characters fall
+        partially or entirely to [0-255] ([0-127] for UCP) ranges. */

-        class_charcount += d - c + 1;
-        class_lastchar = d;
+        class_has_8bitchar = 1;

         /* We can save a bit of time by skipping this in the pre-compile. */

@@ -4168,7 +4464,7 @@
           classbits[c/8] |= (1 << (c&7));
           if ((options & PCRE_CASELESS) != 0)
             {
-            int uc = cd->fcc[c];           /* flip case */
+            int uc = cd->fcc[c]; /* flip case */
             classbits[uc/8] |= (1 << (uc&7));
             }
           }
@@ -4182,41 +4478,117 @@

       LONE_SINGLE_CHARACTER:

-      /* Handle a character that cannot go in the bit map */
+      /* Only the value of 1 matters for class_single_char. */
+      if (class_single_char < 2) class_single_char++;

-#ifdef SUPPORT_UTF8
-      if (utf8 && (c > 255 || ((options & PCRE_CASELESS) != 0 && c > 127)))
+      /* If class_charcount is 1, we saw precisely one character. As long as
+      there were no negated characters >= 128 and there was no use of \p or \P,
+      in other words, no use of any XCLASS features, we can optimize.
+
+      In UTF-8 mode, we can optimize the negative case only if there were no
+      characters >= 128 because OP_NOT and the related opcodes like OP_NOTSTAR
+      operate on single-bytes characters only. This is an historical hangover.
+      Maybe one day we can tidy these opcodes to handle multi-byte characters.
+
+      The optimization throws away the bit map. We turn the item into a
+      1-character OP_CHAR[I] if it's positive, or OP_NOT[I] if it's negative.
+      Note that OP_NOT[I] does not support multibyte characters. In the positive
+      case, it can cause firstchar to be set. Otherwise, there can be no first
+      char if this item is first, whatever repeat count may follow. In the case
+      of reqchar, save the previous value for reinstating. */
+
+#ifdef SUPPORT_UTF
+      if (class_single_char == 1 && ptr[1] == CHAR_RIGHT_SQUARE_BRACKET
+        && (!utf || !negate_class || c < (MAX_VALUE_FOR_SINGLE_CHAR + 1)))
+#else
+      if (class_single_char == 1 && ptr[1] == CHAR_RIGHT_SQUARE_BRACKET)
+#endif
         {
-        class_utf8 = TRUE;
-        *class_utf8data++ = XCL_SINGLE;
-        class_utf8data += _pcre_ord2utf8(c, class_utf8data);
+        ptr++;
+        zeroreqchar = reqchar;

+        /* The OP_NOT[I] opcodes work on single characters only. */
+
+        if (negate_class)
+          {
+          if (firstchar == REQ_UNSET) firstchar = REQ_NONE;
+          zerofirstchar = firstchar;
+          *code++ = ((options & PCRE_CASELESS) != 0)? OP_NOTI: OP_NOT;
+          *code++ = c;
+          goto NOT_CHAR;
+          }
+
+        /* For a single, positive character, get the value into mcbuffer, and
+        then we can handle this with the normal one-character code. */
+
+#ifdef SUPPORT_UTF
+        if (utf && c > MAX_VALUE_FOR_SINGLE_CHAR)
+          mclength = PRIV(ord2utf)(c, mcbuffer);
+        else
+#endif
+          {
+          mcbuffer[0] = c;
+          mclength = 1;
+          }
+        goto ONE_CHAR;
+        }       /* End of 1-char optimization */
+
+      /* Handle a character that cannot go in the bit map. */
+
+#if defined SUPPORT_UTF && !(defined COMPILE_PCRE8)
+      if ((c > 255) || (utf && ((options & PCRE_CASELESS) != 0 && c > 127)))
+#elif defined SUPPORT_UTF
+      if (utf && (c > 255 || ((options & PCRE_CASELESS) != 0 && c > 127)))
+#elif !(defined COMPILE_PCRE8)
+      if (c > 255)
+#endif
+
+#if defined SUPPORT_UTF || !(defined COMPILE_PCRE8)
+        {
+        xclass = TRUE;
+        *class_uchardata++ = XCL_SINGLE;
+#ifdef SUPPORT_UTF
+#ifndef COMPILE_PCRE8
+        /* In non 8 bit mode, we can get here even if we are not in UTF mode. */
+        if (!utf) 
+          *class_uchardata++ = c;
+        else
+#endif
+          class_uchardata += PRIV(ord2utf)(c, class_uchardata);
+#else /* SUPPORT_UTF */
+        *class_uchardata++ = c;
+#endif /* SUPPORT_UTF */
+
 #ifdef SUPPORT_UCP
+#ifdef COMPILE_PCRE8
         if ((options & PCRE_CASELESS) != 0)
+#else
+        /* In non 8 bit mode, we can get here even if we are not in UTF mode. */
+        if (utf && (options & PCRE_CASELESS) != 0)
+#endif
           {
           unsigned int othercase;
           if ((othercase = UCD_OTHERCASE(c)) != c)
             {
-            *class_utf8data++ = XCL_SINGLE;
-            class_utf8data += _pcre_ord2utf8(othercase, class_utf8data);
+            *class_uchardata++ = XCL_SINGLE;
+            class_uchardata += PRIV(ord2utf)(othercase, class_uchardata);
             }
           }
 #endif  /* SUPPORT_UCP */

         }
       else
-#endif  /* SUPPORT_UTF8 */
+#endif  /* SUPPORT_UTF || COMPILE_PCRE16 */

       /* Handle a single-byte character */
         {
+        class_has_8bitchar = 1;
         classbits[c/8] |= (1 << (c&7));
         if ((options & PCRE_CASELESS) != 0)
           {
-          c = cd->fcc[c];   /* flip case */
+          c = cd->fcc[c]; /* flip case */
           classbits[c/8] |= (1 << (c&7));
           }
-        class_charcount++;
-        class_lastchar = c;
         }
       }

@@ -4237,67 +4609,14 @@
       goto FAILED;
       }

-    /* If class_charcount is 1, we saw precisely one character whose value is
-    less than 256. As long as there were no characters >= 128 and there was no
-    use of \p or \P, in other words, no use of any XCLASS features, we can
-    optimize.
+    /* If this is the first thing in the branch, there can be no first char
+    setting, whatever the repeat count. Any reqchar setting must remain
+    unchanged after any kind of repeat. */

-    In UTF-8 mode, we can optimize the negative case only if there were no
-    characters >= 128 because OP_NOT and the related opcodes like OP_NOTSTAR
-    operate on single-bytes characters only. This is an historical hangover.
-    Maybe one day we can tidy these opcodes to handle multi-byte characters.
+    if (firstchar == REQ_UNSET) firstchar = REQ_NONE;
+    zerofirstchar = firstchar;
+    zeroreqchar = reqchar;

-    The optimization throws away the bit map. We turn the item into a
-    1-character OP_CHAR[I] if it's positive, or OP_NOT[I] if it's negative.
-    Note that OP_NOT[I] does not support multibyte characters. In the positive
-    case, it can cause firstbyte to be set. Otherwise, there can be no first
-    char if this item is first, whatever repeat count may follow. In the case
-    of reqbyte, save the previous value for reinstating. */
-
-#ifdef SUPPORT_UTF8
-    if (class_charcount == 1 && !class_utf8 &&
-      (!utf8 || !negate_class || class_lastchar < 128))
-#else
-    if (class_charcount == 1)
-#endif
-      {
-      zeroreqbyte = reqbyte;
-
-      /* The OP_NOT[I] opcodes work on one-byte characters only. */
-
-      if (negate_class)
-        {
-        if (firstbyte == REQ_UNSET) firstbyte = REQ_NONE;
-        zerofirstbyte = firstbyte;
-        *code++ = ((options & PCRE_CASELESS) != 0)? OP_NOTI: OP_NOT;
-        *code++ = class_lastchar;
-        break;
-        }
-
-      /* For a single, positive character, get the value into mcbuffer, and
-      then we can handle this with the normal one-character code. */
-
-#ifdef SUPPORT_UTF8
-      if (utf8 && class_lastchar > 127)
-        mclength = _pcre_ord2utf8(class_lastchar, mcbuffer);
-      else
-#endif
-        {
-        mcbuffer[0] = class_lastchar;
-        mclength = 1;
-        }
-      goto ONE_CHAR;
-      }       /* End of 1-char optimization */
-
-    /* The general case - not the one-char optimization. If this is the first
-    thing in the branch, there can be no first char setting, whatever the
-    repeat count. Any reqbyte setting must remain unchanged after any kind of
-    repeat. */
-
-    if (firstbyte == REQ_UNSET) firstbyte = REQ_NONE;
-    zerofirstbyte = firstbyte;
-    zeroreqbyte = reqbyte;
-
     /* If there are characters with values > 255, we have to compile an
     extended class, with its own opcode, unless there was a negated special
     such as \S in the class, and PCRE_UCP is not set, because in that case all
@@ -4306,29 +4625,34 @@
     be listed) there are no characters < 256, we can omit the bitmap in the
     actual compiled code. */

-#ifdef SUPPORT_UTF8
-    if (class_utf8 && (!should_flip_negation || (options & PCRE_UCP) != 0))
+#ifdef SUPPORT_UTF
+    if (xclass && (!should_flip_negation || (options & PCRE_UCP) != 0))
+#elif !defined COMPILE_PCRE8
+    if (xclass && !should_flip_negation)
+#endif
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
       {
-      *class_utf8data++ = XCL_END;    /* Marks the end of extra data */
+      *class_uchardata++ = XCL_END;    /* Marks the end of extra data */
       *code++ = OP_XCLASS;
       code += LINK_SIZE;
-      *code = negate_class? XCL_NOT : 0;
+      *code = negate_class? XCL_NOT:0;

       /* If the map is required, move up the extra data to make room for it;
       otherwise just move the code pointer to the end of the extra data. */

-      if (class_charcount > 0)
+      if (class_has_8bitchar > 0)
         {
         *code++ |= XCL_MAP;
-        memmove(code + 32, code, class_utf8data - code);
+        memmove(code + (32 / sizeof(pcre_uchar)), code,
+          IN_UCHARS(class_uchardata - code));
         memcpy(code, classbits, 32);
-        code = class_utf8data + 32;
+        code = class_uchardata + (32 / sizeof(pcre_uchar));
         }
-      else code = class_utf8data;
+      else code = class_uchardata;

       /* Now fill in the complete length of the item */

-      PUT(previous, 1, code - previous);
+      PUT(previous, 1, (int)(code - previous));
       break;   /* End of class handling */
       }
 #endif
@@ -4340,16 +4664,14 @@
     negating it if necessary. */

     *code++ = (negate_class == should_flip_negation) ? OP_CLASS : OP_NCLASS;
-    if (negate_class)
+    if (lengthptr == NULL)    /* Save time in the pre-compile phase */
       {
-      if (lengthptr == NULL)    /* Save time in the pre-compile phase */
-        for (c = 0; c < 32; c++) code[c] = ~classbits[c];
-      }
-    else
-      {
+      if (negate_class)
+        for (c = 0; c < 32; c++) classbits[c] = ~classbits[c];
       memcpy(code, classbits, 32);
       }
-    code += 32;
+    code += 32 / sizeof(pcre_uchar);
+    NOT_CHAR:
     break;

@@ -4386,8 +4708,8 @@

     if (repeat_min == 0)
       {
-      firstbyte = zerofirstbyte;    /* Adjust for zero repeat */
-      reqbyte = zeroreqbyte;        /* Ditto */
+      firstchar = zerofirstchar;    /* Adjust for zero repeat */
+      reqchar = zeroreqchar;        /* Ditto */
       }

     /* Remember whether this is a variable length repeat */
@@ -4426,10 +4748,10 @@
     past, but it no longer happens for non-repeated recursions. In fact, the
     repeated ones could be re-implemented independently so as not to need this,
     but for the moment we rely on the code for repeating groups. */
-    
+
     if (*previous == OP_RECURSE)
       {
-      memmove(previous + 1 + LINK_SIZE, previous, 1 + LINK_SIZE);
+      memmove(previous + 1 + LINK_SIZE, previous, IN_UCHARS(1 + LINK_SIZE));
       *previous = OP_ONCE;
       PUT(previous, 1, 2 + 2*LINK_SIZE);
       previous[2 + 2*LINK_SIZE] = OP_KET;
@@ -4452,37 +4774,36 @@

     /* If previous was a character match, abolish the item and generate a
     repeat item instead. If a char item has a minumum of more than one, ensure
-    that it is set in reqbyte - it might not be if a sequence such as x{3} is
-    the first thing in a branch because the x will have gone into firstbyte
+    that it is set in reqchar - it might not be if a sequence such as x{3} is
+    the first thing in a branch because the x will have gone into firstchar
     instead.  */

     if (*previous == OP_CHAR || *previous == OP_CHARI)
       {
       op_type = (*previous == OP_CHAR)? 0 : OP_STARI - OP_STAR;

-      /* Deal with UTF-8 characters that take up more than one byte. It's
+      /* Deal with UTF characters that take up more than one character. It's
       easier to write this out separately than try to macrify it. Use c to
-      hold the length of the character in bytes, plus 0x80 to flag that it's a
-      length rather than a small character. */
+      hold the length of the character in bytes, plus UTF_LENGTH to flag that
+      it's a length rather than a small character. */

-#ifdef SUPPORT_UTF8
-      if (utf8 && (code[-1] & 0x80) != 0)
+#ifdef SUPPORT_UTF
+      if (utf && NOT_FIRSTCHAR(code[-1]))
         {
-        uschar *lastchar = code - 1;
-        while((*lastchar & 0xc0) == 0x80) lastchar--;
-        c = code - lastchar;            /* Length of UTF-8 character */
-        memcpy(utf8_char, lastchar, c); /* Save the char */
-        c |= 0x80;                      /* Flag c as a length */
+        pcre_uchar *lastchar = code - 1;
+        BACKCHAR(lastchar);
+        c = (int)(code - lastchar);     /* Length of UTF-8 character */
+        memcpy(utf_chars, lastchar, IN_UCHARS(c)); /* Save the char */
+        c |= UTF_LENGTH;                /* Flag c as a length */
         }
       else
-#endif
+#endif /* SUPPORT_UTF */

-      /* Handle the case of a single byte - either with no UTF8 support, or
-      with UTF-8 disabled, or for a UTF-8 character < 128. */
-
+      /* Handle the case of a single charater - either with no UTF support, or
+      with UTF disabled, or for a single character UTF character. */
         {
         c = code[-1];
-        if (repeat_min > 1) reqbyte = c | req_caseopt | cd->req_varyopt;
+        if (repeat_min > 1) reqchar = c | req_caseopt | cd->req_varyopt;
         }

       /* If the repetition is unlimited, it pays to see if the next thing on
@@ -4492,7 +4813,7 @@

       if (!possessive_quantifier &&
           repeat_max < 0 &&
-          check_auto_possessive(previous, utf8, ptr + 1, options, cd))
+          check_auto_possessive(previous, utf, ptr + 1, options, cd))
         {
         repeat_type = 0;    /* Force greedy */
         possessive_quantifier = TRUE;
@@ -4513,7 +4834,7 @@
       c = previous[1];
       if (!possessive_quantifier &&
           repeat_max < 0 &&
-          check_auto_possessive(previous, utf8, ptr + 1, options, cd))
+          check_auto_possessive(previous, utf, ptr + 1, options, cd))
         {
         repeat_type = 0;    /* Force greedy */
         possessive_quantifier = TRUE;
@@ -4530,14 +4851,14 @@

     else if (*previous < OP_EODN)
       {
-      uschar *oldcode;
+      pcre_uchar *oldcode;
       int prop_type, prop_value;
       op_type = OP_TYPESTAR - OP_STAR;  /* Use type opcodes */
       c = *previous;

       if (!possessive_quantifier &&
           repeat_max < 0 &&
-          check_auto_possessive(previous, utf8, ptr + 1, options, cd))
+          check_auto_possessive(previous, utf, ptr + 1, options, cd))
         {
         repeat_type = 0;    /* Force greedy */
         possessive_quantifier = TRUE;
@@ -4617,14 +4938,14 @@
         we have to insert the character for the previous code. For a repeated
         Unicode property match, there are two extra bytes that define the
         required property. In UTF-8 mode, long characters have their length in
-        c, with the 0x80 bit as a flag. */
+        c, with the UTF_LENGTH bit as a flag. */

         if (repeat_max < 0)
           {
-#ifdef SUPPORT_UTF8
-          if (utf8 && c >= 128)
+#ifdef SUPPORT_UTF
+          if (utf && (c & UTF_LENGTH) != 0)
             {
-            memcpy(code, utf8_char, c & 7);
+            memcpy(code, utf_chars, IN_UCHARS(c & 7));
             code += c & 7;
             }
           else
@@ -4646,10 +4967,10 @@

         else if (repeat_max != repeat_min)
           {
-#ifdef SUPPORT_UTF8
-          if (utf8 && c >= 128)
+#ifdef SUPPORT_UTF
+          if (utf && (c & UTF_LENGTH) != 0)
             {
-            memcpy(code, utf8_char, c & 7);
+            memcpy(code, utf_chars, IN_UCHARS(c & 7));
             code += c & 7;
             }
           else
@@ -4676,10 +4997,10 @@

       /* The character or character type itself comes last in all cases. */

-#ifdef SUPPORT_UTF8
-      if (utf8 && c >= 128)
+#ifdef SUPPORT_UTF
+      if (utf && (c & UTF_LENGTH) != 0)
         {
-        memcpy(code, utf8_char, c & 7);
+        memcpy(code, utf_chars, IN_UCHARS(c & 7));
         code += c & 7;
         }
       else
@@ -4703,7 +5024,7 @@

     else if (*previous == OP_CLASS ||
              *previous == OP_NCLASS ||
-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
              *previous == OP_XCLASS ||
 #endif
              *previous == OP_REF ||
@@ -4752,8 +5073,8 @@
       {
       register int i;
       int len = (int)(code - previous);
-      uschar *bralink = NULL;
-      uschar *brazeroptr = NULL;
+      pcre_uchar *bralink = NULL;
+      pcre_uchar *brazeroptr = NULL;

       /* Repeating a DEFINE group is pointless, but Perl allows the syntax, so
       we just ignore the repeat. */
@@ -4806,8 +5127,8 @@
         if (repeat_max <= 1)    /* Covers 0, 1, and unlimited */
           {
           *code = OP_END;
-          adjust_recurse(previous, 1, utf8, cd, save_hwm);
-          memmove(previous+1, previous, len);
+          adjust_recurse(previous, 1, utf, cd, save_hwm);
+          memmove(previous + 1, previous, IN_UCHARS(len));
           code++;
           if (repeat_max == 0)
             {
@@ -4830,8 +5151,8 @@
           {
           int offset;
           *code = OP_END;
-          adjust_recurse(previous, 2 + LINK_SIZE, utf8, cd, save_hwm);
-          memmove(previous + 2 + LINK_SIZE, previous, len);
+          adjust_recurse(previous, 2 + LINK_SIZE, utf, cd, save_hwm);
+          memmove(previous + 2 + LINK_SIZE, previous, IN_UCHARS(len));
           code += 2 + LINK_SIZE;
           *previous++ = OP_BRAZERO + repeat_type;
           *previous++ = OP_BRA;
@@ -4877,16 +5198,32 @@
             *lengthptr += delta;
             }

-          /* This is compiling for real */
+          /* This is compiling for real. If there is a set first byte for
+          the group, and we have not yet set a "required byte", set it. Make
+          sure there is enough workspace for copying forward references before
+          doing the copy. */

           else
             {
-            if (groupsetfirstbyte && reqbyte < 0) reqbyte = firstbyte;
+            if (groupsetfirstchar && reqchar < 0) reqchar = firstchar;
+
             for (i = 1; i < repeat_min; i++)
               {
-              uschar *hc;
-              uschar *this_hwm = cd->hwm;
-              memcpy(code, previous, len);
+              pcre_uchar *hc;
+              pcre_uchar *this_hwm = cd->hwm;
+              memcpy(code, previous, IN_UCHARS(len));
+
+              while (cd->hwm > cd->start_workspace + cd->workspace_size -
+                     WORK_SIZE_SAFETY_MARGIN - (this_hwm - save_hwm))
+                {
+                int save_offset = save_hwm - cd->start_workspace;
+                int this_offset = this_hwm - cd->start_workspace;
+                *errorcodeptr = expand_workspace(cd);
+                if (*errorcodeptr != 0) goto FAILED;
+                save_hwm = (pcre_uchar *)cd->start_workspace + save_offset;
+                this_hwm = (pcre_uchar *)cd->start_workspace + this_offset;
+                }
+
               for (hc = save_hwm; hc < this_hwm; hc += LINK_SIZE)
                 {
                 PUT(cd->hwm, 0, GET(hc, 0) + len);
@@ -4936,8 +5273,8 @@

         else for (i = repeat_max - 1; i >= 0; i--)
           {
-          uschar *hc;
-          uschar *this_hwm = cd->hwm;
+          pcre_uchar *hc;
+          pcre_uchar *this_hwm = cd->hwm;

           *code++ = OP_BRAZERO + repeat_type;

@@ -4953,7 +5290,22 @@
             PUTINC(code, 0, offset);
             }

-          memcpy(code, previous, len);
+          memcpy(code, previous, IN_UCHARS(len));
+
+          /* Ensure there is enough workspace for forward references before
+          copying them. */
+
+          while (cd->hwm > cd->start_workspace + cd->workspace_size -
+                 WORK_SIZE_SAFETY_MARGIN - (this_hwm - save_hwm))
+            {
+            int save_offset = save_hwm - cd->start_workspace;
+            int this_offset = this_hwm - cd->start_workspace;
+            *errorcodeptr = expand_workspace(cd);
+            if (*errorcodeptr != 0) goto FAILED;
+            save_hwm = (pcre_uchar *)cd->start_workspace + save_offset;
+            this_hwm = (pcre_uchar *)cd->start_workspace + this_offset;
+            }
+
           for (hc = save_hwm; hc < this_hwm; hc += LINK_SIZE)
             {
             PUT(cd->hwm, 0, GET(hc, 0) + len + ((i != 0)? 2+LINK_SIZE : 1));
@@ -4970,7 +5322,7 @@
           {
           int oldlinkoffset;
           int offset = (int)(code - bralink + 1);
-          uschar *bra = code - offset;
+          pcre_uchar *bra = code - offset;
           oldlinkoffset = GET(bra, 1);
           bralink = (oldlinkoffset == 0)? NULL : bralink - oldlinkoffset;
           *code++ = OP_KET;
@@ -4984,55 +5336,55 @@
       ONCE brackets can be converted into non-capturing brackets, as the
       behaviour of (?:xx)++ is the same as (?>xx)++ and this saves having to
       deal with possessive ONCEs specially.
-      
+
       Otherwise, when we are doing the actual compile phase, check to see
       whether this group is one that could match an empty string. If so,
       convert the initial operator to the S form (e.g. OP_BRA -> OP_SBRA) so
       that runtime checking can be done. [This check is also applied to ONCE
       groups at runtime, but in a different way.]

-      Then, if the quantifier was possessive and the bracket is not a 
+      Then, if the quantifier was possessive and the bracket is not a
       conditional, we convert the BRA code to the POS form, and the KET code to
       KETRPOS. (It turns out to be convenient at runtime to detect this kind of
       subpattern at both the start and at the end.) The use of special opcodes
       makes it possible to reduce greatly the stack usage in pcre_exec(). If
-      the group is preceded by OP_BRAZERO, convert this to OP_BRAPOSZERO. 
-       
+      the group is preceded by OP_BRAZERO, convert this to OP_BRAPOSZERO.
+
       Then, if the minimum number of matches is 1 or 0, cancel the possessive
       flag so that the default action below, of wrapping everything inside
       atomic brackets, does not happen. When the minimum is greater than 1,
-      there will be earlier copies of the group, and so we still have to wrap 
+      there will be earlier copies of the group, and so we still have to wrap
       the whole thing. */

       else
         {
-        uschar *ketcode = code - 1 - LINK_SIZE;
-        uschar *bracode = ketcode - GET(ketcode, 1);
+        pcre_uchar *ketcode = code - 1 - LINK_SIZE;
+        pcre_uchar *bracode = ketcode - GET(ketcode, 1);

         /* Convert possessive ONCE brackets to non-capturing */
-         
+
         if ((*bracode == OP_ONCE || *bracode == OP_ONCE_NC) &&
             possessive_quantifier) *bracode = OP_BRA;

         /* For non-possessive ONCE brackets, all we need to do is to
         set the KET. */
-          
+
         if (*bracode == OP_ONCE || *bracode == OP_ONCE_NC)
           *ketcode = OP_KETRMAX + repeat_type;
-        
+
         /* Handle non-ONCE brackets and possessive ONCEs (which have been
-        converted to non-capturing above). */ 
-   
+        converted to non-capturing above). */
+
         else
           {
           /* In the compile phase, check for empty string matching. */
-             
+
           if (lengthptr == NULL)
             {
-            uschar *scode = bracode;
+            pcre_uchar *scode = bracode;
             do
               {
-              if (could_be_empty_branch(scode, ketcode, utf8, cd))
+              if (could_be_empty_branch(scode, ketcode, utf, cd))
                 {
                 *bracode += OP_SBRA - OP_BRA;
                 break;
@@ -5041,7 +5393,7 @@
               }
             while (*scode == OP_ALT);
             }
-          
+
           /* Handle possessive quantifiers. */

           if (possessive_quantifier)
@@ -5050,38 +5402,38 @@
             repeated non-capturing bracket, because we have not invented POS
             versions of the COND opcodes. Because we are moving code along, we
             must ensure that any pending recursive references are updated. */
-   
+
             if (*bracode == OP_COND || *bracode == OP_SCOND)
               {
               int nlen = (int)(code - bracode);
               *code = OP_END;
-              adjust_recurse(bracode, 1 + LINK_SIZE, utf8, cd, save_hwm);
-              memmove(bracode + 1+LINK_SIZE, bracode, nlen);
+              adjust_recurse(bracode, 1 + LINK_SIZE, utf, cd, save_hwm);
+              memmove(bracode + 1 + LINK_SIZE, bracode, IN_UCHARS(nlen));
               code += 1 + LINK_SIZE;
               nlen += 1 + LINK_SIZE;
               *bracode = OP_BRAPOS;
               *code++ = OP_KETRPOS;
               PUTINC(code, 0, nlen);
               PUT(bracode, 1, nlen);
-              }  
- 
+              }
+
             /* For non-COND brackets, we modify the BRA code and use KETRPOS. */
-             
-            else 
+
+            else
               {
               *bracode += 1;              /* Switch to xxxPOS opcodes */
               *ketcode = OP_KETRPOS;
               }
-            
-            /* If the minimum is zero, mark it as possessive, then unset the 
+
+            /* If the minimum is zero, mark it as possessive, then unset the
             possessive flag when the minimum is 0 or 1. */
-             
+
             if (brazeroptr != NULL) *brazeroptr = OP_BRAPOSZERO;
             if (repeat_min < 2) possessive_quantifier = FALSE;
             }
-            
+
           /* Non-possessive quantifier */
-           
+
           else *ketcode = OP_KETRMAX + repeat_type;
           }
         }
@@ -5125,15 +5477,16 @@
       int len;

       if (*tempcode == OP_TYPEEXACT)
-        tempcode += _pcre_OP_lengths[*tempcode] +
-          ((tempcode[3] == OP_PROP || tempcode[3] == OP_NOTPROP)? 2 : 0);
+        tempcode += PRIV(OP_lengths)[*tempcode] +
+          ((tempcode[1 + IMM2_SIZE] == OP_PROP
+          || tempcode[1 + IMM2_SIZE] == OP_NOTPROP)? 2 : 0);

       else if (*tempcode == OP_EXACT || *tempcode == OP_NOTEXACT)
         {
-        tempcode += _pcre_OP_lengths[*tempcode];
-#ifdef SUPPORT_UTF8
-        if (utf8 && tempcode[-1] >= 0xc0)
-          tempcode += _pcre_utf8_table4[tempcode[-1] & 0x3f];
+        tempcode += PRIV(OP_lengths)[*tempcode];
+#ifdef SUPPORT_UTF
+        if (utf && HAS_EXTRALEN(tempcode[-1]))
+          tempcode += GET_EXTRALEN(tempcode[-1]);
 #endif
         }

@@ -5170,8 +5523,8 @@

         default:
         *code = OP_END;
-        adjust_recurse(tempcode, 1 + LINK_SIZE, utf8, cd, save_hwm);
-        memmove(tempcode + 1+LINK_SIZE, tempcode, len);
+        adjust_recurse(tempcode, 1 + LINK_SIZE, utf, cd, save_hwm);
+        memmove(tempcode + 1 + LINK_SIZE, tempcode, IN_UCHARS(len));
         code += 1 + LINK_SIZE;
         len += 1 + LINK_SIZE;
         tempcode[0] = OP_ONCE;
@@ -5183,7 +5536,7 @@
       }

     /* In all case we no longer have a previous item. We also set the
-    "follows varying string" flag for subsequently encountered reqbytes if
+    "follows varying string" flag for subsequently encountered reqchars if
     it isn't already set and we have just passed a varying length item. */

     END_REPEAT:
@@ -5206,16 +5559,18 @@

     /* First deal with various "verbs" that can be introduced by '*'. */

-    if (*(++ptr) == CHAR_ASTERISK &&
-         ((cd->ctypes[ptr[1]] & ctype_letter) != 0 || ptr[1] == ':'))
+    ptr++;
+    if (ptr[0] == CHAR_ASTERISK && (ptr[1] == ':'
+         || (MAX_255(ptr[1]) && ((cd->ctypes[ptr[1]] & ctype_letter) != 0))))
       {
       int i, namelen;
       int arglen = 0;
       const char *vn = verbnames;
-      const uschar *name = ptr + 1;
-      const uschar *arg = NULL;
+      const pcre_uchar *name = ptr + 1;
+      const pcre_uchar *arg = NULL;
       previous = NULL;
-      while ((cd->ctypes[*++ptr] & ctype_letter) != 0) {};
+      ptr++;
+      while (MAX_255(*ptr) && (cd->ctypes[*ptr] & ctype_letter) != 0) ptr++;
       namelen = (int)(ptr - name);

       /* It appears that Perl allows any characters whatsoever, other than
@@ -5240,7 +5595,7 @@
       for (i = 0; i < verbcount; i++)
         {
         if (namelen == verbs[i].len &&
-            strncmp((char *)name, vn, namelen) == 0)
+            STRNCMP_UC_C8(name, vn, namelen) == 0)
           {
           /* Check for open captures before ACCEPT and convert it to
           ASSERT_ACCEPT if in an assertion. */
@@ -5261,8 +5616,8 @@
               }
             *code++ = (cd->assert_depth > 0)? OP_ASSERT_ACCEPT : OP_ACCEPT;

-            /* Do not set firstbyte after *ACCEPT */
-            if (firstbyte == REQ_UNSET) firstbyte = REQ_NONE;
+            /* Do not set firstchar after *ACCEPT */
+            if (firstchar == REQ_UNSET) firstchar = REQ_NONE;
             }

           /* Handle other cases with/without an argument */
@@ -5288,7 +5643,7 @@
             *code = verbs[i].op_arg;
             if (*code++ == OP_THEN_ARG) cd->external_flags |= PCRE_HASTHEN;
             *code++ = arglen;
-            memcpy(code, arg, arglen);
+            memcpy(code, arg, IN_UCHARS(arglen));
             code += arglen;
             *code++ = 0;
             }
@@ -5311,8 +5666,8 @@
       {
       int i, set, unset, namelen;
       int *optset;
-      const uschar *name;
-      uschar *slot;
+      const pcre_uchar *name;
+      pcre_uchar *slot;

       switch (*(++ptr))
         {
@@ -5365,10 +5720,10 @@
           break;

         /* Most other conditions use OP_CREF (a couple change to OP_RREF
-        below), and all need to skip 3 bytes at the start of the group. */
+        below), and all need to skip 1+IMM2_SIZE bytes at the start of the group. */

         code[1+LINK_SIZE] = OP_CREF;
-        skipbytes = 3;
+        skipbytes = 1+IMM2_SIZE;
         refsign = -1;

         /* Check for a test for recursion in a named group. */
@@ -5401,7 +5756,7 @@

         /* We now expect to read a name; any thing else is an error */

-        if ((cd->ctypes[ptr[1]] & ctype_word) == 0)
+        if (!MAX_255(ptr[1]) || (cd->ctypes[ptr[1]] & ctype_word) == 0)
           {
           ptr += 1;  /* To get the right offset */
           *errorcodeptr = ERR28;
@@ -5412,11 +5767,10 @@

         recno = 0;
         name = ++ptr;
-        while ((cd->ctypes[*ptr] & ctype_word) != 0)
+        while (MAX_255(*ptr) && (cd->ctypes[*ptr] & ctype_word) != 0)
           {
           if (recno >= 0)
-            recno = ((digitab[*ptr] & ctype_digit) != 0)?
-              recno * 10 + *ptr - CHAR_0 : -1;
+            recno = (IS_DIGIT(*ptr))? recno * 10 + *ptr - CHAR_0 : -1;
           ptr++;
           }
         namelen = (int)(ptr - name);
@@ -5464,7 +5818,7 @@
         slot = cd->name_table;
         for (i = 0; i < cd->names_found; i++)
           {
-          if (strncmp((char *)name, (char *)slot+2, namelen) == 0) break;
+          if (STRNCMP_UC_UC(name, slot+IMM2_SIZE, namelen) == 0) break;
           slot += cd->name_entry_size;
           }

@@ -5480,7 +5834,7 @@
         /* Search the pattern for a forward reference */

         else if ((i = find_parens(cd, name, namelen,
-                        (options & PCRE_EXTENDED) != 0, utf8)) > 0)
+                        (options & PCRE_EXTENDED) != 0, utf)) > 0)
           {
           PUT2(code, 2+LINK_SIZE, i);
           code[1+LINK_SIZE]++;
@@ -5506,7 +5860,7 @@
           recno = 0;
           for (i = 1; i < namelen; i++)
             {
-            if ((digitab[name[i]] & ctype_digit) == 0)
+            if (!IS_DIGIT(name[i]))
               {
               *errorcodeptr = ERR15;
               goto FAILED;
@@ -5521,7 +5875,7 @@
         /* Similarly, check for the (?(DEFINE) "condition", which is always
         false. */

-        else if (namelen == 6 && strncmp((char *)name, STRING_DEFINE, 6) == 0)
+        else if (namelen == 6 && STRNCMP_UC_C8(name, STRING_DEFINE, 6) == 0)
           {
           code[1+LINK_SIZE] = OP_DEF;
           skipbytes = 1;
@@ -5584,7 +5938,8 @@
           break;

           default:                /* Could be name define, else bad */
-          if ((cd->ctypes[ptr[1]] & ctype_word) != 0) goto DEFINE_NAME;
+          if (MAX_255(ptr[1]) && (cd->ctypes[ptr[1]] & ctype_word) != 0)
+            goto DEFINE_NAME;
           ptr++;                  /* Correct offset for error */
           *errorcodeptr = ERR24;
           goto FAILED;
@@ -5606,8 +5961,9 @@
         *code++ = OP_CALLOUT;
           {
           int n = 0;
-          while ((digitab[*(++ptr)] & ctype_digit) != 0)
-            n = n * 10 + *ptr - CHAR_0;
+          ptr++;
+          while(IS_DIGIT(*ptr))
+            n = n * 10 + *ptr++ - CHAR_0;
           if (*ptr != CHAR_RIGHT_PARENTHESIS)
             {
             *errorcodeptr = ERR39;
@@ -5652,7 +6008,7 @@
             CHAR_GREATER_THAN_SIGN : CHAR_APOSTROPHE;
           name = ++ptr;

-          while ((cd->ctypes[*ptr] & ctype_word) != 0) ptr++;
+          while (MAX_255(*ptr) && (cd->ctypes[*ptr] & ctype_word) != 0) ptr++;
           namelen = (int)(ptr - name);

           /* In the pre-compile phase, just do a syntax check. */
@@ -5669,9 +6025,9 @@
               *errorcodeptr = ERR49;
               goto FAILED;
               }
-            if (namelen + 3 > cd->name_entry_size)
+            if (namelen + IMM2_SIZE + 1 > cd->name_entry_size)
               {
-              cd->name_entry_size = namelen + 3;
+              cd->name_entry_size = namelen + IMM2_SIZE + 1;
               if (namelen > MAX_NAME_SIZE)
                 {
                 *errorcodeptr = ERR48;
@@ -5700,10 +6056,10 @@

             for (i = 0; i < cd->names_found; i++)
               {
-              int crc = memcmp(name, slot+2, namelen);
+              int crc = memcmp(name, slot+IMM2_SIZE, IN_UCHARS(namelen));
               if (crc == 0)
                 {
-                if (slot[2+namelen] == 0)
+                if (slot[IMM2_SIZE+namelen] == 0)
                   {
                   if (GET2(slot, 0) != cd->bracount + 1 &&
                       (options & PCRE_DUPNAMES) == 0)
@@ -5724,7 +6080,7 @@
               if (crc < 0)
                 {
                 memmove(slot + cd->name_entry_size, slot,
-                  (cd->names_found - i) * cd->name_entry_size);
+                  IN_UCHARS((cd->names_found - i) * cd->name_entry_size));
                 break;
                 }

@@ -5738,7 +6094,7 @@

             if (!dupname)
               {
-              uschar *cslot = cd->name_table;
+              pcre_uchar *cslot = cd->name_table;
               for (i = 0; i < cd->names_found; i++)
                 {
                 if (cslot != slot)
@@ -5755,8 +6111,8 @@
               }

             PUT2(slot, 0, cd->bracount + 1);
-            memcpy(slot + 2, name, namelen);
-            slot[2+namelen] = 0;
+            memcpy(slot + IMM2_SIZE, name, IN_UCHARS(namelen));
+            slot[IMM2_SIZE + namelen] = 0;
             }
           }

@@ -5782,7 +6138,7 @@

         NAMED_REF_OR_RECURSE:
         name = ++ptr;
-        while ((cd->ctypes[*ptr] & ctype_word) != 0) ptr++;
+        while (MAX_255(*ptr) && (cd->ctypes[*ptr] & ctype_word) != 0) ptr++;
         namelen = (int)(ptr - name);

         /* In the pre-compile phase, do a syntax check. We used to just set
@@ -5794,7 +6150,7 @@

         if (lengthptr != NULL)
           {
-          const uschar *temp;
+          const pcre_uchar *temp;

           if (namelen == 0)
             {
@@ -5824,7 +6180,7 @@
           temp = cd->end_pattern;
           cd->end_pattern = ptr;
           recno = find_parens(cd, name, namelen,
-            (options & PCRE_EXTENDED) != 0, utf8);
+            (options & PCRE_EXTENDED) != 0, utf);
           cd->end_pattern = temp;
           if (recno < 0) recno = 0;    /* Forward ref; set dummy number */
           }
@@ -5839,8 +6195,8 @@
           slot = cd->name_table;
           for (i = 0; i < cd->names_found; i++)
             {
-            if (strncmp((char *)name, (char *)slot+2, namelen) == 0 &&
-                slot[2+namelen] == 0)
+            if (STRNCMP_UC_UC(name, slot+IMM2_SIZE, namelen) == 0 &&
+                slot[IMM2_SIZE+namelen] == 0)
               break;
             slot += cd->name_entry_size;
             }
@@ -5851,7 +6207,7 @@
             }
           else if ((recno =                /* Forward back reference */
                     find_parens(cd, name, namelen,
-                      (options & PCRE_EXTENDED) != 0, utf8)) <= 0)
+                      (options & PCRE_EXTENDED) != 0, utf)) <= 0)
             {
             *errorcodeptr = ERR15;
             goto FAILED;
@@ -5876,7 +6232,7 @@
         case CHAR_0: case CHAR_1: case CHAR_2: case CHAR_3: case CHAR_4:
         case CHAR_5: case CHAR_6: case CHAR_7: case CHAR_8: case CHAR_9:
           {
-          const uschar *called;
+          const pcre_uchar *called;
           terminator = CHAR_RIGHT_PARENTHESIS;

           /* Come here from the \g<...> and \g'...' code (Oniguruma
@@ -5890,7 +6246,7 @@
           if ((refsign = *ptr) == CHAR_PLUS)
             {
             ptr++;
-            if ((digitab[*ptr] & ctype_digit) == 0)
+            if (!IS_DIGIT(*ptr))
               {
               *errorcodeptr = ERR63;
               goto FAILED;
@@ -5898,13 +6254,13 @@
             }
           else if (refsign == CHAR_MINUS)
             {
-            if ((digitab[ptr[1]] & ctype_digit) == 0)
+            if (!IS_DIGIT(ptr[1]))
               goto OTHER_CHAR_AFTER_QUERY;
             ptr++;
             }

           recno = 0;
-          while((digitab[*ptr] & ctype_digit) != 0)
+          while(IS_DIGIT(*ptr))
             recno = recno * 10 + *ptr++ - CHAR_0;

           if (*ptr != terminator)
@@ -5955,14 +6311,14 @@
             {
             *code = OP_END;
             if (recno != 0)
-              called = _pcre_find_bracket(cd->start_code, utf8, recno);
+              called = PRIV(find_bracket)(cd->start_code, utf, recno);

             /* Forward reference */

             if (called == NULL)
               {
               if (find_parens(cd, NULL, recno,
-                    (options & PCRE_EXTENDED) != 0, utf8) < 0)
+                    (options & PCRE_EXTENDED) != 0, utf) < 0)
                 {
                 *errorcodeptr = ERR15;
                 goto FAILED;
@@ -5973,6 +6329,12 @@
               of the group. Then remember the forward reference. */

               called = cd->start_code + recno;
+              if (cd->hwm >= cd->start_workspace + cd->workspace_size -
+                  WORK_SIZE_SAFETY_MARGIN)
+                {
+                *errorcodeptr = expand_workspace(cd);
+                if (*errorcodeptr != 0) goto FAILED;
+                }
               PUTINC(cd->hwm, 0, (int)(code + 1 - cd->start_code));
               }

@@ -5986,23 +6348,26 @@
             conditional subpatterns will be picked up then. */

             else if (GET(called, 1) == 0 && cond_depth <= 0 &&
-                     could_be_empty(called, code, bcptr, utf8, cd))
+                     could_be_empty(called, code, bcptr, utf, cd))
               {
               *errorcodeptr = ERR40;
               goto FAILED;
               }
             }

-          /* Insert the recursion/subroutine item. */
+          /* Insert the recursion/subroutine item. It does not have a set first
+          character (relevant if it is repeated, because it will then be
+          wrapped with ONCE brackets). */

           *code = OP_RECURSE;
           PUT(code, 1, (int)(called - cd->start_code));
           code += 1 + LINK_SIZE;
+          groupsetfirstchar = FALSE;
           }

         /* Can't determine a first byte now */

-        if (firstbyte == REQ_UNSET) firstbyte = REQ_NONE;
+        if (firstchar == REQ_UNSET) firstchar = REQ_NONE;
         continue;

@@ -6059,7 +6424,7 @@
         both phases.

         If we are not at the pattern start, reset the greedy defaults and the
-        case value for firstbyte and reqbyte. */
+        case value for firstchar and reqchar. */

         if (*ptr == CHAR_RIGHT_PARENTHESIS)
           {
@@ -6072,7 +6437,7 @@
             {
             greedy_default = ((newoptions & PCRE_UNGREEDY) != 0);
             greedy_non_default = greedy_default ^ 1;
-            req_caseopt = ((newoptions & PCRE_CASELESS) != 0)? REQ_CASELESS : 0;
+            req_caseopt = ((newoptions & PCRE_CASELESS) != 0)? REQ_CASELESS:0;
             }

           /* Change options at this level, and pass them back for use
@@ -6109,7 +6474,7 @@
       NUMBERED_GROUP:
       cd->bracount += 1;
       PUT2(code, 1+LINK_SIZE, cd->bracount);
-      skipbytes = 2;
+      skipbytes = IMM2_SIZE;
       }

     /* Process nested bracketed regex. Assertions used not to be repeatable,
@@ -6135,8 +6500,8 @@
          skipbytes,                       /* Skip over bracket number */
          cond_depth +
            ((bravalue == OP_COND)?1:0),   /* Depth of condition subpatterns */
-         &subfirstbyte,                   /* For possible first char */
-         &subreqbyte,                     /* For possible last char */
+         &subfirstchar,                   /* For possible first char */
+         &subreqchar,                     /* For possible last char */
          bcptr,                           /* Current branch chain */
          cd,                              /* Tables block */
          (lengthptr == NULL)? NULL :      /* Actual compile phase */
@@ -6164,7 +6529,7 @@

     if (bravalue == OP_COND && lengthptr == NULL)
       {
-      uschar *tc = code;
+      pcre_uchar *tc = code;
       int condcount = 0;

       do {
@@ -6187,7 +6552,7 @@
         }

       /* A "normal" conditional group. If there is just one branch, we must not
-      make use of its firstbyte or reqbyte, because this is equivalent to an
+      make use of its firstchar or reqchar, because this is equivalent to an
       empty second branch. */

       else
@@ -6197,7 +6562,7 @@
           *errorcodeptr = ERR27;
           goto FAILED;
           }
-        if (condcount == 1) subfirstbyte = subreqbyte = REQ_NONE;
+        if (condcount == 1) subfirstchar = subreqchar = REQ_NONE;
         }
       }

@@ -6241,55 +6606,55 @@
     /* Handle updating of the required and first characters for other types of
     group. Update for normal brackets of all kinds, and conditions with two
     branches (see code above). If the bracket is followed by a quantifier with
-    zero repeat, we have to back off. Hence the definition of zeroreqbyte and
-    zerofirstbyte outside the main loop so that they can be accessed for the
+    zero repeat, we have to back off. Hence the definition of zeroreqchar and
+    zerofirstchar outside the main loop so that they can be accessed for the
     back off. */

-    zeroreqbyte = reqbyte;
-    zerofirstbyte = firstbyte;
-    groupsetfirstbyte = FALSE;
+    zeroreqchar = reqchar;
+    zerofirstchar = firstchar;
+    groupsetfirstchar = FALSE;

     if (bravalue >= OP_ONCE)
       {
-      /* If we have not yet set a firstbyte in this branch, take it from the
+      /* If we have not yet set a firstchar in this branch, take it from the
       subpattern, remembering that it was set here so that a repeat of more
-      than one can replicate it as reqbyte if necessary. If the subpattern has
-      no firstbyte, set "none" for the whole branch. In both cases, a zero
-      repeat forces firstbyte to "none". */
+      than one can replicate it as reqchar if necessary. If the subpattern has
+      no firstchar, set "none" for the whole branch. In both cases, a zero
+      repeat forces firstchar to "none". */

-      if (firstbyte == REQ_UNSET)
+      if (firstchar == REQ_UNSET)
         {
-        if (subfirstbyte >= 0)
+        if (subfirstchar >= 0)
           {
-          firstbyte = subfirstbyte;
-          groupsetfirstbyte = TRUE;
+          firstchar = subfirstchar;
+          groupsetfirstchar = TRUE;
           }
-        else firstbyte = REQ_NONE;
-        zerofirstbyte = REQ_NONE;
+        else firstchar = REQ_NONE;
+        zerofirstchar = REQ_NONE;
         }

-      /* If firstbyte was previously set, convert the subpattern's firstbyte
-      into reqbyte if there wasn't one, using the vary flag that was in
+      /* If firstchar was previously set, convert the subpattern's firstchar
+      into reqchar if there wasn't one, using the vary flag that was in
       existence beforehand. */

-      else if (subfirstbyte >= 0 && subreqbyte < 0)
-        subreqbyte = subfirstbyte | tempreqvary;
+      else if (subfirstchar >= 0 && subreqchar < 0)
+        subreqchar = subfirstchar | tempreqvary;

       /* If the subpattern set a required byte (or set a first byte that isn't
       really the first byte - see above), set it. */

-      if (subreqbyte >= 0) reqbyte = subreqbyte;
+      if (subreqchar >= 0) reqchar = subreqchar;
       }

-    /* For a forward assertion, we take the reqbyte, if set. This can be
+    /* For a forward assertion, we take the reqchar, if set. This can be
     helpful if the pattern that follows the assertion doesn't set a different
-    char. For example, it's useful for /(?=abcde).+/. We can't set firstbyte
+    char. For example, it's useful for /(?=abcde).+/. We can't set firstchar
     for an assertion, however because it leads to incorrect effect for patterns
-    such as /(?=a)a.+/ when the "real" "a" would then become a reqbyte instead
-    of a firstbyte. This is overcome by a scan at the end if there's no
-    firstbyte, looking for an asserted first char. */
+    such as /(?=a)a.+/ when the "real" "a" would then become a reqchar instead
+    of a firstchar. This is overcome by a scan at the end if there's no
+    firstchar, looking for an asserted first char. */

-    else if (bravalue == OP_ASSERT && subreqbyte >= 0) reqbyte = subreqbyte;
+    else if (bravalue == OP_ASSERT && subreqchar >= 0) reqchar = subreqchar;
     break;     /* End of processing '(' */

@@ -6322,13 +6687,13 @@
       /* For metasequences that actually match a character, we disable the
       setting of a first character if it hasn't already been set. */

-      if (firstbyte == REQ_UNSET && -c > ESC_b && -c < ESC_Z)
-        firstbyte = REQ_NONE;
+      if (firstchar == REQ_UNSET && -c > ESC_b && -c < ESC_Z)
+        firstchar = REQ_NONE;

       /* Set values to reset to if this is followed by a zero repeat. */

-      zerofirstbyte = firstbyte;
-      zeroreqbyte = reqbyte;
+      zerofirstchar = firstchar;
+      zeroreqchar = reqchar;

       /* \g<name> or \g'name' is a subroutine call by name and \g<n> or \g'n'
       is a subroutine call by number (Oniguruma syntax). In fact, the value
@@ -6339,7 +6704,7 @@

       if (-c == ESC_g)
         {
-        const uschar *p;
+        const pcre_uchar *p;
         save_hwm = cd->hwm;   /* Normally this is set when '(' is read */
         terminator = (*(++ptr) == CHAR_LESS_THAN_SIGN)?
           CHAR_GREATER_THAN_SIGN : CHAR_APOSTROPHE;
@@ -6356,10 +6721,11 @@

         if (ptr[1] != CHAR_PLUS && ptr[1] != CHAR_MINUS)
           {
-          BOOL isnumber = TRUE;
+          BOOL is_a_number = TRUE;
           for (p = ptr + 1; *p != 0 && *p != terminator; p++)
             {
-            if ((cd->ctypes[*p] & ctype_digit) == 0) isnumber = FALSE;
+            if (!MAX_255(*p)) { is_a_number = FALSE; break; }
+            if ((cd->ctypes[*p] & ctype_digit) == 0) is_a_number = FALSE;
             if ((cd->ctypes[*p] & ctype_word) == 0) break;
             }
           if (*p != terminator)
@@ -6367,7 +6733,7 @@
             *errorcodeptr = ERR57;
             break;
             }
-          if (isnumber)
+          if (is_a_number)
             {
             ptr++;
             goto HANDLE_NUMERICAL_RECURSION;
@@ -6379,7 +6745,7 @@
         /* Test a signed number in angle brackets or quotes. */

         p = ptr + 2;
-        while ((digitab[*p] & ctype_digit) != 0) p++;
+        while (IS_DIGIT(*p)) p++;
         if (*p != terminator)
           {
           *errorcodeptr = ERR57;
@@ -6407,7 +6773,7 @@
         goto NAMED_REF_OR_RECURSE;
         }

-      /* Back references are handled specially; must disable firstbyte if
+      /* Back references are handled specially; must disable firstchar if
       not set to cope with cases like (?=(\w+))\1: which would otherwise set
       ':' later. */

@@ -6417,7 +6783,7 @@
         recno = -c - ESC_REF;

         HANDLE_REFERENCE:    /* Come here from named backref handling */
-        if (firstbyte == REQ_UNSET) firstbyte = REQ_NONE;
+        if (firstchar == REQ_UNSET) firstchar = REQ_NONE;
         previous = code;
         *code++ = ((options & PCRE_CASELESS) != 0)? OP_REFI : OP_REF;
         PUT2INC(code, 0, recno);
@@ -6481,10 +6847,10 @@
 #endif
         /* In non-UTF-8 mode, we turn \C into OP_ALLANY instead of OP_ANYBYTE
         so that it works in DFA mode and in lookbehinds. */
-         
-          {  
+
+          {
           previous = (-c > ESC_b && -c < ESC_Z)? code : NULL;
-          *code++ = (!utf8 && c == -ESC_C)? OP_ALLANY : -c;
+          *code++ = (!utf && c == -ESC_C)? OP_ALLANY : -c;
           }
         }
       continue;
@@ -6494,9 +6860,9 @@
     a value > 127. We set its representation in the length/buffer, and then
     handle it as a data character. */

-#ifdef SUPPORT_UTF8
-    if (utf8 && c > 127)
-      mclength = _pcre_ord2utf8(c, mcbuffer);
+#ifdef SUPPORT_UTF
+    if (utf && c > MAX_VALUE_FOR_SINGLE_CHAR)
+      mclength = PRIV(ord2utf)(c, mcbuffer);
     else
 #endif

@@ -6517,12 +6883,9 @@
     mclength = 1;
     mcbuffer[0] = c;

-#ifdef SUPPORT_UTF8
-    if (utf8 && c >= 0xc0)
-      {
-      while ((ptr[1] & 0xc0) == 0x80)
-        mcbuffer[mclength++] = *(++ptr);
-      }
+#ifdef SUPPORT_UTF
+    if (utf && HAS_EXTRALEN(c))
+      ACROSSCHAR(TRUE, ptr[1], mcbuffer[mclength++] = *(++ptr));
 #endif

     /* At this point we have the character's bytes in mcbuffer, and the length
@@ -6540,34 +6903,34 @@

     /* Set the first and required bytes appropriately. If no previous first
     byte, set it from this character, but revert to none on a zero repeat.
-    Otherwise, leave the firstbyte value alone, and don't change it on a zero
+    Otherwise, leave the firstchar value alone, and don't change it on a zero
     repeat. */

-    if (firstbyte == REQ_UNSET)
+    if (firstchar == REQ_UNSET)
       {
-      zerofirstbyte = REQ_NONE;
-      zeroreqbyte = reqbyte;
+      zerofirstchar = REQ_NONE;
+      zeroreqchar = reqchar;

-      /* If the character is more than one byte long, we can set firstbyte
+      /* If the character is more than one byte long, we can set firstchar
       only if it is not to be matched caselessly. */

       if (mclength == 1 || req_caseopt == 0)
         {
-        firstbyte = mcbuffer[0] | req_caseopt;
-        if (mclength != 1) reqbyte = code[-1] | cd->req_varyopt;
+        firstchar = mcbuffer[0] | req_caseopt;
+        if (mclength != 1) reqchar = code[-1] | cd->req_varyopt;
         }
-      else firstbyte = reqbyte = REQ_NONE;
+      else firstchar = reqchar = REQ_NONE;
       }

-    /* firstbyte was previously set; we can set reqbyte only if the length is
+    /* firstchar was previously set; we can set reqchar only if the length is
     1 or the matching is caseful. */

     else
       {
-      zerofirstbyte = firstbyte;
-      zeroreqbyte = reqbyte;
+      zerofirstchar = firstchar;
+      zeroreqchar = reqchar;
       if (mclength == 1 || req_caseopt == 0)
-        reqbyte = code[-1] | req_caseopt | cd->req_varyopt;
+        reqchar = code[-1] | req_caseopt | cd->req_varyopt;
       }

     break;            /* End of literal character handling */
@@ -6607,8 +6970,8 @@
   reset_bracount TRUE to reset the count for each branch
   skipbytes      skip this many bytes at start (for brackets and OP_COND)
   cond_depth     depth of nesting for conditional subpatterns
-  firstbyteptr   place to put the first required character, or a negative number
-  reqbyteptr     place to put the last required character, or a negative number
+  firstcharptr   place to put the first required character, or a negative number
+  reqcharptr     place to put the last required character, or a negative number
   bcptr          pointer to the chain of currently open branches
   cd             points to the data block with tables pointers etc.
   lengthptr      NULL during the real compile phase
@@ -6618,20 +6981,20 @@
 */

static BOOL
-compile_regex(int options, uschar **codeptr, const uschar **ptrptr,
+compile_regex(int options, pcre_uchar **codeptr, const pcre_uchar **ptrptr,
int *errorcodeptr, BOOL lookbehind, BOOL reset_bracount, int skipbytes,
- int cond_depth, int *firstbyteptr, int *reqbyteptr, branch_chain *bcptr,
- compile_data *cd, int *lengthptr)
+ int cond_depth, pcre_int32 *firstcharptr, pcre_int32 *reqcharptr,
+ branch_chain *bcptr, compile_data *cd, int *lengthptr)
{
-const uschar *ptr = *ptrptr;
-uschar *code = *codeptr;
-uschar *last_branch = code;
-uschar *start_bracket = code;
-uschar *reverse_count = NULL;
+const pcre_uchar *ptr = *ptrptr;
+pcre_uchar *code = *codeptr;
+pcre_uchar *last_branch = code;
+pcre_uchar *start_bracket = code;
+pcre_uchar *reverse_count = NULL;
open_capitem capitem;
int capnumber = 0;
-int firstbyte, reqbyte;
-int branchfirstbyte, branchreqbyte;
+pcre_int32 firstchar, reqchar;
+pcre_int32 branchfirstchar, branchreqchar;
int length;
int orig_bracount;
int max_bracount;
@@ -6640,7 +7003,7 @@
bc.outer = bcptr;
bc.current_branch = code;

-firstbyte = reqbyte = REQ_UNSET;
+firstchar = reqchar = REQ_UNSET;

/* Accumulate the length for use in the pre-compile phase. Start with the
length of the BRA and KET and any extra bytes that are required at the
@@ -6699,8 +7062,8 @@
/* Now compile the branch; in the pre-compile phase its length gets added
into the length. */

-  if (!compile_branch(&options, &code, &ptr, errorcodeptr, &branchfirstbyte,
-        &branchreqbyte, &bc, cond_depth, cd,
+  if (!compile_branch(&options, &code, &ptr, errorcodeptr, &branchfirstchar,
+        &branchreqchar, &bc, cond_depth, cd,
         (lengthptr == NULL)? NULL : &length))
     {
     *ptrptr = ptr;
@@ -6716,43 +7079,43 @@

   if (lengthptr == NULL)
     {
-    /* If this is the first branch, the firstbyte and reqbyte values for the
+    /* If this is the first branch, the firstchar and reqchar values for the
     branch become the values for the regex. */

     if (*last_branch != OP_ALT)
       {
-      firstbyte = branchfirstbyte;
-      reqbyte = branchreqbyte;
+      firstchar = branchfirstchar;
+      reqchar = branchreqchar;
       }

-    /* If this is not the first branch, the first char and reqbyte have to
+    /* If this is not the first branch, the first char and reqchar have to
     match the values from all the previous branches, except that if the
-    previous value for reqbyte didn't have REQ_VARY set, it can still match,
+    previous value for reqchar didn't have REQ_VARY set, it can still match,
     and we set REQ_VARY for the regex. */

     else
       {
-      /* If we previously had a firstbyte, but it doesn't match the new branch,
-      we have to abandon the firstbyte for the regex, but if there was
-      previously no reqbyte, it takes on the value of the old firstbyte. */
+      /* If we previously had a firstchar, but it doesn't match the new branch,
+      we have to abandon the firstchar for the regex, but if there was
+      previously no reqchar, it takes on the value of the old firstchar. */

-      if (firstbyte >= 0 && firstbyte != branchfirstbyte)
+      if (firstchar >= 0 && firstchar != branchfirstchar)
         {
-        if (reqbyte < 0) reqbyte = firstbyte;
-        firstbyte = REQ_NONE;
+        if (reqchar < 0) reqchar = firstchar;
+        firstchar = REQ_NONE;
         }

-      /* If we (now or from before) have no firstbyte, a firstbyte from the
-      branch becomes a reqbyte if there isn't a branch reqbyte. */
+      /* If we (now or from before) have no firstchar, a firstchar from the
+      branch becomes a reqchar if there isn't a branch reqchar. */

-      if (firstbyte < 0 && branchfirstbyte >= 0 && branchreqbyte < 0)
-          branchreqbyte = branchfirstbyte;
+      if (firstchar < 0 && branchfirstchar >= 0 && branchreqchar < 0)
+          branchreqchar = branchfirstchar;

-      /* Now ensure that the reqbytes match */
+      /* Now ensure that the reqchars match */

-      if ((reqbyte & ~REQ_VARY) != (branchreqbyte & ~REQ_VARY))
-        reqbyte = REQ_NONE;
-      else reqbyte |= branchreqbyte;   /* To "or" REQ_VARY */
+      if ((reqchar & ~REQ_VARY) != (branchreqchar & ~REQ_VARY))
+        reqchar = REQ_NONE;
+      else reqchar |= branchreqchar;   /* To "or" REQ_VARY */
       }

     /* If lookbehind, check that this branch matches a fixed-length string, and
@@ -6822,7 +7185,7 @@
       if (cd->open_caps->flag)
         {
         memmove(start_bracket + 1 + LINK_SIZE, start_bracket,
-          code - start_bracket);
+          IN_UCHARS(code - start_bracket));
         *start_bracket = OP_ONCE;
         code += 1 + LINK_SIZE;
         PUT(start_bracket, 1, (int)(code - start_bracket));
@@ -6842,8 +7205,8 @@

     *codeptr = code;
     *ptrptr = ptr;
-    *firstbyteptr = firstbyte;
-    *reqbyteptr = reqbyte;
+    *firstcharptr = firstchar;
+    *reqcharptr = reqchar;
     if (lengthptr != NULL)
       {
       if (OFLOW_MAX - *lengthptr < length)
@@ -6924,12 +7287,12 @@
 */

 static BOOL
-is_anchored(register const uschar *code, unsigned int bracket_map,
+is_anchored(register const pcre_uchar *code, unsigned int bracket_map,
   unsigned int backref_map)
 {
 do {
-   const uschar *scode = first_significant_code(code + _pcre_OP_lengths[*code],
-     FALSE);
+   const pcre_uchar *scode = first_significant_code(
+     code + PRIV(OP_lengths)[*code], FALSE);
    register int op = *scode;

    /* Non-capturing brackets */
@@ -7001,12 +7364,12 @@
 */

 static BOOL
-is_startline(const uschar *code, unsigned int bracket_map,
+is_startline(const pcre_uchar *code, unsigned int bracket_map,
   unsigned int backref_map)
 {
 do {
-   const uschar *scode = first_significant_code(code + _pcre_OP_lengths[*code],
-     FALSE);
+   const pcre_uchar *scode = first_significant_code(
+     code + PRIV(OP_lengths)[*code], FALSE);
    register int op = *scode;

    /* If we are at the start of a conditional assertion group, *both* the
@@ -7017,7 +7380,7 @@
    if (op == OP_COND)
      {
      scode += 1 + LINK_SIZE;
-     if (*scode == OP_CALLOUT) scode += _pcre_OP_lengths[OP_CALLOUT];
+     if (*scode == OP_CALLOUT) scode += PRIV(OP_lengths)[OP_CALLOUT];
      switch (*scode)
        {
        case OP_CREF:
@@ -7104,14 +7467,15 @@
 */

 static int
-find_firstassertedchar(const uschar *code, BOOL inassert)
+find_firstassertedchar(const pcre_uchar *code, BOOL inassert)
 {
 register int c = -1;
 do {
    int d;
    int xl = (*code == OP_CBRA || *code == OP_SCBRA ||
-             *code == OP_CBRAPOS || *code == OP_SCBRAPOS)? 2:0;
-   const uschar *scode = first_significant_code(code + 1+LINK_SIZE + xl, TRUE);
+             *code == OP_CBRAPOS || *code == OP_SCBRAPOS)? IMM2_SIZE:0;
+   const pcre_uchar *scode = first_significant_code(code + 1+LINK_SIZE + xl,
+     TRUE);
    register int op = *scode;

    switch(op)
@@ -7135,7 +7499,7 @@
      break;

      case OP_EXACT:
-     scode += 2;
+     scode += IMM2_SIZE;
      /* Fall through */

      case OP_CHAR:
@@ -7148,7 +7512,7 @@
      break;

      case OP_EXACTI:
-     scode += 2;
+     scode += IMM2_SIZE;
      /* Fall through */

      case OP_CHARI:
@@ -7191,28 +7555,45 @@
                 with errorptr and erroroffset set
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN pcre * PCRE_CALL_CONVENTION
pcre_compile(const char *pattern, int options, const char **errorptr,
int *erroroffset, const unsigned char *tables)
+#else
+PCRE_EXP_DEFN pcre * PCRE_CALL_CONVENTION
+pcre16_compile(PCRE_SPTR16 pattern, int options, const char **errorptr,
+ int *erroroffset, const unsigned char *tables)
+#endif
{
+#ifdef COMPILE_PCRE8
return pcre_compile2(pattern, options, NULL, errorptr, erroroffset, tables);
+#else
+return pcre16_compile2(pattern, options, NULL, errorptr, erroroffset, tables);
+#endif
}

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN pcre * PCRE_CALL_CONVENTION
pcre_compile2(const char *pattern, int options, int *errorcodeptr,
const char **errorptr, int *erroroffset, const unsigned char *tables)
+#else
+PCRE_EXP_DEFN pcre * PCRE_CALL_CONVENTION
+pcre16_compile2(PCRE_SPTR16 pattern, int options, int *errorcodeptr,
+ const char **errorptr, int *erroroffset, const unsigned char *tables)
+#endif
{
real_pcre *re;
int length = 1; /* For final END opcode */
-int firstbyte, reqbyte, newline;
+pcre_int32 firstchar, reqchar;
+int newline;
int errorcode = 0;
int skipatstart = 0;
-BOOL utf8;
+BOOL utf;
size_t size;
-uschar *code;
-const uschar *codestart;
-const uschar *ptr;
+pcre_uchar *code;
+const pcre_uchar *codestart;
+const pcre_uchar *ptr;
compile_data compile_block;
compile_data *cd = &compile_block;

@@ -7220,13 +7601,14 @@
computing the amount of memory that is needed. Compiled items are thrown away
as soon as possible, so that a fairly large buffer should be sufficient for
this purpose. The same space is used in the second phase for remembering where
-to fill in forward references to subpatterns. */
+to fill in forward references to subpatterns. That may overflow, in which case
+new memory is obtained from malloc(). */

-uschar cworkspace[COMPILE_WORK_SIZE];
+pcre_uchar cworkspace[COMPILE_WORK_SIZE];

/* Set this early so that early errors get offset 0. */

-ptr = (const uschar *)pattern;
+ptr = (const pcre_uchar *)pattern;

/* We can't pass back an error message if errorptr is NULL; I guess the best we
can do is just return NULL, but we can set a code value if there is a code
@@ -7253,7 +7635,7 @@

/* Set up pointers to the individual character tables */

-if (tables == NULL) tables = _pcre_default_tables;
+if (tables == NULL) tables = PRIV(default_tables);
cd->lcc = tables + lcc_offset;
cd->fcc = tables + fcc_offset;
cd->cbits = tables + cbits_offset;
@@ -7276,27 +7658,33 @@
int newnl = 0;
int newbsr = 0;

-  if (strncmp((char *)(ptr+skipatstart+2), STRING_UTF8_RIGHTPAR, 5) == 0)
+#ifdef COMPILE_PCRE8
+  if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_UTF_RIGHTPAR, 5) == 0)
     { skipatstart += 7; options |= PCRE_UTF8; continue; }
-  else if (strncmp((char *)(ptr+skipatstart+2), STRING_UCP_RIGHTPAR, 4) == 0)
+#endif
+#ifdef COMPILE_PCRE16
+  if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_UTF_RIGHTPAR, 6) == 0)
+    { skipatstart += 8; options |= PCRE_UTF16; continue; }
+#endif
+  else if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_UCP_RIGHTPAR, 4) == 0)
     { skipatstart += 6; options |= PCRE_UCP; continue; }
-  else if (strncmp((char *)(ptr+skipatstart+2), STRING_NO_START_OPT_RIGHTPAR, 13) == 0)
+  else if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_NO_START_OPT_RIGHTPAR, 13) == 0)
     { skipatstart += 15; options |= PCRE_NO_START_OPTIMIZE; continue; }

-  if (strncmp((char *)(ptr+skipatstart+2), STRING_CR_RIGHTPAR, 3) == 0)
+  if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_CR_RIGHTPAR, 3) == 0)
     { skipatstart += 5; newnl = PCRE_NEWLINE_CR; }
-  else if (strncmp((char *)(ptr+skipatstart+2), STRING_LF_RIGHTPAR, 3)  == 0)
+  else if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_LF_RIGHTPAR, 3)  == 0)
     { skipatstart += 5; newnl = PCRE_NEWLINE_LF; }
-  else if (strncmp((char *)(ptr+skipatstart+2), STRING_CRLF_RIGHTPAR, 5)  == 0)
+  else if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_CRLF_RIGHTPAR, 5)  == 0)
     { skipatstart += 7; newnl = PCRE_NEWLINE_CR + PCRE_NEWLINE_LF; }
-  else if (strncmp((char *)(ptr+skipatstart+2), STRING_ANY_RIGHTPAR, 4) == 0)
+  else if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_ANY_RIGHTPAR, 4) == 0)
     { skipatstart += 6; newnl = PCRE_NEWLINE_ANY; }
-  else if (strncmp((char *)(ptr+skipatstart+2), STRING_ANYCRLF_RIGHTPAR, 8) == 0)
+  else if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_ANYCRLF_RIGHTPAR, 8) == 0)
     { skipatstart += 10; newnl = PCRE_NEWLINE_ANYCRLF; }

-  else if (strncmp((char *)(ptr+skipatstart+2), STRING_BSR_ANYCRLF_RIGHTPAR, 12) == 0)
+  else if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_BSR_ANYCRLF_RIGHTPAR, 12) == 0)
     { skipatstart += 14; newbsr = PCRE_BSR_ANYCRLF; }
-  else if (strncmp((char *)(ptr+skipatstart+2), STRING_BSR_UNICODE_RIGHTPAR, 12) == 0)
+  else if (STRNCMP_UC_C8(ptr+skipatstart+2, STRING_BSR_UNICODE_RIGHTPAR, 12) == 0)
     { skipatstart += 14; newbsr = PCRE_BSR_UNICODE; }

if (newnl != 0)
@@ -7306,22 +7694,23 @@
else break;
}

-utf8 = (options & PCRE_UTF8) != 0;
+/* PCRE_UTF16 has the same value as PCRE_UTF8. */
+utf = (options & PCRE_UTF8) != 0;

-/* Can't support UTF8 unless PCRE has been compiled to include the code. The
-return of an error code from _pcre_valid_utf8() is a new feature, introduced in
+/* Can't support UTF unless PCRE has been compiled to include the code. The
+return of an error code from PRIV(valid_utf)() is a new feature, introduced in
release 8.13. It is passed back from pcre_[dfa_]exec(), but at the moment is
not used here. */

-#ifdef SUPPORT_UTF8
-if (utf8 && (options & PCRE_NO_UTF8_CHECK) == 0 &&
-     (errorcode = _pcre_valid_utf8((USPTR)pattern, -1, erroroffset)) != 0)
+#ifdef SUPPORT_UTF
+if (utf && (options & PCRE_NO_UTF8_CHECK) == 0 &&
+     (errorcode = PRIV(valid_utf)((PCRE_PUCHAR)pattern, -1, erroroffset)) != 0)
   {
   errorcode = ERR44;
   goto PCRE_EARLY_ERROR_RETURN2;
   }
 #else
-if (utf8)
+if (utf)
   {
   errorcode = ERR32;
   goto PCRE_EARLY_ERROR_RETURN;
@@ -7397,7 +7786,10 @@
 /* Reflect pattern for debugging output */

DPRINTF(("------------------------------------------------------------------\n"));
-DPRINTF(("%s\n", pattern));
+#ifdef PCRE_DEBUG
+print_puchar(stdout, (PCRE_PUCHAR)pattern);
+#endif
+DPRINTF(("\n"));

/* Pretend to compile the pattern while actually just accumulating the length
of memory required. This behaviour is triggered by passing a non-NULL final
@@ -7410,12 +7802,14 @@
cd->names_found = 0;
cd->name_entry_size = 0;
cd->name_table = NULL;
-cd->start_workspace = cworkspace;
cd->start_code = cworkspace;
cd->hwm = cworkspace;
-cd->start_pattern = (const uschar *)pattern;
-cd->end_pattern = (const uschar *)(pattern + strlen(pattern));
+cd->start_workspace = cworkspace;
+cd->workspace_size = COMPILE_WORK_SIZE;
+cd->start_pattern = (const pcre_uchar *)pattern;
+cd->end_pattern = (const pcre_uchar *)(pattern + STRLEN_UC((const pcre_uchar *)pattern));
cd->req_varyopt = 0;
+cd->assert_depth = 0;
cd->external_options = options;
cd->external_flags = 0;
cd->open_caps = NULL;
@@ -7430,11 +7824,11 @@
code = cworkspace;
*code = OP_BRA;
(void)compile_regex(cd->external_options, &code, &ptr, &errorcode, FALSE,
- FALSE, 0, 0, &firstbyte, &reqbyte, NULL, cd, &length);
+ FALSE, 0, 0, &firstchar, &reqchar, NULL, cd, &length);
if (errorcode != 0) goto PCRE_EARLY_ERROR_RETURN;

DPRINTF(("end pre-compile: length=%d workspace=%d\n", length,
- cd->hwm - cworkspace));
+ (int)(cd->hwm - cworkspace)));

if (length > MAX_PATTERN_SIZE)
{
@@ -7447,8 +7841,8 @@
because nowadays we limit the maximum value of cd->names_found and
cd->name_entry_size. */

-size = length + sizeof(real_pcre) + cd->names_found * (cd->name_entry_size + 3);
-re = (real_pcre *)(pcre_malloc)(size);
+size = sizeof(real_pcre) + (length + cd->names_found * cd->name_entry_size) * sizeof(pcre_uchar);
+re = (real_pcre *)(PUBL(malloc))(size);

if (re == NULL)
{
@@ -7467,13 +7861,13 @@
re->options = cd->external_options;
re->flags = cd->external_flags;
re->dummy1 = 0;
-re->first_byte = 0;
-re->req_byte = 0;
-re->name_table_offset = sizeof(real_pcre);
+re->first_char = 0;
+re->req_char = 0;
+re->name_table_offset = sizeof(real_pcre) / sizeof(pcre_uchar);
re->name_entry_size = cd->name_entry_size;
re->name_count = cd->names_found;
re->ref_count = 0;
-re->tables = (tables == _pcre_default_tables)? NULL : tables;
+re->tables = (tables == PRIV(default_tables))? NULL : tables;
re->nullpad = NULL;

/* The starting points of the name/number translation table and of the code are
@@ -7487,10 +7881,10 @@
cd->assert_depth = 0;
cd->bracount = 0;
cd->names_found = 0;
-cd->name_table = (uschar *)re + re->name_table_offset;
+cd->name_table = (pcre_uchar *)re + re->name_table_offset;
codestart = cd->name_table + re->name_entry_size * re->name_count;
cd->start_code = codestart;
-cd->hwm = cworkspace;
+cd->hwm = (pcre_uchar *)(cd->start_workspace);
cd->req_varyopt = 0;
cd->had_accept = FALSE;
cd->check_lookbehind = FALSE;
@@ -7500,16 +7894,16 @@
error, errorcode will be set non-zero, so we don't need to look at the result
of the function here. */

-ptr = (const uschar *)pattern + skipatstart;
-code = (uschar *)codestart;
+ptr = (const pcre_uchar *)pattern + skipatstart;
+code = (pcre_uchar *)codestart;
*code = OP_BRA;
(void)compile_regex(re->options, &code, &ptr, &errorcode, FALSE, FALSE, 0, 0,
- &firstbyte, &reqbyte, NULL, cd, NULL);
+ &firstchar, &reqchar, NULL, cd, NULL);
re->top_bracket = cd->bracount;
re->top_backref = cd->top_backref;
-re->flags = cd->external_flags;
+re->flags = cd->external_flags | PCRE_MODE;

-if (cd->had_accept) reqbyte = REQ_NONE; /* Must disable after (*ACCEPT) */
+if (cd->had_accept) reqchar = REQ_NONE; /* Must disable after (*ACCEPT) */

/* If not reached end of pattern on success, there's an excess bracket. */

@@ -7524,20 +7918,34 @@
if (code - codestart > length) errorcode = ERR23;
#endif

-/* Fill in any forward references that are required. */
+/* Fill in any forward references that are required. There may be repeated
+references; optimize for them, as searching a large regex takes time. */

-while (errorcode == 0 && cd->hwm > cworkspace)
+if (cd->hwm > cd->start_workspace)
   {
-  int offset, recno;
-  const uschar *groupptr;
-  cd->hwm -= LINK_SIZE;
-  offset = GET(cd->hwm, 0);
-  recno = GET(codestart, offset);
-  groupptr = _pcre_find_bracket(codestart, utf8, recno);
-  if (groupptr == NULL) errorcode = ERR53;
-    else PUT(((uschar *)codestart), offset, (int)(groupptr - codestart));
+  int prev_recno = -1;
+  const pcre_uchar *groupptr = NULL;
+  while (errorcode == 0 && cd->hwm > cd->start_workspace)
+    {
+    int offset, recno;
+    cd->hwm -= LINK_SIZE;
+    offset = GET(cd->hwm, 0);
+    recno = GET(codestart, offset);
+    if (recno != prev_recno)
+      {
+      groupptr = PRIV(find_bracket)(codestart, utf, recno);
+      prev_recno = recno;
+      }
+    if (groupptr == NULL) errorcode = ERR53;
+      else PUT(((pcre_uchar *)codestart), offset, (int)(groupptr - codestart));
+    }
   }

+/* If the workspace had to be expanded, free the new memory. */
+
+if (cd->workspace_size > COMPILE_WORK_SIZE)
+ (PUBL(free))((void *)cd->start_workspace);
+
/* Give an error if there's back reference to a non-existent capturing
subpattern. */

@@ -7553,21 +7961,21 @@

if (cd->check_lookbehind)
{
- uschar *cc = (uschar *)codestart;
+ pcre_uchar *cc = (pcre_uchar *)codestart;

/* Loop, searching for OP_REVERSE items, and process those that do not have
their length set. (Actually, it will also re-process any that have a length
of zero, but that is a pathological case, and it does no harm.) When we find
one, we temporarily terminate the branch it is in while we scan it. */

-  for (cc = (uschar *)_pcre_find_bracket(codestart, utf8, -1);
+  for (cc = (pcre_uchar *)PRIV(find_bracket)(codestart, utf, -1);
        cc != NULL;
-       cc = (uschar *)_pcre_find_bracket(cc, utf8, -1))
+       cc = (pcre_uchar *)PRIV(find_bracket)(cc, utf, -1))
     {
     if (GET(cc, 1) == 0)
       {
       int fixed_length;
-      uschar *be = cc - 1 - LINK_SIZE + GET(cc, -LINK_SIZE);
+      pcre_uchar *be = cc - 1 - LINK_SIZE + GET(cc, -LINK_SIZE);
       int end_op = *be;
       *be = OP_END;
       fixed_length = find_fixedlength(cc, (re->options & PCRE_UTF8) != 0, TRUE,
@@ -7590,9 +7998,9 @@

 if (errorcode != 0)
   {
-  (pcre_free)(re);
+  (PUBL(free))(re);
   PCRE_EARLY_ERROR_RETURN:
-  *erroroffset = (int)(ptr - (const uschar *)pattern);
+  *erroroffset = (int)(ptr - (const pcre_uchar *)pattern);
   PCRE_EARLY_ERROR_RETURN2:
   *errorptr = find_error_text(errorcode);
   if (errorcodeptr != NULL) *errorcodeptr = errorcode;
@@ -7615,13 +8023,38 @@
     re->options |= PCRE_ANCHORED;
   else
     {
-    if (firstbyte < 0)
-      firstbyte = find_firstassertedchar(codestart, FALSE);
-    if (firstbyte >= 0)   /* Remove caseless flag for non-caseable chars */
+    if (firstchar < 0)
+      firstchar = find_firstassertedchar(codestart, FALSE);
+    if (firstchar >= 0)   /* Remove caseless flag for non-caseable chars */
       {
-      int ch = firstbyte & 255;
-      re->first_byte = ((firstbyte & REQ_CASELESS) != 0 &&
-         cd->fcc[ch] == ch)? ch : firstbyte;
+#ifdef COMPILE_PCRE8
+      re->first_char = firstchar & 0xff;
+#else
+#ifdef COMPILE_PCRE16
+      re->first_char = firstchar & 0xffff;
+#endif
+#endif
+      if ((firstchar & REQ_CASELESS) != 0)
+        {
+#if defined SUPPORT_UCP && !(defined COMPILE_PCRE8)
+        /* We ignore non-ASCII first chars in 8 bit mode. */
+        if (utf)
+          {
+          if (re->first_char < 128)
+            {
+            if (cd->fcc[re->first_char] != re->first_char)
+              re->flags |= PCRE_FCH_CASELESS;
+            }
+          else if (UCD_OTHERCASE(re->first_char) != re->first_char)
+            re->flags |= PCRE_FCH_CASELESS;
+          }
+        else
+#endif
+        if (MAX_255(re->first_char)
+            && cd->fcc[re->first_char] != re->first_char)
+          re->flags |= PCRE_FCH_CASELESS;
+        }
+
       re->flags |= PCRE_FIRSTSET;
       }
     else if (is_startline(codestart, 0, cd->backref_map))
@@ -7633,12 +8066,36 @@
 variable length item in the regex. Remove the caseless flag for non-caseable
 bytes. */

-if (reqbyte >= 0 &&
-     ((re->options & PCRE_ANCHORED) == 0 || (reqbyte & REQ_VARY) != 0))
+if (reqchar >= 0 &&
+     ((re->options & PCRE_ANCHORED) == 0 || (reqchar & REQ_VARY) != 0))
   {
-  int ch = reqbyte & 255;
-  re->req_byte = ((reqbyte & REQ_CASELESS) != 0 &&
-    cd->fcc[ch] == ch)? (reqbyte & ~REQ_CASELESS) : reqbyte;
+#ifdef COMPILE_PCRE8
+  re->req_char = reqchar & 0xff;
+#else
+#ifdef COMPILE_PCRE16
+  re->req_char = reqchar & 0xffff;
+#endif
+#endif
+  if ((reqchar & REQ_CASELESS) != 0)
+    {
+#if defined SUPPORT_UCP && !(defined COMPILE_PCRE8)
+    /* We ignore non-ASCII first chars in 8 bit mode. */
+    if (utf)
+      {
+      if (re->req_char < 128)
+        {
+        if (cd->fcc[re->req_char] != re->req_char)
+          re->flags |= PCRE_RCH_CASELESS;
+        }
+      else if (UCD_OTHERCASE(re->req_char) != re->req_char)
+        re->flags |= PCRE_RCH_CASELESS;
+      }
+    else
+#endif
+    if (MAX_255(re->req_char) && cd->fcc[re->req_char] != re->req_char)
+      re->flags |= PCRE_RCH_CASELESS;
+    }
+
   re->flags |= PCRE_REQCHSET;
   }

@@ -7653,32 +8110,36 @@

 if ((re->flags & PCRE_FIRSTSET) != 0)
   {
-  int ch = re->first_byte & 255;
-  const char *caseless = ((re->first_byte & REQ_CASELESS) == 0)?
-    "" : " (caseless)";
-  if (isprint(ch)) printf("First char = %c%s\n", ch, caseless);
+  pcre_uchar ch = re->first_char;
+  const char *caseless =
+    ((re->flags & PCRE_FCH_CASELESS) == 0)? "" : " (caseless)";
+  if (PRINTABLE(ch)) printf("First char = %c%s\n", ch, caseless);
     else printf("First char = \\x%02x%s\n", ch, caseless);
   }

 if ((re->flags & PCRE_REQCHSET) != 0)
   {
-  int ch = re->req_byte & 255;
-  const char *caseless = ((re->req_byte & REQ_CASELESS) == 0)?
-    "" : " (caseless)";
-  if (isprint(ch)) printf("Req char = %c%s\n", ch, caseless);
+  pcre_uchar ch = re->req_char;
+  const char *caseless =
+    ((re->flags & PCRE_RCH_CASELESS) == 0)? "" : " (caseless)";
+  if (PRINTABLE(ch)) printf("Req char = %c%s\n", ch, caseless);
     else printf("Req char = \\x%02x%s\n", ch, caseless);
   }

+#ifdef COMPILE_PCRE8
pcre_printint(re, stdout, TRUE);
+#else
+pcre16_printint(re, stdout, TRUE);
+#endif

/* This check is done here in the debugging case so that the code that
was compiled can be seen. */

if (code - codestart > length)
{
- (pcre_free)(re);
+ (PUBL(free))(re);
*errorptr = find_error_text(ERR23);
- *erroroffset = ptr - (uschar *)pattern;
+ *erroroffset = ptr - (pcre_uchar *)pattern;
if (errorcodeptr != NULL) *errorcodeptr = ERR23;
return NULL;
}

Modified: code/trunk/pcre_config.c
===================================================================
--- code/trunk/pcre_config.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_config.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2011 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -45,6 +45,9 @@
#include "config.h"
#endif

+/* Keep the original link size. */
+static int real_link_size = LINK_SIZE;
+
#include "pcre_internal.h"

@@ -62,19 +65,40 @@
 Returns:           0 if data returned, negative on error
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_config(int what, void *where)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_config(int what, void *where)
+#endif
{
switch (what)
{
case PCRE_CONFIG_UTF8:
-#ifdef SUPPORT_UTF8
+#if defined COMPILE_PCRE16
+ return PCRE_ERROR_BADOPTION;
+#else
+#if defined SUPPORT_UTF
*((int *)where) = 1;
#else
*((int *)where) = 0;
#endif
break;
+#endif

+ case PCRE_CONFIG_UTF16:
+#if defined COMPILE_PCRE8
+ return PCRE_ERROR_BADOPTION;
+#else
+#if defined SUPPORT_UTF
+ *((int *)where) = 1;
+#else
+ *((int *)where) = 0;
+#endif
+ break;
+#endif
+
case PCRE_CONFIG_UNICODE_PROPERTIES:
#ifdef SUPPORT_UCP
*((int *)where) = 1;
@@ -104,7 +128,7 @@
break;

case PCRE_CONFIG_LINK_SIZE:
- *((int *)where) = LINK_SIZE;
+ *((int *)where) = real_link_size;
break;

case PCRE_CONFIG_POSIX_MALLOC_THRESHOLD:

Modified: code/trunk/pcre_dfa_exec.c
===================================================================
--- code/trunk/pcre_dfa_exec.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_dfa_exec.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -7,7 +7,7 @@
 below for why this module is different).

                        Written by Philip Hazel
-           Copyright (c) 1997-2011 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -113,7 +113,7 @@
the character is to be found. ***NOTE*** If the start of this table is
modified, the three tables that follow must also be modified. */

-static const uschar coptable[] = {
+static const pcre_uint8 coptable[] = {
   0,                             /* End                                    */
   0, 0, 0, 0, 0,                 /* \A, \G, \K, \B, \b                     */
   0, 0, 0, 0, 0, 0,              /* \D, \d, \S, \s, \W, \w                 */
@@ -128,22 +128,27 @@
   1,                             /* noti                                   */
   /* Positive single-char repeats                                          */
   1, 1, 1, 1, 1, 1,              /* *, *?, +, +?, ?, ??                    */
-  3, 3, 3,                       /* upto, minupto, exact                   */
-  1, 1, 1, 3,                    /* *+, ++, ?+, upto+                      */
+  1+IMM2_SIZE, 1+IMM2_SIZE,      /* upto, minupto                          */
+  1+IMM2_SIZE,                   /* exact                                  */
+  1, 1, 1, 1+IMM2_SIZE,          /* *+, ++, ?+, upto+                      */
   1, 1, 1, 1, 1, 1,              /* *I, *?I, +I, +?I, ?I, ??I              */
-  3, 3, 3,                       /* upto I, minupto I, exact I             */
-  1, 1, 1, 3,                    /* *+I, ++I, ?+I, upto+I                  */
+  1+IMM2_SIZE, 1+IMM2_SIZE,      /* upto I, minupto I                      */
+  1+IMM2_SIZE,                   /* exact I                                */
+  1, 1, 1, 1+IMM2_SIZE,          /* *+I, ++I, ?+I, upto+I                  */
   /* Negative single-char repeats - only for chars < 256                   */
   1, 1, 1, 1, 1, 1,              /* NOT *, *?, +, +?, ?, ??                */
-  3, 3, 3,                       /* NOT upto, minupto, exact               */
-  1, 1, 1, 3,                    /* NOT *+, ++, ?+, upto+                  */
+  1+IMM2_SIZE, 1+IMM2_SIZE,      /* NOT upto, minupto                      */
+  1+IMM2_SIZE,                   /* NOT exact                              */
+  1, 1, 1, 1+IMM2_SIZE,          /* NOT *+, ++, ?+, upto+                  */
   1, 1, 1, 1, 1, 1,              /* NOT *I, *?I, +I, +?I, ?I, ??I          */
-  3, 3, 3,                       /* NOT upto I, minupto I, exact I         */
-  1, 1, 1, 3,                    /* NOT *+I, ++I, ?+I, upto+I              */
+  1+IMM2_SIZE, 1+IMM2_SIZE,      /* NOT upto I, minupto I                  */
+  1+IMM2_SIZE,                   /* NOT exact I                            */
+  1, 1, 1, 1+IMM2_SIZE,          /* NOT *+I, ++I, ?+I, upto+I              */
   /* Positive type repeats                                                 */
   1, 1, 1, 1, 1, 1,              /* Type *, *?, +, +?, ?, ??               */
-  3, 3, 3,                       /* Type upto, minupto, exact              */
-  1, 1, 1, 3,                    /* Type *+, ++, ?+, upto+                 */
+  1+IMM2_SIZE, 1+IMM2_SIZE,      /* Type upto, minupto                     */
+  1+IMM2_SIZE,                   /* Type exact                             */
+  1, 1, 1, 1+IMM2_SIZE,          /* Type *+, ++, ?+, upto+                 */
   /* Character class & ref repeats                                         */
   0, 0, 0, 0, 0, 0,              /* *, *?, +, +?, ?, ??                    */
   0, 0,                          /* CRRANGE, CRMINRANGE                    */
@@ -182,7 +187,7 @@
 the subject is reached. ***NOTE*** If the start of this table is modified, the
 two tables that follow must also be modified. */

-static const uschar poptable[] = {
+static const pcre_uint8 poptable[] = {
   0,                             /* End                                    */
   0, 0, 0, 1, 1,                 /* \A, \G, \K, \B, \b                     */
   1, 1, 1, 1, 1, 1,              /* \D, \d, \S, \s, \W, \w                 */
@@ -249,7 +254,7 @@
 /* These 2 tables allow for compact code for testing for \D, \d, \S, \s, \W,
 and \w */

-static const uschar toptable1[] = {
+static const pcre_uint8 toptable1[] = {
   0, 0, 0, 0, 0, 0,
   ctype_digit, ctype_digit,
   ctype_space, ctype_space,
@@ -257,7 +262,7 @@
   0, 0                            /* OP_ANY, OP_ALLANY */
 };

-static const uschar toptable2[] = {
+static const pcre_uint8 toptable2[] = {
0, 0, 0, 0, 0, 0,
ctype_digit, 0,
ctype_space, 0,
@@ -296,7 +301,7 @@
*/

static void
-pchars(unsigned char *p, int length, FILE *f)
+pchars(const pcre_uchar *p, int length, FILE *f)
{
int c;
while (length-- > 0)
@@ -386,8 +391,8 @@
static int
internal_dfa_exec(
dfa_match_data *md,
- const uschar *this_start_code,
- const uschar *current_subject,
+ const pcre_uchar *this_start_code,
+ const pcre_uchar *current_subject,
int start_offset,
int *offsets,
int offsetcount,
@@ -398,9 +403,9 @@
stateblock *active_states, *new_states, *temp_states;
stateblock *next_active_state, *next_new_state;

-const uschar *ctypes, *lcc, *fcc;
-const uschar *ptr;
-const uschar *end_code, *first_op;
+const pcre_uint8 *ctypes, *lcc, *fcc;
+const pcre_uchar *ptr;
+const pcre_uchar *end_code, *first_op;

dfa_recursion_info new_recursive;

@@ -409,14 +414,14 @@
/* Some fields in the md block are frequently referenced, so we load them into
independent variables in the hope that this will perform better. */

-const uschar *start_subject = md->start_subject;
-const uschar *end_subject = md->end_subject;
-const uschar *start_code = md->start_code;
+const pcre_uchar *start_subject = md->start_subject;
+const pcre_uchar *end_subject = md->end_subject;
+const pcre_uchar *start_code = md->start_code;

-#ifdef SUPPORT_UTF8
-BOOL utf8 = (md->poptions & PCRE_UTF8) != 0;
+#ifdef SUPPORT_UTF
+BOOL utf = (md->poptions & PCRE_UTF8) != 0;
#else
-BOOL utf8 = FALSE;
+BOOL utf = FALSE;
#endif

rlevel++;
@@ -442,7 +447,8 @@

 first_op = this_start_code + 1 + LINK_SIZE +
   ((*this_start_code == OP_CBRA || *this_start_code == OP_SCBRA ||
-    *this_start_code == OP_CBRAPOS || *this_start_code == OP_SCBRAPOS)? 2:0);
+    *this_start_code == OP_CBRAPOS || *this_start_code == OP_SCBRAPOS)
+    ? IMM2_SIZE:0);

/* The first thing in any (sub) pattern is a bracket of some sort. Push all
the alternative states onto the list, and find out where the end is. This
@@ -470,18 +476,16 @@
/* If we can't go back the amount required for the longest lookbehind
pattern, go back as far as we can; some alternatives may still be viable. */

-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
/* In character mode we have to step back character by character */

-  if (utf8)
+  if (utf)
     {
     for (gone_back = 0; gone_back < max_back; gone_back++)
       {
       if (current_subject <= start_subject) break;
       current_subject--;
-      while (current_subject > start_subject &&
-             (*current_subject & 0xc0) == 0x80)
-        current_subject--;
+      ACROSSCHAR(current_subject > start_subject, *current_subject, current_subject--);
       }
     }
   else
@@ -542,8 +546,8 @@
     {
     int length = 1 + LINK_SIZE +
       ((*this_start_code == OP_CBRA || *this_start_code == OP_SCBRA ||
-        *this_start_code == OP_CBRAPOS || *this_start_code == OP_SCBRAPOS)?
-        2:0);
+        *this_start_code == OP_CBRAPOS || *this_start_code == OP_SCBRAPOS)
+        ? IMM2_SIZE:0);
     do
       {
       ADD_NEW((int)(end_code - start_code + length), 0);
@@ -556,7 +560,7 @@

 workspace[0] = 0;    /* Bit indicating which vector is current */

-DPRINTF(("%.*sEnd state = %d\n", rlevel*2-2, SP, end_code - start_code));
+DPRINTF(("%.*sEnd state = %d\n", rlevel*2-2, SP, (int)(end_code - start_code)));

/* Loop for scanning the subject */

@@ -583,7 +587,7 @@

#ifdef PCRE_DEBUG
printf("%.*sNext character: rest of subject = \"", rlevel*2-2, SP);
- pchars((uschar *)ptr, strlen((char *)ptr), stdout);
+ pchars(ptr, STRLEN_UC(ptr), stdout);
printf("\"\n");

   printf("%.*sActive states: ", rlevel*2-2, SP);
@@ -604,9 +608,9 @@
   if (ptr < end_subject)
     {
     clen = 1;        /* Number of bytes in the character */
-#ifdef SUPPORT_UTF8
-    if (utf8) { GETCHARLEN(c, ptr, clen); } else
-#endif  /* SUPPORT_UTF8 */
+#ifdef SUPPORT_UTF
+    if (utf) { GETCHARLEN(c, ptr, clen); } else
+#endif  /* SUPPORT_UTF */
     c = *ptr;
     }
   else
@@ -624,7 +628,7 @@
     {
     stateblock *current_state = active_states + i;
     BOOL caseless = FALSE;
-    const uschar *code;
+    const pcre_uchar *code;
     int state_offset = current_state->offset;
     int count, codevalue, rrc;

@@ -693,9 +697,9 @@
     if (coptable[codevalue] > 0)
       {
       dlen = 1;
-#ifdef SUPPORT_UTF8
-      if (utf8) { GETCHARLEN(d, (code + coptable[codevalue]), dlen); } else
-#endif  /* SUPPORT_UTF8 */
+#ifdef SUPPORT_UTF
+      if (utf) { GETCHARLEN(d, (code + coptable[codevalue]), dlen); } else
+#endif  /* SUPPORT_UTF */
       d = code[coptable[codevalue]];
       if (codevalue >= OP_TYPESTAR)
         {
@@ -816,7 +820,7 @@
       /*-----------------------------------------------------------------*/
       case OP_CBRA:
       case OP_SCBRA:
-      ADD_ACTIVE((int)(code - start_code + 3 + LINK_SIZE),  0);
+      ADD_ACTIVE((int)(code - start_code + 1 + LINK_SIZE + IMM2_SIZE),  0);
       code += GET(code, 1);
       while (*code == OP_ALT)
         {
@@ -956,10 +960,10 @@

         if (ptr > start_subject)
           {
-          const uschar *temp = ptr - 1;
+          const pcre_uchar *temp = ptr - 1;
           if (temp < md->start_used_ptr) md->start_used_ptr = temp;
-#ifdef SUPPORT_UTF8
-          if (utf8) BACKCHAR(temp);
+#ifdef SUPPORT_UTF
+          if (utf) { BACKCHAR(temp); }
 #endif
           GETCHARTEST(d, temp);
 #ifdef SUPPORT_UCP
@@ -1024,7 +1028,7 @@
           break;

           case PT_GC:
-          OK = _pcre_ucp_gentype[prop->chartype] == code[2];
+          OK = PRIV(ucp_gentype)[prop->chartype] == code[2];
           break;

           case PT_PC:
@@ -1038,24 +1042,24 @@
           /* These are specials for combination cases. */

           case PT_ALNUM:
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_L ||
-               _pcre_ucp_gentype[prop->chartype] == ucp_N;
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+               PRIV(ucp_gentype)[prop->chartype] == ucp_N;
           break;

           case PT_SPACE:    /* Perl space */
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
                c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR;
           break;

           case PT_PXSPACE:  /* POSIX space */
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
                c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
                c == CHAR_FF || c == CHAR_CR;
           break;

           case PT_WORD:
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_L ||
-               _pcre_ucp_gentype[prop->chartype] == ucp_N ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+               PRIV(ucp_gentype)[prop->chartype] == ucp_N ||
                c == CHAR_UNDERSCORE;
           break;

@@ -1157,7 +1161,7 @@
               ((ctypes[c] & toptable1[d]) ^ toptable2[d]) != 0))
           {
           if (++count >= GET2(code, 1))
-            { ADD_NEW(state_offset + 4, 0); }
+            { ADD_NEW(state_offset + 1 + IMM2_SIZE + 1, 0); }
           else
             { ADD_NEW(state_offset, count); }
           }
@@ -1168,7 +1172,7 @@
       case OP_TYPEUPTO:
       case OP_TYPEMINUPTO:
       case OP_TYPEPOSUPTO:
-      ADD_ACTIVE(state_offset + 4, 0);
+      ADD_ACTIVE(state_offset + 2 + IMM2_SIZE, 0);
       count = current_state->count;  /* Number already matched */
       if (clen > 0)
         {
@@ -1183,7 +1187,7 @@
             next_active_state--;
             }
           if (++count >= GET2(code, 1))
-            { ADD_NEW(state_offset + 4, 0); }
+            { ADD_NEW(state_offset + 2 + IMM2_SIZE, 0); }
           else
             { ADD_NEW(state_offset, count); }
           }
@@ -1218,7 +1222,7 @@
           break;

           case PT_GC:
-          OK = _pcre_ucp_gentype[prop->chartype] == code[3];
+          OK = PRIV(ucp_gentype)[prop->chartype] == code[3];
           break;

           case PT_PC:
@@ -1232,24 +1236,24 @@
           /* These are specials for combination cases. */

           case PT_ALNUM:
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_L ||
-               _pcre_ucp_gentype[prop->chartype] == ucp_N;
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+               PRIV(ucp_gentype)[prop->chartype] == ucp_N;
           break;

           case PT_SPACE:    /* Perl space */
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
                c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR;
           break;

           case PT_PXSPACE:  /* POSIX space */
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
                c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
                c == CHAR_FF || c == CHAR_CR;
           break;

           case PT_WORD:
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_L ||
-               _pcre_ucp_gentype[prop->chartype] == ucp_N ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+               PRIV(ucp_gentype)[prop->chartype] == ucp_N ||
                c == CHAR_UNDERSCORE;
           break;

@@ -1281,7 +1285,7 @@
       if (count > 0) { ADD_ACTIVE(state_offset + 2, 0); }
       if (clen > 0 && UCD_CATEGORY(c) != ucp_M)
         {
-        const uschar *nptr = ptr + clen;
+        const pcre_uchar *nptr = ptr + clen;
         int ncount = 0;
         if (count > 0 && codevalue == OP_EXTUNI_EXTRA + OP_TYPEPOSPLUS)
           {
@@ -1465,7 +1469,7 @@
           break;

           case PT_GC:
-          OK = _pcre_ucp_gentype[prop->chartype] == code[3];
+          OK = PRIV(ucp_gentype)[prop->chartype] == code[3];
           break;

           case PT_PC:
@@ -1479,24 +1483,24 @@
           /* These are specials for combination cases. */

           case PT_ALNUM:
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_L ||
-               _pcre_ucp_gentype[prop->chartype] == ucp_N;
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+               PRIV(ucp_gentype)[prop->chartype] == ucp_N;
           break;

           case PT_SPACE:    /* Perl space */
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
                c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR;
           break;

           case PT_PXSPACE:  /* POSIX space */
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
                c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
                c == CHAR_FF || c == CHAR_CR;
           break;

           case PT_WORD:
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_L ||
-               _pcre_ucp_gentype[prop->chartype] == ucp_N ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+               PRIV(ucp_gentype)[prop->chartype] == ucp_N ||
                c == CHAR_UNDERSCORE;
           break;

@@ -1537,7 +1541,7 @@
       ADD_ACTIVE(state_offset + 2, 0);
       if (clen > 0 && UCD_CATEGORY(c) != ucp_M)
         {
-        const uschar *nptr = ptr + clen;
+        const pcre_uchar *nptr = ptr + clen;
         int ncount = 0;
         if (codevalue == OP_EXTUNI_EXTRA + OP_TYPEPOSSTAR ||
             codevalue == OP_EXTUNI_EXTRA + OP_TYPEPOSQUERY)
@@ -1719,13 +1723,13 @@
       case OP_PROP_EXTRA + OP_TYPEMINUPTO:
       case OP_PROP_EXTRA + OP_TYPEPOSUPTO:
       if (codevalue != OP_PROP_EXTRA + OP_TYPEEXACT)
-        { ADD_ACTIVE(state_offset + 6, 0); }
+        { ADD_ACTIVE(state_offset + 1 + IMM2_SIZE + 3, 0); }
       count = current_state->count;  /* Number already matched */
       if (clen > 0)
         {
         BOOL OK;
         const ucd_record * prop = GET_UCD(c);
-        switch(code[4])
+        switch(code[1 + IMM2_SIZE + 1])
           {
           case PT_ANY:
           OK = TRUE;
@@ -1737,38 +1741,38 @@
           break;

           case PT_GC:
-          OK = _pcre_ucp_gentype[prop->chartype] == code[5];
+          OK = PRIV(ucp_gentype)[prop->chartype] == code[1 + IMM2_SIZE + 2];
           break;

           case PT_PC:
-          OK = prop->chartype == code[5];
+          OK = prop->chartype == code[1 + IMM2_SIZE + 2];
           break;

           case PT_SC:
-          OK = prop->script == code[5];
+          OK = prop->script == code[1 + IMM2_SIZE + 2];
           break;

           /* These are specials for combination cases. */

           case PT_ALNUM:
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_L ||
-               _pcre_ucp_gentype[prop->chartype] == ucp_N;
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+               PRIV(ucp_gentype)[prop->chartype] == ucp_N;
           break;

           case PT_SPACE:    /* Perl space */
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
                c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR;
           break;

           case PT_PXSPACE:  /* POSIX space */
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
                c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
                c == CHAR_FF || c == CHAR_CR;
           break;

           case PT_WORD:
-          OK = _pcre_ucp_gentype[prop->chartype] == ucp_L ||
-               _pcre_ucp_gentype[prop->chartype] == ucp_N ||
+          OK = PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+               PRIV(ucp_gentype)[prop->chartype] == ucp_N ||
                c == CHAR_UNDERSCORE;
           break;

@@ -1787,7 +1791,7 @@
             next_active_state--;
             }
           if (++count >= GET2(code, 1))
-            { ADD_NEW(state_offset + 6, 0); }
+            { ADD_NEW(state_offset + 1 + IMM2_SIZE + 3, 0); }
           else
             { ADD_NEW(state_offset, count); }
           }
@@ -1800,11 +1804,11 @@
       case OP_EXTUNI_EXTRA + OP_TYPEMINUPTO:
       case OP_EXTUNI_EXTRA + OP_TYPEPOSUPTO:
       if (codevalue != OP_EXTUNI_EXTRA + OP_TYPEEXACT)
-        { ADD_ACTIVE(state_offset + 4, 0); }
+        { ADD_ACTIVE(state_offset + 2 + IMM2_SIZE, 0); }
       count = current_state->count;  /* Number already matched */
       if (clen > 0 && UCD_CATEGORY(c) != ucp_M)
         {
-        const uschar *nptr = ptr + clen;
+        const pcre_uchar *nptr = ptr + clen;
         int ncount = 0;
         if (codevalue == OP_EXTUNI_EXTRA + OP_TYPEPOSUPTO)
           {
@@ -1821,7 +1825,7 @@
           nptr += ndlen;
           }
         if (++count >= GET2(code, 1))
-          { ADD_NEW_DATA(-(state_offset + 4), 0, ncount); }
+          { ADD_NEW_DATA(-(state_offset + 2 + IMM2_SIZE), 0, ncount); }
         else
           { ADD_NEW_DATA(-state_offset, count, ncount); }
         }
@@ -1834,7 +1838,7 @@
       case OP_ANYNL_EXTRA + OP_TYPEMINUPTO:
       case OP_ANYNL_EXTRA + OP_TYPEPOSUPTO:
       if (codevalue != OP_ANYNL_EXTRA + OP_TYPEEXACT)
-        { ADD_ACTIVE(state_offset + 4, 0); }
+        { ADD_ACTIVE(state_offset + 2 + IMM2_SIZE, 0); }
       count = current_state->count;  /* Number already matched */
       if (clen > 0)
         {
@@ -1861,7 +1865,7 @@
             next_active_state--;
             }
           if (++count >= GET2(code, 1))
-            { ADD_NEW_DATA(-(state_offset + 4), 0, ncount); }
+            { ADD_NEW_DATA(-(state_offset + 2 + IMM2_SIZE), 0, ncount); }
           else
             { ADD_NEW_DATA(-state_offset, count, ncount); }
           break;
@@ -1878,7 +1882,7 @@
       case OP_VSPACE_EXTRA + OP_TYPEMINUPTO:
       case OP_VSPACE_EXTRA + OP_TYPEPOSUPTO:
       if (codevalue != OP_VSPACE_EXTRA + OP_TYPEEXACT)
-        { ADD_ACTIVE(state_offset + 4, 0); }
+        { ADD_ACTIVE(state_offset + 2 + IMM2_SIZE, 0); }
       count = current_state->count;  /* Number already matched */
       if (clen > 0)
         {
@@ -1907,7 +1911,7 @@
             next_active_state--;
             }
           if (++count >= GET2(code, 1))
-            { ADD_NEW_DATA(-(state_offset + 4), 0, 0); }
+            { ADD_NEW_DATA(-(state_offset + 2 + IMM2_SIZE), 0, 0); }
           else
             { ADD_NEW_DATA(-state_offset, count, 0); }
           }
@@ -1920,7 +1924,7 @@
       case OP_HSPACE_EXTRA + OP_TYPEMINUPTO:
       case OP_HSPACE_EXTRA + OP_TYPEPOSUPTO:
       if (codevalue != OP_HSPACE_EXTRA + OP_TYPEEXACT)
-        { ADD_ACTIVE(state_offset + 4, 0); }
+        { ADD_ACTIVE(state_offset + 2 + IMM2_SIZE, 0); }
       count = current_state->count;  /* Number already matched */
       if (clen > 0)
         {
@@ -1962,7 +1966,7 @@
             next_active_state--;
             }
           if (++count >= GET2(code, 1))
-            { ADD_NEW_DATA(-(state_offset + 4), 0, 0); }
+            { ADD_NEW_DATA(-(state_offset + 2 + IMM2_SIZE), 0, 0); }
           else
             { ADD_NEW_DATA(-state_offset, count, 0); }
           }
@@ -1984,32 +1988,32 @@
       case OP_CHARI:
       if (clen == 0) break;

-#ifdef SUPPORT_UTF8
-      if (utf8)
+#ifdef SUPPORT_UTF
+      if (utf)
         {
         if (c == d) { ADD_NEW(state_offset + dlen + 1, 0); } else
           {
           unsigned int othercase;
-          if (c < 128) othercase = fcc[c]; else
-
-          /* If we have Unicode property support, we can use it to test the
-          other case of the character. */
-
+          if (c < 128)
+            othercase = fcc[c];
+          else
+            /* If we have Unicode property support, we can use it to test the
+            other case of the character. */
 #ifdef SUPPORT_UCP
-          othercase = UCD_OTHERCASE(c);
+            othercase = UCD_OTHERCASE(c);
 #else
-          othercase = NOTACHAR;
+            othercase = NOTACHAR;
 #endif

           if (d == othercase) { ADD_NEW(state_offset + dlen + 1, 0); }
           }
         }
       else
-#endif  /* SUPPORT_UTF8 */
-
-      /* Non-UTF-8 mode */
+#endif  /* SUPPORT_UTF */
+      /* Not UTF mode */
         {
-        if (lcc[c] == lcc[d]) { ADD_NEW(state_offset + 2, 0); }
+        if (TABLE_GET(c, lcc, c) == TABLE_GET(d, lcc, d))
+          { ADD_NEW(state_offset + 2, 0); }
         }
       break;

@@ -2023,7 +2027,7 @@
       case OP_EXTUNI:
       if (clen > 0 && UCD_CATEGORY(c) != ucp_M)
         {
-        const uschar *nptr = ptr + clen;
+        const pcre_uchar *nptr = ptr + clen;
         int ncount = 0;
         while (nptr < end_subject)
           {
@@ -2209,16 +2213,16 @@
         unsigned int otherd = NOTACHAR;
         if (caseless)
           {
-#ifdef SUPPORT_UTF8
-          if (utf8 && d >= 128)
+#ifdef SUPPORT_UTF
+          if (utf && d >= 128)
             {
 #ifdef SUPPORT_UCP
             otherd = UCD_OTHERCASE(d);
 #endif  /* SUPPORT_UCP */
             }
           else
-#endif  /* SUPPORT_UTF8 */
-          otherd = fcc[d];
+#endif  /* SUPPORT_UTF */
+          otherd = TABLE_GET(d, fcc, d);
           }
         if ((c == d || c == otherd) == (codevalue < OP_NOTSTAR))
           {
@@ -2256,16 +2260,16 @@
         unsigned int otherd = NOTACHAR;
         if (caseless)
           {
-#ifdef SUPPORT_UTF8
-          if (utf8 && d >= 128)
+#ifdef SUPPORT_UTF
+          if (utf && d >= 128)
             {
 #ifdef SUPPORT_UCP
             otherd = UCD_OTHERCASE(d);
 #endif  /* SUPPORT_UCP */
             }
           else
-#endif  /* SUPPORT_UTF8 */
-          otherd = fcc[d];
+#endif  /* SUPPORT_UTF */
+          otherd = TABLE_GET(d, fcc, d);
           }
         if ((c == d || c == otherd) == (codevalue < OP_NOTSTAR))
           {
@@ -2301,16 +2305,16 @@
         unsigned int otherd = NOTACHAR;
         if (caseless)
           {
-#ifdef SUPPORT_UTF8
-          if (utf8 && d >= 128)
+#ifdef SUPPORT_UTF
+          if (utf && d >= 128)
             {
 #ifdef SUPPORT_UCP
             otherd = UCD_OTHERCASE(d);
 #endif  /* SUPPORT_UCP */
             }
           else
-#endif  /* SUPPORT_UTF8 */
-          otherd = fcc[d];
+#endif  /* SUPPORT_UTF */
+          otherd = TABLE_GET(d, fcc, d);
           }
         if ((c == d || c == otherd) == (codevalue < OP_NOTSTAR))
           {
@@ -2338,21 +2342,21 @@
         unsigned int otherd = NOTACHAR;
         if (caseless)
           {
-#ifdef SUPPORT_UTF8
-          if (utf8 && d >= 128)
+#ifdef SUPPORT_UTF
+          if (utf && d >= 128)
             {
 #ifdef SUPPORT_UCP
             otherd = UCD_OTHERCASE(d);
 #endif  /* SUPPORT_UCP */
             }
           else
-#endif  /* SUPPORT_UTF8 */
-          otherd = fcc[d];
+#endif  /* SUPPORT_UTF */
+          otherd = TABLE_GET(d, fcc, d);
           }
         if ((c == d || c == otherd) == (codevalue < OP_NOTSTAR))
           {
           if (++count >= GET2(code, 1))
-            { ADD_NEW(state_offset + dlen + 3, 0); }
+            { ADD_NEW(state_offset + dlen + 1 + IMM2_SIZE, 0); }
           else
             { ADD_NEW(state_offset, count); }
           }
@@ -2375,23 +2379,23 @@
       case OP_NOTUPTO:
       case OP_NOTMINUPTO:
       case OP_NOTPOSUPTO:
-      ADD_ACTIVE(state_offset + dlen + 3, 0);
+      ADD_ACTIVE(state_offset + dlen + 1 + IMM2_SIZE, 0);
       count = current_state->count;  /* Number already matched */
       if (clen > 0)
         {
         unsigned int otherd = NOTACHAR;
         if (caseless)
           {
-#ifdef SUPPORT_UTF8
-          if (utf8 && d >= 128)
+#ifdef SUPPORT_UTF
+          if (utf && d >= 128)
             {
 #ifdef SUPPORT_UCP
             otherd = UCD_OTHERCASE(d);
 #endif  /* SUPPORT_UCP */
             }
           else
-#endif  /* SUPPORT_UTF8 */
-          otherd = fcc[d];
+#endif  /* SUPPORT_UTF */
+          otherd = TABLE_GET(d, fcc, d);
           }
         if ((c == d || c == otherd) == (codevalue < OP_NOTSTAR))
           {
@@ -2401,7 +2405,7 @@
             next_active_state--;
             }
           if (++count >= GET2(code, 1))
-            { ADD_NEW(state_offset + dlen + 3, 0); }
+            { ADD_NEW(state_offset + dlen + 1 + IMM2_SIZE, 0); }
           else
             { ADD_NEW(state_offset, count); }
           }
@@ -2418,18 +2422,18 @@
         {
         BOOL isinclass = FALSE;
         int next_state_offset;
-        const uschar *ecode;
+        const pcre_uchar *ecode;

         /* For a simple class, there is always just a 32-byte table, and we
         can set isinclass from it. */

         if (codevalue != OP_XCLASS)
           {
-          ecode = code + 33;
+          ecode = code + 1 + (32 / sizeof(pcre_uchar));
           if (clen > 0)
             {
             isinclass = (c > 255)? (codevalue == OP_NCLASS) :
-              ((code[1 + c/8] & (1 << (c&7))) != 0);
+              ((((pcre_uint8 *)(code + 1))[c/8] & (1 << (c&7))) != 0);
             }
           }

@@ -2440,7 +2444,7 @@
         else
          {
          ecode = code + GET(code, 1);
-         if (clen > 0) isinclass = _pcre_xclass(c, code + 1 + LINK_SIZE);
+         if (clen > 0) isinclass = PRIV(xclass)(c, code + 1 + LINK_SIZE, utf);
          }

         /* At this point, isinclass is set for all kinds of class, and ecode
@@ -2474,12 +2478,12 @@
           case OP_CRMINRANGE:
           count = current_state->count;  /* Already matched */
           if (count >= GET2(ecode, 1))
-            { ADD_ACTIVE(next_state_offset + 5, 0); }
+            { ADD_ACTIVE(next_state_offset + 1 + 2 * IMM2_SIZE, 0); }
           if (isinclass)
             {
-            int max = GET2(ecode, 3);
+            int max = GET2(ecode, 1 + IMM2_SIZE);
             if (++count >= max && max != 0)   /* Max 0 => no limit */
-              { ADD_NEW(next_state_offset + 5, 0); }
+              { ADD_NEW(next_state_offset + 1 + 2 * IMM2_SIZE, 0); }
             else
               { ADD_NEW(state_offset, count); }
             }
@@ -2510,7 +2514,7 @@
         int rc;
         int local_offsets[2];
         int local_workspace[1000];
-        const uschar *endasscode = code + GET(code, 1);
+        const pcre_uchar *endasscode = code + GET(code, 1);

         while (*endasscode == OP_ALT) endasscode += GET(endasscode, 1);

@@ -2547,7 +2551,7 @@
         if (code[LINK_SIZE+1] == OP_CALLOUT)
           {
           rrc = 0;
-          if (pcre_callout != NULL)
+          if (PUBL(callout) != NULL)
             {
             pcre_callout_block cb;
             cb.version          = 1;   /* Version 1 of the callout block */
@@ -2563,10 +2567,10 @@
             cb.capture_last     = -1;
             cb.callout_data     = md->callout_data;
             cb.mark             = NULL;   /* No (*MARK) support */
-            if ((rrc = (*pcre_callout)(&cb)) < 0) return rrc;   /* Abandon */
+            if ((rrc = (*PUBL(callout))(&cb)) < 0) return rrc;   /* Abandon */
             }
           if (rrc > 0) break;                      /* Fail this thread */
-          code += _pcre_OP_lengths[OP_CALLOUT];    /* Skip callout data */
+          code += PRIV(OP_lengths)[OP_CALLOUT];    /* Skip callout data */
           }

         condcode = code[LINK_SIZE+1];
@@ -2587,10 +2591,10 @@

         else if (condcode == OP_RREF || condcode == OP_NRREF)
           {
-          int value = GET2(code, LINK_SIZE+2);
+          int value = GET2(code, LINK_SIZE + 2);
           if (value != RREF_ANY) return PCRE_ERROR_DFA_UCOND;
           if (md->recursive != NULL)
-            { ADD_ACTIVE(state_offset + LINK_SIZE + 4, 0); }
+            { ADD_ACTIVE(state_offset + LINK_SIZE + 2 + IMM2_SIZE, 0); }
           else { ADD_ACTIVE(state_offset + codelink + LINK_SIZE + 1, 0); }
           }

@@ -2599,8 +2603,8 @@
         else
           {
           int rc;
-          const uschar *asscode = code + LINK_SIZE + 1;
-          const uschar *endasscode = asscode + GET(asscode, 1);
+          const pcre_uchar *asscode = code + LINK_SIZE + 1;
+          const pcre_uchar *endasscode = asscode + GET(asscode, 1);

           while (*endasscode == OP_ALT) endasscode += GET(endasscode, 1);

@@ -2631,7 +2635,7 @@
         dfa_recursion_info *ri;
         int local_offsets[1000];
         int local_workspace[1000];
-        const uschar *callpat = start_code + GET(code, 1);
+        const pcre_uchar *callpat = start_code + GET(code, 1);
         int recno = (callpat == md->start_code)? 0 :
           GET2(callpat, 1 + LINK_SIZE);
         int rc;
@@ -2682,10 +2686,12 @@
           {
           for (rc = rc*2 - 2; rc >= 0; rc -= 2)
             {
-            const uschar *p = start_subject + local_offsets[rc];
-            const uschar *pp = start_subject + local_offsets[rc+1];
+            const pcre_uchar *p = start_subject + local_offsets[rc];
+            const pcre_uchar *pp = start_subject + local_offsets[rc+1];
             int charcount = local_offsets[rc+1] - local_offsets[rc];
-            while (p < pp) if ((*p++ & 0xc0) == 0x80) charcount--;
+#ifdef SUPPORT_UTF
+            while (p < pp) if (NOT_FIRSTCHAR(*p++)) charcount--;
+#endif
             if (charcount > 0)
               {
               ADD_NEW_DATA(-(state_offset + LINK_SIZE + 1), 0, (charcount - 1));
@@ -2708,7 +2714,7 @@
       case OP_BRAPOSZERO:
         {
         int charcount, matched_count;
-        const uschar *local_ptr = ptr;
+        const pcre_uchar *local_ptr = ptr;
         BOOL allow_zero;

         if (codevalue == OP_BRAPOSZERO)
@@ -2758,7 +2764,7 @@

         if (matched_count > 0 || allow_zero)
           {
-          const uschar *end_subpattern = code;
+          const pcre_uchar *end_subpattern = code;
           int next_state_offset;

           do { end_subpattern += GET(end_subpattern, 1); }
@@ -2779,10 +2785,12 @@
             }
           else
             {
-            const uschar *p = ptr;
-            const uschar *pp = local_ptr;
-            charcount = pp - p;
-            while (p < pp) if ((*p++ & 0xc0) == 0x80) charcount--;
+            const pcre_uchar *p = ptr;
+            const pcre_uchar *pp = local_ptr;
+            charcount = (int)(pp - p);
+#ifdef SUPPORT_UTF
+            while (p < pp) if (NOT_FIRSTCHAR(*p++)) charcount--;
+#endif
             ADD_NEW_DATA(-next_state_offset, 0, (charcount - 1));
             }
           }
@@ -2809,7 +2817,7 @@

         if (rc >= 0)
           {
-          const uschar *end_subpattern = code;
+          const pcre_uchar *end_subpattern = code;
           int charcount = local_offsets[1] - local_offsets[0];
           int next_state_offset, repeat_state_offset;

@@ -2862,9 +2870,11 @@
             }
           else
             {
-            const uschar *p = start_subject + local_offsets[0];
-            const uschar *pp = start_subject + local_offsets[1];
-            while (p < pp) if ((*p++ & 0xc0) == 0x80) charcount--;
+#ifdef SUPPORT_UTF
+            const pcre_uchar *p = start_subject + local_offsets[0];
+            const pcre_uchar *pp = start_subject + local_offsets[1];
+            while (p < pp) if (NOT_FIRSTCHAR(*p++)) charcount--;
+#endif
             ADD_NEW_DATA(-next_state_offset, 0, (charcount - 1));
             if (repeat_state_offset >= 0)
               { ADD_NEW_DATA(-repeat_state_offset, 0, (charcount - 1)); }
@@ -2880,7 +2890,7 @@

       case OP_CALLOUT:
       rrc = 0;
-      if (pcre_callout != NULL)
+      if (PUBL(callout) != NULL)
         {
         pcre_callout_block cb;
         cb.version          = 1;   /* Version 1 of the callout block */
@@ -2896,10 +2906,10 @@
         cb.capture_last     = -1;
         cb.callout_data     = md->callout_data;
         cb.mark             = NULL;   /* No (*MARK) support */
-        if ((rrc = (*pcre_callout)(&cb)) < 0) return rrc;   /* Abandon */
+        if ((rrc = (*PUBL(callout))(&cb)) < 0) return rrc;   /* Abandon */
         }
       if (rrc == 0)
-        { ADD_ACTIVE(state_offset + _pcre_OP_lengths[OP_CALLOUT], 0); }
+        { ADD_ACTIVE(state_offset + PRIV(OP_lengths)[OP_CALLOUT], 0); }
       break;

@@ -2996,28 +3006,35 @@
                  < -1 => some kind of unexpected problem
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_dfa_exec(const pcre *argument_re, const pcre_extra *extra_data,
const char *subject, int length, int start_offset, int options, int *offsets,
int offsetcount, int *workspace, int wscount)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_dfa_exec(const pcre *argument_re, const pcre_extra *extra_data,
+ PCRE_SPTR16 subject, int length, int start_offset, int options, int *offsets,
+ int offsetcount, int *workspace, int wscount)
+#endif
{
real_pcre *re = (real_pcre *)argument_re;
dfa_match_data match_block;
dfa_match_data *md = &match_block;
-BOOL utf8, anchored, startline, firstline;
-const uschar *current_subject, *end_subject, *lcc;
+BOOL utf, anchored, startline, firstline;
+const pcre_uchar *current_subject, *end_subject;
+const pcre_uint8 *lcc;

-pcre_study_data internal_study;
const pcre_study_data *study = NULL;
-real_pcre internal_re;

-const uschar *req_byte_ptr;
-const uschar *start_bits = NULL;
-BOOL first_byte_caseless = FALSE;
-BOOL req_byte_caseless = FALSE;
-int first_byte = -1;
-int req_byte = -1;
-int req_byte2 = -1;
+const pcre_uchar *req_char_ptr;
+const pcre_uint8 *start_bits = NULL;
+BOOL has_first_char = FALSE;
+BOOL has_req_char = FALSE;
+pcre_uchar first_char = 0;
+pcre_uchar first_char2 = 0;
+pcre_uchar req_char = 0;
+pcre_uchar req_char2 = 0;
int newline;

/* Plausibility checks */
@@ -3052,27 +3069,26 @@
}

/* Check that the first field in the block is the magic number. If it is not,
-test for a regex that was compiled on a host of opposite endianness. If this is
-the case, flipped values are put in internal_re and internal_study if there was
-study data too. */
+return with PCRE_ERROR_BADMAGIC. However, if the magic number is equal to
+REVERSED_MAGIC_NUMBER we return with PCRE_ERROR_BADENDIANNESS, which
+means that the pattern is likely compiled with different endianness. */

 if (re->magic_number != MAGIC_NUMBER)
-  {
-  re = _pcre_try_flipped(re, &internal_re, study, &internal_study);
-  if (re == NULL) return PCRE_ERROR_BADMAGIC;
-  if (study != NULL) study = &internal_study;
-  }
+  return re->magic_number == REVERSED_MAGIC_NUMBER?
+    PCRE_ERROR_BADENDIANNESS:PCRE_ERROR_BADMAGIC;
+if ((re->flags & PCRE_MODE) == 0) return PCRE_ERROR_BADMODE;

/* Set some local values */

-current_subject = (const unsigned char *)subject + start_offset;
-end_subject = (const unsigned char *)subject + length;
-req_byte_ptr = current_subject - 1;
+current_subject = (const pcre_uchar *)subject + start_offset;
+end_subject = (const pcre_uchar *)subject + length;
+req_char_ptr = current_subject - 1;

-#ifdef SUPPORT_UTF8
-utf8 = (re->options & PCRE_UTF8) != 0;
+#ifdef SUPPORT_UTF
+/* PCRE_UTF16 has the same value as PCRE_UTF8. */
+utf = (re->options & PCRE_UTF8) != 0;
#else
-utf8 = FALSE;
+utf = FALSE;
#endif

anchored = (options & (PCRE_ANCHORED|PCRE_DFA_RESTART)) != 0 ||
@@ -3080,9 +3096,9 @@

/* The remaining fixed data for passing around. */

-md->start_code = (const uschar *)argument_re +
+md->start_code = (const pcre_uchar *)argument_re +
     re->name_table_offset + re->name_count * re->name_entry_size;
-md->start_subject = (const unsigned char *)subject;
+md->start_subject = (const pcre_uchar *)subject;
 md->end_subject = end_subject;
 md->start_offset = start_offset;
 md->moptions = options;
@@ -3143,11 +3159,11 @@
 /* Check a UTF-8 string if required. Unfortunately there's no way of passing
 back the character offset. */

-#ifdef SUPPORT_UTF8
-if (utf8 && (options & PCRE_NO_UTF8_CHECK) == 0)
+#ifdef SUPPORT_UTF
+if (utf && (options & PCRE_NO_UTF8_CHECK) == 0)
   {
   int erroroffset;
-  int errorcode = _pcre_valid_utf8((uschar *)subject, length, &erroroffset);
+  int errorcode = PRIV(valid_utf)((pcre_uchar *)subject, length, &erroroffset);
   if (errorcode != 0)
     {
     if (offsetcount >= 2)
@@ -3159,7 +3175,7 @@
       PCRE_ERROR_SHORTUTF8 : PCRE_ERROR_BADUTF8;
     }
   if (start_offset > 0 && start_offset < length &&
-        (((USPTR)subject)[start_offset] & 0xc0) == 0x80)
+        NOT_FIRSTCHAR(((PCRE_PUCHAR)subject)[start_offset]))
     return PCRE_ERROR_BADUTF8_OFFSET;
   }
 #endif
@@ -3168,7 +3184,7 @@
 is a feature that makes it possible to save compiled regex and re-use them
 in other programs later. */

-if (md->tables == NULL) md->tables = _pcre_default_tables;
+if (md->tables == NULL) md->tables = PRIV(default_tables);

 /* The lower casing table and the "must be at the start of a line" flag are
 used in a loop when finding where to start. */
@@ -3187,9 +3203,16 @@
   {
   if ((re->flags & PCRE_FIRSTSET) != 0)
     {
-    first_byte = re->first_byte & 255;
-    if ((first_byte_caseless = ((re->first_byte & REQ_CASELESS) != 0)) == TRUE)
-      first_byte = lcc[first_byte];
+    has_first_char = TRUE;
+    first_char = first_char2 = re->first_char;
+    if ((re->flags & PCRE_FCH_CASELESS) != 0)
+      {
+      first_char2 = TABLE_GET(first_char, md->tables + fcc_offset, first_char);
+#if defined SUPPORT_UCP && !(defined COMPILE_PCRE8)
+      if (utf && first_char > 127)
+        first_char2 = UCD_OTHERCASE(first_char);
+#endif
+      }
     }
   else
     {
@@ -3204,9 +3227,16 @@

 if ((re->flags & PCRE_REQCHSET) != 0)
   {
-  req_byte = re->req_byte & 255;
-  req_byte_caseless = (re->req_byte & REQ_CASELESS) != 0;
-  req_byte2 = (md->tables + fcc_offset)[req_byte];  /* case flipped */
+  has_req_char = TRUE;
+  req_char = req_char2 = re->req_char;
+  if ((re->flags & PCRE_RCH_CASELESS) != 0)
+    {
+    req_char2 = TABLE_GET(req_char, md->tables + fcc_offset, req_char);
+#if defined SUPPORT_UCP && !(defined COMPILE_PCRE8)
+    if (utf && req_char > 127)
+      req_char2 = UCD_OTHERCASE(req_char);
+#endif
+    }
   }

/* Call the main matching function, looping for a non-anchored regex after a
@@ -3219,7 +3249,7 @@

   if ((options & PCRE_DFA_RESTART) == 0)
     {
-    const uschar *save_end_subject = end_subject;
+    const pcre_uchar *save_end_subject = end_subject;

     /* If firstline is TRUE, the start of the match is constrained to the first
     line of a multiline string. Implement this by temporarily adjusting
@@ -3228,14 +3258,14 @@

     if (firstline)
       {
-      USPTR t = current_subject;
-#ifdef SUPPORT_UTF8
-      if (utf8)
+      PCRE_PUCHAR t = current_subject;
+#ifdef SUPPORT_UTF
+      if (utf)
         {
         while (t < md->end_subject && !IS_NEWLINE(t))
           {
           t++;
-          while (t < end_subject && (*t & 0xc0) == 0x80) t++;
+          ACROSSCHAR(t < end_subject, *t, t++);
           }
         }
       else
@@ -3252,17 +3282,17 @@

     if (((options | re->options) & PCRE_NO_START_OPTIMIZE) == 0)
       {
-      /* Advance to a known first byte. */
+      /* Advance to a known first char. */

-      if (first_byte >= 0)
+      if (has_first_char)
         {
-        if (first_byte_caseless)
+        if (first_char != first_char2)
           while (current_subject < end_subject &&
-                 lcc[*current_subject] != first_byte)
+              *current_subject != first_char && *current_subject != first_char2)
             current_subject++;
         else
           while (current_subject < end_subject &&
-                 *current_subject != first_byte)
+                 *current_subject != first_char)
             current_subject++;
         }

@@ -3272,16 +3302,15 @@
         {
         if (current_subject > md->start_subject + start_offset)
           {
-#ifdef SUPPORT_UTF8
-          if (utf8)
+#ifdef SUPPORT_UTF
+          if (utf)
             {
             while (current_subject < end_subject &&
                    !WAS_NEWLINE(current_subject))
               {
               current_subject++;
-              while(current_subject < end_subject &&
-                    (*current_subject & 0xc0) == 0x80)
-                current_subject++;
+              ACROSSCHAR(current_subject < end_subject, *current_subject,
+                current_subject++);
               }
             }
           else
@@ -3308,13 +3337,18 @@
         while (current_subject < end_subject)
           {
           register unsigned int c = *current_subject;
+#ifndef COMPILE_PCRE8
+          if (c > 255) c = 255;
+#endif
           if ((start_bits[c/8] & (1 << (c&7))) == 0)
             {
             current_subject++;
-#ifdef SUPPORT_UTF8
-            if (utf8)
-              while(current_subject < end_subject &&
-                    (*current_subject & 0xc0) == 0x80) current_subject++;
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
+            /* In non 8-bit mode, the iteration will stop for
+            characters > 255 at the beginning or not stop at all. */
+            if (utf)
+              ACROSSCHAR(current_subject < end_subject, *current_subject,
+                current_subject++);
 #endif
             }
           else break;
@@ -3342,8 +3376,8 @@
           (pcre_uint32)(end_subject - current_subject) < study->minlength)
         return PCRE_ERROR_NOMATCH;

-      /* If req_byte is set, we know that that character must appear in the
-      subject for the match to succeed. If the first character is set, req_byte
+      /* If req_char is set, we know that that character must appear in the
+      subject for the match to succeed. If the first character is set, req_char
       must be later in the subject; otherwise the test starts at the match
       point. This optimization can save a huge amount of work in patterns with
       nested unlimited repeats that aren't going to match. Writing separate
@@ -3355,28 +3389,28 @@
       patterns. This showed up when somebody was matching /^C/ on a 32-megabyte
       string... so we don't do this when the string is sufficiently long. */

-      if (req_byte >= 0 && end_subject - current_subject < REQ_BYTE_MAX)
+      if (has_req_char && end_subject - current_subject < REQ_BYTE_MAX)
         {
-        register const uschar *p = current_subject + ((first_byte >= 0)? 1 : 0);
+        register PCRE_PUCHAR p = current_subject + (has_first_char? 1:0);

         /* We don't need to repeat the search if we haven't yet reached the
         place we found it at last time. */

-        if (p > req_byte_ptr)
+        if (p > req_char_ptr)
           {
-          if (req_byte_caseless)
+          if (req_char != req_char2)
             {
             while (p < end_subject)
               {
               register int pp = *p++;
-              if (pp == req_byte || pp == req_byte2) { p--; break; }
+              if (pp == req_char || pp == req_char2) { p--; break; }
               }
             }
           else
             {
             while (p < end_subject)
               {
-              if (*p++ == req_byte) { p--; break; }
+              if (*p++ == req_char) { p--; break; }
               }
             }

@@ -3389,7 +3423,7 @@
           found it, so that we don't search again next time round the loop if
           the start hasn't passed this character yet. */

-          req_byte_ptr = p;
+          req_char_ptr = p;
           }
         }
       }
@@ -3421,11 +3455,13 @@

   if (firstline && IS_NEWLINE(current_subject)) break;
   current_subject++;
-  if (utf8)
+#ifdef SUPPORT_UTF
+  if (utf)
     {
-    while (current_subject < end_subject && (*current_subject & 0xc0) == 0x80)
-      current_subject++;
+    ACROSSCHAR(current_subject < end_subject, *current_subject,
+      current_subject++);
     }
+#endif
   if (current_subject > end_subject) break;

/* If we have just passed a CR and we are now at a LF, and the pattern does

Modified: code/trunk/pcre_exec.c
===================================================================
--- code/trunk/pcre_exec.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_exec.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2011 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@@ -82,14 +82,6 @@
 #define MATCH_SKIP_ARG     (-993)
 #define MATCH_THEN         (-992)

-/* This is a convenience macro for code that occurs many times. */
-
-#define MRRETURN(ra) \
- { \
- md->mark = markptr; \
- RRETURN(ra); \
- }
-
/* Maximum number of ints of offset to save on the stack for recursive calls.
If the offset vector is bigger, malloc is used. This should be a multiple of 3,
because the offset vector is always a multiple of 3 long. */
@@ -121,7 +113,7 @@
*/

static void
-pchars(const uschar *p, int length, BOOL is_subject, match_data *md)
+pchars(const pcre_uchar *p, int length, BOOL is_subject, match_data *md)
{
unsigned int c;
if (is_subject && length > md->end_subject - p) length = md->end_subject - p;
@@ -152,11 +144,11 @@
*/

static int
-match_ref(int offset, register USPTR eptr, int length, match_data *md,
+match_ref(int offset, register PCRE_PUCHAR eptr, int length, match_data *md,
BOOL caseless)
{
-USPTR eptr_start = eptr;
-register USPTR p = md->start_subject + md->offset_vector[offset];
+PCRE_PUCHAR eptr_start = eptr;
+register PCRE_PUCHAR p = md->start_subject + md->offset_vector[offset];

#ifdef PCRE_DEBUG
if (eptr >= md->end_subject)
@@ -181,9 +173,9 @@

 if (caseless)
   {
-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
 #ifdef SUPPORT_UCP
-  if (md->utf8)
+  if (md->utf)
     {
     /* Match characters up to the end of the reference. NOTE: the number of
     bytes matched may differ, because there are some characters whose upper and
@@ -193,7 +185,7 @@
     the latter. It is important, therefore, to check the length along the
     reference, not along the subject (earlier code did this wrong). */

-    USPTR endptr = p + length;
+    PCRE_PUCHAR endptr = p + length;
     while (p < endptr)
       {
       int c, d;
@@ -212,7 +204,11 @@
     {
     if (eptr + length > md->end_subject) return -1;
     while (length-- > 0)
-      { if (md->lcc[*p++] != md->lcc[*eptr++]) return -1; }
+      {
+      if (TABLE_GET(*p, md->lcc, *p) != TABLE_GET(*eptr, md->lcc, *eptr)) return -1;
+      p++;
+      eptr++;
+      }
     }
   }

@@ -225,7 +221,7 @@
while (length-- > 0) if (*p++ != *eptr++) return -1;
}

-return eptr - eptr_start;
+return (int)(eptr - eptr_start);
}

@@ -290,7 +286,7 @@
#define RMATCH(ra,rb,rc,rd,re,rw) \
{ \
printf("match() called in line %d\n", __LINE__); \
- rrc = match(ra,rb,mstart,markptr,rc,rd,re,rdepth+1); \
+ rrc = match(ra,rb,mstart,rc,rd,re,rdepth+1); \
printf("to line %d\n", __LINE__); \
}
#define RRETURN(ra) \
@@ -300,7 +296,7 @@
}
#else
#define RMATCH(ra,rb,rc,rd,re,rw) \
- rrc = match(ra,rb,mstart,markptr,rc,rd,re,rdepth+1)
+ rrc = match(ra,rb,mstart,rc,rd,re,rdepth+1)
#define RRETURN(ra) return ra
#endif

@@ -315,13 +311,12 @@

 #define RMATCH(ra,rb,rc,rd,re,rw)\
   {\
-  heapframe *newframe = (heapframe *)(pcre_stack_malloc)(sizeof(heapframe));\
+  heapframe *newframe = (heapframe *)(PUBL(stack_malloc))(sizeof(heapframe));\
   if (newframe == NULL) RRETURN(PCRE_ERROR_NOMEMORY);\
   frame->Xwhere = rw; \
   newframe->Xeptr = ra;\
   newframe->Xecode = rb;\
   newframe->Xmstart = mstart;\
-  newframe->Xmarkptr = markptr;\
   newframe->Xoffset_top = rc;\
   newframe->Xeptrb = re;\
   newframe->Xrdepth = frame->Xrdepth + 1;\
@@ -337,7 +332,7 @@
   {\
   heapframe *oldframe = frame;\
   frame = oldframe->Xprevframe;\
-  (pcre_stack_free)(oldframe);\
+  (PUBL(stack_free))(oldframe);\
   if (frame != NULL)\
     {\
     rrc = ra;\
@@ -354,25 +349,24 @@

/* Function arguments that may change */

- USPTR Xeptr;
- const uschar *Xecode;
- USPTR Xmstart;
- USPTR Xmarkptr;
+ PCRE_PUCHAR Xeptr;
+ const pcre_uchar *Xecode;
+ PCRE_PUCHAR Xmstart;
int Xoffset_top;
eptrblock *Xeptrb;
unsigned int Xrdepth;

/* Function local variables */

- USPTR Xcallpat;
-#ifdef SUPPORT_UTF8
- USPTR Xcharptr;
+ PCRE_PUCHAR Xcallpat;
+#ifdef SUPPORT_UTF
+ PCRE_PUCHAR Xcharptr;
#endif
- USPTR Xdata;
- USPTR Xnext;
- USPTR Xpp;
- USPTR Xprev;
- USPTR Xsaved_eptr;
+ PCRE_PUCHAR Xdata;
+ PCRE_PUCHAR Xnext;
+ PCRE_PUCHAR Xpp;
+ PCRE_PUCHAR Xprev;
+ PCRE_PUCHAR Xsaved_eptr;

recursion_info Xnew_recursive;

@@ -385,7 +379,7 @@
int Xprop_value;
int Xprop_fail_result;
int Xoclength;
- uschar Xocchars[8];
+ pcre_uchar Xocchars[6];
#endif

int Xcodelink;
@@ -427,7 +421,7 @@
same response. */

 /* These macros pack up tests that are used for partial matching, and which
-appears several times in the code. We set the "hit end" flag if the pointer is
+appear several times in the code. We set the "hit end" flag if the pointer is
 at the end of the subject and also past the start of the subject (i.e.
 something has been matched). For hard partial matching, we then return
 immediately. The second one is used when we already know we are past the end of
@@ -438,19 +432,19 @@
       eptr > md->start_used_ptr) \
     { \
     md->hitend = TRUE; \
-    if (md->partial > 1) MRRETURN(PCRE_ERROR_PARTIAL); \
+    if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL); \
     }

 #define SCHECK_PARTIAL()\
   if (md->partial != 0 && eptr > md->start_used_ptr) \
     { \
     md->hitend = TRUE; \
-    if (md->partial > 1) MRRETURN(PCRE_ERROR_PARTIAL); \
+    if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL); \
     }

/* Performance note: It might be tempting to extract commonly used fields from
-the md structure (e.g. utf8, end_subject) into individual variables to improve
+the md structure (e.g. utf, end_subject) into individual variables to improve
performance. Tests using gcc on a SPARC disproved this; in the first case, it
made performance worse.

@@ -459,7 +453,6 @@
    ecode       pointer to current position in compiled code
    mstart      pointer to the current match start position (can be modified
                  by encountering \K)
-   markptr     pointer to the most recent MARK name, or NULL
    offset_top  current top pointer
    md          pointer to "static" info for the match
    eptrb       pointer to chain of blocks containing eptr at start of
@@ -474,8 +467,8 @@
 */

 static int
-match(REGISTER USPTR eptr, REGISTER const uschar *ecode, USPTR mstart,
-  const uschar *markptr, int offset_top, match_data *md, eptrblock *eptrb,
+match(REGISTER PCRE_PUCHAR eptr, REGISTER const pcre_uchar *ecode,
+  PCRE_PUCHAR mstart, int offset_top, match_data *md, eptrblock *eptrb, 
   unsigned int rdepth)
 {
 /* These variables do not need to be preserved over recursion in this function,
@@ -485,7 +478,7 @@
 register int  rrc;         /* Returns from recursive calls */
 register int  i;           /* Used for loops not involving calls to RMATCH() */
 register unsigned int c;   /* Character values not kept over RMATCH() calls */
-register BOOL utf8;        /* Local copy of UTF-8 flag for speed */
+register BOOL utf;         /* Local copy of UTF flag for speed */

BOOL minimize, possessive; /* Quantifier options */
BOOL caseless;
@@ -497,7 +490,7 @@
heap whenever RMATCH() does a "recursion". See the macro definitions above. */

 #ifdef NO_RECURSE
-heapframe *frame = (heapframe *)(pcre_stack_malloc)(sizeof(heapframe));
+heapframe *frame = (heapframe *)(PUBL(stack_malloc))(sizeof(heapframe));
 if (frame == NULL) RRETURN(PCRE_ERROR_NOMEMORY);
 frame->Xprevframe = NULL;            /* Marks the top level */

@@ -506,7 +499,6 @@
 frame->Xeptr = eptr;
 frame->Xecode = ecode;
 frame->Xmstart = mstart;
-frame->Xmarkptr = markptr;
 frame->Xoffset_top = offset_top;
 frame->Xeptrb = eptrb;
 frame->Xrdepth = rdepth;
@@ -520,14 +512,13 @@
 #define eptr               frame->Xeptr
 #define ecode              frame->Xecode
 #define mstart             frame->Xmstart
-#define markptr            frame->Xmarkptr
 #define offset_top         frame->Xoffset_top
 #define eptrb              frame->Xeptrb
 #define rdepth             frame->Xrdepth

/* Ditto for the local variables */

-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
 #define charptr            frame->Xcharptr
 #endif
 #define callpat            frame->Xcallpat
@@ -585,15 +576,15 @@
 below are for variables that do not have to be preserved over a recursive call
 to RMATCH(). */

-#ifdef SUPPORT_UTF8
-const uschar *charptr;
+#ifdef SUPPORT_UTF
+const pcre_uchar *charptr;
 #endif
-const uschar *callpat;
-const uschar *data;
-const uschar *next;
-USPTR         pp;
-const uschar *prev;
-USPTR         saved_eptr;
+const pcre_uchar *callpat;
+const pcre_uchar *data;
+const pcre_uchar *next;
+PCRE_PUCHAR       pp;
+const pcre_uchar *prev;
+PCRE_PUCHAR       saved_eptr;

recursion_info new_recursive;

@@ -606,7 +597,7 @@
int prop_value;
int prop_fail_result;
int oclength;
-uschar occhars[8];
+pcre_uchar occhars[6];
#endif

 int codelink;
@@ -634,6 +625,7 @@
 #define code_offset   codelink
 #define condassert    condition
 #define matched_once  prev_is_word
+#define foc           number

/* These statements are here to stop the compiler complaining about unitialized
variables. */
@@ -659,10 +651,10 @@
complicated macro. It has to be used in one particular way. This shouldn't,
however, impact performance when true recursion is being used. */

-#ifdef SUPPORT_UTF8
-utf8 = md->utf8;       /* Local copy of the flag */
+#ifdef SUPPORT_UTF
+utf = md->utf;       /* Local copy of the flag */
 #else
-utf8 = FALSE;
+utf = FALSE;
 #endif

 /* First check that we haven't called match() too many times, or that we
@@ -701,9 +693,12 @@
   switch(op)
     {
     case OP_MARK:
-    markptr = ecode + 2;
-    RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, md,
+    md->nomatch_mark = ecode + 2;
+    md->mark = NULL;    /* In case previously set by assertion */
+    RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode] + ecode[1], offset_top, md,
       eptrb, RM55);
+    if ((rrc == MATCH_MATCH || rrc == MATCH_ACCEPT) &&
+         md->mark == NULL) md->mark = ecode + 2;

     /* A return of MATCH_SKIP_ARG means that matching failed at SKIP with an
     argument, and we must check whether that argument matches this MARK's
@@ -712,65 +707,75 @@
     position and return MATCH_SKIP. Otherwise, pass back the return code
     unaltered. */

-    if (rrc == MATCH_SKIP_ARG &&
-        strcmp((char *)markptr, (char *)(md->start_match_ptr)) == 0)
+    else if (rrc == MATCH_SKIP_ARG &&
+        STRCMP_UC_UC(ecode + 2, md->start_match_ptr) == 0)
       {
       md->start_match_ptr = eptr;
       RRETURN(MATCH_SKIP);
       }
-
-    if (md->mark == NULL) md->mark = markptr;
     RRETURN(rrc);

     case OP_FAIL:
-    MRRETURN(MATCH_NOMATCH);
+    RRETURN(MATCH_NOMATCH);

     /* COMMIT overrides PRUNE, SKIP, and THEN */

     case OP_COMMIT:
-    RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
+    RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
       eptrb, RM52);
     if (rrc != MATCH_NOMATCH && rrc != MATCH_PRUNE &&
         rrc != MATCH_SKIP && rrc != MATCH_SKIP_ARG &&
         rrc != MATCH_THEN)
       RRETURN(rrc);
-    MRRETURN(MATCH_COMMIT);
+    RRETURN(MATCH_COMMIT);

     /* PRUNE overrides THEN */

     case OP_PRUNE:
-    RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
+    RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
       eptrb, RM51);
     if (rrc != MATCH_NOMATCH && rrc != MATCH_THEN) RRETURN(rrc);
-    MRRETURN(MATCH_PRUNE);
+    RRETURN(MATCH_PRUNE);

     case OP_PRUNE_ARG:
-    RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, md,
+    md->nomatch_mark = ecode + 2;
+    md->mark = NULL;    /* In case previously set by assertion */
+    RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode] + ecode[1], offset_top, md,
       eptrb, RM56);
+    if ((rrc == MATCH_MATCH || rrc == MATCH_ACCEPT) &&
+         md->mark == NULL) md->mark = ecode + 2;
     if (rrc != MATCH_NOMATCH && rrc != MATCH_THEN) RRETURN(rrc);
-    md->mark = ecode + 2;
     RRETURN(MATCH_PRUNE);

     /* SKIP overrides PRUNE and THEN */

     case OP_SKIP:
-    RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
+    RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
       eptrb, RM53);
     if (rrc != MATCH_NOMATCH && rrc != MATCH_PRUNE && rrc != MATCH_THEN)
       RRETURN(rrc);
     md->start_match_ptr = eptr;   /* Pass back current position */
-    MRRETURN(MATCH_SKIP);
+    RRETURN(MATCH_SKIP);

+    /* Note that, for Perl compatibility, SKIP with an argument does NOT set
+    nomatch_mark. There is a flag that disables this opcode when re-matching a
+    pattern that ended with a SKIP for which there was not a matching MARK. */
+
     case OP_SKIP_ARG:
-    RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, md,
+    if (md->ignore_skip_arg)
+      {
+      ecode += PRIV(OP_lengths)[*ecode] + ecode[1];
+      break;
+      }
+    RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode] + ecode[1], offset_top, md,
       eptrb, RM57);
     if (rrc != MATCH_NOMATCH && rrc != MATCH_PRUNE && rrc != MATCH_THEN)
       RRETURN(rrc);

     /* Pass back the current skip name by overloading md->start_match_ptr and
     returning the special MATCH_SKIP_ARG return code. This will either be
-    caught by a matching MARK, or get to the top, where it is treated the same
-    as PRUNE. */
+    caught by a matching MARK, or get to the top, where it causes a rematch
+    with the md->ignore_skip_arg flag set. */

     md->start_match_ptr = ecode + 2;
     RRETURN(MATCH_SKIP_ARG);
@@ -780,18 +785,21 @@
     match pointer to do this. */

     case OP_THEN:
-    RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
+    RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
       eptrb, RM54);
     if (rrc != MATCH_NOMATCH) RRETURN(rrc);
     md->start_match_ptr = ecode;
-    MRRETURN(MATCH_THEN);
+    RRETURN(MATCH_THEN);

     case OP_THEN_ARG:
-    RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top,
+    md->nomatch_mark = ecode + 2;
+    md->mark = NULL;    /* In case previously set by assertion */
+    RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode] + ecode[1], offset_top,
       md, eptrb, RM58);
+    if ((rrc == MATCH_MATCH || rrc == MATCH_ACCEPT) &&
+         md->mark == NULL) md->mark = ecode + 2;
     if (rrc != MATCH_NOMATCH) RRETURN(rrc);
     md->start_match_ptr = ecode;
-    md->mark = ecode + 2;
     RRETURN(MATCH_THEN);

     /* Handle an atomic group that does not contain any capturing parentheses.
@@ -816,7 +824,6 @@
       if (rrc == MATCH_MATCH)  /* Note: _not_ MATCH_ACCEPT */
         {
         mstart = md->start_match_ptr;
-        markptr = md->mark;
         break;
         }
       if (rrc == MATCH_THEN)
@@ -916,7 +923,7 @@
       for (;;)
         {
         if (op >= OP_SBRA) md->match_function_type = MATCH_CBEGROUP;
-        RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
+        RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
           eptrb, RM1);
         if (rrc == MATCH_ONCE) break;  /* Backing up through an atomic group */

@@ -954,7 +961,6 @@

       /* At this point, rrc will be one of MATCH_ONCE or MATCH_NOMATCH. */

-      if (md->mark == NULL) md->mark = markptr;
       RRETURN(rrc);
       }

@@ -1004,13 +1010,13 @@

       else if (!md->hasthen && ecode[GET(ecode, 1)] != OP_ALT)
         {
-        ecode += _pcre_OP_lengths[*ecode];
+        ecode += PRIV(OP_lengths)[*ecode];
         goto TAIL_RECURSE;
         }

       /* In all other cases, we have to make another call to match(). */

-      RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md, eptrb,
+      RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md, eptrb,
         RM2);

       /* See comment in the code for capturing groups above about handling
@@ -1028,7 +1034,7 @@
         {
         if (rrc == MATCH_ONCE)
           {
-          const uschar *scode = ecode;
+          const pcre_uchar *scode = ecode;
           if (*scode != OP_ONCE)           /* If not at start, find it */
             {
             while (*scode == OP_ALT) scode += GET(scode, 1);
@@ -1042,7 +1048,6 @@
       if (*ecode != OP_ALT) break;
       }

-    if (md->mark == NULL) md->mark = markptr;
     RRETURN(MATCH_NOMATCH);

     /* Handle possessive capturing brackets with an unlimited repeat. We come
@@ -1071,7 +1076,7 @@
     if (offset < md->offset_max)
       {
       matched_once = FALSE;
-      code_offset = ecode - md->start_code;
+      code_offset = (int)(ecode - md->start_code);

       save_offset1 = md->offset_vector[offset];
       save_offset2 = md->offset_vector[offset+1];
@@ -1094,7 +1099,7 @@
         md->offset_vector[md->offset_end - number] =
           (int)(eptr - md->start_subject);
         if (op >= OP_SBRA) md->match_function_type = MATCH_CBEGROUP;
-        RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
+        RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
           eptrb, RM63);
         if (rrc == MATCH_KETRPOS)
           {
@@ -1130,7 +1135,6 @@
         md->offset_vector[md->offset_end - number] = save_offset3;
         }

-      if (md->mark == NULL) md->mark = markptr;
       if (allow_zero || matched_once)
         {
         ecode += 1 + LINK_SIZE;
@@ -1162,12 +1166,12 @@

     POSSESSIVE_NON_CAPTURE:
     matched_once = FALSE;
-    code_offset = ecode - md->start_code;
+    code_offset = (int)(ecode - md->start_code);

     for (;;)
       {
       if (op >= OP_SBRA) md->match_function_type = MATCH_CBEGROUP;
-      RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
+      RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
         eptrb, RM48);
       if (rrc == MATCH_KETRPOS)
         {
@@ -1217,7 +1221,7 @@

     if (ecode[LINK_SIZE+1] == OP_CALLOUT)
       {
-      if (pcre_callout != NULL)
+      if (PUBL(callout) != NULL)
         {
         pcre_callout_block cb;
         cb.version          = 2;   /* Version 1 of the callout block */
@@ -1232,11 +1236,11 @@
         cb.capture_top      = offset_top/2;
         cb.capture_last     = md->capture_last;
         cb.callout_data     = md->callout_data;
-        cb.mark             = markptr;
-        if ((rrc = (*pcre_callout)(&cb)) > 0) MRRETURN(MATCH_NOMATCH);
+        cb.mark             = md->nomatch_mark;
+        if ((rrc = (*PUBL(callout))(&cb)) > 0) RRETURN(MATCH_NOMATCH);
         if (rrc < 0) RRETURN(rrc);
         }
-      ecode += _pcre_OP_lengths[OP_CALLOUT];
+      ecode += PRIV(OP_lengths)[OP_CALLOUT];
       }

     condcode = ecode[LINK_SIZE+1];
@@ -1262,7 +1266,7 @@

         if (!condition && condcode == OP_NRREF)
           {
-          uschar *slotA = md->name_table;
+          pcre_uchar *slotA = md->name_table;
           for (i = 0; i < md->name_count; i++)
             {
             if (GET2(slotA, 0) == recno) break;
@@ -1275,11 +1279,11 @@

           if (i < md->name_count)
             {
-            uschar *slotB = slotA;
+            pcre_uchar *slotB = slotA;
             while (slotB > md->name_table)
               {
               slotB -= md->name_entry_size;
-              if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+              if (STRCMP_UC_UC(slotA + IMM2_SIZE, slotB + IMM2_SIZE) == 0)
                 {
                 condition = GET2(slotB, 0) == md->recursive->group_num;
                 if (condition) break;
@@ -1295,7 +1299,7 @@
               for (i++; i < md->name_count; i++)
                 {
                 slotB += md->name_entry_size;
-                if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+                if (STRCMP_UC_UC(slotA + IMM2_SIZE, slotB + IMM2_SIZE) == 0)
                   {
                   condition = GET2(slotB, 0) == md->recursive->group_num;
                   if (condition) break;
@@ -1308,7 +1312,7 @@

         /* Chose branch according to the condition */

-        ecode += condition? 3 : GET(ecode, 1);
+        ecode += condition? 1 + IMM2_SIZE : GET(ecode, 1);
         }
       }

@@ -1325,7 +1329,7 @@
       if (!condition && condcode == OP_NCREF)
         {
         int refno = offset >> 1;
-        uschar *slotA = md->name_table;
+        pcre_uchar *slotA = md->name_table;

         for (i = 0; i < md->name_count; i++)
           {
@@ -1339,11 +1343,11 @@

         if (i < md->name_count)
           {
-          uschar *slotB = slotA;
+          pcre_uchar *slotB = slotA;
           while (slotB > md->name_table)
             {
             slotB -= md->name_entry_size;
-            if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+            if (STRCMP_UC_UC(slotA + IMM2_SIZE, slotB + IMM2_SIZE) == 0)
               {
               offset = GET2(slotB, 0) << 1;
               condition = offset < offset_top &&
@@ -1361,7 +1365,7 @@
             for (i++; i < md->name_count; i++)
               {
               slotB += md->name_entry_size;
-              if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+              if (STRCMP_UC_UC(slotA + IMM2_SIZE, slotB + IMM2_SIZE) == 0)
                 {
                 offset = GET2(slotB, 0) << 1;
                 condition = offset < offset_top &&
@@ -1376,7 +1380,7 @@

       /* Chose branch according to the condition */

-      ecode += condition? 3 : GET(ecode, 1);
+      ecode += condition? 1 + IMM2_SIZE : GET(ecode, 1);
       }

     else if (condcode == OP_DEF)     /* DEFINE - always false */
@@ -1468,7 +1472,7 @@
       md->offset_vector[offset+1] = (int)(eptr - md->start_subject);
       if (offset_top <= offset) offset_top = offset + 2;
       }
-    ecode += 3;
+    ecode += 1 + IMM2_SIZE;
     break;

@@ -1488,7 +1492,7 @@
          (md->notempty ||
            (md->notempty_atstart &&
              mstart == md->start_subject + md->start_offset)))
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);

     /* Otherwise, we have a match. */

@@ -1497,10 +1501,10 @@
     md->start_match_ptr = mstart;       /* and the start (\K can modify) */

     /* For some reason, the macros don't work properly if an expression is
-    given as the argument to MRRETURN when the heap is in use. */
+    given as the argument to RRETURN when the heap is in use. */

     rrc = (op == OP_END)? MATCH_MATCH : MATCH_ACCEPT;
-    MRRETURN(rrc);
+    RRETURN(rrc);

     /* Assertion brackets. Check the alternative branches in turn - the
     matching won't pass the KET for an assertion. If any one branch matches,
@@ -1528,7 +1532,6 @@
       if (rrc == MATCH_MATCH || rrc == MATCH_ACCEPT)
         {
         mstart = md->start_match_ptr;   /* In case \K reset it */
-        markptr = md->mark;
         break;
         }

@@ -1540,7 +1543,7 @@
       }
     while (*ecode == OP_ALT);

-    if (*ecode == OP_KET) MRRETURN(MATCH_NOMATCH);
+    if (*ecode == OP_KET) RRETURN(MATCH_NOMATCH);

     /* If checking an assertion for a condition, return MATCH_MATCH. */

@@ -1570,7 +1573,7 @@
     do
       {
       RMATCH(eptr, ecode + 1 + LINK_SIZE, offset_top, md, NULL, RM5);
-      if (rrc == MATCH_MATCH || rrc == MATCH_ACCEPT) MRRETURN(MATCH_NOMATCH);
+      if (rrc == MATCH_MATCH || rrc == MATCH_ACCEPT) RRETURN(MATCH_NOMATCH);
       if (rrc == MATCH_SKIP || rrc == MATCH_PRUNE || rrc == MATCH_COMMIT)
         {
         do ecode += GET(ecode,1); while (*ecode == OP_ALT);
@@ -1596,14 +1599,14 @@
     back a number of characters, not bytes. */

     case OP_REVERSE:
-#ifdef SUPPORT_UTF8
-    if (utf8)
+#ifdef SUPPORT_UTF
+    if (utf)
       {
       i = GET(ecode, 1);
       while (i-- > 0)
         {
         eptr--;
-        if (eptr < md->start_subject) MRRETURN(MATCH_NOMATCH);
+        if (eptr < md->start_subject) RRETURN(MATCH_NOMATCH);
         BACKCHAR(eptr);
         }
       }
@@ -1614,7 +1617,7 @@

       {
       eptr -= GET(ecode, 1);
-      if (eptr < md->start_subject) MRRETURN(MATCH_NOMATCH);
+      if (eptr < md->start_subject) RRETURN(MATCH_NOMATCH);
       }

     /* Save the earliest consulted character, then skip to next op code */
@@ -1628,7 +1631,7 @@
     function is able to force a failure. */

     case OP_CALLOUT:
-    if (pcre_callout != NULL)
+    if (PUBL(callout) != NULL)
       {
       pcre_callout_block cb;
       cb.version          = 2;   /* Version 1 of the callout block */
@@ -1643,8 +1646,8 @@
       cb.capture_top      = offset_top/2;
       cb.capture_last     = md->capture_last;
       cb.callout_data     = md->callout_data;
-      cb.mark             = markptr;
-      if ((rrc = (*pcre_callout)(&cb)) > 0) MRRETURN(MATCH_NOMATCH);
+      cb.mark             = md->nomatch_mark;
+      if ((rrc = (*PUBL(callout))(&cb)) > 0) RRETURN(MATCH_NOMATCH);
       if (rrc < 0) RRETURN(rrc);
       }
     ecode += 2 + 2*LINK_SIZE;
@@ -1703,7 +1706,7 @@
       else
         {
         new_recursive.offset_save =
-          (int *)(pcre_malloc)(new_recursive.saved_max * sizeof(int));
+          (int *)(PUBL(malloc))(new_recursive.saved_max * sizeof(int));
         if (new_recursive.offset_save == NULL) RRETURN(PCRE_ERROR_NOMEMORY);
         }
       memcpy(new_recursive.offset_save, md->offset_vector,
@@ -1718,7 +1721,7 @@
       do
         {
         if (cbegroup) md->match_function_type = MATCH_CBEGROUP;
-        RMATCH(eptr, callpat + _pcre_OP_lengths[*callpat], offset_top,
+        RMATCH(eptr, callpat + PRIV(OP_lengths)[*callpat], offset_top,
           md, eptrb, RM6);
         memcpy(md->offset_vector, new_recursive.offset_save,
             new_recursive.saved_max * sizeof(int));
@@ -1727,7 +1730,7 @@
           {
           DPRINTF(("Recursion matched\n"));
           if (new_recursive.offset_save != stacksave)
-            (pcre_free)(new_recursive.offset_save);
+            (PUBL(free))(new_recursive.offset_save);

           /* Set where we got to in the subject, and reset the start in case
           it was changed by \K. This *is* propagated back out of a recursion,
@@ -1745,7 +1748,7 @@
           {
           DPRINTF(("Recursion gave error %d\n", rrc));
           if (new_recursive.offset_save != stacksave)
-            (pcre_free)(new_recursive.offset_save);
+            (PUBL(free))(new_recursive.offset_save);
           RRETURN(rrc);
           }

@@ -1757,8 +1760,8 @@
       DPRINTF(("Recursion didn't match\n"));
       md->recursive = new_recursive.prevrec;
       if (new_recursive.offset_save != stacksave)
-        (pcre_free)(new_recursive.offset_save);
-      MRRETURN(MATCH_NOMATCH);
+        (PUBL(free))(new_recursive.offset_save);
+      RRETURN(MATCH_NOMATCH);
       }

     RECURSION_MATCHED:
@@ -1838,7 +1841,7 @@
       md->end_match_ptr = eptr;      /* For ONCE_NC */
       md->end_offset_top = offset_top;
       md->start_match_ptr = mstart;
-      MRRETURN(MATCH_MATCH);         /* Sets md->mark */
+      RRETURN(MATCH_MATCH);         /* Sets md->mark */
       }

     /* For capturing groups we have to check the group number back at the start
@@ -1980,29 +1983,29 @@
     /* Not multiline mode: start of subject assertion, unless notbol. */

     case OP_CIRC:
-    if (md->notbol && eptr == md->start_subject) MRRETURN(MATCH_NOMATCH);
+    if (md->notbol && eptr == md->start_subject) RRETURN(MATCH_NOMATCH);

     /* Start of subject assertion */

     case OP_SOD:
-    if (eptr != md->start_subject) MRRETURN(MATCH_NOMATCH);
+    if (eptr != md->start_subject) RRETURN(MATCH_NOMATCH);
     ecode++;
     break;

     /* Multiline mode: start of subject unless notbol, or after any newline. */

     case OP_CIRCM:
-    if (md->notbol && eptr == md->start_subject) MRRETURN(MATCH_NOMATCH);
+    if (md->notbol && eptr == md->start_subject) RRETURN(MATCH_NOMATCH);
     if (eptr != md->start_subject &&
         (eptr == md->end_subject || !WAS_NEWLINE(eptr)))
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
     ecode++;
     break;

     /* Start of match assertion */

     case OP_SOM:
-    if (eptr != md->start_subject + md->start_offset) MRRETURN(MATCH_NOMATCH);
+    if (eptr != md->start_subject + md->start_offset) RRETURN(MATCH_NOMATCH);
     ecode++;
     break;

@@ -2018,10 +2021,10 @@

     case OP_DOLLM:
     if (eptr < md->end_subject)
-      { if (!IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH); }
+      { if (!IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH); }
     else
       {
-      if (md->noteol) MRRETURN(MATCH_NOMATCH);
+      if (md->noteol) RRETURN(MATCH_NOMATCH);
       SCHECK_PARTIAL();
       }
     ecode++;
@@ -2031,7 +2034,7 @@
     subject unless noteol is set. */

     case OP_DOLL:
-    if (md->noteol) MRRETURN(MATCH_NOMATCH);
+    if (md->noteol) RRETURN(MATCH_NOMATCH);
     if (!md->endonly) goto ASSERT_NL_OR_EOS;

     /* ... else fall through for endonly */
@@ -2039,7 +2042,7 @@
     /* End of subject assertion (\z) */

     case OP_EOD:
-    if (eptr < md->end_subject) MRRETURN(MATCH_NOMATCH);
+    if (eptr < md->end_subject) RRETURN(MATCH_NOMATCH);
     SCHECK_PARTIAL();
     ecode++;
     break;
@@ -2050,7 +2053,7 @@
     ASSERT_NL_OR_EOS:
     if (eptr < md->end_subject &&
         (!IS_NEWLINE(eptr) || eptr != md->end_subject - md->nllen))
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);

     /* Either at end of string or \n before end. */

@@ -2069,15 +2072,15 @@
       be "non-word" characters. Remember the earliest consulted character for
       partial matching. */

-#ifdef SUPPORT_UTF8
-      if (utf8)
+#ifdef SUPPORT_UTF
+      if (utf)
         {
         /* Get status of previous character */

         if (eptr == md->start_subject) prev_is_word = FALSE; else
           {
-          USPTR lastptr = eptr - 1;
-          while((*lastptr & 0xc0) == 0x80) lastptr--;
+          PCRE_PUCHAR lastptr = eptr - 1;
+          BACKCHAR(lastptr);
           if (lastptr < md->start_used_ptr) md->start_used_ptr = lastptr;
           GETCHAR(c, lastptr);
 #ifdef SUPPORT_UCP
@@ -2142,7 +2145,8 @@
             }
           else
 #endif
-          prev_is_word = ((md->ctypes[eptr[-1]] & ctype_word) != 0);
+          prev_is_word = MAX_255(eptr[-1])
+            && ((md->ctypes[eptr[-1]] & ctype_word) != 0);
           }

         /* Get status of next character */
@@ -2165,31 +2169,34 @@
           }
         else
 #endif
-        cur_is_word = ((md->ctypes[*eptr] & ctype_word) != 0);
+        cur_is_word = MAX_255(*eptr)
+          && ((md->ctypes[*eptr] & ctype_word) != 0);
         }

       /* Now see if the situation is what we want */

       if ((*ecode++ == OP_WORD_BOUNDARY)?
            cur_is_word == prev_is_word : cur_is_word != prev_is_word)
-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
       }
     break;

     /* Match a single character type; inline for speed */

     case OP_ANY:
-    if (IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH);
+    if (IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH);
     /* Fall through */

     case OP_ALLANY:
     if (eptr >= md->end_subject)   /* DO NOT merge the eptr++ here; it must */
       {                            /* not be updated before SCHECK_PARTIAL. */
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     eptr++;
-    if (utf8) while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
+#ifdef SUPPORT_UTF
+    if (utf) ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
+#endif
     ecode++;
     break;

@@ -2200,7 +2207,7 @@
     if (eptr >= md->end_subject)   /* DO NOT merge the eptr++ here; it must */
       {                            /* not be updated before SCHECK_PARTIAL. */
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     eptr++;
     ecode++;
@@ -2210,16 +2217,16 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !(defined COMPILE_PCRE8)
        c < 256 &&
 #endif
        (md->ctypes[c] & ctype_digit) != 0
        )
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
     ecode++;
     break;

@@ -2227,16 +2234,16 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
-#ifdef SUPPORT_UTF8
-       c >= 256 ||
+#if defined SUPPORT_UTF || !(defined COMPILE_PCRE8)
+       c > 255 ||
 #endif
        (md->ctypes[c] & ctype_digit) == 0
        )
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
     ecode++;
     break;

@@ -2244,16 +2251,16 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !(defined COMPILE_PCRE8)
        c < 256 &&
 #endif
        (md->ctypes[c] & ctype_space) != 0
        )
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
     ecode++;
     break;

@@ -2261,16 +2268,16 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
-#ifdef SUPPORT_UTF8
-       c >= 256 ||
+#if defined SUPPORT_UTF || !(defined COMPILE_PCRE8)
+       c > 255 ||
 #endif
        (md->ctypes[c] & ctype_space) == 0
        )
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
     ecode++;
     break;

@@ -2278,16 +2285,16 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !(defined COMPILE_PCRE8)
        c < 256 &&
 #endif
        (md->ctypes[c] & ctype_word) != 0
        )
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
     ecode++;
     break;

@@ -2295,16 +2302,16 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
-#ifdef SUPPORT_UTF8
-       c >= 256 ||
+#if defined SUPPORT_UTF || !(defined COMPILE_PCRE8)
+       c > 255 ||
 #endif
        (md->ctypes[c] & ctype_word) == 0
        )
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
     ecode++;
     break;

@@ -2312,12 +2319,12 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     switch(c)
       {
-      default: MRRETURN(MATCH_NOMATCH);
+      default: RRETURN(MATCH_NOMATCH);

       case 0x000d:
       if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
@@ -2331,7 +2338,7 @@
       case 0x0085:
       case 0x2028:
       case 0x2029:
-      if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
+      if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
       break;
       }
     ecode++;
@@ -2341,7 +2348,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     switch(c)
@@ -2366,7 +2373,7 @@
       case 0x202f:    /* NARROW NO-BREAK SPACE */
       case 0x205f:    /* MEDIUM MATHEMATICAL SPACE */
       case 0x3000:    /* IDEOGRAPHIC SPACE */
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     ecode++;
     break;
@@ -2375,12 +2382,12 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     switch(c)
       {
-      default: MRRETURN(MATCH_NOMATCH);
+      default: RRETURN(MATCH_NOMATCH);
       case 0x09:      /* HT */
       case 0x20:      /* SPACE */
       case 0xa0:      /* NBSP */
@@ -2409,7 +2416,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     switch(c)
@@ -2422,7 +2429,7 @@
       case 0x85:      /* NEL */
       case 0x2028:    /* LINE SEPARATOR */
       case 0x2029:    /* PARAGRAPH SEPARATOR */
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     ecode++;
     break;
@@ -2431,12 +2438,12 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     switch(c)
       {
-      default: MRRETURN(MATCH_NOMATCH);
+      default: RRETURN(MATCH_NOMATCH);
       case 0x0a:      /* LF */
       case 0x0b:      /* VT */
       case 0x0c:      /* FF */
@@ -2458,7 +2465,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
       {
@@ -2467,59 +2474,59 @@
       switch(ecode[1])
         {
         case PT_ANY:
-        if (op == OP_NOTPROP) MRRETURN(MATCH_NOMATCH);
+        if (op == OP_NOTPROP) RRETURN(MATCH_NOMATCH);
         break;

         case PT_LAMP:
         if ((prop->chartype == ucp_Lu ||
              prop->chartype == ucp_Ll ||
              prop->chartype == ucp_Lt) == (op == OP_NOTPROP))
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
         break;

         case PT_GC:
-        if ((ecode[2] != _pcre_ucp_gentype[prop->chartype]) == (op == OP_PROP))
-          MRRETURN(MATCH_NOMATCH);
+        if ((ecode[2] != PRIV(ucp_gentype)[prop->chartype]) == (op == OP_PROP))
+          RRETURN(MATCH_NOMATCH);
         break;

         case PT_PC:
         if ((ecode[2] != prop->chartype) == (op == OP_PROP))
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
         break;

         case PT_SC:
         if ((ecode[2] != prop->script) == (op == OP_PROP))
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
         break;

         /* These are specials */

         case PT_ALNUM:
-        if ((_pcre_ucp_gentype[prop->chartype] == ucp_L ||
-             _pcre_ucp_gentype[prop->chartype] == ucp_N) == (op == OP_NOTPROP))
-          MRRETURN(MATCH_NOMATCH);
+        if ((PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+             PRIV(ucp_gentype)[prop->chartype] == ucp_N) == (op == OP_NOTPROP))
+          RRETURN(MATCH_NOMATCH);
         break;

         case PT_SPACE:    /* Perl space */
-        if ((_pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+        if ((PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
              c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR)
                == (op == OP_NOTPROP))
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
         break;

         case PT_PXSPACE:  /* POSIX space */
-        if ((_pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+        if ((PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
              c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
              c == CHAR_FF || c == CHAR_CR)
                == (op == OP_NOTPROP))
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
         break;

         case PT_WORD:
-        if ((_pcre_ucp_gentype[prop->chartype] == ucp_L ||
-             _pcre_ucp_gentype[prop->chartype] == ucp_N ||
+        if ((PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+             PRIV(ucp_gentype)[prop->chartype] == ucp_N ||
              c == CHAR_UNDERSCORE) == (op == OP_NOTPROP))
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
         break;

         /* This should never occur */
@@ -2539,14 +2546,14 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
-    if (UCD_CATEGORY(c) == ucp_M) MRRETURN(MATCH_NOMATCH);
+    if (UCD_CATEGORY(c) == ucp_M) RRETURN(MATCH_NOMATCH);
     while (eptr < md->end_subject)
       {
       int len = 1;
-      if (!utf8) c = *eptr; else { GETCHARLEN(c, eptr, len); }
+      if (!utf) c = *eptr; else { GETCHARLEN(c, eptr, len); }
       if (UCD_CATEGORY(c) != ucp_M) break;
       eptr += len;
       }
@@ -2567,7 +2574,7 @@
     case OP_REFI:
     caseless = op == OP_REFI;
     offset = GET2(ecode, 1) << 1;               /* Doubled ref number */
-    ecode += 3;
+    ecode += 1 + IMM2_SIZE;

     /* If the reference is unset, there are two possibilities:

@@ -2607,25 +2614,29 @@
       case OP_CRMINRANGE:
       minimize = (*ecode == OP_CRMINRANGE);
       min = GET2(ecode, 1);
-      max = GET2(ecode, 3);
+      max = GET2(ecode, 1 + IMM2_SIZE);
       if (max == 0) max = INT_MAX;
-      ecode += 5;
+      ecode += 1 + 2 * IMM2_SIZE;
       break;

       default:               /* No repeat follows */
       if ((length = match_ref(offset, eptr, length, md, caseless)) < 0)
         {
         CHECK_PARTIAL();
-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
         }
       eptr += length;
       continue;              /* With the main loop */
       }

     /* Handle repeated back references. If the length of the reference is
-    zero, just continue with the main loop. */
+    zero, just continue with the main loop. If the length is negative, it
+    means the reference is unset in non-Java-compatible mode. If the minimum is 
+    zero, we can continue at the same level without recursion. For any other 
+    minimum, carrying on will result in NOMATCH. */

     if (length == 0) continue;
+    if (length < 0 && min == 0) continue;

     /* First, ensure the minimum number of matches are present. We get back
     the length of the reference string explicitly rather than passing the
@@ -2637,7 +2648,7 @@
       if ((slength = match_ref(offset, eptr, length, md, caseless)) < 0)
         {
         CHECK_PARTIAL();
-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
         }
       eptr += slength;
       }
@@ -2656,11 +2667,11 @@
         int slength;
         RMATCH(eptr, ecode, offset_top, md, eptrb, RM14);
         if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-        if (fi >= max) MRRETURN(MATCH_NOMATCH);
+        if (fi >= max) RRETURN(MATCH_NOMATCH);
         if ((slength = match_ref(offset, eptr, length, md, caseless)) < 0)
           {
           CHECK_PARTIAL();
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
           }
         eptr += slength;
         }
@@ -2688,7 +2699,7 @@
         if (rrc != MATCH_NOMATCH) RRETURN(rrc);
         eptr -= length;
         }
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     /* Control never gets here */

@@ -2706,8 +2717,11 @@
     case OP_NCLASS:
     case OP_CLASS:
       {
+      /* The data variable is saved across frames, so the byte map needs to
+      be stored there. */
+#define BYTE_MAP ((pcre_uint8 *)data)
       data = ecode + 1;                /* Save for matching */
-      ecode += 33;                     /* Advance past the item */
+      ecode += 1 + (32 / sizeof(pcre_uchar)); /* Advance past the item */

       switch (*ecode)
         {
@@ -2728,9 +2742,9 @@
         case OP_CRMINRANGE:
         minimize = (*ecode == OP_CRMINRANGE);
         min = GET2(ecode, 1);
-        max = GET2(ecode, 3);
+        max = GET2(ecode, 1 + IMM2_SIZE);
         if (max == 0) max = INT_MAX;
-        ecode += 5;
+        ecode += 1 + 2 * IMM2_SIZE;
         break;

         default:               /* No repeat follows */
@@ -2740,41 +2754,45 @@

       /* First, ensure the minimum number of matches are present. */

-#ifdef SUPPORT_UTF8
-      /* UTF-8 mode */
-      if (utf8)
+#ifdef SUPPORT_UTF
+      if (utf)
         {
         for (i = 1; i <= min; i++)
           {
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           if (c > 255)
             {
-            if (op == OP_CLASS) MRRETURN(MATCH_NOMATCH);
+            if (op == OP_CLASS) RRETURN(MATCH_NOMATCH);
             }
           else
-            {
-            if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
-            }
+            if ((BYTE_MAP[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
           }
         }
       else
 #endif
-      /* Not UTF-8 mode */
+      /* Not UTF mode */
         {
         for (i = 1; i <= min; i++)
           {
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           c = *eptr++;
-          if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
+#ifndef COMPILE_PCRE8
+          if (c > 255)
+            {
+            if (op == OP_CLASS) RRETURN(MATCH_NOMATCH);
+            }
+          else
+#endif
+            if ((BYTE_MAP[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
           }
         }

@@ -2788,47 +2806,51 @@

       if (minimize)
         {
-#ifdef SUPPORT_UTF8
-        /* UTF-8 mode */
-        if (utf8)
+#ifdef SUPPORT_UTF
+        if (utf)
           {
           for (fi = min;; fi++)
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM16);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINC(c, eptr);
             if (c > 255)
               {
-              if (op == OP_CLASS) MRRETURN(MATCH_NOMATCH);
+              if (op == OP_CLASS) RRETURN(MATCH_NOMATCH);
               }
             else
-              {
-              if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
-              }
+              if ((BYTE_MAP[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
             }
           }
         else
 #endif
-        /* Not UTF-8 mode */
+        /* Not UTF mode */
           {
           for (fi = min;; fi++)
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM17);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             c = *eptr++;
-            if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
+#ifndef COMPILE_PCRE8
+            if (c > 255)
+              {
+              if (op == OP_CLASS) RRETURN(MATCH_NOMATCH);
+              }
+            else
+#endif
+              if ((BYTE_MAP[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
             }
           }
         /* Control never gets here */
@@ -2840,9 +2862,8 @@
         {
         pp = eptr;

-#ifdef SUPPORT_UTF8
-        /* UTF-8 mode */
-        if (utf8)
+#ifdef SUPPORT_UTF
+        if (utf)
           {
           for (i = min; i < max; i++)
             {
@@ -2858,9 +2879,7 @@
               if (op == OP_CLASS) break;
               }
             else
-              {
-              if ((data[c/8] & (1 << (c&7))) == 0) break;
-              }
+              if ((BYTE_MAP[c/8] & (1 << (c&7))) == 0) break;
             eptr += len;
             }
           for (;;)
@@ -2873,7 +2892,7 @@
           }
         else
 #endif
-          /* Not UTF-8 mode */
+          /* Not UTF mode */
           {
           for (i = min; i < max; i++)
             {
@@ -2883,7 +2902,14 @@
               break;
               }
             c = *eptr;
-            if ((data[c/8] & (1 << (c&7))) == 0) break;
+#ifndef COMPILE_PCRE8
+            if (c > 255)
+              {
+              if (op == OP_CLASS) break;
+              }
+            else
+#endif
+              if ((BYTE_MAP[c/8] & (1 << (c&7))) == 0) break;
             eptr++;
             }
           while (eptr >= pp)
@@ -2894,8 +2920,9 @@
             }
           }

-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
         }
+#undef BYTE_MAP
       }
     /* Control never gets here */

@@ -2904,7 +2931,7 @@
     when UTF-8 mode mode is supported. Nevertheless, we may not be in UTF-8
     mode, because Unicode properties are supported in non-UTF-8 mode. */

-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
     case OP_XCLASS:
       {
       data = ecode + 1 + LINK_SIZE;                /* Save for matching */
@@ -2929,9 +2956,9 @@
         case OP_CRMINRANGE:
         minimize = (*ecode == OP_CRMINRANGE);
         min = GET2(ecode, 1);
-        max = GET2(ecode, 3);
+        max = GET2(ecode, 1 + IMM2_SIZE);
         if (max == 0) max = INT_MAX;
-        ecode += 5;
+        ecode += 1 + 2 * IMM2_SIZE;
         break;

         default:               /* No repeat follows */
@@ -2946,10 +2973,10 @@
         if (eptr >= md->end_subject)
           {
           SCHECK_PARTIAL();
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
           }
         GETCHARINCTEST(c, eptr);
-        if (!_pcre_xclass(c, data)) MRRETURN(MATCH_NOMATCH);
+        if (!PRIV(xclass)(c, data, utf)) RRETURN(MATCH_NOMATCH);
         }

       /* If max == min we can continue with the main loop without the
@@ -2966,14 +2993,14 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM20);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) MRRETURN(MATCH_NOMATCH);
+          if (fi >= max) RRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINCTEST(c, eptr);
-          if (!_pcre_xclass(c, data)) MRRETURN(MATCH_NOMATCH);
+          if (!PRIV(xclass)(c, data, utf)) RRETURN(MATCH_NOMATCH);
           }
         /* Control never gets here */
         }
@@ -2991,8 +3018,12 @@
             SCHECK_PARTIAL();
             break;
             }
+#ifdef SUPPORT_UTF
           GETCHARLENTEST(c, eptr, len);
-          if (!_pcre_xclass(c, data)) break;
+#else
+          c = *eptr;
+#endif
+          if (!PRIV(xclass)(c, data, utf)) break;
           eptr += len;
           }
         for(;;)
@@ -3000,9 +3031,11 @@
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM21);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
           if (eptr-- == pp) break;        /* Stop if tried at original pos */
-          if (utf8) BACKCHAR(eptr);
+#ifdef SUPPORT_UTF
+          if (utf) BACKCHAR(eptr);
+#endif
           }
-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
         }

       /* Control never gets here */
@@ -3012,8 +3045,8 @@
     /* Match a single character, casefully */

     case OP_CHAR:
-#ifdef SUPPORT_UTF8
-    if (utf8)
+#ifdef SUPPORT_UTF
+    if (utf)
       {
       length = 1;
       ecode++;
@@ -3021,50 +3054,57 @@
       if (length > md->end_subject - eptr)
         {
         CHECK_PARTIAL();             /* Not SCHECK_PARTIAL() */
-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
         }
-      while (length-- > 0) if (*ecode++ != *eptr++) MRRETURN(MATCH_NOMATCH);
+      while (length-- > 0) if (*ecode++ != *eptr++) RRETURN(MATCH_NOMATCH);
       }
     else
 #endif
-
-    /* Non-UTF-8 mode */
+    /* Not UTF mode */
       {
       if (md->end_subject - eptr < 1)
         {
         SCHECK_PARTIAL();            /* This one can use SCHECK_PARTIAL() */
-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
         }
-      if (ecode[1] != *eptr++) MRRETURN(MATCH_NOMATCH);
+      if (ecode[1] != *eptr++) RRETURN(MATCH_NOMATCH);
       ecode += 2;
       }
     break;

-    /* Match a single character, caselessly */
+    /* Match a single character, caselessly. If we are at the end of the
+    subject, give up immediately. */

     case OP_CHARI:
-#ifdef SUPPORT_UTF8
-    if (utf8)
+    if (eptr >= md->end_subject)
       {
+      SCHECK_PARTIAL();
+      RRETURN(MATCH_NOMATCH);
+      }
+
+#ifdef SUPPORT_UTF
+    if (utf)
+      {
       length = 1;
       ecode++;
       GETCHARLEN(fc, ecode, length);

-      if (length > md->end_subject - eptr)
-        {
-        CHECK_PARTIAL();             /* Not SCHECK_PARTIAL() */
-        MRRETURN(MATCH_NOMATCH);
-        }
-
       /* If the pattern character's value is < 128, we have only one byte, and
-      can use the fast lookup table. */
+      we know that its other case must also be one byte long, so we can use the
+      fast lookup table. We know that there is at least one byte left in the
+      subject. */

       if (fc < 128)
         {
-        if (md->lcc[*ecode++] != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
+        if (md->lcc[fc]
+            != TABLE_GET(*eptr, md->lcc, *eptr)) RRETURN(MATCH_NOMATCH);
+        ecode++;
+        eptr++;
         }

-      /* Otherwise we must pick up the subject character */
+      /* Otherwise we must pick up the subject character. Note that we cannot
+      use the value of "length" to check for sufficient bytes left, because the
+      other case of the character may have more or fewer bytes.  */

       else
         {
@@ -3080,21 +3120,18 @@
 #ifdef SUPPORT_UCP
           if (dc != UCD_OTHERCASE(fc))
 #endif
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
           }
         }
       }
     else
-#endif   /* SUPPORT_UTF8 */
+#endif   /* SUPPORT_UTF */

-    /* Non-UTF-8 mode */
+    /* Not UTF mode */
       {
-      if (md->end_subject - eptr < 1)
-        {
-        SCHECK_PARTIAL();            /* This one can use SCHECK_PARTIAL() */
-        MRRETURN(MATCH_NOMATCH);
-        }
-      if (md->lcc[ecode[1]] != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
+      if (TABLE_GET(ecode[1], md->lcc, ecode[1])
+          != TABLE_GET(*eptr, md->lcc, *eptr)) RRETURN(MATCH_NOMATCH);
+      eptr++;
       ecode += 2;
       }
     break;
@@ -3104,7 +3141,7 @@
     case OP_EXACT:
     case OP_EXACTI:
     min = max = GET2(ecode, 1);
-    ecode += 3;
+    ecode += 1 + IMM2_SIZE;
     goto REPEATCHAR;

     case OP_POSUPTO:
@@ -3119,7 +3156,7 @@
     min = 0;
     max = GET2(ecode, 1);
     minimize = *ecode == OP_MINUPTO || *ecode == OP_MINUPTOI;
-    ecode += 3;
+    ecode += 1 + IMM2_SIZE;
     goto REPEATCHAR;

     case OP_POSSTAR:
@@ -3167,8 +3204,8 @@
     /* Common code for all repeated single-character matches. */

     REPEATCHAR:
-#ifdef SUPPORT_UTF8
-    if (utf8)
+#ifdef SUPPORT_UTF
+    if (utf)
       {
       length = 1;
       charptr = ecode;
@@ -3184,23 +3221,23 @@
         unsigned int othercase;
         if (op >= OP_STARI &&     /* Caseless */
             (othercase = UCD_OTHERCASE(fc)) != fc)
-          oclength = _pcre_ord2utf8(othercase, occhars);
+          oclength = PRIV(ord2utf)(othercase, occhars);
         else oclength = 0;
 #endif  /* SUPPORT_UCP */

         for (i = 1; i <= min; i++)
           {
           if (eptr <= md->end_subject - length &&
-            memcmp(eptr, charptr, length) == 0) eptr += length;
+            memcmp(eptr, charptr, IN_UCHARS(length)) == 0) eptr += length;
 #ifdef SUPPORT_UCP
           else if (oclength > 0 &&
                    eptr <= md->end_subject - oclength &&
-                   memcmp(eptr, occhars, oclength) == 0) eptr += oclength;
+                   memcmp(eptr, occhars, IN_UCHARS(oclength)) == 0) eptr += oclength;
 #endif  /* SUPPORT_UCP */
           else
             {
             CHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           }

@@ -3212,18 +3249,18 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM22);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr <= md->end_subject - length &&
-              memcmp(eptr, charptr, length) == 0) eptr += length;
+              memcmp(eptr, charptr, IN_UCHARS(length)) == 0) eptr += length;
 #ifdef SUPPORT_UCP
             else if (oclength > 0 &&
                      eptr <= md->end_subject - oclength &&
-                     memcmp(eptr, occhars, oclength) == 0) eptr += oclength;
+                     memcmp(eptr, occhars, IN_UCHARS(oclength)) == 0) eptr += oclength;
 #endif  /* SUPPORT_UCP */
             else
               {
               CHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             }
           /* Control never gets here */
@@ -3235,11 +3272,11 @@
           for (i = min; i < max; i++)
             {
             if (eptr <= md->end_subject - length &&
-                memcmp(eptr, charptr, length) == 0) eptr += length;
+                memcmp(eptr, charptr, IN_UCHARS(length)) == 0) eptr += length;
 #ifdef SUPPORT_UCP
             else if (oclength > 0 &&
                      eptr <= md->end_subject - oclength &&
-                     memcmp(eptr, occhars, oclength) == 0) eptr += oclength;
+                     memcmp(eptr, occhars, IN_UCHARS(oclength)) == 0) eptr += oclength;
 #endif  /* SUPPORT_UCP */
             else
               {
@@ -3254,7 +3291,7 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM23);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (eptr == pp) { MRRETURN(MATCH_NOMATCH); }
+            if (eptr == pp) { RRETURN(MATCH_NOMATCH); }
 #ifdef SUPPORT_UCP
             eptr--;
             BACKCHAR(eptr);
@@ -3271,14 +3308,12 @@
       value of fc will always be < 128. */
       }
     else
-#endif  /* SUPPORT_UTF8 */
+#endif  /* SUPPORT_UTF */
+      /* When not in UTF-8 mode, load a single-byte character. */
+      fc = *ecode++;

-    /* When not in UTF-8 mode, load a single-byte character. */
-
-    fc = *ecode++;
-
-    /* The value of fc at this point is always less than 256, though we may or
-    may not be in UTF-8 mode. The code is duplicated for the caseless and
+    /* The value of fc at this point is always one character, though we may
+    or may not be in UTF mode. The code is duplicated for the caseless and
     caseful cases, for speed, since matching characters is likely to be quite
     common. First, ensure the minimum number of matches are present. If min =
     max, continue at the same level without recursing. Otherwise, if
@@ -3291,15 +3326,32 @@

     if (op >= OP_STARI)  /* Caseless */
       {
-      fc = md->lcc[fc];
+#ifdef COMPILE_PCRE8
+      /* fc must be < 128 if UTF is enabled. */
+      foc = md->fcc[fc];
+#else
+#ifdef SUPPORT_UTF
+#ifdef SUPPORT_UCP
+      if (utf && fc > 127)
+        foc = UCD_OTHERCASE(fc);
+#else
+      if (utf && fc > 127)
+        foc = fc;
+#endif /* SUPPORT_UCP */
+      else
+#endif /* SUPPORT_UTF */
+        foc = TABLE_GET(fc, md->fcc, fc);
+#endif /* COMPILE_PCRE8 */
+
       for (i = 1; i <= min; i++)
         {
         if (eptr >= md->end_subject)
           {
           SCHECK_PARTIAL();
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
           }
-        if (fc != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
+        if (fc != *eptr && foc != *eptr) RRETURN(MATCH_NOMATCH);
+        eptr++;
         }
       if (min == max) continue;
       if (minimize)
@@ -3308,13 +3360,14 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM24);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) MRRETURN(MATCH_NOMATCH);
+          if (fi >= max) RRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if (fc != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
+          if (fc != *eptr && foc != *eptr) RRETURN(MATCH_NOMATCH);
+          eptr++;
           }
         /* Control never gets here */
         }
@@ -3328,7 +3381,7 @@
             SCHECK_PARTIAL();
             break;
             }
-          if (fc != md->lcc[*eptr]) break;
+          if (fc != *eptr && foc != *eptr) break;
           eptr++;
           }

@@ -3340,7 +3393,7 @@
           eptr--;
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
           }
-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
         }
       /* Control never gets here */
       }
@@ -3354,9 +3407,9 @@
         if (eptr >= md->end_subject)
           {
           SCHECK_PARTIAL();
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
           }
-        if (fc != *eptr++) MRRETURN(MATCH_NOMATCH);
+        if (fc != *eptr++) RRETURN(MATCH_NOMATCH);
         }

       if (min == max) continue;
@@ -3367,13 +3420,13 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM26);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) MRRETURN(MATCH_NOMATCH);
+          if (fi >= max) RRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if (fc != *eptr++) MRRETURN(MATCH_NOMATCH);
+          if (fc != *eptr++) RRETURN(MATCH_NOMATCH);
           }
         /* Control never gets here */
         }
@@ -3398,7 +3451,7 @@
           eptr--;
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
           }
-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
         }
       }
     /* Control never gets here */
@@ -3411,21 +3464,35 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     ecode++;
     GETCHARINCTEST(c, eptr);
     if (op == OP_NOTI)         /* The caseless case */
       {
-#ifdef SUPPORT_UTF8
-      if (c < 256)
-#endif
-      c = md->lcc[c];
-      if (md->lcc[*ecode++] == c) MRRETURN(MATCH_NOMATCH);
+      register int ch, och;
+      ch = *ecode++;
+#ifdef COMPILE_PCRE8
+      /* ch must be < 128 if UTF is enabled. */
+      och = md->fcc[ch];
+#else
+#ifdef SUPPORT_UTF
+#ifdef SUPPORT_UCP
+      if (utf && ch > 127)
+        och = UCD_OTHERCASE(ch);
+#else
+      if (utf && ch > 127)
+        och = ch;
+#endif /* SUPPORT_UCP */
+      else
+#endif /* SUPPORT_UTF */
+        och = TABLE_GET(ch, md->fcc, ch);
+#endif /* COMPILE_PCRE8 */
+      if (ch == c || och == c) RRETURN(MATCH_NOMATCH);
       }
     else    /* Caseful */
       {
-      if (*ecode++ == c) MRRETURN(MATCH_NOMATCH);
+      if (*ecode++ == c) RRETURN(MATCH_NOMATCH);
       }
     break;

@@ -3439,7 +3506,7 @@
     case OP_NOTEXACT:
     case OP_NOTEXACTI:
     min = max = GET2(ecode, 1);
-    ecode += 3;
+    ecode += 1 + IMM2_SIZE;
     goto REPEATNOTCHAR;

     case OP_NOTUPTO:
@@ -3449,7 +3516,7 @@
     min = 0;
     max = GET2(ecode, 1);
     minimize = *ecode == OP_NOTMINUPTO || *ecode == OP_NOTMINUPTOI;
-    ecode += 3;
+    ecode += 1 + IMM2_SIZE;
     goto REPEATNOTCHAR;

     case OP_NOTPOSSTAR:
@@ -3481,7 +3548,7 @@
     possessive = TRUE;
     min = 0;
     max = GET2(ecode, 1);
-    ecode += 3;
+    ecode += 1 + IMM2_SIZE;
     goto REPEATNOTCHAR;

     case OP_NOTSTAR:
@@ -3520,11 +3587,25 @@

     if (op >= OP_NOTSTARI)     /* Caseless */
       {
-      fc = md->lcc[fc];
+#ifdef COMPILE_PCRE8
+      /* fc must be < 128 if UTF is enabled. */
+      foc = md->fcc[fc];
+#else
+#ifdef SUPPORT_UTF
+#ifdef SUPPORT_UCP
+      if (utf && fc > 127)
+        foc = UCD_OTHERCASE(fc);
+#else
+      if (utf && fc > 127)
+        foc = fc;
+#endif /* SUPPORT_UCP */
+      else
+#endif /* SUPPORT_UTF */
+        foc = TABLE_GET(fc, md->fcc, fc);
+#endif /* COMPILE_PCRE8 */

-#ifdef SUPPORT_UTF8
-      /* UTF-8 mode */
-      if (utf8)
+#ifdef SUPPORT_UTF
+      if (utf)
         {
         register unsigned int d;
         for (i = 1; i <= min; i++)
@@ -3532,26 +3613,25 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(d, eptr);
-          if (d < 256) d = md->lcc[d];
-          if (fc == d) MRRETURN(MATCH_NOMATCH);
+          if (fc == d || foc == d) RRETURN(MATCH_NOMATCH);
           }
         }
       else
 #endif
-
-      /* Not UTF-8 mode */
+      /* Not UTF mode */
         {
         for (i = 1; i <= min; i++)
           {
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if (fc == md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
+          if (fc == *eptr || foc == *eptr) RRETURN(MATCH_NOMATCH);
+          eptr++;
           }
         }

@@ -3559,41 +3639,40 @@

       if (minimize)
         {
-#ifdef SUPPORT_UTF8
-        /* UTF-8 mode */
-        if (utf8)
+#ifdef SUPPORT_UTF
+        if (utf)
           {
           register unsigned int d;
           for (fi = min;; fi++)
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM28);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINC(d, eptr);
-            if (d < 256) d = md->lcc[d];
-            if (fc == d) MRRETURN(MATCH_NOMATCH);
+            if (fc == d || foc == d) RRETURN(MATCH_NOMATCH);
             }
           }
         else
 #endif
-        /* Not UTF-8 mode */
+        /* Not UTF mode */
           {
           for (fi = min;; fi++)
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM29);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
-            if (fc == md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
+            if (fc == *eptr || foc == *eptr) RRETURN(MATCH_NOMATCH);
+            eptr++;
             }
           }
         /* Control never gets here */
@@ -3605,9 +3684,8 @@
         {
         pp = eptr;

-#ifdef SUPPORT_UTF8
-        /* UTF-8 mode */
-        if (utf8)
+#ifdef SUPPORT_UTF
+        if (utf)
           {
           register unsigned int d;
           for (i = min; i < max; i++)
@@ -3619,8 +3697,7 @@
               break;
               }
             GETCHARLEN(d, eptr, len);
-            if (d < 256) d = md->lcc[d];
-            if (fc == d) break;
+            if (fc == d || foc == d) break;
             eptr += len;
             }
         if (possessive) continue;
@@ -3634,7 +3711,7 @@
           }
         else
 #endif
-        /* Not UTF-8 mode */
+        /* Not UTF mode */
           {
           for (i = min; i < max; i++)
             {
@@ -3643,7 +3720,7 @@
               SCHECK_PARTIAL();
               break;
               }
-            if (fc == md->lcc[*eptr]) break;
+            if (fc == *eptr || foc == *eptr) break;
             eptr++;
             }
           if (possessive) continue;
@@ -3655,7 +3732,7 @@
             }
           }

-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
         }
       /* Control never gets here */
       }
@@ -3664,9 +3741,8 @@

     else
       {
-#ifdef SUPPORT_UTF8
-      /* UTF-8 mode */
-      if (utf8)
+#ifdef SUPPORT_UTF
+      if (utf)
         {
         register unsigned int d;
         for (i = 1; i <= min; i++)
@@ -3674,24 +3750,24 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(d, eptr);
-          if (fc == d) MRRETURN(MATCH_NOMATCH);
+          if (fc == d) RRETURN(MATCH_NOMATCH);
           }
         }
       else
 #endif
-      /* Not UTF-8 mode */
+      /* Not UTF mode */
         {
         for (i = 1; i <= min; i++)
           {
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if (fc == *eptr++) MRRETURN(MATCH_NOMATCH);
+          if (fc == *eptr++) RRETURN(MATCH_NOMATCH);
           }
         }

@@ -3699,40 +3775,39 @@

       if (minimize)
         {
-#ifdef SUPPORT_UTF8
-        /* UTF-8 mode */
-        if (utf8)
+#ifdef SUPPORT_UTF
+        if (utf)
           {
           register unsigned int d;
           for (fi = min;; fi++)
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM32);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINC(d, eptr);
-            if (fc == d) MRRETURN(MATCH_NOMATCH);
+            if (fc == d) RRETURN(MATCH_NOMATCH);
             }
           }
         else
 #endif
-        /* Not UTF-8 mode */
+        /* Not UTF mode */
           {
           for (fi = min;; fi++)
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM33);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
-            if (fc == *eptr++) MRRETURN(MATCH_NOMATCH);
+            if (fc == *eptr++) RRETURN(MATCH_NOMATCH);
             }
           }
         /* Control never gets here */
@@ -3744,9 +3819,8 @@
         {
         pp = eptr;

-#ifdef SUPPORT_UTF8
-        /* UTF-8 mode */
-        if (utf8)
+#ifdef SUPPORT_UTF
+        if (utf)
           {
           register unsigned int d;
           for (i = min; i < max; i++)
@@ -3772,7 +3846,7 @@
           }
         else
 #endif
-        /* Not UTF-8 mode */
+        /* Not UTF mode */
           {
           for (i = min; i < max; i++)
             {
@@ -3793,7 +3867,7 @@
             }
           }

-        MRRETURN(MATCH_NOMATCH);
+        RRETURN(MATCH_NOMATCH);
         }
       }
     /* Control never gets here */
@@ -3805,7 +3879,7 @@
     case OP_TYPEEXACT:
     min = max = GET2(ecode, 1);
     minimize = TRUE;
-    ecode += 3;
+    ecode += 1 + IMM2_SIZE;
     goto REPEATTYPE;

     case OP_TYPEUPTO:
@@ -3813,7 +3887,7 @@
     min = 0;
     max = GET2(ecode, 1);
     minimize = *ecode == OP_TYPEMINUPTO;
-    ecode += 3;
+    ecode += 1 + IMM2_SIZE;
     goto REPEATTYPE;

     case OP_TYPEPOSSTAR:
@@ -3841,7 +3915,7 @@
     possessive = TRUE;
     min = 0;
     max = GET2(ecode, 1);
-    ecode += 3;
+    ecode += 1 + IMM2_SIZE;
     goto REPEATTYPE;

     case OP_TYPESTAR:
@@ -3887,13 +3961,13 @@
         switch(prop_type)
           {
           case PT_ANY:
-          if (prop_fail_result) MRRETURN(MATCH_NOMATCH);
+          if (prop_fail_result) RRETURN(MATCH_NOMATCH);
           for (i = 1; i <= min; i++)
             {
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             }
@@ -3906,14 +3980,14 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             chartype = UCD_CHARTYPE(c);
             if ((chartype == ucp_Lu ||
                  chartype == ucp_Ll ||
                  chartype == ucp_Lt) == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           break;

@@ -3923,11 +3997,11 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == prop_value) == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           break;

@@ -3937,11 +4011,11 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CHARTYPE(c) == prop_value) == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           break;

@@ -3951,11 +4025,11 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_SCRIPT(c) == prop_value) == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           break;

@@ -3966,12 +4040,12 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             category = UCD_CATEGORY(c);
             if ((category == ucp_L || category == ucp_N) == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           break;

@@ -3981,13 +4055,13 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == ucp_Z || c == CHAR_HT || c == CHAR_NL ||
                  c == CHAR_FF || c == CHAR_CR)
                    == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           break;

@@ -3997,13 +4071,13 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == ucp_Z || c == CHAR_HT || c == CHAR_NL ||
                  c == CHAR_VT || c == CHAR_FF || c == CHAR_CR)
                    == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           break;

@@ -4014,13 +4088,13 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             category = UCD_CATEGORY(c);
             if ((category == ucp_L || category == ucp_N || c == CHAR_UNDERSCORE)
                    == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           break;

@@ -4041,14 +4115,14 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINCTEST(c, eptr);
-          if (UCD_CATEGORY(c) == ucp_M) MRRETURN(MATCH_NOMATCH);
+          if (UCD_CATEGORY(c) == ucp_M) RRETURN(MATCH_NOMATCH);
           while (eptr < md->end_subject)
             {
             int len = 1;
-            if (!utf8) c = *eptr; else { GETCHARLEN(c, eptr, len); }
+            if (!utf) c = *eptr; else { GETCHARLEN(c, eptr, len); }
             if (UCD_CATEGORY(c) != ucp_M) break;
             eptr += len;
             }
@@ -4060,8 +4134,8 @@

/* Handle all other cases when the coding is UTF-8 */

-#ifdef SUPPORT_UTF8
-      if (utf8) switch(ctype)
+#ifdef SUPPORT_UTF
+      if (utf) switch(ctype)
         {
         case OP_ANY:
         for (i = 1; i <= min; i++)
@@ -4069,11 +4143,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if (IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH);
+          if (IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH);
           eptr++;
-          while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
+          ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
           }
         break;

@@ -4083,15 +4157,15 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           eptr++;
-          while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
+          ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
           }
         break;

         case OP_ANYBYTE:
-        if (eptr > md->end_subject - min) MRRETURN(MATCH_NOMATCH);
+        if (eptr > md->end_subject - min) RRETURN(MATCH_NOMATCH);
         eptr += min;
         break;

@@ -4101,12 +4175,12 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           switch(c)
             {
-            default: MRRETURN(MATCH_NOMATCH);
+            default: RRETURN(MATCH_NOMATCH);

             case 0x000d:
             if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
@@ -4120,7 +4194,7 @@
             case 0x0085:
             case 0x2028:
             case 0x2029:
-            if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
+            if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
             break;
             }
           }
@@ -4132,7 +4206,7 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           switch(c)
@@ -4157,7 +4231,7 @@
             case 0x202f:    /* NARROW NO-BREAK SPACE */
             case 0x205f:    /* MEDIUM MATHEMATICAL SPACE */
             case 0x3000:    /* IDEOGRAPHIC SPACE */
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           }
         break;
@@ -4168,12 +4242,12 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           switch(c)
             {
-            default: MRRETURN(MATCH_NOMATCH);
+            default: RRETURN(MATCH_NOMATCH);
             case 0x09:      /* HT */
             case 0x20:      /* SPACE */
             case 0xa0:      /* NBSP */
@@ -4204,7 +4278,7 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           switch(c)
@@ -4217,7 +4291,7 @@
             case 0x85:      /* NEL */
             case 0x2028:    /* LINE SEPARATOR */
             case 0x2029:    /* PARAGRAPH SEPARATOR */
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           }
         break;
@@ -4228,12 +4302,12 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           switch(c)
             {
-            default: MRRETURN(MATCH_NOMATCH);
+            default: RRETURN(MATCH_NOMATCH);
             case 0x0a:      /* LF */
             case 0x0b:      /* VT */
             case 0x0c:      /* FF */
@@ -4252,11 +4326,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           if (c < 128 && (md->ctypes[c] & ctype_digit) != 0)
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
           }
         break;

@@ -4266,10 +4340,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if (*eptr >= 128 || (md->ctypes[*eptr++] & ctype_digit) == 0)
-            MRRETURN(MATCH_NOMATCH);
+          if (*eptr >= 128 || (md->ctypes[*eptr] & ctype_digit) == 0)
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
           /* No need to skip more bytes - we know it's a 1-byte character */
           }
         break;
@@ -4280,11 +4355,12 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           if (*eptr < 128 && (md->ctypes[*eptr] & ctype_space) != 0)
-            MRRETURN(MATCH_NOMATCH);
-          while (++eptr < md->end_subject && (*eptr & 0xc0) == 0x80);
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
+          ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
           }
         break;

@@ -4294,10 +4370,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if (*eptr >= 128 || (md->ctypes[*eptr++] & ctype_space) == 0)
-            MRRETURN(MATCH_NOMATCH);
+          if (*eptr >= 128 || (md->ctypes[*eptr] & ctype_space) == 0)
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
           /* No need to skip more bytes - we know it's a 1-byte character */
           }
         break;
@@ -4308,11 +4385,12 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           if (*eptr < 128 && (md->ctypes[*eptr] & ctype_word) != 0)
-            MRRETURN(MATCH_NOMATCH);
-          while (++eptr < md->end_subject && (*eptr & 0xc0) == 0x80);
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
+          ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
           }
         break;

@@ -4322,10 +4400,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if (*eptr >= 128 || (md->ctypes[*eptr++] & ctype_word) == 0)
-            MRRETURN(MATCH_NOMATCH);
+          if (*eptr >= 128 || (md->ctypes[*eptr] & ctype_word) == 0)
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
           /* No need to skip more bytes - we know it's a 1-byte character */
           }
         break;
@@ -4335,7 +4414,7 @@
         }  /* End switch(ctype) */

       else
-#endif     /* SUPPORT_UTF8 */
+#endif     /* SUPPORT_UTF */

       /* Code for the non-UTF-8 case for minimum matching of operators other
       than OP_PROP and OP_NOTPROP. */
@@ -4348,9 +4427,9 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if (IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH);
+          if (IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH);
           eptr++;
           }
         break;
@@ -4359,7 +4438,7 @@
         if (eptr > md->end_subject - min)
           {
           SCHECK_PARTIAL();
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
           }
         eptr += min;
         break;
@@ -4368,7 +4447,7 @@
         if (eptr > md->end_subject - min)
           {
           SCHECK_PARTIAL();
-          MRRETURN(MATCH_NOMATCH);
+          RRETURN(MATCH_NOMATCH);
           }
         eptr += min;
         break;
@@ -4379,11 +4458,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           switch(*eptr++)
             {
-            default: MRRETURN(MATCH_NOMATCH);
+            default: RRETURN(MATCH_NOMATCH);

             case 0x000d:
             if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
@@ -4395,7 +4474,11 @@
             case 0x000b:
             case 0x000c:
             case 0x0085:
-            if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
+#ifdef COMPILE_PCRE16
+            case 0x2028:
+            case 0x2029:
+#endif
+            if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
             break;
             }
           }
@@ -4407,7 +4490,7 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           switch(*eptr++)
             {
@@ -4415,7 +4498,25 @@
             case 0x09:      /* HT */
             case 0x20:      /* SPACE */
             case 0xa0:      /* NBSP */
-            MRRETURN(MATCH_NOMATCH);
+#ifdef COMPILE_PCRE16
+            case 0x1680:    /* OGHAM SPACE MARK */
+            case 0x180e:    /* MONGOLIAN VOWEL SEPARATOR */
+            case 0x2000:    /* EN QUAD */
+            case 0x2001:    /* EM QUAD */
+            case 0x2002:    /* EN SPACE */
+            case 0x2003:    /* EM SPACE */
+            case 0x2004:    /* THREE-PER-EM SPACE */
+            case 0x2005:    /* FOUR-PER-EM SPACE */
+            case 0x2006:    /* SIX-PER-EM SPACE */
+            case 0x2007:    /* FIGURE SPACE */
+            case 0x2008:    /* PUNCTUATION SPACE */
+            case 0x2009:    /* THIN SPACE */
+            case 0x200A:    /* HAIR SPACE */
+            case 0x202f:    /* NARROW NO-BREAK SPACE */
+            case 0x205f:    /* MEDIUM MATHEMATICAL SPACE */
+            case 0x3000:    /* IDEOGRAPHIC SPACE */
+#endif
+            RRETURN(MATCH_NOMATCH);
             }
           }
         break;
@@ -4426,14 +4527,32 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           switch(*eptr++)
             {
-            default: MRRETURN(MATCH_NOMATCH);
+            default: RRETURN(MATCH_NOMATCH);
             case 0x09:      /* HT */
             case 0x20:      /* SPACE */
             case 0xa0:      /* NBSP */
+#ifdef COMPILE_PCRE16
+            case 0x1680:    /* OGHAM SPACE MARK */
+            case 0x180e:    /* MONGOLIAN VOWEL SEPARATOR */
+            case 0x2000:    /* EN QUAD */
+            case 0x2001:    /* EM QUAD */
+            case 0x2002:    /* EN SPACE */
+            case 0x2003:    /* EM SPACE */
+            case 0x2004:    /* THREE-PER-EM SPACE */
+            case 0x2005:    /* FOUR-PER-EM SPACE */
+            case 0x2006:    /* SIX-PER-EM SPACE */
+            case 0x2007:    /* FIGURE SPACE */
+            case 0x2008:    /* PUNCTUATION SPACE */
+            case 0x2009:    /* THIN SPACE */
+            case 0x200A:    /* HAIR SPACE */
+            case 0x202f:    /* NARROW NO-BREAK SPACE */
+            case 0x205f:    /* MEDIUM MATHEMATICAL SPACE */
+            case 0x3000:    /* IDEOGRAPHIC SPACE */
+#endif
             break;
             }
           }
@@ -4445,7 +4564,7 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           switch(*eptr++)
             {
@@ -4455,7 +4574,11 @@
             case 0x0c:      /* FF */
             case 0x0d:      /* CR */
             case 0x85:      /* NEL */
-            MRRETURN(MATCH_NOMATCH);
+#ifdef COMPILE_PCRE16
+            case 0x2028:    /* LINE SEPARATOR */
+            case 0x2029:    /* PARAGRAPH SEPARATOR */
+#endif
+            RRETURN(MATCH_NOMATCH);
             }
           }
         break;
@@ -4466,16 +4589,20 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           switch(*eptr++)
             {
-            default: MRRETURN(MATCH_NOMATCH);
+            default: RRETURN(MATCH_NOMATCH);
             case 0x0a:      /* LF */
             case 0x0b:      /* VT */
             case 0x0c:      /* FF */
             case 0x0d:      /* CR */
             case 0x85:      /* NEL */
+#ifdef COMPILE_PCRE16
+            case 0x2028:    /* LINE SEPARATOR */
+            case 0x2029:    /* PARAGRAPH SEPARATOR */
+#endif
             break;
             }
           }
@@ -4487,9 +4614,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if ((md->ctypes[*eptr++] & ctype_digit) != 0) MRRETURN(MATCH_NOMATCH);
+          if (MAX_255(*eptr) && (md->ctypes[*eptr] & ctype_digit) != 0)
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
           }
         break;

@@ -4499,9 +4628,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if ((md->ctypes[*eptr++] & ctype_digit) == 0) MRRETURN(MATCH_NOMATCH);
+          if (!MAX_255(*eptr) || (md->ctypes[*eptr] & ctype_digit) == 0)
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
           }
         break;

@@ -4511,9 +4642,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if ((md->ctypes[*eptr++] & ctype_space) != 0) MRRETURN(MATCH_NOMATCH);
+          if (MAX_255(*eptr) && (md->ctypes[*eptr] & ctype_space) != 0)
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
           }
         break;

@@ -4523,9 +4656,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if ((md->ctypes[*eptr++] & ctype_space) == 0) MRRETURN(MATCH_NOMATCH);
+          if (!MAX_255(*eptr) || (md->ctypes[*eptr] & ctype_space) == 0)
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
           }
         break;

@@ -4535,10 +4670,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if ((md->ctypes[*eptr++] & ctype_word) != 0)
-            MRRETURN(MATCH_NOMATCH);
+          if (MAX_255(*eptr) && (md->ctypes[*eptr] & ctype_word) != 0)
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
           }
         break;

@@ -4548,10 +4684,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
-          if ((md->ctypes[*eptr++] & ctype_word) == 0)
-            MRRETURN(MATCH_NOMATCH);
+          if (!MAX_255(*eptr) || (md->ctypes[*eptr] & ctype_word) == 0)
+            RRETURN(MATCH_NOMATCH);
+          eptr++;
           }
         break;

@@ -4580,14 +4717,14 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM36);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
-            if (prop_fail_result) MRRETURN(MATCH_NOMATCH);
+            if (prop_fail_result) RRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */

@@ -4597,18 +4734,18 @@
             int chartype;
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM37);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             chartype = UCD_CHARTYPE(c);
             if ((chartype == ucp_Lu ||
                  chartype == ucp_Ll ||
                  chartype == ucp_Lt) == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */

@@ -4617,15 +4754,15 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM38);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == prop_value) == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */

@@ -4634,15 +4771,15 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM39);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CHARTYPE(c) == prop_value) == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */

@@ -4651,15 +4788,15 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM40);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_SCRIPT(c) == prop_value) == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */

@@ -4669,16 +4806,16 @@
             int category;
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM59);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             category = UCD_CATEGORY(c);
             if ((category == ucp_L || category == ucp_N) == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */

@@ -4687,17 +4824,17 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM60);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == ucp_Z || c == CHAR_HT || c == CHAR_NL ||
                  c == CHAR_FF || c == CHAR_CR)
                    == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */

@@ -4706,17 +4843,17 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM61);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == ucp_Z || c == CHAR_HT || c == CHAR_NL ||
                  c == CHAR_VT || c == CHAR_FF || c == CHAR_CR)
                    == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */

@@ -4726,11 +4863,11 @@
             int category;
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM62);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) MRRETURN(MATCH_NOMATCH);
+            if (fi >= max) RRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             category = UCD_CATEGORY(c);
@@ -4738,7 +4875,7 @@
                  category == ucp_N ||
                  c == CHAR_UNDERSCORE)
                    == prop_fail_result)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */

@@ -4758,18 +4895,18 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM41);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) MRRETURN(MATCH_NOMATCH);
+          if (fi >= max) RRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           GETCHARINCTEST(c, eptr);
-          if (UCD_CATEGORY(c) == ucp_M) MRRETURN(MATCH_NOMATCH);
+          if (UCD_CATEGORY(c) == ucp_M) RRETURN(MATCH_NOMATCH);
           while (eptr < md->end_subject)
             {
             int len = 1;
-            if (!utf8) c = *eptr; else { GETCHARLEN(c, eptr, len); }
+            if (!utf) c = *eptr; else { GETCHARLEN(c, eptr, len); }
             if (UCD_CATEGORY(c) != ucp_M) break;
             eptr += len;
             }
@@ -4778,22 +4915,21 @@
       else
 #endif     /* SUPPORT_UCP */

-#ifdef SUPPORT_UTF8
-      /* UTF-8 mode */
-      if (utf8)
+#ifdef SUPPORT_UTF
+      if (utf)
         {
         for (fi = min;; fi++)
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM42);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) MRRETURN(MATCH_NOMATCH);
+          if (fi >= max) RRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           if (ctype == OP_ANY && IS_NEWLINE(eptr))
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
           GETCHARINC(c, eptr);
           switch(ctype)
             {
@@ -4805,7 +4941,7 @@
             case OP_ANYNL:
             switch(c)
               {
-              default: MRRETURN(MATCH_NOMATCH);
+              default: RRETURN(MATCH_NOMATCH);
               case 0x000d:
               if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
               break;
@@ -4817,7 +4953,7 @@
               case 0x0085:
               case 0x2028:
               case 0x2029:
-              if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
+              if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
               break;
               }
             break;
@@ -4845,14 +4981,14 @@
               case 0x202f:    /* NARROW NO-BREAK SPACE */
               case 0x205f:    /* MEDIUM MATHEMATICAL SPACE */
               case 0x3000:    /* IDEOGRAPHIC SPACE */
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             break;

             case OP_HSPACE:
             switch(c)
               {
-              default: MRRETURN(MATCH_NOMATCH);
+              default: RRETURN(MATCH_NOMATCH);
               case 0x09:      /* HT */
               case 0x20:      /* SPACE */
               case 0xa0:      /* NBSP */
@@ -4887,14 +5023,14 @@
               case 0x85:      /* NEL */
               case 0x2028:    /* LINE SEPARATOR */
               case 0x2029:    /* PARAGRAPH SEPARATOR */
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
               }
             break;

             case OP_VSPACE:
             switch(c)
               {
-              default: MRRETURN(MATCH_NOMATCH);
+              default: RRETURN(MATCH_NOMATCH);
               case 0x0a:      /* LF */
               case 0x0b:      /* VT */
               case 0x0c:      /* FF */
@@ -4908,32 +5044,32 @@

             case OP_NOT_DIGIT:
             if (c < 256 && (md->ctypes[c] & ctype_digit) != 0)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             break;

             case OP_DIGIT:
             if (c >= 256 || (md->ctypes[c] & ctype_digit) == 0)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             break;

             case OP_NOT_WHITESPACE:
             if (c < 256 && (md->ctypes[c] & ctype_space) != 0)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             break;

             case OP_WHITESPACE:
-            if  (c >= 256 || (md->ctypes[c] & ctype_space) == 0)
-              MRRETURN(MATCH_NOMATCH);
+            if (c >= 256 || (md->ctypes[c] & ctype_space) == 0)
+              RRETURN(MATCH_NOMATCH);
             break;

             case OP_NOT_WORDCHAR:
             if (c < 256 && (md->ctypes[c] & ctype_word) != 0)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             break;

             case OP_WORDCHAR:
             if (c >= 256 || (md->ctypes[c] & ctype_word) == 0)
-              MRRETURN(MATCH_NOMATCH);
+              RRETURN(MATCH_NOMATCH);
             break;

             default:
@@ -4943,20 +5079,20 @@
         }
       else
 #endif
-      /* Not UTF-8 mode */
+      /* Not UTF mode */
         {
         for (fi = min;; fi++)
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM43);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) MRRETURN(MATCH_NOMATCH);
+          if (fi >= max) RRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
             }
           if (ctype == OP_ANY && IS_NEWLINE(eptr))
-            MRRETURN(MATCH_NOMATCH);
+            RRETURN(MATCH_NOMATCH);
           c = *eptr++;
           switch(ctype)
             {
@@ -4968,7 +5104,7 @@
             case OP_ANYNL:
             switch(c)
               {
-              default: MRRETURN(MATCH_NOMATCH);
+              default: RRETURN(MATCH_NOMATCH);
               case 0x000d:
               if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
               break;
@@ -4979,7 +5115,11 @@
               case 0x000b:
               case 0x000c:
               case 0x0085:
-              if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
+#ifdef COMPILE_PCRE16
+              case 0x2028:
+              case 0x2029:
+#endif
+              if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
               break;
               }
             break;
@@ -4991,17 +5131,53 @@
               case 0x09:      /* HT */
               case 0x20:      /* SPACE */
               case 0xa0:      /* NBSP */
-              MRRETURN(MATCH_NOMATCH);
+#ifdef COMPILE_PCRE16
+              case 0x1680:    /* OGHAM SPACE MARK */
+              case 0x180e:    /* MONGOLIAN VOWEL SEPARATOR */
+              case 0x2000:    /* EN QUAD */
+              case 0x2001:    /* EM QUAD */
+              case 0x2002:    /* EN SPACE */
+              case 0x2003:    /* EM SPACE */
+              case 0x2004:    /* THREE-PER-EM SPACE */
+              case 0x2005:    /* FOUR-PER-EM SPACE */
+              case 0x2006:    /* SIX-PER-EM SPACE */
+              case 0x2007:    /* FIGURE SPACE */
+              case 0x2008:    /* PUNCTUATION SPACE */
+              case 0x2009:    /* THIN SPACE */
+              case 0x200A:    /* HAIR SPACE */
+              case 0x202f:    /* NARROW NO-BREAK SPACE */
+              case 0x205f:    /* MEDIUM MATHEMATICAL SPACE */
+              case 0x3000:    /* IDEOGRAPHIC SPACE */
+#endif
+              RRETURN(MATCH_NOMATCH);
               }
             break;

             case OP_HSPACE:
             switch(c)
               {
-              default: MRRETURN(MATCH_NOMATCH);
+              default: RRETURN(MATCH_NOMATCH);
               case 0x09:      /* HT */
               case 0x20:      /* SPACE */
               case 0xa0:      /* NBSP */
+#ifdef COMPILE_PCRE16
+              case 0x1680:    /* OGHAM SPACE MARK */
+              case 0x180e:    /* MONGOLIAN VOWEL SEPARATOR */
+              case 0x2000:    /* EN QUAD */
+              case 0x2001:    /* EM QUAD */
+              case 0x2002:    /* EN SPACE */
+              case 0x2003:    /* EM SPACE */
+              case 0x2004:    /* THREE-PER-EM SPACE */
+              case 0x2005:    /* FOUR-PER-EM SPACE */
+              case 0x2006:    /* SIX-PER-EM SPACE */
+              case 0x2007:    /* FIGURE SPACE */
+              case 0x2008:    /* PUNCTUATION SPACE */
+              case 0x2009:    /* THIN SPACE */
+              case 0x200A:    /* HAIR SPACE */
+              case 0x202f:    /* NARROW NO-BREAK SPACE */
+              case 0x205f:    /* MEDIUM MATHEMATICAL SPACE */
+              case 0x3000:    /* IDEOGRAPHIC SPACE */
+#endif
               break;
               }
             break;
@@ -5015,45 +5191,53 @@
               case 0x0c:      /* FF */
               case 0x0d:      /* CR */
               case 0x85:      /* NEL */
-              MRRETURN(MATCH_NOMATCH);
+#ifdef COMPILE_PCRE16
+              case 0x2028:    /* LINE SEPARATOR */
+              case 0x2029:    /* PARAGRAPH SEPARATOR */
+#endif
+              RRETURN(MATCH_NOMATCH);
               }
             break;

             case OP_VSPACE:
             switch(c)
               {
-              default: MRRETURN(MATCH_NOMATCH);
+              default: RRETURN(MATCH_NOMATCH);
               case 0x0a:      /* LF */
               case 0x0b:      /* VT */
               case 0x0c:      /* FF */
               case 0x0d:      /* CR */
               case 0x85:      /* NEL */
+#ifdef COMPILE_PCRE16
+              case 0x2028:    /* LINE SEPARATOR */
+              case 0x2029:    /* PARAGRAPH SEPARATOR */
+#endif
               break;
               }
             break;

             case OP_NOT_DIGIT:
-            if ((md->ctypes[c] & ctype_digit) != 0) MRRETURN(MATCH_NOMATCH);
+            if (MAX_255(c) && (md->ctypes[c] & ctype_digit) != 0) RRETURN(MATCH_NOMATCH);
             break;

             case OP_DIGIT:
-            if ((md->ctypes[c] & ctype_digit) == 0) MRRETURN(MATCH_NOMATCH);
+            if (!MAX_255(c) || (md->ctypes[c] & ctype_digit) == 0) RRETURN(MATCH_NOMATCH);
             break;

             case OP_NOT_WHITESPACE:
-            if ((md->ctypes[c] & ctype_space) != 0) MRRETURN(MATCH_NOMATCH);
+            if (MAX_255(c) && (md->ctypes[c] & ctype_space) != 0) RRETURN(MATCH_NOMATCH);
             break;

             case OP_WHITESPACE:
-            if  ((md->ctypes[c] & ctype_space) == 0) MRRETURN(MATCH_NOMATCH);
+            if (!MAX_255(c) || (md->ctypes[c] & ctype_space) == 0) RRETURN(MATCH_NOMATCH);
             break;

             case OP_NOT_WORDCHAR:
-            if ((md->ctypes[c] & ctype_word) != 0) MRRETURN(MATCH_NOMATCH);
+            if (MAX_255(c) && (md->ctypes[c] & ctype_word) != 0) RRETURN(MATCH_NOMATCH);
             break;

             case OP_WORDCHAR:
-            if ((md->ctypes[c] & ctype_word) == 0) MRRETURN(MATCH_NOMATCH);
+            if (!MAX_255(c) || (md->ctypes[c] & ctype_word) == 0) RRETURN(MATCH_NOMATCH);
             break;

             default:
@@ -5242,7 +5426,7 @@
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM44);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
           if (eptr-- == pp) break;        /* Stop if tried at original pos */
-          if (utf8) BACKCHAR(eptr);
+          if (utf) BACKCHAR(eptr);
           }
         }

@@ -5259,13 +5443,13 @@
             SCHECK_PARTIAL();
             break;
             }
-          if (!utf8) c = *eptr; else { GETCHARLEN(c, eptr, len); }
+          if (!utf) c = *eptr; else { GETCHARLEN(c, eptr, len); }
           if (UCD_CATEGORY(c) == ucp_M) break;
           eptr += len;
           while (eptr < md->end_subject)
             {
             len = 1;
-            if (!utf8) c = *eptr; else { GETCHARLEN(c, eptr, len); }
+            if (!utf) c = *eptr; else { GETCHARLEN(c, eptr, len); }
             if (UCD_CATEGORY(c) != ucp_M) break;
             eptr += len;
             }
@@ -5282,7 +5466,7 @@
           if (eptr-- == pp) break;        /* Stop if tried at original pos */
           for (;;)                        /* Move back over one extended */
             {
-            if (!utf8) c = *eptr; else
+            if (!utf) c = *eptr; else
               {
               BACKCHAR(eptr);
               GETCHAR(c, eptr);
@@ -5296,10 +5480,8 @@
       else
 #endif   /* SUPPORT_UCP */

-#ifdef SUPPORT_UTF8
-      /* UTF-8 mode */
-
-      if (utf8)
+#ifdef SUPPORT_UTF
+      if (utf)
         {
         switch(ctype)
           {
@@ -5315,7 +5497,7 @@
                 }
               if (IS_NEWLINE(eptr)) break;
               eptr++;
-              while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
+              ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
               }
             }

@@ -5332,7 +5514,7 @@
                 }
               if (IS_NEWLINE(eptr)) break;
               eptr++;
-              while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
+              ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
               }
             }
           break;
@@ -5348,7 +5530,7 @@
                 break;
                 }
               eptr++;
-              while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
+              ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
               }
             }
           else
@@ -5581,9 +5763,8 @@
           }
         }
       else
-#endif  /* SUPPORT_UTF8 */
-
-      /* Not UTF-8 mode */
+#endif  /* SUPPORT_UTF */
+      /* Not UTF mode */
         {
         switch(ctype)
           {
@@ -5627,10 +5808,12 @@
               }
             else
               {
-              if (c != 0x000a &&
-                  (md->bsr_anycrlf ||
-                    (c != 0x000b && c != 0x000c && c != 0x0085)))
-                break;
+              if (c != 0x000a && (md->bsr_anycrlf ||
+                (c != 0x000b && c != 0x000c && c != 0x0085
+#ifdef COMPILE_PCRE16
+                && c != 0x2028 && c != 0x2029
+#endif
+                ))) break;
               eptr++;
               }
             }
@@ -5645,7 +5828,12 @@
               break;
               }
             c = *eptr;
-            if (c == 0x09 || c == 0x20 || c == 0xa0) break;
+            if (c == 0x09 || c == 0x20 || c == 0xa0
+#ifdef COMPILE_PCRE16
+              || c == 0x1680 || c == 0x180e || (c >= 0x2000 && c <= 0x200A)
+              || c == 0x202f || c == 0x205f || c == 0x3000
+#endif
+              ) break;
             eptr++;
             }
           break;
@@ -5659,7 +5847,12 @@
               break;
               }
             c = *eptr;
-            if (c != 0x09 && c != 0x20 && c != 0xa0) break;
+            if (c != 0x09 && c != 0x20 && c != 0xa0
+#ifdef COMPILE_PCRE16
+              && c != 0x1680 && c != 0x180e && (c < 0x2000 || c > 0x200A)
+              && c != 0x202f && c != 0x205f && c != 0x3000
+#endif
+              ) break;
             eptr++;
             }
           break;
@@ -5673,8 +5866,11 @@
               break;
               }
             c = *eptr;
-            if (c == 0x0a || c == 0x0b || c == 0x0c || c == 0x0d || c == 0x85)
-              break;
+            if (c == 0x0a || c == 0x0b || c == 0x0c || c == 0x0d || c == 0x85
+#ifdef COMPILE_PCRE16
+              || c == 0x2028 || c == 0x2029
+#endif
+              ) break;
             eptr++;
             }
           break;
@@ -5688,8 +5884,11 @@
               break;
               }
             c = *eptr;
-            if (c != 0x0a && c != 0x0b && c != 0x0c && c != 0x0d && c != 0x85)
-              break;
+            if (c != 0x0a && c != 0x0b && c != 0x0c && c != 0x0d && c != 0x85
+#ifdef COMPILE_PCRE16
+              && c != 0x2028 && c != 0x2029
+#endif
+              ) break;
             eptr++;
             }
           break;
@@ -5702,7 +5901,7 @@
               SCHECK_PARTIAL();
               break;
               }
-            if ((md->ctypes[*eptr] & ctype_digit) != 0) break;
+            if (MAX_255(*eptr) && (md->ctypes[*eptr] & ctype_digit) != 0) break;
             eptr++;
             }
           break;
@@ -5715,7 +5914,7 @@
               SCHECK_PARTIAL();
               break;
               }
-            if ((md->ctypes[*eptr] & ctype_digit) == 0) break;
+            if (!MAX_255(*eptr) || (md->ctypes[*eptr] & ctype_digit) == 0) break;
             eptr++;
             }
           break;
@@ -5728,7 +5927,7 @@
               SCHECK_PARTIAL();
               break;
               }
-            if ((md->ctypes[*eptr] & ctype_space) != 0) break;
+            if (MAX_255(*eptr) && (md->ctypes[*eptr] & ctype_space) != 0) break;
             eptr++;
             }
           break;
@@ -5741,7 +5940,7 @@
               SCHECK_PARTIAL();
               break;
               }
-            if ((md->ctypes[*eptr] & ctype_space) == 0) break;
+            if (!MAX_255(*eptr) || (md->ctypes[*eptr] & ctype_space) == 0) break;
             eptr++;
             }
           break;
@@ -5754,7 +5953,7 @@
               SCHECK_PARTIAL();
               break;
               }
-            if ((md->ctypes[*eptr] & ctype_word) != 0) break;
+            if (MAX_255(*eptr) && (md->ctypes[*eptr] & ctype_word) != 0) break;
             eptr++;
             }
           break;
@@ -5767,7 +5966,7 @@
               SCHECK_PARTIAL();
               break;
               }
-            if ((md->ctypes[*eptr] & ctype_word) == 0) break;
+            if (!MAX_255(*eptr) || (md->ctypes[*eptr] & ctype_word) == 0) break;
             eptr++;
             }
           break;
@@ -5795,7 +5994,7 @@

       /* Get here if we can't make it match with any permitted repetitions */

-      MRRETURN(MATCH_NOMATCH);
+      RRETURN(MATCH_NOMATCH);
       }
     /* Control never gets here */

@@ -5830,16 +6029,23 @@
   LBL(35) LBL(43) LBL(47) LBL(48) LBL(49) LBL(50) LBL(51) LBL(52)
   LBL(53) LBL(54) LBL(55) LBL(56) LBL(57) LBL(58) LBL(63) LBL(64)
   LBL(65) LBL(66)
-#ifdef SUPPORT_UTF8
-  LBL(16) LBL(18) LBL(20) LBL(21) LBL(22) LBL(23) LBL(28) LBL(30)
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
+  LBL(21) 
+#endif
+#ifdef SUPPORT_UTF   
+  LBL(16) LBL(18) LBL(20) 
+  LBL(22) LBL(23) LBL(28) LBL(30)
   LBL(32) LBL(34) LBL(42) LBL(46)
 #ifdef SUPPORT_UCP
   LBL(36) LBL(37) LBL(38) LBL(39) LBL(40) LBL(41) LBL(44) LBL(45)
   LBL(59) LBL(60) LBL(61) LBL(62)
 #endif  /* SUPPORT_UCP */
-#endif  /* SUPPORT_UTF8 */
+#endif  /* SUPPORT_UTF */
   default:
   DPRINTF(("jump error in pcre match: label %d non-existent\n", frame->Xwhere));
+  
+printf("+++jump error in pcre match: label %d non-existent\n", frame->Xwhere);
+
   return PCRE_ERROR_INTERNAL;
   }
 #undef LBL
@@ -5926,36 +6132,41 @@
                  < -1 => some kind of unexpected problem
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_exec(const pcre *argument_re, const pcre_extra *extra_data,
PCRE_SPTR subject, int length, int start_offset, int options, int *offsets,
int offsetcount)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_exec(const pcre *argument_re, const pcre_extra *extra_data,
+ PCRE_SPTR16 subject, int length, int start_offset, int options, int *offsets,
+ int offsetcount)
+#endif
{
int rc, ocount, arg_offset_max;
-int first_byte = -1;
-int req_byte = -1;
-int req_byte2 = -1;
int newline;
BOOL using_temporary_offsets = FALSE;
BOOL anchored;
BOOL startline;
BOOL firstline;
-BOOL first_byte_caseless = FALSE;
-BOOL req_byte_caseless = FALSE;
-BOOL utf8;
+BOOL utf;
+BOOL has_first_char = FALSE;
+BOOL has_req_char = FALSE;
+pcre_uchar first_char = 0;
+pcre_uchar first_char2 = 0;
+pcre_uchar req_char = 0;
+pcre_uchar req_char2 = 0;
match_data match_block;
match_data *md = &match_block;
-const uschar *tables;
-const uschar *start_bits = NULL;
-USPTR start_match = (USPTR)subject + start_offset;
-USPTR end_subject;
-USPTR start_partial = NULL;
-USPTR req_byte_ptr = start_match - 1;
+const pcre_uint8 *tables;
+const pcre_uint8 *start_bits = NULL;
+PCRE_PUCHAR start_match = (PCRE_PUCHAR)subject + start_offset;
+PCRE_PUCHAR end_subject;
+PCRE_PUCHAR start_partial = NULL;
+PCRE_PUCHAR req_char_ptr = start_match - 1;

-pcre_study_data internal_study;
const pcre_study_data *study;
-
-real_pcre internal_re;
const real_pcre *external_re = (const real_pcre *)argument_re;
const real_pcre *re = external_re;

@@ -5972,18 +6183,19 @@
during "normal" pcre_exec() processing, not when the JIT support is in use,
so they are set up later. */

-utf8 = md->utf8 = (re->options & PCRE_UTF8) != 0;
+/* PCRE_UTF16 has the same value as PCRE_UTF8. */
+utf = md->utf = (re->options & PCRE_UTF8) != 0;
 md->partial = ((options & PCRE_PARTIAL_HARD) != 0)? 2 :
               ((options & PCRE_PARTIAL_SOFT) != 0)? 1 : 0;

/* Check a UTF-8 string if required. Pass back the character offset and error
code for an invalid string if a results vector is available. */

-#ifdef SUPPORT_UTF8
-if (utf8 && (options & PCRE_NO_UTF8_CHECK) == 0)
+#ifdef SUPPORT_UTF
+if (utf && (options & PCRE_NO_UTF8_CHECK) == 0)
   {
   int erroroffset;
-  int errorcode = _pcre_valid_utf8((USPTR)subject, length, &erroroffset);
+  int errorcode = PRIV(valid_utf)((PCRE_PUCHAR)subject, length, &erroroffset);
   if (errorcode != 0)
     {
     if (offsetcount >= 2)
@@ -5991,13 +6203,18 @@
       offsets[0] = erroroffset;
       offsets[1] = errorcode;
       }
+#ifdef COMPILE_PCRE16
+    return (errorcode <= PCRE_UTF16_ERR1 && md->partial > 1)?
+      PCRE_ERROR_SHORTUTF16 : PCRE_ERROR_BADUTF16;
+#else
     return (errorcode <= PCRE_UTF8_ERR5 && md->partial > 1)?
       PCRE_ERROR_SHORTUTF8 : PCRE_ERROR_BADUTF8;
+#endif       
     }

-  /* Check that a start_offset points to the start of a UTF-8 character. */
+  /* Check that a start_offset points to the start of a UTF character. */
   if (start_offset > 0 && start_offset < length &&
-      (((USPTR)subject)[start_offset] & 0xc0) == 0x80)
+      NOT_FIRSTCHAR(((PCRE_PUCHAR)subject)[start_offset]))
     return PCRE_ERROR_BADUTF8_OFFSET;
   }
 #endif
@@ -6015,15 +6232,16 @@
     && (extra_data->flags & PCRE_EXTRA_TABLES) == 0
     && (options & ~(PCRE_NO_UTF8_CHECK | PCRE_NOTBOL | PCRE_NOTEOL |
                     PCRE_NOTEMPTY | PCRE_NOTEMPTY_ATSTART)) == 0)
-  return _pcre_jit_exec(re, extra_data->executable_jit, subject, length,
-    start_offset, options, ((extra_data->flags & PCRE_EXTRA_MATCH_LIMIT) == 0)
+  return PRIV(jit_exec)(re, extra_data->executable_jit,
+    (const pcre_uchar *)subject, length, start_offset, options,
+    ((extra_data->flags & PCRE_EXTRA_MATCH_LIMIT) == 0)
     ? MATCH_LIMIT : extra_data->match_limit, offsets, offsetcount);
 #endif

/* Carry on with non-JIT matching. This information is for finding all the
numbers associated with a given name, for condition testing. */

-md->name_table = (uschar *)re + re->name_table_offset;
+md->name_table = (pcre_uchar *)re + re->name_table_offset;
md->name_count = re->name_count;
md->name_entry_size = re->name_entry_size;

@@ -6057,19 +6275,17 @@
is a feature that makes it possible to save compiled regex and re-use them
in other programs later. */

-if (tables == NULL) tables = _pcre_default_tables;
+if (tables == NULL) tables = PRIV(default_tables);

/* Check that the first field in the block is the magic number. If it is not,
-test for a regex that was compiled on a host of opposite endianness. If this is
-the case, flipped values are put in internal_re and internal_study if there was
-study data too. */
+return with PCRE_ERROR_BADMAGIC. However, if the magic number is equal to
+REVERSED_MAGIC_NUMBER we return with PCRE_ERROR_BADENDIANNESS, which
+means that the pattern is likely compiled with different endianness. */

 if (re->magic_number != MAGIC_NUMBER)
-  {
-  re = _pcre_try_flipped(re, &internal_re, study, &internal_study);
-  if (re == NULL) return PCRE_ERROR_BADMAGIC;
-  if (study != NULL) study = &internal_study;
-  }
+  return re->magic_number == REVERSED_MAGIC_NUMBER?
+    PCRE_ERROR_BADENDIANNESS:PCRE_ERROR_BADMAGIC;
+if ((re->flags & PCRE_MODE) == 0) return PCRE_ERROR_BADMODE;

/* Set up other data */

@@ -6079,10 +6295,10 @@

/* The code starts after the real_pcre block and the capture name table. */

-md->start_code = (const uschar *)external_re + re->name_table_offset +
+md->start_code = (const pcre_uchar *)external_re + re->name_table_offset +
re->name_count * re->name_entry_size;

-md->start_subject = (USPTR)subject;
+md->start_subject = (PCRE_PUCHAR)subject;
md->start_offset = start_offset;
md->end_subject = md->start_subject + length;
end_subject = md->end_subject;
@@ -6090,6 +6306,7 @@
md->endonly = (re->options & PCRE_DOLLAR_ENDONLY) != 0;
md->use_ucp = (re->options & PCRE_UCP) != 0;
md->jscript_compat = (re->options & PCRE_JAVASCRIPT_COMPAT) != 0;
+md->ignore_skip_arg = FALSE;

/* Some options are unpacked into BOOL variables in the hope that testing
them will be faster than individual option bits. */
@@ -6100,12 +6317,13 @@
md->notempty_atstart = (options & PCRE_NOTEMPTY_ATSTART) != 0;

 md->hitend = FALSE;
-md->mark = NULL;                        /* In case never set */
+md->mark = md->nomatch_mark = NULL;     /* In case never set */

 md->recursive = NULL;                   /* No recursion at top level */
 md->hasthen = (re->flags & PCRE_HASTHEN) != 0;

md->lcc = tables + lcc_offset;
+md->fcc = tables + fcc_offset;
md->ctypes = tables + ctypes_offset;

/* Handle different \R options. */
@@ -6192,7 +6410,7 @@
if (re->top_backref > 0 && re->top_backref >= ocount/3)
{
ocount = re->top_backref * 3 + 3;
- md->offset_vector = (int *)(pcre_malloc)(ocount * sizeof(int));
+ md->offset_vector = (int *)(PUBL(malloc))(ocount * sizeof(int));
if (md->offset_vector == NULL) return PCRE_ERROR_NOMEMORY;
using_temporary_offsets = TRUE;
DPRINTF(("Got memory to hold back references\n"));
@@ -6219,7 +6437,7 @@
md->offset_vector[0] = md->offset_vector[1] = -1;
}

-/* Set up the first character to match, if available. The first_byte value is
+/* Set up the first character to match, if available. The first_char value is
 never set for an anchored regular expression, but the anchoring may be forced
 at run time, so we have to test for anchoring. The first char may be unset for
 an unanchored pattern, of course. If there's no first char and the pattern was
@@ -6229,9 +6447,16 @@
   {
   if ((re->flags & PCRE_FIRSTSET) != 0)
     {
-    first_byte = re->first_byte & 255;
-    if ((first_byte_caseless = ((re->first_byte & REQ_CASELESS) != 0)) == TRUE)
-      first_byte = md->lcc[first_byte];
+    has_first_char = TRUE;
+    first_char = first_char2 = re->first_char;
+    if ((re->flags & PCRE_FCH_CASELESS) != 0)
+      {
+      first_char2 = TABLE_GET(first_char, md->fcc, first_char);
+#if defined SUPPORT_UCP && !(defined COMPILE_PCRE8)
+      if (utf && first_char > 127)
+        first_char2 = UCD_OTHERCASE(first_char);
+#endif
+      }
     }
   else
     if (!startline && study != NULL &&
@@ -6244,14 +6469,19 @@

 if ((re->flags & PCRE_REQCHSET) != 0)
   {
-  req_byte = re->req_byte & 255;
-  req_byte_caseless = (re->req_byte & REQ_CASELESS) != 0;
-  req_byte2 = (tables + fcc_offset)[req_byte];  /* case flipped */
+  has_req_char = TRUE;
+  req_char = req_char2 = re->req_char;
+  if ((re->flags & PCRE_RCH_CASELESS) != 0)
+    {
+    req_char2 = TABLE_GET(req_char, md->fcc, req_char);
+#if defined SUPPORT_UCP && !(defined COMPILE_PCRE8)
+    if (utf && req_char > 127)
+      req_char2 = UCD_OTHERCASE(req_char);
+#endif
+    }
   }

-
-
/* ==========================================================================*/

/* Loop for handling unanchored repeated matching attempts; for anchored regexs
@@ -6259,8 +6489,8 @@

for(;;)
{
- USPTR save_end_subject = end_subject;
- USPTR new_start_match;
+ PCRE_PUCHAR save_end_subject = end_subject;
+ PCRE_PUCHAR new_start_match;

/* If firstline is TRUE, the start of the match is constrained to the first
line of a multiline string. That is, the match must be before or at the first
@@ -6270,14 +6500,14 @@

   if (firstline)
     {
-    USPTR t = start_match;
-#ifdef SUPPORT_UTF8
-    if (utf8)
+    PCRE_PUCHAR t = start_match;
+#ifdef SUPPORT_UTF
+    if (utf)
       {
       while (t < md->end_subject && !IS_NEWLINE(t))
         {
         t++;
-        while (t < end_subject && (*t & 0xc0) == 0x80) t++;
+        ACROSSCHAR(t < end_subject, *t, t++);
         }
       }
     else
@@ -6294,15 +6524,16 @@

   if (((options | re->options) & PCRE_NO_START_OPTIMIZE) == 0)
     {
-    /* Advance to a unique first byte if there is one. */
+    /* Advance to a unique first char if there is one. */

-    if (first_byte >= 0)
+    if (has_first_char)
       {
-      if (first_byte_caseless)
-        while (start_match < end_subject && md->lcc[*start_match] != first_byte)
+      if (first_char != first_char2)
+        while (start_match < end_subject &&
+            *start_match != first_char && *start_match != first_char2)
           start_match++;
       else
-        while (start_match < end_subject && *start_match != first_byte)
+        while (start_match < end_subject && *start_match != first_char)
           start_match++;
       }

@@ -6312,14 +6543,14 @@
       {
       if (start_match > md->start_subject + start_offset)
         {
-#ifdef SUPPORT_UTF8
-        if (utf8)
+#ifdef SUPPORT_UTF
+        if (utf)
           {
           while (start_match < end_subject && !WAS_NEWLINE(start_match))
             {
             start_match++;
-            while(start_match < end_subject && (*start_match & 0xc0) == 0x80)
-              start_match++;
+            ACROSSCHAR(start_match < end_subject, *start_match,
+              start_match++);
             }
           }
         else
@@ -6346,13 +6577,18 @@
       while (start_match < end_subject)
         {
         register unsigned int c = *start_match;
+#ifndef COMPILE_PCRE8
+        if (c > 255) c = 255;
+#endif
         if ((start_bits[c/8] & (1 << (c&7))) == 0)
           {
           start_match++;
-#ifdef SUPPORT_UTF8
-          if (utf8)
-            while(start_match < end_subject && (*start_match & 0xc0) == 0x80)
-              start_match++;
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
+          /* In non 8-bit mode, the iteration will stop for
+          characters > 255 at the beginning or not stop at all. */
+          if (utf)
+            ACROSSCHAR(start_match < end_subject, *start_match,
+              start_match++);
 #endif
           }
         else break;
@@ -6381,8 +6617,8 @@
       break;
       }

-    /* If req_byte is set, we know that that character must appear in the
-    subject for the match to succeed. If the first character is set, req_byte
+    /* If req_char is set, we know that that character must appear in the
+    subject for the match to succeed. If the first character is set, req_char
     must be later in the subject; otherwise the test starts at the match point.
     This optimization can save a huge amount of backtracking in patterns with
     nested unlimited repeats that aren't going to match. Writing separate code
@@ -6395,28 +6631,28 @@
     32-megabyte string... so we don't do this when the string is sufficiently
     long. */

-    if (req_byte >= 0 && end_subject - start_match < REQ_BYTE_MAX)
+    if (has_req_char && end_subject - start_match < REQ_BYTE_MAX)
       {
-      register USPTR p = start_match + ((first_byte >= 0)? 1 : 0);
+      register PCRE_PUCHAR p = start_match + (has_first_char? 1:0);

       /* We don't need to repeat the search if we haven't yet reached the
       place we found it at last time. */

-      if (p > req_byte_ptr)
+      if (p > req_char_ptr)
         {
-        if (req_byte_caseless)
+        if (req_char != req_char2)
           {
           while (p < end_subject)
             {
             register int pp = *p++;
-            if (pp == req_byte || pp == req_byte2) { p--; break; }
+            if (pp == req_char || pp == req_char2) { p--; break; }
             }
           }
         else
           {
           while (p < end_subject)
             {
-            if (*p++ == req_byte) { p--; break; }
+            if (*p++ == req_char) { p--; break; }
             }
           }

@@ -6433,7 +6669,7 @@
         found it, so that we don't search again next time round the loop if
         the start hasn't passed this character yet. */

-        req_byte_ptr = p;
+        req_char_ptr = p;
         }
       }
     }
@@ -6452,11 +6688,23 @@
   md->match_call_count = 0;
   md->match_function_type = 0;
   md->end_offset_top = 0;
-  rc = match(start_match, md->start_code, start_match, NULL, 2, md, NULL, 0);
+  rc = match(start_match, md->start_code, start_match, 2, md, NULL, 0);
   if (md->hitend && start_partial == NULL) start_partial = md->start_used_ptr;

   switch(rc)
     {
+    /* If MATCH_SKIP_ARG reaches this level it means that a MARK that matched
+    the SKIP's arg was not found. In this circumstance, Perl ignores the SKIP
+    entirely. The only way we can do that is to re-do the match at the same
+    point, with a flag to force SKIP with an argument to be ignored. Just
+    treating this case as NOMATCH does not work because it does not check other
+    alternatives in patterns such as A(*SKIP:A)B|AC when the subject is AC. */
+
+    case MATCH_SKIP_ARG:
+    new_start_match = start_match;
+    md->ignore_skip_arg = TRUE;
+    break;
+
     /* SKIP passes back the next starting point explicitly, but if it is the
     same as the match we have just done, treat it as NOMATCH. */

@@ -6468,23 +6716,18 @@
       }
     /* Fall through */

-    /* If MATCH_SKIP_ARG reaches this level it means that a MARK that matched
-    the SKIP's arg was not found. We also treat this as NOMATCH. */
-
-    case MATCH_SKIP_ARG:
-    /* Fall through */
-
     /* NOMATCH and PRUNE advance by one character. THEN at this level acts
-    exactly like PRUNE. */
+    exactly like PRUNE. Unset the ignore SKIP-with-argument flag. */

     case MATCH_NOMATCH:
     case MATCH_PRUNE:
     case MATCH_THEN:
+    md->ignore_skip_arg = FALSE;
     new_start_match = start_match + 1;
-#ifdef SUPPORT_UTF8
-    if (utf8)
-      while(new_start_match < end_subject && (*new_start_match & 0xc0) == 0x80)
-        new_start_match++;
+#ifdef SUPPORT_UTF
+    if (utf)
+      ACROSSCHAR(new_start_match < end_subject, *new_start_match,
+        new_start_match++);
 #endif
     break;

@@ -6522,9 +6765,13 @@

/* If we have just passed a CR and we are now at a LF, and the pattern does
not contain any explicit matches for \r or \n, and the newline option is CRLF
- or ANY or ANYCRLF, advance the match position by one more character. */
+ or ANY or ANYCRLF, advance the match position by one more character. In
+ normal matching start_match will aways be greater than the first position at
+ this stage, but a failed *SKIP can cause a return at the same point, which is
+ why the first test exists. */

-  if (start_match[-1] == CHAR_CR &&
+  if (start_match > (PCRE_PUCHAR)subject + start_offset &&
+      start_match[-1] == CHAR_CR &&
       start_match < end_subject &&
       *start_match == CHAR_NL &&
       (re->flags & PCRE_HASCRORLF) == 0 &&
@@ -6570,7 +6817,7 @@
       }
     if (md->end_offset_top > arg_offset_max) md->offset_overflow = TRUE;
     DPRINTF(("Freeing temporary memory\n"));
-    (pcre_free)(md->offset_vector);
+    (PUBL(free))(md->offset_vector);
     }

   /* Set the return code to the number of captured strings, or 0 if there were
@@ -6608,8 +6855,12 @@
     offsets[1] = (int)(md->end_match_ptr - md->start_subject);
     }

+  /* Return MARK data if requested */
+
+  if (extra_data != NULL && (extra_data->flags & PCRE_EXTRA_MARK) != 0)
+    *(extra_data->mark) = (unsigned char *)(md->mark);
   DPRINTF((">>>> returning %d\n", rc));
-  goto RETURN_MARK;
+  return rc;
   }

/* Control gets here if there has been an error, or if the overall match
@@ -6618,7 +6869,7 @@
if (using_temporary_offsets)
{
DPRINTF(("Freeing temporary memory\n"));
- (pcre_free)(md->offset_vector);
+ (PUBL(free))(md->offset_vector);
}

 /* For anything other than nomatch or partial match, just return the code. */
@@ -6637,8 +6888,8 @@
   md->mark = NULL;
   if (offsetcount > 1)
     {
-    offsets[0] = (int)(start_partial - (USPTR)subject);
-    offsets[1] = (int)(end_subject - (USPTR)subject);
+    offsets[0] = (int)(start_partial - (PCRE_PUCHAR)subject);
+    offsets[1] = (int)(end_subject - (PCRE_PUCHAR)subject);
     }
   rc = PCRE_ERROR_PARTIAL;
   }
@@ -6653,10 +6904,8 @@

/* Return the MARK data if it has been requested. */

-RETURN_MARK:
-
if (extra_data != NULL && (extra_data->flags & PCRE_EXTRA_MARK) != 0)
- *(extra_data->mark) = (unsigned char *)(md->mark);
+ *(extra_data->mark) = (unsigned char *)(md->nomatch_mark);
return rc;
}

Modified: code/trunk/pcre_fullinfo.c
===================================================================
--- code/trunk/pcre_fullinfo.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_fullinfo.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2011 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@@ -65,12 +65,16 @@
 Returns:           0 if data returned, negative on error
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_fullinfo(const pcre *argument_re, const pcre_extra *extra_data, int what,
void *where)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_fullinfo(const pcre *argument_re, const pcre_extra *extra_data, int what,
+ void *where)
+#endif
{
-real_pcre internal_re;
-pcre_study_data internal_study;
const real_pcre *re = (const real_pcre *)argument_re;
const pcre_study_data *study = NULL;

@@ -79,12 +83,18 @@
if (extra_data != NULL && (extra_data->flags & PCRE_EXTRA_STUDY_DATA) != 0)
study = (const pcre_study_data *)extra_data->study_data;

+/* Check that the first field in the block is the magic number. If it is not,
+return with PCRE_ERROR_BADMAGIC. However, if the magic number is equal to
+REVERSED_MAGIC_NUMBER we return with PCRE_ERROR_BADENDIANNESS, which
+means that the pattern is likely compiled with different endianness. */
+
 if (re->magic_number != MAGIC_NUMBER)
-  {
-  re = _pcre_try_flipped(re, &internal_re, study, &internal_study);
-  if (re == NULL) return PCRE_ERROR_BADMAGIC;
-  if (study != NULL) study = &internal_study;
-  }
+  return re->magic_number == REVERSED_MAGIC_NUMBER?
+    PCRE_ERROR_BADENDIANNESS:PCRE_ERROR_BADMAGIC;
+    
+/* Check that this pattern was compiled in the correct bit mode */
+ 
+if ((re->flags & PCRE_MODE) == 0) return PCRE_ERROR_BADMODE;

switch (what)
{
@@ -100,6 +110,18 @@
*((size_t *)where) = (study == NULL)? 0 : study->size;
break;

+  case PCRE_INFO_JITSIZE:
+#ifdef SUPPORT_JIT
+  *((size_t *)where) =
+      (extra_data != NULL &&
+      (extra_data->flags & PCRE_EXTRA_EXECUTABLE_JIT) != 0 &&
+      extra_data->executable_jit != NULL)?
+    PRIV(jit_get_size)(extra_data->executable_jit) : 0;
+#else
+  *((size_t *)where) = 0;
+#endif
+  break;
+
   case PCRE_INFO_CAPTURECOUNT:
   *((int *)where) = re->top_bracket;
   break;
@@ -110,7 +132,7 @@

   case PCRE_INFO_FIRSTBYTE:
   *((int *)where) =
-    ((re->flags & PCRE_FIRSTSET) != 0)? re->first_byte :
+    ((re->flags & PCRE_FIRSTSET) != 0)? re->first_char :
     ((re->flags & PCRE_STARTLINE) != 0)? -1 : -2;
   break;

@@ -118,7 +140,7 @@
block, not the internal copy (with flipped integer fields). */

   case PCRE_INFO_FIRSTTABLE:
-  *((const uschar **)where) =
+  *((const pcre_uint8 **)where) =
     (study != NULL && (study->flags & PCRE_STUDY_MAPPED) != 0)?
       ((const pcre_study_data *)extra_data->study_data)->start_bits : NULL;
   break;
@@ -137,7 +159,7 @@

   case PCRE_INFO_LASTLITERAL:
   *((int *)where) =
-    ((re->flags & PCRE_REQCHSET) != 0)? re->req_byte : -1;
+    ((re->flags & PCRE_REQCHSET) != 0)? re->req_char : -1;
   break;

case PCRE_INFO_NAMEENTRYSIZE:
@@ -149,11 +171,11 @@
break;

case PCRE_INFO_NAMETABLE:
- *((const uschar **)where) = (const uschar *)re + re->name_table_offset;
+ *((const pcre_uchar **)where) = (const pcre_uchar *)re + re->name_table_offset;
break;

case PCRE_INFO_DEFAULT_TABLES:
- *((const uschar **)where) = (const uschar *)(_pcre_default_tables);
+ *((const pcre_uint8 **)where) = (const pcre_uint8 *)(PRIV(default_tables));
break;

/* From release 8.00 this will always return TRUE because NOPARTIAL is

Modified: code/trunk/pcre_get.c
===================================================================
--- code/trunk/pcre_get.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_get.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2008 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@@ -65,14 +65,20 @@
                 (PCRE_ERROR_NOSUBSTRING) if not found
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_stringnumber(const pcre *code, const char *stringname)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_get_stringnumber(const pcre *code, PCRE_SPTR16 stringname)
+#endif
{
int rc;
int entrysize;
int top, bot;
-uschar *nametable;
+pcre_uchar *nametable;

+#ifdef COMPILE_PCRE8
if ((rc = pcre_fullinfo(code, NULL, PCRE_INFO_NAMECOUNT, &top)) != 0)
return rc;
if (top <= 0) return PCRE_ERROR_NOSUBSTRING;
@@ -81,14 +87,26 @@
return rc;
if ((rc = pcre_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
return rc;
+#endif
+#ifdef COMPILE_PCRE16
+if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMECOUNT, &top)) != 0)
+ return rc;
+if (top <= 0) return PCRE_ERROR_NOSUBSTRING;

+if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0)
+  return rc;
+if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
+  return rc;
+#endif
+
 bot = 0;
 while (top > bot)
   {
   int mid = (top + bot) / 2;
-  uschar *entry = nametable + entrysize*mid;
-  int c = strcmp(stringname, (char *)(entry + 2));
-  if (c == 0) return (entry[0] << 8) + entry[1];
+  pcre_uchar *entry = nametable + entrysize*mid;
+  int c = STRCMP_UC_UC((pcre_uchar *)stringname,
+    (pcre_uchar *)(entry + IMM2_SIZE));
+  if (c == 0) return GET2(entry, 0);
   if (c > 0) bot = mid + 1; else top = mid;
   }

@@ -114,15 +132,22 @@
                 (PCRE_ERROR_NOSUBSTRING) if not found
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_stringtable_entries(const pcre *code, const char *stringname,
char **firstptr, char **lastptr)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_get_stringtable_entries(const pcre *code, PCRE_SPTR16 stringname,
+ PCRE_SCHAR16 **firstptr, PCRE_SCHAR16 **lastptr)
+#endif
{
int rc;
int entrysize;
int top, bot;
-uschar *nametable, *lastentry;
+pcre_uchar *nametable, *lastentry;

+#ifdef COMPILE_PCRE8
if ((rc = pcre_fullinfo(code, NULL, PCRE_INFO_NAMECOUNT, &top)) != 0)
return rc;
if (top <= 0) return PCRE_ERROR_NOSUBSTRING;
@@ -131,30 +156,49 @@
return rc;
if ((rc = pcre_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
return rc;
+#endif
+#ifdef COMPILE_PCRE16
+if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMECOUNT, &top)) != 0)
+ return rc;
+if (top <= 0) return PCRE_ERROR_NOSUBSTRING;

+if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0)
+  return rc;
+if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
+  return rc;
+#endif
+
 lastentry = nametable + entrysize * (top - 1);
 bot = 0;
 while (top > bot)
   {
   int mid = (top + bot) / 2;
-  uschar *entry = nametable + entrysize*mid;
-  int c = strcmp(stringname, (char *)(entry + 2));
+  pcre_uchar *entry = nametable + entrysize*mid;
+  int c = STRCMP_UC_UC((pcre_uchar *)stringname,
+    (pcre_uchar *)(entry + IMM2_SIZE));
   if (c == 0)
     {
-    uschar *first = entry;
-    uschar *last = entry;
+    pcre_uchar *first = entry;
+    pcre_uchar *last = entry;
     while (first > nametable)
       {
-      if (strcmp(stringname, (char *)(first - entrysize + 2)) != 0) break;
+      if (STRCMP_UC_UC((pcre_uchar *)stringname,
+        (pcre_uchar *)(first - entrysize + IMM2_SIZE)) != 0) break;
       first -= entrysize;
       }
     while (last < lastentry)
       {
-      if (strcmp(stringname, (char *)(last + entrysize + 2)) != 0) break;
+      if (STRCMP_UC_UC((pcre_uchar *)stringname,
+        (pcre_uchar *)(last + entrysize + IMM2_SIZE)) != 0) break;
       last += entrysize;
       }
+#ifdef COMPILE_PCRE8
     *firstptr = (char *)first;
     *lastptr = (char *)last;
+#else
+    *firstptr = (PCRE_SCHAR16 *)first;
+    *lastptr = (PCRE_SCHAR16 *)last;
+#endif
     return entrysize;
     }
   if (c > 0) bot = mid + 1; else top = mid;
@@ -182,23 +226,36 @@
                or a negative number on error
 */

+#ifdef COMPILE_PCRE8
static int
get_first_set(const pcre *code, const char *stringname, int *ovector)
+#else
+static int
+get_first_set(const pcre *code, PCRE_SPTR16 stringname, int *ovector)
+#endif
{
const real_pcre *re = (const real_pcre *)code;
int entrysize;
-char *first, *last;
-uschar *entry;
+pcre_uchar *first, *last;
+pcre_uchar *entry;
+#ifdef COMPILE_PCRE8
if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
return pcre_get_stringnumber(code, stringname);
-entrysize = pcre_get_stringtable_entries(code, stringname, &first, &last);
+entrysize = pcre_get_stringtable_entries(code, stringname,
+ (char **)&first, (char **)&last);
+#else
+if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
+ return pcre16_get_stringnumber(code, stringname);
+entrysize = pcre16_get_stringtable_entries(code, stringname,
+ (PCRE_SCHAR16 **)&first, (PCRE_SCHAR16 **)&last);
+#endif
if (entrysize <= 0) return entrysize;
-for (entry = (uschar *)first; entry <= (uschar *)last; entry += entrysize)
+for (entry = (pcre_uchar *)first; entry <= (pcre_uchar *)last; entry += entrysize)
{
- int n = (entry[0] << 8) + entry[1];
+ int n = GET2(entry, 0);
if (ovector[n*2] >= 0) return n;
}
-return (first[0] << 8) + first[1];
+return GET2(entry, 0);
}

@@ -231,9 +288,15 @@
                    PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
 */

+#ifdef COMPILE_PCRE8
 PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
 pcre_copy_substring(const char *subject, int *ovector, int stringcount,
   int stringnumber, char *buffer, int size)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_copy_substring(PCRE_SPTR16 subject, int *ovector, int stringcount,
+  int stringnumber, PCRE_SCHAR16 *buffer, int size)
+#endif
 {
 int yield;
 if (stringnumber < 0 || stringnumber >= stringcount)
@@ -241,7 +304,7 @@
 stringnumber *= 2;
 yield = ovector[stringnumber+1] - ovector[stringnumber];
 if (size < yield + 1) return PCRE_ERROR_NOMEMORY;
-memcpy(buffer, subject + ovector[stringnumber], yield);
+memcpy(buffer, subject + ovector[stringnumber], IN_UCHARS(yield));
 buffer[yield] = 0;
 return yield;
 }
@@ -276,13 +339,23 @@
                    PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_copy_named_substring(const pcre *code, const char *subject, int *ovector,
int stringcount, const char *stringname, char *buffer, int size)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_copy_named_substring(const pcre *code, PCRE_SPTR16 subject, int *ovector,
+ int stringcount, PCRE_SPTR16 stringname, PCRE_SCHAR16 *buffer, int size)
+#endif
{
int n = get_first_set(code, stringname, ovector);
if (n <= 0) return n;
+#ifdef COMPILE_PCRE8
return pcre_copy_substring(subject, ovector, stringcount, n, buffer, size);
+#else
+return pcre16_copy_substring(subject, ovector, stringcount, n, buffer, size);
+#endif
}

@@ -308,29 +381,39 @@
                    PCRE_ERROR_NOMEMORY (-6) failed to get store
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_substring_list(const char *subject, int *ovector, int stringcount,
const char ***listptr)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_get_substring_list(PCRE_SPTR16 subject, int *ovector, int stringcount,
+ PCRE_SPTR16 **listptr)
+#endif
{
int i;
-int size = sizeof(char *);
+int size = sizeof(pcre_uchar *);
int double_count = stringcount * 2;
-char **stringlist;
-char *p;
+pcre_uchar **stringlist;
+pcre_uchar *p;

for (i = 0; i < double_count; i += 2)
- size += sizeof(char *) + ovector[i+1] - ovector[i] + 1;
+ size += sizeof(pcre_uchar *) + IN_UCHARS(ovector[i+1] - ovector[i] + 1);

-stringlist = (char **)(pcre_malloc)(size);
+stringlist = (pcre_uchar **)(PUBL(malloc))(size);
if (stringlist == NULL) return PCRE_ERROR_NOMEMORY;

+#ifdef COMPILE_PCRE8
*listptr = (const char **)stringlist;
-p = (char *)(stringlist + stringcount + 1);
+#else
+*listptr = (PCRE_SPTR16 *)stringlist;
+#endif
+p = (pcre_uchar *)(stringlist + stringcount + 1);

for (i = 0; i < double_count; i += 2)
{
int len = ovector[i+1] - ovector[i];
- memcpy(p, subject + ovector[i], len);
+ memcpy(p, subject + ovector[i], IN_UCHARS(len));
*stringlist++ = p;
p += len;
*p++ = 0;
@@ -347,16 +430,22 @@
*************************************************/

/* This function exists for the benefit of people calling PCRE from non-C
-programs that can call its functions, but not free() or (pcre_free)() directly.
+programs that can call its functions, but not free() or (PUBL(free))()
+directly.

 Argument:   the result of a previous pcre_get_substring_list()
 Returns:    nothing
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre_free_substring_list(const char **pointer)
+#else
+PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
+pcre16_free_substring_list(PCRE_SPTR16 *pointer)
+#endif
{
-(pcre_free)((void *)pointer);
+(PUBL(free))((void *)pointer);
}

@@ -386,21 +475,31 @@
                    PCRE_ERROR_NOSUBSTRING (-7) substring not present
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_substring(const char *subject, int *ovector, int stringcount,
int stringnumber, const char **stringptr)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_get_substring(PCRE_SPTR16 subject, int *ovector, int stringcount,
+ int stringnumber, PCRE_SPTR16 *stringptr)
+#endif
{
int yield;
-char *substring;
+pcre_uchar *substring;
if (stringnumber < 0 || stringnumber >= stringcount)
return PCRE_ERROR_NOSUBSTRING;
stringnumber *= 2;
yield = ovector[stringnumber+1] - ovector[stringnumber];
-substring = (char *)(pcre_malloc)(yield + 1);
+substring = (pcre_uchar *)(PUBL(malloc))(IN_UCHARS(yield + 1));
if (substring == NULL) return PCRE_ERROR_NOMEMORY;
-memcpy(substring, subject + ovector[stringnumber], yield);
+memcpy(substring, subject + ovector[stringnumber], IN_UCHARS(yield));
substring[yield] = 0;
-*stringptr = substring;
+#ifdef COMPILE_PCRE8
+*stringptr = (const char *)substring;
+#else
+*stringptr = (PCRE_SPTR16)substring;
+#endif
return yield;
}

@@ -433,13 +532,23 @@
                    PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_named_substring(const pcre *code, const char *subject, int *ovector,
int stringcount, const char *stringname, const char **stringptr)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre16_get_named_substring(const pcre *code, PCRE_SPTR16 subject, int *ovector,
+ int stringcount, PCRE_SPTR16 stringname, PCRE_SPTR16 *stringptr)
+#endif
{
int n = get_first_set(code, stringname, ovector);
if (n <= 0) return n;
+#ifdef COMPILE_PCRE8
return pcre_get_substring(subject, ovector, stringcount, n, stringptr);
+#else
+return pcre16_get_substring(subject, ovector, stringcount, n, stringptr);
+#endif
}

@@ -450,16 +559,22 @@
*************************************************/

/* This function exists for the benefit of people calling PCRE from non-C
-programs that can call its functions, but not free() or (pcre_free)() directly.
+programs that can call its functions, but not free() or (PUBL(free))()
+directly.

 Argument:   the result of a previous pcre_get_substring()
 Returns:    nothing
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre_free_substring(const char *pointer)
+#else
+PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
+pcre16_free_substring(PCRE_SPTR16 pointer)
+#endif
{
-(pcre_free)((void *)pointer);
+(PUBL(free))((void *)pointer);
}

/* End of pcre_get.c */

Modified: code/trunk/pcre_globals.c
===================================================================
--- code/trunk/pcre_globals.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_globals.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2008 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -67,18 +67,18 @@
{
free(aPtr);
}
-PCRE_EXP_DATA_DEFN void *(*pcre_malloc)(size_t) = LocalPcreMalloc;
-PCRE_EXP_DATA_DEFN void (*pcre_free)(void *) = LocalPcreFree;
-PCRE_EXP_DATA_DEFN void *(*pcre_stack_malloc)(size_t) = LocalPcreMalloc;
-PCRE_EXP_DATA_DEFN void (*pcre_stack_free)(void *) = LocalPcreFree;
-PCRE_EXP_DATA_DEFN int (*pcre_callout)(pcre_callout_block *) = NULL;
+PCRE_EXP_DATA_DEFN void *(*PUBL(malloc))(size_t) = LocalPcreMalloc;
+PCRE_EXP_DATA_DEFN void (*PUBL(free))(void *) = LocalPcreFree;
+PCRE_EXP_DATA_DEFN void *(*PUBL(stack_malloc))(size_t) = LocalPcreMalloc;
+PCRE_EXP_DATA_DEFN void (*PUBL(stack_free))(void *) = LocalPcreFree;
+PCRE_EXP_DATA_DEFN int (*PUBL(callout))(pcre_callout_block *) = NULL;

#elif !defined VPCOMPAT
-PCRE_EXP_DATA_DEFN void *(*pcre_malloc)(size_t) = malloc;
-PCRE_EXP_DATA_DEFN void (*pcre_free)(void *) = free;
-PCRE_EXP_DATA_DEFN void *(*pcre_stack_malloc)(size_t) = malloc;
-PCRE_EXP_DATA_DEFN void (*pcre_stack_free)(void *) = free;
-PCRE_EXP_DATA_DEFN int (*pcre_callout)(pcre_callout_block *) = NULL;
+PCRE_EXP_DATA_DEFN void *(*PUBL(malloc))(size_t) = malloc;
+PCRE_EXP_DATA_DEFN void (*PUBL(free))(void *) = free;
+PCRE_EXP_DATA_DEFN void *(*PUBL(stack_malloc))(size_t) = malloc;
+PCRE_EXP_DATA_DEFN void (*PUBL(stack_free))(void *) = free;
+PCRE_EXP_DATA_DEFN int (*PUBL(callout))(pcre_callout_block *) = NULL;
#endif

/* End of pcre_globals.c */

Deleted: code/trunk/pcre_info.c
===================================================================
--- code/trunk/pcre_info.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_info.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,93 +0,0 @@
-/*************************************************
-*      Perl-Compatible Regular Expressions       *
-*************************************************/
-
-/* PCRE is a library of functions to support regular expressions whose syntax
-and semantics are as close as possible to those of the Perl 5 language.
-
-                       Written by Philip Hazel
-           Copyright (c) 1997-2009 University of Cambridge
-
------------------------------------------------------------------------------
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are met:
-
-    * Redistributions of source code must retain the above copyright notice,
-      this list of conditions and the following disclaimer.
-
-    * Redistributions in binary form must reproduce the above copyright
-      notice, this list of conditions and the following disclaimer in the
-      documentation and/or other materials provided with the distribution.
-
-    * Neither the name of the University of Cambridge nor the names of its
-      contributors may be used to endorse or promote products derived from
-      this software without specific prior written permission.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
-AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
-ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
-LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
-CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
-SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
-INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
-CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
-ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
-POSSIBILITY OF SUCH DAMAGE.
------------------------------------------------------------------------------
-*/
-
-
-/* This module contains the external function pcre_info(), which gives some
-information about a compiled pattern. However, use of this function is now
-deprecated, as it has been superseded by pcre_fullinfo(). */
-
-
-#ifdef HAVE_CONFIG_H
-#include "config.h"
-#endif
-
-#include "pcre_internal.h"
-
-
-/*************************************************
-* (Obsolete) Return info about compiled pattern  *
-*************************************************/
-
-/* This is the original "info" function. It picks potentially useful data out
-of the private structure, but its interface was too rigid. It remains for
-backwards compatibility. The public options are passed back in an int - though
-the re->options field has been expanded to a long int, all the public options
-at the low end of it, and so even on 16-bit systems this will still be OK.
-Therefore, I haven't changed the API for pcre_info().
-
-Arguments:
-  argument_re   points to compiled code
-  optptr        where to pass back the options
-  first_byte    where to pass back the first character,
-                or -1 if multiline and all branches start ^,
-                or -2 otherwise
-
-Returns:        number of capturing subpatterns
-                or negative values on error
-*/
-
-PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
-pcre_info(const pcre *argument_re, int *optptr, int *first_byte)
-{
-real_pcre internal_re;
-const real_pcre *re = (const real_pcre *)argument_re;
-if (re == NULL) return PCRE_ERROR_NULL;
-if (re->magic_number != MAGIC_NUMBER)
-  {
-  re = _pcre_try_flipped(re, &internal_re, NULL, NULL);
-  if (re == NULL) return PCRE_ERROR_BADMAGIC;
-  }
-if (optptr != NULL) *optptr = (int)(re->options & PUBLIC_COMPILE_OPTIONS);
-if (first_byte != NULL)
-  *first_byte = ((re->flags & PCRE_FIRSTSET) != 0)? re->first_byte :
-     ((re->flags & PCRE_STARTLINE) != 0)? -1 : -2;
-return re->top_bracket;
-}
-
-/* End of pcre_info.c */

Modified: code/trunk/pcre_internal.h
===================================================================
--- code/trunk/pcre_internal.h    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_internal.h    2011-12-28 17:16:11 UTC (rev 836)
@@ -7,7 +7,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2011 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -40,7 +40,8 @@

/* This header contains definitions that are shared between the different
modules, but which are not relevant to the exported API. This includes some
-functions whose names all begin with "_pcre_". */
+functions whose names all begin with "_pcre_" or "_pcre16_" depending on
+the PRIV macro. */

#ifndef PCRE_INTERNAL_H
#define PCRE_INTERNAL_H
@@ -51,20 +52,39 @@
#define PCRE_DEBUG
#endif

-/* We do not support both EBCDIC and UTF-8 at the same time. The "configure"
-script prevents both being selected, but not everybody uses "configure". */
-
-#if defined EBCDIC && defined SUPPORT_UTF8
-#error The use of both EBCDIC and SUPPORT_UTF8 is not supported.
+/* PCRE is compiled as an 8 bit library if it is not requested otherwise. */
+#ifndef COMPILE_PCRE16
+#define COMPILE_PCRE8
#endif

-/* If SUPPORT_UCP is defined, SUPPORT_UTF8 must also be defined. The
+/* If SUPPORT_UCP is defined, SUPPORT_UTF must also be defined. The
"configure" script ensures this, but not everybody uses "configure". */

-#if defined SUPPORT_UCP && !defined SUPPORT_UTF8
+#if defined SUPPORT_UCP && !(defined SUPPORT_UTF)
+#define SUPPORT_UTF 1
+#endif
+
+/* We define SUPPORT_UTF if SUPPORT_UTF8 is enabled for compatibility
+reasons with existing code. */
+
+#if defined SUPPORT_UTF8 && !(defined SUPPORT_UTF)
+#define SUPPORT_UTF 1
+#endif
+
+/* Fixme: SUPPORT_UTF8 should be eventually disappear from the code.
+Until then we define it if SUPPORT_UTF is defined. */
+
+#if defined SUPPORT_UTF && !(defined SUPPORT_UTF8)
#define SUPPORT_UTF8 1
#endif

+/* We do not support both EBCDIC and UTF-8/16 at the same time. The "configure"
+script prevents both being selected, but not everybody uses "configure". */
+
+#if defined EBCDIC && defined SUPPORT_UTF
+#error The use of both EBCDIC and SUPPORT_UTF8/16 is not supported.
+#endif
+
/* Use a macro for debugging printing, 'cause that eliminates the use of #ifdef
inline, and there are *still* stupid compilers about that don't like indented
pre-processor statements, or at least there were when I first wrote this. After
@@ -158,12 +178,14 @@
#define PCRE_CALL_CONVENTION
#endif

-/* We need to have types that specify unsigned 16-bit and 32-bit integers. We
+/* We need to have types that specify unsigned 8, 16 and 32-bit integers. We
cannot determine these outside the compilation (e.g. by running a program as
part of "configure") because PCRE is often cross-compiled for use on other
systems. Instead we make use of the maximum sizes that are available at
preprocessor time in standard C environments. */

+typedef unsigned char pcre_uint8;
+
#if USHRT_MAX == 65535
typedef unsigned short pcre_uint16;
typedef short pcre_int16;
@@ -206,13 +228,48 @@

/* All character handling must be done as unsigned characters. Otherwise there
are problems with top-bit-set characters and functions such as isspace().
-However, we leave the interface to the outside world as char *, because that
-should make things easier for callers. We define a short type for unsigned char
-to save lots of typing. I tried "uchar", but it causes problems on Digital
-Unix, where it is defined in sys/types, so use "uschar" instead. */
+However, we leave the interface to the outside world as char * or short *,
+because that should make things easier for callers. This character type is
+called pcre_uchar.

-typedef unsigned char uschar;
+The IN_UCHARS macro multiply its argument with the byte size of the current
+pcre_uchar type. Useful for memcpy and such operations, whose require the
+byte size of their input/output buffers.

+The MAX_255 macro checks whether its pcre_uchar input is less than 256.
+
+The TABLE_GET macro is designed for accessing elements of tables whose contain
+exactly 256 items. When the character is able to contain more than 256
+items, some check is needed before accessing these tables.
+*/
+
+#ifdef COMPILE_PCRE8
+
+typedef unsigned char pcre_uchar;
+#define IN_UCHARS(x) (x)
+#define MAX_255(c) 1
+#define TABLE_GET(c, table, default) ((table)[c])
+
+#else
+
+#ifdef COMPILE_PCRE16
+#if USHRT_MAX != 65535
+/* This is a warning message. Change PCRE_SCHAR16 to a 16 bit data type in
+pcre.h(.in) and disable (comment out) this message. */
+#error Warning: PCRE_SCHAR16 is not a 16 bit data type.
+#endif
+
+typedef pcre_uint16 pcre_uchar;
+#define IN_UCHARS(x) ((x) << 1)
+#define MAX_255(c) ((c) <= 255u)
+#define TABLE_GET(c, table, default) (MAX_255(c)? ((table)[c]):(default))
+
+#else
+#error Unsupported compiling mode
+#endif /* COMPILE_PCRE16 */
+
+#endif /* COMPILE_PCRE8 */
+
 /* This is an unsigned int value that no character can ever have. UTF-8
 characters only go up to 0x7fffffff (though Unicode doesn't go beyond
 0x0010ffff). */
@@ -234,8 +291,8 @@
 #define IS_NEWLINE(p) \
   ((NLBLOCK->nltype != NLTYPE_FIXED)? \
     ((p) < NLBLOCK->PSEND && \
-     _pcre_is_newline((p), NLBLOCK->nltype, NLBLOCK->PSEND, &(NLBLOCK->nllen),\
-       utf8)) \
+     PRIV(is_newline)((p), NLBLOCK->nltype, NLBLOCK->PSEND, \
+       &(NLBLOCK->nllen), utf)) \
     : \
     ((p) <= NLBLOCK->PSEND - NLBLOCK->nllen && \
      (p)[0] == NLBLOCK->nl[0] && \
@@ -248,8 +305,8 @@
 #define WAS_NEWLINE(p) \
   ((NLBLOCK->nltype != NLTYPE_FIXED)? \
     ((p) > NLBLOCK->PSSTART && \
-     _pcre_was_newline((p), NLBLOCK->nltype, NLBLOCK->PSSTART, \
-       &(NLBLOCK->nllen), utf8)) \
+     PRIV(was_newline)((p), NLBLOCK->nltype, NLBLOCK->PSSTART, \
+       &(NLBLOCK->nllen), utf)) \
     : \
     ((p) >= NLBLOCK->PSSTART + NLBLOCK->nllen && \
      (p)[-NLBLOCK->nllen] == NLBLOCK->nl[0] && \
@@ -267,15 +324,11 @@
 must begin with PCRE_. */

#ifdef CUSTOM_SUBJECT_PTR
-#define PCRE_SPTR CUSTOM_SUBJECT_PTR
-#define USPTR CUSTOM_SUBJECT_PTR
+#define PCRE_PUCHAR CUSTOM_SUBJECT_PTR
#else
-#define PCRE_SPTR const char *
-#define USPTR const unsigned char *
+#define PCRE_PUCHAR const pcre_uchar *
#endif

-
-
/* Include the public PCRE header and the definitions of UCP character property
values. */

@@ -343,6 +396,8 @@
the config.h file, but can be overridden by using -D on the command line. This
is automated on Unix systems via the "configure" command. */

+#ifdef COMPILE_PCRE8
+
#if LINK_SIZE == 2

#define PUT(a,n,d) \
@@ -379,14 +434,55 @@
#define GET(a,n) \
(((a)[n] << 24) | ((a)[(n)+1] << 16) | ((a)[(n)+2] << 8) | (a)[(n)+3])

-#define MAX_PATTERN_SIZE (1 << 30) /* Keep it positive */
+/* Keep it positive */
+#define MAX_PATTERN_SIZE (1 << 30)

+#else
+#error LINK_SIZE must be either 2, 3, or 4
+#endif

+#else /* COMPILE_PCRE8 */
+
+#ifdef COMPILE_PCRE16
+
+#if LINK_SIZE == 2
+
+#undef LINK_SIZE
+#define LINK_SIZE 1
+
+#define PUT(a,n,d) \
+ (a[n] = (d))
+
+#define GET(a,n) \
+ (a[n])
+
+#define MAX_PATTERN_SIZE (1 << 16)
+
+#elif LINK_SIZE == 3 || LINK_SIZE == 4
+
+#undef LINK_SIZE
+#define LINK_SIZE 2
+
+#define PUT(a,n,d) \
+ (a[n] = (d) >> 16), \
+ (a[(n)+1] = (d) & 65535)
+
+#define GET(a,n) \
+ (((a)[n] << 16) | (a)[(n)+1])
+
+/* Keep it positive */
+#define MAX_PATTERN_SIZE (1 << 30)
+
#else
#error LINK_SIZE must be either 2, 3, or 4
#endif

+#else
+#error Unsupported compiling mode
+#endif /* COMPILE_PCRE16 */

+#endif /* COMPILE_PCRE8 */
+
/* Convenience macro defined in terms of the others */

#define PUTINC(a,n,d) PUT(a,n,d), a += LINK_SIZE
@@ -396,6 +492,10 @@
offsets changes. There are used for repeat counts and for other things such as
capturing parenthesis numbers in back references. */

+#ifdef COMPILE_PCRE8
+
+#define IMM2_SIZE 2
+
#define PUT2(a,n,d) \
a[n] = (d) >> 8; \
a[(n)+1] = (d) & 255
@@ -403,17 +503,39 @@
#define GET2(a,n) \
(((a)[n] << 8) | (a)[(n)+1])

-#define PUT2INC(a,n,d) PUT2(a,n,d), a += 2
+#else /* COMPILE_PCRE8 */

+#ifdef COMPILE_PCRE16

-/* When UTF-8 encoding is being used, a character is no longer just a single
-byte. The macros for character handling generate simple sequences when used in
-byte-mode, and more complicated ones for UTF-8 characters. GETCHARLENTEST is
-not used when UTF-8 is not supported, so it is not defined, and BACKCHAR should
-never be called in byte mode. To make sure they can never even appear when
-UTF-8 support is omitted, we don't even define them. */
+#define IMM2_SIZE 1

-#ifndef SUPPORT_UTF8
+#define PUT2(a,n,d) \
+ a[n] = d
+
+#define GET2(a,n) \
+ a[n]
+
+#else
+#error Unsupported compiling mode
+#endif /* COMPILE_PCRE16 */
+
+#endif /* COMPILE_PCRE8 */
+
+#define PUT2INC(a,n,d) PUT2(a,n,d), a += IMM2_SIZE
+
+/* When UTF encoding is being used, a character is no longer just a single
+character. The macros for character handling generate simple sequences when
+used in character-mode, and more complicated ones for UTF characters.
+GETCHARLENTEST and other macros are not used when UTF is not supported,
+so they are not defined. To make sure they can never even appear when
+UTF support is omitted, we don't even define them. */
+
+#ifndef SUPPORT_UTF
+
+/* #define MAX_VALUE_FOR_SINGLE_CHAR */
+/* #define HAS_EXTRALEN(c) */
+/* #define GET_EXTRALEN(c) */
+/* #define NOT_FIRSTCHAR(c) */
#define GETCHAR(c, eptr) c = *eptr;
#define GETCHARTEST(c, eptr) c = *eptr;
#define GETCHARINC(c, eptr) c = *eptr++;
@@ -421,14 +543,36 @@
#define GETCHARLEN(c, eptr, len) c = *eptr;
/* #define GETCHARLENTEST(c, eptr, len) */
/* #define BACKCHAR(eptr) */
+/* #define FORWARDCHAR(eptr) */
+/* #define ACROSSCHAR(condition, eptr, action) */

-#else /* SUPPORT_UTF8 */
+#else /* SUPPORT_UTF */

+#ifdef COMPILE_PCRE8
+
/* These macros were originally written in the form of loops that used data
-from the tables whose names start with _pcre_utf8_table. They were rewritten by
+from the tables whose names start with PRIV(utf8_table). They were rewritten by
a user so as not to use loops, because in some environments this gives a
significant performance advantage, and it seems never to do any harm. */

+/* Tells the biggest code point which can be encoded as a single character. */
+
+#define MAX_VALUE_FOR_SINGLE_CHAR 127
+
+/* Tests whether the code point needs extra characters to decode. */
+
+#define HAS_EXTRALEN(c) ((c) >= 0xc0)
+
+/* Returns with the additional number of characters if IS_MULTICHAR(c) is TRUE.
+Otherwise it has an undefined behaviour. */
+
+#define GET_EXTRALEN(c) (PRIV(utf8_table4)[(c) & 0x3f])
+
+/* Returns TRUE, if the given character is not the first character
+of a UTF sequence. */
+
+#define NOT_FIRSTCHAR(c) (((c) & 0xc0) == 0x80)
+
/* Base macro to pick up the remaining bytes of a UTF-8 character, not
advancing the pointer. */

@@ -463,7 +607,7 @@

#define GETCHARTEST(c, eptr) \
c = *eptr; \
- if (utf8 && c >= 0xc0) GETUTF8(c, eptr);
+ if (utf && c >= 0xc0) GETUTF8(c, eptr);

/* Base macro to pick up the remaining bytes of a UTF-8 character, advancing
the pointer. */
@@ -511,7 +655,7 @@

#define GETCHARINCTEST(c, eptr) \
c = *eptr++; \
- if (utf8 && c >= 0xc0) GETUTF8INC(c, eptr);
+ if (utf && c >= 0xc0) GETUTF8INC(c, eptr);

/* Base macro to pick up the remaining bytes of a UTF-8 character, not
advancing the pointer, incrementing the length. */
@@ -563,7 +707,7 @@

#define GETCHARLENTEST(c, eptr, len) \
c = *eptr; \
- if (utf8 && c >= 0xc0) GETUTF8LEN(c, eptr, len);
+ if (utf && c >= 0xc0) GETUTF8LEN(c, eptr, len);

/* If the pointer is not at the start of a character, move it back until
it is. This is called only in UTF-8 mode - we don't put a test within the macro
@@ -571,9 +715,118 @@

#define BACKCHAR(eptr) while((*eptr & 0xc0) == 0x80) eptr--

-#endif /* SUPPORT_UTF8 */
+/* Same as above, just in the other direction. */
+#define FORWARDCHAR(eptr) while((*eptr & 0xc0) == 0x80) eptr++

+/* Same as above, but it allows a fully customizable form. */
+#define ACROSSCHAR(condition, eptr, action) \
+ while((condition) && ((eptr) & 0xc0) == 0x80) action

+#else /* COMPILE_PCRE8 */
+
+#ifdef COMPILE_PCRE16
+
+/* Tells the biggest code point which can be encoded as a single character. */
+
+#define MAX_VALUE_FOR_SINGLE_CHAR 65535
+
+/* Tests whether the code point needs extra characters to decode. */
+
+#define HAS_EXTRALEN(c) (((c) & 0xfc00) == 0xd800)
+
+/* Returns with the additional number of characters if IS_MULTICHAR(c) is TRUE.
+Otherwise it has an undefined behaviour. */
+
+#define GET_EXTRALEN(c) 1
+
+/* Returns TRUE, if the given character is not the first character
+of a UTF sequence. */
+
+#define NOT_FIRSTCHAR(c) (((c) & 0xfc00) == 0xdc00)
+
+/* Base macro to pick up the low surrogate of a UTF-16 character, not
+advancing the pointer. */
+
+#define GETUTF16(c, eptr) \
+ { c = (((c & 0x3ff) << 10) | (eptr[1] & 0x3ff)) + 0x10000; }
+
+/* Get the next UTF-16 character, not advancing the pointer. This is called when
+we know we are in UTF-16 mode. */
+
+#define GETCHAR(c, eptr) \
+ c = *eptr; \
+ if ((c & 0xfc00) == 0xd800) GETUTF16(c, eptr);
+
+/* Get the next UTF-16 character, testing for UTF-16 mode, and not advancing the
+pointer. */
+
+#define GETCHARTEST(c, eptr) \
+ c = *eptr; \
+ if (utf && (c & 0xfc00) == 0xd800) GETUTF16(c, eptr);
+
+/* Base macro to pick up the low surrogate of a UTF-16 character, advancing
+the pointer. */
+
+#define GETUTF16INC(c, eptr) \
+ { c = (((c & 0x3ff) << 10) | (*eptr++ & 0x3ff)) + 0x10000; }
+
+/* Get the next UTF-16 character, advancing the pointer. This is called when we
+know we are in UTF-16 mode. */
+
+#define GETCHARINC(c, eptr) \
+ c = *eptr++; \
+ if ((c & 0xfc00) == 0xd800) GETUTF16INC(c, eptr);
+
+/* Get the next character, testing for UTF-16 mode, and advancing the pointer.
+This is called when we don't know if we are in UTF-16 mode. */
+
+#define GETCHARINCTEST(c, eptr) \
+ c = *eptr++; \
+ if (utf && (c & 0xfc00) == 0xd800) GETUTF16INC(c, eptr);
+
+/* Base macro to pick up the low surrogate of a UTF-16 character, not
+advancing the pointer, incrementing the length. */
+
+#define GETUTF16LEN(c, eptr, len) \
+ { c = (((c & 0x3ff) << 10) | (eptr[1] & 0x3ff)) + 0x10000; len++; }
+
+/* Get the next UTF-16 character, not advancing the pointer, incrementing
+length if there is a low surrogate. This is called when we know we are in
+UTF-16 mode. */
+
+#define GETCHARLEN(c, eptr, len) \
+ c = *eptr; \
+ if ((c & 0xfc00) == 0xd800) GETUTF16LEN(c, eptr, len);
+
+/* Get the next UTF-816character, testing for UTF-16 mode, not advancing the
+pointer, incrementing length if there is a low surrogate. This is called when
+we do not know if we are in UTF-16 mode. */
+
+#define GETCHARLENTEST(c, eptr, len) \
+ c = *eptr; \
+ if (utf && (c & 0xfc00) == 0xd800) GETUTF16LEN(c, eptr, len);
+
+/* If the pointer is not at the start of a character, move it back until
+it is. This is called only in UTF-16 mode - we don't put a test within the
+macro because almost all calls are already within a block of UTF-16 only
+code. */
+
+#define BACKCHAR(eptr) if ((*eptr & 0xfc00) == 0xdc00) eptr--
+
+/* Same as above, just in the other direction. */
+#define FORWARDCHAR(eptr) if ((*eptr & 0xfc00) == 0xdc00) eptr++
+
+/* Same as above, but it allows a fully customizable form. */
+#define ACROSSCHAR(condition, eptr, action) \
+ if ((condition) && ((eptr) & 0xfc00) == 0xdc00) action
+
+#endif
+
+#endif /* COMPILE_PCRE8 */
+
+#endif /* SUPPORT_UTF */
+
+
/* In case there is no definition of offsetof() provided - though any proper
Standard C system should have one. */

@@ -588,13 +841,21 @@
the restrictions on partial matching have been lifted. It remains for backwards
compatibility. */

-#define PCRE_NOPARTIAL     0x0001  /* can't use partial with this regex */
-#define PCRE_FIRSTSET      0x0002  /* first_byte is set */
-#define PCRE_REQCHSET      0x0004  /* req_byte is set */
-#define PCRE_STARTLINE     0x0008  /* start after \n for multiline */
-#define PCRE_JCHANGED      0x0010  /* j option used in regex */
-#define PCRE_HASCRORLF     0x0020  /* explicit \r or \n in pattern */
-#define PCRE_HASTHEN       0x0040  /* pattern contains (*THEN) */
+#ifdef COMPILE_PCRE8
+#define PCRE_MODE          0x0001  /* compiled in 8 bit mode */
+#endif
+#ifdef COMPILE_PCRE16
+#define PCRE_MODE          0x0002  /* compiled in 16 bit mode */
+#endif
+#define PCRE_FIRSTSET      0x0010  /* first_char is set */
+#define PCRE_FCH_CASELESS  0x0020  /* caseless first char */
+#define PCRE_REQCHSET      0x0040  /* req_byte is set */
+#define PCRE_RCH_CASELESS  0x0080  /* caseless requested char */
+#define PCRE_STARTLINE     0x0100  /* start after \n for multiline */
+#define PCRE_NOPARTIAL     0x0200  /* can't use partial with this regex */
+#define PCRE_JCHANGED      0x0400  /* j option used in regex */
+#define PCRE_HASCRORLF     0x0800  /* explicit \r or \n in pattern */
+#define PCRE_HASTHEN       0x1000  /* pattern contains (*THEN) */

/* Flags for the "extra" block produced by pcre_study(). */

@@ -628,11 +889,15 @@
 #define PUBLIC_STUDY_OPTIONS \
    PCRE_STUDY_JIT_COMPILE

-/* Magic number to provide a small check against being handed junk. Also used
-to detect whether a pattern was compiled on a host of different endianness. */
+/* Magic number to provide a small check against being handed junk. */

#define MAGIC_NUMBER 0x50435245UL /* 'PCRE' */

+/* This variable is used to detect a loaded regular expression
+in different endianness. */
+
+#define REVERSED_MAGIC_NUMBER 0x45524350UL /* 'ERCP' */
+
/* Negative values for the firstchar and reqchar variables */

#define REQ_UNSET (-2)
@@ -643,12 +908,6 @@

#define REQ_BYTE_MAX 1000

-/* Flags added to firstbyte or reqbyte; a "non-literal" item is either a
-variable-length repeat, or a anything other than literal characters. */
-
-#define REQ_CASELESS 0x0100    /* indicates caselessness */
-#define REQ_VARY     0x0200    /* reqbyte followed non-literal item */
-
 /* Miscellaneous definitions. The #ifndef is to pacify compiler warnings in
 environments where these macros are defined elsewhere. Unfortunately, there
 is no way to do the same for the typedef. */
@@ -677,7 +936,7 @@
 application that did need both could compile two versions of the library, using
 macros to give the functions distinct names. */

-#ifndef SUPPORT_UTF8
+#ifndef SUPPORT_UTF

 /* UTF-8 support is not enabled; use the platform-dependent character literals
 so that PCRE works on both ASCII and EBCDIC platforms, in non-UTF-mode only. */
@@ -937,11 +1196,16 @@
 #define STRING_ANYCRLF_RIGHTPAR        "ANYCRLF)"
 #define STRING_BSR_ANYCRLF_RIGHTPAR    "BSR_ANYCRLF)"
 #define STRING_BSR_UNICODE_RIGHTPAR    "BSR_UNICODE)"
-#define STRING_UTF8_RIGHTPAR           "UTF8)"
+#ifdef COMPILE_PCRE8
+#define STRING_UTF_RIGHTPAR            "UTF8)"
+#endif
+#ifdef COMPILE_PCRE16
+#define STRING_UTF_RIGHTPAR            "UTF16)"
+#endif
 #define STRING_UCP_RIGHTPAR            "UCP)"
 #define STRING_NO_START_OPT_RIGHTPAR   "NO_START_OPT)"

-#else /* SUPPORT_UTF8 */
+#else /* SUPPORT_UTF */

 /* UTF-8 support is enabled; always use UTF-8 (=ASCII) character codes. This
 works in both modes non-EBCDIC platforms, and on EBCDIC platforms in UTF-8 mode
@@ -1192,11 +1456,16 @@
 #define STRING_ANYCRLF_RIGHTPAR        STR_A STR_N STR_Y STR_C STR_R STR_L STR_F STR_RIGHT_PARENTHESIS
 #define STRING_BSR_ANYCRLF_RIGHTPAR    STR_B STR_S STR_R STR_UNDERSCORE STR_A STR_N STR_Y STR_C STR_R STR_L STR_F STR_RIGHT_PARENTHESIS
 #define STRING_BSR_UNICODE_RIGHTPAR    STR_B STR_S STR_R STR_UNDERSCORE STR_U STR_N STR_I STR_C STR_O STR_D STR_E STR_RIGHT_PARENTHESIS
-#define STRING_UTF8_RIGHTPAR           STR_U STR_T STR_F STR_8 STR_RIGHT_PARENTHESIS
+#ifdef COMPILE_PCRE8
+#define STRING_UTF_RIGHTPAR            STR_U STR_T STR_F STR_8 STR_RIGHT_PARENTHESIS
+#endif
+#ifdef COMPILE_PCRE16
+#define STRING_UTF_RIGHTPAR            STR_U STR_T STR_F STR_1 STR_6 STR_RIGHT_PARENTHESIS
+#endif
 #define STRING_UCP_RIGHTPAR            STR_U STR_C STR_P STR_RIGHT_PARENTHESIS
 #define STRING_NO_START_OPT_RIGHTPAR   STR_N STR_O STR_UNDERSCORE STR_S STR_T STR_A STR_R STR_T STR_UNDERSCORE STR_O STR_P STR_T STR_RIGHT_PARENTHESIS

-#endif /* SUPPORT_UTF8 */
+#endif /* SUPPORT_UTF */

/* Escape items that are just an encoding of a particular data value. */

@@ -1236,7 +1505,7 @@
 #define PT_WORD       8    /* Word - L plus N plus underscore */

/* Flag bits and data types for the extended class (OP_XCLASS) for classes that
-contain UTF-8 characters with values greater than 255. */
+contain characters with values greater than 255. */

 #define XCL_NOT    0x01    /* Flag: this is a negative class */
 #define XCL_MAP    0x02    /* Flag: a 32-byte map is present */
@@ -1252,7 +1521,7 @@
 their negation. Also, they must appear in the same order as in the opcode
 definitions below, up to ESC_z. There's a dummy for OP_ALLANY because it
 corresponds to "." in DOTALL mode rather than an escape sequence. It is also
-used for [^] in JavaScript compatibility mode, and for \C in non-utf8 mode. In
+used for [^] in JavaScript compatibility mode, and for \C in non-utf mode. In
 non-DOTALL mode, "." behaves like \N.

 The special values ESC_DU, ESC_du, etc. are used instead of ESC_D, ESC_d, etc.
@@ -1433,8 +1702,8 @@
   OP_CLASS,          /* 106 Match a character class, chars < 256 only */
   OP_NCLASS,         /* 107 Same, but the bitmap was created from a negative
                               class - the difference is relevant only when a
-                              UTF-8 character > 255 is encountered. */
-  OP_XCLASS,         /* 108 Extended class for handling UTF-8 chars within the
+                              character > 255 is encountered. */
+  OP_XCLASS,         /* 108 Extended class for handling > 255 chars within the
                               class. This does both positive and negative. */
   OP_REF,            /* 109 Match a back reference, casefully */
   OP_REFI,           /* 110 Match a back reference, caselessly */
@@ -1591,30 +1860,35 @@
   2,                             /* noti                                   */ \
   /* Positive single-char repeats                             ** These are */ \
   2, 2, 2, 2, 2, 2,              /* *, *?, +, +?, ?, ??       ** minima in */ \
-  4, 4, 4,                       /* upto, minupto, exact      ** mode      */ \
-  2, 2, 2, 4,                    /* *+, ++, ?+, upto+                      */ \
+  2+IMM2_SIZE, 2+IMM2_SIZE,      /* upto, minupto             ** mode      */ \
+  2+IMM2_SIZE,                   /* exact                                  */ \
+  2, 2, 2, 2+IMM2_SIZE,          /* *+, ++, ?+, upto+                      */ \
   2, 2, 2, 2, 2, 2,              /* *I, *?I, +I, +?I, ?I, ??I ** UTF-8     */ \
-  4, 4, 4,                       /* upto I, minupto I, exact I             */ \
-  2, 2, 2, 4,                    /* *+I, ++I, ?+I, upto+I                  */ \
+  2+IMM2_SIZE, 2+IMM2_SIZE,      /* upto I, minupto I                      */ \
+  2+IMM2_SIZE,                   /* exact I                                */ \
+  2, 2, 2, 2+IMM2_SIZE,          /* *+I, ++I, ?+I, upto+I                  */ \
   /* Negative single-char repeats - only for chars < 256                   */ \
   2, 2, 2, 2, 2, 2,              /* NOT *, *?, +, +?, ?, ??                */ \
-  4, 4, 4,                       /* NOT upto, minupto, exact               */ \
-  2, 2, 2, 4,                    /* Possessive NOT *, +, ?, upto           */ \
+  2+IMM2_SIZE, 2+IMM2_SIZE,      /* NOT upto, minupto                      */ \
+  2+IMM2_SIZE,                   /* NOT exact                              */ \
+  2, 2, 2, 2+IMM2_SIZE,          /* Possessive NOT *, +, ?, upto           */ \
   2, 2, 2, 2, 2, 2,              /* NOT *I, *?I, +I, +?I, ?I, ??I          */ \
-  4, 4, 4,                       /* NOT upto I, minupto I, exact I         */ \
-  2, 2, 2, 4,                    /* Possessive NOT *I, +I, ?I, upto I      */ \
+  2+IMM2_SIZE, 2+IMM2_SIZE,      /* NOT upto I, minupto I                  */ \
+  2+IMM2_SIZE,                   /* NOT exact I                            */ \
+  2, 2, 2, 2+IMM2_SIZE,          /* Possessive NOT *I, +I, ?I, upto I      */ \
   /* Positive type repeats                                                 */ \
   2, 2, 2, 2, 2, 2,              /* Type *, *?, +, +?, ?, ??               */ \
-  4, 4, 4,                       /* Type upto, minupto, exact              */ \
-  2, 2, 2, 4,                    /* Possessive *+, ++, ?+, upto+           */ \
+  2+IMM2_SIZE, 2+IMM2_SIZE,      /* Type upto, minupto                     */ \
+  2+IMM2_SIZE,                   /* Type exact                             */ \
+  2, 2, 2, 2+IMM2_SIZE,          /* Possessive *+, ++, ?+, upto+           */ \
   /* Character class & ref repeats                                         */ \
   1, 1, 1, 1, 1, 1,              /* *, *?, +, +?, ?, ??                    */ \
-  5, 5,                          /* CRRANGE, CRMINRANGE                    */ \
- 33,                             /* CLASS                                  */ \
- 33,                             /* NCLASS                                 */ \
+  1+2*IMM2_SIZE, 1+2*IMM2_SIZE,  /* CRRANGE, CRMINRANGE                    */ \
+  1+(32/sizeof(pcre_uchar)),     /* CLASS                                  */ \
+  1+(32/sizeof(pcre_uchar)),     /* NCLASS                                 */ \
   0,                             /* XCLASS - variable length               */ \
-  3,                             /* REF                                    */ \
-  3,                             /* REFI                                   */ \
+  1+IMM2_SIZE,                   /* REF                                    */ \
+  1+IMM2_SIZE,                   /* REFI                                   */ \
   1+LINK_SIZE,                   /* RECURSE                                */ \
   2+2*LINK_SIZE,                 /* CALLOUT                                */ \
   1+LINK_SIZE,                   /* Alt                                    */ \
@@ -1631,23 +1905,23 @@
   1+LINK_SIZE,                   /* ONCE_NC                                */ \
   1+LINK_SIZE,                   /* BRA                                    */ \
   1+LINK_SIZE,                   /* BRAPOS                                 */ \
-  3+LINK_SIZE,                   /* CBRA                                   */ \
-  3+LINK_SIZE,                   /* CBRAPOS                                */ \
+  1+LINK_SIZE+IMM2_SIZE,         /* CBRA                                   */ \
+  1+LINK_SIZE+IMM2_SIZE,         /* CBRAPOS                                */ \
   1+LINK_SIZE,                   /* COND                                   */ \
   1+LINK_SIZE,                   /* SBRA                                   */ \
   1+LINK_SIZE,                   /* SBRAPOS                                */ \
-  3+LINK_SIZE,                   /* SCBRA                                  */ \
-  3+LINK_SIZE,                   /* SCBRAPOS                               */ \
+  1+LINK_SIZE+IMM2_SIZE,         /* SCBRA                                  */ \
+  1+LINK_SIZE+IMM2_SIZE,         /* SCBRAPOS                               */ \
   1+LINK_SIZE,                   /* SCOND                                  */ \
-  3, 3,                          /* CREF, NCREF                            */ \
-  3, 3,                          /* RREF, NRREF                            */ \
+  1+IMM2_SIZE, 1+IMM2_SIZE,      /* CREF, NCREF                            */ \
+  1+IMM2_SIZE, 1+IMM2_SIZE,      /* RREF, NRREF                            */ \
   1,                             /* DEF                                    */ \
   1, 1, 1,                       /* BRAZERO, BRAMINZERO, BRAPOSZERO        */ \
   3, 1, 3,                       /* MARK, PRUNE, PRUNE_ARG                 */ \
   1, 3,                          /* SKIP, SKIP_ARG                         */ \
   1, 3,                          /* THEN, THEN_ARG                         */ \
   1, 1, 1, 1,                    /* COMMIT, FAIL, ACCEPT, ASSERT_ACCEPT    */ \
-  3, 1                           /* CLOSE, SKIPZERO  */
+  1+IMM2_SIZE, 1                 /* CLOSE, SKIPZERO                        */

 /* A magic value for OP_RREF and OP_NRREF to indicate the "any recursion"
 condition. */
@@ -1665,7 +1939,7 @@
        ERR40, ERR41, ERR42, ERR43, ERR44, ERR45, ERR46, ERR47, ERR48, ERR49,
        ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
        ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69,
-       ERR70, ERRCOUNT };
+       ERR70, ERR71, ERR72, ERR73, ERRCOUNT };

 /* The real format of the start of the pcre block; the index of names and the
 code vector run on as long as necessary after the end. We store an explicit
@@ -1692,15 +1966,15 @@
   pcre_uint16 dummy1;             /* For future use */
   pcre_uint16 top_bracket;
   pcre_uint16 top_backref;
-  pcre_uint16 first_byte;
-  pcre_uint16 req_byte;
+  pcre_uint16 first_char;         /* Starting character */
+  pcre_uint16 req_char;           /* This character must be seen */
   pcre_uint16 name_table_offset;  /* Offset to name table that follows */
   pcre_uint16 name_entry_size;    /* Size of any name items */
   pcre_uint16 name_count;         /* Number of name items */
   pcre_uint16 ref_count;          /* Reference count */

-  const unsigned char *tables;    /* Pointer to tables or NULL for std */
-  const unsigned char *nullpad;   /* NULL padding */
+  const pcre_uint8 *tables;       /* Pointer to tables or NULL for std */
+  const pcre_uint8 *nullpad;      /* NULL padding */
 } real_pcre;

 /* The format of the block used to store data from pcre_study(). The same
@@ -1709,7 +1983,7 @@
 typedef struct pcre_study_data {
   pcre_uint32 size;               /* Total that was malloced */
   pcre_uint32 flags;              /* Private flags */
-  uschar start_bits[32];          /* Starting char bits */
+  pcre_uint8 start_bits[32];      /* Starting char bits */
   pcre_uint32 minlength;          /* Minimum subject length */
 } pcre_study_data;

@@ -1728,32 +2002,33 @@
doing the compiling, so that they are thread-safe. */

 typedef struct compile_data {
-  const uschar *lcc;            /* Points to lower casing table */
-  const uschar *fcc;            /* Points to case-flipping table */
-  const uschar *cbits;          /* Points to character type table */
-  const uschar *ctypes;         /* Points to table of type maps */
-  const uschar *start_workspace;/* The start of working space */
-  const uschar *start_code;     /* The start of the compiled code */
-  const uschar *start_pattern;  /* The start of the pattern */
-  const uschar *end_pattern;    /* The end of the pattern */
-  open_capitem *open_caps;      /* Chain of open capture items */
-  uschar *hwm;                  /* High watermark of workspace */
-  uschar *name_table;           /* The name/number table */
-  int  names_found;             /* Number of entries so far */
-  int  name_entry_size;         /* Size of each entry */
-  int  bracount;                /* Count of capturing parens as we compile */
-  int  final_bracount;          /* Saved value after first pass */
-  int  top_backref;             /* Maximum back reference */
-  unsigned int backref_map;     /* Bitmap of low back refs */
-  int  assert_depth;            /* Depth of nested assertions */
-  int  external_options;        /* External (initial) options */
-  int  external_flags;          /* External flag bits to be set */
-  int  req_varyopt;             /* "After variable item" flag for reqbyte */
-  BOOL had_accept;              /* (*ACCEPT) encountered */
-  BOOL check_lookbehind;        /* Lookbehinds need later checking */
-  int  nltype;                  /* Newline type */
-  int  nllen;                   /* Newline string length */
-  uschar nl[4];                 /* Newline string when fixed length */
+  const pcre_uint8 *lcc;            /* Points to lower casing table */
+  const pcre_uint8 *fcc;            /* Points to case-flipping table */
+  const pcre_uint8 *cbits;          /* Points to character type table */
+  const pcre_uint8 *ctypes;         /* Points to table of type maps */
+  const pcre_uchar *start_workspace;/* The start of working space */
+  const pcre_uchar *start_code;     /* The start of the compiled code */
+  const pcre_uchar *start_pattern;  /* The start of the pattern */
+  const pcre_uchar *end_pattern;    /* The end of the pattern */
+  open_capitem *open_caps;          /* Chain of open capture items */
+  pcre_uchar *hwm;                  /* High watermark of workspace */
+  pcre_uchar *name_table;           /* The name/number table */
+  int  names_found;                 /* Number of entries so far */
+  int  name_entry_size;             /* Size of each entry */
+  int  workspace_size;              /* Size of workspace */
+  int  bracount;                    /* Count of capturing parens as we compile */
+  int  final_bracount;              /* Saved value after first pass */
+  int  top_backref;                 /* Maximum back reference */
+  unsigned int backref_map;         /* Bitmap of low back refs */
+  int  assert_depth;                /* Depth of nested assertions */
+  int  external_options;            /* External (initial) options */
+  int  external_flags;              /* External flag bits to be set */
+  int  req_varyopt;                 /* "After variable item" flag for reqbyte */
+  BOOL had_accept;                  /* (*ACCEPT) encountered */
+  BOOL check_lookbehind;            /* Lookbehinds need later checking */
+  int  nltype;                      /* Newline type */
+  int  nllen;                       /* Newline string length */
+  pcre_uchar nl[4];                 /* Newline string when fixed length */
 } compile_data;

/* Structure for maintaining a chain of pointers to the currently incomplete
@@ -1761,7 +2036,7 @@

typedef struct branch_chain {
struct branch_chain *outer;
- uschar *current_branch;
+ pcre_uchar *current_branch;
} branch_chain;

 /* Structure for items in a linked list that represents an explicit recursive
@@ -1772,7 +2047,7 @@
   int group_num;                  /* Number of group that was called */
   int *offset_save;               /* Pointer to start of saved offsets */
   int saved_max;                  /* Number of saved offsets */
-  USPTR subject_position;         /* Position at start of recursion */
+  PCRE_PUCHAR subject_position;   /* Position at start of recursion */
 } recursion_info;

/* A similar structure for pcre_dfa_exec(). */
@@ -1780,7 +2055,7 @@
typedef struct dfa_recursion_info {
struct dfa_recursion_info *prevrec;
int group_num;
- USPTR subject_position;
+ PCRE_PUCHAR subject_position;
} dfa_recursion_info;

/* Structure for building a chain of data for holding the values of the subject
@@ -1790,7 +2065,7 @@

typedef struct eptrblock {
struct eptrblock *epb_prev;
- USPTR epb_saved_eptr;
+ PCRE_PUCHAR epb_saved_eptr;
} eptrblock;

@@ -1801,65 +2076,68 @@
   unsigned long int match_call_count;      /* As it says */
   unsigned long int match_limit;           /* As it says */
   unsigned long int match_limit_recursion; /* As it says */
-  int   *offset_vector;         /* Offset vector */
-  int    offset_end;            /* One past the end */
-  int    offset_max;            /* The maximum usable for return data */
-  int    nltype;                /* Newline type */
-  int    nllen;                 /* Newline string length */
-  int    name_count;            /* Number of names in name table */
-  int    name_entry_size;       /* Size of entry in names table */
-  uschar *name_table;           /* Table of names */
-  uschar nl[4];                 /* Newline string when fixed */
-  const  uschar *lcc;           /* Points to lower casing table */
-  const  uschar *ctypes;        /* Points to table of type maps */
-  BOOL   offset_overflow;       /* Set if too many extractions */
-  BOOL   notbol;                /* NOTBOL flag */
-  BOOL   noteol;                /* NOTEOL flag */
-  BOOL   utf8;                  /* UTF8 flag */
-  BOOL   jscript_compat;        /* JAVASCRIPT_COMPAT flag */
-  BOOL   use_ucp;               /* PCRE_UCP flag */
-  BOOL   endonly;               /* Dollar not before final \n */
-  BOOL   notempty;              /* Empty string match not wanted */
-  BOOL   notempty_atstart;      /* Empty string match at start not wanted */
-  BOOL   hitend;                /* Hit the end of the subject at some point */
-  BOOL   bsr_anycrlf;           /* \R is just any CRLF, not full Unicode */
-  BOOL   hasthen;               /* Pattern contains (*THEN) */
-  const  uschar *start_code;    /* For use when recursing */
-  USPTR  start_subject;         /* Start of the subject string */
-  USPTR  end_subject;           /* End of the subject string */
-  USPTR  start_match_ptr;       /* Start of matched string */
-  USPTR  end_match_ptr;         /* Subject position at end match */
-  USPTR  start_used_ptr;        /* Earliest consulted character */
-  int    partial;               /* PARTIAL options */
-  int    end_offset_top;        /* Highwater mark at end of match */
-  int    capture_last;          /* Most recent capture number */
-  int    start_offset;          /* The start offset value */
-  int    match_function_type;   /* Set for certain special calls of MATCH() */
-  eptrblock *eptrchain;         /* Chain of eptrblocks for tail recursions */
-  int    eptrn;                 /* Next free eptrblock */
-  recursion_info *recursive;    /* Linked list of recursion data */
-  void  *callout_data;          /* To pass back to callouts */
-  const  uschar *mark;          /* Mark pointer to pass back */
-  const  uschar *once_target;   /* Where to back up to for atomic groups */
+  int   *offset_vector;           /* Offset vector */
+  int    offset_end;              /* One past the end */
+  int    offset_max;              /* The maximum usable for return data */
+  int    nltype;                  /* Newline type */
+  int    nllen;                   /* Newline string length */
+  int    name_count;              /* Number of names in name table */
+  int    name_entry_size;         /* Size of entry in names table */
+  pcre_uchar *name_table;         /* Table of names */
+  pcre_uchar nl[4];               /* Newline string when fixed */
+  const  pcre_uint8 *lcc;         /* Points to lower casing table */
+  const  pcre_uint8 *fcc;         /* Points to case-flipping table */
+  const  pcre_uint8 *ctypes;      /* Points to table of type maps */
+  BOOL   offset_overflow;         /* Set if too many extractions */
+  BOOL   notbol;                  /* NOTBOL flag */
+  BOOL   noteol;                  /* NOTEOL flag */
+  BOOL   utf;                     /* UTF-8 / UTF-16 flag */
+  BOOL   jscript_compat;          /* JAVASCRIPT_COMPAT flag */
+  BOOL   use_ucp;                 /* PCRE_UCP flag */
+  BOOL   endonly;                 /* Dollar not before final \n */
+  BOOL   notempty;                /* Empty string match not wanted */
+  BOOL   notempty_atstart;        /* Empty string match at start not wanted */
+  BOOL   hitend;                  /* Hit the end of the subject at some point */
+  BOOL   bsr_anycrlf;             /* \R is just any CRLF, not full Unicode */
+  BOOL   hasthen;                 /* Pattern contains (*THEN) */
+  BOOL   ignore_skip_arg;         /* For re-run when SKIP name not found */
+  const  pcre_uchar *start_code;  /* For use when recursing */
+  PCRE_PUCHAR start_subject;      /* Start of the subject string */
+  PCRE_PUCHAR end_subject;        /* End of the subject string */
+  PCRE_PUCHAR start_match_ptr;    /* Start of matched string */
+  PCRE_PUCHAR end_match_ptr;      /* Subject position at end match */
+  PCRE_PUCHAR start_used_ptr;     /* Earliest consulted character */
+  int    partial;                 /* PARTIAL options */
+  int    end_offset_top;          /* Highwater mark at end of match */
+  int    capture_last;            /* Most recent capture number */
+  int    start_offset;            /* The start offset value */
+  int    match_function_type;     /* Set for certain special calls of MATCH() */
+  eptrblock *eptrchain;           /* Chain of eptrblocks for tail recursions */
+  int    eptrn;                   /* Next free eptrblock */
+  recursion_info *recursive;      /* Linked list of recursion data */
+  void  *callout_data;            /* To pass back to callouts */
+  const  pcre_uchar *mark;        /* Mark pointer to pass back on success */
+  const  pcre_uchar *nomatch_mark;/* Mark pointer to pass back on failure */
+  const  pcre_uchar *once_target; /* Where to back up to for atomic groups */
 } match_data;

/* A similar structure is used for the same purpose by the DFA matching
functions. */

 typedef struct dfa_match_data {
-  const uschar *start_code;      /* Start of the compiled pattern */
-  const uschar *start_subject;   /* Start of the subject string */
-  const uschar *end_subject;     /* End of subject string */
-  const uschar *start_used_ptr;  /* Earliest consulted character */
-  const uschar *tables;          /* Character tables */
-  int   start_offset;            /* The start offset value */
-  int   moptions;                /* Match options */
-  int   poptions;                /* Pattern options */
-  int    nltype;                 /* Newline type */
-  int    nllen;                  /* Newline string length */
-  uschar nl[4];                  /* Newline string when fixed */
-  void  *callout_data;           /* To pass back to callouts */
-  dfa_recursion_info *recursive; /* Linked list of recursion data */
+  const pcre_uchar *start_code;     /* Start of the compiled pattern */
+  const pcre_uchar *start_subject ; /* Start of the subject string */
+  const pcre_uchar *end_subject;    /* End of subject string */
+  const pcre_uchar *start_used_ptr; /* Earliest consulted character */
+  const pcre_uint8 *tables;         /* Character tables */
+  int   start_offset;               /* The start offset value */
+  int   moptions;                   /* Match options */
+  int   poptions;                   /* Pattern options */
+  int   nltype;                     /* Newline type */
+  int   nllen;                      /* Newline string length */
+  pcre_uchar nl[4];                 /* Newline string when fixed */
+  void *callout_data;               /* To pass back to callouts */
+  dfa_recursion_info *recursive;    /* Linked list of recursion data */
 } dfa_match_data;

/* Bit definitions for entries in the pcre_ctypes table. */
@@ -1895,6 +2173,20 @@
#define ctypes_offset (cbits_offset + cbit_length)
#define tables_length (ctypes_offset + 256)

+/* Internal function prefix */
+
+#ifdef COMPILE_PCRE8
+#define PUBL(name) pcre_##name
+#define PRIV(name) _pcre_##name
+#else
+#ifdef COMPILE_PCRE16
+#define PUBL(name) pcre16_##name
+#define PRIV(name) _pcre16_##name
+#else
+#error Unsupported compiling mode
+#endif /* COMPILE_PCRE16 */
+#endif /* COMPILE_PCRE8 */
+
/* Layout of the UCP type table that translates property names into types and
codes. Each entry used to point directly to a name, but to reduce the number of
relocations in shared libraries, it now has an offset into a single string
@@ -1912,74 +2204,114 @@
but are not part of the PCRE public API. The data for these tables is in the
pcre_tables.c module. */

-extern const int    _pcre_utf8_table1[];
-extern const int    _pcre_utf8_table2[];
-extern const int    _pcre_utf8_table3[];
-extern const uschar _pcre_utf8_table4[];
+#ifdef COMPILE_PCRE8

-#ifdef SUPPORT_JIT
-extern const uschar _pcre_utf8_char_sizes[];
-#endif
+extern const int            PRIV(utf8_table1)[];
+extern const int            PRIV(utf8_table1_size);
+extern const int            PRIV(utf8_table2)[];
+extern const int            PRIV(utf8_table3)[];
+extern const pcre_uint8     PRIV(utf8_table4)[];

-extern const int    _pcre_utf8_table1_size;
+#endif /* COMPILE_PCRE8 */

-extern const char   _pcre_utt_names[];
-extern const ucp_type_table _pcre_utt[];
-extern const int _pcre_utt_size;
+extern const char           PRIV(utt_names)[];
+extern const ucp_type_table PRIV(utt)[];
+extern const int            PRIV(utt_size);

-extern const uschar _pcre_default_tables[];
+extern const pcre_uint8     PRIV(default_tables)[];

-extern const uschar _pcre_OP_lengths[];
+extern const pcre_uint8     PRIV(OP_lengths)[];

/* Internal shared functions. These are functions that are used by more than
one of the exported public functions. They have to be "external" in the C
sense, but are not part of the PCRE public API. */

-extern const uschar *_pcre_find_bracket(const uschar *, BOOL, int);
-extern BOOL          _pcre_is_newline(USPTR, int, USPTR, int *, BOOL);
-extern int           _pcre_ord2utf8(int, uschar *);
-extern real_pcre    *_pcre_try_flipped(const real_pcre *, real_pcre *,
-                       const pcre_study_data *, pcre_study_data *);
-extern int           _pcre_valid_utf8(USPTR, int, int *);
-extern BOOL          _pcre_was_newline(USPTR, int, USPTR, int *, BOOL);
-extern BOOL          _pcre_xclass(int, const uschar *);
+/* String comparison functions. */
+#ifdef COMPILE_PCRE8

+#define STRCMP_UC_UC(str1, str2) \
+  strcmp((char *)(str1), (char *)(str2))
+#define STRCMP_UC_C8(str1, str2) \
+  strcmp((char *)(str1), (str2))
+#define STRNCMP_UC_UC(str1, str2, num) \
+  strncmp((char *)(str1), (char *)(str2), (num))
+#define STRNCMP_UC_C8(str1, str2, num) \
+  strncmp((char *)(str1), (str2), (num))
+#define STRLEN_UC(str) strlen((const char *)str)
+
+#else
+
+extern int               PRIV(strcmp_uc_uc)(const pcre_uchar *,
+                           const pcre_uchar *);
+extern int               PRIV(strcmp_uc_c8)(const pcre_uchar *,
+                           const char *);
+extern int               PRIV(strncmp_uc_uc)(const pcre_uchar *,
+                           const pcre_uchar *, unsigned int num);
+extern int               PRIV(strncmp_uc_c8)(const pcre_uchar *,
+                           const char *, unsigned int num);
+extern unsigned int      PRIV(strlen_uc)(const pcre_uchar *str);
+
+#define STRCMP_UC_UC(str1, str2) \
+  PRIV(strcmp_uc_uc)((str1), (str2))
+#define STRCMP_UC_C8(str1, str2) \
+  PRIV(strcmp_uc_c8)((str1), (str2))
+#define STRNCMP_UC_UC(str1, str2, num) \
+  PRIV(strncmp_uc_uc)((str1), (str2), (num))
+#define STRNCMP_UC_C8(str1, str2, num) \
+  PRIV(strncmp_uc_c8)((str1), (str2), (num))
+#define STRLEN_UC(str) PRIV(strlen_uc)(str)
+
+#endif /* COMPILE_PCRE8 */
+
+extern const pcre_uchar *PRIV(find_bracket)(const pcre_uchar *, BOOL, int);
+extern BOOL              PRIV(is_newline)(PCRE_PUCHAR, int, PCRE_PUCHAR,
+                           int *, BOOL);
+extern int               PRIV(ord2utf)(pcre_uint32, pcre_uchar *);
+extern int               PRIV(valid_utf)(PCRE_PUCHAR, int, int *);
+extern BOOL              PRIV(was_newline)(PCRE_PUCHAR, int, PCRE_PUCHAR,
+                           int *, BOOL);
+extern BOOL              PRIV(xclass)(int, const pcre_uchar *, BOOL);
+
 #ifdef SUPPORT_JIT
-extern void          _pcre_jit_compile(const real_pcre *, pcre_extra *);
-extern int           _pcre_jit_exec(const real_pcre *, void *, PCRE_SPTR,
-                        int, int, int, int, int *, int);
-extern void          _pcre_jit_free(void *);
+extern void              PRIV(jit_compile)(const real_pcre *, pcre_extra *);
+extern int               PRIV(jit_exec)(const real_pcre *, void *,
+                           const pcre_uchar *, int, int, int, int, int *, int);
+extern void              PRIV(jit_free)(void *);
+extern int               PRIV(jit_get_size)(void *);
 #endif

/* Unicode character database (UCD) */

typedef struct {
- uschar script;
- uschar chartype;
+ pcre_uint8 script;
+ pcre_uint8 chartype;
pcre_int32 other_case;
} ucd_record;

-extern const ucd_record  _pcre_ucd_records[];
-extern const uschar      _pcre_ucd_stage1[];
-extern const pcre_uint16 _pcre_ucd_stage2[];
-extern const int         _pcre_ucp_gentype[];
+extern const ucd_record  PRIV(ucd_records)[];
+extern const pcre_uint8  PRIV(ucd_stage1)[];
+extern const pcre_uint16 PRIV(ucd_stage2)[];
+extern const int         PRIV(ucp_gentype)[];
 #ifdef SUPPORT_JIT
-extern const int         _pcre_ucp_typerange[];
+extern const int         PRIV(ucp_typerange)[];
 #endif

+#ifdef SUPPORT_UCP
/* UCD access macros */

 #define UCD_BLOCK_SIZE 128
-#define GET_UCD(ch) (_pcre_ucd_records + \
-        _pcre_ucd_stage2[_pcre_ucd_stage1[(ch) / UCD_BLOCK_SIZE] * \
+#define GET_UCD(ch) (PRIV(ucd_records) + \
+        PRIV(ucd_stage2)[PRIV(ucd_stage1)[(ch) / UCD_BLOCK_SIZE] * \
         UCD_BLOCK_SIZE + (ch) % UCD_BLOCK_SIZE])

 #define UCD_CHARTYPE(ch)  GET_UCD(ch)->chartype
 #define UCD_SCRIPT(ch)    GET_UCD(ch)->script
-#define UCD_CATEGORY(ch)  _pcre_ucp_gentype[UCD_CHARTYPE(ch)]
+#define UCD_CATEGORY(ch)  PRIV(ucp_gentype)[UCD_CHARTYPE(ch)]
 #define UCD_OTHERCASE(ch) (ch + GET_UCD(ch)->other_case)

+#endif /* SUPPORT_UCP */
+
#endif

/* End of pcre_internal.h */

Modified: code/trunk/pcre_jit_compile.c
===================================================================
--- code/trunk/pcre_jit_compile.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_jit_compile.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,10 +6,10 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2008 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

   The machine code generator part (this module) was written by Zoltan Herczeg
-                      Copyright (c) 2010-2011
+                      Copyright (c) 2010-2012

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -52,8 +52,8 @@
we just include it. This way we don't need to touch the build
system files. */

-#define SLJIT_MALLOC(size) (pcre_malloc)(size)
-#define SLJIT_FREE(ptr) (pcre_free)(ptr)
+#define SLJIT_MALLOC(size) (PUBL(malloc))(size)
+#define SLJIT_FREE(ptr) (PUBL(free))(ptr)
#define SLJIT_CONFIG_AUTO 1
#define SLJIT_CONFIG_STATIC 1
#define SLJIT_VERBOSE 0
@@ -62,7 +62,7 @@
#include "sljit/sljitLir.c"

#if defined SLJIT_CONFIG_UNSUPPORTED && SLJIT_CONFIG_UNSUPPORTED
-#error "Unsupported architecture"
+#error Unsupported architecture
#endif

/* Allocate memory on the stack. Fast, but limited size. */
@@ -148,24 +148,25 @@
typedef struct jit_arguments {
/* Pointers first. */
struct sljit_stack *stack;
- PCRE_SPTR str;
- PCRE_SPTR begin;
- PCRE_SPTR end;
+ const pcre_uchar *str;
+ const pcre_uchar *begin;
+ const pcre_uchar *end;
int *offsets;
- uschar *ptr;
+ pcre_uchar *ptr;
/* Everything else after. */
int offsetcount;
int calllimit;
- uschar notbol;
- uschar noteol;
- uschar notempty;
- uschar notempty_atstart;
+ pcre_uint8 notbol;
+ pcre_uint8 noteol;
+ pcre_uint8 notempty;
+ pcre_uint8 notempty_atstart;
} jit_arguments;

typedef struct executable_function {
void *executable_func;
pcre_jit_callback callback;
void *userdata;
+ sljit_uw executable_size;
} executable_function;

typedef struct jump_list {
@@ -197,7 +198,7 @@
struct fallback_common *top;
jump_list *topfallbacks;
/* Opcode pointer. */
- uschar *cc;
+ pcre_uchar *cc;
} fallback_common;

typedef struct assert_fallback {
@@ -268,10 +269,10 @@

 typedef struct compiler_common {
   struct sljit_compiler *compiler;
-  uschar *start;
+  pcre_uchar *start;
   int localsize;
   int *localptrs;
-  const uschar *fcc;
+  const pcre_uint8 *fcc;
   sljit_w lcc;
   int cbraptr;
   int nltype;
@@ -297,14 +298,16 @@
   jump_list *casefulcmp;
   jump_list *caselesscmp;
   BOOL jscript_compat;
-#ifdef SUPPORT_UTF8
-  BOOL utf8;
+#ifdef SUPPORT_UTF
+  BOOL utf;
 #ifdef SUPPORT_UCP
-  BOOL useucp;
+  BOOL use_ucp;
 #endif
-  jump_list *utf8readchar;
-  jump_list *utf8readtype8;
+  jump_list *utfreadchar;
+#ifdef COMPILE_PCRE8
+  jump_list *utfreadtype8;
 #endif
+#endif /* SUPPORT_UTF */
 #ifdef SUPPORT_UCP
   jump_list *getucd;
 #endif
@@ -316,18 +319,30 @@
   int length;
   int sourcereg;
 #if defined SLJIT_UNALIGNED && SLJIT_UNALIGNED
-  int byteptr;
+  int ucharptr;
   union {
-    int asint;
-    short asshort;
+    sljit_i asint;
+    sljit_h asshort;
+#ifdef COMPILE_PCRE8
     sljit_ub asbyte;
-    sljit_ub asbytes[4];
+    sljit_ub asuchars[4];
+#else
+#ifdef COMPILE_PCRE16
+    sljit_uh asuchars[2];
+#endif
+#endif
   } c;
   union {
-    int asint;
-    short asshort;
+    sljit_i asint;
+    sljit_h asshort;
+#ifdef COMPILE_PCRE8
     sljit_ub asbyte;
-    sljit_ub asbytes[4];
+    sljit_ub asuchars[4];
+#else
+#ifdef COMPILE_PCRE16
+    sljit_uh asuchars[2];
+#endif
+#endif
   } oc;
 #endif
 } compare_context;
@@ -363,7 +378,7 @@
 /* Max limit of recursions. */
 #define CALL_LIMIT       (5 * sizeof(sljit_w))
 /* Last known position of the requested byte. */
-#define REQ_BYTE_PTR     (6 * sizeof(sljit_w))
+#define REQ_CHAR_PTR     (6 * sizeof(sljit_w))
 /* End pointer of the first line. */
 #define FIRSTLINE_END    (7 * sizeof(sljit_w))
 /* The output vector is stored on the stack, and contains pointers
@@ -373,8 +388,20 @@
 #define OVECTOR_START    (8 * sizeof(sljit_w))
 #define OVECTOR(i)       (OVECTOR_START + (i) * sizeof(sljit_w))
 #define OVECTOR_PRIV(i)  (common->cbraptr + (i) * sizeof(sljit_w))
-#define PRIV(cc)         (common->localptrs[(cc) - common->start])
+#define PRIV_DATA(cc)    (common->localptrs[(cc) - common->start])

+#ifdef COMPILE_PCRE8
+#define MOV_UCHAR SLJIT_MOV_UB
+#define MOVU_UCHAR SLJIT_MOVU_UB
+#else
+#ifdef COMPILE_PCRE16
+#define MOV_UCHAR SLJIT_MOV_UH
+#define MOVU_UCHAR SLJIT_MOVU_UH
+#else
+#error Unsupported compiling mode
+#endif
+#endif
+
/* Shortcuts. */
#define DEFINE_COMPILER \
struct sljit_compiler *compiler = common->compiler
@@ -397,7 +424,7 @@
#define COND_VALUE(op, dst, dstw, type) \
sljit_emit_cond_value(compiler, (op), (dst), (dstw), (type))

-static uschar* bracketend(uschar* cc)
+static pcre_uchar* bracketend(pcre_uchar* cc)
{
SLJIT_ASSERT((*cc >= OP_ASSERT && *cc <= OP_ASSERTBACK_NOT) || (*cc >= OP_ONCE && *cc <= OP_SCOND));
do cc += GET(cc, 1); while (*cc == OP_ALT);
@@ -418,7 +445,7 @@
compile_fallbackpath
*/

-static uschar *next_opcode(compiler_common *common, uschar *cc)
+static pcre_uchar *next_opcode(compiler_common *common, pcre_uchar *cc)
{
SLJIT_UNUSED_ARG(common);
switch(*cc)
@@ -474,8 +501,8 @@
return cc + 1;

case OP_ANYBYTE:
-#ifdef SUPPORT_UTF8
- if (common->utf8) return NULL;
+#ifdef SUPPORT_UTF
+ if (common->utf) return NULL;
#endif
return cc + 1;

@@ -483,7 +510,6 @@
case OP_CHARI:
case OP_NOT:
case OP_NOTI:
-
case OP_STAR:
case OP_MINSTAR:
case OP_PLUS:
@@ -521,8 +547,8 @@
case OP_NOTPOSPLUSI:
case OP_NOTPOSQUERYI:
cc += 2;
-#ifdef SUPPORT_UTF8
- if (common->utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
+#ifdef SUPPORT_UTF
+ if (common->utf && HAS_EXTRALEN(cc[-1])) cc += GET_EXTRALEN(cc[-1]);
#endif
return cc;

@@ -542,14 +568,16 @@
case OP_NOTMINUPTOI:
case OP_NOTEXACTI:
case OP_NOTPOSUPTOI:
- cc += 4;
-#ifdef SUPPORT_UTF8
- if (common->utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
+ cc += 2 + IMM2_SIZE;
+#ifdef SUPPORT_UTF
+ if (common->utf && HAS_EXTRALEN(cc[-1])) cc += GET_EXTRALEN(cc[-1]);
#endif
return cc;

case OP_NOTPROP:
case OP_PROP:
+ return cc + 1 + 2;
+
case OP_TYPEUPTO:
case OP_TYPEMINUPTO:
case OP_TYPEEXACT:
@@ -561,18 +589,18 @@
case OP_RREF:
case OP_NRREF:
case OP_CLOSE:
- cc += 3;
+ cc += 1 + IMM2_SIZE;
return cc;

case OP_CRRANGE:
case OP_CRMINRANGE:
- return cc + 5;
+ return cc + 1 + 2 * IMM2_SIZE;

case OP_CLASS:
case OP_NCLASS:
- return cc + 33;
+ return cc + 1 + 32 / sizeof(pcre_uchar);

-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
case OP_XCLASS:
return cc + GET(cc, 1);
#endif
@@ -602,17 +630,17 @@
case OP_CBRAPOS:
case OP_SCBRA:
case OP_SCBRAPOS:
- return cc + 1 + LINK_SIZE + 2;
+ return cc + 1 + LINK_SIZE + IMM2_SIZE;

default:
return NULL;
}
}

-static int get_localspace(compiler_common *common, uschar *cc, uschar *ccend)
+static int get_localspace(compiler_common *common, pcre_uchar *cc, pcre_uchar *ccend)
 {
 int localspace = 0;
-uschar *alternative;
+pcre_uchar *alternative;
 /* Calculate important variables (like stack size) and checks whether all opcodes are supported. */
 while (cc < ccend)
   {
@@ -635,7 +663,7 @@
     case OP_CBRAPOS:
     case OP_SCBRAPOS:
     localspace += sizeof(sljit_w);
-    cc += 1 + LINK_SIZE + 2;
+    cc += 1 + LINK_SIZE + IMM2_SIZE;
     break;

     case OP_COND:
@@ -656,10 +684,10 @@
 return localspace;
 }

-static void set_localptrs(compiler_common *common, int localptr, uschar *ccend)
+static void set_localptrs(compiler_common *common, int localptr, pcre_uchar *ccend)
 {
-uschar *cc = common->start;
-uschar *alternative;
+pcre_uchar *cc = common->start;
+pcre_uchar *alternative;
 while (cc < ccend)
   {
   switch(*cc)
@@ -683,7 +711,7 @@
     case OP_SCBRAPOS:
     common->localptrs[cc - common->start] = localptr;
     localptr += sizeof(sljit_w);
-    cc += 1 + LINK_SIZE + 2;
+    cc += 1 + LINK_SIZE + IMM2_SIZE;
     break;

     case OP_COND:
@@ -706,9 +734,9 @@
 }

 /* Returns with -1 if no need for frame. */
-static int get_framesize(compiler_common *common, uschar *cc, BOOL recursive)
+static int get_framesize(compiler_common *common, pcre_uchar *cc, BOOL recursive)
 {
-uschar *ccend = bracketend(cc);
+pcre_uchar *ccend = bracketend(cc);
 int length = 0;
 BOOL possessive = FALSE;
 BOOL setsom_found = FALSE;
@@ -739,7 +767,7 @@
     case OP_SCBRA:
     case OP_SCBRAPOS:
     length += 3;
-    cc += 1 + LINK_SIZE + 2;
+    cc += 1 + LINK_SIZE + IMM2_SIZE;
     break;

     default:
@@ -757,10 +785,10 @@
 return -1;
 }

-static void init_frame(compiler_common *common, uschar *cc, int stackpos, int stacktop, BOOL recursive)
+static void init_frame(compiler_common *common, pcre_uchar *cc, int stackpos, int stacktop, BOOL recursive)
{
DEFINE_COMPILER;
-uschar *ccend = bracketend(cc);
+pcre_uchar *ccend = bracketend(cc);
BOOL setsom_found = FALSE;
int offset;

@@ -802,7 +830,7 @@
     OP1(SLJIT_MOV, SLJIT_MEM1(STACK_TOP), stackpos, TMP2, 0);
     stackpos += (int)sizeof(sljit_w);

-    cc += 1 + LINK_SIZE + 2;
+    cc += 1 + LINK_SIZE + IMM2_SIZE;
     break;

     default:
@@ -815,10 +843,10 @@
 SLJIT_ASSERT(stackpos == STACK(stacktop));
 }

-static SLJIT_INLINE int get_localsize(compiler_common *common, uschar *cc, uschar *ccend)
+static SLJIT_INLINE int get_localsize(compiler_common *common, pcre_uchar *cc, pcre_uchar *ccend)
 {
 int localsize = 2;
-uschar *alternative;
+pcre_uchar *alternative;
 /* Calculate the sum of the local variables. */
 while (cc < ccend)
   {
@@ -841,13 +869,13 @@
     case OP_CBRA:
     case OP_SCBRA:
     localsize++;
-    cc += 1 + LINK_SIZE + 2;
+    cc += 1 + LINK_SIZE + IMM2_SIZE;
     break;

     case OP_CBRAPOS:
     case OP_SCBRAPOS:
     localsize += 2;
-    cc += 1 + LINK_SIZE + 2;
+    cc += 1 + LINK_SIZE + IMM2_SIZE;
     break;

     case OP_COND:
@@ -868,7 +896,7 @@
 return localsize;
 }

-static void copy_locals(compiler_common *common, uschar *cc, uschar *ccend,
+static void copy_locals(compiler_common *common, pcre_uchar *cc, pcre_uchar *ccend,
   BOOL save, int stackptr, int stacktop)
 {
 DEFINE_COMPILER;
@@ -877,7 +905,7 @@
 BOOL tmp1next = TRUE;
 BOOL tmp1empty = TRUE;
 BOOL tmp2empty = TRUE;
-uschar *alternative;
+pcre_uchar *alternative;
 enum {
   start,
   loop,
@@ -938,7 +966,7 @@
       case OP_SBRAPOS:
       case OP_SCOND:
       count = 1;
-      srcw[0] = PRIV(cc);
+      srcw[0] = PRIV_DATA(cc);
       SLJIT_ASSERT(srcw[0] != 0);
       cc += 1 + LINK_SIZE;
       break;
@@ -947,16 +975,16 @@
       case OP_SCBRA:
       count = 1;
       srcw[0] = OVECTOR_PRIV(GET2(cc, 1 + LINK_SIZE));
-      cc += 1 + LINK_SIZE + 2;
+      cc += 1 + LINK_SIZE + IMM2_SIZE;
       break;

       case OP_CBRAPOS:
       case OP_SCBRAPOS:
       count = 2;
       srcw[1] = OVECTOR_PRIV(GET2(cc, 1 + LINK_SIZE));
-      srcw[0] = PRIV(cc);
+      srcw[0] = PRIV_DATA(cc);
       SLJIT_ASSERT(srcw[0] != 0);
-      cc += 1 + LINK_SIZE + 2;
+      cc += 1 + LINK_SIZE + IMM2_SIZE;
       break;

       case OP_COND:
@@ -965,7 +993,7 @@
       if (*alternative == OP_KETRMAX || *alternative == OP_KETRMIN)
         {
         count = 1;
-        srcw[0] = PRIV(cc);
+        srcw[0] = PRIV_DATA(cc);
         SLJIT_ASSERT(srcw[0] != 0);
         }
       cc += 1 + LINK_SIZE;
@@ -1173,7 +1201,7 @@
 int i;
 /* At this point we can freely use all temporary registers. */
 /* TMP1 returns with begin - 1. */
-OP2(SLJIT_SUB, SLJIT_TEMPORARY_REG1, 0, SLJIT_MEM1(SLJIT_GENERAL_REG1), SLJIT_OFFSETOF(jit_arguments, begin), SLJIT_IMM, 1);
+OP2(SLJIT_SUB, SLJIT_TEMPORARY_REG1, 0, SLJIT_MEM1(SLJIT_GENERAL_REG1), SLJIT_OFFSETOF(jit_arguments, begin), SLJIT_IMM, IN_UCHARS(1));
 if (length < 8)
   {
   for (i = 0; i < length; i++)
@@ -1211,6 +1239,9 @@
 OP2(SLJIT_SUB, SLJIT_GENERAL_REG2, 0, SLJIT_MEM1(SLJIT_GENERAL_REG1), 0, SLJIT_TEMPORARY_REG1, 0);
 OP2(SLJIT_ADD, SLJIT_GENERAL_REG1, 0, SLJIT_GENERAL_REG1, 0, SLJIT_IMM, sizeof(sljit_w));
 /* Copy the integer value to the output buffer */
+#ifdef COMPILE_PCRE16
+OP2(SLJIT_ASHR, SLJIT_GENERAL_REG2, 0, SLJIT_GENERAL_REG2, 0, SLJIT_IMM, 1);
+#endif
 OP1(SLJIT_MOVU_SI, SLJIT_MEM1(SLJIT_TEMPORARY_REG3), sizeof(int), SLJIT_GENERAL_REG2, 0);
 OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_TEMPORARY_REG2, 0, SLJIT_TEMPORARY_REG2, 0, SLJIT_IMM, 1);
 JUMPTO(SLJIT_C_NOT_ZERO, loop);
@@ -1233,13 +1264,13 @@
   OP1(SLJIT_MOV, SLJIT_RETURN_REG, 0, SLJIT_IMM, 1);
 }

-static SLJIT_INLINE BOOL char_has_othercase(compiler_common *common, uschar* cc)
+static SLJIT_INLINE BOOL char_has_othercase(compiler_common *common, pcre_uchar* cc)
{
/* Detects if the character has an othercase. */
unsigned int c;

-#ifdef SUPPORT_UTF8
-if (common->utf8)
+#ifdef SUPPORT_UTF
+if (common->utf)
   {
   GETCHAR(c, cc);
   if (c > 127)
@@ -1250,18 +1281,21 @@
     return FALSE;
 #endif
     }
+#ifndef COMPILE_PCRE8
+  return common->fcc[c] != c;
+#endif
   }
 else
 #endif
   c = *cc;
-return common->fcc[c] != c;
+return MAX_255(c) ? common->fcc[c] != c : FALSE;
 }

static SLJIT_INLINE unsigned int char_othercase(compiler_common *common, unsigned int c)
{
/* Returns with the othercase. */
-#ifdef SUPPORT_UTF8
-if (common->utf8 && c > 127)
+#ifdef SUPPORT_UTF
+if (common->utf && c > 127)
{
#ifdef SUPPORT_UCP
return UCD_OTHERCASE(c);
@@ -1270,19 +1304,19 @@
#endif
}
#endif
-return common->fcc[c];
+return TABLE_GET(c, common->fcc, c);
}

-static unsigned int char_get_othercase_bit(compiler_common *common, uschar* cc)
+static unsigned int char_get_othercase_bit(compiler_common *common, pcre_uchar* cc)
{
/* Detects if the character and its othercase has only 1 bit difference. */
unsigned int c, oc, bit;
-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
int n;
#endif

-#ifdef SUPPORT_UTF8
-if (common->utf8)
+#ifdef SUPPORT_UTF
+if (common->utf)
{
GETCHAR(c, cc);
if (c <= 127)
@@ -1299,11 +1333,11 @@
else
{
c = *cc;
- oc = common->fcc[c];
+ oc = TABLE_GET(c, common->fcc, c);
}
#else
c = *cc;
-oc = common->fcc[c];
+oc = TABLE_GET(c, common->fcc, c);
#endif

SLJIT_ASSERT(c != oc);
@@ -1317,10 +1351,12 @@
if (!ispowerof2(bit))
return 0;

-#ifdef SUPPORT_UTF8
-if (common->utf8 && c > 127)
+#ifdef COMPILE_PCRE8
+
+#ifdef SUPPORT_UTF
+if (common->utf && c > 127)
   {
-  n = _pcre_utf8_table4[*cc & 0x3f];
+  n = GET_EXTRALEN(*cc);
   while ((bit & 0x3f) == 0)
     {
     n--;
@@ -1328,8 +1364,25 @@
     }
   return (n << 8) | bit;
   }
-#endif
+#endif /* SUPPORT_UTF */
 return (0 << 8) | bit;
+
+#else /* COMPILE_PCRE8 */
+
+#ifdef COMPILE_PCRE16
+#ifdef SUPPORT_UTF
+if (common->utf && c > 65535)
+  {
+  if (bit >= (1 << 10))
+    bit >>= 10;
+  else
+    return (bit < 256) ? ((2 << 8) | bit) : ((3 << 8) | (bit >> 8));
+  }
+#endif /* SUPPORT_UTF */
+return (bit < 256) ? ((0 << 8) | bit) : ((1 << 8) | (bit >> 8));
+#endif /* COMPILE_PCRE16 */
+
+#endif /* COMPILE_PCRE8 */
 }

static SLJIT_INLINE void check_input_end(compiler_common *common, jump_list **fallbacks)
@@ -1343,20 +1396,26 @@
/* Reads the character into TMP1, updates STR_PTR.
Does not check STR_END. TMP2 Destroyed. */
DEFINE_COMPILER;
-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
struct sljit_jump *jump;
#endif

-OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
-#ifdef SUPPORT_UTF8
-if (common->utf8)
+OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+#ifdef SUPPORT_UTF
+if (common->utf)
{
+#ifdef COMPILE_PCRE8
jump = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xc0);
- add_jump(compiler, &common->utf8readchar, JUMP(SLJIT_FAST_CALL));
+#else
+#ifdef COMPILE_PCRE16
+ jump = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xd800);
+#endif
+#endif /* COMPILE_PCRE8 */
+ add_jump(compiler, &common->utfreadchar, JUMP(SLJIT_FAST_CALL));
JUMPHERE(jump);
}
#endif
-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
}

static void peek_char(compiler_common *common)
@@ -1364,16 +1423,22 @@
/* Reads the character into TMP1, keeps STR_PTR.
Does not check STR_END. TMP2 Destroyed. */
DEFINE_COMPILER;
-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
struct sljit_jump *jump;
#endif

-OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
-#ifdef SUPPORT_UTF8
-if (common->utf8)
+OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+#ifdef SUPPORT_UTF
+if (common->utf)
{
+#ifdef COMPILE_PCRE8
jump = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xc0);
- add_jump(compiler, &common->utf8readchar, JUMP(SLJIT_FAST_CALL));
+#else
+#ifdef COMPILE_PCRE16
+ jump = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xd800);
+#endif
+#endif /* COMPILE_PCRE8 */
+ add_jump(compiler, &common->utfreadchar, JUMP(SLJIT_FAST_CALL));
OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, TMP2, 0);
JUMPHERE(jump);
}
@@ -1384,47 +1449,84 @@
{
/* Reads the character type into TMP1, updates STR_PTR. Does not check STR_END. */
DEFINE_COMPILER;
-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || defined COMPILE_PCRE16
struct sljit_jump *jump;
#endif

-#ifdef SUPPORT_UTF8
-if (common->utf8)
+#ifdef SUPPORT_UTF
+if (common->utf)
{
- OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 0);
- OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+ OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), 0);
+ OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
+#ifdef COMPILE_PCRE8
/* This can be an extra read in some situations, but hopefully
- it is a clever early read in most cases. */
+ it is needed in most cases. */
OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP2), common->ctypes);
jump = CMP(SLJIT_C_LESS, TMP2, 0, SLJIT_IMM, 0xc0);
- add_jump(compiler, &common->utf8readtype8, JUMP(SLJIT_FAST_CALL));
+ add_jump(compiler, &common->utfreadtype8, JUMP(SLJIT_FAST_CALL));
JUMPHERE(jump);
+#else
+#ifdef COMPILE_PCRE16
+ OP1(SLJIT_MOV, TMP1, 0, SLJIT_IMM, 0);
+ jump = CMP(SLJIT_C_GREATER, TMP2, 0, SLJIT_IMM, 255);
+ OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP2), common->ctypes);
+ JUMPHERE(jump);
+ /* Skip low surrogate if necessary. */
+ OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0xfc00);
+ OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP2, 0, SLJIT_IMM, 0xd800);
+ COND_VALUE(SLJIT_MOV, TMP2, 0, SLJIT_C_EQUAL);
+ OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, 1);
+ OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP2, 0);
+#endif
+#endif /* COMPILE_PCRE8 */
return;
}
#endif
-OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
-OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), common->ctypes);
+OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), 0);
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
+#ifdef COMPILE_PCRE16
+/* The ctypes array contains only 256 values. */
+OP1(SLJIT_MOV, TMP1, 0, SLJIT_IMM, 0);
+jump = CMP(SLJIT_C_GREATER, TMP2, 0, SLJIT_IMM, 255);
+#endif
+OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP2), common->ctypes);
+#ifdef COMPILE_PCRE16
+JUMPHERE(jump);
+#endif
}

static void skip_char_back(compiler_common *common)
{
-/* Goes one character back. Only affects STR_PTR. Does not check begin. */
+/* Goes one character back. Affects STR_PTR and TMP1. Does not check begin. */
DEFINE_COMPILER;
-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
struct sljit_label *label;

-if (common->utf8)
+if (common->utf)
{
label = LABEL();
- OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
- OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+ OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), -IN_UCHARS(1));
+ OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0xc0);
CMPTO(SLJIT_C_EQUAL, TMP1, 0, SLJIT_IMM, 0x80, label);
return;
}
#endif
-OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+#if defined SUPPORT_UTF && defined COMPILE_PCRE16
+if (common->utf)
+ {
+ OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), -IN_UCHARS(1));
+ OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
+ /* Skip low surrogate if necessary. */
+ OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0xfc00);
+ OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0xdc00);
+ COND_VALUE(SLJIT_MOV, TMP1, 0, SLJIT_C_EQUAL);
+ OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
+ OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
+ return;
+ }
+#endif
+OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
}

static void check_newlinechar(compiler_common *common, int nltype, jump_list **fallbacks, BOOL jumpiftrue)
@@ -1447,15 +1549,17 @@
}
else
{
- SLJIT_ASSERT(nltype == NLTYPE_FIXED && common->newline <= 255);
+ SLJIT_ASSERT(nltype == NLTYPE_FIXED && common->newline < 256);
add_jump(compiler, fallbacks, CMP(jumpiftrue ? SLJIT_C_EQUAL : SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, common->newline));
}
}

-#ifdef SUPPORT_UTF8
-static void do_utf8readchar(compiler_common *common)
+#ifdef SUPPORT_UTF
+
+#ifdef COMPILE_PCRE8
+static void do_utfreadchar(compiler_common *common)
{
-/* Fast decoding an utf8 character. TMP1 contains the first byte
+/* Fast decoding a UTF-8 character. TMP1 contains the first byte
of the character (>= 0xc0). Return char value in TMP1, length - 1 in TMP2. */
DEFINE_COMPILER;
struct sljit_jump *jump;
@@ -1464,82 +1568,57 @@
/* Searching for the first zero. */
OP2(SLJIT_AND | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x20);
jump = JUMP(SLJIT_C_NOT_ZERO);
-/* 2 byte sequence */
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 1);
-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+/* Two byte sequence. */
+OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(1));
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0x1f);
OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 6);
OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3f);
OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, 1);
+OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, IN_UCHARS(1));
sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
JUMPHERE(jump);

OP2(SLJIT_AND | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x10);
jump = JUMP(SLJIT_C_NOT_ZERO);
-/* 3 byte sequence */
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 1);
+/* Three byte sequence. */
+OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(1));
OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0x0f);
OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 12);
OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3f);
OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, 6);
OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 2);
-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 2);
+OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(2));
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(2));
OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3f);
OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, 2);
+OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, IN_UCHARS(2));
sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
JUMPHERE(jump);

-OP2(SLJIT_AND | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x08);
-jump = JUMP(SLJIT_C_NOT_ZERO);
-/* 4 byte sequence */
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 1);
+/* Four byte sequence. */
+OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(1));
OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0x07);
OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 18);
OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3f);
OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, 12);
OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 2);
+OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(2));
OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3f);
OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, 6);
OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 3);
-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 3);
+OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(3));
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(3));
OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3f);
OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, 3);
+OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, IN_UCHARS(3));
sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
-JUMPHERE(jump);
-
-/* 5 byte sequence */
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 1);
-OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0x03);
-OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 24);
-OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3f);
-OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, 18);
-OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 2);
-OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3f);
-OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, 12);
-OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 3);
-OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3f);
-OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, 6);
-OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 4);
-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 4);
-OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3f);
-OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, 4);
-sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
}

-static void do_utf8readtype8(compiler_common *common)
+static void do_utfreadtype8(compiler_common *common)
{
-/* Fast decoding an utf8 character type. TMP2 contains the first byte
-of the character (>= 0xc0) and TMP1 is destroyed. Return value in TMP1. */
+/* Fast decoding a UTF-8 character type. TMP2 contains the first byte
+of the character (>= 0xc0). Return value in TMP1. */
DEFINE_COMPILER;
struct sljit_jump *jump;
struct sljit_jump *compare;
@@ -1548,9 +1627,9 @@

OP2(SLJIT_AND | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP2, 0, SLJIT_IMM, 0x20);
jump = JUMP(SLJIT_C_NOT_ZERO);
-/* 2 byte sequence */
-OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+/* Two byte sequence. */
+OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(0));
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x1f);
OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, 6);
OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0x3f);
@@ -1565,14 +1644,45 @@
JUMPHERE(jump);

/* We only have types for characters less than 256. */
-OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP2), (sljit_w)_pcre_utf8_char_sizes - 0xc0);
+OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP2), (sljit_w)PRIV(utf8_table4) - 0xc0);
OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
OP1(SLJIT_MOV, TMP1, 0, SLJIT_IMM, 0);
sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
}

-#endif
+#else /* COMPILE_PCRE8 */

+#ifdef COMPILE_PCRE16
+static void do_utfreadchar(compiler_common *common)
+{
+/* Fast decoding a UTF-16 character. TMP1 contains the first 16 bit char
+of the character (>= 0xd800). Return char value in TMP1, length - 1 in TMP2. */
+DEFINE_COMPILER;
+struct sljit_jump *jump;
+
+sljit_emit_fast_enter(compiler, RETURN_ADDR, 0, 1, 5, 5, common->localsize);
+jump = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xdc00);
+/* Do nothing, only return. */
+sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
+
+JUMPHERE(jump);
+/* Combine two 16 bit characters. */
+OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(1));
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
+OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0x3ff);
+OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 10);
+OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 0x3ff);
+OP2(SLJIT_OR, TMP1, 0, TMP1, 0, TMP2, 0);
+OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, IN_UCHARS(1));
+OP2(SLJIT_ADD, TMP1, 0, TMP1, 0, SLJIT_IMM, 0x10000);
+sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
+}
+#endif /* COMPILE_PCRE16 */
+
+#endif /* COMPILE_PCRE8 */
+
+#endif /* SUPPORT_UTF */
+
#ifdef SUPPORT_UCP

/* UCD_BLOCK_SIZE must be 128 (see the assert below). */
@@ -1589,13 +1699,13 @@

sljit_emit_fast_enter(compiler, RETURN_ADDR, 0, 1, 5, 5, common->localsize);
OP2(SLJIT_LSHR, TMP2, 0, TMP1, 0, SLJIT_IMM, UCD_BLOCK_SHIFT);
-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(TMP2), (sljit_w)_pcre_ucd_stage1);
+OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(TMP2), (sljit_w)PRIV(ucd_stage1));
OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, UCD_BLOCK_MASK);
OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, UCD_BLOCK_SHIFT);
OP2(SLJIT_ADD, TMP1, 0, TMP1, 0, TMP2, 0);
-OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, (sljit_w)_pcre_ucd_stage2);
+OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, (sljit_w)PRIV(ucd_stage2));
OP1(SLJIT_MOV_UH, TMP2, 0, SLJIT_MEM2(TMP2, TMP1), 1);
-OP1(SLJIT_MOV, TMP1, 0, SLJIT_IMM, (sljit_w)_pcre_ucd_records + SLJIT_OFFSETOF(ucd_record, chartype));
+OP1(SLJIT_MOV, TMP1, 0, SLJIT_IMM, (sljit_w)PRIV(ucd_records) + SLJIT_OFFSETOF(ucd_record, chartype));
OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM2(TMP1, TMP2), 3);
sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
}
@@ -1609,12 +1719,12 @@
struct sljit_jump *start;
struct sljit_jump *end = NULL;
struct sljit_jump *nl = NULL;
-#ifdef SUPPORT_UTF8
-struct sljit_jump *singlebyte;
+#ifdef SUPPORT_UTF
+struct sljit_jump *singlechar;
#endif
jump_list *newline = NULL;
BOOL newlinecheck = FALSE;
-BOOL readbyte = FALSE;
+BOOL readuchar = FALSE;

 if (!(hascrorlf || firstline) && (common->nltype == NLTYPE_ANY ||
     common->nltype == NLTYPE_ANYCRLF || common->newline > 255))
@@ -1629,13 +1739,13 @@
   if (common->nltype == NLTYPE_FIXED && common->newline > 255)
     {
     mainloop = LABEL();
-    OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+    OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
     end = CMP(SLJIT_C_GREATER_EQUAL, STR_PTR, 0, STR_END, 0);
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), -1);
-    OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 0);
+    OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(-1));
+    OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(0));
     CMPTO(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, (common->newline >> 8) & 0xff, mainloop);
     CMPTO(SLJIT_C_NOT_EQUAL, TMP2, 0, SLJIT_IMM, common->newline & 0xff, mainloop);
-    OP2(SLJIT_SUB, SLJIT_MEM1(SLJIT_LOCALS_REG), FIRSTLINE_END, STR_PTR, 0, SLJIT_IMM, 1);
+    OP2(SLJIT_SUB, SLJIT_MEM1(SLJIT_LOCALS_REG), FIRSTLINE_END, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
     }
   else
     {
@@ -1659,11 +1769,14 @@
 if (newlinecheck)
   {
   newlinelabel = LABEL();
-  OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+  OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
   end = CMP(SLJIT_C_GREATER_EQUAL, STR_PTR, 0, STR_END, 0);
-  OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+  OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
   OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, common->newline & 0xff);
   COND_VALUE(SLJIT_MOV, TMP1, 0, SLJIT_C_EQUAL);
+#ifdef COMPILE_PCRE16
+  OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
+#endif
   OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
   nl = JUMP(SLJIT_JUMP);
   }
@@ -1671,27 +1784,39 @@
 mainloop = LABEL();

/* Increasing the STR_PTR here requires one less jump in the most common case. */
-#ifdef SUPPORT_UTF8
-if (common->utf8) readbyte = TRUE;
+#ifdef SUPPORT_UTF
+if (common->utf) readuchar = TRUE;
#endif
-if (newlinecheck) readbyte = TRUE;
+if (newlinecheck) readuchar = TRUE;

-if (readbyte)
- OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+if (readuchar)
+ OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);

if (newlinecheck)
CMPTO(SLJIT_C_EQUAL, TMP1, 0, SLJIT_IMM, (common->newline >> 8) & 0xff, newlinelabel);

-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
-#ifdef SUPPORT_UTF8
-if (common->utf8)
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
+if (common->utf)
{
- singlebyte = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xc0);
- OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)_pcre_utf8_char_sizes - 0xc0);
+ singlechar = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xc0);
+ OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)PRIV(utf8_table4) - 0xc0);
OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
- JUMPHERE(singlebyte);
+ JUMPHERE(singlechar);
}
#endif
+#if defined SUPPORT_UTF && defined COMPILE_PCRE16
+if (common->utf)
+ {
+ singlechar = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xd800);
+ OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0xfc00);
+ OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0xd800);
+ COND_VALUE(SLJIT_MOV, TMP1, 0, SLJIT_C_EQUAL);
+ OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
+ OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
+ JUMPHERE(singlechar);
+ }
+#endif
JUMPHERE(start);

if (newlinecheck)
@@ -1703,13 +1828,13 @@
return mainloop;
}

-static SLJIT_INLINE void fast_forward_first_byte(compiler_common *common, pcre_uint16 firstbyte, BOOL firstline)
+static SLJIT_INLINE void fast_forward_first_char(compiler_common *common, pcre_uchar first_char, BOOL caseless, BOOL firstline)
{
DEFINE_COMPILER;
struct sljit_label *start;
struct sljit_jump *leave;
struct sljit_jump *found;
-pcre_uint16 oc, bit;
+pcre_uchar oc, bit;

if (firstline)
{
@@ -1719,23 +1844,30 @@

start = LABEL();
leave = CMP(SLJIT_C_GREATER_EQUAL, STR_PTR, 0, STR_END, 0);
-OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);

-if ((firstbyte & REQ_CASELESS) == 0)
-  found = CMP(SLJIT_C_EQUAL, TMP1, 0, SLJIT_IMM, firstbyte & 0xff);
+oc = first_char;
+if (caseless)
+  {
+  oc = TABLE_GET(first_char, common->fcc, first_char);
+#if defined SUPPORT_UCP && !(defined COMPILE_PCRE8)
+  if (first_char > 127 && common->utf)
+    oc = UCD_OTHERCASE(first_char);
+#endif
+  }
+if (first_char == oc)
+  found = CMP(SLJIT_C_EQUAL, TMP1, 0, SLJIT_IMM, first_char);
 else
   {
-  firstbyte &= 0xff;
-  oc = common->fcc[firstbyte];
-  bit = firstbyte ^ oc;
+  bit = first_char ^ oc;
   if (ispowerof2(bit))
     {
     OP2(SLJIT_OR, TMP2, 0, TMP1, 0, SLJIT_IMM, bit);
-    found = CMP(SLJIT_C_EQUAL, TMP2, 0, SLJIT_IMM, firstbyte | bit);
+    found = CMP(SLJIT_C_EQUAL, TMP2, 0, SLJIT_IMM, first_char | bit);
     }
   else
     {
-    OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, firstbyte);
+    OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, first_char);
     COND_VALUE(SLJIT_MOV, TMP2, 0, SLJIT_C_EQUAL);
     OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, oc);
     COND_VALUE(SLJIT_OR | SLJIT_SET_E, TMP2, 0, SLJIT_C_EQUAL);
@@ -1743,15 +1875,26 @@
     }
   }

-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
-#ifdef SUPPORT_UTF8
-if (common->utf8)
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
+if (common->utf)
{
CMPTO(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xc0, start);
- OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)_pcre_utf8_char_sizes - 0xc0);
+ OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)PRIV(utf8_table4) - 0xc0);
OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
}
#endif
+#if defined SUPPORT_UTF && defined COMPILE_PCRE16
+if (common->utf)
+ {
+ CMPTO(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xd800, start);
+ OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0xfc00);
+ OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0xd800);
+ COND_VALUE(SLJIT_MOV, TMP1, 0, SLJIT_C_EQUAL);
+ OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
+ OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
+ }
+#endif
JUMPTO(SLJIT_JUMP, start);
JUMPHERE(found);
JUMPHERE(leave);
@@ -1785,16 +1928,19 @@
OP1(SLJIT_MOV, TMP1, 0, SLJIT_MEM1(TMP1), SLJIT_OFFSETOF(jit_arguments, begin));
firstchar = CMP(SLJIT_C_LESS_EQUAL, STR_PTR, 0, TMP2, 0);

- OP2(SLJIT_ADD, TMP1, 0, TMP1, 0, SLJIT_IMM, 2);
+ OP2(SLJIT_ADD, TMP1, 0, TMP1, 0, SLJIT_IMM, IN_UCHARS(2));
OP2(SLJIT_SUB | SLJIT_SET_U, SLJIT_UNUSED, 0, STR_PTR, 0, TMP1, 0);
COND_VALUE(SLJIT_MOV, TMP2, 0, SLJIT_C_GREATER_EQUAL);
+#ifdef COMPILE_PCRE16
+ OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, 1);
+#endif
OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, TMP2, 0);

loop = LABEL();
- OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+ OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
leave = CMP(SLJIT_C_GREATER_EQUAL, STR_PTR, 0, STR_END, 0);
- OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), -2);
- OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), -1);
+ OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(-2));
+ OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(-1));
CMPTO(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, (common->newline >> 8) & 0xff, loop);
CMPTO(SLJIT_C_NOT_EQUAL, TMP2, 0, SLJIT_IMM, common->newline & 0xff, loop);

@@ -1825,9 +1971,12 @@
leave = JUMP(SLJIT_JUMP);
JUMPHERE(foundcr);
notfoundnl = CMP(SLJIT_C_GREATER_EQUAL, STR_PTR, 0, STR_END, 0);
- OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+ OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, CHAR_NL);
COND_VALUE(SLJIT_MOV, TMP1, 0, SLJIT_C_EQUAL);
+#ifdef COMPILE_PCRE16
+ OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
+#endif
OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
JUMPHERE(notfoundnl);
JUMPHERE(leave);
@@ -1845,6 +1994,9 @@
struct sljit_label *start;
struct sljit_jump *leave;
struct sljit_jump *found;
+#ifndef COMPILE_PCRE8
+struct sljit_jump *jump;
+#endif

if (firstline)
{
@@ -1854,11 +2006,16 @@

start = LABEL();
leave = CMP(SLJIT_C_GREATER_EQUAL, STR_PTR, 0, STR_END, 0);
-OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
-#ifdef SUPPORT_UTF8
-if (common->utf8)
+OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+#ifdef SUPPORT_UTF
+if (common->utf)
OP1(SLJIT_MOV, TMP3, 0, TMP1, 0);
#endif
+#ifndef COMPILE_PCRE8
+jump = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 255);
+OP1(SLJIT_MOV, TMP1, 0, SLJIT_IMM, 255);
+JUMPHERE(jump);
+#endif
OP2(SLJIT_AND, TMP2, 0, TMP1, 0, SLJIT_IMM, 0x7);
OP2(SLJIT_LSHR, TMP1, 0, TMP1, 0, SLJIT_IMM, 3);
OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), start_bits);
@@ -1866,19 +2023,30 @@
OP2(SLJIT_AND | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, TMP2, 0);
found = JUMP(SLJIT_C_NOT_ZERO);

-#ifdef SUPPORT_UTF8
-if (common->utf8)
+#ifdef SUPPORT_UTF
+if (common->utf)
OP1(SLJIT_MOV, TMP1, 0, TMP3, 0);
#endif
-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
-#ifdef SUPPORT_UTF8
-if (common->utf8)
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
+if (common->utf)
{
CMPTO(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xc0, start);
- OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)_pcre_utf8_char_sizes - 0xc0);
+ OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)PRIV(utf8_table4) - 0xc0);
OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
}
#endif
+#if defined SUPPORT_UTF && defined COMPILE_PCRE16
+if (common->utf)
+ {
+ CMPTO(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xd800, start);
+ OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0xfc00);
+ OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0xd800);
+ COND_VALUE(SLJIT_MOV, TMP1, 0, SLJIT_C_EQUAL);
+ OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
+ OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
+ }
+#endif
JUMPTO(SLJIT_JUMP, start);
JUMPHERE(found);
JUMPHERE(leave);
@@ -1887,7 +2055,7 @@
OP1(SLJIT_MOV, STR_END, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), POSSESSIVE0);
}

-static SLJIT_INLINE struct sljit_jump *search_requested_char(compiler_common *common, pcre_uint16 reqbyte, BOOL has_firstbyte)
+static SLJIT_INLINE struct sljit_jump *search_requested_char(compiler_common *common, pcre_uchar req_char, BOOL caseless, BOOL has_firstchar)
{
DEFINE_COMPILER;
struct sljit_label *loop;
@@ -1896,47 +2064,54 @@
struct sljit_jump *found;
struct sljit_jump *foundoc = NULL;
struct sljit_jump *notfound;
-pcre_uint16 oc, bit;
+pcre_uchar oc, bit;

-OP1(SLJIT_MOV, TMP2, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), REQ_BYTE_PTR);
+OP1(SLJIT_MOV, TMP2, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), REQ_CHAR_PTR);
OP2(SLJIT_ADD, TMP1, 0, STR_PTR, 0, SLJIT_IMM, REQ_BYTE_MAX);
toolong = CMP(SLJIT_C_LESS, TMP1, 0, STR_END, 0);
alreadyfound = CMP(SLJIT_C_LESS, STR_PTR, 0, TMP2, 0);

-if (has_firstbyte)
- OP2(SLJIT_ADD, TMP1, 0, STR_PTR, 0, SLJIT_IMM, 1);
+if (has_firstchar)
+ OP2(SLJIT_ADD, TMP1, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
else
OP1(SLJIT_MOV, TMP1, 0, STR_PTR, 0);

loop = LABEL();
notfound = CMP(SLJIT_C_GREATER_EQUAL, TMP1, 0, STR_END, 0);

-OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(TMP1), 0);
-if ((reqbyte & REQ_CASELESS) == 0)
-  found = CMP(SLJIT_C_EQUAL, TMP2, 0, SLJIT_IMM, reqbyte & 0xff);
+OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(TMP1), 0);
+oc = req_char;
+if (caseless)
+  {
+  oc = TABLE_GET(req_char, common->fcc, req_char);
+#if defined SUPPORT_UCP && !(defined COMPILE_PCRE8)
+  if (req_char > 127 && common->utf)
+    oc = UCD_OTHERCASE(req_char);
+#endif
+  }
+if (req_char == oc)
+  found = CMP(SLJIT_C_EQUAL, TMP2, 0, SLJIT_IMM, req_char);
 else
   {
-  reqbyte &= 0xff;
-  oc = common->fcc[reqbyte];
-  bit = reqbyte ^ oc;
+  bit = req_char ^ oc;
   if (ispowerof2(bit))
     {
     OP2(SLJIT_OR, TMP2, 0, TMP2, 0, SLJIT_IMM, bit);
-    found = CMP(SLJIT_C_EQUAL, TMP2, 0, SLJIT_IMM, reqbyte | bit);
+    found = CMP(SLJIT_C_EQUAL, TMP2, 0, SLJIT_IMM, req_char | bit);
     }
   else
     {
-    found = CMP(SLJIT_C_EQUAL, TMP2, 0, SLJIT_IMM, reqbyte);
+    found = CMP(SLJIT_C_EQUAL, TMP2, 0, SLJIT_IMM, req_char);
     foundoc = CMP(SLJIT_C_EQUAL, TMP2, 0, SLJIT_IMM, oc);
     }
   }
-OP2(SLJIT_ADD, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
+OP2(SLJIT_ADD, TMP1, 0, TMP1, 0, SLJIT_IMM, IN_UCHARS(1));
 JUMPTO(SLJIT_JUMP, loop);

JUMPHERE(found);
if (foundoc)
JUMPHERE(foundoc);
-OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), REQ_BYTE_PTR, TMP1, 0);
+OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), REQ_CHAR_PTR, TMP1, 0);
JUMPHERE(alreadyfound);
JUMPHERE(toolong);
return notfound;
@@ -1984,7 +2159,7 @@
{
DEFINE_COMPILER;
struct sljit_jump *beginend;
-#ifdef SUPPORT_UTF8
+#if !(defined COMPILE_PCRE8) || defined SUPPORT_UTF
struct sljit_jump *jump;
#endif

@@ -2001,7 +2176,7 @@

 /* Testing char type. */
 #ifdef SUPPORT_UCP
-if (common->useucp)
+if (common->use_ucp)
   {
   OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, 1);
   jump = CMP(SLJIT_C_EQUAL, TMP1, 0, SLJIT_IMM, CHAR_UNDERSCORE);
@@ -2018,20 +2193,24 @@
 else
 #endif
   {
-#ifdef SUPPORT_UTF8
+#ifndef COMPILE_PCRE8
+  jump = CMP(SLJIT_C_GREATER, TMP1, 0, SLJIT_IMM, 255);
+#elif defined SUPPORT_UTF
   /* Here LOCALS1 has already been zeroed. */
   jump = NULL;
-  if (common->utf8)
+  if (common->utf)
     jump = CMP(SLJIT_C_GREATER, TMP1, 0, SLJIT_IMM, 255);
-#endif
+#endif /* COMPILE_PCRE8 */
   OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), common->ctypes);
   OP2(SLJIT_LSHR, TMP1, 0, TMP1, 0, SLJIT_IMM, 4 /* ctype_word */);
   OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
   OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), LOCALS1, TMP1, 0);
-#ifdef SUPPORT_UTF8
+#ifndef COMPILE_PCRE8
+  JUMPHERE(jump);
+#elif defined SUPPORT_UTF
   if (jump != NULL)
     JUMPHERE(jump);
-#endif
+#endif /* COMPILE_PCRE8 */
   }
 JUMPHERE(beginend);

@@ -2041,7 +2220,7 @@

 /* Testing char type. This is a code duplication. */
 #ifdef SUPPORT_UCP
-if (common->useucp)
+if (common->use_ucp)
   {
   OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, 1);
   jump = CMP(SLJIT_C_EQUAL, TMP1, 0, SLJIT_IMM, CHAR_UNDERSCORE);
@@ -2057,19 +2236,25 @@
 else
 #endif
   {
-#ifdef SUPPORT_UTF8
+#ifndef COMPILE_PCRE8
+  /* TMP2 may be destroyed by peek_char. */
   OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, 0);
+  jump = CMP(SLJIT_C_GREATER, TMP1, 0, SLJIT_IMM, 255);
+#elif defined SUPPORT_UTF
+  OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, 0);
   jump = NULL;
-  if (common->utf8)
+  if (common->utf)
     jump = CMP(SLJIT_C_GREATER, TMP1, 0, SLJIT_IMM, 255);
 #endif
   OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(TMP1), common->ctypes);
   OP2(SLJIT_LSHR, TMP2, 0, TMP2, 0, SLJIT_IMM, 4 /* ctype_word */);
   OP2(SLJIT_AND, TMP2, 0, TMP2, 0, SLJIT_IMM, 1);
-#ifdef SUPPORT_UTF8
+#ifndef COMPILE_PCRE8
+  JUMPHERE(jump);
+#elif defined SUPPORT_UTF
   if (jump != NULL)
     JUMPHERE(jump);
-#endif
+#endif /* COMPILE_PCRE8 */
   }
 JUMPHERE(beginend);

@@ -2088,14 +2273,18 @@
OP2(SLJIT_SUB | SLJIT_SET_U, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x0d - 0x0a);
COND_VALUE(SLJIT_MOV, TMP2, 0, SLJIT_C_LESS_EQUAL);
OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x85 - 0x0a);
-#ifdef SUPPORT_UTF8
-if (common->utf8)
+#if defined SUPPORT_UTF || defined COMPILE_PCRE16
+#ifdef COMPILE_PCRE8
+if (common->utf)
{
+#endif
COND_VALUE(SLJIT_OR, TMP2, 0, SLJIT_C_EQUAL);
OP2(SLJIT_OR, TMP1, 0, TMP1, 0, SLJIT_IMM, 0x1);
OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x2029 - 0x0a);
+#ifdef COMPILE_PCRE8
}
#endif
+#endif /* SUPPORT_UTF || COMPILE_PCRE16 */
COND_VALUE(SLJIT_OR | SLJIT_SET_E, TMP2, 0, SLJIT_C_EQUAL);
sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
}
@@ -2112,9 +2301,11 @@
OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x20);
COND_VALUE(SLJIT_OR, TMP2, 0, SLJIT_C_EQUAL);
OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0xa0);
-#ifdef SUPPORT_UTF8
-if (common->utf8)
+#if defined SUPPORT_UTF || defined COMPILE_PCRE16
+#ifdef COMPILE_PCRE8
+if (common->utf)
{
+#endif
COND_VALUE(SLJIT_OR, TMP2, 0, SLJIT_C_EQUAL);
OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x1680);
COND_VALUE(SLJIT_OR, TMP2, 0, SLJIT_C_EQUAL);
@@ -2128,8 +2319,10 @@
OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x205f - 0x2000);
COND_VALUE(SLJIT_OR, TMP2, 0, SLJIT_C_EQUAL);
OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x3000 - 0x2000);
+#ifdef COMPILE_PCRE8
}
#endif
+#endif /* SUPPORT_UTF || COMPILE_PCRE16 */
COND_VALUE(SLJIT_OR | SLJIT_SET_E, TMP2, 0, SLJIT_C_EQUAL);

sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
@@ -2146,14 +2339,18 @@
OP2(SLJIT_SUB | SLJIT_SET_U, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x0d - 0x0a);
COND_VALUE(SLJIT_MOV, TMP2, 0, SLJIT_C_LESS_EQUAL);
OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x85 - 0x0a);
-#ifdef SUPPORT_UTF8
-if (common->utf8)
+#if defined SUPPORT_UTF || defined COMPILE_PCRE16
+#ifdef COMPILE_PCRE8
+if (common->utf)
{
+#endif
COND_VALUE(SLJIT_OR | SLJIT_SET_E, TMP2, 0, SLJIT_C_EQUAL);
OP2(SLJIT_OR, TMP1, 0, TMP1, 0, SLJIT_IMM, 0x1);
OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0x2029 - 0x0a);
+#ifdef COMPILE_PCRE8
}
#endif
+#endif /* SUPPORT_UTF || COMPILE_PCRE16 */
COND_VALUE(SLJIT_OR | SLJIT_SET_E, TMP2, 0, SLJIT_C_EQUAL);

sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
@@ -2172,18 +2369,18 @@
OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, TMP2, 0);
OP1(SLJIT_MOV, TMP3, 0, CHAR1, 0);
OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), LOCALS0, CHAR2, 0);
-OP2(SLJIT_SUB, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
-OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+OP2(SLJIT_SUB, TMP1, 0, TMP1, 0, SLJIT_IMM, IN_UCHARS(1));
+OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));

label = LABEL();
-OP1(SLJIT_MOVU_UB, CHAR1, 0, SLJIT_MEM1(TMP1), 1);
-OP1(SLJIT_MOVU_UB, CHAR2, 0, SLJIT_MEM1(STR_PTR), 1);
+OP1(MOVU_UCHAR, CHAR1, 0, SLJIT_MEM1(TMP1), IN_UCHARS(1));
+OP1(MOVU_UCHAR, CHAR2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(1));
jump = CMP(SLJIT_C_NOT_EQUAL, CHAR1, 0, CHAR2, 0);
-OP2(SLJIT_SUB | SLJIT_SET_E, TMP2, 0, TMP2, 0, SLJIT_IMM, 1);
+OP2(SLJIT_SUB | SLJIT_SET_E, TMP2, 0, TMP2, 0, SLJIT_IMM, IN_UCHARS(1));
JUMPTO(SLJIT_C_NOT_ZERO, label);

JUMPHERE(jump);
-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
OP1(SLJIT_MOV, CHAR1, 0, TMP3, 0);
OP1(SLJIT_MOV, CHAR2, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), LOCALS0);
sljit_emit_fast_return(compiler, RETURN_ADDR, 0);
@@ -2204,20 +2401,30 @@
OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), LOCALS0, CHAR1, 0);
OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), LOCALS1, CHAR2, 0);
OP1(SLJIT_MOV, LCC_TABLE, 0, SLJIT_IMM, common->lcc);
-OP2(SLJIT_SUB, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
-OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+OP2(SLJIT_SUB, TMP1, 0, TMP1, 0, SLJIT_IMM, IN_UCHARS(1));
+OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));

label = LABEL();
-OP1(SLJIT_MOVU_UB, CHAR1, 0, SLJIT_MEM1(TMP1), 1);
-OP1(SLJIT_MOVU_UB, CHAR2, 0, SLJIT_MEM1(STR_PTR), 1);
+OP1(MOVU_UCHAR, CHAR1, 0, SLJIT_MEM1(TMP1), IN_UCHARS(1));
+OP1(MOVU_UCHAR, CHAR2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(1));
+#ifndef COMPILE_PCRE8
+jump = CMP(SLJIT_C_GREATER, CHAR1, 0, SLJIT_IMM, 255);
+#endif
OP1(SLJIT_MOV_UB, CHAR1, 0, SLJIT_MEM2(LCC_TABLE, CHAR1), 0);
+#ifndef COMPILE_PCRE8
+JUMPHERE(jump);
+jump = CMP(SLJIT_C_GREATER, CHAR2, 0, SLJIT_IMM, 255);
+#endif
OP1(SLJIT_MOV_UB, CHAR2, 0, SLJIT_MEM2(LCC_TABLE, CHAR2), 0);
+#ifndef COMPILE_PCRE8
+JUMPHERE(jump);
+#endif
jump = CMP(SLJIT_C_NOT_EQUAL, CHAR1, 0, CHAR2, 0);
-OP2(SLJIT_SUB | SLJIT_SET_E, TMP2, 0, TMP2, 0, SLJIT_IMM, 1);
+OP2(SLJIT_SUB | SLJIT_SET_E, TMP2, 0, TMP2, 0, SLJIT_IMM, IN_UCHARS(1));
JUMPTO(SLJIT_C_NOT_ZERO, label);

JUMPHERE(jump);
-OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
OP1(SLJIT_MOV, LCC_TABLE, 0, TMP3, 0);
OP1(SLJIT_MOV, CHAR1, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), LOCALS0);
OP1(SLJIT_MOV, CHAR2, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), LOCALS1);
@@ -2228,15 +2435,14 @@
#undef CHAR1
#undef CHAR2

-#ifdef SUPPORT_UTF8
-#ifdef SUPPORT_UCP
+#if defined SUPPORT_UTF && defined SUPPORT_UCP

-static uschar * SLJIT_CALL do_utf8caselesscmp(uschar *src1, jit_arguments *args, uschar *end1)
+static const pcre_uchar *SLJIT_CALL do_utf_caselesscmp(pcre_uchar *src1, jit_arguments *args, pcre_uchar *end1)
{
/* This function would be ineffective to do in JIT level. */
int c1, c2;
-uschar *src2 = args->ptr;
-uschar *end2 = (uschar*)args->end;
+const pcre_uchar *src2 = args->ptr;
+const pcre_uchar *end2 = args->end;

while (src1 < end1)
{
@@ -2249,17 +2455,16 @@
return src2;
}

-#endif
-#endif
+#endif /* SUPPORT_UTF && SUPPORT_UCP */

-static uschar *byte_sequence_compare(compiler_common *common, BOOL caseless, uschar *cc,
+static pcre_uchar *byte_sequence_compare(compiler_common *common, BOOL caseless, pcre_uchar *cc,
     compare_context* context, jump_list **fallbacks)
 {
 DEFINE_COMPILER;
 unsigned int othercasebit = 0;
-uschar *othercasebyte = NULL;
-#ifdef SUPPORT_UTF8
-int utf8length;
+pcre_uchar *othercasechar = NULL;
+#ifdef SUPPORT_UTF
+int utflength;
 #endif

 if (caseless && char_has_othercase(common, cc))
@@ -2267,12 +2472,23 @@
   othercasebit = char_get_othercase_bit(common, cc);
   SLJIT_ASSERT(othercasebit);
   /* Extracting bit difference info. */
-  othercasebyte = cc + (othercasebit >> 8);
+#ifdef COMPILE_PCRE8
+  othercasechar = cc + (othercasebit >> 8);
   othercasebit &= 0xff;
+#else
+#ifdef COMPILE_PCRE16
+  othercasechar = cc + (othercasebit >> 9);
+  if ((othercasebit & 0x100) != 0)
+    othercasebit = (othercasebit & 0xff) << 8;
+  else
+    othercasebit &= 0xff;
+#endif
+#endif
   }

 if (context->sourcereg == -1)
   {
+#ifdef COMPILE_PCRE8
 #if defined SLJIT_UNALIGNED && SLJIT_UNALIGNED
   if (context->length >= 4)
     OP1(SLJIT_MOV_SI, TMP1, 0, SLJIT_MEM1(STR_PTR), -context->length);
@@ -2281,79 +2497,105 @@
   else
 #endif
     OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), -context->length);
+#else
+#ifdef COMPILE_PCRE16
+#if defined SLJIT_UNALIGNED && SLJIT_UNALIGNED
+  if (context->length >= 4)
+    OP1(SLJIT_MOV_SI, TMP1, 0, SLJIT_MEM1(STR_PTR), -context->length);
+  else
+#endif
+    OP1(SLJIT_MOV_SH, TMP1, 0, SLJIT_MEM1(STR_PTR), -context->length);
+#endif
+#endif /* COMPILE_PCRE8 */
   context->sourcereg = TMP2;
   }

-#ifdef SUPPORT_UTF8
-utf8length = 1;
-if (common->utf8 && *cc >= 0xc0)
- utf8length += _pcre_utf8_table4[*cc & 0x3f];
+#ifdef SUPPORT_UTF
+utflength = 1;
+if (common->utf && HAS_EXTRALEN(*cc))
+ utflength += GET_EXTRALEN(*cc);

do
{
#endif

- context->length--;
+ context->length -= IN_UCHARS(1);
#if defined SLJIT_UNALIGNED && SLJIT_UNALIGNED

   /* Unaligned read is supported. */
-  if (othercasebit != 0 && othercasebyte == cc)
+  if (othercasebit != 0 && othercasechar == cc)
     {
-    context->c.asbytes[context->byteptr] = *cc | othercasebit;
-    context->oc.asbytes[context->byteptr] = othercasebit;
+    context->c.asuchars[context->ucharptr] = *cc | othercasebit;
+    context->oc.asuchars[context->ucharptr] = othercasebit;
     }
   else
     {
-    context->c.asbytes[context->byteptr] = *cc;
-    context->oc.asbytes[context->byteptr] = 0;
+    context->c.asuchars[context->ucharptr] = *cc;
+    context->oc.asuchars[context->ucharptr] = 0;
     }
-  context->byteptr++;
+  context->ucharptr++;

-  if (context->byteptr >= 4 || context->length == 0 || (context->byteptr == 2 && context->length == 1))
+#ifdef COMPILE_PCRE8
+  if (context->ucharptr >= 4 || context->length == 0 || (context->ucharptr == 2 && context->length == 1))
+#else
+  if (context->ucharptr >= 2 || context->length == 0)
+#endif
     {
     if (context->length >= 4)
       OP1(SLJIT_MOV_SI, context->sourcereg, 0, SLJIT_MEM1(STR_PTR), -context->length);
+#ifdef COMPILE_PCRE8
     else if (context->length >= 2)
       OP1(SLJIT_MOV_SH, context->sourcereg, 0, SLJIT_MEM1(STR_PTR), -context->length);
     else if (context->length >= 1)
       OP1(SLJIT_MOV_UB, context->sourcereg, 0, SLJIT_MEM1(STR_PTR), -context->length);
+#else
+    else if (context->length >= 2)
+      OP1(SLJIT_MOV_SH, context->sourcereg, 0, SLJIT_MEM1(STR_PTR), -context->length);
+#endif
     context->sourcereg = context->sourcereg == TMP1 ? TMP2 : TMP1;

-    switch(context->byteptr)
+    switch(context->ucharptr)
       {
-      case 4:
+      case 4 / sizeof(pcre_uchar):
       if (context->oc.asint != 0)
         OP2(SLJIT_OR, context->sourcereg, 0, context->sourcereg, 0, SLJIT_IMM, context->oc.asint);
       add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, context->sourcereg, 0, SLJIT_IMM, context->c.asint | context->oc.asint));
       break;

-      case 2:
+      case 2 / sizeof(pcre_uchar):
       if (context->oc.asshort != 0)
         OP2(SLJIT_OR, context->sourcereg, 0, context->sourcereg, 0, SLJIT_IMM, context->oc.asshort);
       add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, context->sourcereg, 0, SLJIT_IMM, context->c.asshort | context->oc.asshort));
       break;

+#ifdef COMPILE_PCRE8
       case 1:
       if (context->oc.asbyte != 0)
         OP2(SLJIT_OR, context->sourcereg, 0, context->sourcereg, 0, SLJIT_IMM, context->oc.asbyte);
       add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, context->sourcereg, 0, SLJIT_IMM, context->c.asbyte | context->oc.asbyte));
       break;
+#endif

       default:
       SLJIT_ASSERT_STOP();
       break;
       }
-    context->byteptr = 0;
+    context->ucharptr = 0;
     }

#else

   /* Unaligned read is unsupported. */
+#ifdef COMPILE_PCRE8
   if (context->length > 0)
     OP1(SLJIT_MOV_UB, context->sourcereg, 0, SLJIT_MEM1(STR_PTR), -context->length);
+#else
+  if (context->length > 0)
+    OP1(SLJIT_MOV_UH, context->sourcereg, 0, SLJIT_MEM1(STR_PTR), -context->length);
+#endif
   context->sourcereg = context->sourcereg == TMP1 ? TMP2 : TMP1;

-  if (othercasebit != 0 && othercasebyte == cc)
+  if (othercasebit != 0 && othercasechar == cc)
     {
     OP2(SLJIT_OR, context->sourcereg, 0, context->sourcereg, 0, SLJIT_IMM, othercasebit);
     add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, context->sourcereg, 0, SLJIT_IMM, *cc | othercasebit));
@@ -2364,16 +2606,16 @@
 #endif

cc++;
-#ifdef SUPPORT_UTF8
- utf8length--;
+#ifdef SUPPORT_UTF
+ utflength--;
}
-while (utf8length > 0);
+while (utflength > 0);
#endif

return cc;
}

-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8

 #define SET_TYPE_OFFSET(value) \
   if ((value) != typeoffset) \
@@ -2395,7 +2637,7 @@
     } \
   charoffset = (value);

-static void compile_xclass_hotpath(compiler_common *common, uschar *cc, jump_list **fallbacks)
+static void compile_xclass_hotpath(compiler_common *common, pcre_uchar *cc, jump_list **fallbacks)
{
DEFINE_COMPILER;
jump_list *found = NULL;
@@ -2403,7 +2645,7 @@
unsigned int c;
int compares;
struct sljit_jump *jump = NULL;
-uschar *ccbegin;
+pcre_uchar *ccbegin;
#ifdef SUPPORT_UCP
BOOL needstype = FALSE, needsscript = FALSE, needschar = FALSE;
BOOL charsaved = FALSE;
@@ -2413,15 +2655,19 @@
int invertcmp, numberofcmps;
unsigned int charoffset;

-/* Although SUPPORT_UTF8 must be defined, we are not necessary in utf8 mode. */
+/* Although SUPPORT_UTF must be defined, we are not necessary in utf mode. */
check_input_end(common, fallbacks);
read_char(common);

 if ((*cc++ & XCL_MAP) != 0)
   {
   OP1(SLJIT_MOV, TMP3, 0, TMP1, 0);
-  if (common->utf8)
+#ifndef COMPILE_PCRE8
+  jump = CMP(SLJIT_C_GREATER, TMP1, 0, SLJIT_IMM, 255);
+#elif defined SUPPORT_UTF
+  if (common->utf)
     jump = CMP(SLJIT_C_GREATER, TMP1, 0, SLJIT_IMM, 255);
+#endif

OP2(SLJIT_AND, TMP2, 0, TMP1, 0, SLJIT_IMM, 0x7);
OP2(SLJIT_LSHR, TMP1, 0, TMP1, 0, SLJIT_IMM, 3);
@@ -2430,13 +2676,17 @@
OP2(SLJIT_AND | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, TMP2, 0);
add_jump(compiler, list, JUMP(SLJIT_C_NOT_ZERO));

-  if (common->utf8)
+#ifndef COMPILE_PCRE8
+  JUMPHERE(jump);
+#elif defined SUPPORT_UTF
+  if (common->utf)
     JUMPHERE(jump);
+#endif
   OP1(SLJIT_MOV, TMP1, 0, TMP3, 0);
 #ifdef SUPPORT_UCP
   charsaved = TRUE;
 #endif
-  cc += 32;
+  cc += 32 / sizeof(pcre_uchar);
   }

 /* Scanning the necessary info. */
@@ -2448,8 +2698,8 @@
   if (*cc == XCL_SINGLE)
     {
     cc += 2;
-#ifdef SUPPORT_UTF8
-    if (common->utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
+#ifdef SUPPORT_UTF
+    if (common->utf && HAS_EXTRALEN(cc[-1])) cc += GET_EXTRALEN(cc[-1]);
 #endif
 #ifdef SUPPORT_UCP
     needschar = TRUE;
@@ -2458,12 +2708,12 @@
   else if (*cc == XCL_RANGE)
     {
     cc += 2;
-#ifdef SUPPORT_UTF8
-    if (common->utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
+#ifdef SUPPORT_UTF
+    if (common->utf && HAS_EXTRALEN(cc[-1])) cc += GET_EXTRALEN(cc[-1]);
 #endif
     cc++;
-#ifdef SUPPORT_UTF8
-    if (common->utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
+#ifdef SUPPORT_UTF
+    if (common->utf && HAS_EXTRALEN(cc[-1])) cc += GET_EXTRALEN(cc[-1]);
 #endif
 #ifdef SUPPORT_UCP
     needschar = TRUE;
@@ -2533,13 +2783,13 @@
     {
     if (scriptreg == TMP1)
       {
-      OP1(SLJIT_MOV, scriptreg, 0, SLJIT_IMM, (sljit_w)_pcre_ucd_records + SLJIT_OFFSETOF(ucd_record, script));
+      OP1(SLJIT_MOV, scriptreg, 0, SLJIT_IMM, (sljit_w)PRIV(ucd_records) + SLJIT_OFFSETOF(ucd_record, script));
       OP1(SLJIT_MOV_UB, scriptreg, 0, SLJIT_MEM2(scriptreg, TMP2), 3);
       }
     else
       {
       OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, 3);
-      OP2(SLJIT_ADD, TMP2, 0, TMP2, 0, SLJIT_IMM, (sljit_w)_pcre_ucd_records + SLJIT_OFFSETOF(ucd_record, script));
+      OP2(SLJIT_ADD, TMP2, 0, TMP2, 0, SLJIT_IMM, (sljit_w)PRIV(ucd_records) + SLJIT_OFFSETOF(ucd_record, script));
       OP1(SLJIT_MOV_UB, scriptreg, 0, SLJIT_MEM1(TMP2), 0);
       }
     }
@@ -2563,8 +2813,8 @@
   if (*cc == XCL_SINGLE)
     {
     cc ++;
-#ifdef SUPPORT_UTF8
-    if (common->utf8)
+#ifdef SUPPORT_UTF
+    if (common->utf)
       {
       GETCHARINC(c, cc);
       }
@@ -2594,8 +2844,8 @@
   else if (*cc == XCL_RANGE)
     {
     cc ++;
-#ifdef SUPPORT_UTF8
-    if (common->utf8)
+#ifdef SUPPORT_UTF
+    if (common->utf)
       {
       GETCHARINC(c, cc);
       }
@@ -2603,8 +2853,8 @@
 #endif
       c = *cc++;
     SET_CHAR_OFFSET(c);
-#ifdef SUPPORT_UTF8
-    if (common->utf8)
+#ifdef SUPPORT_UTF
+    if (common->utf)
       {
       GETCHARINC(c, cc);
       }
@@ -2660,9 +2910,9 @@
       break;

       case PT_GC:
-      c = _pcre_ucp_typerange[(int)cc[1] * 2];
+      c = PRIV(ucp_typerange)[(int)cc[1] * 2];
       SET_TYPE_OFFSET(c);
-      jump = CMP(SLJIT_C_LESS_EQUAL ^ invertcmp, typereg, 0, SLJIT_IMM, _pcre_ucp_typerange[(int)cc[1] * 2 + 1] - c);
+      jump = CMP(SLJIT_C_LESS_EQUAL ^ invertcmp, typereg, 0, SLJIT_IMM, PRIV(ucp_typerange)[(int)cc[1] * 2 + 1] - c);
       break;

       case PT_PC:
@@ -2724,17 +2974,17 @@

#endif

-static uschar *compile_char1_hotpath(compiler_common *common, uschar type, uschar *cc, jump_list **fallbacks)
+static pcre_uchar *compile_char1_hotpath(compiler_common *common, pcre_uchar type, pcre_uchar *cc, jump_list **fallbacks)
{
DEFINE_COMPILER;
int length;
unsigned int c, oc, bit;
compare_context context;
struct sljit_jump *jump[4];
-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
struct sljit_label *label;
#ifdef SUPPORT_UCP
-uschar propdata[5];
+pcre_uchar propdata[5];
#endif
#endif

@@ -2789,7 +3039,7 @@
     {
     jump[0] = CMP(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, (common->newline >> 8) & 0xff);
     jump[1] = CMP(SLJIT_C_GREATER_EQUAL, STR_PTR, 0, STR_END, 0);
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+    OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
     add_jump(compiler, fallbacks, CMP(SLJIT_C_EQUAL, TMP1, 0, SLJIT_IMM, common->newline & 0xff));
     JUMPHERE(jump[1]);
     JUMPHERE(jump[0]);
@@ -2800,27 +3050,38 @@

   case OP_ALLANY:
   check_input_end(common, fallbacks);
-#ifdef SUPPORT_UTF8
-  if (common->utf8)
+#ifdef SUPPORT_UTF
+  if (common->utf)
     {
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
-    OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+    OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+    OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
+#ifdef COMPILE_PCRE8
     jump[0] = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xc0);
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)_pcre_utf8_char_sizes - 0xc0);
+    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)PRIV(utf8_table4) - 0xc0);
     OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
+#else /* COMPILE_PCRE8 */
+#ifdef COMPILE_PCRE16
+    jump[0] = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xd800);
+    OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, 0xfc00);
+    OP2(SLJIT_SUB | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, SLJIT_IMM, 0xd800);
+    COND_VALUE(SLJIT_MOV, TMP1, 0, SLJIT_C_EQUAL);
+    OP2(SLJIT_SHL, TMP1, 0, TMP1, 0, SLJIT_IMM, 1);
+    OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
+#endif /* COMPILE_PCRE16 */
+#endif /* COMPILE_PCRE8 */
     JUMPHERE(jump[0]);
     return cc;
     }
 #endif
-  OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+  OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
   return cc;

case OP_ANYBYTE:
check_input_end(common, fallbacks);
- OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+ OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
return cc;

-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
 #ifdef SUPPORT_UCP
   case OP_NOTPROP:
   case OP_PROP:
@@ -2839,9 +3100,9 @@
   read_char(common);
   jump[0] = CMP(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, CHAR_CR);
   jump[1] = CMP(SLJIT_C_GREATER_EQUAL, STR_PTR, 0, STR_END, 0);
-  OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+  OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
   jump[2] = CMP(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, CHAR_NL);
-  OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+  OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
   jump[3] = JUMP(SLJIT_JUMP);
   JUMPHERE(jump[0]);
   check_newlinechar(common, common->bsr_nltype, fallbacks, FALSE);
@@ -2891,36 +3152,37 @@
   jump[0] = CMP(SLJIT_C_GREATER_EQUAL, STR_PTR, 0, STR_END, 0);
   if (common->nltype == NLTYPE_FIXED && common->newline > 255)
     {
-    OP2(SLJIT_ADD, TMP2, 0, STR_PTR, 0, SLJIT_IMM, 2);
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+    OP2(SLJIT_ADD, TMP2, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(2));
+    OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(0));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, TMP2, 0, STR_END, 0));
-    OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 1);
+    OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(1));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, (common->newline >> 8) & 0xff));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, TMP2, 0, SLJIT_IMM, common->newline & 0xff));
     }
   else if (common->nltype == NLTYPE_FIXED)
     {
-    OP2(SLJIT_ADD, TMP2, 0, STR_PTR, 0, SLJIT_IMM, 1);
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+    OP2(SLJIT_ADD, TMP2, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
+    OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(0));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, TMP2, 0, STR_END, 0));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, common->newline));
     }
   else
     {
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
+    OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(0));
     jump[1] = CMP(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, CHAR_CR);
-    OP2(SLJIT_ADD, TMP2, 0, STR_PTR, 0, SLJIT_IMM, 2);
+    OP2(SLJIT_ADD, TMP2, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(2));
     OP2(SLJIT_SUB | SLJIT_SET_U, SLJIT_UNUSED, 0, TMP2, 0, STR_END, 0);
     jump[2] = JUMP(SLJIT_C_GREATER);
     add_jump(compiler, fallbacks, JUMP(SLJIT_C_LESS));
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 1);
+    /* Equal. */
+    OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(1));
     jump[3] = CMP(SLJIT_C_EQUAL, TMP1, 0, SLJIT_IMM, CHAR_NL);
     add_jump(compiler, fallbacks, JUMP(SLJIT_JUMP));

     JUMPHERE(jump[1]);
     if (common->nltype == NLTYPE_ANYCRLF)
       {
-      OP2(SLJIT_ADD, TMP2, 0, STR_PTR, 0, SLJIT_IMM, 1);
+      OP2(SLJIT_ADD, TMP2, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
       add_jump(compiler, fallbacks, CMP(SLJIT_C_LESS, TMP2, 0, STR_END, 0));
       add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, CHAR_NL));
       }
@@ -2960,15 +3222,13 @@
   jump[0] = JUMP(SLJIT_JUMP);
   JUMPHERE(jump[1]);

-  OP1(SLJIT_MOV, TMP2, 0, SLJIT_MEM1(TMP2), SLJIT_OFFSETOF(jit_arguments, end));
-  add_jump(compiler, fallbacks, CMP(SLJIT_C_EQUAL, TMP2, 0, STR_PTR, 0));
-
+  add_jump(compiler, fallbacks, CMP(SLJIT_C_EQUAL, STR_PTR, 0, STR_END, 0));
   if (common->nltype == NLTYPE_FIXED && common->newline > 255)
     {
-    OP2(SLJIT_SUB, TMP2, 0, STR_PTR, 0, SLJIT_IMM, 2);
+    OP2(SLJIT_SUB, TMP2, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(2));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_LESS, TMP2, 0, TMP1, 0));
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), -2);
-    OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), -1);
+    OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(-2));
+    OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(-1));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, (common->newline >> 8) & 0xff));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, TMP2, 0, SLJIT_IMM, common->newline & 0xff));
     }
@@ -3002,10 +3262,10 @@

   if (common->nltype == NLTYPE_FIXED && common->newline > 255)
     {
-    OP2(SLJIT_ADD, TMP2, 0, STR_PTR, 0, SLJIT_IMM, 2);
+    OP2(SLJIT_ADD, TMP2, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(2));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_GREATER, TMP2, 0, STR_END, 0));
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
-    OP1(SLJIT_MOV_UB, TMP2, 0, SLJIT_MEM1(STR_PTR), 1);
+    OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(0));
+    OP1(MOV_UCHAR, TMP2, 0, SLJIT_MEM1(STR_PTR), IN_UCHARS(1));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, TMP1, 0, SLJIT_IMM, (common->newline >> 8) & 0xff));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_NOT_EQUAL, TMP2, 0, SLJIT_IMM, common->newline & 0xff));
     }
@@ -3020,25 +3280,25 @@
   case OP_CHAR:
   case OP_CHARI:
   length = 1;
-#ifdef SUPPORT_UTF8
-  if (common->utf8 && *cc >= 0xc0) length += _pcre_utf8_table4[*cc & 0x3f];
+#ifdef SUPPORT_UTF
+  if (common->utf && HAS_EXTRALEN(*cc)) length += GET_EXTRALEN(*cc);
 #endif
   if (type == OP_CHAR || !char_has_othercase(common, cc) || char_get_othercase_bit(common, cc) != 0)
     {
-    OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, length);
+    OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(length));
     add_jump(compiler, fallbacks, CMP(SLJIT_C_GREATER, STR_PTR, 0, STR_END, 0));

-    context.length = length;
+    context.length = IN_UCHARS(length);
     context.sourcereg = -1;
 #if defined SLJIT_UNALIGNED && SLJIT_UNALIGNED
-    context.byteptr = 0;
+    context.ucharptr = 0;
 #endif
     return byte_sequence_compare(common, type == OP_CHARI, cc, &context, fallbacks);
     }
-  add_jump(compiler, fallbacks, CMP(SLJIT_C_GREATER_EQUAL, STR_PTR, 0, STR_END, 0));
+  check_input_end(common, fallbacks);
   read_char(common);
-#ifdef SUPPORT_UTF8
-  if (common->utf8)
+#ifdef SUPPORT_UTF
+  if (common->utf)
     {
     GETCHAR(c, cc);
     }
@@ -3054,16 +3314,14 @@

   case OP_NOT:
   case OP_NOTI:
+  check_input_end(common, fallbacks);
   length = 1;
-#ifdef SUPPORT_UTF8
-  if (common->utf8)
+#ifdef SUPPORT_UTF
+  if (common->utf)
     {
-    if (*cc >= 0xc0) length += _pcre_utf8_table4[*cc & 0x3f];
-
-    check_input_end(common, fallbacks);
-    GETCHAR(c, cc);
-
-    if (c <= 127)
+#ifdef COMPILE_PCRE8
+    c = *cc;
+    if (c < 128)
       {
       OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), 0);
       if (type == OP_NOT || !char_has_othercase(common, cc))
@@ -3075,22 +3333,24 @@
         add_jump(compiler, fallbacks, CMP(SLJIT_C_EQUAL, TMP2, 0, SLJIT_IMM, c | 0x20));
         }
       /* Skip the variable-length character. */
-      OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
+      OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(1));
       jump[0] = CMP(SLJIT_C_LESS, TMP1, 0, SLJIT_IMM, 0xc0);
-      OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)_pcre_utf8_char_sizes - 0xc0);
+      OP1(MOV_UCHAR, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)PRIV(utf8_table4) - 0xc0);
       OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, TMP1, 0);
       JUMPHERE(jump[0]);
-      return cc + length;
+      return cc + 1;
       }
     else
+#endif /* COMPILE_PCRE8 */
+      {
+      GETCHARLEN(c, cc, length);
       read_char(common);
+      }
     }
   else
-#endif
+#endif /* SUPPORT_UTF */
     {
-    OP2(SLJIT_ADD, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, 1);
-    add_jump(compiler, fallbacks, CMP(SLJIT_C_GREATER, STR_PTR, 0, STR_END, 0));
-    OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(STR_PTR), -1);
+    read_char(common);
     c = *cc;
     }

@@ -3111,15 +3371,19 @@
       add_jump(compiler, fallbacks, CMP(SLJIT_C_EQUAL, TMP1, 0, SLJIT_IMM, oc));
       }
     }
-  return cc + length;
+  return cc + 1;

   case OP_CLASS:
   case OP_NCLASS:
   check_input_end(common, fallbacks);
   read_char(common);
-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
   jump[0] = NULL;
-  if (common->utf8)
+#ifdef COMPILE_PCRE8
+  /* This check only affects 8 bit mode. In other modes, we
+  always need to compare the value with 255. */
+  if (common->utf)
+#endif /* COMPILE_PCRE8 */
     {
     jump[0] = CMP(SLJIT_C_GREATER, TMP1, 0, SLJIT_IMM, 255);
     if (type == OP_CLASS)
@@ -3128,20 +3392,20 @@
       jump[0] = NULL;
       }
     }
-#endif
+#endif /* SUPPORT_UTF || !COMPILE_PCRE8 */
   OP2(SLJIT_AND, TMP2, 0, TMP1, 0, SLJIT_IMM, 0x7);
   OP2(SLJIT_LSHR, TMP1, 0, TMP1, 0, SLJIT_IMM, 3);
   OP1(SLJIT_MOV_UB, TMP1, 0, SLJIT_MEM1(TMP1), (sljit_w)cc);
   OP2(SLJIT_SHL, TMP2, 0, SLJIT_IMM, 1, TMP2, 0);
   OP2(SLJIT_AND | SLJIT_SET_E, SLJIT_UNUSED, 0, TMP1, 0, TMP2, 0);
   add_jump(compiler, fallbacks, JUMP(SLJIT_C_ZERO));
-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
   if (jump[0] != NULL)
     JUMPHERE(jump[0]);
-#endif
-  return cc + 32;
+#endif /* SUPPORT_UTF || !COMPILE_PCRE8 */
+  return cc + 32 / sizeof(pcre_uchar);

-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || defined COMPILE_PCRE16
   case OP_XCLASS:
   compile_xclass_hotpath(common, cc + LINK_SIZE, fallbacks);
   return cc + GET(cc, 0) - 1;
@@ -3151,20 +3415,21 @@
   length = GET(cc, 0);
   SLJIT_ASSERT(length > 0);
   OP1(SLJIT_MOV, TMP1, 0, ARGUMENTS, 0);
-  OP1(SLJIT_MOV, TMP1, 0, SLJIT_MEM1(TMP1), SLJIT_OFFSETOF(jit_arguments, begin));
-#ifdef SUPPORT_UTF8
-  if (common->utf8)
+#ifdef SUPPORT_UTF
+  if (common->utf)
     {
+    OP1(SLJIT_MOV, TMP3, 0, SLJIT_MEM1(TMP1), SLJIT_OFFSETOF(jit_arguments, begin));
     OP1(SLJIT_MOV, TMP2, 0, SLJIT_IMM, length);
     label = LABEL();
-    add_jump(compiler, fallbacks, CMP(SLJIT_C_LESS_EQUAL, STR_PTR, 0, TMP1, 0));
+    add_jump(compiler, fallbacks, CMP(SLJIT_C_LESS_EQUAL, STR_PTR, 0, TMP3, 0));
     skip_char_back(common);
     OP2(SLJIT_SUB | SLJIT_SET_E, TMP2, 0, TMP2, 0, SLJIT_IMM, 1);
     JUMPTO(SLJIT_C_NOT_ZERO, label);
     return cc + LINK_SIZE;
     }
 #endif
-  OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, length);
+  OP1(SLJIT_MOV, TMP1, 0, SLJIT_MEM1(TMP1), SLJIT_OFFSETOF(jit_arguments, begin));
+  OP2(SLJIT_SUB, STR_PTR, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(length));
   add_jump(compiler, fallbacks, CMP(SLJIT_C_LESS, STR_PTR, 0, TMP1, 0));
   return cc + LINK_SIZE;
   }
@@ -3172,12 +3437,12 @@
 return cc;
 }

-static SLJIT_INLINE uschar *compile_charn_hotpath(compiler_common *common, uschar *cc, uschar *ccend, jump_list **fallbacks)
+static SLJIT_INLINE pcre_uchar *compile_charn_hotpath(compiler_common *common, pcre_uchar *cc, pcre_uchar *ccend, jump_list **fallbacks)
{
/* This function consumes at least one input character. */
/* To decrease the number of length checks, we try to concatenate the fixed length character sequences. */
DEFINE_COMPILER;
-uschar *ccbegin = cc;
+pcre_uchar *ccbegin = cc;
compare_context context;
int size;

@@ -3190,21 +3455,21 @@
   if (*cc == OP_CHAR)
     {
     size = 1;
-#ifdef SUPPORT_UTF8
-    if (common->utf8 && cc[1] >= 0xc0)
-      size += _pcre_utf8_table4[cc[1] & 0x3f];
+#ifdef SUPPORT_UTF
+    if (common->utf && HAS_EXTRALEN(cc[1]))
+      size += GET_EXTRALEN(cc[1]);
 #endif
     }
   else if (*cc == OP_CHARI)
     {
     size = 1;
-#ifdef SUPPORT_UTF8
-    if (common->utf8)
+#ifdef SUPPORT_UTF
+    if (common->utf)
       {
       if (char_has_othercase(common, cc + 1) && char_get_othercase_bit(common, cc + 1) == 0)
         size = 0;
-      else if (cc[1] >= 0xc0)
-        size += _pcre_utf8_table4[cc[1] & 0x3f];
+      else if (HAS_EXTRALEN(cc[1]))
+        size += GET_EXTRALEN(cc[1]);
       }
     else
 #endif
@@ -3215,7 +3480,7 @@
     size = 0;

cc += 1 + size;
- context.length += size;
+ context.length += IN_UCHARS(size);
}
while (size > 0 && context.length <= 128);

@@ -3228,7 +3493,7 @@

context.sourcereg = -1;
#if defined SLJIT_UNALIGNED && SLJIT_UNALIGNED
- context.byteptr = 0;
+ context.ucharptr = 0;
#endif
do cc = byte_sequence_compare(common, *cc == OP_CHARI, cc + 1, &context, fallbacks); while (context.length > 0);
return cc;
@@ -3238,7 +3503,7 @@
return compile_char1_hotpath(common, *cc, cc + 1, fallbacks);
}

-static struct sljit_jump *compile_ref_checks(compiler_common *common, uschar *cc, jump_list **fallbacks)
+static struct sljit_jump *compile_ref_checks(compiler_common *common, pcre_uchar *cc, jump_list **fallbacks)
{
DEFINE_COMPILER;
int offset = GET2(cc, 1) << 1;
@@ -3260,7 +3525,7 @@
}

/* Forward definitions. */
-static void compile_hotpath(compiler_common *, uschar *, uschar *, fallback_common *);
+static void compile_hotpath(compiler_common *, pcre_uchar *, pcre_uchar *, fallback_common *);
static void compile_fallbackpath(compiler_common *, struct fallback_common *);

#define PUSH_FALLBACK(size, ccstart, error) \
@@ -3291,7 +3556,7 @@

#define FALLBACK_AS(type) ((type*)fallback)

-static uschar *compile_ref_hotpath(compiler_common *common, uschar *cc, jump_list **fallbacks, BOOL withchecks, BOOL emptyfail)
+static pcre_uchar *compile_ref_hotpath(compiler_common *common, pcre_uchar *cc, jump_list **fallbacks, BOOL withchecks, BOOL emptyfail)
{
DEFINE_COMPILER;
int offset = GET2(cc, 1) << 1;
@@ -3301,9 +3566,8 @@
if (withchecks && !common->jscript_compat)
add_jump(compiler, fallbacks, CMP(SLJIT_C_EQUAL, TMP1, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), OVECTOR(1)));

-#ifdef SUPPORT_UTF8
-#ifdef SUPPORT_UCP
-if (common->utf8 && *cc == OP_REFI)
+#if defined SUPPORT_UTF && defined SUPPORT_UCP
+if (common->utf && *cc == OP_REFI)
   {
   SLJIT_ASSERT(TMP1 == SLJIT_TEMPORARY_REG1 && STACK_TOP == SLJIT_TEMPORARY_REG2 && TMP2 == SLJIT_TEMPORARY_REG3);
   OP1(SLJIT_MOV, TMP2, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), OVECTOR(offset + 1));
@@ -3314,14 +3578,13 @@
   OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), LOCALS0, STACK_TOP, 0);
   OP1(SLJIT_MOV, SLJIT_TEMPORARY_REG2, 0, ARGUMENTS, 0);
   OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_TEMPORARY_REG2), SLJIT_OFFSETOF(jit_arguments, ptr), STR_PTR, 0);
-  sljit_emit_ijump(compiler, SLJIT_CALL3, SLJIT_IMM, SLJIT_FUNC_OFFSET(do_utf8caselesscmp));
+  sljit_emit_ijump(compiler, SLJIT_CALL3, SLJIT_IMM, SLJIT_FUNC_OFFSET(do_utf_caselesscmp));
   OP1(SLJIT_MOV, STACK_TOP, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), LOCALS0);
   add_jump(compiler, fallbacks, CMP(SLJIT_C_EQUAL, SLJIT_RETURN_REG, 0, SLJIT_IMM, 0));
   OP1(SLJIT_MOV, STR_PTR, 0, SLJIT_RETURN_REG, 0);
   }
 else
-#endif
-#endif
+#endif /* SUPPORT_UTF && SUPPORT_UCP */
   {
   OP2(SLJIT_SUB | SLJIT_SET_E, TMP2, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), OVECTOR(offset + 1), TMP1, 0);
   if (withchecks)
@@ -3340,24 +3603,24 @@
   else
     JUMPHERE(jump);
   }
-return cc + 3;
+return cc + 1 + IMM2_SIZE;
 }

-static SLJIT_INLINE uschar *compile_ref_iterator_hotpath(compiler_common *common, uschar *cc, fallback_common *parent)
+static SLJIT_INLINE pcre_uchar *compile_ref_iterator_hotpath(compiler_common *common, pcre_uchar *cc, fallback_common *parent)
{
DEFINE_COMPILER;
fallback_common *fallback;
-uschar type;
+pcre_uchar type;
struct sljit_label *label;
struct sljit_jump *zerolength;
struct sljit_jump *jump = NULL;
-uschar *ccbegin = cc;
+pcre_uchar *ccbegin = cc;
int min = 0, max = 0;
BOOL minimize;

PUSH_FALLBACK(sizeof(iterator_fallback), cc, NULL);

-type = cc[3];
+type = cc[1 + IMM2_SIZE];
minimize = (type & 0x1) != 0;
switch(type)
{
@@ -3365,25 +3628,25 @@
case OP_CRMINSTAR:
min = 0;
max = 0;
- cc += 4;
+ cc += 1 + IMM2_SIZE + 1;
break;
case OP_CRPLUS:
case OP_CRMINPLUS:
min = 1;
max = 0;
- cc += 4;
+ cc += 1 + IMM2_SIZE + 1;
break;
case OP_CRQUERY:
case OP_CRMINQUERY:
min = 0;
max = 1;
- cc += 4;
+ cc += 1 + IMM2_SIZE + 1;
break;
case OP_CRRANGE:
case OP_CRMINRANGE:
- min = GET2(cc, 3 + 1);
- max = GET2(cc, 3 + 3);
- cc += 8;
+ min = GET2(cc, 1 + IMM2_SIZE + 1);
+ max = GET2(cc, 1 + IMM2_SIZE + 1 + IMM2_SIZE);
+ cc += 1 + IMM2_SIZE + 1 + 2 * IMM2_SIZE;
break;
default:
SLJIT_ASSERT_STOP();
@@ -3487,7 +3750,7 @@
return cc;
}

-static SLJIT_INLINE uschar *compile_recurse_hotpath(compiler_common *common, uschar *cc, fallback_common *parent)
+static SLJIT_INLINE pcre_uchar *compile_recurse_hotpath(compiler_common *common, pcre_uchar *cc, fallback_common *parent)
{
DEFINE_COMPILER;
fallback_common *fallback;
@@ -3533,15 +3796,15 @@
return cc + 1 + LINK_SIZE;
}

-static uschar *compile_assert_hotpath(compiler_common *common, uschar *cc, assert_fallback *fallback, BOOL conditional)
+static pcre_uchar *compile_assert_hotpath(compiler_common *common, pcre_uchar *cc, assert_fallback *fallback, BOOL conditional)
{
DEFINE_COMPILER;
int framesize;
int localptr;
fallback_common altfallback;
-uschar *ccbegin;
-uschar opcode;
-uschar bra = OP_BRA;
+pcre_uchar *ccbegin;
+pcre_uchar opcode;
+pcre_uchar bra = OP_BRA;
jump_list *tmp = NULL;
jump_list **target = (conditional) ? &fallback->condfailed : &fallback->common.topfallbacks;
jump_list **found;
@@ -3557,7 +3820,7 @@
bra = *cc;
cc++;
}
-localptr = PRIV(cc);
+localptr = PRIV_DATA(cc);
SLJIT_ASSERT(localptr != 0);
framesize = get_framesize(common, cc, FALSE);
fallback->framesize = framesize;
@@ -3803,11 +4066,11 @@
return cc + 1 + LINK_SIZE;
}

-static sljit_w SLJIT_CALL do_searchovector(sljit_w refno, sljit_w* locals, uschar *name_table)
+static sljit_w SLJIT_CALL do_searchovector(sljit_w refno, sljit_w* locals, pcre_uchar *name_table)
 {
 int condition = FALSE;
-uschar *slotA = name_table;
-uschar *slotB;
+pcre_uchar *slotA = name_table;
+pcre_uchar *slotB;
 sljit_w name_count = locals[LOCALS0 / sizeof(sljit_w)];
 sljit_w name_entry_size = locals[LOCALS1 / sizeof(sljit_w)];
 sljit_w no_capture;
@@ -3832,7 +4095,7 @@
   while (slotB > name_table)
     {
     slotB -= name_entry_size;
-    if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+    if (STRCMP_UC_UC(slotA + IMM2_SIZE, slotB + IMM2_SIZE) == 0)
       {
       condition = locals[GET2(slotB, 0) << 1] != no_capture;
       if (condition) break;
@@ -3847,7 +4110,7 @@
     for (i++; i < name_count; i++)
       {
       slotB += name_entry_size;
-      if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+      if (STRCMP_UC_UC(slotA + IMM2_SIZE, slotB + IMM2_SIZE) == 0)
         {
         condition = locals[GET2(slotB, 0) << 1] != no_capture;
         if (condition) break;
@@ -3859,11 +4122,11 @@
 return condition;
 }

-static sljit_w SLJIT_CALL do_searchgroups(sljit_w recno, sljit_w* locals, uschar *name_table)
+static sljit_w SLJIT_CALL do_searchgroups(sljit_w recno, sljit_w* locals, pcre_uchar *name_table)
 {
 int condition = FALSE;
-uschar *slotA = name_table;
-uschar *slotB;
+pcre_uchar *slotA = name_table;
+pcre_uchar *slotB;
 sljit_w name_count = locals[LOCALS0 / sizeof(sljit_w)];
 sljit_w name_entry_size = locals[LOCALS1 / sizeof(sljit_w)];
 sljit_w group_num = locals[POSSESSIVE0 / sizeof(sljit_w)];
@@ -3885,7 +4148,7 @@
   while (slotB > name_table)
     {
     slotB -= name_entry_size;
-    if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+    if (STRCMP_UC_UC(slotA + IMM2_SIZE, slotB + IMM2_SIZE) == 0)
       {
       condition = GET2(slotB, 0) == group_num;
       if (condition) break;
@@ -3900,7 +4163,7 @@
     for (i++; i < name_count; i++)
       {
       slotB += name_entry_size;
-      if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+      if (STRCMP_UC_UC(slotA + IMM2_SIZE, slotB + IMM2_SIZE) == 0)
         {
         condition = GET2(slotB, 0) == group_num;
         if (condition) break;
@@ -3966,18 +4229,18 @@
                                           Or nothing, if trace is unnecessary
 */

-static uschar *compile_bracket_hotpath(compiler_common *common, uschar *cc, fallback_common *parent)
+static pcre_uchar *compile_bracket_hotpath(compiler_common *common, pcre_uchar *cc, fallback_common *parent)
 {
 DEFINE_COMPILER;
 fallback_common *fallback;
-uschar opcode;
+pcre_uchar opcode;
 int localptr = 0;
 int offset = 0;
 int stacksize;
-uschar *ccbegin;
-uschar *hotpath;
-uschar bra = OP_BRA;
-uschar ket;
+pcre_uchar *ccbegin;
+pcre_uchar *hotpath;
+pcre_uchar bra = OP_BRA;
+pcre_uchar ket;
 assert_fallback *assert;
 BOOL has_alternatives;
 struct sljit_jump *jump;
@@ -4038,12 +4301,12 @@
   localptr = OVECTOR_PRIV(offset);
   offset <<= 1;
   FALLBACK_AS(bracket_fallback)->localptr = localptr;
-  hotpath += 2;
+  hotpath += IMM2_SIZE;
   }
 else if (opcode == OP_ONCE || opcode == OP_SBRA || opcode == OP_SCOND)
   {
   /* Other brackets simply allocate the next entry. */
-  localptr = PRIV(ccbegin);
+  localptr = PRIV_DATA(ccbegin);
   SLJIT_ASSERT(localptr != 0);
   FALLBACK_AS(bracket_fallback)->localptr = localptr;
   if (opcode == OP_ONCE)
@@ -4202,7 +4465,7 @@
     SLJIT_ASSERT(has_alternatives);
     add_jump(compiler, &(FALLBACK_AS(bracket_fallback)->u.condfailed),
       CMP(SLJIT_C_EQUAL, SLJIT_MEM1(SLJIT_LOCALS_REG), OVECTOR(GET2(hotpath, 1) << 1), SLJIT_MEM1(SLJIT_LOCALS_REG), OVECTOR(1)));
-    hotpath += 3;
+    hotpath += 1 + IMM2_SIZE;
     }
   else if (*hotpath == OP_NCREF)
     {
@@ -4221,7 +4484,7 @@
     add_jump(compiler, &(FALLBACK_AS(bracket_fallback)->u.condfailed), CMP(SLJIT_C_EQUAL, SLJIT_TEMPORARY_REG1, 0, SLJIT_IMM, 0));

     JUMPHERE(jump);
-    hotpath += 3;
+    hotpath += 1 + IMM2_SIZE;
     }
   else if (*hotpath == OP_RREF || *hotpath == OP_NRREF)
     {
@@ -4242,7 +4505,7 @@
       {
       SLJIT_ASSERT(!has_alternatives);
       if (stacksize != 0)
-        hotpath += 3;
+        hotpath += 1 + IMM2_SIZE;
       else
         {
         if (*cc == OP_ALT)
@@ -4269,7 +4532,7 @@
       sljit_emit_ijump(compiler, SLJIT_CALL3, SLJIT_IMM, SLJIT_FUNC_OFFSET(do_searchgroups));
       OP1(SLJIT_MOV, STACK_TOP, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), POSSESSIVE1);
       add_jump(compiler, &(FALLBACK_AS(bracket_fallback)->u.condfailed), CMP(SLJIT_C_EQUAL, SLJIT_TEMPORARY_REG1, 0, SLJIT_IMM, 0));
-      hotpath += 3;
+      hotpath += 1 + IMM2_SIZE;
       }
     }
   else
@@ -4405,18 +4668,18 @@
 return cc;
 }

-static uschar *compile_bracketpos_hotpath(compiler_common *common, uschar *cc, fallback_common *parent)
+static pcre_uchar *compile_bracketpos_hotpath(compiler_common *common, pcre_uchar *cc, fallback_common *parent)
{
DEFINE_COMPILER;
fallback_common *fallback;
-uschar opcode;
+pcre_uchar opcode;
int localptr;
int cbraprivptr = 0;
int framesize;
int stacksize;
int offset = 0;
BOOL zero = FALSE;
-uschar *ccbegin = NULL;
+pcre_uchar *ccbegin = NULL;
int stack;
struct sljit_label *loop = NULL;
struct jump_list *emptymatch = NULL;
@@ -4429,7 +4692,7 @@
}

opcode = *cc;
-localptr = PRIV(cc);
+localptr = PRIV_DATA(cc);
SLJIT_ASSERT(localptr != 0);
FALLBACK_AS(bracketpos_fallback)->localptr = localptr;
switch(opcode)
@@ -4444,7 +4707,7 @@
offset = GET2(cc, 1 + LINK_SIZE);
cbraprivptr = OVECTOR_PRIV(offset);
offset <<= 1;
- ccbegin = cc + 1 + LINK_SIZE + 2;
+ ccbegin = cc + 1 + LINK_SIZE + IMM2_SIZE;
break;

default:
@@ -4623,7 +4886,7 @@
return cc + 1 + LINK_SIZE;
}

-static SLJIT_INLINE uschar *get_iterator_parameters(compiler_common *common, uschar *cc, uschar *opcode, uschar *type, int *arg1, int *arg2, uschar **end)
+static SLJIT_INLINE pcre_uchar *get_iterator_parameters(compiler_common *common, pcre_uchar *cc, pcre_uchar *opcode, pcre_uchar *type, int *arg1, int *arg2, pcre_uchar **end)
{
int class_len;

@@ -4662,7 +4925,7 @@
   SLJIT_ASSERT(*opcode >= OP_CLASS || *opcode <= OP_XCLASS);
   *type = *opcode;
   cc++;
-  class_len = (*type < OP_XCLASS) ? 33 : GET(cc, 0);
+  class_len = (*type < OP_XCLASS) ? (1 + (32 / sizeof(pcre_uchar))) : GET(cc, 0);
   *opcode = cc[class_len - 1];
   if (*opcode >= OP_CRSTAR && *opcode <= OP_CRMINQUERY)
     {
@@ -4673,7 +4936,7 @@
   else
     {
     SLJIT_ASSERT(*opcode == OP_CRRANGE || *opcode == OP_CRMINRANGE);
-    *arg1 = GET2(cc, (class_len + 2));
+    *arg1 = GET2(cc, (class_len + IMM2_SIZE));
     *arg2 = GET2(cc, class_len);

     if (*arg2 == 0)
@@ -4685,7 +4948,7 @@
       *opcode = OP_EXACT;

     if (end != NULL)
-      *end = cc + class_len + 4;
+      *end = cc + class_len + 2 * IMM2_SIZE;
     }
   return cc;
   }
@@ -4693,7 +4956,7 @@
 if (*opcode == OP_UPTO || *opcode == OP_MINUPTO || *opcode == OP_EXACT || *opcode == OP_POSUPTO)
   {
   *arg1 = GET2(cc, 0);
-  cc += 2;
+  cc += IMM2_SIZE;
   }

if (*type == 0)
@@ -4708,21 +4971,21 @@
if (end != NULL)
{
*end = cc + 1;
-#ifdef SUPPORT_UTF8
- if (common->utf8 && *cc >= 0xc0) *end += _pcre_utf8_table4[*cc & 0x3f];
+#ifdef SUPPORT_UTF
+ if (common->utf && HAS_EXTRALEN(*cc)) *end += GET_EXTRALEN(*cc);
#endif
}
return cc;
}

-static uschar *compile_iterator_hotpath(compiler_common *common, uschar *cc, fallback_common *parent)
+static pcre_uchar *compile_iterator_hotpath(compiler_common *common, pcre_uchar *cc, fallback_common *parent)
{
DEFINE_COMPILER;
fallback_common *fallback;
-uschar opcode;
-uschar type;
+pcre_uchar opcode;
+pcre_uchar type;
int arg1 = -1, arg2 = -1;
-uschar* end;
+pcre_uchar* end;
jump_list *nomatch = NULL;
struct sljit_jump *jump = NULL;
struct sljit_label *label;
@@ -4884,7 +5147,7 @@
return end;
}

-static SLJIT_INLINE uschar *compile_fail_accept_hotpath(compiler_common *common, uschar *cc, fallback_common *parent)
+static SLJIT_INLINE pcre_uchar *compile_fail_accept_hotpath(compiler_common *common, pcre_uchar *cc, fallback_common *parent)
{
DEFINE_COMPILER;
fallback_common *fallback;
@@ -4928,23 +5191,23 @@
return cc + 1;
}

-static SLJIT_INLINE uschar *compile_close_hotpath(compiler_common *common, uschar *cc)
+static SLJIT_INLINE pcre_uchar *compile_close_hotpath(compiler_common *common, pcre_uchar *cc)
{
DEFINE_COMPILER;
int offset = GET2(cc, 1);

/* Data will be discarded anyway... */
if (common->currententry != NULL)
- return cc + 3;
+ return cc + 1 + IMM2_SIZE;

OP1(SLJIT_MOV, TMP1, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), OVECTOR_PRIV(offset));
offset <<= 1;
OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), OVECTOR(offset + 1), STR_PTR, 0);
OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), OVECTOR(offset), TMP1, 0);
-return cc + 3;
+return cc + 1 + IMM2_SIZE;
}

-static void compile_hotpath(compiler_common *common, uschar *cc, uschar *ccend, fallback_common *parent)
+static void compile_hotpath(compiler_common *common, pcre_uchar *cc, pcre_uchar *ccend, fallback_common *parent)
{
DEFINE_COMPILER;
fallback_common *fallback;
@@ -5070,13 +5333,13 @@

     case OP_CLASS:
     case OP_NCLASS:
-    if (cc[33] >= OP_CRSTAR && cc[33] <= OP_CRMINRANGE)
+    if (cc[1 + (32 / sizeof(pcre_uchar))] >= OP_CRSTAR && cc[1 + (32 / sizeof(pcre_uchar))] <= OP_CRMINRANGE)
       cc = compile_iterator_hotpath(common, cc, parent);
     else
       cc = compile_char1_hotpath(common, *cc, cc + 1, parent->top != NULL ? &parent->top->nextfallbacks : &parent->topfallbacks);
     break;

-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || defined COMPILE_PCRE16
     case OP_XCLASS:
     if (*(cc + GET(cc, 1)) >= OP_CRSTAR && *(cc + GET(cc, 1)) <= OP_CRMINRANGE)
       cc = compile_iterator_hotpath(common, cc, parent);
@@ -5087,7 +5350,7 @@

     case OP_REF:
     case OP_REFI:
-    if (cc[3] >= OP_CRSTAR && cc[3] <= OP_CRMINRANGE)
+    if (cc[1 + IMM2_SIZE] >= OP_CRSTAR && cc[1 + IMM2_SIZE] <= OP_CRMINRANGE)
       cc = compile_ref_iterator_hotpath(common, cc, parent);
     else
       cc = compile_ref_hotpath(common, cc, parent->top != NULL ? &parent->top->nextfallbacks : &parent->topfallbacks, TRUE, FALSE);
@@ -5195,9 +5458,9 @@
 static void compile_iterator_fallbackpath(compiler_common *common, struct fallback_common *current)
 {
 DEFINE_COMPILER;
-uschar *cc = current->cc;
-uschar opcode;
-uschar type;
+pcre_uchar *cc = current->cc;
+pcre_uchar opcode;
+pcre_uchar type;
 int arg1 = -1, arg2 = -1;
 struct sljit_label *label = NULL;
 struct sljit_jump *jump = NULL;
@@ -5322,10 +5585,10 @@
 static void compile_ref_iterator_fallbackpath(compiler_common *common, struct fallback_common *current)
 {
 DEFINE_COMPILER;
-uschar *cc = current->cc;
-uschar type;
+pcre_uchar *cc = current->cc;
+pcre_uchar type;

-type = cc[3];
+type = cc[1 + IMM2_SIZE];
if ((type & 0x1) == 0)
{
set_jumps(current->topfallbacks, LABEL());
@@ -5354,8 +5617,8 @@
static void compile_assert_fallbackpath(compiler_common *common, struct fallback_common *current)
{
DEFINE_COMPILER;
-uschar *cc = current->cc;
-uschar bra = OP_BRA;
+pcre_uchar *cc = current->cc;
+pcre_uchar bra = OP_BRA;
struct sljit_jump *brajump = NULL;

 SLJIT_ASSERT(*cc != OP_BRAMINZERO);
@@ -5426,13 +5689,13 @@
 int localptr = CURRENT_AS(bracket_fallback)->localptr;
 int stacksize;
 int count;
-uschar *cc = current->cc;
-uschar *ccbegin;
-uschar *ccprev;
+pcre_uchar *cc = current->cc;
+pcre_uchar *ccbegin;
+pcre_uchar *ccprev;
 jump_list *jumplist = NULL;
 jump_list *jumplistitem = NULL;
-uschar bra = OP_BRA;
-uschar ket;
+pcre_uchar bra = OP_BRA;
+pcre_uchar ket;
 assert_fallback *assert;
 BOOL has_alternatives;
 struct sljit_jump *brazero = NULL;
@@ -5697,7 +5960,8 @@
     {
     SLJIT_ASSERT(opcode == OP_COND || opcode == OP_SCOND);
     assert = CURRENT_AS(bracket_fallback)->u.assert;
-    if (assert->framesize >= 0 && (ccbegin[1 + LINK_SIZE] == OP_ASSERT_NOT || ccbegin[1 + LINK_SIZE] == OP_ASSERTBACK_NOT))
+    if ((ccbegin[1 + LINK_SIZE] == OP_ASSERT_NOT || ccbegin[1 + LINK_SIZE] == OP_ASSERTBACK_NOT) && assert->framesize >= 0)
+
       {
       OP1(SLJIT_MOV, STACK_TOP, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), assert->localptr);
       add_jump(compiler, &common->revertframes, JUMP(SLJIT_FAST_CALL));
@@ -5931,7 +6195,9 @@
     case OP_TYPEPOSUPTO:
     case OP_CLASS:
     case OP_NCLASS:
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
     case OP_XCLASS:
+#endif
     compile_iterator_fallbackpath(common, current);
     break;

@@ -5998,9 +6264,9 @@
static SLJIT_INLINE void compile_recurse(compiler_common *common)
{
DEFINE_COMPILER;
-uschar *cc = common->start + common->currententry->start;
-uschar *ccbegin = cc + 1 + LINK_SIZE + (*cc == OP_BRA ? 0 : 2);
-uschar *ccend = bracketend(cc);
+pcre_uchar *cc = common->start + common->currententry->start;
+pcre_uchar *ccbegin = cc + 1 + LINK_SIZE + (*cc == OP_BRA ? 0 : IMM2_SIZE);
+pcre_uchar *ccend = bracketend(cc);
int localsize = get_localsize(common, ccbegin, ccend);
int framesize = get_framesize(common, cc, TRUE);
int alternativesize;
@@ -6088,17 +6354,18 @@
#undef CURRENT_AS

void
-_pcre_jit_compile(const real_pcre *re, pcre_extra *extra)
+PRIV(jit_compile)(const real_pcre *re, pcre_extra *extra)
{
struct sljit_compiler *compiler;
fallback_common rootfallback;
compiler_common common_data;
compiler_common *common = &common_data;
-const uschar *tables = re->tables;
+const pcre_uint8 *tables = re->tables;
pcre_study_data *study;
-uschar *ccend;
+pcre_uchar *ccend;
executable_function *function;
void *executable_func;
+sljit_uw executable_size;
struct sljit_label *leave;
struct sljit_label *mainloop = NULL;
struct sljit_label *empty_match_found;
@@ -6111,10 +6378,10 @@
study = extra->study_data;

if (!tables)
- tables = _pcre_default_tables;
+ tables = PRIV(default_tables);

memset(&rootfallback, 0, sizeof(fallback_common));
-rootfallback.cc = (uschar *)re + re->name_table_offset + re->name_count * re->name_entry_size;
+rootfallback.cc = (pcre_uchar *)re + re->name_table_offset + re->name_count * re->name_entry_size;

common->compiler = NULL;
common->start = rootfallback.cc;
@@ -6155,7 +6422,7 @@
}
common->endonly = (re->options & PCRE_DOLLAR_ENDONLY) != 0;
common->ctypes = (sljit_w)(tables + ctypes_offset);
-common->name_table = (sljit_w)re + re->name_table_offset;
+common->name_table = (sljit_w)((pcre_uchar *)re + re->name_table_offset);
common->name_count = re->name_count;
common->name_entry_size = re->name_entry_size;
common->acceptlabel = NULL;
@@ -6173,14 +6440,17 @@
common->casefulcmp = NULL;
common->caselesscmp = NULL;
common->jscript_compat = (re->options & PCRE_JAVASCRIPT_COMPAT) != 0;
-#ifdef SUPPORT_UTF8
-common->utf8 = (re->options & PCRE_UTF8) != 0;
+#ifdef SUPPORT_UTF
+/* PCRE_UTF16 has the same value as PCRE_UTF8. */
+common->utf = (re->options & PCRE_UTF8) != 0;
#ifdef SUPPORT_UCP
-common->useucp = (re->options & PCRE_UCP) != 0;
+common->use_ucp = (re->options & PCRE_UCP) != 0;
#endif
-common->utf8readchar = NULL;
-common->utf8readtype8 = NULL;
+common->utfreadchar = NULL;
+#ifdef COMPILE_PCRE8
+common->utfreadtype8 = NULL;
#endif
+#endif /* SUPPORT_UTF */
#ifdef SUPPORT_UCP
common->getucd = NULL;
#endif
@@ -6212,7 +6482,7 @@
/* Register init. */
reset_ovector(common, (re->top_bracket + 1) * 2);
if ((re->flags & PCRE_REQCHSET) != 0)
- OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), REQ_BYTE_PTR, SLJIT_TEMPORARY_REG1, 0);
+ OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), REQ_CHAR_PTR, SLJIT_TEMPORARY_REG1, 0);

 OP1(SLJIT_MOV, ARGUMENTS, 0, SLJIT_GENERAL_REG1, 0);
 OP1(SLJIT_MOV, TMP1, 0, SLJIT_GENERAL_REG1, 0);
@@ -6230,14 +6500,14 @@
   mainloop = mainloop_entry(common, (re->flags & PCRE_HASCRORLF) != 0, (re->options & PCRE_FIRSTLINE) != 0);
   /* Forward search if possible. */
   if ((re->flags & PCRE_FIRSTSET) != 0)
-    fast_forward_first_byte(common, re->first_byte, (re->options & PCRE_FIRSTLINE) != 0);
+    fast_forward_first_char(common, re->first_char, (re->flags & PCRE_FCH_CASELESS) != 0, (re->options & PCRE_FIRSTLINE) != 0);
   else if ((re->flags & PCRE_STARTLINE) != 0)
     fast_forward_newline(common, (re->options & PCRE_FIRSTLINE) != 0);
   else if ((re->flags & PCRE_STARTLINE) == 0 && study != NULL && (study->flags & PCRE_STUDY_MAPPED) != 0)
     fast_forward_start_bits(common, (sljit_uw)study->start_bits, (re->options & PCRE_FIRSTLINE) != 0);
   }
 if ((re->flags & PCRE_REQCHSET) != 0)
-  reqbyte_notfound = search_requested_char(common, re->req_byte, (re->flags & PCRE_FIRSTSET) != 0);
+  reqbyte_notfound = search_requested_char(common, re->req_char, (re->flags & PCRE_RCH_CASELESS) != 0, (re->flags & PCRE_FIRSTSET) != 0);

 /* Store the current STR_PTR in OVECTOR(0). */
 OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_LOCALS_REG), OVECTOR(0), STR_PTR, 0);
@@ -6284,7 +6554,7 @@
     {
     if (study != NULL && study->minlength > 1)
       {
-      OP2(SLJIT_ADD, TMP1, 0, STR_PTR, 0, SLJIT_IMM, study->minlength);
+      OP2(SLJIT_ADD, TMP1, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(study->minlength));
       CMPTO(SLJIT_C_LESS_EQUAL, TMP1, 0, STR_END, 0, mainloop);
       }
     else
@@ -6294,7 +6564,7 @@
     {
     if (study != NULL && study->minlength > 1)
       {
-      OP2(SLJIT_ADD, TMP1, 0, STR_PTR, 0, SLJIT_IMM, study->minlength);
+      OP2(SLJIT_ADD, TMP1, 0, STR_PTR, 0, SLJIT_IMM, IN_UCHARS(study->minlength));
       OP2(SLJIT_SUB | SLJIT_SET_U, SLJIT_UNUSED, 0, TMP1, 0, STR_END, 0);
       COND_VALUE(SLJIT_MOV, TMP2, 0, SLJIT_C_GREATER);
       OP2(SLJIT_SUB | SLJIT_SET_U, SLJIT_UNUSED, 0, STR_PTR, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), FIRSTLINE_END);
@@ -6406,18 +6676,20 @@
   set_jumps(common->caselesscmp, LABEL());
   do_caselesscmp(common);
   }
-#ifdef SUPPORT_UTF8
-if (common->utf8readchar != NULL)
+#ifdef SUPPORT_UTF
+if (common->utfreadchar != NULL)
   {
-  set_jumps(common->utf8readchar, LABEL());
-  do_utf8readchar(common);
+  set_jumps(common->utfreadchar, LABEL());
+  do_utfreadchar(common);
   }
-if (common->utf8readtype8 != NULL)
+#ifdef COMPILE_PCRE8
+if (common->utfreadtype8 != NULL)
   {
-  set_jumps(common->utf8readtype8, LABEL());
-  do_utf8readtype8(common);
+  set_jumps(common->utfreadtype8, LABEL());
+  do_utfreadtype8(common);
   }
 #endif
+#endif /* COMPILE_PCRE8 */
 #ifdef SUPPORT_UCP
 if (common->getucd != NULL)
   {
@@ -6428,6 +6700,7 @@

SLJIT_FREE(common->localptrs);
executable_func = sljit_generate_code(compiler);
+executable_size = sljit_get_generated_code_size(compiler);
sljit_free_compiler(compiler);
if (executable_func == NULL)
return;
@@ -6442,6 +6715,7 @@
}

 function->executable_func = executable_func;
+function->executable_size = executable_size;
 function->callback = NULL;
 function->userdata = NULL;
 extra->executable_jit = function;
@@ -6454,7 +6728,7 @@
    void* executable_func;
    jit_function call_executable_func;
 } convert_executable_func;
-uschar local_area[LOCAL_SPACE_SIZE];
+pcre_uint8 local_area[LOCAL_SPACE_SIZE];
 struct sljit_stack local_stack;

local_stack.top = (sljit_w)&local_area;
@@ -6467,8 +6741,8 @@
}

int
-_pcre_jit_exec(const real_pcre *re, void *executable_func,
- PCRE_SPTR subject, int length, int start_offset, int options,
+PRIV(jit_exec)(const real_pcre *re, void *executable_func,
+ const pcre_uchar *subject, int length, int start_offset, int options,
int match_limit, int *offsets, int offsetcount)
{
executable_function *function = (executable_function*)executable_func;
@@ -6523,15 +6797,26 @@
}

void
-_pcre_jit_free(void *executable_func)
+PRIV(jit_free)(void *executable_func)
{
executable_function *function = (executable_function*)executable_func;
sljit_free_code(function->executable_func);
SLJIT_FREE(function);
}

+int
+PRIV(jit_get_size)(void *executable_func)
+{
+return ((executable_function*)executable_func)->executable_size;
+}
+
+#ifdef COMPILE_PCRE8
PCRE_EXP_DECL pcre_jit_stack *
pcre_jit_stack_alloc(int startsize, int maxsize)
+#else
+PCRE_EXP_DECL pcre_jit_stack *
+pcre16_jit_stack_alloc(int startsize, int maxsize)
+#endif
{
if (startsize < 1 || maxsize < 1)
return NULL;
@@ -6542,14 +6827,24 @@
return (pcre_jit_stack*)sljit_allocate_stack(startsize, maxsize);
}

+#ifdef COMPILE_PCRE8
PCRE_EXP_DECL void
pcre_jit_stack_free(pcre_jit_stack *stack)
+#else
+PCRE_EXP_DECL void
+pcre16_jit_stack_free(pcre_jit_stack *stack)
+#endif
{
sljit_free_stack((struct sljit_stack*)stack);
}

+#ifdef COMPILE_PCRE8
PCRE_EXP_DECL void
pcre_assign_jit_stack(pcre_extra *extra, pcre_jit_callback callback, void *userdata)
+#else
+PCRE_EXP_DECL void
+pcre16_assign_jit_stack(pcre_extra *extra, pcre_jit_callback callback, void *userdata)
+#endif
{
executable_function *function;
if (extra != NULL &&
@@ -6567,22 +6862,37 @@
/* These are dummy functions to avoid linking errors when JIT support is not
being compiled. */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DECL pcre_jit_stack *
pcre_jit_stack_alloc(int startsize, int maxsize)
+#else
+PCRE_EXP_DECL pcre_jit_stack *
+pcre16_jit_stack_alloc(int startsize, int maxsize)
+#endif
{
(void)startsize;
(void)maxsize;
return NULL;
}

+#ifdef COMPILE_PCRE8
PCRE_EXP_DECL void
pcre_jit_stack_free(pcre_jit_stack *stack)
+#else
+PCRE_EXP_DECL void
+pcre16_jit_stack_free(pcre_jit_stack *stack)
+#endif
{
(void)stack;
}

+#ifdef COMPILE_PCRE8
PCRE_EXP_DECL void
pcre_assign_jit_stack(pcre_extra *extra, pcre_jit_callback callback, void *userdata)
+#else
+PCRE_EXP_DECL void
+pcre16_assign_jit_stack(pcre_extra *extra, pcre_jit_callback callback, void *userdata)
+#endif
{
(void)extra;
(void)callback;

Modified: code/trunk/pcre_jit_test.c
===================================================================
--- code/trunk/pcre_jit_test.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_jit_test.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,10 +6,10 @@
 and semantics are as close as possible to those of the Perl 5 language.

                   Main Library written by Philip Hazel
-           Copyright (c) 1997-2011 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

   This JIT compiler regression test program was written by Zoltan Herczeg
-                      Copyright (c) 2010-2011
+                      Copyright (c) 2010-2012

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -51,24 +51,47 @@
#define PCRE_BUG 0x80000000

 /*
- Hungarian utf8 characters
- \xc3\xa9 = 0xe9 = 233 (e') \xc3\x89 = 0xc9 = 201 (E')
- \xc3\xa1 = 0xe1 = 225 (a') \xc3\x81 = 0xc1 = 193 (A')
- \xe6\x92\xad = 0x64ad = 25773 (a valid kanji)
- \xc2\x85 = 0x85 (NExt Line = NEL)
- \xc2\xa1 = 0xa1 (Inverted Exclamation Mark)
- \xe2\x80\xa8 = 0x2028 (Line Separator)
- \xc8\xba = 570 \xe2\xb1\xa5 = 11365 (lowercase length != uppercase length)
- \xcc\x8d = 781 (Something with Mark property)
+ Letter characters:
+   \xe6\x92\xad = 0x64ad = 25773 (kanji)
+ Non-letter characters:
+   \xc2\xa1 = 0xa1 =  (Inverted Exclamation Mark)
+   \xf3\xa9\xb7\x80 = 0xe9dc0 = 957888
+   \xed\xa0\x80 = 55296 = 0xd800 (Invalid UTF character)
+   \xed\xb0\x80 = 56320 = 0xdc00 (Invalid UTF character)
+ Newlines:
+   \xc2\x85 = 0x85 = 133 (NExt Line = NEL)
+   \xe2\x80\xa8 = 0x2028 = 8232 (Line Separator)
+ Othercase pairs:
+   \xc3\xa9 = 0xe9 = 233 (e')
+      \xc3\x89 = 0xc9 = 201 (E')
+   \xc3\xa1 = 0xe1 = 225 (a')
+      \xc3\x81 = 0xc1 = 193 (A')
+   \xc8\xba = 0x23a = 570
+      \xe2\xb1\xa5 = 0x2c65 = 11365
+   \xe1\xbd\xb8 = 0x1f78 = 8056
+      \xe1\xbf\xb8 = 0x1ff8 = 8184
+   \xf0\x90\x90\x80 = 0x10400 = 66560
+      \xf0\x90\x90\xa8 = 0x10428 = 66600
+ Mark property:
+   \xcc\x8d = 0x30d = 781
+ Special:
+   \xdf\xbf = 0x7ff = 2047 (highest 2 byte character)
+   \xe0\xa0\x80 = 0x800 = 2048 (lowest 2 byte character)
+   \xef\xbf\xbf = 0xffff = 65535 (highest 3 byte character)
+   \xf0\x90\x80\x80 = 0x10000 = 65536 (lowest 4 byte character)
+   \xf4\x8f\xbf\xbf = 0x10ffff = 1114111 (highest allowed utf character)
 */

-static void setstack(pcre_extra *extra);
static int regression_tests(void);

 int main(void)
 {
     int jit = 0;
+#ifdef SUPPORT_PCRE8
     pcre_config(PCRE_CONFIG_JIT, &jit);
+#else
+    pcre16_config(PCRE_CONFIG_JIT, &jit);
+#endif
     if (!jit) {
         printf("JIT must be enabled to run pcre_jit_test\n");
         return 1;
@@ -76,28 +99,27 @@
     return regression_tests();
 }

-static pcre_jit_stack* callback(void *arg)
-{
-    return (pcre_jit_stack *)arg;
-}
+/* --------------------------------------------------------------------------------------- */

-static void setstack(pcre_extra *extra)
-{
-    static pcre_jit_stack *stack;
-    if (stack) pcre_jit_stack_free(stack);
-    stack = pcre_jit_stack_alloc(1, 1024 * 1024);
-    pcre_assign_jit_stack(extra, callback, stack);
-}
+#if !(defined SUPPORT_PCRE8) && !(defined SUPPORT_PCRE16)
+#error SUPPORT_PCRE8 or SUPPORT_PCRE16 must be defined
+#endif

-/* --------------------------------------------------------------------------------------- */
+#define MUA    (PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANYCRLF)
+#define MUAP    (PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANYCRLF | PCRE_UCP)
+#define CMUA    (PCRE_CASELESS | PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANYCRLF)
+#define CMUAP    (PCRE_CASELESS | PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANYCRLF | PCRE_UCP)
+#define MA    (PCRE_MULTILINE | PCRE_NEWLINE_ANYCRLF)
+#define MAP    (PCRE_MULTILINE | PCRE_NEWLINE_ANYCRLF | PCRE_UCP)
+#define CMA    (PCRE_CASELESS | PCRE_MULTILINE | PCRE_NEWLINE_ANYCRLF)

-#define MUA     (PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANYCRLF)
-#define MUAP    (PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANYCRLF | PCRE_UCP)
-#define CMUA    (PCRE_CASELESS | PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANYCRLF)
-#define CMUAP   (PCRE_CASELESS | PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANYCRLF | PCRE_UCP)
-#define MA      (PCRE_MULTILINE | PCRE_NEWLINE_ANYCRLF)
-#define MAP     (PCRE_MULTILINE | PCRE_NEWLINE_ANYCRLF | PCRE_UCP)
-#define CMA     (PCRE_CASELESS | PCRE_MULTILINE | PCRE_NEWLINE_ANYCRLF)
+#define OFFSET_MASK    0x00ffff
+#define F_NO8        0x010000
+#define F_NO16        0x020000
+#define F_NOMATCH    0x040000
+#define F_DIFF        0x080000
+#define F_FORCECONV    0x100000
+#define F_PROPERTY    0x200000

 struct regression_test_case {
     int flags;
@@ -124,7 +146,7 @@
     { MUA, 0, "\\W(\\W)?\\w", "\n\n+bc" },
     { MUA, 0, "[axd]", "sAXd" },
     { CMUA, 0, "[axd]", "sAXd" },
-    { CMUA, 0, "[^axd]", "DxA" },
+    { CMUA, 0 | F_NOMATCH, "[^axd]", "DxA" },
     { MUA, 0, "[a-dA-C]", "\xe6\x92\xad\xc3\xa9.B" },
     { MUA, 0, "[^a-dA-C]", "\xe6\x92\xad\xc3\xa9" },
     { CMUA, 0, "[^\xc3\xa9]", "\xc3\xa9\xc3\x89." },
@@ -137,12 +159,16 @@
     { PCRE_CASELESS, 0, "a1", "Aa1" },
     { MA, 0, "\\Ca", "cda" },
     { CMA, 0, "\\Ca", "CDA" },
-    { MA, 0, "\\Cx", "cda" },
-    { CMA, 0, "\\Cx", "CDA" },
+    { MA, 0 | F_NOMATCH, "\\Cx", "cda" },
+    { CMA, 0 | F_NOMATCH, "\\Cx", "CDA" },
+    { CMUAP, 0, "\xf0\x90\x90\x80\xf0\x90\x90\xa8", "\xf0\x90\x90\xa8\xf0\x90\x90\x80" },
+    { CMUAP, 0, "\xf0\x90\x90\x80{2}", "\xf0\x90\x90\x80#\xf0\x90\x90\xa8\xf0\x90\x90\x80" },
+    { CMUAP, 0, "\xf0\x90\x90\xa8{2}", "\xf0\x90\x90\x80#\xf0\x90\x90\xa8\xf0\x90\x90\x80" },
+    { CMUAP, 0, "\xe1\xbd\xb8\xe1\xbf\xb8", "\xe1\xbf\xb8\xe1\xbd\xb8" },

     /* Assertions. */
     { MUA, 0, "\\b[^A]", "A_B#" },
-    { MA, 0, "\\b\\W", "\n*" },
+    { MA, 0 | F_NOMATCH, "\\b\\W", "\n*" },
     { MUA, 0, "\\B[^,]\\b[^s]\\b", "#X" },
     { MAP, 0, "\\B", "_\xa1" },
     { MAP, 0, "\\b_\\b[,A]\\B", "_," },
@@ -150,27 +176,28 @@
     { MUAP, 0, "\\B", "_\xc2\xa1\xc3\xa1\xc2\x85" },
     { MUAP, 0, "\\b[^A]\\B[^c]\\b[^_]\\B", "_\xc3\xa1\xe2\x80\xa8" },
     { MUAP, 0, "\\b\\w+\\B", "\xc3\x89\xc2\xa1\xe6\x92\xad\xc3\x81\xc3\xa1" },
-    { MUA, 0, "\\b.", "\xcd\xbe" },
-    { MA, 0, "\\R^", "\n" },
-    { MA, 1, "^", "\n" },
+    { MUA, 0 | F_NOMATCH, "\\b.", "\xcd\xbe" },
+    { CMUAP, 0, "\\By", "\xf0\x90\x90\xa8y" },
+    { MA, 0 | F_NOMATCH, "\\R^", "\n" },
+    { MA, 1 | F_NOMATCH, "^", "\n" },
     { 0, 0, "^ab", "ab" },
-    { 0, 0, "^ab", "aab" },
+    { 0, 0 | F_NOMATCH, "^ab", "aab" },
     { PCRE_MULTILINE | PCRE_NEWLINE_CRLF, 0, "^a", "\r\raa\n\naa\r\naa" },
     { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANYCRLF, 0, "^-", "\xe2\x80\xa8--\xc2\x85-\r\n-" },
     { PCRE_MULTILINE | PCRE_NEWLINE_ANY, 0, "^-", "a--b--\x85--" },
     { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANY, 0, "^-", "a--\xe2\x80\xa8--" },
     { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANY, 0, "^-", "a--\xc2\x85--" },
     { 0, 0, "ab$", "ab" },
-    { 0, 0, "ab$", "ab\r\n" },
+    { 0, 0 | F_NOMATCH, "ab$", "ab\r\n" },
     { PCRE_MULTILINE | PCRE_NEWLINE_CRLF, 0, "a$", "\r\raa\n\naa\r\naa" },
     { PCRE_MULTILINE | PCRE_NEWLINE_ANY, 0, "a$", "aaa" },
     { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANYCRLF, 0, "#$", "#\xc2\x85###\r#" },
     { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANY, 0, "#$", "#\xe2\x80\xa9" },
-    { PCRE_NOTBOL | PCRE_NEWLINE_ANY, 0, "^a", "aa\naa" },
+    { PCRE_NOTBOL | PCRE_NEWLINE_ANY, 0 | F_NOMATCH, "^a", "aa\naa" },
     { PCRE_NOTBOL | PCRE_MULTILINE | PCRE_NEWLINE_ANY, 0, "^a", "aa\naa" },
-    { PCRE_NOTEOL | PCRE_NEWLINE_ANY, 0, "a$", "aa\naa" },
-    { PCRE_NOTEOL | PCRE_NEWLINE_ANY, 0, "a$", "aa\r\n" },
-    { PCRE_UTF8 | PCRE_DOLLAR_ENDONLY | PCRE_NEWLINE_ANY, 0, "\\p{Any}{2,}$", "aa\r\n" },
+    { PCRE_NOTEOL | PCRE_NEWLINE_ANY, 0 | F_NOMATCH, "a$", "aa\naa" },
+    { PCRE_NOTEOL | PCRE_NEWLINE_ANY, 0 | F_NOMATCH, "a$", "aa\r\n" },
+    { PCRE_UTF8 | PCRE_DOLLAR_ENDONLY | PCRE_NEWLINE_ANY, 0 | F_PROPERTY, "\\p{Any}{2,}$", "aa\r\n" },
     { PCRE_NOTEOL | PCRE_MULTILINE | PCRE_NEWLINE_ANY, 0, "a$", "aa\naa" },
     { PCRE_NEWLINE_CR, 0, ".\\Z", "aaa" },
     { PCRE_NEWLINE_CR | PCRE_UTF8, 0, "a\\Z", "aaa\r" },
@@ -190,11 +217,11 @@
     { PCRE_NEWLINE_ANY | PCRE_UTF8, 0, ".\\Z", "aaa\xc2\x85" },
     { PCRE_NEWLINE_ANY | PCRE_UTF8, 0, ".\\Z", "aaa\xe2\x80\xa8" },
     { MA, 0, "\\Aa", "aaa" },
-    { MA, 1, "\\Aa", "aaa" },
+    { MA, 1 | F_NOMATCH, "\\Aa", "aaa" },
     { MA, 1, "\\Ga", "aaa" },
-    { MA, 1, "\\Ga", "aba" },
+    { MA, 1 | F_NOMATCH, "\\Ga", "aba" },
     { MA, 0, "a\\z", "aaa" },
-    { MA, 0, "a\\z", "aab" },
+    { MA, 0 | F_NOMATCH, "a\\z", "aab" },

     /* Brackets. */
     { MUA, 0, "(ab|bb|cd)", "bacde" },
@@ -267,6 +294,11 @@
     { MUA, 0, "\\b\\w+\\B", "x,a_cd" },
     { MUAP, 0, "\\b[^\xc2\xa1]+\\B", "\xc3\x89\xc2\xa1\xe6\x92\xad\xc3\x81\xc3\xa1" },
     { CMUA, 0, "[^b]+(a*)([^c]?d{3})", "aaaaddd" },
+    { CMUAP, 0, "\xe1\xbd\xb8{2}", "\xe1\xbf\xb8#\xe1\xbf\xb8\xe1\xbd\xb8" },
+    { CMUA, 0, "[^\xf0\x90\x90\x80]{2,4}@", "\xf0\x90\x90\xa8\xf0\x90\x90\x80###\xf0\x90\x90\x80@@@" },
+    { CMUA, 0, "[^\xe1\xbd\xb8][^\xc3\xa9]", "\xe1\xbd\xb8\xe1\xbf\xb8\xc3\xa9\xc3\x89#" },
+    { MUA, 0, "[^\xe1\xbd\xb8][^\xc3\xa9]", "\xe1\xbd\xb8\xe1\xbf\xb8\xc3\xa9\xc3\x89#" },
+    { MUA, 0, "[^\xe1\xbd\xb8]{3,}?", "##\xe1\xbd\xb8#\xe1\xbd\xb8#\xc3\x89#\xe1\xbd\xb8" },

     /* Basic character sets. */
     { MUA, 0, "(?:\\s)+(?:\\S)+", "ab \t\xc3\xa9\xe6\x92\xad " },
@@ -278,24 +310,24 @@

     /* Unicode properties. */
     { MUAP, 0, "[1-5\xc3\xa9\\w]", "\xc3\xa1_" },
-    { MUAP, 0, "[\xc3\x81\\p{Ll}]", "A_\xc3\x89\xc3\xa1" },
+    { MUAP, 0 | F_PROPERTY, "[\xc3\x81\\p{Ll}]", "A_\xc3\x89\xc3\xa1" },
     { MUAP, 0, "[\\Wd-h_x-z]+", "a\xc2\xa1#_yhzdxi" },
-    { MUAP, 0, "[\\P{Any}]", "abc" },
-    { MUAP, 0, "[^\\p{Any}]", "abc" },
-    { MUAP, 0, "[\\P{Any}\xc3\xa1-\xc3\xa8]", "abc" },
-    { MUAP, 0, "[^\\p{Any}\xc3\xa1-\xc3\xa8]", "abc" },
-    { MUAP, 0, "[\xc3\xa1-\xc3\xa8\\P{Any}]", "abc" },
-    { MUAP, 0, "[^\xc3\xa1-\xc3\xa8\\p{Any}]", "abc" },
-    { MUAP, 0, "[\xc3\xa1-\xc3\xa8\\p{Any}]", "abc" },
-    { MUAP, 0, "[^\xc3\xa1-\xc3\xa8\\P{Any}]", "abc" },
+    { MUAP, 0 | F_NOMATCH | F_PROPERTY, "[\\P{Any}]", "abc" },
+    { MUAP, 0 | F_NOMATCH | F_PROPERTY, "[^\\p{Any}]", "abc" },
+    { MUAP, 0 | F_NOMATCH | F_PROPERTY, "[\\P{Any}\xc3\xa1-\xc3\xa8]", "abc" },
+    { MUAP, 0 | F_NOMATCH | F_PROPERTY, "[^\\p{Any}\xc3\xa1-\xc3\xa8]", "abc" },
+    { MUAP, 0 | F_NOMATCH | F_PROPERTY, "[\xc3\xa1-\xc3\xa8\\P{Any}]", "abc" },
+    { MUAP, 0 | F_NOMATCH | F_PROPERTY, "[^\xc3\xa1-\xc3\xa8\\p{Any}]", "abc" },
+    { MUAP, 0 | F_PROPERTY, "[\xc3\xa1-\xc3\xa8\\p{Any}]", "abc" },
+    { MUAP, 0 | F_PROPERTY, "[^\xc3\xa1-\xc3\xa8\\P{Any}]", "abc" },
     { MUAP, 0, "[b-\xc3\xa9\\s]", "a\xc\xe6\x92\xad" },
     { CMUAP, 0, "[\xc2\x85-\xc2\x89\xc3\x89]", "\xc2\x84\xc3\xa9" },
     { MUAP, 0, "[^b-d^&\\s]{3,}", "db^ !a\xe2\x80\xa8_ae" },
-    { MUAP, 0, "[^\\S\\P{Any}][\\sN]{1,3}[\\P{N}]{4}", "\xe2\x80\xaa\xa N\x9\xc3\xa9_0" },
-    { MUA, 0, "[^\\P{L}\x9!D-F\xa]{2,3}", "\x9,.DF\xa.CG\xc3\x81" },
+    { MUAP, 0 | F_PROPERTY, "[^\\S\\P{Any}][\\sN]{1,3}[\\P{N}]{4}", "\xe2\x80\xaa\xa N\x9\xc3\xa9_0" },
+    { MUA, 0 | F_PROPERTY, "[^\\P{L}\x9!D-F\xa]{2,3}", "\x9,.DF\xa.CG\xc3\x81" },
     { CMUAP, 0, "[\xc3\xa1-\xc3\xa9_\xe2\x80\xa0-\xe2\x80\xaf]{1,5}[^\xe2\x80\xa0-\xe2\x80\xaf]", "\xc2\xa1\xc3\x89\xc3\x89\xe2\x80\xaf_\xe2\x80\xa0" },
-    { MUAP, 0, "[\xc3\xa2-\xc3\xa6\xc3\x81-\xc3\x84\xe2\x80\xa8-\xe2\x80\xa9\xe6\x92\xad\\p{Zs}]{2,}", "\xe2\x80\xa7\xe2\x80\xa9\xe6\x92\xad \xe6\x92\xae" },
-    { MUAP, 0, "[\\P{L&}]{2}[^\xc2\x85-\xc2\x89\\p{Ll}\\p{Lu}]{2}", "\xc3\xa9\xe6\x92\xad.a\xe6\x92\xad|\xc2\x8a#" },
+    { MUAP, 0 | F_PROPERTY, "[\xc3\xa2-\xc3\xa6\xc3\x81-\xc3\x84\xe2\x80\xa8-\xe2\x80\xa9\xe6\x92\xad\\p{Zs}]{2,}", "\xe2\x80\xa7\xe2\x80\xa9\xe6\x92\xad \xe6\x92\xae" },
+    { MUAP, 0 | F_PROPERTY, "[\\P{L&}]{2}[^\xc2\x85-\xc2\x89\\p{Ll}\\p{Lu}]{2}", "\xc3\xa9\xe6\x92\xad.a\xe6\x92\xad|\xc2\x8a#" },
     { PCRE_UCP, 0, "[a-b\\s]{2,5}[^a]", "AB  baaa" },

     /* Possible empty brackets. */
@@ -312,8 +344,8 @@

     /* Start offset. */
     { MUA, 3, "(\\d|(?:\\w)*\\w)+", "0ac01Hb" },
-    { MUA, 4, "(\\w\\W\\w)+", "ab#d" },
-    { MUA, 2, "(\\w\\W\\w)+", "ab#d" },
+    { MUA, 4 | F_NOMATCH, "(\\w\\W\\w)+", "ab#d" },
+    { MUA, 2 | F_NOMATCH, "(\\w\\W\\w)+", "ab#d" },
     { MUA, 1, "(\\w\\W\\w)+", "ab#d" },

     /* Newline. */
@@ -327,7 +359,7 @@
     { PCRE_NEWLINE_ANYCRLF, 0, ".(.)", "a\rb\nc\r\n\xc2\x85\xe2\x80\xa8" },
     { PCRE_NEWLINE_ANYCRLF | PCRE_UTF8, 0, ".(.)", "a\rb\nc\r\n\xc2\x85\xe2\x80\xa8" },
     { PCRE_NEWLINE_ANY | PCRE_UTF8, 0, "(.).", "a\rb\nc\r\n\xc2\x85\xe2\x80\xa9$de" },
-    { PCRE_NEWLINE_ANYCRLF | PCRE_UTF8, 0, ".(.).", "\xe2\x80\xa8\nb\r" },
+    { PCRE_NEWLINE_ANYCRLF | PCRE_UTF8, 0 | F_NOMATCH, ".(.).", "\xe2\x80\xa8\nb\r" },
     { PCRE_NEWLINE_ANY, 0, "(.)(.)", "#\x85#\r#\n#\r\n#\x84" },
     { PCRE_NEWLINE_ANY | PCRE_UTF8, 0, "(.+)#", "#\rMn\xc2\x85#\n###" },
     { PCRE_BSR_ANYCRLF, 0, "\\R", "\r" },
@@ -335,7 +367,7 @@
     { PCRE_BSR_UNICODE | PCRE_UTF8, 0, "\\R", "ab\xe2\x80\xa8#c" },
     { PCRE_BSR_UNICODE | PCRE_UTF8, 0, "\\R", "ab\r\nc" },
     { PCRE_NEWLINE_CRLF | PCRE_BSR_UNICODE | PCRE_UTF8, 0, "(\\R.)+", "\xc2\x85\r\n#\xe2\x80\xa8\n\r\n\r" },
-    { MUA, 0, "\\R+", "ab" },
+    { MUA, 0 | F_NOMATCH, "\\R+", "ab" },
     { MUA, 0, "\\R+", "ab\r\n\r" },
     { MUA, 0, "\\R*", "ab\r\n\r" },
     { MUA, 0, "\\R*", "\r\n\r" },
@@ -343,15 +375,15 @@
     { MUA, 0, "\\R{2,4}", "\r\nab\n\n\n\r\r\r" },
     { MUA, 0, "\\R{2,}", "\r\nab\n\n\n\r\r\r" },
     { MUA, 0, "\\R{0,3}", "\r\n\r\n\r\n\r\n\r\n" },
-    { MUA, 0, "\\R+\\R\\R", "\r\n\r\n" },
+    { MUA, 0 | F_NOMATCH, "\\R+\\R\\R", "\r\n\r\n" },
     { MUA, 0, "\\R+\\R\\R", "\r\r\r" },
     { MUA, 0, "\\R*\\R\\R", "\n\r" },
-    { MUA, 0, "\\R{2,4}\\R\\R", "\r\r\r" },
+    { MUA, 0 | F_NOMATCH, "\\R{2,4}\\R\\R", "\r\r\r" },
     { MUA, 0, "\\R{2,4}\\R\\R", "\r\r\r\r" },

     /* Atomic groups (no fallback from "next" direction). */
-    { MUA, 0, "(?>ab)ab", "bab" },
-    { MUA, 0, "(?>(ab))ab", "bab" },
+    { MUA, 0 | F_NOMATCH, "(?>ab)ab", "bab" },
+    { MUA, 0 | F_NOMATCH, "(?>(ab))ab", "bab" },
     { MUA, 0, "(?>ab)+abc(?>de)*def(?>gh)?ghe(?>ij)+?k(?>lm)*?n(?>op)?\?op",
             "bababcdedefgheijijklmlmnop" },
     { MUA, 0, "(?>a(b)+a|(ab)?\?(b))an", "abban" },
@@ -379,13 +411,13 @@
     { CMA, 0, "(?>((?>a{32}|b+|(a*))?(?>c+|d*)?\?)+e)+?f", "aaccebbdde bbdaaaccebbdee bbdaaaccebbdeef" },
     { MUA, 0, "(?>(?:(?>aa|a||x)+?b|(?>aa|a||(x))+?c)?(?>[ad]{0,2})*?d)+d", "aaacdbaabdcabdbaaacd aacaabdbdcdcaaaadaabcbaadd" },
     { MUA, 0, "(?>(?:(?>aa|a||(x))+?b|(?>aa|a||x)+?c)?(?>[ad]{0,2})*?d)+d", "aaacdbaabdcabdbaaacd aacaabdbdcdcaaaadaabcbaadd" },
-    { MUA, 0, "\\X", "\xcc\x8d\xcc\x8d" },
-    { MUA, 0, "\\X", "\xcc\x8d\xcc\x8d#\xcc\x8d\xcc\x8d" },
-    { MUA, 0, "\\X+..", "\xcc\x8d#\xcc\x8d#\xcc\x8d\xcc\x8d" },
-    { MUA, 0, "\\X{2,4}", "abcdef" },
-    { MUA, 0, "\\X{2,4}?", "abcdef" },
-    { MUA, 0, "\\X{2,4}..", "#\xcc\x8d##" },
-    { MUA, 0, "\\X{2,4}..", "#\xcc\x8d#\xcc\x8d##" },
+    { MUA, 0 | F_NOMATCH | F_PROPERTY, "\\X", "\xcc\x8d\xcc\x8d" },
+    { MUA, 0 | F_PROPERTY, "\\X", "\xcc\x8d\xcc\x8d#\xcc\x8d\xcc\x8d" },
+    { MUA, 0 | F_PROPERTY, "\\X+..", "\xcc\x8d#\xcc\x8d#\xcc\x8d\xcc\x8d" },
+    { MUA, 0 | F_PROPERTY, "\\X{2,4}", "abcdef" },
+    { MUA, 0 | F_PROPERTY, "\\X{2,4}?", "abcdef" },
+    { MUA, 0 | F_NOMATCH | F_PROPERTY, "\\X{2,4}..", "#\xcc\x8d##" },
+    { MUA, 0 | F_PROPERTY, "\\X{2,4}..", "#\xcc\x8d#\xcc\x8d##" },
     { MUA, 0, "(c(ab)?+ab)+", "cabcababcab" },
     { MUA, 0, "(?>(a+)b)+aabab", "aaaabaaabaabab" },

@@ -420,7 +452,7 @@
     { MUA, 0, "((b*))++m", "bxbbxbbbxbbm" },
     { MUA, 0, "((b*))*+m", "bxbbxbbbxm" },
     { MUA, 0, "((b*))*+m", "bxbbxbbbxbbm" },
-    { MUA, 0, "(?>(b{2,4}))(?:(?:(aa|c))++m|(?:(aa|c))+n)", "bbaacaaccaaaacxbbbmbn" },
+    { MUA, 0 | F_NOMATCH, "(?>(b{2,4}))(?:(?:(aa|c))++m|(?:(aa|c))+n)", "bbaacaaccaaaacxbbbmbn" },
     { MUA, 0, "((?:b)++a)+(cd)*+m", "bbababbacdcdnbbababbacdcdm" },
     { MUA, 0, "((?:(b))++a)+((c)d)*+m", "bbababbacdcdnbbababbacdcdm" },
     { MUA, 0, "(?:(?:(?:ab)*+k)++(?:n(?:cd)++)*+)*+m", "ababkkXababkkabkncXababkkabkncdcdncdXababkkabkncdcdncdkkabkncdXababkkabkncdcdncdkkabkncdm" },
@@ -444,11 +476,12 @@
     { MUA, 0, "(?:(aa|bb)(\\1{0,3}?)){2}(dd|)(\\3{0,3}?)b(\\1{0,3}?)(\\1{0,3})", "aaaaaaaaaaaaaaabaaaaa" },
     { MUA, 0, "(a(?:\\1|)a){3}b", "aaaaaaaaaaab" },
     { MA, 0, "(a?)b(\\1\\1*\\1+\\1?\\1*?\\1+?\\1??\\1*+\\1++\\1?+\\1{4}\\1{3,5}\\1{4,}\\1{0,5}\\1{3,5}?\\1{4,}?\\1{0,5}?\\1{3,5}+\\1{4,}+\\1{0,5}+#){2}d", "bb#b##d" },
-    { MUAP, 0, "(\\P{N})\\1{2,}", ".www." },
-    { MUAP, 0, "(\\P{N})\\1{0,2}", "wwwww." },
-    { MUAP, 0, "(\\P{N})\\1{1,2}ww", "wwww" },
-    { MUAP, 0, "(\\P{N})\\1{1,2}ww", "wwwww" },
-    { PCRE_UCP, 0, "(\\P{N})\\1{2,}", ".www." },
+    { MUAP, 0 | F_PROPERTY, "(\\P{N})\\1{2,}", ".www." },
+    { MUAP, 0 | F_PROPERTY, "(\\P{N})\\1{0,2}", "wwwww." },
+    { MUAP, 0 | F_PROPERTY, "(\\P{N})\\1{1,2}ww", "wwww" },
+    { MUAP, 0 | F_PROPERTY, "(\\P{N})\\1{1,2}ww", "wwwww" },
+    { PCRE_UCP, 0 | F_PROPERTY, "(\\P{N})\\1{2,}", ".www." },
+    { CMUAP, 0, "(\xf0\x90\x90\x80)\\1", "\xf0\x90\x90\xa8\xf0\x90\x90\xa8" },

     /* Assertions. */
     { MUA, 0, "(?=xx|yy|zz)\\w{4}", "abczzdefg" },
@@ -464,9 +497,9 @@
     { MUA, 0, "(?>a(?>(b+))a(?=(..)))*?k", "acabbcabbaabacabaabbakk" },
     { MUA, 0, "((?(?=(a))a)+k)", "bbak" },
     { MUA, 0, "((?(?=a)a)+k)", "bbak" },
-    { MUA, 0, "(?=(?>(a))m)amk", "a k" },
-    { MUA, 0, "(?!(?>(a))m)amk", "a k" },
-    { MUA, 0, "(?>(?=(a))am)amk", "a k" },
+    { MUA, 0 | F_NOMATCH, "(?=(?>(a))m)amk", "a k" },
+    { MUA, 0 | F_NOMATCH, "(?!(?>(a))m)amk", "a k" },
+    { MUA, 0 | F_NOMATCH, "(?>(?=(a))am)amk", "a k" },
     { MUA, 0, "(?=(?>a|(?=(?>(b+))a|c)[a-c]+)*?m)[a-cm]+k", "aaam bbam baaambaam abbabba baaambaamk" },
     { MUA, 0, "(?> ?\?\\b(?(?=\\w{1,4}(a))m)\\w{0,8}bc){2,}?", "bca ssbc mabd ssbc mabc" },
     { MUA, 0, "(?:(?=ab)?[^n][^n])+m", "ababcdabcdcdabnababcdabcdcdabm" },
@@ -477,19 +510,19 @@
     { MUA, 0, "((?!a)?\?(?!([^a]))?\?)+$", "acbab" },

     /* Not empty, ACCEPT, FAIL */
-    { MUA | PCRE_NOTEMPTY, 0, "a*", "bcx" },
+    { MUA | PCRE_NOTEMPTY, 0 | F_NOMATCH, "a*", "bcx" },
     { MUA | PCRE_NOTEMPTY, 0, "a*", "bcaad" },
     { MUA | PCRE_NOTEMPTY, 0, "a*?", "bcaad" },
     { MUA | PCRE_NOTEMPTY_ATSTART, 0, "a*", "bcaad" },
     { MUA, 0, "a(*ACCEPT)b", "ab" },
-    { MUA | PCRE_NOTEMPTY, 0, "a*(*ACCEPT)b", "bcx" },
+    { MUA | PCRE_NOTEMPTY, 0 | F_NOMATCH, "a*(*ACCEPT)b", "bcx" },
     { MUA | PCRE_NOTEMPTY, 0, "a*(*ACCEPT)b", "bcaad" },
     { MUA | PCRE_NOTEMPTY, 0, "a*?(*ACCEPT)b", "bcaad" },
-    { MUA | PCRE_NOTEMPTY, 0, "(?:z|a*(*ACCEPT)b)", "bcx" },
+    { MUA | PCRE_NOTEMPTY, 0 | F_NOMATCH, "(?:z|a*(*ACCEPT)b)", "bcx" },
     { MUA | PCRE_NOTEMPTY, 0, "(?:z|a*(*ACCEPT)b)", "bcaad" },
     { MUA | PCRE_NOTEMPTY, 0, "(?:z|a*?(*ACCEPT)b)", "bcaad" },
     { MUA | PCRE_NOTEMPTY_ATSTART, 0, "a*(*ACCEPT)b", "bcx" },
-    { MUA | PCRE_NOTEMPTY_ATSTART, 0, "a*(*ACCEPT)b", "" },
+    { MUA | PCRE_NOTEMPTY_ATSTART, 0 | F_NOMATCH, "a*(*ACCEPT)b", "" },
     { MUA, 0, "((a(*ACCEPT)b))", "ab" },
     { MUA, 0, "(a(*FAIL)a|a)", "aaa" },
     { MUA, 0, "(?=ab(*ACCEPT)b)a", "ab" },
@@ -506,7 +539,7 @@
     { MUA, 0, "(?(?!(b))a*|b*)+k", "ababbalbbadabak" },
     { MUA, 0, "(?(?!(b))(?:aaaaaa|a)|(?:bbbbbb|b))+aaaak", "aaaaaaaaaaaaaa bbbbbbbbbbbbbbb aaaaaaak" },
     { MUA, 0, "(?(?!b)(?:aaaaaa|a)|(?:bbbbbb|b))+aaaak", "aaaaaaaaaaaaaa bbbbbbbbbbbbbbb aaaaaaak" },
-    { MUA | PCRE_BUG, 0, "(?(?!(b))(?:aaaaaa|a)|(?:bbbbbb|b))+bbbbk", "aaaaaaaaaaaaaa bbbbbbbbbbbbbbb bbbbbbbk" },
+    { MUA, 0 | F_DIFF, "(?(?!(b))(?:aaaaaa|a)|(?:bbbbbb|b))+bbbbk", "aaaaaaaaaaaaaa bbbbbbbbbbbbbbb bbbbbbbk" },
     { MUA, 0, "(?(?!b)(?:aaaaaa|a)|(?:bbbbbb|b))+bbbbk", "aaaaaaaaaaaaaa bbbbbbbbbbbbbbb bbbbbbbk" },
     { MUA, 0, "(?(?=a)a*|b*)+k", "ababbalbbadabak" },
     { MUA, 0, "(?(?!b)a*|b*)+k", "ababbalbbadabak" },
@@ -520,11 +553,11 @@
     { MUA, 0, "((?=\\w{5})\\w(?(?=\\w*k)\\d|[a-f_])*\\w\\s)+", "mol m10kk m088k _f_a_ mbkkl" },
     { MUA, 0, "(c)?\?(?(1)a|b)", "cdcaa" },
     { MUA, 0, "(c)?\?(?(1)a|b)", "cbb" },
-    { MUA | PCRE_BUG, 0, "(?(?=(a))(aaaa|a?))+aak", "aaaaab aaaaak" },
+    { MUA, 0 | F_DIFF, "(?(?=(a))(aaaa|a?))+aak", "aaaaab aaaaak" },
     { MUA, 0, "(?(?=a)(aaaa|a?))+aak", "aaaaab aaaaak" },
     { MUA, 0, "(?(?!(b))(aaaa|a?))+aak", "aaaaab aaaaak" },
     { MUA, 0, "(?(?!b)(aaaa|a?))+aak", "aaaaab aaaaak" },
-    { MUA | PCRE_BUG, 0, "(?(?=(a))a*)+aak", "aaaaab aaaaak" },
+    { MUA, 0 | F_DIFF, "(?(?=(a))a*)+aak", "aaaaab aaaaak" },
     { MUA, 0, "(?(?=a)a*)+aak", "aaaaab aaaaak" },
     { MUA, 0, "(?(?!(b))a*)+aak", "aaaaab aaaaak" },
     { MUA, 0, "(?(?!b)a*)+aak", "aaaaab aaaaak" },
@@ -537,26 +570,26 @@
     { MUA, 0, "(?:\\Ka)*aaaab", "aaaaaaaa aaaaaaabb" },
     { MUA, 0, "(?>\\Ka\\Ka)*aaaab", "aaaaaaaa aaaaaaaaaabb" },
     { MUA, 0, "a+\\K(?<=\\Gaa)a", "aaaaaa" },
-    { MUA | PCRE_NOTEMPTY, 0, "a\\K(*ACCEPT)b", "aa" },
+    { MUA | PCRE_NOTEMPTY, 0 | F_NOMATCH, "a\\K(*ACCEPT)b", "aa" },
     { MUA | PCRE_NOTEMPTY_ATSTART, 0, "a\\K(*ACCEPT)b", "aa" },

     /* First line. */
-    { MUA | PCRE_FIRSTLINE, 0, "\\p{Any}a", "bb\naaa" },
-    { MUA | PCRE_FIRSTLINE, 0, "\\p{Any}a", "bb\r\naaa" },
+    { MUA | PCRE_FIRSTLINE, 0 | F_PROPERTY, "\\p{Any}a", "bb\naaa" },
+    { MUA | PCRE_FIRSTLINE, 0 | F_NOMATCH | F_PROPERTY, "\\p{Any}a", "bb\r\naaa" },
     { MUA | PCRE_FIRSTLINE, 0, "(?<=a)", "a" },
-    { MUA | PCRE_FIRSTLINE, 0, "[^a][^b]", "ab" },
-    { MUA | PCRE_FIRSTLINE, 0, "a", "\na" },
-    { MUA | PCRE_FIRSTLINE, 0, "[abc]", "\na" },
-    { MUA | PCRE_FIRSTLINE, 0, "^a", "\na" },
-    { MUA | PCRE_FIRSTLINE, 0, "^(?<=\n)", "\na" },
-    { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANY | PCRE_FIRSTLINE, 0, "#", "\xc2\x85#" },
-    { PCRE_MULTILINE | PCRE_NEWLINE_ANY | PCRE_FIRSTLINE, 0, "#", "\x85#" },
-    { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANY | PCRE_FIRSTLINE, 0, "^#", "\xe2\x80\xa8#" },
-    { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 0, "\\p{Any}", "\r\na" },
+    { MUA | PCRE_FIRSTLINE, 0 | F_NOMATCH, "[^a][^b]", "ab" },
+    { MUA | PCRE_FIRSTLINE, 0 | F_NOMATCH, "a", "\na" },
+    { MUA | PCRE_FIRSTLINE, 0 | F_NOMATCH, "[abc]", "\na" },
+    { MUA | PCRE_FIRSTLINE, 0 | F_NOMATCH, "^a", "\na" },
+    { MUA | PCRE_FIRSTLINE, 0 | F_NOMATCH, "^(?<=\n)", "\na" },
+    { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANY | PCRE_FIRSTLINE, 0 | F_NOMATCH, "#", "\xc2\x85#" },
+    { PCRE_MULTILINE | PCRE_NEWLINE_ANY | PCRE_FIRSTLINE, 0 | F_NOMATCH, "#", "\x85#" },
+    { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_ANY | PCRE_FIRSTLINE, 0 | F_NOMATCH, "^#", "\xe2\x80\xa8#" },
+    { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 0 | F_PROPERTY, "\\p{Any}", "\r\na" },
     { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 0, ".", "\r" },
     { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 0, "a", "\ra" },
-    { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 0, "ba", "bbb\r\nba" },
-    { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 0, "\\p{Any}{4}|a", "\r\na" },
+    { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 0 | F_NOMATCH, "ba", "bbb\r\nba" },
+    { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 0 | F_NOMATCH | F_PROPERTY, "\\p{Any}{4}|a", "\r\na" },
     { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 1, ".", "\r\n" },

     /* Recurse. */
@@ -564,7 +597,7 @@
     { MUA, 0, "((a))(?1)", "aa" },
     { MUA, 0, "(b|a)(?1)", "aa" },
     { MUA, 0, "(b|(a))(?1)", "aa" },
-    { MUA, 0, "((a)(b)(?:a*))(?1)", "aba" },
+    { MUA, 0 | F_NOMATCH, "((a)(b)(?:a*))(?1)", "aba" },
     { MUA, 0, "((a)(b)(?:a*))(?1)", "abab" },
     { MUA, 0, "((a+)c(?2))b(?1)", "aacaabaca" },
     { MUA, 0, "((?2)b|(a)){2}(?1)", "aabab" },
@@ -572,10 +605,10 @@
     { MUA, 0, "(?1)(((a(*ACCEPT)))b)", "axaa" },
     { MUA, 0, "(?1)(?(DEFINE) (((ac(*ACCEPT)))b) )", "akaac" },
     { MUA, 0, "(a+)b(?1)b\\1", "abaaabaaaaa" },
-    { MUA, 0, "(?(DEFINE)(aa|a))(?1)ab", "aab" },
+    { MUA, 0 | F_NOMATCH, "(?(DEFINE)(aa|a))(?1)ab", "aab" },
     { MUA, 0, "(?(DEFINE)(a\\Kb))(?1)+ababc", "abababxabababc" },
     { MUA, 0, "(a\\Kb)(?1)+ababc", "abababxababababc" },
-    { MUA, 0, "(a\\Kb)(?1)+ababc", "abababxababababxc" },
+    { MUA, 0 | F_NOMATCH, "(a\\Kb)(?1)+ababc", "abababxababababxc" },
     { MUA, 0, "b|<(?R)*>", "<<b>" },
     { MUA, 0, "(a\\K){0}(?:(?1)b|ac)", "ac" },
     { MUA, 0, "(?(DEFINE)(a(?2)|b)(b(?1)|(a)))(?:(?1)|(?2))m", "ababababnababababaam" },
@@ -586,133 +619,541 @@
     { MUA, 0, "((?(R)(?:aaaa|a)|(?:(aaaa)|(a)))+)(?1)$", "aaaaaaaaaa aaaa" },
     { MUA, 0, "(?P<Name>a(?(R&Name)a|b))(?1)", "aab abb abaa" },

+    /* 16 bit specific tests. */
+    { CMA, 0 | F_FORCECONV, "\xc3\xa1", "\xc3\x81\xc3\xa1" },
+    { CMA, 0 | F_FORCECONV, "\xe1\xbd\xb8", "\xe1\xbf\xb8\xe1\xbd\xb8" },
+    { CMA, 0 | F_FORCECONV, "[\xc3\xa1]", "\xc3\x81\xc3\xa1" },
+    { CMA, 0 | F_FORCECONV, "[\xe1\xbd\xb8]", "\xe1\xbf\xb8\xe1\xbd\xb8" },
+    { CMA, 0 | F_FORCECONV, "[a-\xed\xb0\x80]", "A" },
+    { CMA, 0 | F_NO8 | F_FORCECONV, "[a-\\x{dc00}]", "B" },
+    { CMA, 0 | F_NO8 | F_NOMATCH | F_FORCECONV, "[b-\\x{dc00}]", "a" },
+    { CMA, 0 | F_NO8 | F_FORCECONV, "\xed\xa0\x80\\x{d800}\xed\xb0\x80\\x{dc00}", "\xed\xa0\x80\xed\xa0\x80\xed\xb0\x80\xed\xb0\x80" },
+    { CMA, 0 | F_NO8 | F_FORCECONV, "[\xed\xa0\x80\\x{d800}]{1,2}?[\xed\xb0\x80\\x{dc00}]{1,2}?#", "\xed\xa0\x80\xed\xa0\x80\xed\xb0\x80\xed\xb0\x80#" },
+    { CMA, 0 | F_FORCECONV, "[\xed\xa0\x80\xed\xb0\x80#]{0,3}(?<=\xed\xb0\x80.)", "\xed\xa0\x80#\xed\xa0\x80##\xed\xb0\x80\xed\xa0\x80" },
+    { CMA, 0 | F_FORCECONV, "[\xed\xa0\x80-\xed\xb3\xbf]", "\xed\x9f\xbf\xed\xa0\x83" },
+    { CMA, 0 | F_FORCECONV, "[\xed\xa0\x80-\xed\xb3\xbf]", "\xed\xb4\x80\xed\xb3\xb0" },
+    { CMA, 0 | F_NO8 | F_FORCECONV, "[\\x{d800}-\\x{dcff}]", "\xed\x9f\xbf\xed\xa0\x83" },
+    { CMA, 0 | F_NO8 | F_FORCECONV, "[\\x{d800}-\\x{dcff}]", "\xed\xb4\x80\xed\xb3\xb0" },
+    { CMA, 0 | F_FORCECONV, "[\xed\xa0\x80-\xef\xbf\xbf]+[\x1-\xed\xb0\x80]+#", "\xed\xa0\x85\xc3\x81\xed\xa0\x85\xef\xbf\xb0\xc2\x85\xed\xa9\x89#" },
+    { CMA, 0 | F_FORCECONV, "[\xed\xa0\x80][\xed\xb0\x80]{2,}", "\xed\xa0\x80\xed\xb0\x80\xed\xa0\x80\xed\xb0\x80\xed\xb0\x80\xed\xb0\x80" },
+    { MA, 0 | F_FORCECONV, "[^\xed\xb0\x80]{3,}?", "##\xed\xb0\x80#\xed\xb0\x80#\xc3\x89#\xed\xb0\x80" },
+    { MA, 0 | F_NO8 | F_FORCECONV, "[^\\x{dc00}]{3,}?", "##\xed\xb0\x80#\xed\xb0\x80#\xc3\x89#\xed\xb0\x80" },
+    { CMA, 0 | F_FORCECONV, ".\\B.", "\xed\xa0\x80\xed\xb0\x80" },
+    { CMA, 0 | F_FORCECONV, "\\D+(?:\\d+|.)\\S+(?:\\s+|.)\\W+(?:\\w+|.)\xed\xa0\x80\xed\xa0\x80", "\xed\xa0\x80\xed\xa0\x80\xed\xa0\x80\xed\xa0\x80\xed\xa0\x80\xed\xa0\x80\xed\xa0\x80\xed\xa0\x80" },
+    { CMA, 0 | F_FORCECONV, "\\d*\\s*\\w*\xed\xa0\x80\xed\xa0\x80", "\xed\xa0\x80\xed\xa0\x80" },
+    { CMA, 0 | F_FORCECONV | F_NOMATCH, "\\d*?\\D*?\\s*?\\S*?\\w*?\\W*?##", "\xed\xa0\x80\xed\xa0\x80\xed\xa0\x80\xed\xa0\x80#" },
+    { CMA | PCRE_EXTENDED, 0 | F_FORCECONV, "\xed\xa0\x80 \xed\xb0\x80 !", "\xed\xa0\x80\xed\xb0\x80!" },
+    { CMA, 0 | F_FORCECONV, "\xed\xa0\x80+#[^#]+\xed\xa0\x80", "\xed\xa0\x80#a\xed\xa0\x80" },
+    { CMA, 0 | F_FORCECONV, "(\xed\xa0\x80+)#\\1", "\xed\xa0\x80\xed\xa0\x80#\xed\xa0\x80\xed\xa0\x80" },
+    { PCRE_MULTILINE | PCRE_NEWLINE_ANY, 0 | F_NO8 | F_FORCECONV, "^-", "a--\xe2\x80\xa8--" },
+    { PCRE_BSR_UNICODE, 0 | F_NO8 | F_FORCECONV, "\\R", "ab\xe2\x80\xa8" },
+    { 0, 0 | F_NO8 | F_FORCECONV, "\\v", "ab\xe2\x80\xa9" },
+    { 0, 0 | F_NO8 | F_FORCECONV, "\\h", "ab\xe1\xa0\x8e" },
+    { 0, 0 | F_NO8 | F_FORCECONV, "\\v+?\\V+?#", "\xe2\x80\xa9\xe2\x80\xa9\xef\xbf\xbf\xef\xbf\xbf#" },
+    { 0, 0 | F_NO8 | F_FORCECONV, "\\h+?\\H+?#", "\xe1\xa0\x8e\xe1\xa0\x8e\xef\xbf\xbf\xef\xbf\xbf#" },
+
     /* Deep recursion. */
     { MUA, 0, "((((?:(?:(?:\\w)+)?)*|(?>\\w)+?)+|(?>\\w)?\?)*)?\\s", "aaaaa+ " },
     { MUA, 0, "(?:((?:(?:(?:\\w*?)+)??|(?>\\w)?|\\w*+)*)+)+?\\s", "aa+ " },
     { MUA, 0, "((a?)+)+b", "aaaaaaaaaaaaa b" },

     /* Deep recursion: Stack limit reached. */
-    { MA, 0, "a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaa", "aaaaaaaaaaaaaaaaaaaaaaa" },
-    { MA, 0, "(?:a+)+b", "aaaaaaaaaaaaaaaaaaaaaaaa b" },
-    { MA, 0, "(?:a+?)+?b", "aaaaaaaaaaaaaaaaaaaaaaaa b" },
-    { MA, 0, "(?:a*)*b", "aaaaaaaaaaaaaaaaaaaaaaaa b" },
-    { MA, 0, "(?:a*?)*?b", "aaaaaaaaaaaaaaaaaaaaaaaa b" },
+    { MA, 0 | F_NOMATCH, "a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaa", "aaaaaaaaaaaaaaaaaaaaaaa" },
+    { MA, 0 | F_NOMATCH, "(?:a+)+b", "aaaaaaaaaaaaaaaaaaaaaaaa b" },
+    { MA, 0 | F_NOMATCH, "(?:a+?)+?b", "aaaaaaaaaaaaaaaaaaaaaaaa b" },
+    { MA, 0 | F_NOMATCH, "(?:a*)*b", "aaaaaaaaaaaaaaaaaaaaaaaa b" },
+    { MA, 0 | F_NOMATCH, "(?:a*?)*?b", "aaaaaaaaaaaaaaaaaaaaaaaa b" },

     { 0, 0, NULL, NULL }
 };

+static const unsigned char *tables(int mode)
+{
+    /* The purpose of this function to allow valgrind
+    for reporting invalid reads and writes. */
+    static unsigned char *tables_copy;
+    pcre *regex;
+    const char *errorptr;
+    int erroroffset;
+    const unsigned char *default_tables;
+#ifdef SUPPORT_PCRE8
+    char null_str[1] = { 0 };
+#else
+    PCRE_SCHAR16 null_str[1] = { 0 };
+#endif
+
+    if (mode) {
+        if (tables_copy)
+            free(tables_copy);
+        tables_copy = NULL;
+        return NULL;
+    }
+
+    if (tables_copy)
+        return tables_copy;
+
+    default_tables = NULL;
+#ifdef SUPPORT_PCRE8
+    regex = pcre_compile(null_str, 0, &errorptr, &erroroffset, NULL);
+    if (regex) {
+        pcre_fullinfo(regex, NULL, PCRE_INFO_DEFAULT_TABLES, &default_tables);
+        pcre_free(regex);
+    }
+#else
+    regex = pcre16_compile(null_str, 0, &errorptr, &erroroffset, NULL);
+    if (regex) {
+        pcre16_fullinfo(regex, NULL, PCRE_INFO_DEFAULT_TABLES, &default_tables);
+        pcre16_free(regex);
+    }
+#endif
+    /* Shouldn't ever happen. */
+    if (!default_tables)
+        return NULL;
+
+    /* Unfortunately this value cannot get from pcre_fullinfo.
+    Since this is a test program, this is acceptable at the moment. */
+    tables_copy = (unsigned char *)malloc(1088);
+    if (!tables_copy)
+        return NULL;
+
+    memcpy(tables_copy, default_tables, 1088);
+    return tables_copy;
+}
+
+static pcre_jit_stack* callback(void *arg)
+{
+    return (pcre_jit_stack *)arg;
+}
+
+#ifdef SUPPORT_PCRE8
+static void setstack8(pcre_extra *extra)
+{
+    static pcre_jit_stack *stack;
+
+    if (!extra) {
+        if (stack)
+            pcre_jit_stack_free(stack);
+        stack = NULL;
+        return;
+    }
+
+    if (!stack)
+        stack = pcre_jit_stack_alloc(1, 1024 * 1024);
+    /* Extra can be NULL. */
+    pcre_assign_jit_stack(extra, callback, stack);
+}
+#endif /* SUPPORT_PCRE8 */
+
+#ifdef SUPPORT_PCRE16
+static void setstack16(pcre_extra *extra)
+{
+    static pcre_jit_stack *stack;
+
+    if (!extra) {
+        if (stack)
+            pcre16_jit_stack_free(stack);
+        stack = NULL;
+        return;
+    }
+
+    if (!stack)
+        stack = pcre16_jit_stack_alloc(1, 1024 * 1024);
+    /* Extra can be NULL. */
+    pcre16_assign_jit_stack(extra, callback, stack);
+}
+#endif /* SUPPORT_PCRE8 */
+
+#ifdef SUPPORT_PCRE16
+
+static int convert_utf8_to_utf16(const char *input, PCRE_SCHAR16 *output, int *offsetmap, int max_length)
+{
+    unsigned char *iptr = (unsigned char*)input;
+    unsigned short *optr = (unsigned short *)output;
+    unsigned int c;
+
+    if (max_length == 0)
+        return 0;
+
+    while (*iptr && max_length > 1) {
+        c = 0;
+        if (offsetmap)
+            *offsetmap++ = (int)(iptr - (unsigned char*)input);
+
+        if (!(*iptr & 0x80))
+            c = *iptr++;
+        else if (!(*iptr & 0x20)) {
+            c = ((iptr[0] & 0x1f) << 6) | (iptr[1] & 0x3f);
+            iptr += 2;
+        } else if (!(*iptr & 0x10)) {
+            c = ((iptr[0] & 0x0f) << 12) | ((iptr[1] & 0x3f) << 6) | (iptr[2] & 0x3f);
+            iptr += 3;
+        } else if (!(*iptr & 0x08)) {
+            c = ((iptr[0] & 0x07) << 18) | ((iptr[1] & 0x3f) << 12) | ((iptr[2] & 0x3f) << 6) | (iptr[3] & 0x3f);
+            iptr += 4;
+        }
+
+        if (c < 65536) {
+            *optr++ = c;
+            max_length--;
+        } else if (max_length <= 2) {
+            *optr = '\0';
+            return (int)(optr - (unsigned short *)output);
+        } else {
+            c -= 0x10000;
+            *optr++ = 0xd800 | ((c >> 10) & 0x3ff);
+            *optr++ = 0xdc00 | (c & 0x3ff);
+            max_length -= 2;
+            if (offsetmap)
+                offsetmap++;
+        }
+    }
+    if (offsetmap)
+        *offsetmap = (int)(iptr - (unsigned char*)input);
+    *optr = '\0';
+    return (int)(optr - (unsigned short *)output);
+}
+
+static int copy_char8_to_char16(const char *input, PCRE_SCHAR16 *output, int max_length)
+{
+    unsigned char *iptr = (unsigned char*)input;
+    unsigned short *optr = (unsigned short *)output;
+
+    if (max_length == 0)
+        return 0;
+
+    while (*iptr && max_length > 1) {
+        *optr++ = *iptr++;
+        max_length--;
+    }
+    *optr = '\0';
+    return (int)(optr - (unsigned short *)output);
+}
+
+#define REGTEST_MAX_LENGTH 4096
+static PCRE_SCHAR16 regtest_buf[REGTEST_MAX_LENGTH];
+static int regtest_offsetmap[REGTEST_MAX_LENGTH];
+
+#endif /* SUPPORT_PCRE16 */
+
+static int check_ascii(const char *input)
+{
+    const unsigned char *ptr = (unsigned char *)input;
+    while (*ptr) {
+        if (*ptr > 127)
+            return 0;
+        ptr++;
+    }
+    return 1;
+}
+
 static int regression_tests(void)
 {
-    pcre *re;
     struct regression_test_case *current = regression_test_cases;
     const char *error;
-    pcre_extra *extra;
-    int utf8 = 0, ucp = 0;
-    int ovector1[32];
-    int ovector2[32];
-    int return_value1, return_value2;
     int i, err_offs;
-    int total = 0, succesful = 0;
+    int is_successful, is_ascii_pattern, is_ascii_input;
+    int total = 0;
+    int successful = 0;
     int counter = 0;
-    int disabled_flags = PCRE_BUG;
+#ifdef SUPPORT_PCRE8
+    pcre *re8;
+    pcre_extra *extra8;
+    int ovector8_1[32];
+    int ovector8_2[32];
+    int return_value8_1, return_value8_2;
+    int utf8 = 0, ucp8 = 0;
+    int disabled_flags8 = 0;
+#endif
+#ifdef SUPPORT_PCRE16
+    pcre *re16;
+    pcre_extra *extra16;
+    int ovector16_1[32];
+    int ovector16_2[32];
+    int return_value16_1, return_value16_2;
+    int utf16 = 0, ucp16 = 0;
+    int disabled_flags16 = 0;
+    int length16;
+#endif

     /* This test compares the behaviour of interpreter and JIT. Although disabling
-    utf8 or ucp may make tests fail, if the pcre_exec result is the SAME, it is
+    utf or ucp may make tests fail, if the pcre_exec result is the SAME, it is
     still considered successful from pcre_jit_test point of view. */

+    printf("Running JIT regression\n");
+
+#ifdef SUPPORT_PCRE8
     pcre_config(PCRE_CONFIG_UTF8, &utf8);
-    pcre_config(PCRE_CONFIG_UNICODE_PROPERTIES, &ucp);
+    pcre_config(PCRE_CONFIG_UNICODE_PROPERTIES, &ucp8);
     if (!utf8)
-        disabled_flags |= PCRE_UTF8;
-    if (!ucp)
-        disabled_flags |= PCRE_UCP;
+        disabled_flags8 |= PCRE_UTF8;
+    if (!ucp8)
+        disabled_flags8 |= PCRE_UCP;
+    printf(" in  8 bit mode with utf8  %s and ucp %s:\n", utf8 ? "enabled" : "disabled", ucp8 ? "enabled" : "disabled");
+#endif
+#ifdef SUPPORT_PCRE16
+    pcre16_config(PCRE_CONFIG_UTF16, &utf16);
+    pcre16_config(PCRE_CONFIG_UNICODE_PROPERTIES, &ucp16);
+    if (!utf16)
+        disabled_flags16 |= PCRE_UTF8;
+    if (!ucp16)
+        disabled_flags16 |= PCRE_UCP;
+    printf(" in 16 bit mode with utf16 %s and ucp %s:\n", utf16 ? "enabled" : "disabled", ucp16 ? "enabled" : "disabled");
+#endif

-    printf("Running JIT regression tests with utf8 %s and ucp %s:\n", utf8 ? "enabled" : "disabled", ucp ? "enabled" : "disabled");
     while (current->pattern) {
         /* printf("\nPattern: %s :\n", current->pattern); */
         total++;
+        if (current->start_offset & F_PROPERTY) {
+            is_ascii_pattern = 0;
+            is_ascii_input = 0;
+        } else {
+            is_ascii_pattern = check_ascii(current->pattern);
+            is_ascii_input = check_ascii(current->input);
+        }

         error = NULL;
-        re = pcre_compile(current->pattern, current->flags & ~(PCRE_NOTBOL | PCRE_NOTEOL | PCRE_NOTEMPTY | PCRE_NOTEMPTY_ATSTART | disabled_flags), &error, &err_offs, NULL);
+#ifdef SUPPORT_PCRE8
+        re8 = NULL;
+        if (!(current->start_offset & F_NO8))
+            re8 = pcre_compile(current->pattern,
+                current->flags & ~(PCRE_NOTBOL | PCRE_NOTEOL | PCRE_NOTEMPTY | PCRE_NOTEMPTY_ATSTART | disabled_flags8),
+                &error, &err_offs, tables(0));

-        if (!re) {
-            if (utf8 && ucp)
-                printf("\nCannot compile pattern: %s\n", current->pattern);
-            else {
-                /* Some patterns cannot be compiled when either of utf8
-                or ucp is disabled. We just skip them. */
-                printf(".");
-                succesful++;
+        extra8 = NULL;
+        if (re8) {
+            error = NULL;
+            extra8 = pcre_study(re8, PCRE_STUDY_JIT_COMPILE, &error);
+            if (!extra8) {
+                printf("\n8 bit: Cannot study pattern: %s\n", current->pattern);
+                pcre_free(re8);
+                re8 = NULL;
             }
-            current++;
-            continue;
-        }
+            if (!(extra8->flags & PCRE_EXTRA_EXECUTABLE_JIT)) {
+                printf("\n8 bit: JIT compiler does not support: %s\n", current->pattern);
+                pcre_free_study(extra8);
+                pcre_free(re8);
+                re8 = NULL;
+            }
+        } else if (((utf8 && ucp8) || is_ascii_pattern) && !(current->start_offset & F_NO8))
+            printf("\n8 bit: Cannot compile pattern: %s\n", current->pattern);
+#endif
+#ifdef SUPPORT_PCRE16
+        if ((current->flags & PCRE_UTF8) || (current->start_offset & F_FORCECONV))
+            convert_utf8_to_utf16(current->pattern, regtest_buf, NULL, REGTEST_MAX_LENGTH);
+        else
+            copy_char8_to_char16(current->pattern, regtest_buf, REGTEST_MAX_LENGTH);

-        error = NULL;
-        extra = pcre_study(re, PCRE_STUDY_JIT_COMPILE, &error);
-        if (!extra) {
-            printf("\nCannot study pattern: %s\n", current->pattern);
-            current++;
-            continue;
-        }
+        re16 = NULL;
+        if (!(current->start_offset & F_NO16))
+            re16 = pcre16_compile(regtest_buf,
+                current->flags & ~(PCRE_NOTBOL | PCRE_NOTEOL | PCRE_NOTEMPTY | PCRE_NOTEMPTY_ATSTART | disabled_flags16),
+                &error, &err_offs, tables(0));

-        if (!(extra->flags & PCRE_EXTRA_EXECUTABLE_JIT)) {
-            printf("\nJIT compiler does not support: %s\n", current->pattern);
-            current++;
-            continue;
-        }
+        extra16 = NULL;
+        if (re16) {
+            error = NULL;
+            extra16 = pcre16_study(re16, PCRE_STUDY_JIT_COMPILE, &error);
+            if (!extra16) {
+                printf("\n16 bit: Cannot study pattern: %s\n", current->pattern);
+                pcre16_free(re16);
+                re16 = NULL;
+            }
+            if (!(extra16->flags & PCRE_EXTRA_EXECUTABLE_JIT)) {
+                printf("\n16 bit: JIT compiler does not support: %s\n", current->pattern);
+                pcre16_free_study(extra16);
+                pcre16_free(re16);
+                re16 = NULL;
+            }
+        } else if (((utf16 && ucp16) || is_ascii_pattern) && !(current->start_offset & F_NO16))
+            printf("\n16 bit: Cannot compile pattern: %s\n", current->pattern);
+#endif

         counter++;
-        if ((counter & 0x3) != 0)
-            setstack(extra);
+        if ((counter & 0x3) != 0) {
+#ifdef SUPPORT_PCRE8
+            setstack8(NULL);
+#endif
+#ifdef SUPPORT_PCRE16
+            setstack16(NULL);
+#endif
+        }

+#ifdef SUPPORT_PCRE8
+        return_value8_1 = -1000;
+        return_value8_2 = -1000;
         for (i = 0; i < 32; ++i)
-            ovector1[i] = -2;
-        return_value1 = pcre_exec(re, extra, current->input, strlen(current->input), current->start_offset, current->flags & (PCRE_NOTBOL | PCRE_NOTEOL | PCRE_NOTEMPTY | PCRE_NOTEMPTY_ATSTART), ovector1, 32);
+            ovector8_1[i] = -2;
+        for (i = 0; i < 32; ++i)
+            ovector8_2[i] = -2;
+        if (re8) {
+            setstack8(extra8);
+            return_value8_1 = pcre_exec(re8, extra8, current->input, strlen(current->input), current->start_offset & OFFSET_MASK,
+                current->flags & (PCRE_NOTBOL | PCRE_NOTEOL | PCRE_NOTEMPTY | PCRE_NOTEMPTY_ATSTART), ovector8_1, 32);
+            return_value8_2 = pcre_exec(re8, NULL, current->input, strlen(current->input), current->start_offset & OFFSET_MASK,
+                current->flags & (PCRE_NOTBOL | PCRE_NOTEOL | PCRE_NOTEMPTY | PCRE_NOTEMPTY_ATSTART), ovector8_2, 32);
+        }
+#endif

+#ifdef SUPPORT_PCRE16
+        return_value16_1 = -1000;
+        return_value16_2 = -1000;
         for (i = 0; i < 32; ++i)
-            ovector2[i] = -2;
-        return_value2 = pcre_exec(re, NULL, current->input, strlen(current->input), current->start_offset, current->flags & (PCRE_NOTBOL | PCRE_NOTEOL | PCRE_NOTEMPTY | PCRE_NOTEMPTY_ATSTART), ovector2, 32);
+            ovector16_1[i] = -2;
+        for (i = 0; i < 32; ++i)
+            ovector16_2[i] = -2;
+        if (re16) {
+            setstack16(extra16);
+            if ((current->flags & PCRE_UTF8) || (current->start_offset & F_FORCECONV))
+                length16 = convert_utf8_to_utf16(current->input, regtest_buf, regtest_offsetmap, REGTEST_MAX_LENGTH);
+            else
+                length16 = copy_char8_to_char16(current->input, regtest_buf, REGTEST_MAX_LENGTH);
+            return_value16_1 = pcre16_exec(re16, extra16, regtest_buf, length16, current->start_offset & OFFSET_MASK,
+                current->flags & (PCRE_NOTBOL | PCRE_NOTEOL | PCRE_NOTEMPTY | PCRE_NOTEMPTY_ATSTART), ovector16_1, 32);
+            return_value16_2 = pcre16_exec(re16, NULL, regtest_buf, length16, current->start_offset & OFFSET_MASK,
+                current->flags & (PCRE_NOTBOL | PCRE_NOTEOL | PCRE_NOTEMPTY | PCRE_NOTEMPTY_ATSTART), ovector16_2, 32);
+        }
+#endif

-        /* If PCRE_BUG is set, just run the test, but do not compare the results.
+        /* If F_DIFF is set, just run the test, but do not compare the results.
         Segfaults can still be captured. */
-        if (!(current->flags & PCRE_BUG)) {
-            if (return_value1 != return_value2) {
-                printf("\nReturn value differs(%d:%d): '%s' @ '%s'\n", return_value1, return_value2, current->pattern, current->input);
-                current++;
-                continue;
-            }

-            if (return_value1 >= 0) {
-                return_value1 *= 2;
-                err_offs = 0;
-                for (i = 0; i < return_value1; ++i)
-                    if (ovector1[i] != ovector2[i]) {
-                        printf("\nOvector[%d] value differs(%d:%d): '%s' @ '%s' \n", i, ovector1[i], ovector2[i], current->pattern, current->input);
-                        err_offs = 1;
+        is_successful = 1;
+        if (!(current->start_offset & F_DIFF)) {
+#if defined SUPPORT_PCRE8 && defined SUPPORT_PCRE16
+            if (utf8 == utf16 && !(current->start_offset & F_FORCECONV)) {
+                /* All results must be the same. */
+                if (return_value8_1 != return_value8_2 || return_value8_1 != return_value16_1 || return_value8_1 != return_value16_2) {
+                    printf("\n8 and 16 bit: Return value differs(%d:%d:%d:%d): [%d] '%s' @ '%s'\n",
+                        return_value8_1, return_value8_2, return_value16_1, return_value16_2,
+                        total, current->pattern, current->input);
+                    is_successful = 0;
+                } else if (return_value8_1 >= 0) {
+                    return_value8_1 *= 2;
+                    /* Transform back the results. */
+                    if (current->flags & PCRE_UTF8) {
+                        for (i = 0; i < return_value8_1; ++i) {
+                            if (ovector16_1[i] >= 0)
+                                ovector16_1[i] = regtest_offsetmap[ovector16_1[i]];
+                            if (ovector16_2[i] >= 0)
+                                ovector16_2[i] = regtest_offsetmap[ovector16_2[i]];
+                        }
                     }
-                if (err_offs) {
-                    current++;
-                    continue;
+
+                    for (i = 0; i < return_value8_1; ++i)
+                        if (ovector8_1[i] != ovector8_2[i] || ovector8_1[i] != ovector16_1[i] || ovector8_1[i] != ovector16_2[i]) {
+                            printf("\n8 and 16 bit: Ovector[%d] value differs(%d:%d:%d:%d): [%d] '%s' @ '%s' \n",
+                                i, ovector8_1[i], ovector8_2[i], ovector16_1[i], ovector16_2[i],
+                                total, current->pattern, current->input);
+                            is_successful = 0;
+                        }
                 }
+            } else {
+#endif /* SUPPORT_PCRE8 && SUPPORT_PCRE16 */
+                /* Only the 8 bit and 16 bit results must be equal. */
+#ifdef SUPPORT_PCRE8
+                if (return_value8_1 != return_value8_2) {
+                    printf("\n8 bit: Return value differs(%d:%d): [%d] '%s' @ '%s'\n",
+                        return_value8_1, return_value8_2, total, current->pattern, current->input);
+                    is_successful = 0;
+                } else if (return_value8_1 >= 0) {
+                    return_value8_1 *= 2;
+                    for (i = 0; i < return_value8_1; ++i)
+                        if (ovector8_1[i] != ovector8_2[i]) {
+                            printf("\n8 bit: Ovector[%d] value differs(%d:%d): [%d] '%s' @ '%s'\n",
+                                i, ovector8_1[i], ovector8_2[i], total, current->pattern, current->input);
+                            is_successful = 0;
+                        }
+                }
+#endif
+
+#ifdef SUPPORT_PCRE16
+                if (return_value16_1 != return_value16_2) {
+                    printf("\n16 bit: Return value differs(%d:%d): [%d] '%s' @ '%s'\n",
+                        return_value16_1, return_value16_2, total, current->pattern, current->input);
+                    is_successful = 0;
+                } else if (return_value16_1 >= 0) {
+                    return_value16_1 *= 2;
+                    for (i = 0; i < return_value16_1; ++i)
+                        if (ovector16_1[i] != ovector16_2[i]) {
+                            printf("\n16 bit: Ovector[%d] value differs(%d:%d): [%d] '%s' @ '%s'\n",
+                                i, ovector16_1[i], ovector16_2[i], total, current->pattern, current->input);
+                            is_successful = 0;
+                        }
+                }
+#endif
+
+#if defined SUPPORT_PCRE8 && defined SUPPORT_PCRE16
             }
+#endif /* SUPPORT_PCRE8 && SUPPORT_PCRE16 */
         }

-        pcre_free_study(extra);
-        pcre_free(re);
+        if (is_successful) {
+#ifdef SUPPORT_PCRE8
+            if (!(current->start_offset & F_NO8) && ((utf8 && ucp8) || is_ascii_input)) {
+                if (return_value8_1 < 0 && !(current->start_offset & F_NOMATCH)) {
+                    printf("8 bit: Test should match: [%d] '%s' @ '%s'\n",
+                        total, current->pattern, current->input);
+                    is_successful = 0;
+                }

-        /* printf("[%d-%d]%s", ovector1[0], ovector1[1], (current->flags & PCRE_CASELESS) ? "C" : ""); */
+                if (return_value8_1 >= 0 && (current->start_offset & F_NOMATCH)) {
+                    printf("8 bit: Test should not match: [%d] '%s' @ '%s'\n",
+                        total, current->pattern, current->input);
+                    is_successful = 0;
+                }
+            }
+#endif
+#ifdef SUPPORT_PCRE16
+            if (!(current->start_offset & F_NO16) && ((utf16 && ucp16) || is_ascii_input)) {
+                if (return_value16_1 < 0 && !(current->start_offset & F_NOMATCH)) {
+                    printf("16 bit: Test should match: [%d] '%s' @ '%s'\n",
+                        total, current->pattern, current->input);
+                    is_successful = 0;
+                }
+
+                if (return_value16_1 >= 0 && (current->start_offset & F_NOMATCH)) {
+                    printf("16 bit: Test should not match: [%d] '%s' @ '%s'\n",
+                        total, current->pattern, current->input);
+                    is_successful = 0;
+                }
+            }
+#endif
+        }
+
+        if (is_successful)
+            successful++;
+
+#ifdef SUPPORT_PCRE8
+        if (re8) {
+            pcre_free_study(extra8);
+            pcre_free(re8);
+        }
+#endif
+#ifdef SUPPORT_PCRE16
+        if (re16) {
+            pcre16_free_study(extra16);
+            pcre16_free(re16);
+        }
+#endif
+
+        /* printf("[%d-%d|%d-%d]%s", ovector8_1[0], ovector8_1[1], ovector16_1[0], ovector16_1[1], (current->flags & PCRE_CASELESS) ? "C" : ""); */
         printf(".");
         fflush(stdout);
         current++;
-        succesful++;
     }
+    tables(1);
+#ifdef SUPPORT_PCRE8
+    setstack8(NULL);
+#endif
+#ifdef SUPPORT_PCRE16
+    setstack16(NULL);
+#endif

-    if (total == succesful) {
+    if (total == successful) {
         printf("\nAll JIT regression tests are successfully passed.\n");
         return 0;
     } else {
-        printf("\nSuccessful test ratio: %d%%\n", succesful * 100 / total);
+        printf("\nSuccessful test ratio: %d%% (%d failed)\n", successful * 100 / total, total - successful);
         return 1;
     }
 }

Modified: code/trunk/pcre_maketables.c
===================================================================
--- code/trunk/pcre_maketables.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_maketables.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2008 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -59,21 +59,26 @@
/* This function builds a set of character tables for use by PCRE and returns
a pointer to them. They are build using the ctype functions, and consequently
their contents will depend upon the current locale setting. When compiled as
-part of the library, the store is obtained via pcre_malloc(), but when compiled
-inside dftables, use malloc().
+part of the library, the store is obtained via PUBL(malloc)(), but when
+compiled inside dftables, use malloc().

 Arguments:   none
 Returns:     pointer to the contiguous block of data
 */

+#ifdef COMPILE_PCRE8
const unsigned char *
pcre_maketables(void)
+#else
+const unsigned char *
+pcre16_maketables(void)
+#endif
{
unsigned char *yield, *p;
int i;

#ifndef DFTABLES
-yield = (unsigned char*)(pcre_malloc)(tables_length);
+yield = (unsigned char*)(PUBL(malloc))(tables_length);
#else
yield = (unsigned char*)malloc(tables_length);
#endif

Modified: code/trunk/pcre_newline.c
===================================================================
--- code/trunk/pcre_newline.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_newline.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2009 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@@ -67,16 +67,25 @@
   type         the newline type
   endptr       pointer to the end of the string
   lenptr       where to return the length
-  utf8         TRUE if in utf8 mode
+  utf          TRUE if in utf mode

 Returns:       TRUE or FALSE
 */

BOOL
-_pcre_is_newline(USPTR ptr, int type, USPTR endptr, int *lenptr, BOOL utf8)
+PRIV(is_newline)(PCRE_PUCHAR ptr, int type, PCRE_PUCHAR endptr, int *lenptr,
+ BOOL utf)
{
int c;
-if (utf8) { GETCHAR(c, ptr); } else c = *ptr;
+(void)utf;
+#ifdef SUPPORT_UTF
+if (utf)
+ {
+ GETCHAR(c, ptr);
+ }
+else
+#endif /* SUPPORT_UTF */
+ c = *ptr;

 if (type == NLTYPE_ANYCRLF) switch(c)
   {
@@ -95,9 +104,15 @@
   case 0x000c: *lenptr = 1; return TRUE;             /* FF */
   case 0x000d: *lenptr = (ptr < endptr - 1 && ptr[1] == 0x0a)? 2 : 1;
                return TRUE;                          /* CR */
-  case 0x0085: *lenptr = utf8? 2 : 1; return TRUE;   /* NEL */
+#ifdef COMPILE_PCRE8
+  case 0x0085: *lenptr = utf? 2 : 1; return TRUE;    /* NEL */
   case 0x2028:                                       /* LS */
   case 0x2029: *lenptr = 3; return TRUE;             /* PS */
+#else
+  case 0x0085:                                       /* NEL */
+  case 0x2028:                                       /* LS */
+  case 0x2029: *lenptr = 1; return TRUE;             /* PS */
+#endif /* COMPILE_PCRE8 */
   default: return FALSE;
   }
 }
@@ -116,26 +131,27 @@
   type         the newline type
   startptr     pointer to the start of the string
   lenptr       where to return the length
-  utf8         TRUE if in utf8 mode
+  utf          TRUE if in utf mode

 Returns:       TRUE or FALSE
 */

BOOL
-_pcre_was_newline(USPTR ptr, int type, USPTR startptr, int *lenptr, BOOL utf8)
+PRIV(was_newline)(PCRE_PUCHAR ptr, int type, PCRE_PUCHAR startptr, int *lenptr,
+ BOOL utf)
{
int c;
+(void)utf;
ptr--;
-#ifdef SUPPORT_UTF8
-if (utf8)
+#ifdef SUPPORT_UTF
+if (utf)
{
BACKCHAR(ptr);
GETCHAR(c, ptr);
}
-else c = *ptr;
-#else /* no UTF-8 support */
-c = *ptr;
-#endif /* SUPPORT_UTF8 */
+else
+#endif /* SUPPORT_UTF */
+ c = *ptr;

 if (type == NLTYPE_ANYCRLF) switch(c)
   {
@@ -152,9 +168,15 @@
   case 0x000b:                                      /* VT */
   case 0x000c:                                      /* FF */
   case 0x000d: *lenptr = 1; return TRUE;            /* CR */
-  case 0x0085: *lenptr = utf8? 2 : 1; return TRUE;  /* NEL */
+#ifdef COMPILE_PCRE8
+  case 0x0085: *lenptr = utf? 2 : 1; return TRUE;   /* NEL */
   case 0x2028:                                      /* LS */
   case 0x2029: *lenptr = 3; return TRUE;            /* PS */
+#else
+  case 0x0085:                                       /* NEL */
+  case 0x2028:                                       /* LS */
+  case 0x2029: *lenptr = 1; return TRUE;             /* PS */
+#endif /* COMPILE_PCRE8 */
   default: return FALSE;
   }
 }

Modified: code/trunk/pcre_ord2utf8.c
===================================================================
--- code/trunk/pcre_ord2utf8.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_ord2utf8.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2008 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@@ -52,35 +52,45 @@
 *       Convert character value to UTF-8         *
 *************************************************/

-/* This function takes an integer value in the range 0 - 0x7fffffff
-and encodes it as a UTF-8 character in 0 to 6 bytes.
+/* This function takes an integer value in the range 0 - 0x10ffff
+and encodes it as a UTF-8 character in 1 to 6 pcre_uchars.

 Arguments:
   cvalue     the character value
-  buffer     pointer to buffer for result - at least 6 bytes long
+  buffer     pointer to buffer for result - at least 6 pcre_uchars long

 Returns:     number of characters placed in the buffer
 */

int
-_pcre_ord2utf8(int cvalue, uschar *buffer)
+PRIV(ord2utf)(pcre_uint32 cvalue, pcre_uchar *buffer)
{
-#ifdef SUPPORT_UTF8
+#ifdef SUPPORT_UTF
+
register int i, j;
-for (i = 0; i < _pcre_utf8_table1_size; i++)
- if (cvalue <= _pcre_utf8_table1[i]) break;
+
+/* Checking invalid cvalue character, encoded as invalid UTF-16 character.
+Should never happen in practice. */
+if ((cvalue & 0xf800) == 0xd800 || cvalue >= 0x110000)
+ cvalue = 0xfffe;
+
+for (i = 0; i < PRIV(utf8_table1_size); i++)
+ if (cvalue <= PRIV(utf8_table1)[i]) break;
buffer += i;
for (j = i; j > 0; j--)
{
*buffer-- = 0x80 | (cvalue & 0x3f);
cvalue >>= 6;
}
-*buffer = _pcre_utf8_table2[i] | cvalue;
+*buffer = PRIV(utf8_table2)[i] | cvalue;
return i + 1;
+
#else
+
(void)(cvalue); /* Keep compiler happy; this function won't ever be */
-(void)(buffer); /* called when SUPPORT_UTF8 is not defined. */
+(void)(buffer); /* called when SUPPORT_UTF is not defined. */
return 0;
+
#endif
}

Copied: code/trunk/pcre_printint.c (from rev 835, code/branches/pcre16/pcre_printint.c)
===================================================================
--- code/trunk/pcre_printint.c                            (rev 0)
+++ code/trunk/pcre_printint.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,711 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+
+/* This module contains a PCRE private debugging function for printing out the
+internal form of a compiled regular expression, along with some supporting
+local functions. This source file is used in two places:
+
+(1) It is #included by pcre_compile.c when it is compiled in debugging mode
+(PCRE_DEBUG defined in pcre_internal.h). It is not included in production
+compiles. In this case PCRE_INCLUDED is defined.
+
+(2) It is also compiled separately and linked with pcretest.c, which can be
+asked to print out a compiled regex for debugging purposes. */
+
+#ifndef PCRE_INCLUDED
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+/* We have to include pcre_internal.h because we need the internal info for
+displaying the results of pcre_study() and we also need to know about the
+internal macros, structures, and other internal data values; pcretest has
+"inside information" compared to a program that strictly follows the PCRE API.
+
+Although pcre_internal.h does itself include pcre.h, we explicitly include it
+here before pcre_internal.h so that the PCRE_EXP_xxx macros get set
+appropriately for an application, not for building PCRE. */
+
+#include "pcre.h"
+#include "pcre_internal.h"
+
+/* These are the funtions that are contained within. It doesn't seem worth
+having a separate .h file just for this. */
+
+#endif /* PCRE_INCLUDED */
+
+#ifdef PCRE_INCLUDED
+static /* Keep the following function as private. */
+#endif
+#ifdef COMPILE_PCRE8
+void pcre_printint(pcre *external_re, FILE *f, BOOL print_lengths);
+#else
+void pcre16_printint(pcre *external_re, FILE *f, BOOL print_lengths);
+#endif
+
+/* Macro that decides whether a character should be output as a literal or in
+hexadecimal. We don't use isprint() because that can vary from system to system
+(even without the use of locales) and we want the output always to be the same,
+for testing purposes. */
+
+#ifdef EBCDIC
+#define PRINTABLE(c) ((c) >= 64 && (c) < 255)
+#else
+#define PRINTABLE(c) ((c) >= 32 && (c) < 127)
+#endif
+
+/* The table of operator names. */
+
+static const char *OP_names[] = { OP_NAME_LIST };
+
+/* This table of operator lengths is not actually used by the working code,
+but its size is needed for a check that ensures it is the correct size for the
+number of opcodes (thus catching update omissions). */
+
+static const pcre_uint8 OP_lengths[] = { OP_LENGTHS };
+
+
+
+/*************************************************
+*       Print single- or multi-byte character    *
+*************************************************/
+
+static int
+print_char(FILE *f, pcre_uchar *ptr, BOOL utf)
+{
+int c = *ptr;
+
+#ifndef SUPPORT_UTF
+
+(void)utf;  /* Avoid compiler warning */
+if (PRINTABLE(c)) fprintf(f, "%c", c);
+else if (c <= 0xff) fprintf(f, "\\x%02x", c);
+else fprintf(f, "\\x{%x}", c);
+return 0;
+
+#else
+
+#ifdef COMPILE_PCRE8
+
+if (!utf || (c & 0xc0) != 0xc0)
+  {
+  if (PRINTABLE(c)) fprintf(f, "%c", c); else fprintf(f, "\\x%02x", c);
+  return 0;
+  }
+else
+  {
+  int i;
+  int a = PRIV(utf8_table4)[c & 0x3f];  /* Number of additional bytes */
+  int s = 6*a;
+  c = (c & PRIV(utf8_table3)[a]) << s;
+  for (i = 1; i <= a; i++)
+    {
+    /* This is a check for malformed UTF-8; it should only occur if the sanity
+    check has been turned off. Rather than swallow random bytes, just stop if
+    we hit a bad one. Print it with \X instead of \x as an indication. */
+
+    if ((ptr[i] & 0xc0) != 0x80)
+      {
+      fprintf(f, "\\X{%x}", c);
+      return i - 1;
+      }
+
+    /* The byte is OK */
+
+    s -= 6;
+    c |= (ptr[i] & 0x3f) << s;
+    }
+  fprintf(f, "\\x{%x}", c);
+  return a;
+  }
+
+#else
+
+#ifdef COMPILE_PCRE16
+
+if (!utf || (c & 0xfc00) != 0xd800)
+  {
+  if (PRINTABLE(c)) fprintf(f, "%c", c);
+  else if (c <= 0xff) fprintf(f, "\\x%02x", c);
+  else fprintf(f, "\\x{%x}", c);
+  return 0;
+  }
+else
+  {
+  /* This is a check for malformed UTF-16; it should only occur if the sanity
+  check has been turned off. Rather than swallow a low surrogate, just stop if
+  we hit a bad one. Print it with \X instead of \x as an indication. */
+
+  if ((ptr[1] & 0xfc00) != 0xdc00)
+    {
+    fprintf(f, "\\X{%x}", c);
+    return 0;
+    }
+
+  c = (((c & 0x3ff) << 10) | (ptr[1] & 0x3ff)) + 0x10000;
+  fprintf(f, "\\x{%x}", c);
+  return 1;
+  }
+
+#endif /* COMPILE_PCRE16 */
+
+#endif /* COMPILE_PCRE8 */
+
+#endif /* SUPPORT_UTF */
+}
+
+/*************************************************
+*  Print uchar string (regardless of utf)        *
+*************************************************/
+
+static void
+print_puchar(FILE *f, PCRE_PUCHAR ptr)
+{
+while (*ptr != '\0')
+  {
+  register int c = *ptr++;
+  if (PRINTABLE(c)) fprintf(f, "%c", c); else fprintf(f, "\\x{%x}", c);
+  }
+}
+
+/*************************************************
+*          Find Unicode property name            *
+*************************************************/
+
+static const char *
+get_ucpname(int ptype, int pvalue)
+{
+#ifdef SUPPORT_UCP
+int i;
+for (i = PRIV(utt_size) - 1; i >= 0; i--)
+  {
+  if (ptype == PRIV(utt)[i].type && pvalue == PRIV(utt)[i].value) break;
+  }
+return (i >= 0)? PRIV(utt_names) + PRIV(utt)[i].name_offset : "??";
+#else
+/* It gets harder and harder to shut off unwanted compiler warnings. */
+ptype = ptype * pvalue;
+return (ptype == pvalue)? "??" : "??";
+#endif
+}
+
+
+
+/*************************************************
+*         Print compiled regex                   *
+*************************************************/
+
+/* Make this function work for a regex with integers either byte order.
+However, we assume that what we are passed is a compiled regex. The
+print_lengths flag controls whether offsets and lengths of items are printed.
+They can be turned off from pcretest so that automatic tests on bytecode can be
+written that do not depend on the value of LINK_SIZE. */
+
+#ifdef PCRE_INCLUDED
+static /* Keep the following function as private. */
+#endif
+#ifdef COMPILE_PCRE8
+void
+pcre_printint(pcre *external_re, FILE *f, BOOL print_lengths)
+#else
+void
+pcre16_printint(pcre *external_re, FILE *f, BOOL print_lengths)
+#endif
+{
+real_pcre *re = (real_pcre *)external_re;
+pcre_uchar *codestart, *code;
+BOOL utf;
+
+unsigned int options = re->options;
+int offset = re->name_table_offset;
+int count = re->name_count;
+int size = re->name_entry_size;
+
+if (re->magic_number != MAGIC_NUMBER)
+  {
+  offset = ((offset << 8) & 0xff00) | ((offset >> 8) & 0xff);
+  count = ((count << 8) & 0xff00) | ((count >> 8) & 0xff);
+  size = ((size << 8) & 0xff00) | ((size >> 8) & 0xff);
+  options = ((options << 24) & 0xff000000) |
+            ((options <<  8) & 0x00ff0000) |
+            ((options >>  8) & 0x0000ff00) |
+            ((options >> 24) & 0x000000ff);
+  }
+
+code = codestart = (pcre_uchar *)re + offset + count * size;
+/* PCRE_UTF16 has the same value as PCRE_UTF8. */
+utf = (options & PCRE_UTF8) != 0;
+
+for(;;)
+  {
+  pcre_uchar *ccode;
+  const char *flag = "  ";
+  int c;
+  int extra = 0;
+
+  if (print_lengths)
+    fprintf(f, "%3d ", (int)(code - codestart));
+  else
+    fprintf(f, "    ");
+
+  switch(*code)
+    {
+/* ========================================================================== */
+      /* These cases are never obeyed. This is a fudge that causes a compile-
+      time error if the vectors OP_names or OP_lengths, which are indexed
+      by opcode, are not the correct length. It seems to be the only way to do
+      such a check at compile time, as the sizeof() operator does not work in
+      the C preprocessor. */
+
+      case OP_TABLE_LENGTH:
+      case OP_TABLE_LENGTH +
+        ((sizeof(OP_names)/sizeof(const char *) == OP_TABLE_LENGTH) &&
+        (sizeof(OP_lengths) == OP_TABLE_LENGTH)):
+      break;
+/* ========================================================================== */
+
+    case OP_END:
+    fprintf(f, "    %s\n", OP_names[*code]);
+    fprintf(f, "------------------------------------------------------------------\n");
+    return;
+
+    case OP_CHAR:
+    fprintf(f, "    ");
+    do
+      {
+      code++;
+      code += 1 + print_char(f, code, utf);
+      }
+    while (*code == OP_CHAR);
+    fprintf(f, "\n");
+    continue;
+
+    case OP_CHARI:
+    fprintf(f, " /i ");
+    do
+      {
+      code++;
+      code += 1 + print_char(f, code, utf);
+      }
+    while (*code == OP_CHARI);
+    fprintf(f, "\n");
+    continue;
+
+    case OP_CBRA:
+    case OP_CBRAPOS:
+    case OP_SCBRA:
+    case OP_SCBRAPOS:
+    if (print_lengths) fprintf(f, "%3d ", GET(code, 1));
+      else fprintf(f, "    ");
+    fprintf(f, "%s %d", OP_names[*code], GET2(code, 1+LINK_SIZE));
+    break;
+
+    case OP_BRA:
+    case OP_BRAPOS:
+    case OP_SBRA:
+    case OP_SBRAPOS:
+    case OP_KETRMAX:
+    case OP_KETRMIN:
+    case OP_KETRPOS:
+    case OP_ALT:
+    case OP_KET:
+    case OP_ASSERT:
+    case OP_ASSERT_NOT:
+    case OP_ASSERTBACK:
+    case OP_ASSERTBACK_NOT:
+    case OP_ONCE:
+    case OP_ONCE_NC:
+    case OP_COND:
+    case OP_SCOND:
+    case OP_REVERSE:
+    if (print_lengths) fprintf(f, "%3d ", GET(code, 1));
+      else fprintf(f, "    ");
+    fprintf(f, "%s", OP_names[*code]);
+    break;
+
+    case OP_CLOSE:
+    fprintf(f, "    %s %d", OP_names[*code], GET2(code, 1));
+    break;
+
+    case OP_CREF:
+    case OP_NCREF:
+    fprintf(f, "%3d %s", GET2(code,1), OP_names[*code]);
+    break;
+
+    case OP_RREF:
+    c = GET2(code, 1);
+    if (c == RREF_ANY)
+      fprintf(f, "    Cond recurse any");
+    else
+      fprintf(f, "    Cond recurse %d", c);
+    break;
+
+    case OP_NRREF:
+    c = GET2(code, 1);
+    if (c == RREF_ANY)
+      fprintf(f, "    Cond nrecurse any");
+    else
+      fprintf(f, "    Cond nrecurse %d", c);
+    break;
+
+    case OP_DEF:
+    fprintf(f, "    Cond def");
+    break;
+
+    case OP_STARI:
+    case OP_MINSTARI:
+    case OP_POSSTARI:
+    case OP_PLUSI:
+    case OP_MINPLUSI:
+    case OP_POSPLUSI:
+    case OP_QUERYI:
+    case OP_MINQUERYI:
+    case OP_POSQUERYI:
+    flag = "/i";
+    /* Fall through */
+    case OP_STAR:
+    case OP_MINSTAR:
+    case OP_POSSTAR:
+    case OP_PLUS:
+    case OP_MINPLUS:
+    case OP_POSPLUS:
+    case OP_QUERY:
+    case OP_MINQUERY:
+    case OP_POSQUERY:
+    case OP_TYPESTAR:
+    case OP_TYPEMINSTAR:
+    case OP_TYPEPOSSTAR:
+    case OP_TYPEPLUS:
+    case OP_TYPEMINPLUS:
+    case OP_TYPEPOSPLUS:
+    case OP_TYPEQUERY:
+    case OP_TYPEMINQUERY:
+    case OP_TYPEPOSQUERY:
+    fprintf(f, " %s ", flag);
+    if (*code >= OP_TYPESTAR)
+      {
+      fprintf(f, "%s", OP_names[code[1]]);
+      if (code[1] == OP_PROP || code[1] == OP_NOTPROP)
+        {
+        fprintf(f, " %s ", get_ucpname(code[2], code[3]));
+        extra = 2;
+        }
+      }
+    else extra = print_char(f, code+1, utf);
+    fprintf(f, "%s", OP_names[*code]);
+    break;
+
+    case OP_EXACTI:
+    case OP_UPTOI:
+    case OP_MINUPTOI:
+    case OP_POSUPTOI:
+    flag = "/i";
+    /* Fall through */
+    case OP_EXACT:
+    case OP_UPTO:
+    case OP_MINUPTO:
+    case OP_POSUPTO:
+    fprintf(f, " %s ", flag);
+    extra = print_char(f, code + 1 + IMM2_SIZE, utf);
+    fprintf(f, "{");
+    if (*code != OP_EXACT && *code != OP_EXACTI) fprintf(f, "0,");
+    fprintf(f, "%d}", GET2(code,1));
+    if (*code == OP_MINUPTO || *code == OP_MINUPTOI) fprintf(f, "?");
+      else if (*code == OP_POSUPTO || *code == OP_POSUPTOI) fprintf(f, "+");
+    break;
+
+    case OP_TYPEEXACT:
+    case OP_TYPEUPTO:
+    case OP_TYPEMINUPTO:
+    case OP_TYPEPOSUPTO:
+    fprintf(f, "    %s", OP_names[code[1 + IMM2_SIZE]]);
+    if (code[1 + IMM2_SIZE] == OP_PROP || code[1 + IMM2_SIZE] == OP_NOTPROP)
+      {
+      fprintf(f, " %s ", get_ucpname(code[1 + IMM2_SIZE + 1],
+        code[1 + IMM2_SIZE + 2]));
+      extra = 2;
+      }
+    fprintf(f, "{");
+    if (*code != OP_TYPEEXACT) fprintf(f, "0,");
+    fprintf(f, "%d}", GET2(code,1));
+    if (*code == OP_TYPEMINUPTO) fprintf(f, "?");
+      else if (*code == OP_TYPEPOSUPTO) fprintf(f, "+");
+    break;
+
+    case OP_NOTI:
+    flag = "/i";
+    /* Fall through */
+    case OP_NOT:
+    c = code[1];
+    if (PRINTABLE(c)) fprintf(f, " %s [^%c]", flag, c);
+    else if (utf || c > 0xff)
+      fprintf(f, " %s [^\\x{%02x}]", flag, c);
+    else   
+      fprintf(f, " %s [^\\x%02x]", flag, c);
+    break;
+
+    case OP_NOTSTARI:
+    case OP_NOTMINSTARI:
+    case OP_NOTPOSSTARI:
+    case OP_NOTPLUSI:
+    case OP_NOTMINPLUSI:
+    case OP_NOTPOSPLUSI:
+    case OP_NOTQUERYI:
+    case OP_NOTMINQUERYI:
+    case OP_NOTPOSQUERYI:
+    flag = "/i";
+    /* Fall through */
+
+    case OP_NOTSTAR:
+    case OP_NOTMINSTAR:
+    case OP_NOTPOSSTAR:
+    case OP_NOTPLUS:
+    case OP_NOTMINPLUS:
+    case OP_NOTPOSPLUS:
+    case OP_NOTQUERY:
+    case OP_NOTMINQUERY:
+    case OP_NOTPOSQUERY:
+    c = code[1];
+    if (PRINTABLE(c)) fprintf(f, " %s [^%c]", flag, c);
+      else fprintf(f, " %s [^\\x%02x]", flag, c);
+    fprintf(f, "%s", OP_names[*code]);
+    break;
+
+    case OP_NOTEXACTI:
+    case OP_NOTUPTOI:
+    case OP_NOTMINUPTOI:
+    case OP_NOTPOSUPTOI:
+    flag = "/i";
+    /* Fall through */
+
+    case OP_NOTEXACT:
+    case OP_NOTUPTO:
+    case OP_NOTMINUPTO:
+    case OP_NOTPOSUPTO:
+    c = code[1 + IMM2_SIZE];
+    if (PRINTABLE(c)) fprintf(f, " %s [^%c]{", flag, c);
+      else fprintf(f, " %s [^\\x%02x]{", flag, c);
+    if (*code != OP_NOTEXACT && *code != OP_NOTEXACTI) fprintf(f, "0,");
+    fprintf(f, "%d}", GET2(code,1));
+    if (*code == OP_NOTMINUPTO || *code == OP_NOTMINUPTOI) fprintf(f, "?");
+      else
+    if (*code == OP_NOTPOSUPTO || *code == OP_NOTPOSUPTOI) fprintf(f, "+");
+    break;
+
+    case OP_RECURSE:
+    if (print_lengths) fprintf(f, "%3d ", GET(code, 1));
+      else fprintf(f, "    ");
+    fprintf(f, "%s", OP_names[*code]);
+    break;
+
+    case OP_REFI:
+    flag = "/i";
+    /* Fall through */
+    case OP_REF:
+    fprintf(f, " %s \\%d", flag, GET2(code,1));
+    ccode = code + PRIV(OP_lengths)[*code];
+    goto CLASS_REF_REPEAT;
+
+    case OP_CALLOUT:
+    fprintf(f, "    %s %d %d %d", OP_names[*code], code[1], GET(code,2),
+      GET(code, 2 + LINK_SIZE));
+    break;
+
+    case OP_PROP:
+    case OP_NOTPROP:
+    fprintf(f, "    %s %s", OP_names[*code], get_ucpname(code[1], code[2]));
+    break;
+
+    /* OP_XCLASS can only occur in UTF or PCRE16 modes. However, there's no
+    harm in having this code always here, and it makes it less messy without
+    all those #ifdefs. */
+
+    case OP_CLASS:
+    case OP_NCLASS:
+    case OP_XCLASS:
+      {
+      int i, min, max;
+      BOOL printmap;
+      pcre_uint8 *map;
+
+      fprintf(f, "    [");
+
+      if (*code == OP_XCLASS)
+        {
+        extra = GET(code, 1);
+        ccode = code + LINK_SIZE + 1;
+        printmap = (*ccode & XCL_MAP) != 0;
+        if ((*ccode++ & XCL_NOT) != 0) fprintf(f, "^");
+        }
+      else
+        {
+        printmap = TRUE;
+        ccode = code + 1;
+        }
+
+      /* Print a bit map */
+
+      if (printmap)
+        {
+        map = (pcre_uint8 *)ccode;
+        for (i = 0; i < 256; i++)
+          {
+          if ((map[i/8] & (1 << (i&7))) != 0)
+            {
+            int j;
+            for (j = i+1; j < 256; j++)
+              if ((map[j/8] & (1 << (j&7))) == 0) break;
+            if (i == '-' || i == ']') fprintf(f, "\\");
+            if (PRINTABLE(i)) fprintf(f, "%c", i);
+              else fprintf(f, "\\x%02x", i);
+            if (--j > i)
+              {
+              if (j != i + 1) fprintf(f, "-");
+              if (j == '-' || j == ']') fprintf(f, "\\");
+              if (PRINTABLE(j)) fprintf(f, "%c", j);
+                else fprintf(f, "\\x%02x", j);
+              }
+            i = j;
+            }
+          }
+        ccode += 32 / sizeof(pcre_uchar);
+        }
+
+      /* For an XCLASS there is always some additional data */
+
+      if (*code == OP_XCLASS)
+        {
+        int ch;
+        while ((ch = *ccode++) != XCL_END)
+          {
+          if (ch == XCL_PROP)
+            {
+            int ptype = *ccode++;
+            int pvalue = *ccode++;
+            fprintf(f, "\\p{%s}", get_ucpname(ptype, pvalue));
+            }
+          else if (ch == XCL_NOTPROP)
+            {
+            int ptype = *ccode++;
+            int pvalue = *ccode++;
+            fprintf(f, "\\P{%s}", get_ucpname(ptype, pvalue));
+            }
+          else
+            {
+            ccode += 1 + print_char(f, ccode, utf);
+            if (ch == XCL_RANGE)
+              {
+              fprintf(f, "-");
+              ccode += 1 + print_char(f, ccode, utf);
+              }
+            }
+          }
+        }
+
+      /* Indicate a non-UTF class which was created by negation */
+
+      fprintf(f, "]%s", (*code == OP_NCLASS)? " (neg)" : "");
+
+      /* Handle repeats after a class or a back reference */
+
+      CLASS_REF_REPEAT:
+      switch(*ccode)
+        {
+        case OP_CRSTAR:
+        case OP_CRMINSTAR:
+        case OP_CRPLUS:
+        case OP_CRMINPLUS:
+        case OP_CRQUERY:
+        case OP_CRMINQUERY:
+        fprintf(f, "%s", OP_names[*ccode]);
+        extra += PRIV(OP_lengths)[*ccode];
+        break;
+
+        case OP_CRRANGE:
+        case OP_CRMINRANGE:
+        min = GET2(ccode,1);
+        max = GET2(ccode,1 + IMM2_SIZE);
+        if (max == 0) fprintf(f, "{%d,}", min);
+        else fprintf(f, "{%d,%d}", min, max);
+        if (*ccode == OP_CRMINRANGE) fprintf(f, "?");
+        extra += PRIV(OP_lengths)[*ccode];
+        break;
+
+        /* Do nothing if it's not a repeat; this code stops picky compilers
+        warning about the lack of a default code path. */
+
+        default:
+        break;
+        }
+      }
+    break;
+
+    case OP_MARK:
+    case OP_PRUNE_ARG:
+    case OP_SKIP_ARG:
+    case OP_THEN_ARG:
+    fprintf(f, "    %s ", OP_names[*code]);
+    print_puchar(f, code + 2);
+    extra += code[1];
+    break;
+
+    case OP_THEN:
+    fprintf(f, "    %s", OP_names[*code]);
+    break;
+
+    case OP_CIRCM:
+    case OP_DOLLM:
+    flag = "/m";
+    /* Fall through */
+
+    /* Anything else is just an item with no data, but possibly a flag. */
+
+    default:
+    fprintf(f, " %s %s", flag, OP_names[*code]);
+    break;
+    }
+
+  code += PRIV(OP_lengths)[*code] + extra;
+  fprintf(f, "\n");
+  }
+}
+
+/* End of pcre_printint.src */

Deleted: code/trunk/pcre_printint.src
===================================================================
--- code/trunk/pcre_printint.src    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_printint.src    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,616 +0,0 @@
-/*************************************************
-*      Perl-Compatible Regular Expressions       *
-*************************************************/
-
-/* PCRE is a library of functions to support regular expressions whose syntax
-and semantics are as close as possible to those of the Perl 5 language.
-
-                       Written by Philip Hazel
-           Copyright (c) 1997-2010 University of Cambridge
-
------------------------------------------------------------------------------
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are met:
-
-    * Redistributions of source code must retain the above copyright notice,
-      this list of conditions and the following disclaimer.
-
-    * Redistributions in binary form must reproduce the above copyright
-      notice, this list of conditions and the following disclaimer in the
-      documentation and/or other materials provided with the distribution.
-
-    * Neither the name of the University of Cambridge nor the names of its
-      contributors may be used to endorse or promote products derived from
-      this software without specific prior written permission.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
-AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
-ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
-LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
-CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
-SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
-INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
-CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
-ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
-POSSIBILITY OF SUCH DAMAGE.
------------------------------------------------------------------------------
-*/
-
-
-/* This module contains a PCRE private debugging function for printing out the
-internal form of a compiled regular expression, along with some supporting
-local functions. This source file is used in two places:
-
-(1) It is #included by pcre_compile.c when it is compiled in debugging mode
-(PCRE_DEBUG defined in pcre_internal.h). It is not included in production
-compiles.
-
-(2) It is always #included by pcretest.c, which can be asked to print out a
-compiled regex for debugging purposes. */
-
-
-/* Macro that decides whether a character should be output as a literal or in
-hexadecimal. We don't use isprint() because that can vary from system to system
-(even without the use of locales) and we want the output always to be the same,
-for testing purposes. This macro is used in pcretest as well as in this file. */
-
-#ifdef EBCDIC
-#define PRINTABLE(c) ((c) >= 64 && (c) < 255)
-#else
-#define PRINTABLE(c) ((c) >= 32 && (c) < 127)
-#endif
-
-/* The table of operator names. */
-
-static const char *OP_names[] = { OP_NAME_LIST };
-
-
-
-/*************************************************
-*       Print single- or multi-byte character    *
-*************************************************/
-
-static int
-print_char(FILE *f, uschar *ptr, BOOL utf8)
-{
-int c = *ptr;
-
-#ifndef SUPPORT_UTF8
-utf8 = utf8;  /* Avoid compiler warning */
-if (PRINTABLE(c)) fprintf(f, "%c", c); else fprintf(f, "\\x%02x", c);
-return 0;
-
-#else
-if (!utf8 || (c & 0xc0) != 0xc0)
-  {
-  if (PRINTABLE(c)) fprintf(f, "%c", c); else fprintf(f, "\\x%02x", c);
-  return 0;
-  }
-else
-  {
-  int i;
-  int a = _pcre_utf8_table4[c & 0x3f];  /* Number of additional bytes */
-  int s = 6*a;
-  c = (c & _pcre_utf8_table3[a]) << s;
-  for (i = 1; i <= a; i++)
-    {
-    /* This is a check for malformed UTF-8; it should only occur if the sanity
-    check has been turned off. Rather than swallow random bytes, just stop if
-    we hit a bad one. Print it with \X instead of \x as an indication. */
-
-    if ((ptr[i] & 0xc0) != 0x80)
-      {
-      fprintf(f, "\\X{%x}", c);
-      return i - 1;
-      }
-
-    /* The byte is OK */
-
-    s -= 6;
-    c |= (ptr[i] & 0x3f) << s;
-    }
-  if (c < 128) fprintf(f, "\\x%02x", c); else fprintf(f, "\\x{%x}", c);
-  return a;
-  }
-#endif
-}
-
-
-
-/*************************************************
-*          Find Unicode property name            *
-*************************************************/
-
-static const char *
-get_ucpname(int ptype, int pvalue)
-{
-#ifdef SUPPORT_UCP
-int i;
-for (i = _pcre_utt_size - 1; i >= 0; i--)
-  {
-  if (ptype == _pcre_utt[i].type && pvalue == _pcre_utt[i].value) break;
-  }
-return (i >= 0)? _pcre_utt_names + _pcre_utt[i].name_offset : "??";
-#else
-/* It gets harder and harder to shut off unwanted compiler warnings. */
-ptype = ptype * pvalue;
-return (ptype == pvalue)? "??" : "??";
-#endif
-}
-
-
-
-/*************************************************
-*         Print compiled regex                   *
-*************************************************/
-
-/* Make this function work for a regex with integers either byte order.
-However, we assume that what we are passed is a compiled regex. The
-print_lengths flag controls whether offsets and lengths of items are printed.
-They can be turned off from pcretest so that automatic tests on bytecode can be
-written that do not depend on the value of LINK_SIZE. */
-
-static void
-pcre_printint(pcre *external_re, FILE *f, BOOL print_lengths)
-{
-real_pcre *re = (real_pcre *)external_re;
-uschar *codestart, *code;
-BOOL utf8;
-
-unsigned int options = re->options;
-int offset = re->name_table_offset;
-int count = re->name_count;
-int size = re->name_entry_size;
-
-if (re->magic_number != MAGIC_NUMBER)
-  {
-  offset = ((offset << 8) & 0xff00) | ((offset >> 8) & 0xff);
-  count = ((count << 8) & 0xff00) | ((count >> 8) & 0xff);
-  size = ((size << 8) & 0xff00) | ((size >> 8) & 0xff);
-  options = ((options << 24) & 0xff000000) |
-            ((options <<  8) & 0x00ff0000) |
-            ((options >>  8) & 0x0000ff00) |
-            ((options >> 24) & 0x000000ff);
-  }
-
-code = codestart = (uschar *)re + offset + count * size;
-utf8 = (options & PCRE_UTF8) != 0;
-
-for(;;)
-  {
-  uschar *ccode;
-  const char *flag = "  ";
-  int c;
-  int extra = 0;
-
-  if (print_lengths)
-    fprintf(f, "%3d ", (int)(code - codestart));
-  else
-    fprintf(f, "    ");
-
-  switch(*code)
-    {
-/* ========================================================================== */
-      /* These cases are never obeyed. This is a fudge that causes a compile-
-      time error if the vectors OP_names or _pcre_OP_lengths, which are indexed
-      by opcode, are not the correct length. It seems to be the only way to do
-      such a check at compile time, as the sizeof() operator does not work in
-      the C preprocessor. We do this while compiling pcretest, because that
-      #includes pcre_tables.c, which holds _pcre_OP_lengths. We can't do this
-      when building pcre_compile.c with PCRE_DEBUG set, because it doesn't then
-      know the size of _pcre_OP_lengths. */
-
-#ifdef COMPILING_PCRETEST
-      case OP_TABLE_LENGTH:
-      case OP_TABLE_LENGTH +
-        ((sizeof(OP_names)/sizeof(const char *) == OP_TABLE_LENGTH) &&
-        (sizeof(_pcre_OP_lengths) == OP_TABLE_LENGTH)):
-      break;
-#endif
-/* ========================================================================== */
-
-    case OP_END:
-    fprintf(f, "    %s\n", OP_names[*code]);
-    fprintf(f, "------------------------------------------------------------------\n");
-    return;
-
-    case OP_CHAR:
-    fprintf(f, "    ");
-    do
-      {
-      code++;
-      code += 1 + print_char(f, code, utf8);
-      }
-    while (*code == OP_CHAR);
-    fprintf(f, "\n");
-    continue;
-
-    case OP_CHARI:
-    fprintf(f, " /i ");
-    do
-      {
-      code++;
-      code += 1 + print_char(f, code, utf8);
-      }
-    while (*code == OP_CHARI);
-    fprintf(f, "\n");
-    continue;
-
-    case OP_CBRA:
-    case OP_CBRAPOS:
-    case OP_SCBRA:
-    case OP_SCBRAPOS:
-    if (print_lengths) fprintf(f, "%3d ", GET(code, 1));
-      else fprintf(f, "    ");
-    fprintf(f, "%s %d", OP_names[*code], GET2(code, 1+LINK_SIZE));
-    break;
-
-    case OP_BRA:
-    case OP_BRAPOS:
-    case OP_SBRA:
-    case OP_SBRAPOS:
-    case OP_KETRMAX:
-    case OP_KETRMIN:
-    case OP_KETRPOS:
-    case OP_ALT:
-    case OP_KET:
-    case OP_ASSERT:
-    case OP_ASSERT_NOT:
-    case OP_ASSERTBACK:
-    case OP_ASSERTBACK_NOT:
-    case OP_ONCE:
-    case OP_ONCE_NC:
-    case OP_COND:
-    case OP_SCOND:
-    case OP_REVERSE:
-    if (print_lengths) fprintf(f, "%3d ", GET(code, 1));
-      else fprintf(f, "    ");
-    fprintf(f, "%s", OP_names[*code]);
-    break;
-
-    case OP_CLOSE:
-    fprintf(f, "    %s %d", OP_names[*code], GET2(code, 1));
-    break;
-
-    case OP_CREF:
-    case OP_NCREF:
-    fprintf(f, "%3d %s", GET2(code,1), OP_names[*code]);
-    break;
-
-    case OP_RREF:
-    c = GET2(code, 1);
-    if (c == RREF_ANY)
-      fprintf(f, "    Cond recurse any");
-    else
-      fprintf(f, "    Cond recurse %d", c);
-    break;
-
-    case OP_NRREF:
-    c = GET2(code, 1);
-    if (c == RREF_ANY)
-      fprintf(f, "    Cond nrecurse any");
-    else
-      fprintf(f, "    Cond nrecurse %d", c);
-    break;
-
-    case OP_DEF:
-    fprintf(f, "    Cond def");
-    break;
-
-    case OP_STARI:
-    case OP_MINSTARI:
-    case OP_POSSTARI:
-    case OP_PLUSI:
-    case OP_MINPLUSI:
-    case OP_POSPLUSI:
-    case OP_QUERYI:
-    case OP_MINQUERYI:
-    case OP_POSQUERYI:
-    flag = "/i";
-    /* Fall through */
-    case OP_STAR:
-    case OP_MINSTAR:
-    case OP_POSSTAR:
-    case OP_PLUS:
-    case OP_MINPLUS:
-    case OP_POSPLUS:
-    case OP_QUERY:
-    case OP_MINQUERY:
-    case OP_POSQUERY:
-    case OP_TYPESTAR:
-    case OP_TYPEMINSTAR:
-    case OP_TYPEPOSSTAR:
-    case OP_TYPEPLUS:
-    case OP_TYPEMINPLUS:
-    case OP_TYPEPOSPLUS:
-    case OP_TYPEQUERY:
-    case OP_TYPEMINQUERY:
-    case OP_TYPEPOSQUERY:
-    fprintf(f, " %s ", flag);
-    if (*code >= OP_TYPESTAR)
-      {
-      fprintf(f, "%s", OP_names[code[1]]);
-      if (code[1] == OP_PROP || code[1] == OP_NOTPROP)
-        {
-        fprintf(f, " %s ", get_ucpname(code[2], code[3]));
-        extra = 2;
-        }
-      }
-    else extra = print_char(f, code+1, utf8);
-    fprintf(f, "%s", OP_names[*code]);
-    break;
-
-    case OP_EXACTI:
-    case OP_UPTOI:
-    case OP_MINUPTOI:
-    case OP_POSUPTOI:
-    flag = "/i";
-    /* Fall through */
-    case OP_EXACT:
-    case OP_UPTO:
-    case OP_MINUPTO:
-    case OP_POSUPTO:
-    fprintf(f, " %s ", flag);
-    extra = print_char(f, code+3, utf8);
-    fprintf(f, "{");
-    if (*code != OP_EXACT && *code != OP_EXACTI) fprintf(f, "0,");
-    fprintf(f, "%d}", GET2(code,1));
-    if (*code == OP_MINUPTO || *code == OP_MINUPTOI) fprintf(f, "?");
-      else if (*code == OP_POSUPTO || *code == OP_POSUPTOI) fprintf(f, "+");
-    break;
-
-    case OP_TYPEEXACT:
-    case OP_TYPEUPTO:
-    case OP_TYPEMINUPTO:
-    case OP_TYPEPOSUPTO:
-    fprintf(f, "    %s", OP_names[code[3]]);
-    if (code[3] == OP_PROP || code[3] == OP_NOTPROP)
-      {
-      fprintf(f, " %s ", get_ucpname(code[4], code[5]));
-      extra = 2;
-      }
-    fprintf(f, "{");
-    if (*code != OP_TYPEEXACT) fprintf(f, "0,");
-    fprintf(f, "%d}", GET2(code,1));
-    if (*code == OP_TYPEMINUPTO) fprintf(f, "?");
-      else if (*code == OP_TYPEPOSUPTO) fprintf(f, "+");
-    break;
-
-    case OP_NOTI:
-    flag = "/i";
-    /* Fall through */
-    case OP_NOT:
-    c = code[1];
-    if (PRINTABLE(c)) fprintf(f, " %s [^%c]", flag, c);
-      else fprintf(f, " %s [^\\x%02x]", flag, c);
-    break;
-
-    case OP_NOTSTARI:
-    case OP_NOTMINSTARI:
-    case OP_NOTPOSSTARI:
-    case OP_NOTPLUSI:
-    case OP_NOTMINPLUSI:
-    case OP_NOTPOSPLUSI:
-    case OP_NOTQUERYI:
-    case OP_NOTMINQUERYI:
-    case OP_NOTPOSQUERYI:
-    flag = "/i";
-    /* Fall through */
-
-    case OP_NOTSTAR:
-    case OP_NOTMINSTAR:
-    case OP_NOTPOSSTAR:
-    case OP_NOTPLUS:
-    case OP_NOTMINPLUS:
-    case OP_NOTPOSPLUS:
-    case OP_NOTQUERY:
-    case OP_NOTMINQUERY:
-    case OP_NOTPOSQUERY:
-    c = code[1];
-    if (PRINTABLE(c)) fprintf(f, " %s [^%c]", flag, c);
-      else fprintf(f, " %s [^\\x%02x]", flag, c);
-    fprintf(f, "%s", OP_names[*code]);
-    break;
-
-    case OP_NOTEXACTI:
-    case OP_NOTUPTOI:
-    case OP_NOTMINUPTOI:
-    case OP_NOTPOSUPTOI:
-    flag = "/i";
-    /* Fall through */
-
-    case OP_NOTEXACT:
-    case OP_NOTUPTO:
-    case OP_NOTMINUPTO:
-    case OP_NOTPOSUPTO:
-    c = code[3];
-    if (PRINTABLE(c)) fprintf(f, " %s [^%c]{", flag, c);
-      else fprintf(f, " %s [^\\x%02x]{", flag, c);
-    if (*code != OP_NOTEXACT && *code != OP_NOTEXACTI) fprintf(f, "0,");
-    fprintf(f, "%d}", GET2(code,1));
-    if (*code == OP_NOTMINUPTO || *code == OP_NOTMINUPTOI) fprintf(f, "?");
-      else
-    if (*code == OP_NOTPOSUPTO || *code == OP_NOTPOSUPTOI) fprintf(f, "+");
-    break;
-
-    case OP_RECURSE:
-    if (print_lengths) fprintf(f, "%3d ", GET(code, 1));
-      else fprintf(f, "    ");
-    fprintf(f, "%s", OP_names[*code]);
-    break;
-
-    case OP_REFI:
-    flag = "/i";
-    /* Fall through */
-    case OP_REF:
-    fprintf(f, " %s \\%d", flag, GET2(code,1));
-    ccode = code + _pcre_OP_lengths[*code];
-    goto CLASS_REF_REPEAT;
-
-    case OP_CALLOUT:
-    fprintf(f, "    %s %d %d %d", OP_names[*code], code[1], GET(code,2),
-      GET(code, 2 + LINK_SIZE));
-    break;
-
-    case OP_PROP:
-    case OP_NOTPROP:
-    fprintf(f, "    %s %s", OP_names[*code], get_ucpname(code[1], code[2]));
-    break;
-
-    /* OP_XCLASS can only occur in UTF-8 mode. However, there's no harm in
-    having this code always here, and it makes it less messy without all those
-    #ifdefs. */
-
-    case OP_CLASS:
-    case OP_NCLASS:
-    case OP_XCLASS:
-      {
-      int i, min, max;
-      BOOL printmap;
-
-      fprintf(f, "    [");
-
-      if (*code == OP_XCLASS)
-        {
-        extra = GET(code, 1);
-        ccode = code + LINK_SIZE + 1;
-        printmap = (*ccode & XCL_MAP) != 0;
-        if ((*ccode++ & XCL_NOT) != 0) fprintf(f, "^");
-        }
-      else
-        {
-        printmap = TRUE;
-        ccode = code + 1;
-        }
-
-      /* Print a bit map */
-
-      if (printmap)
-        {
-        for (i = 0; i < 256; i++)
-          {
-          if ((ccode[i/8] & (1 << (i&7))) != 0)
-            {
-            int j;
-            for (j = i+1; j < 256; j++)
-              if ((ccode[j/8] & (1 << (j&7))) == 0) break;
-            if (i == '-' || i == ']') fprintf(f, "\\");
-            if (PRINTABLE(i)) fprintf(f, "%c", i);
-              else fprintf(f, "\\x%02x", i);
-            if (--j > i)
-              {
-              if (j != i + 1) fprintf(f, "-");
-              if (j == '-' || j == ']') fprintf(f, "\\");
-              if (PRINTABLE(j)) fprintf(f, "%c", j);
-                else fprintf(f, "\\x%02x", j);
-              }
-            i = j;
-            }
-          }
-        ccode += 32;
-        }
-
-      /* For an XCLASS there is always some additional data */
-
-      if (*code == OP_XCLASS)
-        {
-        int ch;
-        while ((ch = *ccode++) != XCL_END)
-          {
-          if (ch == XCL_PROP)
-            {
-            int ptype = *ccode++;
-            int pvalue = *ccode++;
-            fprintf(f, "\\p{%s}", get_ucpname(ptype, pvalue));
-            }
-          else if (ch == XCL_NOTPROP)
-            {
-            int ptype = *ccode++;
-            int pvalue = *ccode++;
-            fprintf(f, "\\P{%s}", get_ucpname(ptype, pvalue));
-            }
-          else
-            {
-            ccode += 1 + print_char(f, ccode, TRUE);
-            if (ch == XCL_RANGE)
-              {
-              fprintf(f, "-");
-              ccode += 1 + print_char(f, ccode, TRUE);
-              }
-            }
-          }
-        }
-
-      /* Indicate a non-UTF8 class which was created by negation */
-
-      fprintf(f, "]%s", (*code == OP_NCLASS)? " (neg)" : "");
-
-      /* Handle repeats after a class or a back reference */
-
-      CLASS_REF_REPEAT:
-      switch(*ccode)
-        {
-        case OP_CRSTAR:
-        case OP_CRMINSTAR:
-        case OP_CRPLUS:
-        case OP_CRMINPLUS:
-        case OP_CRQUERY:
-        case OP_CRMINQUERY:
-        fprintf(f, "%s", OP_names[*ccode]);
-        extra += _pcre_OP_lengths[*ccode];
-        break;
-
-        case OP_CRRANGE:
-        case OP_CRMINRANGE:
-        min = GET2(ccode,1);
-        max = GET2(ccode,3);
-        if (max == 0) fprintf(f, "{%d,}", min);
-        else fprintf(f, "{%d,%d}", min, max);
-        if (*ccode == OP_CRMINRANGE) fprintf(f, "?");
-        extra += _pcre_OP_lengths[*ccode];
-        break;
-
-        /* Do nothing if it's not a repeat; this code stops picky compilers
-        warning about the lack of a default code path. */
-
-        default:
-        break;
-        }
-      }
-    break;
-
-    case OP_MARK:
-    case OP_PRUNE_ARG:
-    case OP_SKIP_ARG:
-    fprintf(f, "    %s %s", OP_names[*code], code + 2);
-    extra += code[1];
-    break;
-
-    case OP_THEN:
-    fprintf(f, "    %s", OP_names[*code]);
-    break;
-
-    case OP_THEN_ARG:
-    fprintf(f, "    %s %s", OP_names[*code], code + 2);
-    extra += code[1];
-    break;
-
-    case OP_CIRCM:
-    case OP_DOLLM:
-    flag = "/m";
-    /* Fall through */
-
-    /* Anything else is just an item with no data, but possibly a flag. */
-
-    default:
-    fprintf(f, " %s %s", flag, OP_names[*code]);
-    break;
-    }
-
-  code += _pcre_OP_lengths[*code] + extra;
-  fprintf(f, "\n");
-  }
-}
-
-/* End of pcre_printint.src */

Modified: code/trunk/pcre_refcount.c
===================================================================
--- code/trunk/pcre_refcount.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_refcount.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2008 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@@ -68,11 +68,18 @@
                 a negative error number
 */

+#ifdef COMPILE_PCRE8
 PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
 pcre_refcount(pcre *argument_re, int adjust)
+#else
+PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
+pcre_refcount(pcre *argument_re, int adjust)
+#endif
 {
 real_pcre *re = (real_pcre *)argument_re;
 if (re == NULL) return PCRE_ERROR_NULL;
+if (re->magic_number != MAGIC_NUMBER) return PCRE_ERROR_BADMAGIC;
+if ((re->flags & PCRE_MODE) == 0) return PCRE_ERROR_BADMODE;
 re->ref_count = (-adjust > re->ref_count)? 0 :
                 (adjust + re->ref_count > 65535)? 65535 :
                 re->ref_count + adjust;

Copied: code/trunk/pcre_string_utils.c (from rev 835, code/branches/pcre16/pcre_string_utils.c)
===================================================================
--- code/trunk/pcre_string_utils.c                            (rev 0)
+++ code/trunk/pcre_string_utils.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,168 @@
+/*************************************************
+*      Perl-Compatible Regular Expressions       *
+*************************************************/
+
+/* PCRE is a library of functions to support regular expressions whose syntax
+and semantics are as close as possible to those of the Perl 5 language.
+
+                       Written by Philip Hazel
+           Copyright (c) 1997-2012 University of Cambridge
+
+-----------------------------------------------------------------------------
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the name of the University of Cambridge nor the names of its
+      contributors may be used to endorse or promote products derived from
+      this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+-----------------------------------------------------------------------------
+*/
+
+
+/* This module contains an internal function that is used to match an extended
+class. It is used by both pcre_exec() and pcre_def_exec(). */
+
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include "pcre_internal.h"
+
+#ifndef COMPILE_PCRE8
+
+/*************************************************
+*           Compare string utilities             *
+*************************************************/
+
+/* The following two functions compares two strings. Basically an strcmp
+for non 8 bit characters.
+
+Arguments:
+  str1        first string
+  str2        second string
+
+Returns:      0 if both string are equal (like strcmp), 1 otherwise
+*/
+
+int
+PRIV(strcmp_uc_uc)(const pcre_uchar *str1, const pcre_uchar *str2)
+{
+pcre_uchar c1;
+pcre_uchar c2;
+
+while (*str1 != '\0' || *str2 != '\0')
+  {
+  c1 = *str1++;
+  c2 = *str2++;
+  if (c1 != c2)
+    return ((c1 > c2) << 1) - 1;
+  }
+/* Both length and characters must be equal. */
+return 0;
+}
+
+int
+PRIV(strcmp_uc_c8)(const pcre_uchar *str1, const char *str2)
+{
+const pcre_uint8 *ustr2 = (pcre_uint8 *)str2;
+pcre_uchar c1;
+pcre_uchar c2;
+
+while (*str1 != '\0' || *ustr2 != '\0')
+  {
+  c1 = *str1++;
+  c2 = (pcre_uchar)*ustr2++;
+  if (c1 != c2)
+    return ((c1 > c2) << 1) - 1;
+  }
+/* Both length and characters must be equal. */
+return 0;
+}
+
+/* The following two functions compares two, fixed length
+strings. Basically an strncmp for non 8 bit characters.
+
+Arguments:
+  str1        first string
+  str2        second string
+  num         size of the string
+
+Returns:      0 if both string are equal (like strcmp), 1 otherwise
+*/
+
+int
+PRIV(strncmp_uc_uc)(const pcre_uchar *str1, const pcre_uchar *str2, unsigned int num)
+{
+pcre_uchar c1;
+pcre_uchar c2;
+
+while (num-- > 0)
+  {
+  c1 = *str1++;
+  c2 = *str2++;
+  if (c1 != c2)
+    return ((c1 > c2) << 1) - 1;
+  }
+/* Both length and characters must be equal. */
+return 0;
+}
+
+int
+PRIV(strncmp_uc_c8)(const pcre_uchar *str1, const char *str2, unsigned int num)
+{
+const pcre_uint8 *ustr2 = (pcre_uint8 *)str2;
+pcre_uchar c1;
+pcre_uchar c2;
+
+while (num-- > 0)
+  {
+  c1 = *str1++;
+  c2 = (pcre_uchar)*ustr2++;
+  if (c1 != c2)
+    return ((c1 > c2) << 1) - 1;
+  }
+/* Both length and characters must be equal. */
+return 0;
+}
+
+/* The following function returns with the length of
+a zero terminated string. Basically an strlen for non 8 bit characters.
+
+Arguments:
+  str         string
+
+Returns:      length of the string
+*/
+
+unsigned int
+PRIV(strlen_uc)(const pcre_uchar *str)
+{
+unsigned int len = 0;
+while (*str++ != 0)
+  len++;
+return len;
+}
+
+#endif /* COMPILE_PCRE8 */
+
+/* End of pcre_string_utils.c */

Modified: code/trunk/pcre_study.c
===================================================================
--- code/trunk/pcre_study.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_study.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2010 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -78,17 +78,18 @@
*/

static int
-find_minlength(const uschar *code, const uschar *startcode, int options,
+find_minlength(const pcre_uchar *code, const pcre_uchar *startcode, int options,
int recurse_depth)
{
int length = -1;
-BOOL utf8 = (options & PCRE_UTF8) != 0;
+/* PCRE_UTF16 has the same value as PCRE_UTF8. */
+BOOL utf = (options & PCRE_UTF8) != 0;
BOOL had_recurse = FALSE;
register int branchlength = 0;
-register uschar *cc = (uschar *)code + 1 + LINK_SIZE;
+register pcre_uchar *cc = (pcre_uchar *)code + 1 + LINK_SIZE;

 if (*code == OP_CBRA || *code == OP_SCBRA ||
-    *code == OP_CBRAPOS || *code == OP_SCBRAPOS) cc += 2;
+    *code == OP_CBRAPOS || *code == OP_SCBRAPOS) cc += IMM2_SIZE;

/* Scan along the opcodes for this branch. If we get to the end of the
branch, check the length against that of the other branches. */
@@ -96,7 +97,7 @@
for (;;)
{
int d, min;
- uschar *cs, *ce;
+ pcre_uchar *cs, *ce;
register int op = *cc;

   switch (op)
@@ -189,7 +190,7 @@
     case OP_DOLLM:
     case OP_NOT_WORD_BOUNDARY:
     case OP_WORD_BOUNDARY:
-    cc += _pcre_OP_lengths[*cc];
+    cc += PRIV(OP_lengths)[*cc];
     break;

     /* Skip over a subpattern that has a {0} or {0,x} quantifier */
@@ -198,7 +199,7 @@
     case OP_BRAMINZERO:
     case OP_BRAPOSZERO:
     case OP_SKIPZERO:
-    cc += _pcre_OP_lengths[*cc];
+    cc += PRIV(OP_lengths)[*cc];
     do cc += GET(cc, 1); while (*cc == OP_ALT);
     cc += 1 + LINK_SIZE;
     break;
@@ -223,8 +224,8 @@
     case OP_NOTPOSPLUSI:
     branchlength++;
     cc += 2;
-#ifdef SUPPORT_UTF8
-    if (utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
+#ifdef SUPPORT_UTF
+    if (utf && HAS_EXTRALEN(cc[-1])) cc += GET_EXTRALEN(cc[-1]);
 #endif
     break;

@@ -243,15 +244,16 @@
     case OP_NOTEXACT:
     case OP_NOTEXACTI:
     branchlength += GET2(cc,1);
-    cc += 4;
-#ifdef SUPPORT_UTF8
-    if (utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
+    cc += 2 + IMM2_SIZE;
+#ifdef SUPPORT_UTF
+    if (utf && HAS_EXTRALEN(cc[-1])) cc += GET_EXTRALEN(cc[-1]);
 #endif
     break;

     case OP_TYPEEXACT:
     branchlength += GET2(cc,1);
-    cc += (cc[3] == OP_PROP || cc[3] == OP_NOTPROP)? 6 : 4;
+    cc += 2 + IMM2_SIZE + ((cc[1 + IMM2_SIZE] == OP_PROP
+      || cc[1 + IMM2_SIZE] == OP_NOTPROP)? 2 : 0);
     break;

     /* Handle single-char non-literal matchers */
@@ -286,13 +288,13 @@
     cc++;
     break;

-    /* The single-byte matcher means we can't proceed in UTF-8 mode. (In 
-    non-UTF-8 mode \C will actually be turned into OP_ALLANY, so won't ever 
+    /* The single-byte matcher means we can't proceed in UTF-8 mode. (In
+    non-UTF-8 mode \C will actually be turned into OP_ALLANY, so won't ever
     appear, but leave the code, just in case.) */

     case OP_ANYBYTE:
-#ifdef SUPPORT_UTF8
-    if (utf8) return -1;
+#ifdef SUPPORT_UTF
+    if (utf) return -1;
 #endif
     branchlength++;
     cc++;
@@ -308,27 +310,28 @@
     case OP_TYPEPOSSTAR:
     case OP_TYPEPOSQUERY:
     if (cc[1] == OP_PROP || cc[1] == OP_NOTPROP) cc += 2;
-    cc += _pcre_OP_lengths[op];
+    cc += PRIV(OP_lengths)[op];
     break;

     case OP_TYPEUPTO:
     case OP_TYPEMINUPTO:
     case OP_TYPEPOSUPTO:
-    if (cc[3] == OP_PROP || cc[3] == OP_NOTPROP) cc += 2;
-    cc += _pcre_OP_lengths[op];
+    if (cc[1 + IMM2_SIZE] == OP_PROP
+      || cc[1 + IMM2_SIZE] == OP_NOTPROP) cc += 2;
+    cc += PRIV(OP_lengths)[op];
     break;

     /* Check a class for variable quantification */

-#ifdef SUPPORT_UTF8
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
     case OP_XCLASS:
-    cc += GET(cc, 1) - 33;
+    cc += GET(cc, 1) - PRIV(OP_lengths)[OP_CLASS];
     /* Fall through */
 #endif

     case OP_CLASS:
     case OP_NCLASS:
-    cc += 33;
+    cc += PRIV(OP_lengths)[OP_CLASS];

     switch (*cc)
       {
@@ -347,7 +350,7 @@
       case OP_CRRANGE:
       case OP_CRMINRANGE:
       branchlength += GET2(cc,1);
-      cc += 5;
+      cc += 1 + 2 * IMM2_SIZE;
       break;

       default:
@@ -372,7 +375,7 @@
     case OP_REFI:
     if ((options & PCRE_JAVASCRIPT_COMPAT) == 0)
       {
-      ce = cs = (uschar *)_pcre_find_bracket(startcode, utf8, GET2(cc, 1));
+      ce = cs = (pcre_uchar *)PRIV(find_bracket)(startcode, utf, GET2(cc, 1));
       if (cs == NULL) return -2;
       do ce += GET(ce, 1); while (*ce == OP_ALT);
       if (cc > cs && cc < ce)
@@ -386,7 +389,7 @@
         }
       }
     else d = 0;
-    cc += 3;
+    cc += 1 + IMM2_SIZE;

     /* Handle repeated back references */

@@ -409,7 +412,7 @@
       case OP_CRRANGE:
       case OP_CRMINRANGE:
       min = GET2(cc, 1);
-      cc += 5;
+      cc += 1 + 2 * IMM2_SIZE;
       break;

       default:
@@ -424,7 +427,7 @@
     caught by a recursion depth count. */

     case OP_RECURSE:
-    cs = ce = (uschar *)startcode + GET(cc, 1);
+    cs = ce = (pcre_uchar *)startcode + GET(cc, 1);
     do ce += GET(ce, 1); while (*ce == OP_ALT);
     if ((cc > cs && cc < ce) || recurse_depth > 10)
       had_recurse = TRUE;
@@ -482,9 +485,9 @@
     case OP_NOTPOSQUERY:
     case OP_NOTPOSQUERYI:

-    cc += _pcre_OP_lengths[op];
-#ifdef SUPPORT_UTF8
-    if (utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
+    cc += PRIV(OP_lengths)[op];
+#ifdef SUPPORT_UTF
+    if (utf && HAS_EXTRALEN(cc[-1])) cc += GET_EXTRALEN(cc[-1]);
 #endif
     break;

@@ -494,7 +497,7 @@
     case OP_PRUNE_ARG:
     case OP_SKIP_ARG:
     case OP_THEN_ARG:
-    cc += _pcre_OP_lengths[op] + cc[1];
+    cc += PRIV(OP_lengths)[op] + cc[1];
     break;

     /* The remaining opcodes are just skipped over. */
@@ -506,7 +509,7 @@
     case OP_SET_SOM:
     case OP_SKIP:
     case OP_THEN:
-    cc += _pcre_OP_lengths[op];
+    cc += PRIV(OP_lengths)[op];
     break;

     /* This should not occur: we list all opcodes explicitly so that when
@@ -535,29 +538,30 @@
   p             points to the character
   caseless      the caseless flag
   cd            the block with char table pointers
-  utf8          TRUE for UTF-8 mode
+  utf           TRUE for UTF-8 / UTF-16 mode

 Returns:        pointer after the character
 */

-static const uschar *
-set_table_bit(uschar *start_bits, const uschar *p, BOOL caseless,
- compile_data *cd, BOOL utf8)
+static const pcre_uchar *
+set_table_bit(pcre_uint8 *start_bits, const pcre_uchar *p, BOOL caseless,
+ compile_data *cd, BOOL utf)
{
unsigned int c = *p;

+#ifdef COMPILE_PCRE8
SET_BIT(c);

-#ifdef SUPPORT_UTF8
-if (utf8 && c > 127)
+#ifdef SUPPORT_UTF
+if (utf && c > 127)
   {
   GETCHARINC(c, p);
 #ifdef SUPPORT_UCP
   if (caseless)
     {
-    uschar buff[8];
+    pcre_uchar buff[6];
     c = UCD_OTHERCASE(c);
-    (void)_pcre_ord2utf8(c, buff);
+    (void)PRIV(ord2utf)(c, buff);
     SET_BIT(buff[0]);
     }
 #endif
@@ -569,6 +573,36 @@

 if (caseless && (cd->ctypes[c] & ctype_letter) != 0) SET_BIT(cd->fcc[c]);
 return p + 1;
+#endif
+
+#ifdef COMPILE_PCRE16
+if (c > 0xff)
+  {
+  c = 0xff;
+  caseless = FALSE;
+  }
+SET_BIT(c);
+
+#ifdef SUPPORT_UTF
+if (utf && c > 127)
+  {
+  GETCHARINC(c, p);
+#ifdef SUPPORT_UCP
+  if (caseless)
+    {
+    c = UCD_OTHERCASE(c);
+    if (c > 0xff)
+      c = 0xff;
+    SET_BIT(c);
+    }
+#endif
+  return p;
+  }
+#endif
+
+if (caseless && (cd->ctypes[c] & ctype_letter) != 0) SET_BIT(cd->fcc[c]);
+return p + 1;
+#endif
 }

@@ -594,21 +628,23 @@
*/

 static void
-set_type_bits(uschar *start_bits, int cbit_type, int table_limit,
+set_type_bits(pcre_uint8 *start_bits, int cbit_type, int table_limit,
   compile_data *cd)
 {
 register int c;
 for (c = 0; c < table_limit; c++) start_bits[c] |= cd->cbits[c+cbit_type];
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
 if (table_limit == 32) return;
 for (c = 128; c < 256; c++)
   {
   if ((cd->cbits[c/8] & (1 << (c&7))) != 0)
     {
-    uschar buff[8];
-    (void)_pcre_ord2utf8(c, buff);
+    pcre_uchar buff[6];
+    (void)PRIV(ord2utf)(c, buff);
     SET_BIT(buff[0]);
     }
   }
+#endif
 }

@@ -634,12 +670,14 @@
*/

static void
-set_nottype_bits(uschar *start_bits, int cbit_type, int table_limit,
+set_nottype_bits(pcre_uint8 *start_bits, int cbit_type, int table_limit,
compile_data *cd)
{
register int c;
for (c = 0; c < table_limit; c++) start_bits[c] |= ~cd->cbits[c+cbit_type];
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
if (table_limit != 32) for (c = 24; c < 32; c++) start_bits[c] = 0xff;
+#endif
}

@@ -659,7 +697,7 @@
 Arguments:
   code         points to an expression
   start_bits   points to a 32-byte table, initialized to 0
-  utf8         TRUE if in UTF-8 mode
+  utf          TRUE if in UTF-8 / UTF-16 mode
   cd           the block with char table pointers

 Returns:       SSB_FAIL     => Failed to find any starting bytes
@@ -669,12 +707,16 @@
 */

static int
-set_start_bits(const uschar *code, uschar *start_bits, BOOL utf8,
+set_start_bits(const pcre_uchar *code, pcre_uint8 *start_bits, BOOL utf,
compile_data *cd)
{
register int c;
int yield = SSB_DONE;
-int table_limit = utf8? 16:32;
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
+int table_limit = utf? 16:32;
+#else
+int table_limit = 32;
+#endif

#if 0
/* ========================================================================= */
@@ -696,10 +738,10 @@
do
{
BOOL try_next = TRUE;
- const uschar *tcode = code + 1 + LINK_SIZE;
+ const pcre_uchar *tcode = code + 1 + LINK_SIZE;

   if (*code == OP_CBRA || *code == OP_SCBRA ||
-      *code == OP_CBRAPOS || *code == OP_SCBRAPOS) tcode += 2;
+      *code == OP_CBRAPOS || *code == OP_SCBRAPOS) tcode += IMM2_SIZE;

   while (try_next)    /* Loop for items in this branch */
     {
@@ -785,7 +827,9 @@
       case OP_SOM:
       case OP_THEN:
       case OP_THEN_ARG:
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
       case OP_XCLASS:
+#endif
       return SSB_FAIL;

       /* We can ignore word boundary tests. */
@@ -811,7 +855,7 @@
       case OP_ONCE:
       case OP_ONCE_NC:
       case OP_ASSERT:
-      rc = set_start_bits(tcode, start_bits, utf8, cd);
+      rc = set_start_bits(tcode, start_bits, utf, cd);
       if (rc == SSB_FAIL || rc == SSB_UNKNOWN) return rc;
       if (rc == SSB_DONE) try_next = FALSE; else
         {
@@ -858,7 +902,7 @@
       case OP_BRAZERO:
       case OP_BRAMINZERO:
       case OP_BRAPOSZERO:
-      rc = set_start_bits(++tcode, start_bits, utf8, cd);
+      rc = set_start_bits(++tcode, start_bits, utf, cd);
       if (rc == SSB_FAIL || rc == SSB_UNKNOWN) return rc;
 /* =========================================================================
       See the comment at the head of this function concerning the next line,
@@ -885,7 +929,7 @@
       case OP_QUERY:
       case OP_MINQUERY:
       case OP_POSQUERY:
-      tcode = set_table_bit(start_bits, tcode + 1, FALSE, cd, utf8);
+      tcode = set_table_bit(start_bits, tcode + 1, FALSE, cd, utf);
       break;

       case OP_STARI:
@@ -894,7 +938,7 @@
       case OP_QUERYI:
       case OP_MINQUERYI:
       case OP_POSQUERYI:
-      tcode = set_table_bit(start_bits, tcode + 1, TRUE, cd, utf8);
+      tcode = set_table_bit(start_bits, tcode + 1, TRUE, cd, utf);
       break;

       /* Single-char upto sets the bit and tries the next */
@@ -902,36 +946,36 @@
       case OP_UPTO:
       case OP_MINUPTO:
       case OP_POSUPTO:
-      tcode = set_table_bit(start_bits, tcode + 3, FALSE, cd, utf8);
+      tcode = set_table_bit(start_bits, tcode + 1 + IMM2_SIZE, FALSE, cd, utf);
       break;

       case OP_UPTOI:
       case OP_MINUPTOI:
       case OP_POSUPTOI:
-      tcode = set_table_bit(start_bits, tcode + 3, TRUE, cd, utf8);
+      tcode = set_table_bit(start_bits, tcode + 1 + IMM2_SIZE, TRUE, cd, utf);
       break;

       /* At least one single char sets the bit and stops */

       case OP_EXACT:
-      tcode += 2;
+      tcode += IMM2_SIZE;
       /* Fall through */
       case OP_CHAR:
       case OP_PLUS:
       case OP_MINPLUS:
       case OP_POSPLUS:
-      (void)set_table_bit(start_bits, tcode + 1, FALSE, cd, utf8);
+      (void)set_table_bit(start_bits, tcode + 1, FALSE, cd, utf);
       try_next = FALSE;
       break;

       case OP_EXACTI:
-      tcode += 2;
+      tcode += IMM2_SIZE;
       /* Fall through */
       case OP_CHARI:
       case OP_PLUSI:
       case OP_MINPLUSI:
       case OP_POSPLUSI:
-      (void)set_table_bit(start_bits, tcode + 1, TRUE, cd, utf8);
+      (void)set_table_bit(start_bits, tcode + 1, TRUE, cd, utf);
       try_next = FALSE;
       break;

@@ -944,14 +988,28 @@
       case OP_HSPACE:
       SET_BIT(0x09);
       SET_BIT(0x20);
-      if (utf8)
+#ifdef SUPPORT_UTF
+      if (utf)
         {
+#ifdef COMPILE_PCRE8
         SET_BIT(0xC2);  /* For U+00A0 */
         SET_BIT(0xE1);  /* For U+1680, U+180E */
         SET_BIT(0xE2);  /* For U+2000 - U+200A, U+202F, U+205F */
         SET_BIT(0xE3);  /* For U+3000 */
+#endif
+#ifdef COMPILE_PCRE16
+        SET_BIT(0xA0);
+        SET_BIT(0xFF);  /* For characters > 255 */
+#endif
         }
-      else SET_BIT(0xA0);
+      else
+#endif /* SUPPORT_UTF */
+        {
+        SET_BIT(0xA0);
+#ifdef COMPILE_PCRE16
+        SET_BIT(0xFF);  /* For characters > 255 */
+#endif
+        }
       try_next = FALSE;
       break;

@@ -961,12 +1019,26 @@
       SET_BIT(0x0B);
       SET_BIT(0x0C);
       SET_BIT(0x0D);
-      if (utf8)
+#ifdef SUPPORT_UTF
+      if (utf)
         {
+#ifdef COMPILE_PCRE8
         SET_BIT(0xC2);  /* For U+0085 */
         SET_BIT(0xE2);  /* For U+2028, U+2029 */
+#endif
+#ifdef COMPILE_PCRE16
+        SET_BIT(0x85);
+        SET_BIT(0xFF);  /* For characters > 255 */
+#endif
         }
-      else SET_BIT(0x85);
+      else
+#endif /* SUPPORT_UTF */
+        {
+        SET_BIT(0x85);
+#ifdef COMPILE_PCRE16
+        SET_BIT(0xFF);  /* For characters > 255 */
+#endif
+        }
       try_next = FALSE;
       break;

@@ -1024,7 +1096,7 @@
       break;

       case OP_TYPEEXACT:
-      tcode += 3;
+      tcode += 1 + IMM2_SIZE;
       break;

       /* Zero or more repeats of character types set the bits and then
@@ -1033,7 +1105,7 @@
       case OP_TYPEUPTO:
       case OP_TYPEMINUPTO:
       case OP_TYPEPOSUPTO:
-      tcode += 2;               /* Fall through */
+      tcode += IMM2_SIZE;  /* Fall through */

       case OP_TYPESTAR:
       case OP_TYPEMINSTAR:
@@ -1051,14 +1123,23 @@
         case OP_HSPACE:
         SET_BIT(0x09);
         SET_BIT(0x20);
-        if (utf8)
+#ifdef COMPILE_PCRE8
+        if (utf)
           {
+#ifdef COMPILE_PCRE8
           SET_BIT(0xC2);  /* For U+00A0 */
           SET_BIT(0xE1);  /* For U+1680, U+180E */
           SET_BIT(0xE2);  /* For U+2000 - U+200A, U+202F, U+205F */
           SET_BIT(0xE3);  /* For U+3000 */
+#endif
+#ifdef COMPILE_PCRE16
+          SET_BIT(0xA0);
+          SET_BIT(0xFF);  /* For characters > 255 */
+#endif
           }
-        else SET_BIT(0xA0);
+        else
+#endif /* SUPPORT_UTF */
+          SET_BIT(0xA0);
         break;

         case OP_ANYNL:
@@ -1067,12 +1148,21 @@
         SET_BIT(0x0B);
         SET_BIT(0x0C);
         SET_BIT(0x0D);
-        if (utf8)
+#ifdef COMPILE_PCRE8
+        if (utf)
           {
+#ifdef COMPILE_PCRE8
           SET_BIT(0xC2);  /* For U+0085 */
           SET_BIT(0xE2);  /* For U+2028, U+2029 */
+#endif
+#ifdef COMPILE_PCRE16
+          SET_BIT(0x85);
+          SET_BIT(0xFF);  /* For characters > 255 */
+#endif
           }
-        else SET_BIT(0x85);
+        else
+#endif /* SUPPORT_UTF */
+          SET_BIT(0x85);
         break;

         case OP_NOT_DIGIT:
@@ -1119,18 +1209,23 @@
       character with a value > 255. */

       case OP_NCLASS:
-#ifdef SUPPORT_UTF8
-      if (utf8)
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
+      if (utf)
         {
         start_bits[24] |= 0xf0;              /* Bits for 0xc4 - 0xc8 */
         memset(start_bits+25, 0xff, 7);      /* Bits for 0xc9 - 0xff */
         }
 #endif
+#ifdef COMPILE_PCRE16
+      SET_BIT(0xFF);                         /* For characters > 255 */
+#endif
       /* Fall through */

       case OP_CLASS:
         {
+        pcre_uint8 *map;
         tcode++;
+        map = (pcre_uint8 *)tcode;

         /* In UTF-8 mode, the bits in a bit map correspond to character
         values, not to byte values. However, the bit map we are constructing is
@@ -1138,13 +1233,13 @@
         value is > 127. In fact, there are only two possible starting bytes for
         characters in the range 128 - 255. */

-#ifdef SUPPORT_UTF8
-        if (utf8)
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
+        if (utf)
           {
-          for (c = 0; c < 16; c++) start_bits[c] |= tcode[c];
+          for (c = 0; c < 16; c++) start_bits[c] |= map[c];
           for (c = 128; c < 256; c++)
             {
-            if ((tcode[c/8] && (1 << (c&7))) != 0)
+            if ((map[c/8] && (1 << (c&7))) != 0)
               {
               int d = (c >> 6) | 0xc0;            /* Set bit for this starter */
               start_bits[d/8] |= (1 << (d&7));    /* and then skip on to the */
@@ -1152,19 +1247,17 @@
               }
             }
           }
-
-        /* In non-UTF-8 mode, the two bit maps are completely compatible. */
-
         else
 #endif
           {
-          for (c = 0; c < 32; c++) start_bits[c] |= tcode[c];
+          /* In non-UTF-8 mode, the two bit maps are completely compatible. */
+          for (c = 0; c < 32; c++) start_bits[c] |= map[c];
           }

         /* Advance past the bit map, and act on what follows. For a zero
         minimum repeat, continue; otherwise stop processing. */

-        tcode += 32;
+        tcode += 32 / sizeof(pcre_uchar);
         switch (*tcode)
           {
           case OP_CRSTAR:
@@ -1176,7 +1269,7 @@

           case OP_CRRANGE:
           case OP_CRMINRANGE:
-          if (((tcode[1] << 8) + tcode[2]) == 0) tcode += 5;
+          if (GET2(tcode, 1) == 0) tcode += 1 + 2 * IMM2_SIZE;
             else try_next = FALSE;
           break;

@@ -1219,16 +1312,21 @@
             NULL on error or if no optimization possible
 */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN pcre_extra * PCRE_CALL_CONVENTION
pcre_study(const pcre *external_re, int options, const char **errorptr)
+#else
+PCRE_EXP_DEFN pcre_extra * PCRE_CALL_CONVENTION
+pcre16_study(const pcre *external_re, int options, const char **errorptr)
+#endif
{
int min;
BOOL bits_set = FALSE;
-uschar start_bits[32];
+pcre_uint8 start_bits[32];
pcre_extra *extra = NULL;
pcre_study_data *study;
-const uschar *tables;
-uschar *code;
+const pcre_uint8 *tables;
+pcre_uchar *code;
compile_data compile_block;
const real_pcre *re = (const real_pcre *)external_re;

@@ -1240,13 +1338,23 @@
return NULL;
}

+if ((re->flags & PCRE_MODE) == 0)
+ {
+#ifdef COMPILE_PCRE8
+ *errorptr = "argument is compiled in 16 bit mode";
+#else
+ *errorptr = "argument is compiled in 8 bit mode";
+#endif
+ return NULL;
+ }
+
if ((options & ~PUBLIC_STUDY_OPTIONS) != 0)
{
*errorptr = "unknown or incorrect option bit(s) set";
return NULL;
}

-code = (uschar *)re + re->name_table_offset +
+code = (pcre_uchar *)re + re->name_table_offset +
(re->name_count * re->name_entry_size);

/* For an anchored pattern, or an unanchored pattern that has a first char, or
@@ -1261,9 +1369,16 @@
/* Set the character tables in the block that is passed around */

   tables = re->tables;
+
+#ifdef COMPILE_PCRE8
   if (tables == NULL)
     (void)pcre_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
     (void *)(&tables));
+#else
+  if (tables == NULL)
+    (void)pcre16_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
+    (void *)(&tables));
+#endif

compile_block.lcc = tables + lcc_offset;
compile_block.fcc = tables + fcc_offset;
@@ -1272,7 +1387,7 @@

/* See if we can find a fixed set of initial characters for the pattern. */

-  memset(start_bits, 0, 32 * sizeof(uschar));
+  memset(start_bits, 0, 32 * sizeof(pcre_uint8));
   rc = set_start_bits(code, start_bits, (re->options & PCRE_UTF8) != 0,
     &compile_block);
   bits_set = rc == SSB_DONE;
@@ -1307,7 +1422,7 @@
 #endif
   )
   {
-  extra = (pcre_extra *)(pcre_malloc)
+  extra = (pcre_extra *)(PUBL(malloc))
     (sizeof(pcre_extra) + sizeof(pcre_study_data));
   if (extra == NULL)
     {
@@ -1322,12 +1437,29 @@
   study->size = sizeof(pcre_study_data);
   study->flags = 0;

+  /* Set the start bits always, to avoid unset memory errors if the
+  study data is written to a file, but set the flag only if any of the bits
+  are set, to save time looking when none are. */
+
   if (bits_set)
     {
     study->flags |= PCRE_STUDY_MAPPED;
     memcpy(study->start_bits, start_bits, sizeof(start_bits));
     }
+  else memset(study->start_bits, 0, 32 * sizeof(pcre_uint8));

+#ifdef PCRE_DEBUG
+  if (bits_set)
+    {
+    pcre_uint8 *ptr = (pcre_uint32 *)start_bits;
+    int i;
+
+    printf("Start bits:\n");
+    for (i = 0; i < 32; i++)
+      printf("%3d: %02x%s", i * 8, *ptr++, ((i + 1) & 0x7) != 0? " " : "\n");
+    }
+#endif
+
   /* Always set the minlength value in the block, because the JIT compiler
   makes use of it. However, don't set the bit unless the length is greater than
   zero - the interpretive pcre_exec() and pcre_dfa_exec() needn't waste time
@@ -1346,10 +1478,15 @@

 #ifdef SUPPORT_JIT
   extra->executable_jit = NULL;
-  if ((options & PCRE_STUDY_JIT_COMPILE) != 0) _pcre_jit_compile(re, extra);
+  if ((options & PCRE_STUDY_JIT_COMPILE) != 0) PRIV(jit_compile)(re, extra);
   if (study->flags == 0 && (extra->flags & PCRE_EXTRA_EXECUTABLE_JIT) == 0)
     {
+#ifdef COMPILE_PCRE8
     pcre_free_study(extra);
+#endif
+#ifdef COMPILE_PCRE16
+    pcre16_free_study(extra);
+#endif
     extra = NULL;
     }
 #endif
@@ -1369,15 +1506,22 @@
 Returns:    nothing
 */

+#ifdef COMPILE_PCRE8
 PCRE_EXP_DEFN void
 pcre_free_study(pcre_extra *extra)
+#else
+PCRE_EXP_DEFN void
+pcre16_free_study(pcre_extra *extra)
+#endif
 {
+if (extra == NULL)
+  return;
 #ifdef SUPPORT_JIT
 if ((extra->flags & PCRE_EXTRA_EXECUTABLE_JIT) != 0 &&
      extra->executable_jit != NULL)
-  _pcre_jit_free(extra->executable_jit);
+  PRIV(jit_free)(extra->executable_jit);
 #endif
-pcre_free(extra);
+PUBL(free)(extra);
 }

/* End of pcre_study.c */

Modified: code/trunk/pcre_tables.c
===================================================================
--- code/trunk/pcre_tables.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_tables.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2009 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -37,6 +37,7 @@
-----------------------------------------------------------------------------
*/

+#ifndef PCRE_INCLUDED

/* This module contains some fixed tables that are used by more than one of the
PCRE code modules. The tables are also #included by the pcretest program, which
@@ -50,11 +51,12 @@

#include "pcre_internal.h"

+#endif /* PCRE_INCLUDED */

/* Table of sizes for the fixed-length opcodes. It's defined in a macro so that
the definition is next to the definition of the opcodes in pcre_internal.h. */

-const uschar _pcre_OP_lengths[] = { OP_LENGTHS };
+const pcre_uint8 PRIV(OP_lengths)[] = { OP_LENGTHS };

@@ -65,44 +67,38 @@
/* These are the breakpoints for different numbers of bytes in a UTF-8
character. */

-#ifdef SUPPORT_UTF8
+#if (defined SUPPORT_UTF && defined COMPILE_PCRE8) \
+ || (defined PCRE_INCLUDED && defined SUPPORT_PCRE16)

-const int _pcre_utf8_table1[] =
+/* These tables are also required by pcretest in 16 bit mode. */
+
+const int PRIV(utf8_table1)[] =
{ 0x7f, 0x7ff, 0xffff, 0x1fffff, 0x3ffffff, 0x7fffffff};

-const int _pcre_utf8_table1_size = sizeof(_pcre_utf8_table1)/sizeof(int);
+const int PRIV(utf8_table1_size) = sizeof(PRIV(utf8_table1)) / sizeof(int);

/* These are the indicator bits and the mask for the data bits to set in the
first byte of a character, indexed by the number of additional bytes. */

-const int _pcre_utf8_table2[] = { 0,    0xc0, 0xe0, 0xf0, 0xf8, 0xfc};
-const int _pcre_utf8_table3[] = { 0xff, 0x1f, 0x0f, 0x07, 0x03, 0x01};
+const int PRIV(utf8_table2)[] = { 0,    0xc0, 0xe0, 0xf0, 0xf8, 0xfc};
+const int PRIV(utf8_table3)[] = { 0xff, 0x1f, 0x0f, 0x07, 0x03, 0x01};

/* Table of the number of extra bytes, indexed by the first byte masked with
0x3f. The highest number for a valid UTF-8 first byte is in fact 0x3d. */

-const uschar _pcre_utf8_table4[] = {
+const pcre_uint8 PRIV(utf8_table4)[] = {
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5 };

-#ifdef SUPPORT_JIT
-/* Full table of the number of extra bytes when the
-character code is greater or equal than 0xc0.
-See _pcre_utf8_table4 above. */
+#endif /* (SUPPORT_UTF && COMPILE_PCRE8) || (PCRE_INCLUDED && SUPPORT_PCRE16)*/

-const uschar _pcre_utf8_char_sizes[] = {
- 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
- 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
- 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
- 3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,
-};
-#endif
+#ifdef SUPPORT_UTF

/* Table to translate from particular type value to the general value. */

-const int _pcre_ucp_gentype[] = {
+const int PRIV(ucp_gentype)[] = {
   ucp_C, ucp_C, ucp_C, ucp_C, ucp_C,  /* Cc, Cf, Cn, Co, Cs */
   ucp_L, ucp_L, ucp_L, ucp_L, ucp_L,  /* Ll, Lu, Lm, Lo, Lt */
   ucp_M, ucp_M, ucp_M,                /* Mc, Me, Mn */
@@ -114,10 +110,10 @@
 };

#ifdef SUPPORT_JIT
-/* This table reverses _pcre_ucp_gentype. We can save the cost
+/* This table reverses PRIV(ucp_gentype). We can save the cost
of a memory load. */

-const int _pcre_ucp_typerange[] = {
+const int PRIV(ucp_typerange)[] = {
ucp_Cc, ucp_Cs,
ucp_Ll, ucp_Lu,
ucp_Mc, ucp_Mn,
@@ -126,7 +122,7 @@
ucp_Sc, ucp_So,
ucp_Zl, ucp_Zs,
};
-#endif
+#endif /* SUPPORT_JIT */

/* The pcre_utt[] table below translates Unicode property names into type and
code values. It is searched by binary chop, so must be in collating sequence of
@@ -284,7 +280,7 @@
#define STRING_Zp0 STR_Z STR_p "\0"
#define STRING_Zs0 STR_Z STR_s "\0"

-const char _pcre_utt_names[] =
+const char PRIV(utt_names)[] =
STRING_Any0
STRING_Arabic0
STRING_Armenian0
@@ -424,7 +420,7 @@
STRING_Zp0
STRING_Zs0;

-const ucp_type_table _pcre_utt[] = {
+const ucp_type_table PRIV(utt)[] = {
{ 0, PT_ANY, 0 },
{ 4, PT_SC, ucp_Arabic },
{ 11, PT_SC, ucp_Armenian },
@@ -565,8 +561,8 @@
{ 961, PT_PC, ucp_Zs }
};

-const int _pcre_utt_size = sizeof(_pcre_utt)/sizeof(ucp_type_table);
+const int PRIV(utt_size) = sizeof(PRIV(utt)) / sizeof(ucp_type_table);

-#endif /* SUPPORT_UTF8 */
+#endif /* SUPPORT_UTF */

/* End of pcre_tables.c */

Deleted: code/trunk/pcre_try_flipped.c
===================================================================
--- code/trunk/pcre_try_flipped.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_try_flipped.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,139 +0,0 @@
-/*************************************************
-*      Perl-Compatible Regular Expressions       *
-*************************************************/
-
-/* PCRE is a library of functions to support regular expressions whose syntax
-and semantics are as close as possible to those of the Perl 5 language.
-
-                       Written by Philip Hazel
-           Copyright (c) 1997-2009 University of Cambridge
-
------------------------------------------------------------------------------
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are met:
-
-    * Redistributions of source code must retain the above copyright notice,
-      this list of conditions and the following disclaimer.
-
-    * Redistributions in binary form must reproduce the above copyright
-      notice, this list of conditions and the following disclaimer in the
-      documentation and/or other materials provided with the distribution.
-
-    * Neither the name of the University of Cambridge nor the names of its
-      contributors may be used to endorse or promote products derived from
-      this software without specific prior written permission.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
-AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
-ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
-LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
-CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
-SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
-INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
-CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
-ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
-POSSIBILITY OF SUCH DAMAGE.
------------------------------------------------------------------------------
-*/
-
-
-/* This module contains an internal function that tests a compiled pattern to
-see if it was compiled with the opposite endianness. If so, it uses an
-auxiliary local function to flip the appropriate bytes. */
-
-
-#ifdef HAVE_CONFIG_H
-#include "config.h"
-#endif
-
-#include "pcre_internal.h"
-
-
-/*************************************************
-*         Flip bytes in an integer               *
-*************************************************/
-
-/* This function is called when the magic number in a regex doesn't match, in
-order to flip its bytes to see if we are dealing with a pattern that was
-compiled on a host of different endianness. If so, this function is used to
-flip other byte values.
-
-Arguments:
-  value        the number to flip
-  n            the number of bytes to flip (assumed to be 2 or 4)
-
-Returns:       the flipped value
-*/
-
-static unsigned long int
-byteflip(unsigned long int value, int n)
-{
-if (n == 2) return ((value & 0x00ff) << 8) | ((value & 0xff00) >> 8);
-return ((value & 0x000000ff) << 24) |
-       ((value & 0x0000ff00) <<  8) |
-       ((value & 0x00ff0000) >>  8) |
-       ((value & 0xff000000) >> 24);
-}
-
-
-
-/*************************************************
-*       Test for a byte-flipped compiled regex   *
-*************************************************/
-
-/* This function is called from pcre_exec(), pcre_dfa_exec(), and also from
-pcre_fullinfo(). Its job is to test whether the regex is byte-flipped - that
-is, it was compiled on a system of opposite endianness. The function is called
-only when the native MAGIC_NUMBER test fails. If the regex is indeed flipped,
-we flip all the relevant values into a different data block, and return it.
-
-Arguments:
-  re               points to the regex
-  study            points to study data, or NULL
-  internal_re      points to a new regex block
-  internal_study   points to a new study block
-
-Returns:           the new block if is is indeed a byte-flipped regex
-                   NULL if it is not
-*/
-
-real_pcre *
-_pcre_try_flipped(const real_pcre *re, real_pcre *internal_re,
-  const pcre_study_data *study, pcre_study_data *internal_study)
-{
-if (byteflip(re->magic_number, sizeof(re->magic_number)) != MAGIC_NUMBER)
-  return NULL;
-
-*internal_re = *re;           /* To copy other fields */
-internal_re->size = byteflip(re->size, sizeof(re->size));
-internal_re->options = byteflip(re->options, sizeof(re->options));
-internal_re->flags = (pcre_uint16)byteflip(re->flags, sizeof(re->flags));
-internal_re->top_bracket =
-  (pcre_uint16)byteflip(re->top_bracket, sizeof(re->top_bracket));
-internal_re->top_backref =
-  (pcre_uint16)byteflip(re->top_backref, sizeof(re->top_backref));
-internal_re->first_byte =
-  (pcre_uint16)byteflip(re->first_byte, sizeof(re->first_byte));
-internal_re->req_byte =
-  (pcre_uint16)byteflip(re->req_byte, sizeof(re->req_byte));
-internal_re->name_table_offset =
-  (pcre_uint16)byteflip(re->name_table_offset, sizeof(re->name_table_offset));
-internal_re->name_entry_size =
-  (pcre_uint16)byteflip(re->name_entry_size, sizeof(re->name_entry_size));
-internal_re->name_count =
-  (pcre_uint16)byteflip(re->name_count, sizeof(re->name_count));
-
-if (study != NULL)
-  {
-  *internal_study = *study;   /* To copy other fields */
-  internal_study->size = byteflip(study->size, sizeof(study->size));
-  internal_study->flags = byteflip(study->flags, sizeof(study->flags));
-  internal_study->minlength = byteflip(study->minlength,
-    sizeof(study->minlength));
-  }
-
-return internal_re;
-}
-
-/* End of pcre_tryflipped.c */

Modified: code/trunk/pcre_ucd.c
===================================================================
--- code/trunk/pcre_ucd.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_ucd.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -18,21 +18,21 @@
 /* Instead, just supply small dummy tables. */

#ifndef SUPPORT_UCP
-const ucd_record _pcre_ucd_records[] = {{0,0,0 }};
-const uschar _pcre_ucd_stage1[] = {0};
-const pcre_uint16 _pcre_ucd_stage2[] = {0};
+const ucd_record PRIV(ucd_records)[] = {{0,0,0 }};
+const pcre_uint8 PRIV(ucd_stage1)[] = {0};
+const pcre_uint16 PRIV(ucd_stage2)[] = {0};
#else

/* When recompiling tables with a new Unicode version,
please check types in the structure definition from pcre_internal.h:
typedef struct {
-uschar property_0;
-uschar property_1;
+pcre_uint8 property_0;
+pcre_uint8 property_1;
pcre_int32 property_2;
} ucd_record; */

-const ucd_record _pcre_ucd_records[] = { /* 4320 bytes, record size 8 */
+const ucd_record PRIV(ucd_records)[] = { /* 4320 bytes, record size 8 */
   {     9,      0,      0, }, /*   0 */
   {     9,     29,      0, }, /*   1 */
   {     9,     21,      0, }, /*   2 */
@@ -575,7 +575,7 @@
   {    26,     26,      0, }, /* 539 */
 };

-const uschar _pcre_ucd_stage1[] = { /* 8704 bytes */
+const pcre_uint8 PRIV(ucd_stage1)[] = { /* 8704 bytes */
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, /* U+0000 */
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, /* U+0800 */
32, 33, 34, 34, 35, 36, 37, 38, 39, 40, 40, 40, 41, 42, 43, 44, /* U+1000 */
@@ -1122,7 +1122,7 @@
114,114,114,114,114,114,114,114,114,114,114,114,114,114,114,184, /* U+10F800 */
};

-const pcre_uint16 _pcre_ucd_stage2[] = { /* 47360 bytes, block = 128 */
+const pcre_uint16 PRIV(ucd_stage2)[] = { /* 47360 bytes, block = 128 */
/* block 0 */
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

Modified: code/trunk/pcre_ucp_searchfuncs.c
===================================================================
--- code/trunk/pcre_ucp_searchfuncs.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_ucp_searchfuncs.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -92,7 +92,7 @@
 */

int
-_pcre_ucp_findprop(const unsigned int c, int *type_ptr, int *script_ptr)
+PRIV(ucp_findprop)(const unsigned int c, int *type_ptr, int *script_ptr)
{
int bot = 0;
int top = sizeof(ucp_table)/sizeof(cnode);
@@ -148,7 +148,7 @@
*/

unsigned int
-_pcre_ucp_othercase(const unsigned int c)
+PRIV(ucp_othercase)(const unsigned int c)
{
int bot = 0;
int top = sizeof(ucp_table)/sizeof(cnode);

Modified: code/trunk/pcre_valid_utf8.c
===================================================================
--- code/trunk/pcre_valid_utf8.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_valid_utf8.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2009 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -103,15 +103,15 @@
*/

int
-_pcre_valid_utf8(USPTR string, int length, int *erroroffset)
+PRIV(valid_utf)(PCRE_PUCHAR string, int length, int *erroroffset)
{
-#ifdef SUPPORT_UTF8
-register USPTR p;
+#ifdef SUPPORT_UTF
+register PCRE_PUCHAR p;

if (length < 0)
{
for (p = string; *p != 0; p++);
- length = p - string;
+ length = (int)(p - string);
}

for (p = string; length-- > 0; p++)
@@ -123,20 +123,20 @@

   if (c < 0xc0)                         /* Isolated 10xx xxxx byte */
     {
-    *erroroffset = p - string;
+    *erroroffset = (int)(p - string);
     return PCRE_UTF8_ERR20;
     }

   if (c >= 0xfe)                        /* Invalid 0xfe or 0xff bytes */
     {
-    *erroroffset = p - string;
+    *erroroffset = (int)(p - string);
     return PCRE_UTF8_ERR21;
     }

-  ab = _pcre_utf8_table4[c & 0x3f];     /* Number of additional bytes */
+  ab = PRIV(utf8_table4)[c & 0x3f];     /* Number of additional bytes */
   if (length < ab)
     {
-    *erroroffset = p - string;          /* Missing bytes */
+    *erroroffset = (int)(p - string);          /* Missing bytes */
     return ab - length;                 /* Codes ERR1 to ERR5 */
     }
   length -= ab;                         /* Length remaining */
@@ -145,7 +145,7 @@

   if (((d = *(++p)) & 0xc0) != 0x80)
     {
-    *erroroffset = p - string - 1;
+    *erroroffset = (int)(p - string) - 1;
     return PCRE_UTF8_ERR6;
     }

@@ -160,7 +160,7 @@

     case 1: if ((c & 0x3e) == 0)
       {
-      *erroroffset = p - string - 1;
+      *erroroffset = (int)(p - string) - 1;
       return PCRE_UTF8_ERR15;
       }
     break;
@@ -172,17 +172,17 @@
     case 2:
     if ((*(++p) & 0xc0) != 0x80)     /* Third byte */
       {
-      *erroroffset = p - string - 2;
+      *erroroffset = (int)(p - string) - 2;
       return PCRE_UTF8_ERR7;
       }
     if (c == 0xe0 && (d & 0x20) == 0)
       {
-      *erroroffset = p - string - 2;
+      *erroroffset = (int)(p - string) - 2;
       return PCRE_UTF8_ERR16;
       }
     if (c == 0xed && d >= 0xa0)
       {
-      *erroroffset = p - string - 2;
+      *erroroffset = (int)(p - string) - 2;
       return PCRE_UTF8_ERR14;
       }
     break;
@@ -194,22 +194,22 @@
     case 3:
     if ((*(++p) & 0xc0) != 0x80)     /* Third byte */
       {
-      *erroroffset = p - string - 2;
+      *erroroffset = (int)(p - string) - 2;
       return PCRE_UTF8_ERR7;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Fourth byte */
       {
-      *erroroffset = p - string - 3;
+      *erroroffset = (int)(p - string) - 3;
       return PCRE_UTF8_ERR8;
       }
     if (c == 0xf0 && (d & 0x30) == 0)
       {
-      *erroroffset = p - string - 3;
+      *erroroffset = (int)(p - string) - 3;
       return PCRE_UTF8_ERR17;
       }
     if (c > 0xf4 || (c == 0xf4 && d > 0x8f))
       {
-      *erroroffset = p - string - 3;
+      *erroroffset = (int)(p - string) - 3;
       return PCRE_UTF8_ERR13;
       }
     break;
@@ -225,22 +225,22 @@
     case 4:
     if ((*(++p) & 0xc0) != 0x80)     /* Third byte */
       {
-      *erroroffset = p - string - 2;
+      *erroroffset = (int)(p - string) - 2;
       return PCRE_UTF8_ERR7;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Fourth byte */
       {
-      *erroroffset = p - string - 3;
+      *erroroffset = (int)(p - string) - 3;
       return PCRE_UTF8_ERR8;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Fifth byte */
       {
-      *erroroffset = p - string - 4;
+      *erroroffset = (int)(p - string) - 4;
       return PCRE_UTF8_ERR9;
       }
     if (c == 0xf8 && (d & 0x38) == 0)
       {
-      *erroroffset = p - string - 4;
+      *erroroffset = (int)(p - string) - 4;
       return PCRE_UTF8_ERR18;
       }
     break;
@@ -251,27 +251,27 @@
     case 5:
     if ((*(++p) & 0xc0) != 0x80)     /* Third byte */
       {
-      *erroroffset = p - string - 2;
+      *erroroffset = (int)(p - string) - 2;
       return PCRE_UTF8_ERR7;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Fourth byte */
       {
-      *erroroffset = p - string - 3;
+      *erroroffset = (int)(p - string) - 3;
       return PCRE_UTF8_ERR8;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Fifth byte */
       {
-      *erroroffset = p - string - 4;
+      *erroroffset = (int)(p - string) - 4;
       return PCRE_UTF8_ERR9;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Sixth byte */
       {
-      *erroroffset = p - string - 5;
+      *erroroffset = (int)(p - string) - 5;
       return PCRE_UTF8_ERR10;
       }
     if (c == 0xfc && (d & 0x3c) == 0)
       {
-      *erroroffset = p - string - 5;
+      *erroroffset = (int)(p - string) - 5;
       return PCRE_UTF8_ERR19;
       }
     break;
@@ -283,12 +283,12 @@

   if (ab > 3)
     {
-    *erroroffset = p - string - ab;
+    *erroroffset = (int)(p - string) - ab;
     return (ab == 4)? PCRE_UTF8_ERR11 : PCRE_UTF8_ERR12;
     }
   }

-#else /* SUPPORT_UTF8 */
+#else /* SUPPORT_UTF */
(void)(string); /* Keep picky compilers happy */
(void)(length);
#endif

Modified: code/trunk/pcre_version.c
===================================================================
--- code/trunk/pcre_version.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_version.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2008 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -79,8 +79,13 @@
pre-processor time. This hack uses a standard trick for avoiding calling
the STRING macro with an empty argument when doing the test. */

+#ifdef COMPILE_PCRE8
PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
pcre_version(void)
+#else
+PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
+pcre16_version(void)
+#endif
{
return (XSTRING(Z PCRE_PRERELEASE)[1] == 0)?
XSTRING(PCRE_MAJOR.PCRE_MINOR PCRE_DATE) :

Modified: code/trunk/pcre_xclass.c
===================================================================
--- code/trunk/pcre_xclass.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcre_xclass.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2010 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -64,39 +64,63 @@
*/

BOOL
-_pcre_xclass(int c, const uschar *data)
+PRIV(xclass)(int c, const pcre_uchar *data, BOOL utf)
{
int t;
BOOL negated = (*data & XCL_NOT) != 0;

+(void)utf;
+#ifdef COMPILE_PCRE8
+/* In 8 bit mode, this must always be TRUE. Help the compiler to know that. */
+utf = TRUE;
+#endif
+
/* Character values < 256 are matched against a bitmap, if one is present. If
not, we still carry on, because there may be ranges that start below 256 in the
additional data. */

 if (c < 256)
   {
-  if ((*data & XCL_MAP) != 0 && (data[1 + c/8] & (1 << (c&7))) != 0)
-    return !negated;   /* char found */
+  if ((*data & XCL_MAP) != 0 &&
+    (((pcre_uint8 *)(data + 1))[c/8] & (1 << (c&7))) != 0)
+    return !negated; /* char found */
   }

/* First skip the bit map if present. Then match against the list of Unicode
properties or large chars or ranges that end with a large char. We won't ever
encounter XCL_PROP or XCL_NOTPROP when UCP support is not compiled. */

-if ((*data++ & XCL_MAP) != 0) data += 32;
+if ((*data++ & XCL_MAP) != 0) data += 32 / sizeof(pcre_uchar);

 while ((t = *data++) != XCL_END)
   {
   int x, y;
   if (t == XCL_SINGLE)
     {
-    GETCHARINC(x, data);
+#ifdef SUPPORT_UTF
+    if (utf)
+      {
+      GETCHARINC(x, data); /* macro generates multiple statements */
+      }
+    else
+#endif
+      x = *data++;
     if (c == x) return !negated;
     }
   else if (t == XCL_RANGE)
     {
-    GETCHARINC(x, data);
-    GETCHARINC(y, data);
+#ifdef SUPPORT_UTF
+    if (utf)
+      {
+      GETCHARINC(x, data); /* macro generates multiple statements */
+      GETCHARINC(y, data); /* macro generates multiple statements */
+      }
+    else
+#endif
+      {
+      x = *data++;
+      y = *data++;
+      }
     if (c >= x && c <= y) return !negated;
     }

@@ -117,7 +141,7 @@
       break;

       case PT_GC:
-      if ((data[1] == _pcre_ucp_gentype[prop->chartype]) == (t == XCL_PROP))
+      if ((data[1] == PRIV(ucp_gentype)[prop->chartype]) == (t == XCL_PROP))
         return !negated;
       break;

@@ -130,28 +154,28 @@
       break;

       case PT_ALNUM:
-      if ((_pcre_ucp_gentype[prop->chartype] == ucp_L ||
-           _pcre_ucp_gentype[prop->chartype] == ucp_N) == (t == XCL_PROP))
+      if ((PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+           PRIV(ucp_gentype)[prop->chartype] == ucp_N) == (t == XCL_PROP))
         return !negated;
       break;

       case PT_SPACE:    /* Perl space */
-      if ((_pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+      if ((PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
            c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR)
              == (t == XCL_PROP))
         return !negated;
       break;

       case PT_PXSPACE:  /* POSIX space */
-      if ((_pcre_ucp_gentype[prop->chartype] == ucp_Z ||
+      if ((PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
            c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
            c == CHAR_FF || c == CHAR_CR) == (t == XCL_PROP))
         return !negated;
       break;

       case PT_WORD:
-      if ((_pcre_ucp_gentype[prop->chartype] == ucp_L ||
-           _pcre_ucp_gentype[prop->chartype] == ucp_N || c == CHAR_UNDERSCORE)
+      if ((PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
+           PRIV(ucp_gentype)[prop->chartype] == ucp_N || c == CHAR_UNDERSCORE)
              == (t == XCL_PROP))
         return !negated;
       break;

Modified: code/trunk/pcregrep.c
===================================================================
--- code/trunk/pcregrep.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcregrep.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 its pattern matching. On a Unix or Win32 system it can recurse into
 directories.

-           Copyright (c) 1997-2011 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@@ -1410,7 +1410,7 @@
         and its line-ending characters (if they matched the pattern), so there
         may be no more to print. */

-        plength = (linelength + endlinelength) - startoffset;
+        plength = (int)((linelength + endlinelength) - startoffset);
         if (plength > 0) FWRITE(ptr + startoffset, 1, plength, stdout);
         }

@@ -1462,7 +1462,7 @@

   if (input_line_buffered && bufflength < (size_t)bufsize)
     {
-    int add = read_one_line(ptr, bufsize - (ptr - main_buffer), in);
+    int add = read_one_line(ptr, bufsize - (int)(ptr - main_buffer), in);
     bufflength += add;
     endptr += add;
     }

Modified: code/trunk/pcreposix.c
===================================================================
--- code/trunk/pcreposix.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcreposix.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.

                        Written by Philip Hazel
-           Copyright (c) 1997-2010 University of Cambridge
+           Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -154,7 +154,10 @@
REG_BADPAT, /* \c must be followed by an ASCII character */
REG_BADPAT, /* \k is not followed by a braced, angle-bracketed, or quoted name */
/* 70 */
- REG_BADPAT, /* internal error: unknown opcode in find_fixedlength() */
+ REG_BADPAT, /* internal error: unknown opcode in find_fixedlength() */
+ REG_BADPAT, /* \N is not supported in a class */
+ REG_BADPAT, /* too many forward references */
+ REG_BADPAT, /* disallowed UTF-8/16 code point (>= 0xd800 && <= 0xdfff) */
};

/* Table of texts corresponding to POSIX error codes */
@@ -225,7 +228,7 @@
PCREPOSIX_EXP_DEFN void PCRE_CALL_CONVENTION
regfree(regex_t *preg)
{
-(pcre_free)(preg->re_pcre);
+(PUBL(free))(preg->re_pcre);
}

@@ -274,7 +277,8 @@
     eint[errorcode] : REG_BADPAT;
   }

-preg->re_nsub = pcre_info((const pcre *)preg->re_pcre, NULL, NULL);
+(void)pcre_fullinfo((const pcre *)preg->re_pcre, NULL, PCRE_INFO_CAPTURECOUNT,
+ &(preg->re_nsub));
return 0;
}

@@ -400,6 +404,7 @@
case PCRE_ERROR_MATCHLIMIT: return REG_ESPACE;
case PCRE_ERROR_BADUTF8: return REG_INVARG;
case PCRE_ERROR_BADUTF8_OFFSET: return REG_INVARG;
+ case PCRE_ERROR_BADMODE: return REG_INVARG;
default: return REG_ASSERT;
}
}

Modified: code/trunk/pcreposix.h
===================================================================
--- code/trunk/pcreposix.h    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcreposix.h    2011-12-28 17:16:11 UTC (rev 836)
@@ -9,7 +9,7 @@
 Compatible Regular Expression library. It defines the things POSIX says should
 be there. I hope.

-            Copyright (c) 1997-2009 University of Cambridge
+            Copyright (c) 1997-2012 University of Cambridge

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without

Modified: code/trunk/pcretest.c
===================================================================
--- code/trunk/pcretest.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/pcretest.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -4,7 +4,8 @@

/* This program was hacked up as a tester for PCRE. I really should have
written it more tidily in the first place. Will I ever learn? It has grown and
-been extended and consequently is now rather, er, *very* untidy in places.
+been extended and consequently is now rather, er, *very* untidy in places. The
+addition of 16-bit support has made it even worse. :-(

-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -35,7 +36,17 @@
-----------------------------------------------------------------------------
*/

+/* This program now supports the testing of both the 8-bit and 16-bit PCRE
+libraries in a single program. This is different from the modules such as
+pcre_compile.c in the library itself, which are compiled separately for each
+mode. If both modes are enabled, for example, pcre_compile.c is compiled twice
+(the second time with COMPILE_PCRE16 defined). By contrast, pcretest.c is
+compiled only once. Therefore, it must not make use of any of the macros from
+pcre_internal.h that depend on COMPILE_PCRE8 or COMPILE_PCRE16. It does,
+however, make use of SUPPORT_PCRE8 and SUPPORT_PCRE16 to ensure that it calls
+only supported library functions. */

+
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
@@ -105,43 +116,55 @@
appropriately for an application, not for building PCRE. */

#include "pcre.h"
+
+#if defined SUPPORT_PCRE16 && !defined SUPPORT_PCRE8
+/* Configure internal macros to 16 bit mode. */
+#define COMPILE_PCRE16
+#endif
+
#include "pcre_internal.h"

+/* The pcre_printint() function, which prints the internal form of a compiled
+regex, is held in a separate file so that (a) it can be compiled in either
+8-bit or 16-bit mode, and (b) it can be #included directly in pcre_compile.c
+when that is compiled in debug mode. */
+
+#ifdef SUPPORT_PCRE8
+void pcre_printint(pcre *external_re, FILE *f, BOOL print_lengths);
+#endif
+#ifdef SUPPORT_PCRE16
+void pcre16_printint(pcre *external_re, FILE *f, BOOL print_lengths);
+#endif
+
/* We need access to some of the data tables that PCRE uses. So as not to have
to keep two copies, we include the source file here, changing the names of the
external symbols to prevent clashes. */

-#define _pcre_ucp_gentype      ucp_gentype
-#define _pcre_ucp_typerange    ucp_typerange
-#define _pcre_utf8_table1      utf8_table1
-#define _pcre_utf8_table1_size utf8_table1_size
-#define _pcre_utf8_table2      utf8_table2
-#define _pcre_utf8_table3      utf8_table3
-#define _pcre_utf8_table4      utf8_table4
-#define _pcre_utf8_char_sizes  utf8_char_sizes
-#define _pcre_utt              utt
-#define _pcre_utt_size         utt_size
-#define _pcre_utt_names        utt_names
-#define _pcre_OP_lengths       OP_lengths
+#define PCRE_INCLUDED
+#undef PRIV
+#define PRIV(name) name

#include "pcre_tables.c"

-/* We also need the pcre_printint() function for printing out compiled
-patterns. This function is in a separate file so that it can be included in
-pcre_compile.c when that module is compiled with debugging enabled. It needs to
-know which case is being compiled. */
-
-#define COMPILING_PCRETEST
-#include "pcre_printint.src"
-
/* The definition of the macro PRINTABLE, which determines whether to print an
output character as-is or as a hex value when showing compiled patterns, is
-contained in the printint.src file. We uses it here also, in cases when the
-locale has not been explicitly changed, so as to get consistent output from
-systems that differ in their output from isprint() even in the "C" locale. */
+the same as in the printint.src file. We uses it here in cases when the locale
+has not been explicitly changed, so as to get consistent output from systems
+that differ in their output from isprint() even in the "C" locale. */

-#define PRINTHEX(c) (locale_set? isprint(c) : PRINTABLE(c))
+#ifdef EBCDIC
+#define PRINTABLE(c) ((c) >= 64 && (c) < 255)
+#else
+#define PRINTABLE(c) ((c) >= 32 && (c) < 127)
+#endif

+#define PRINTOK(c) (locale_set? isprint(c) : PRINTABLE(c))
+
+/* Posix support is disabled in 16 bit only mode. */
+#if defined SUPPORT_PCRE16 && !defined SUPPORT_PCRE8 && !defined NOPOSIX
+#define NOPOSIX
+#endif
+
/* It is possible to compile this test program without including support for
testing the POSIX interface, though this is not available via the standard
Makefile. */
@@ -150,19 +173,391 @@
#include "pcreposix.h"
#endif

-/* It is also possible, for the benefit of the version currently imported into
-Exim, to build pcretest without support for UTF8 (define NOUTF8), without the
-interface to the DFA matcher (NODFA), and without the doublecheck of the old
-"info" function (define NOINFOCHECK). In fact, we automatically cut out the
-UTF8 support if PCRE is built without it. */
+/* It is also possible, originally for the benefit of a version that was
+imported into Exim, to build pcretest without support for UTF8 or UTF16 (define
+NOUTF), without the interface to the DFA matcher (NODFA). In fact, we
+automatically cut out the UTF support if PCRE is built without it. */

-#ifndef SUPPORT_UTF8
-#ifndef NOUTF8
-#define NOUTF8
+#ifndef SUPPORT_UTF
+#ifndef NOUTF
+#define NOUTF
#endif
#endif

+/* To make the code a bit tidier for 8-bit and 16-bit support, we define macros
+for all the pcre[16]_xxx functions (except pcre16_fullinfo, which is called
+only from one place and is handled differently). I couldn't dream up any way of
+using a single macro to do this in a generic way, because of the many different
+argument requirements. We know that at least one of SUPPORT_PCRE8 and
+SUPPORT_PCRE16 must be set. First define macros for each individual mode; then
+use these in the definitions of generic macros.

+**** Special note about the PCHARSxxx macros: the address of the string to be
+printed is always given as two arguments: a base address followed by an offset.
+The base address is cast to the correct data size for 8 or 16 bit data; the
+offset is in units of this size. If the string were given as base+offset in one
+argument, the casting might be incorrectly applied. */
+
+#ifdef SUPPORT_PCRE8
+
+#define PCHARS8(lv, p, offset, len, f) \
+  lv = pchars((pcre_uint8 *)(p) + offset, len, f)
+
+#define PCHARSV8(p, offset, len, f) \
+  (void)pchars((pcre_uint8 *)(p) + offset, len, f)
+
+#define READ_CAPTURE_NAME8(p, cn8, cn16, re) \
+  p = read_capture_name8(p, cn8, re)
+
+#define SET_PCRE_CALLOUT8(callout) \
+  pcre_callout = callout
+
+#define STRLEN8(p) ((int)strlen((char *)p))
+
+
+#define PCRE_COMPILE8(re, pat, options, error, erroffset, tables) \
+  re = pcre_compile((char *)pat, options, error, erroffset, tables)
+
+#define PCRE_COPY_NAMED_SUBSTRING8(rc, re, bptr, offsets, count, \
+    namesptr, cbuffer, size) \
+  rc = pcre_copy_named_substring(re, (char *)bptr, offsets, count, \
+    (char *)namesptr, cbuffer, size)
+
+#define PCRE_COPY_SUBSTRING8(rc, bptr, offsets, count, i, cbuffer, size) \
+  rc = pcre_copy_substring((char *)bptr, offsets, count, i, cbuffer, size)
+
+#define PCRE_DFA_EXEC8(count, re, extra, bptr, len, start_offset, options, \
+    offsets, size_offsets, workspace, size_workspace) \
+  count = pcre_dfa_exec(re, extra, (char *)bptr, len, start_offset, options, \
+    offsets, size_offsets, workspace, size_workspace)
+
+#define PCRE_EXEC8(count, re, extra, bptr, len, start_offset, options, \
+    offsets, size_offsets) \
+  count = pcre_exec(re, extra, (char *)bptr, len, start_offset, options, \
+    offsets, size_offsets)
+
+#define PCRE_FREE_STUDY8(extra) \
+  pcre_free_study(extra)
+
+#define PCRE_FREE_SUBSTRING8(substring) \
+  pcre_free_substring(substring)
+
+#define PCRE_FREE_SUBSTRING_LIST8(listptr) \
+  pcre_free_substring_list(listptr)
+
+#define PCRE_GET_NAMED_SUBSTRING8(rc, re, bptr, offsets, count, \
+    getnamesptr, subsptr) \
+  rc = pcre_get_named_substring(re, (char *)bptr, offsets, count, \
+    (char *)getnamesptr, subsptr)
+
+#define PCRE_GET_STRINGNUMBER8(n, rc, ptr) \
+  n = pcre_get_stringnumber(re, (char *)ptr)
+
+#define PCRE_GET_SUBSTRING8(rc, bptr, offsets, count, i, subsptr) \
+  rc = pcre_get_substring((char *)bptr, offsets, count, i, subsptr)
+
+#define PCRE_GET_SUBSTRING_LIST8(rc, bptr, offsets, count, listptr) \
+  rc = pcre_get_substring_list((const char *)bptr, offsets, count, listptr)
+
+#define PCRE_PATTERN_TO_HOST_BYTE_ORDER8(re, extra, tables) \
+  pcre_pattern_to_host_byte_order(re, extra, tables)
+
+#define PCRE_PRINTINT8(re, outfile, debug_lengths) \
+  pcre_printint(re, outfile, debug_lengths)
+
+#define PCRE_STUDY8(extra, re, options, error) \
+  extra = pcre_study(re, options, error)
+
+#endif /* SUPPORT_PCRE8 */
+
+/* -----------------------------------------------------------*/
+
+#ifdef SUPPORT_PCRE16
+
+#define PCHARS16(lv, p, offset, len, f) \
+  lv = pchars16((PCRE_SPTR16)(p) + offset, len, f)
+
+#define PCHARSV16(p, offset, len, f) \
+  (void)pchars16((PCRE_SPTR16)(p) + offset, len, f)
+
+#define READ_CAPTURE_NAME16(p, cn8, cn16, re) \
+  p = read_capture_name16(p, cn16, re)
+
+#define STRLEN16(p) ((int)strlen16((PCRE_SPTR16)p))
+
+#define SET_PCRE_CALLOUT16(callout) \
+  pcre16_callout = callout
+
+
+#define PCRE_COMPILE16(re, pat, options, error, erroffset, tables) \
+  re = pcre16_compile((PCRE_SPTR16)pat, options, error, erroffset, tables)
+
+#define PCRE_COPY_NAMED_SUBSTRING16(rc, re, bptr, offsets, count, \
+    namesptr, cbuffer, size) \
+  rc = pcre16_copy_named_substring(re, (PCRE_SPTR16)bptr, offsets, count, \
+    (PCRE_SPTR16)namesptr, (PCRE_SCHAR16 *)cbuffer, size/2)
+
+#define PCRE_COPY_SUBSTRING16(rc, bptr, offsets, count, i, cbuffer, size) \
+  rc = pcre16_copy_substring((PCRE_SPTR16)bptr, offsets, count, i, \
+    (PCRE_SCHAR16 *)cbuffer, size/2)
+
+#define PCRE_DFA_EXEC16(count, re, extra, bptr, len, start_offset, options, \
+    offsets, size_offsets, workspace, size_workspace) \
+  count = pcre16_dfa_exec(re, extra, (PCRE_SPTR16)bptr, len, start_offset, \
+    options, offsets, size_offsets, workspace, size_workspace)
+
+#define PCRE_EXEC16(count, re, extra, bptr, len, start_offset, options, \
+    offsets, size_offsets) \
+  count = pcre16_exec(re, extra, (PCRE_SPTR16)bptr, len, start_offset, \
+    options, offsets, size_offsets)
+
+#define PCRE_FREE_STUDY16(extra) \
+  pcre16_free_study(extra)
+
+#define PCRE_FREE_SUBSTRING16(substring) \
+  pcre16_free_substring((PCRE_SPTR16)substring)
+
+#define PCRE_FREE_SUBSTRING_LIST16(listptr) \
+  pcre16_free_substring_list((PCRE_SPTR16 *)listptr)
+
+#define PCRE_GET_NAMED_SUBSTRING16(rc, re, bptr, offsets, count, \
+    getnamesptr, subsptr) \
+  rc = pcre16_get_named_substring(re, (PCRE_SPTR16)bptr, offsets, count, \
+    (PCRE_SPTR16)getnamesptr, (PCRE_SPTR16 *)(void*)subsptr)
+
+#define PCRE_GET_STRINGNUMBER16(n, rc, ptr) \
+  n = pcre16_get_stringnumber(re, (PCRE_SPTR16)ptr)
+
+#define PCRE_GET_SUBSTRING16(rc, bptr, offsets, count, i, subsptr) \
+  rc = pcre16_get_substring((PCRE_SPTR16)bptr, offsets, count, i, \
+    (PCRE_SPTR16 *)(void*)subsptr)
+
+#define PCRE_GET_SUBSTRING_LIST16(rc, bptr, offsets, count, listptr) \
+  rc = pcre16_get_substring_list((PCRE_SPTR16)bptr, offsets, count, \
+    (PCRE_SPTR16 **)(void*)listptr)
+
+#define PCRE_PATTERN_TO_HOST_BYTE_ORDER16(re, extra, tables) \
+  pcre16_pattern_to_host_byte_order(re, extra, tables)
+
+#define PCRE_PRINTINT16(re, outfile, debug_lengths) \
+  pcre16_printint(re, outfile, debug_lengths)
+
+#define PCRE_STUDY16(extra, re, options, error) \
+  extra = pcre16_study(re, options, error)
+
+#endif /* SUPPORT_PCRE16 */
+
+
+/* ----- Both modes are supported; a runtime test is needed, except for
+pcre_config(), and the JIT stack functions, when it doesn't matter which
+version is called. ----- */
+
+#if defined SUPPORT_PCRE8 && defined SUPPORT_PCRE16
+
+#define CHAR_SIZE (use_pcre16? 2:1)
+
+#define PCHARS(lv, p, offset, len, f) \
+  if (use_pcre16) \
+    PCHARS16(lv, p, offset, len, f); \
+  else \
+    PCHARS8(lv, p, offset, len, f)
+
+#define PCHARSV(p, offset, len, f) \
+  if (use_pcre16) \
+    PCHARSV16(p, offset, len, f); \
+  else \
+    PCHARSV8(p, offset, len, f)
+
+#define READ_CAPTURE_NAME(p, cn8, cn16, re) \
+  if (use_pcre16) \
+    READ_CAPTURE_NAME16(p, cn8, cn16, re); \
+  else \
+    READ_CAPTURE_NAME8(p, cn8, cn16, re)
+
+#define SET_PCRE_CALLOUT(callout) \
+  if (use_pcre16) \
+    SET_PCRE_CALLOUT16(callout); \
+  else \
+    SET_PCRE_CALLOUT8(callout)
+
+#define STRLEN(p) (use_pcre16? STRLEN16(p) : STRLEN8(p))
+
+#define PCRE_ASSIGN_JIT_STACK pcre_assign_jit_stack
+
+#define PCRE_COMPILE(re, pat, options, error, erroffset, tables) \
+  if (use_pcre16) \
+    PCRE_COMPILE16(re, pat, options, error, erroffset, tables); \
+  else \
+    PCRE_COMPILE8(re, pat, options, error, erroffset, tables)
+
+#define PCRE_CONFIG pcre_config
+
+#define PCRE_COPY_NAMED_SUBSTRING(rc, re, bptr, offsets, count, \
+    namesptr, cbuffer, size) \
+  if (use_pcre16) \
+    PCRE_COPY_NAMED_SUBSTRING16(rc, re, bptr, offsets, count, \
+      namesptr, cbuffer, size); \
+  else \
+    PCRE_COPY_NAMED_SUBSTRING8(rc, re, bptr, offsets, count, \
+      namesptr, cbuffer, size)
+
+#define PCRE_COPY_SUBSTRING(rc, bptr, offsets, count, i, cbuffer, size) \
+  if (use_pcre16) \
+    PCRE_COPY_SUBSTRING16(rc, bptr, offsets, count, i, cbuffer, size); \
+  else \
+    PCRE_COPY_SUBSTRING8(rc, bptr, offsets, count, i, cbuffer, size)
+
+#define PCRE_DFA_EXEC(count, re, extra, bptr, len, start_offset, options, \
+    offsets, size_offsets, workspace, size_workspace) \
+  if (use_pcre16) \
+    PCRE_DFA_EXEC16(count, re, extra, bptr, len, start_offset, options, \
+      offsets, size_offsets, workspace, size_workspace); \
+  else \
+    PCRE_DFA_EXEC8(count, re, extra, bptr, len, start_offset, options, \
+      offsets, size_offsets, workspace, size_workspace)
+
+#define PCRE_EXEC(count, re, extra, bptr, len, start_offset, options, \
+    offsets, size_offsets) \
+  if (use_pcre16) \
+    PCRE_EXEC16(count, re, extra, bptr, len, start_offset, options, \
+      offsets, size_offsets); \
+  else \
+    PCRE_EXEC8(count, re, extra, bptr, len, start_offset, options, \
+      offsets, size_offsets)
+
+#define PCRE_FREE_STUDY(extra) \
+  if (use_pcre16) \
+    PCRE_FREE_STUDY16(extra); \
+  else \
+    PCRE_FREE_STUDY8(extra)
+
+#define PCRE_FREE_SUBSTRING(substring) \
+  if (use_pcre16) \
+    PCRE_FREE_SUBSTRING16(substring); \
+  else \
+    PCRE_FREE_SUBSTRING8(substring)
+
+#define PCRE_FREE_SUBSTRING_LIST(listptr) \
+  if (use_pcre16) \
+    PCRE_FREE_SUBSTRING_LIST16(listptr); \
+  else \
+    PCRE_FREE_SUBSTRING_LIST8(listptr)
+
+#define PCRE_GET_NAMED_SUBSTRING(rc, re, bptr, offsets, count, \
+    getnamesptr, subsptr) \
+  if (use_pcre16) \
+    PCRE_GET_NAMED_SUBSTRING16(rc, re, bptr, offsets, count, \
+      getnamesptr, subsptr); \
+  else \
+    PCRE_GET_NAMED_SUBSTRING8(rc, re, bptr, offsets, count, \
+      getnamesptr, subsptr)
+
+#define PCRE_GET_STRINGNUMBER(n, rc, ptr) \
+  if (use_pcre16) \
+    PCRE_GET_STRINGNUMBER16(n, rc, ptr); \
+  else \
+    PCRE_GET_STRINGNUMBER8(n, rc, ptr)
+
+#define PCRE_GET_SUBSTRING(rc, bptr, use_offsets, count, i, subsptr) \
+  if (use_pcre16) \
+    PCRE_GET_SUBSTRING16(rc, bptr, use_offsets, count, i, subsptr); \
+  else \
+    PCRE_GET_SUBSTRING8(rc, bptr, use_offsets, count, i, subsptr)
+
+#define PCRE_GET_SUBSTRING_LIST(rc, bptr, offsets, count, listptr) \
+  if (use_pcre16) \
+    PCRE_GET_SUBSTRING_LIST16(rc, bptr, offsets, count, listptr); \
+  else \
+    PCRE_GET_SUBSTRING_LIST8(rc, bptr, offsets, count, listptr)
+
+#define PCRE_JIT_STACK_ALLOC pcre_jit_stack_alloc
+#define PCRE_JIT_STACK_FREE pcre_jit_stack_free
+
+#define PCRE_MAKETABLES \
+  (use_pcre16? pcre16_maketables() : pcre_maketables())
+
+#define PCRE_PATTERN_TO_HOST_BYTE_ORDER(re, extra, tables) \
+  if (use_pcre16) \
+    PCRE_PATTERN_TO_HOST_BYTE_ORDER16(re, extra, tables); \
+  else \
+    PCRE_PATTERN_TO_HOST_BYTE_ORDER8(re, extra, tables)
+
+#define PCRE_PRINTINT(re, outfile, debug_lengths) \
+  if (use_pcre16) \
+    PCRE_PRINTINT16(re, outfile, debug_lengths); \
+  else \
+    PCRE_PRINTINT8(re, outfile, debug_lengths)
+
+#define PCRE_STUDY(extra, re, options, error) \
+  if (use_pcre16) \
+    PCRE_STUDY16(extra, re, options, error); \
+  else \
+    PCRE_STUDY8(extra, re, options, error)
+
+/* ----- Only 8-bit mode is supported ----- */
+
+#elif defined SUPPORT_PCRE8
+#define CHAR_SIZE                 1
+#define PCHARS                    PCHARS8
+#define PCHARSV                   PCHARSV8
+#define READ_CAPTURE_NAME         READ_CAPTURE_NAME8
+#define SET_PCRE_CALLOUT          SET_PCRE_CALLOUT8
+#define STRLEN                    STRLEN8
+#define PCRE_ASSIGN_JIT_STACK     pcre_assign_jit_stack
+#define PCRE_COMPILE              PCRE_COMPILE8
+#define PCRE_CONFIG               pcre_config
+#define PCRE_COPY_NAMED_SUBSTRING PCRE_COPY_NAMED_SUBSTRING8
+#define PCRE_COPY_SUBSTRING       PCRE_COPY_SUBSTRING8
+#define PCRE_DFA_EXEC             PCRE_DFA_EXEC8
+#define PCRE_EXEC                 PCRE_EXEC8
+#define PCRE_FREE_STUDY           PCRE_FREE_STUDY8
+#define PCRE_FREE_SUBSTRING       PCRE_FREE_SUBSTRING8
+#define PCRE_FREE_SUBSTRING_LIST  PCRE_FREE_SUBSTRING_LIST8
+#define PCRE_GET_NAMED_SUBSTRING  PCRE_GET_NAMED_SUBSTRING8
+#define PCRE_GET_STRINGNUMBER     PCRE_GET_STRINGNUMBER8
+#define PCRE_GET_SUBSTRING        PCRE_GET_SUBSTRING8
+#define PCRE_GET_SUBSTRING_LIST   PCRE_GET_SUBSTRING_LIST8
+#define PCRE_JIT_STACK_ALLOC      pcre_jit_stack_alloc
+#define PCRE_JIT_STACK_FREE       pcre_jit_stack_free
+#define PCRE_MAKETABLES           pcre_maketables()
+#define PCRE_PATTERN_TO_HOST_BYTE_ORDER PCRE_PATTERN_TO_HOST_BYTE_ORDER8
+#define PCRE_PRINTINT             PCRE_PRINTINT8
+#define PCRE_STUDY                PCRE_STUDY8
+
+/* ----- Only 16-bit mode is supported ----- */
+
+#else
+#define CHAR_SIZE                 2
+#define PCHARS                    PCHARS16
+#define PCHARSV                   PCHARSV16
+#define READ_CAPTURE_NAME         READ_CAPTURE_NAME16
+#define SET_PCRE_CALLOUT          SET_PCRE_CALLOUT16
+#define STRLEN                    STRLEN16
+#define PCRE_ASSIGN_JIT_STACK     pcre16_assign_jit_stack
+#define PCRE_COMPILE              PCRE_COMPILE16
+#define PCRE_CONFIG               pcre16_config
+#define PCRE_COPY_NAMED_SUBSTRING PCRE_COPY_NAMED_SUBSTRING16
+#define PCRE_COPY_SUBSTRING       PCRE_COPY_SUBSTRING16
+#define PCRE_DFA_EXEC             PCRE_DFA_EXEC16
+#define PCRE_EXEC                 PCRE_EXEC16
+#define PCRE_FREE_STUDY           PCRE_FREE_STUDY16
+#define PCRE_FREE_SUBSTRING       PCRE_FREE_SUBSTRING16
+#define PCRE_FREE_SUBSTRING_LIST  PCRE_FREE_SUBSTRING_LIST16
+#define PCRE_GET_NAMED_SUBSTRING  PCRE_GET_NAMED_SUBSTRING16
+#define PCRE_GET_STRINGNUMBER     PCRE_GET_STRINGNUMBER16
+#define PCRE_GET_SUBSTRING        PCRE_GET_SUBSTRING16
+#define PCRE_GET_SUBSTRING_LIST   PCRE_GET_SUBSTRING_LIST16
+#define PCRE_JIT_STACK_ALLOC      pcre16_jit_stack_alloc
+#define PCRE_JIT_STACK_FREE       pcre16_jit_stack_free
+#define PCRE_MAKETABLES           pcre16_maketables()
+#define PCRE_PATTERN_TO_HOST_BYTE_ORDER PCRE_PATTERN_TO_HOST_BYTE_ORDER16
+#define PCRE_PRINTINT             PCRE_PRINTINT16
+#define PCRE_STUDY                PCRE_STUDY16
+#endif
+
+/* ----- End of mode-specific function call macros ----- */
+
+
 /* Other parameters */

#ifndef CLOCKS_PER_SEC
@@ -189,17 +584,62 @@
static int first_callout;
static int locale_set = 0;
static int show_malloc;
-static int use_utf8;
+static int use_utf;
static size_t gotten_store;
+static size_t first_gotten_store = 0;
static const unsigned char *last_callout_mark = NULL;

/* The buffers grow automatically if very long input lines are encountered. */

static int buffer_size = 50000;
-static uschar *buffer = NULL;
-static uschar *dbuffer = NULL;
-static uschar *pbuffer = NULL;
+static pcre_uint8 *buffer = NULL;
+static pcre_uint8 *dbuffer = NULL;
+static pcre_uint8 *pbuffer = NULL;

+/* Another buffer is needed translation to 16-bit character strings. It will
+obtained and extended as required. */
+
+#ifdef SUPPORT_PCRE16
+static int buffer16_size = 0;
+static pcre_uint16 *buffer16 = NULL;
+
+#ifdef SUPPORT_PCRE8
+
+/* We need the table of operator lengths that is used for 16-bit compiling, in
+order to swap bytes in a pattern for saving/reloading testing. Luckily, the
+data is defined as a macro. However, we must ensure that LINK_SIZE is adjusted
+appropriately for the 16-bit world. Just as a safety check, make sure that
+COMPILE_PCRE16 is *not* set. */
+
+#ifdef COMPILE_PCRE16
+#error COMPILE_PCRE16 must not be set when compiling pcretest.c
+#endif
+
+#if LINK_SIZE == 2
+#undef LINK_SIZE
+#define LINK_SIZE 1
+#elif LINK_SIZE == 3 || LINK_SIZE == 4
+#undef LINK_SIZE
+#define LINK_SIZE 2
+#else
+#error LINK_SIZE must be either 2, 3, or 4
+#endif
+
+#endif /* SUPPORT_PCRE8 */
+
+static const pcre_uint16 OP_lengths16[] = { OP_LENGTHS };
+#endif /* SUPPORT_PCRE16 */
+
+/* If we have 8-bit support, default use_pcre16 to false; if there is also
+16-bit support, it can be changed by an option. If there is no 8-bit support,
+there must be 16-bit support, so default it to 1. */
+
+#ifdef SUPPORT_PCRE8
+static int use_pcre16 = 0;
+#else
+static int use_pcre16 = 1;
+#endif
+
/* Textual explanations for runtime error codes */

static const char *errtexts[] = {
@@ -213,8 +653,8 @@
NULL, /* never returned by pcre_exec() or pcre_dfa_exec() */
"match limit exceeded",
"callout error code",
- NULL, /* BADUTF8 is handled specially */
- "bad UTF-8 offset",
+ NULL, /* BADUTF8/16 is handled specially */
+ NULL, /* BADUTF8/16 offset is handled specially */
NULL, /* PARTIAL is handled specially */
"not used - internal error",
"internal error - pattern overwritten?",
@@ -228,9 +668,10 @@
"not used - internal error",
"invalid combination of newline options",
"bad offset value",
- NULL, /* SHORTUTF8 is handled specially */
+ NULL, /* SHORTUTF8/16 is handled specially */
"nested recursion at the same subject position",
- "JIT stack limit reached"
+ "JIT stack limit reached",
+ "pattern compiled in wrong mode: 8-bit/16-bit error"
};

@@ -246,7 +687,7 @@
/* This is the set of tables distributed as default with PCRE. It recognizes
only ASCII characters. */

-static const unsigned char tables0[] = {
+static const pcre_uint8 tables0[] = {

/* This table is a lower casing table. */

@@ -419,7 +860,7 @@
be at least an approximation of ISO 8859. In particular, there are characters
greater than 128 that are marked as spaces, letters, etc. */

-static const unsigned char tables1[] = {
+static const pcre_uint8 tables1[] = {
0,1,2,3,4,5,6,7,
8,9,10,11,12,13,14,15,
16,17,18,19,20,21,22,23,
@@ -592,7 +1033,181 @@
}

+#if !defined NOUTF || defined SUPPORT_PCRE16
 /*************************************************
+*            Convert UTF-8 string to value       *
+*************************************************/
+
+/* This function takes one or more bytes that represents a UTF-8 character,
+and returns the value of the character.
+
+Argument:
+  utf8bytes   a pointer to the byte vector
+  vptr        a pointer to an int to receive the value
+
+Returns:      >  0 => the number of bytes consumed
+              -6 to 0 => malformed UTF-8 character at offset = (-return)
+*/
+
+static int
+utf82ord(pcre_uint8 *utf8bytes, int *vptr)
+{
+int c = *utf8bytes++;
+int d = c;
+int i, j, s;
+
+for (i = -1; i < 6; i++)               /* i is number of additional bytes */
+  {
+  if ((d & 0x80) == 0) break;
+  d <<= 1;
+  }
+
+if (i == -1) { *vptr = c; return 1; }  /* ascii character */
+if (i == 0 || i == 6) return 0;        /* invalid UTF-8 */
+
+/* i now has a value in the range 1-5 */
+
+s = 6*i;
+d = (c & utf8_table3[i]) << s;
+
+for (j = 0; j < i; j++)
+  {
+  c = *utf8bytes++;
+  if ((c & 0xc0) != 0x80) return -(j+1);
+  s -= 6;
+  d |= (c & 0x3f) << s;
+  }
+
+/* Check that encoding was the correct unique one */
+
+for (j = 0; j < utf8_table1_size; j++)
+  if (d <= utf8_table1[j]) break;
+if (j != i) return -(i+1);
+
+/* Valid value */
+
+*vptr = d;
+return i+1;
+}
+#endif /* NOUTF || SUPPORT_PCRE16 */
+
+
+
+#if !defined NOUTF || defined SUPPORT_PCRE16
+/*************************************************
+*       Convert character value to UTF-8         *
+*************************************************/
+
+/* This function takes an integer value in the range 0 - 0x7fffffff
+and encodes it as a UTF-8 character in 0 to 6 bytes.
+
+Arguments:
+  cvalue     the character value
+  utf8bytes  pointer to buffer for result - at least 6 bytes long
+
+Returns:     number of characters placed in the buffer
+*/
+
+static int
+ord2utf8(int cvalue, pcre_uint8 *utf8bytes)
+{
+register int i, j;
+for (i = 0; i < utf8_table1_size; i++)
+  if (cvalue <= utf8_table1[i]) break;
+utf8bytes += i;
+for (j = i; j > 0; j--)
+ {
+ *utf8bytes-- = 0x80 | (cvalue & 0x3f);
+ cvalue >>= 6;
+ }
+*utf8bytes = utf8_table2[i] | cvalue;
+return i + 1;
+}
+#endif /* NOUTF || SUPPORT_PCRE16 */
+
+
+
+#ifdef SUPPORT_PCRE16
+/*************************************************
+*         Convert a string to 16-bit             *
+*************************************************/
+
+/* In non-UTF mode, the space needed for a 16-bit string is exactly double the
+8-bit size. For a UTF-8 string, the size needed for UTF-16 is no more than
+double, because up to 0xffff uses no more than 3 bytes in UTF-8 but possibly 4
+in UTF-16. Higher values use 4 bytes in UTF-8 and up to 4 bytes in UTF-16. The
+result is always left in buffer16.
+
+Note that this function does not object to surrogate values. This is
+deliberate; it makes it possible to construct UTF-16 strings that are invalid,
+for the purpose of testing that they are correctly faulted.
+
+Patterns to be converted are either plain ASCII or UTF-8; data lines are always 
+in UTF-8 so that values greater than 255 can be handled.
+
+Arguments:
+  data       TRUE if converting a data line; FALSE for a regex
+  p          points to a byte string
+  utf        true if UTF-8 (to be converted to UTF-16)
+  len        number of bytes in the string (excluding trailing zero)
+
+Returns:     number of 16-bit data items used (excluding trailing zero)
+             OR -1 if a UTF-8 string is malformed
+             OR -2 if a value > 0x10ffff is encountered
+             OR -3 if a value > 0xffff is encountered when not in UTF mode 
+*/
+
+static int
+to16(int data, pcre_uint8 *p, int utf, int len)
+{
+pcre_uint16 *pp;
+
+if (buffer16_size < 2*len + 2)
+  {
+  if (buffer16 != NULL) free(buffer16);
+  buffer16_size = 2*len + 2;
+  buffer16 = (pcre_uint16 *)malloc(buffer16_size);
+  if (buffer16 == NULL)
+    {
+    fprintf(stderr, "pcretest: malloc(%d) failed for buffer16\n", buffer16_size);
+    exit(1);
+    }
+  }
+
+pp = buffer16;
+
+if (!utf && !data)
+  {
+  while (len-- > 0) *pp++ = *p++;
+  }
+
+else
+  {
+  int c = 0;
+  while (len > 0)
+    {
+    int chlen = utf82ord(p, &c);
+    if (chlen <= 0) return -1;
+    if (c > 0x10ffff) return -2;
+    p += chlen;
+    len -= chlen;
+    if (c < 0x10000) *pp++ = c; else
+      {
+      if (!utf) return -3;
+      c -= 0x10000;
+      *pp++ = 0xD800 | (c >> 10);
+      *pp++ = 0xDC00 | (c & 0x3ff);
+      }
+    }
+  }
+
+*pp = 0;
+return pp - buffer16;
+}
+#endif
+
+
+/*************************************************
 *        Read or extend an input line            *
 *************************************************/

@@ -615,10 +1230,10 @@
                NULL if no data read and EOF reached
 */

-static uschar *
-extend_inputline(FILE *f, uschar *start, const char *prompt)
+static pcre_uint8 *
+extend_inputline(FILE *f, pcre_uint8 *start, const char *prompt)
{
-uschar *here = start;
+pcre_uint8 *here = start;

 for (;;)
   {
@@ -665,9 +1280,9 @@
   else
     {
     int new_buffer_size = 2*buffer_size;
-    uschar *new_buffer = (unsigned char *)malloc(new_buffer_size);
-    uschar *new_dbuffer = (unsigned char *)malloc(new_buffer_size);
-    uschar *new_pbuffer = (unsigned char *)malloc(new_buffer_size);
+    pcre_uint8 *new_buffer = (pcre_uint8 *)malloc(new_buffer_size);
+    pcre_uint8 *new_dbuffer = (pcre_uint8 *)malloc(new_buffer_size);
+    pcre_uint8 *new_pbuffer = (pcre_uint8 *)malloc(new_buffer_size);

     if (new_buffer == NULL || new_dbuffer == NULL || new_pbuffer == NULL)
       {
@@ -698,10 +1313,6 @@

-
-
-
-
 /*************************************************
 *          Read number from string               *
 *************************************************/
@@ -718,7 +1329,7 @@
 */

static int
-get_value(unsigned char *str, unsigned char **endptr)
+get_value(pcre_uint8 *str, pcre_uint8 **endptr)
{
int result = 0;
while(*str != 0 && isspace(*str)) str++;
@@ -729,169 +1340,191 @@

-
 /*************************************************
-*            Convert UTF-8 string to value       *
+*             Print one character                *
 *************************************************/

-/* This function takes one or more bytes that represents a UTF-8 character,
-and returns the value of the character.
+/* Print a single character either literally, or as a hex escape. */

-Argument:
-  utf8bytes   a pointer to the byte vector
-  vptr        a pointer to an int to receive the value
-
-Returns:      >  0 => the number of bytes consumed
-              -6 to 0 => malformed UTF-8 character at offset = (-return)
-*/
-
-#if !defined NOUTF8
-
-static int
-utf82ord(unsigned char *utf8bytes, int *vptr)
+static int pchar(int c, FILE *f)
 {
-int c = *utf8bytes++;
-int d = c;
-int i, j, s;
+if (PRINTOK(c))
+  {
+  if (f != NULL) fprintf(f, "%c", c);
+  return 1;
+  }

-for (i = -1; i < 6; i++)               /* i is number of additional bytes */
+if (c < 0x100)
   {
-  if ((d & 0x80) == 0) break;
-  d <<= 1;
+  if (use_utf)
+    {
+    if (f != NULL) fprintf(f, "\\x{%02x}", c);
+    return 6;
+    }
+  else
+    {
+    if (f != NULL) fprintf(f, "\\x%02x", c);
+    return 4;
+    }
   }

-if (i == -1) { *vptr = c; return 1; }  /* ascii character */
-if (i == 0 || i == 6) return 0;        /* invalid UTF-8 */
+if (f != NULL) fprintf(f, "\\x{%02x}", c);
+return (c <= 0x000000ff)? 6 :
+       (c <= 0x00000fff)? 7 :
+       (c <= 0x0000ffff)? 8 :
+       (c <= 0x000fffff)? 9 : 10;
+}

-/* i now has a value in the range 1-5 */

-s = 6*i;
-d = (c & utf8_table3[i]) << s;

-for (j = 0; j < i; j++)
-  {
-  c = *utf8bytes++;
-  if ((c & 0xc0) != 0x80) return -(j+1);
-  s -= 6;
-  d |= (c & 0x3f) << s;
-  }
+#ifdef SUPPORT_PCRE8
+/*************************************************
+*         Print 8-bit character string           *
+*************************************************/

-/* Check that encoding was the correct unique one */
+/* Must handle UTF-8 strings in utf8 mode. Yields number of characters printed.
+If handed a NULL file, just counts chars without printing. */

-for (j = 0; j < utf8_table1_size; j++)
- if (d <= utf8_table1[j]) break;
-if (j != i) return -(i+1);
+static int pchars(pcre_uint8 *p, int length, FILE *f)
+{
+int c = 0;
+int yield = 0;

-/* Valid value */
+if (length < 0)
+ length = strlen((char *)p);

-*vptr = d;
-return i+1;
+while (length-- > 0)
+  {
+#if !defined NOUTF
+  if (use_utf)
+    {
+    int rc = utf82ord(p, &c);
+    if (rc > 0 && rc <= length + 1)   /* Mustn't run over the end */
+      {
+      length -= rc - 1;
+      p += rc;
+      yield += pchar(c, f);
+      continue;
+      }
+    }
+#endif
+  c = *p++;
+  yield += pchar(c, f);
+  }
+
+return yield;
 }
-
 #endif

+#ifdef SUPPORT_PCRE16
 /*************************************************
-*       Convert character value to UTF-8         *
+*    Find length of 0-terminated 16-bit string   *
 *************************************************/

-/* This function takes an integer value in the range 0 - 0x7fffffff
-and encodes it as a UTF-8 character in 0 to 6 bytes.
-
-Arguments:
-  cvalue     the character value
-  utf8bytes  pointer to buffer for result - at least 6 bytes long
-
-Returns:     number of characters placed in the buffer
-*/
-
-#if !defined NOUTF8
-
-static int
-ord2utf8(int cvalue, uschar *utf8bytes)
+static int strlen16(PCRE_SPTR16 p)
 {
-register int i, j;
-for (i = 0; i < utf8_table1_size; i++)
-  if (cvalue <= utf8_table1[i]) break;
-utf8bytes += i;
-for (j = i; j > 0; j--)
- {
- *utf8bytes-- = 0x80 | (cvalue & 0x3f);
- cvalue >>= 6;
- }
-*utf8bytes = utf8_table2[i] | cvalue;
-return i + 1;
+int len = 0;
+while (*p++ != 0) len++;
+return len;
 }
+#endif  /* SUPPORT_PCRE16 */

-#endif

-
-
+#ifdef SUPPORT_PCRE16
 /*************************************************
-*             Print character string             *
+*           Print 16-bit character string        *
 *************************************************/

-/* Character string printing function. Must handle UTF-8 strings in utf8
-mode. Yields number of characters printed. If handed a NULL file, just counts
-chars without printing. */
+/* Must handle UTF-16 strings in utf mode. Yields number of characters printed.
+If handed a NULL file, just counts chars without printing. */

-static int pchars(unsigned char *p, int length, FILE *f)
+static int pchars16(PCRE_SPTR16 p, int length, FILE *f)
{
-int c = 0;
int yield = 0;

+if (length < 0)
+  length = strlen16(p);
+
 while (length-- > 0)
   {
-#if !defined NOUTF8
-  if (use_utf8)
+  int c = *p++ & 0xffff;
+#if !defined NOUTF
+  if (use_utf && c >= 0xD800 && c < 0xDC00 && length > 0)
     {
-    int rc = utf82ord(p, &c);
-
-    if (rc > 0 && rc <= length + 1)   /* Mustn't run over the end */
+    int d = *p & 0xffff;
+    if (d >= 0xDC00 && d < 0xDFFF)
       {
-      length -= rc - 1;
-      p += rc;
-      if (PRINTHEX(c))
-        {
-        if (f != NULL) fprintf(f, "%c", c);
-        yield++;
-        }
-      else
-        {
-        int n = 4;
-        if (f != NULL) fprintf(f, "\\x{%02x}", c);
-        yield += (n <= 0x000000ff)? 2 :
-                 (n <= 0x00000fff)? 3 :
-                 (n <= 0x0000ffff)? 4 :
-                 (n <= 0x000fffff)? 5 : 6;
-        }
-      continue;
+      c = ((c & 0x3ff) << 10) + (d & 0x3ff) + 0x10000;
+      length--;
+      p++;
       }
     }
 #endif
+  yield += pchar(c, f);
+  }

- /* Not UTF-8, or malformed UTF-8 */
+return yield;
+}
+#endif /* SUPPORT_PCRE16 */

-  c = *p++;
-  if (PRINTHEX(c))
-    {
-    if (f != NULL) fprintf(f, "%c", c);
-    yield++;
-    }
-  else
-    {
-    if (f != NULL) fprintf(f, "\\x%02x", c);
-    yield += 4;
-    }
+
+
+#ifdef SUPPORT_PCRE8
+/*************************************************
+*     Read a capture name (8-bit) and check it   *
+*************************************************/
+
+static pcre_uint8 *
+read_capture_name8(pcre_uint8 *p, pcre_uint8 **pp, pcre *re)
+{
+pcre_uint8 *npp = *pp;
+while (isalnum(*p)) *npp++ = *p++;
+*npp++ = 0;
+*npp = 0;
+if (pcre_get_stringnumber(re, (char *)(*pp)) < 0)
+  {
+  fprintf(outfile, "no parentheses with name \"");
+  PCHARSV(*pp, 0, -1, outfile);
+  fprintf(outfile, "\"\n");
   }

-return yield;
+*pp = npp;
+return p;
}
+#endif /* SUPPORT_PCRE8 */

+#ifdef SUPPORT_PCRE16
 /*************************************************
+*     Read a capture name (16-bit) and check it  *
+*************************************************/
+
+/* Note that the text being read is 8-bit. */
+
+static pcre_uint8 *
+read_capture_name16(pcre_uint8 *p, pcre_uint16 **pp, pcre *re)
+{
+pcre_uint16 *npp = *pp;
+while (isalnum(*p)) *npp++ = *p++;
+*npp++ = 0;
+*npp = 0;
+if (pcre16_get_stringnumber(re, (PCRE_SPTR16)(*pp)) < 0)
+  {
+  fprintf(outfile, "no parentheses with name \"");
+  PCHARSV(*pp, 0, -1, outfile);
+  fprintf(outfile, "\"\n");
+  }
+*pp = npp;
+return p;
+}
+#endif  /* SUPPORT_PCRE16 */
+
+
+
+/*************************************************
 *              Callout function                  *
 *************************************************/

@@ -916,7 +1549,7 @@
     else
       {
       fprintf(f, "%2d: ", i/2);
-      (void)pchars((unsigned char *)cb->subject + cb->offset_vector[i],
+      PCHARSV(cb->subject, cb->offset_vector[i],
         cb->offset_vector[i+1] - cb->offset_vector[i], f);
       fprintf(f, "\n");
       }
@@ -929,13 +1562,13 @@

if (f != NULL) fprintf(f, "--->");

-pre_start = pchars((unsigned char *)cb->subject, cb->start_match, f);
-post_start = pchars((unsigned char *)(cb->subject + cb->start_match),
+PCHARS(pre_start, cb->subject, 0, cb->start_match, f);
+PCHARS(post_start, cb->subject, cb->start_match,
cb->current_position - cb->start_match, f);

-subject_length = pchars((unsigned char *)cb->subject, cb->subject_length, NULL);
+PCHARS(subject_length, cb->subject, 0, cb->subject_length, NULL);

-(void)pchars((unsigned char *)(cb->subject + cb->current_position),
+PCHARSV(cb->subject, cb->current_position,
cb->subject_length - cb->current_position, f);

if (f != NULL) fprintf(f, "\n");
@@ -974,8 +1607,14 @@

 if (cb->mark != last_callout_mark)
   {
-  fprintf(outfile, "Latest Mark: %s\n",
-    (cb->mark == NULL)? "<unset>" : (char *)(cb->mark));
+  if (cb->mark == NULL)
+    fprintf(outfile, "Latest Mark: <unset>\n");
+  else
+    {
+    fprintf(outfile, "Latest Mark: ");
+    PCHARSV(cb->mark, 0, -1, outfile);
+    putc('\n', outfile);
+    }
   last_callout_mark = cb->mark;
   }

@@ -999,12 +1638,14 @@
*************************************************/

/* Alternative malloc function, to test functionality and save the size of a
-compiled re. The show_malloc variable is set only during matching. */
+compiled re, which is the first store request that pcre_compile() makes. The
+show_malloc variable is set only during matching. */

 static void *new_malloc(size_t size)
 {
 void *block = malloc(size);
 gotten_store = size;
+if (first_gotten_store == 0) first_gotten_store = size;
 if (show_malloc)
   fprintf(outfile, "malloc       %3d %p\n", (int)size, block);
 return block;
@@ -1039,40 +1680,279 @@
 *          Call pcre_fullinfo()                  *
 *************************************************/

-/* Get one piece of information from the pcre_fullinfo() function */
+/* Get one piece of information from the pcre_fullinfo() function. When only
+one of 8-bit or 16-bit is supported, use_pcre16 should always have the correct
+value, but the code is defensive.

-static void new_info(pcre *re, pcre_extra *study, int option, void *ptr)
+Arguments:
+  re        compiled regex
+  study     study data
+  option    PCRE_INFO_xxx option
+  ptr       where to put the data
+
+Returns:    0 when OK, < 0 on error
+*/
+
+static int
+new_info(pcre *re, pcre_extra *study, int option, void *ptr)
 {
 int rc;
-if ((rc = pcre_fullinfo(re, study, option, ptr)) < 0)
-  fprintf(outfile, "Error %d from pcre_fullinfo(%d)\n", rc, option);
+
+if (use_pcre16)
+#ifdef SUPPORT_PCRE16
+  rc = pcre16_fullinfo(re, study, option, ptr);
+#else
+  rc = PCRE_ERROR_BADMODE;
+#endif
+else
+#ifdef SUPPORT_PCRE8
+  rc = pcre_fullinfo(re, study, option, ptr);
+#else
+  rc = PCRE_ERROR_BADMODE;
+#endif
+
+if (rc < 0)
+  {
+  fprintf(outfile, "Error %d from pcre%s_fullinfo(%d)\n", rc,
+    use_pcre16? "16" : "", option);
+  if (rc == PCRE_ERROR_BADMODE)
+    fprintf(outfile, "Running in %s-bit mode but pattern was compiled in "
+      "%s-bit mode\n", use_pcre16? "16":"8", use_pcre16? "8":"16");
+  }
+
+return rc;
 }

 /*************************************************
-*         Byte flipping function                 *
+*             Swap byte functions                *
 *************************************************/

-static unsigned long int
-byteflip(unsigned long int value, int n)
+/* The following functions swap the bytes of a pcre_uint16 and pcre_uint32
+value, respectively.
+
+Arguments:
+  value        any number
+
+Returns:       the byte swapped value
+*/
+
+static pcre_uint32
+swap_uint32(pcre_uint32 value)
 {
-if (n == 2) return ((value & 0x00ff) << 8) | ((value & 0xff00) >> 8);
 return ((value & 0x000000ff) << 24) |
        ((value & 0x0000ff00) <<  8) |
        ((value & 0x00ff0000) >>  8) |
-       ((value & 0xff000000) >> 24);
+       (value >> 24);
 }

+static pcre_uint16
+swap_uint16(pcre_uint16 value)
+{
+return (value >> 8) | (value << 8);
+}

 /*************************************************
+*        Flip bytes in a compiled pattern        *
+*************************************************/
+
+/* This function is called if the 'F' option was present on a pattern that is
+to be written to a file. We flip the bytes of all the integer fields in the
+regex data block and the study block. In 16-bit mode this also flips relevant
+bytes in the pattern itself. This is to make it possible to test PCRE's
+ability to reload byte-flipped patterns, e.g. those compiled on a different
+architecture. */
+
+static void
+regexflip(pcre *ere, pcre_extra *extra)
+{
+real_pcre *re = (real_pcre *)ere;
+#ifdef SUPPORT_PCRE16
+int op;
+pcre_uint16 *ptr = (pcre_uint16 *)re + re->name_table_offset;
+int length = re->name_count * re->name_entry_size;
+#ifdef SUPPORT_UTF
+BOOL utf = (re->options & PCRE_UTF16) != 0;
+BOOL utf16_char = FALSE;
+#endif /* SUPPORT_UTF */
+#endif /* SUPPORT_PCRE16 */
+
+/* Always flip the bytes in the main data block and study blocks. */
+
+re->magic_number = REVERSED_MAGIC_NUMBER;
+re->size = swap_uint32(re->size);
+re->options = swap_uint32(re->options);
+re->flags = swap_uint16(re->flags);
+re->top_bracket = swap_uint16(re->top_bracket);
+re->top_backref = swap_uint16(re->top_backref);
+re->first_char = swap_uint16(re->first_char);
+re->req_char = swap_uint16(re->req_char);
+re->name_table_offset = swap_uint16(re->name_table_offset);
+re->name_entry_size = swap_uint16(re->name_entry_size);
+re->name_count = swap_uint16(re->name_count);
+
+if (extra != NULL)
+  {
+  pcre_study_data *rsd = (pcre_study_data *)(extra->study_data);
+  rsd->size = swap_uint32(rsd->size);
+  rsd->flags = swap_uint32(rsd->flags);
+  rsd->minlength = swap_uint32(rsd->minlength);
+  }
+
+/* In 8-bit mode, that is all we need to do. In 16-bit mode we must swap bytes
+in the name table, if present, and then in the pattern itself. */
+
+#ifdef SUPPORT_PCRE16
+if (!use_pcre16) return;
+
+while(TRUE)
+  {
+  /* Swap previous characters. */
+  while (length-- > 0)
+    {
+    *ptr = swap_uint16(*ptr);
+    ptr++;
+    }
+#ifdef SUPPORT_UTF
+  if (utf16_char)
+    {
+    if ((ptr[-1] & 0xfc00) == 0xd800)
+      {
+      /* We know that there is only one extra character in UTF-16. */
+      *ptr = swap_uint16(*ptr);
+      ptr++;
+      }
+    }
+  utf16_char = FALSE;
+#endif /* SUPPORT_UTF */
+
+  /* Get next opcode. */
+
+  length = 0;
+  op = *ptr;
+  *ptr++ = swap_uint16(op);
+
+  switch (op)
+    {
+    case OP_END:
+    return;
+
+#ifdef SUPPORT_UTF
+    case OP_CHAR:
+    case OP_CHARI:
+    case OP_NOT:
+    case OP_NOTI:
+    case OP_STAR:
+    case OP_MINSTAR:
+    case OP_PLUS:
+    case OP_MINPLUS:
+    case OP_QUERY:
+    case OP_MINQUERY:
+    case OP_UPTO:
+    case OP_MINUPTO:
+    case OP_EXACT:
+    case OP_POSSTAR:
+    case OP_POSPLUS:
+    case OP_POSQUERY:
+    case OP_POSUPTO:
+    case OP_STARI:
+    case OP_MINSTARI:
+    case OP_PLUSI:
+    case OP_MINPLUSI:
+    case OP_QUERYI:
+    case OP_MINQUERYI:
+    case OP_UPTOI:
+    case OP_MINUPTOI:
+    case OP_EXACTI:
+    case OP_POSSTARI:
+    case OP_POSPLUSI:
+    case OP_POSQUERYI:
+    case OP_POSUPTOI:
+    case OP_NOTSTAR:
+    case OP_NOTMINSTAR:
+    case OP_NOTPLUS:
+    case OP_NOTMINPLUS:
+    case OP_NOTQUERY:
+    case OP_NOTMINQUERY:
+    case OP_NOTUPTO:
+    case OP_NOTMINUPTO:
+    case OP_NOTEXACT:
+    case OP_NOTPOSSTAR:
+    case OP_NOTPOSPLUS:
+    case OP_NOTPOSQUERY:
+    case OP_NOTPOSUPTO:
+    case OP_NOTSTARI:
+    case OP_NOTMINSTARI:
+    case OP_NOTPLUSI:
+    case OP_NOTMINPLUSI:
+    case OP_NOTQUERYI:
+    case OP_NOTMINQUERYI:
+    case OP_NOTUPTOI:
+    case OP_NOTMINUPTOI:
+    case OP_NOTEXACTI:
+    case OP_NOTPOSSTARI:
+    case OP_NOTPOSPLUSI:
+    case OP_NOTPOSQUERYI:
+    case OP_NOTPOSUPTOI:
+    if (utf) utf16_char = TRUE;
+#endif
+    /* Fall through. */
+
+    default:
+    length = OP_lengths16[op] - 1;
+    break;
+
+    case OP_CLASS:
+    case OP_NCLASS:
+    /* Skip the character bit map. */
+    ptr += 32/sizeof(pcre_uint16);
+    length = 0;
+    break;
+
+    case OP_XCLASS:
+    /* Reverse the size of the XCLASS instance. */
+    ptr++;
+    *ptr = swap_uint16(*ptr);
+    if (LINK_SIZE > 1)
+      {
+      /* LINK_SIZE can be 1 or 2 in 16 bit mode. */
+      ptr++;
+      *ptr = swap_uint16(*ptr);
+      }
+    ptr++;
+
+    if (LINK_SIZE > 1)
+      length = ((ptr[-LINK_SIZE] << 16) | ptr[-LINK_SIZE + 1]) -
+        (1 + LINK_SIZE + 1);
+    else
+      length = ptr[-LINK_SIZE] - (1 + LINK_SIZE + 1);
+
+    op = *ptr;
+    *ptr = swap_uint16(op);
+    if ((op & XCL_MAP) != 0)
+      {
+      /* Skip the character bit map. */
+      ptr += 32/sizeof(pcre_uint16);
+      length -= 32/sizeof(pcre_uint16);
+      }
+    break;
+    }
+  }
+/* Control should never reach here in 16 bit mode. */
+#endif /* SUPPORT_PCRE16 */
+}
+
+
+
+/*************************************************
 *        Check match or recursion limit          *
 *************************************************/

static int
-check_match_limit(pcre *re, pcre_extra *extra, uschar *bptr, int len,
+check_match_limit(pcre *re, pcre_extra *extra, pcre_uint8 *bptr, int len,
int start_offset, int options, int *use_offsets, int use_size_offsets,
int flag, unsigned long int *limit, int errnumber, const char *msg)
{
@@ -1087,7 +1967,7 @@
{
*limit = mid;

-  count = pcre_exec(re, extra, (char *)bptr, len, start_offset, options,
+  PCRE_EXEC(count, re, extra, bptr, len, start_offset, options,
     use_offsets, use_size_offsets);

if (count == errnumber)
@@ -1132,7 +2012,7 @@
*/

static int
-strncmpic(uschar *s, uschar *t, int n)
+strncmpic(pcre_uint8 *s, pcre_uint8 *t, int n)
{
while (n--)
{
@@ -1159,15 +2039,15 @@
*/

 static int
-check_newline(uschar *p, FILE *f)
+check_newline(pcre_uint8 *p, FILE *f)
 {
-if (strncmpic(p, (uschar *)"cr>", 3) == 0) return PCRE_NEWLINE_CR;
-if (strncmpic(p, (uschar *)"lf>", 3) == 0) return PCRE_NEWLINE_LF;
-if (strncmpic(p, (uschar *)"crlf>", 5) == 0) return PCRE_NEWLINE_CRLF;
-if (strncmpic(p, (uschar *)"anycrlf>", 8) == 0) return PCRE_NEWLINE_ANYCRLF;
-if (strncmpic(p, (uschar *)"any>", 4) == 0) return PCRE_NEWLINE_ANY;
-if (strncmpic(p, (uschar *)"bsr_anycrlf>", 12) == 0) return PCRE_BSR_ANYCRLF;
-if (strncmpic(p, (uschar *)"bsr_unicode>", 12) == 0) return PCRE_BSR_UNICODE;
+if (strncmpic(p, (pcre_uint8 *)"cr>", 3) == 0) return PCRE_NEWLINE_CR;
+if (strncmpic(p, (pcre_uint8 *)"lf>", 3) == 0) return PCRE_NEWLINE_LF;
+if (strncmpic(p, (pcre_uint8 *)"crlf>", 5) == 0) return PCRE_NEWLINE_CRLF;
+if (strncmpic(p, (pcre_uint8 *)"anycrlf>", 8) == 0) return PCRE_NEWLINE_ANYCRLF;
+if (strncmpic(p, (pcre_uint8 *)"any>", 4) == 0) return PCRE_NEWLINE_ANY;
+if (strncmpic(p, (pcre_uint8 *)"bsr_anycrlf>", 12) == 0) return PCRE_BSR_ANYCRLF;
+if (strncmpic(p, (pcre_uint8 *)"bsr_unicode>", 12) == 0) return PCRE_BSR_UNICODE;
 fprintf(f, "Unknown newline type at: <%s\n", p);
 return 0;
 }
@@ -1189,8 +2069,19 @@
 printf("This version of pcretest is not linked with readline().\n");
 #endif
 printf("\nOptions:\n");
+#ifdef SUPPORT_PCRE16
+printf("  -16      use 16-bit interface\n");
+#endif
 printf("  -b       show compiled code (bytecode)\n");
 printf("  -C       show PCRE compile-time options and exit\n");
+printf("  -C arg   show a specific compile-time option\n");
+printf("           and exit with its value. The arg can be:\n");
+printf("     linksize     internal link size [2, 3, 4]\n");
+printf("     pcre8        8 bit library support enabled [0, 1]\n");
+printf("     pcre16       16 bit library support enabled [0, 1]\n");
+printf("     utf          Unicode Transformation Format supported [0, 1]\n");
+printf("     ucp          Unicode Properties supported [0, 1]\n");
+printf("     jit          Just-in-time compiler supported [0, 1]\n");
 printf("  -d       debug: show compiled code and information (-b and -i)\n");
 #if !defined NODFA
 printf("  -dfa     force DFA matching for all subjects\n");
@@ -1226,6 +2117,7 @@
 int main(int argc, char **argv)
 {
 FILE *infile = stdin;
+const char *version;
 int options = 0;
 int study_options = 0;
 int default_find_match_limit = FALSE;
@@ -1251,22 +2143,32 @@

pcre_jit_stack *jit_stack = NULL;

+/* These vectors store, end-to-end, a list of zero-terminated captured
+substring names, each list itself being terminated by an empty name. Assume
+that 1024 is plenty long enough for the few names we'll be testing. It is
+easiest to keep separate 8-bit and 16-bit versions, using the 16-bit version
+for the actual memory, to ensure alignment. By defining these variables always
+(whether or not 8-bit or 16-bit is supported), we avoid too much mess with
+#ifdefs in the code. */

-/* These vectors store, end-to-end, a list of captured substring names. Assume
-that 1024 is plenty long enough for the few names we'll be testing. */
+pcre_uint16 copynames[1024];
+pcre_uint16 getnames[1024];

-uschar copynames[1024];
-uschar getnames[1024];
+pcre_uint16 *cn16ptr;
+pcre_uint16 *gn16ptr;

-uschar *copynamesptr;
-uschar *getnamesptr;
+pcre_uint8 *copynames8 = (pcre_uint8 *)copynames;
+pcre_uint8 *getnames8 = (pcre_uint8 *)getnames;
+pcre_uint8 *cn8ptr;
+pcre_uint8 *gn8ptr;

-/* Get buffers from malloc() so that Electric Fence will check their misuse
-when I am debugging. They grow automatically when very long lines are read. */
+/* Get buffers from malloc() so that valgrind will check their misuse when
+debugging. They grow automatically when very long lines are read. The 16-bit
+buffer (buffer16) is obtained only if needed. */

-buffer = (unsigned char *)malloc(buffer_size);
-dbuffer = (unsigned char *)malloc(buffer_size);
-pbuffer = (unsigned char *)malloc(buffer_size);
+buffer = (pcre_uint8 *)malloc(buffer_size);
+dbuffer = (pcre_uint8 *)malloc(buffer_size);
+pbuffer = (pcre_uint8 *)malloc(buffer_size);

/* The outfile variable is static so that new_malloc can use it. */

@@ -1281,11 +2183,20 @@
_setmode( _fileno( stdout ), _O_BINARY );
#endif

+/* Get the version number: both pcre_version() and pcre16_version() give the
+same answer. We just need to ensure that we call one that is available. */
+
+#ifdef SUPPORT_PCRE8
+version = pcre_version();
+#else
+version = pcre16_version();
+#endif
+
/* Scan options */

while (argc > 1 && argv[op][0] == '-')
{
- unsigned char *endptr;
+ pcre_uint8 *endptr;

   if (strcmp(argv[op], "-m") == 0) showstore = 1;
   else if (strcmp(argv[op], "-s") == 0) force_study = 0;
@@ -1294,6 +2205,15 @@
     force_study = 1;
     force_study_options = PCRE_STUDY_JIT_COMPILE;
     }
+  else if (strcmp(argv[op], "-16") == 0)
+    {
+#ifdef SUPPORT_PCRE16
+    use_pcre16 = 1;
+#else
+    printf("** This version of PCRE was built without 16-bit support\n");
+    exit(1);
+#endif
+    }
   else if (strcmp(argv[op], "-q") == 0) quiet = 1;
   else if (strcmp(argv[op], "-b") == 0) debug = 1;
   else if (strcmp(argv[op], "-i") == 0) showinfo = 1;
@@ -1303,7 +2223,7 @@
   else if (strcmp(argv[op], "-dfa") == 0) all_use_dfa = 1;
 #endif
   else if (strcmp(argv[op], "-o") == 0 && argc > 2 &&
-      ((size_offsets = get_value((unsigned char *)argv[op+1], &endptr)),
+      ((size_offsets = get_value((pcre_uint8 *)argv[op+1], &endptr)),
         *endptr == 0))
     {
     op++;
@@ -1313,7 +2233,7 @@
     {
     int both = argv[op][2] == 0;
     int temp;
-    if (argc > 2 && (temp = get_value((unsigned char *)argv[op+1], &endptr),
+    if (argc > 2 && (temp = get_value((pcre_uint8 *)argv[op+1], &endptr),
                      *endptr == 0))
       {
       timeitm = temp;
@@ -1324,7 +2244,7 @@
     if (both) timeit = timeitm;
     }
   else if (strcmp(argv[op], "-S") == 0 && argc > 2 &&
-      ((stack_size = get_value((unsigned char *)argv[op+1], &endptr)),
+      ((stack_size = get_value((pcre_uint8 *)argv[op+1], &endptr)),
         *endptr == 0))
     {
 #if defined(_WIN32) || defined(WIN32) || defined(__minix)
@@ -1352,36 +2272,118 @@
     {
     int rc;
     unsigned long int lrc;
-    printf("PCRE version %s\n", pcre_version());
+
+    if (argc > 2)
+      {
+      if (strcmp(argv[op + 1], "linksize") == 0)
+        {
+        (void)PCRE_CONFIG(PCRE_CONFIG_LINK_SIZE, &rc);
+        printf("%d\n", rc);
+        yield = rc;
+        goto EXIT;
+        }
+      if (strcmp(argv[op + 1], "pcre8") == 0)
+        {
+#ifdef SUPPORT_PCRE8
+        printf("1\n");
+        yield = 1;
+#else
+        printf("0\n");
+        yield = 0;
+#endif
+        goto EXIT;
+        }
+      if (strcmp(argv[op + 1], "pcre16") == 0)
+        {
+#ifdef SUPPORT_PCRE16
+        printf("1\n");
+        yield = 1;
+#else
+        printf("0\n");
+        yield = 0;
+#endif
+        goto EXIT;
+        }
+      if (strcmp(argv[op + 1], "utf") == 0)
+        {
+#ifdef SUPPORT_PCRE8
+        (void)pcre_config(PCRE_CONFIG_UTF8, &rc);
+        printf("%d\n", rc);
+        yield = rc;
+#else
+        (void)pcre16_config(PCRE_CONFIG_UTF16, &rc);
+        printf("%d\n", rc);
+        yield = rc;
+#endif
+        goto EXIT;
+        }
+      if (strcmp(argv[op + 1], "ucp") == 0)
+        {
+        (void)PCRE_CONFIG(PCRE_CONFIG_UNICODE_PROPERTIES, &rc);
+        printf("%d\n", rc);
+        yield = rc;
+        goto EXIT;
+        }
+      if (strcmp(argv[op + 1], "jit") == 0)
+        {
+        (void)PCRE_CONFIG(PCRE_CONFIG_JIT, &rc);
+        printf("%d\n", rc);
+        yield = rc;
+        goto EXIT;
+        }
+      printf("Unknown option: %s\n", argv[op + 1]);
+      goto EXIT;
+      }
+
+    printf("PCRE version %s\n", version);
     printf("Compiled with\n");
+
+/* At least one of SUPPORT_PCRE8 and SUPPORT_PCRE16 will be set. If both
+are set, either both UTFs are supported or both are not supported. */
+
+#if defined SUPPORT_PCRE8 && defined SUPPORT_PCRE16
+    printf("  8-bit and 16-bit support\n");
     (void)pcre_config(PCRE_CONFIG_UTF8, &rc);
+    if (rc)
+      printf("  UTF-8 and UTF-16 support\n");
+    else
+      printf("  No UTF-8 or UTF-16 support\n");
+#elif defined SUPPORT_PCRE8
+    printf("  8-bit support only\n");
+    (void)pcre_config(PCRE_CONFIG_UTF8, &rc);
     printf("  %sUTF-8 support\n", rc? "" : "No ");
-    (void)pcre_config(PCRE_CONFIG_UNICODE_PROPERTIES, &rc);
+#else
+    printf("  16-bit support only\n");
+    (void)pcre16_config(PCRE_CONFIG_UTF16, &rc);
+    printf("  %sUTF-16 support\n", rc? "" : "No ");
+#endif
+
+    (void)PCRE_CONFIG(PCRE_CONFIG_UNICODE_PROPERTIES, &rc);
     printf("  %sUnicode properties support\n", rc? "" : "No ");
-    (void)pcre_config(PCRE_CONFIG_JIT, &rc);
+    (void)PCRE_CONFIG(PCRE_CONFIG_JIT, &rc);
     if (rc)
       printf("  Just-in-time compiler support\n");
     else
       printf("  No just-in-time compiler support\n");
-    (void)pcre_config(PCRE_CONFIG_NEWLINE, &rc);
+    (void)PCRE_CONFIG(PCRE_CONFIG_NEWLINE, &rc);
     /* Note that these values are always the ASCII values, even
     in EBCDIC environments. CR is 13 and NL is 10. */
     printf("  Newline sequence is %s\n", (rc == 13)? "CR" :
       (rc == 10)? "LF" : (rc == (13<<8 | 10))? "CRLF" :
       (rc == -2)? "ANYCRLF" :
       (rc == -1)? "ANY" : "???");
-    (void)pcre_config(PCRE_CONFIG_BSR, &rc);
+    (void)PCRE_CONFIG(PCRE_CONFIG_BSR, &rc);
     printf("  \\R matches %s\n", rc? "CR, LF, or CRLF only" :
                                      "all Unicode newlines");
-    (void)pcre_config(PCRE_CONFIG_LINK_SIZE, &rc);
+    (void)PCRE_CONFIG(PCRE_CONFIG_LINK_SIZE, &rc);
     printf("  Internal link size = %d\n", rc);
-    (void)pcre_config(PCRE_CONFIG_POSIX_MALLOC_THRESHOLD, &rc);
+    (void)PCRE_CONFIG(PCRE_CONFIG_POSIX_MALLOC_THRESHOLD, &rc);
     printf("  POSIX malloc threshold = %d\n", rc);
-    (void)pcre_config(PCRE_CONFIG_MATCH_LIMIT, &lrc);
+    (void)PCRE_CONFIG(PCRE_CONFIG_MATCH_LIMIT, &lrc);
     printf("  Default match limit = %ld\n", lrc);
-    (void)pcre_config(PCRE_CONFIG_MATCH_LIMIT_RECURSION, &lrc);
+    (void)PCRE_CONFIG(PCRE_CONFIG_MATCH_LIMIT_RECURSION, &lrc);
     printf("  Default recursion depth limit = %ld\n", lrc);
-    (void)pcre_config(PCRE_CONFIG_STACKRECURSE, &rc);
+    (void)PCRE_CONFIG(PCRE_CONFIG_STACKRECURSE, &rc);
     printf("  Match recursion uses %s\n", rc? "stack" : "heap");
     goto EXIT;
     }
@@ -1440,14 +2442,23 @@

/* Set alternative malloc function */

+#ifdef SUPPORT_PCRE8
pcre_malloc = new_malloc;
pcre_free = new_free;
pcre_stack_malloc = stack_malloc;
pcre_stack_free = stack_free;
+#endif

+#ifdef SUPPORT_PCRE16
+pcre16_malloc = new_malloc;
+pcre16_free = new_free;
+pcre16_stack_malloc = stack_malloc;
+pcre16_stack_free = stack_free;
+#endif
+
/* Heading line unless quiet, then prompt for first regex if stdin */

-if (!quiet) fprintf(outfile, "PCRE version %s\n\n", pcre_version());
+if (!quiet) fprintf(outfile, "PCRE version %s\n\n", version);

/* Main loop */

@@ -1462,10 +2473,10 @@
#endif

const char *error;
- unsigned char *markptr;
- unsigned char *p, *pp, *ppp;
- unsigned char *to_file = NULL;
- const unsigned char *tables = NULL;
+ pcre_uint8 *markptr;
+ pcre_uint8 *p, *pp, *ppp;
+ pcre_uint8 *to_file = NULL;
+ const pcre_uint8 *tables = NULL;
unsigned long int true_size, true_study_size = 0;
size_t size, regex_gotten_store;
int do_allcaps = 0;
@@ -1481,7 +2492,7 @@
int do_flip = 0;
int erroroffset, len, delimiter, poffset;

- use_utf8 = 0;
+ use_utf = 0;
debug_lengths = 1;

   if (extend_inputline(infile, buffer, "  re> ") == NULL) break;
@@ -1497,7 +2508,7 @@
   if (*p == '<' && strchr((char *)(p+1), '<') == NULL)
     {
     unsigned long int magic, get_options;
-    uschar sbuf[8];
+    pcre_uint8 sbuf[8];
     FILE *f;

     p++;
@@ -1520,14 +2531,14 @@
       (sbuf[4] << 24) | (sbuf[5] << 16) | (sbuf[6] << 8) | sbuf[7];

     re = (real_pcre *)new_malloc(true_size);
-    regex_gotten_store = gotten_store;
+    regex_gotten_store = first_gotten_store;

     if (fread(re, 1, true_size, f) != true_size) goto FAIL_READ;

     magic = ((real_pcre *)re)->magic_number;
     if (magic != MAGIC_NUMBER)
       {
-      if (byteflip(magic, sizeof(magic)) == MAGIC_NUMBER)
+      if (swap_uint32(magic) == MAGIC_NUMBER)
         {
         do_flip = 1;
         }
@@ -1542,11 +2553,6 @@
     fprintf(outfile, "Compiled pattern%s loaded from %s\n",
       do_flip? " (byte-inverted)" : "", p);

-    /* Need to know if UTF-8 for printing data strings */
-
-    new_info(re, NULL, PCRE_INFO_OPTIONS, &get_options);
-    use_utf8 = (get_options & PCRE_UTF8) != 0;
-
     /* Now see if there is any following study data. */

     if (true_study_size != 0)
@@ -1563,7 +2569,10 @@
         {
         FAIL_READ:
         fprintf(outfile, "Failed to read data from %s\n", p);
-        if (extra != NULL) pcre_free_study(extra);
+        if (extra != NULL)
+          {
+          PCRE_FREE_STUDY(extra);
+          }
         if (re != NULL) new_free(re);
         fclose(f);
         continue;
@@ -1573,12 +2582,23 @@
       }
     else fprintf(outfile, "No study data\n");

+    /* Flip the necessary bytes. */
+    if (do_flip)
+      {
+      PCRE_PATTERN_TO_HOST_BYTE_ORDER(re, extra, NULL);
+      }
+
+    /* Need to know if UTF-8 for printing data strings. */
+
+    if (new_info(re, NULL, PCRE_INFO_OPTIONS, &get_options) < 0) continue;
+    use_utf = (get_options & PCRE_UTF8) != 0;
+
     fclose(f);
     goto SHOW_INFO;
     }

/* In-line pattern (the usual case). Get the delimiter and seek the end of
- the pattern; if is isn't complete, read more. */
+ the pattern; if it isn't complete, read more. */

delimiter = *p++;

@@ -1629,6 +2649,7 @@
/* Look for options after final delimiter */

options = 0;
+ study_options = 0;
log_store = showstore; /* default from command line */

   while (*pp != 0)
@@ -1686,7 +2707,7 @@
       case 'X': options |= PCRE_EXTRA; break;
       case 'Y': options |= PCRE_NO_START_OPTIMISE; break;
       case 'Z': debug_lengths = 0; break;
-      case '8': options |= PCRE_UTF8; use_utf8 = 1; break;
+      case '8': options |= PCRE_UTF8; use_utf = 1; break;
       case '?': options |= PCRE_NO_UTF8_CHECK; break;

       case 'T':
@@ -1720,7 +2741,7 @@
         goto SKIP_DATA;
         }
       locale_set = 1;
-      tables = pcre_maketables();
+      tables = PCRE_MAKETABLES;
       pp = ppp;
       break;

@@ -1733,7 +2754,7 @@

       case '<':
         {
-        if (strncmpic(pp, (uschar *)"JS>", 3) == 0)
+        if (strncmpic(pp, (pcre_uint8 *)"JS>", 3) == 0)
           {
           options |= PCRE_JAVASCRIPT_COMPAT;
           pp += 3;
@@ -1761,7 +2782,7 @@

/* Handle compiling via the POSIX interface, which doesn't support the
timing, showing, or debugging options, nor the ability to pass over
- local character tables. */
+ local character tables. Neither does it have 16-bit support. */

 #if !defined NOPOSIX
   if (posix || do_posix)
@@ -1777,6 +2798,7 @@
     if ((options & PCRE_UCP) != 0) cflags |= REG_UCP;
     if ((options & PCRE_UNGREEDY) != 0) cflags |= REG_UNGREEDY;

+    first_gotten_store = 0;
     rc = regcomp(&preg, (char *)p, cflags);

     /* Compilation failed; go back for another re, skipping to blank line
@@ -1798,6 +2820,37 @@
     {
     unsigned long int get_options;

+    /* In 16-bit mode, convert the input. */
+
+#ifdef SUPPORT_PCRE16
+    if (use_pcre16)
+      {
+      switch(to16(FALSE, p, options & PCRE_UTF8, (int)strlen((char *)p)))
+        {
+        case -1:
+        fprintf(outfile, "**Failed: invalid UTF-8 string cannot be "
+          "converted to UTF-16\n");
+        goto SKIP_DATA;
+
+        case -2:
+        fprintf(outfile, "**Failed: character value greater than 0x10ffff "
+          "cannot be converted to UTF-16\n");
+        goto SKIP_DATA;
+        
+        case -3: /* "Impossible error" when to16 is called arg1 FALSE */
+        fprintf(outfile, "**Failed: character value greater than 0xffff "
+          "cannot be converted to 16-bit in non-UTF mode\n");
+        goto SKIP_DATA;   
+
+        default:
+        break;
+        }
+      p = (pcre_uint8 *)buffer16;
+      }
+#endif
+
+    /* Compile many times when timing */
+
     if (timeit > 0)
       {
       register int i;
@@ -1805,7 +2858,7 @@
       clock_t start_time = clock();
       for (i = 0; i < timeit; i++)
         {
-        re = pcre_compile((char *)p, options, &error, &erroroffset, tables);
+        PCRE_COMPILE(re, p, options, &error, &erroroffset, tables);
         if (re != NULL) free(re);
         }
       time_taken = clock() - start_time;
@@ -1814,7 +2867,8 @@
           (double)CLOCKS_PER_SEC);
       }

-    re = pcre_compile((char *)p, options, &error, &erroroffset, tables);
+    first_gotten_store = 0;
+    PCRE_COMPILE(re, p, options, &error, &erroroffset, tables);

     /* Compilation failed; go back for another re, skipping to blank line
     if non-interactive. */
@@ -1845,25 +2899,24 @@
     within the regex; check for this so that we know how to process the data
     lines. */

-    new_info(re, NULL, PCRE_INFO_OPTIONS, &get_options);
-    if ((get_options & PCRE_UTF8) != 0) use_utf8 = 1;
+    if (new_info(re, NULL, PCRE_INFO_OPTIONS, &get_options) < 0)
+      goto SKIP_DATA;
+    if ((get_options & PCRE_UTF8) != 0) use_utf = 1;

-    /* Print information if required. There are now two info-returning
-    functions. The old one has a limited interface and returns only limited
-    data. Check that it agrees with the newer one. */
+    /* Extract the size for possible writing before possibly flipping it,
+    and remember the store that was got. */

+    true_size = ((real_pcre *)re)->size;
+    regex_gotten_store = first_gotten_store;
+
+    /* Output code size information if requested */
+
     if (log_store)
       fprintf(outfile, "Memory allocation (code space): %d\n",
-        (int)(gotten_store -
+        (int)(first_gotten_store -
               sizeof(real_pcre) -
               ((real_pcre *)re)->name_count * ((real_pcre *)re)->name_entry_size));

-    /* Extract the size for possible writing before possibly flipping it,
-    and remember the store that was got. */
-
-    true_size = ((real_pcre *)re)->size;
-    regex_gotten_store = gotten_store;
-
     /* If -s or /S was present, study the regex to generate additional info to
     help with the matching, unless the pattern has the SS option, which
     suppresses the effect of /S (used for a few test patterns where studying is
@@ -1877,18 +2930,32 @@
         clock_t time_taken;
         clock_t start_time = clock();
         for (i = 0; i < timeit; i++)
-          extra = pcre_study(re, study_options | force_study_options, &error);
+          {
+          PCRE_STUDY(extra, re, study_options | force_study_options, &error);
+          }
         time_taken = clock() - start_time;
-        if (extra != NULL) pcre_free_study(extra);
+        if (extra != NULL)
+          {
+          PCRE_FREE_STUDY(extra);
+          }
         fprintf(outfile, "  Study time %.4f milliseconds\n",
           (((double)time_taken * 1000.0) / (double)timeit) /
             (double)CLOCKS_PER_SEC);
         }
-      extra = pcre_study(re, study_options | force_study_options, &error);
+      PCRE_STUDY(extra, re, study_options | force_study_options, &error);
       if (error != NULL)
         fprintf(outfile, "Failed to study: %s\n", error);
       else if (extra != NULL)
+        {
         true_study_size = ((pcre_study_data *)(extra->study_data))->size;
+        if (log_store)
+          {
+          size_t jitsize;
+          if (new_info(re, extra, PCRE_INFO_JITSIZE, &jitsize) == 0 &&
+              jitsize != 0)
+            fprintf(outfile, "Memory allocation (JIT code): %d\n", (int)jitsize);
+          }
+        }
       }

     /* If /K was present, we set up for handling MARK data. */
@@ -1904,51 +2971,14 @@
       extra->flags |= PCRE_EXTRA_MARK;
       }

-    /* If the 'F' option was present, we flip the bytes of all the integer
-    fields in the regex data block and the study block. This is to make it
-    possible to test PCRE's handling of byte-flipped patterns, e.g. those
-    compiled on a different architecture. */
+    /* Extract and display information from the compiled data if required. */

-    if (do_flip)
-      {
-      real_pcre *rre = (real_pcre *)re;
-      rre->magic_number =
-        byteflip(rre->magic_number, sizeof(rre->magic_number));
-      rre->size = byteflip(rre->size, sizeof(rre->size));
-      rre->options = byteflip(rre->options, sizeof(rre->options));
-      rre->flags = (pcre_uint16)byteflip(rre->flags, sizeof(rre->flags));
-      rre->top_bracket =
-        (pcre_uint16)byteflip(rre->top_bracket, sizeof(rre->top_bracket));
-      rre->top_backref =
-        (pcre_uint16)byteflip(rre->top_backref, sizeof(rre->top_backref));
-      rre->first_byte =
-        (pcre_uint16)byteflip(rre->first_byte, sizeof(rre->first_byte));
-      rre->req_byte =
-        (pcre_uint16)byteflip(rre->req_byte, sizeof(rre->req_byte));
-      rre->name_table_offset = (pcre_uint16)byteflip(rre->name_table_offset,
-        sizeof(rre->name_table_offset));
-      rre->name_entry_size = (pcre_uint16)byteflip(rre->name_entry_size,
-        sizeof(rre->name_entry_size));
-      rre->name_count = (pcre_uint16)byteflip(rre->name_count,
-        sizeof(rre->name_count));
-
-      if (extra != NULL)
-        {
-        pcre_study_data *rsd = (pcre_study_data *)(extra->study_data);
-        rsd->size = byteflip(rsd->size, sizeof(rsd->size));
-        rsd->flags = byteflip(rsd->flags, sizeof(rsd->flags));
-        rsd->minlength = byteflip(rsd->minlength, sizeof(rsd->minlength));
-        }
-      }
-
-    /* Extract information from the compiled data if required */
-
     SHOW_INFO:

     if (do_debug)
       {
       fprintf(outfile, "------------------------------------------------------------------\n");
-      pcre_printint(re, outfile, debug_lengths);
+      PCRE_PRINTINT(re, outfile, debug_lengths);
       }

     /* We already have the options in get_options (see above) */
@@ -1956,46 +2986,25 @@
     if (do_showinfo)
       {
       unsigned long int all_options;
-#if !defined NOINFOCHECK
-      int old_first_char, old_options, old_count;
-#endif
       int count, backrefmax, first_char, need_char, okpartial, jchanged,
         hascrorlf;
       int nameentrysize, namecount;
-      const uschar *nametable;
+      const pcre_uint8 *nametable;

-      new_info(re, NULL, PCRE_INFO_SIZE, &size);
-      new_info(re, NULL, PCRE_INFO_CAPTURECOUNT, &count);
-      new_info(re, NULL, PCRE_INFO_BACKREFMAX, &backrefmax);
-      new_info(re, NULL, PCRE_INFO_FIRSTBYTE, &first_char);
-      new_info(re, NULL, PCRE_INFO_LASTLITERAL, &need_char);
-      new_info(re, NULL, PCRE_INFO_NAMEENTRYSIZE, &nameentrysize);
-      new_info(re, NULL, PCRE_INFO_NAMECOUNT, &namecount);
-      new_info(re, NULL, PCRE_INFO_NAMETABLE, (void *)&nametable);
-      new_info(re, NULL, PCRE_INFO_OKPARTIAL, &okpartial);
-      new_info(re, NULL, PCRE_INFO_JCHANGED, &jchanged);
-      new_info(re, NULL, PCRE_INFO_HASCRORLF, &hascrorlf);
+      if (new_info(re, NULL, PCRE_INFO_SIZE, &size) +
+          new_info(re, NULL, PCRE_INFO_CAPTURECOUNT, &count) +
+          new_info(re, NULL, PCRE_INFO_BACKREFMAX, &backrefmax) +
+          new_info(re, NULL, PCRE_INFO_FIRSTBYTE, &first_char) +
+          new_info(re, NULL, PCRE_INFO_LASTLITERAL, &need_char) +
+          new_info(re, NULL, PCRE_INFO_NAMEENTRYSIZE, &nameentrysize) +
+          new_info(re, NULL, PCRE_INFO_NAMECOUNT, &namecount) +
+          new_info(re, NULL, PCRE_INFO_NAMETABLE, (void *)&nametable) +
+          new_info(re, NULL, PCRE_INFO_OKPARTIAL, &okpartial) +
+          new_info(re, NULL, PCRE_INFO_JCHANGED, &jchanged) +
+          new_info(re, NULL, PCRE_INFO_HASCRORLF, &hascrorlf)
+          != 0)
+        goto SKIP_DATA;

-#if !defined NOINFOCHECK
-      old_count = pcre_info(re, &old_options, &old_first_char);
-      if (count < 0) fprintf(outfile,
-        "Error %d from pcre_info()\n", count);
-      else
-        {
-        if (old_count != count) fprintf(outfile,
-          "Count disagreement: pcre_fullinfo=%d pcre_info=%d\n", count,
-            old_count);
-
-        if (old_first_char != first_char) fprintf(outfile,
-          "First char disagreement: pcre_fullinfo=%d pcre_info=%d\n",
-            first_char, old_first_char);
-
-        if (old_options != (int)get_options) fprintf(outfile,
-          "Options disagreement: pcre_fullinfo=%ld pcre_info=%d\n",
-            get_options, old_options);
-        }
-#endif
-
       if (size != regex_gotten_store) fprintf(outfile,
         "Size disagreement: pcre_fullinfo=%d call to malloc for %d\n",
         (int)size, (int)regex_gotten_store);
@@ -2009,10 +3018,28 @@
         fprintf(outfile, "Named capturing subpatterns:\n");
         while (namecount-- > 0)
           {
-          fprintf(outfile, "  %s %*s%3d\n", nametable + 2,
-            nameentrysize - 3 - (int)strlen((char *)nametable + 2), "",
-            GET2(nametable, 0));
+#if defined SUPPORT_PCRE8 && defined SUPPORT_PCRE16
+          int imm2_size = use_pcre16 ? 1 : 2;
+#else
+          int imm2_size = IMM2_SIZE;
+#endif
+          int length = (int)STRLEN(nametable + imm2_size);
+          fprintf(outfile, "  ");
+          PCHARSV(nametable, imm2_size, length, outfile);
+          while (length++ < nameentrysize - imm2_size) putc(' ', outfile);
+#if defined SUPPORT_PCRE8 && defined SUPPORT_PCRE16
+          fprintf(outfile, "%3d\n", use_pcre16?
+             (int)(((PCRE_SPTR16)nametable)[0])
+            :((int)nametable[0] << 8) | (int)nametable[1]);
+          nametable += nameentrysize * (use_pcre16 ? 2 : 1);
+#else
+          fprintf(outfile, "%3d\n", GET2(nametable, 0));
+#ifdef SUPPORT_PCRE8
           nametable += nameentrysize;
+#else
+          nametable += nameentrysize * 2;
+#endif
+#endif
           }
         }

@@ -2020,7 +3047,7 @@
       if (hascrorlf) fprintf(outfile, "Contains explicit CR or LF match\n");

       all_options = ((real_pcre *)re)->options;
-      if (do_flip) all_options = byteflip(all_options, sizeof(all_options));
+      if (do_flip) all_options = swap_uint32(all_options);

       if (get_options == 0) fprintf(outfile, "No options\n");
         else fprintf(outfile, "Options:%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
@@ -2036,9 +3063,9 @@
           ((get_options & PCRE_EXTRA) != 0)? " extra" : "",
           ((get_options & PCRE_UNGREEDY) != 0)? " ungreedy" : "",
           ((get_options & PCRE_NO_AUTO_CAPTURE) != 0)? " no_auto_capture" : "",
-          ((get_options & PCRE_UTF8) != 0)? " utf8" : "",
+          ((get_options & PCRE_UTF8) != 0)? " utf" : "",
           ((get_options & PCRE_UCP) != 0)? " ucp" : "",
-          ((get_options & PCRE_NO_UTF8_CHECK) != 0)? " no_utf8_check" : "",
+          ((get_options & PCRE_NO_UTF8_CHECK) != 0)? " no_utf_check" : "",
           ((get_options & PCRE_NO_START_OPTIMIZE) != 0)? " no_start_optimize" : "",
           ((get_options & PCRE_DUPNAMES) != 0)? " dupnames" : "");

@@ -2080,13 +3107,18 @@
         }
       else
         {
-        int ch = first_char & 255;
-        const char *caseless = ((first_char & REQ_CASELESS) == 0)?
+        const char *caseless =
+          ((((real_pcre *)re)->flags & PCRE_FCH_CASELESS) == 0)?
           "" : " (caseless)";
-        if (PRINTHEX(ch))
-          fprintf(outfile, "First char = \'%c\'%s\n", ch, caseless);
+
+        if (PRINTOK(first_char))
+          fprintf(outfile, "First char = \'%c\'%s\n", first_char, caseless);
         else
-          fprintf(outfile, "First char = %d%s\n", ch, caseless);
+          {
+          fprintf(outfile, "First char = ");
+          pchar(first_char, outfile);
+          fprintf(outfile, "%s\n", caseless);
+          }
         }

       if (need_char < 0)
@@ -2095,13 +3127,18 @@
         }
       else
         {
-        int ch = need_char & 255;
-        const char *caseless = ((need_char & REQ_CASELESS) == 0)?
+        const char *caseless =
+          ((((real_pcre *)re)->flags & PCRE_RCH_CASELESS) == 0)?
           "" : " (caseless)";
-        if (PRINTHEX(ch))
-          fprintf(outfile, "Need char = \'%c\'%s\n", ch, caseless);
+
+        if (PRINTOK(need_char))
+          fprintf(outfile, "Need char = \'%c\'%s\n", need_char, caseless);
         else
-          fprintf(outfile, "Need char = %d%s\n", ch, caseless);
+          {
+          fprintf(outfile, "Need char = ");
+          pchar(need_char, outfile);
+          fprintf(outfile, "%s\n", caseless);
+          }
         }

       /* Don't output study size; at present it is in any case a fixed
@@ -2118,42 +3155,44 @@
           fprintf(outfile, "Study returned NULL\n");
         else
           {
-          uschar *start_bits = NULL;
+          pcre_uint8 *start_bits = NULL;
           int minlength;

-          new_info(re, extra, PCRE_INFO_MINLENGTH, &minlength);
-          fprintf(outfile, "Subject length lower bound = %d\n", minlength);
+          if (new_info(re, extra, PCRE_INFO_MINLENGTH, &minlength) == 0)
+            fprintf(outfile, "Subject length lower bound = %d\n", minlength);

-          new_info(re, extra, PCRE_INFO_FIRSTTABLE, &start_bits);
-          if (start_bits == NULL)
-            fprintf(outfile, "No set of starting bytes\n");
-          else
+          if (new_info(re, extra, PCRE_INFO_FIRSTTABLE, &start_bits) == 0)
             {
-            int i;
-            int c = 24;
-            fprintf(outfile, "Starting byte set: ");
-            for (i = 0; i < 256; i++)
+            if (start_bits == NULL)
+              fprintf(outfile, "No set of starting bytes\n");
+            else
               {
-              if ((start_bits[i/8] & (1<<(i&7))) != 0)
+              int i;
+              int c = 24;
+              fprintf(outfile, "Starting byte set: ");
+              for (i = 0; i < 256; i++)
                 {
-                if (c > 75)
+                if ((start_bits[i/8] & (1<<(i&7))) != 0)
                   {
-                  fprintf(outfile, "\n  ");
-                  c = 2;
+                  if (c > 75)
+                    {
+                    fprintf(outfile, "\n  ");
+                    c = 2;
+                    }
+                  if (PRINTOK(i) && i != ' ')
+                    {
+                    fprintf(outfile, "%c ", i);
+                    c += 2;
+                    }
+                  else
+                    {
+                    fprintf(outfile, "\\x%02x ", i);
+                    c += 5;
+                    }
                   }
-                if (PRINTHEX(i) && i != ' ')
-                  {
-                  fprintf(outfile, "%c ", i);
-                  c += 2;
-                  }
-                else
-                  {
-                  fprintf(outfile, "\\x%02x ", i);
-                  c += 5;
-                  }
                 }
+              fprintf(outfile, "\n");
               }
-            fprintf(outfile, "\n");
             }
           }

@@ -2162,15 +3201,17 @@
         if ((study_options & PCRE_STUDY_JIT_COMPILE) != 0)
           {
           int jit;
-          new_info(re, extra, PCRE_INFO_JIT, &jit);
-          if (jit)
-            fprintf(outfile, "JIT study was successful\n");
-          else
+          if (new_info(re, extra, PCRE_INFO_JIT, &jit) == 0)
+            {
+            if (jit)
+              fprintf(outfile, "JIT study was successful\n");
+            else
 #ifdef SUPPORT_JIT
-            fprintf(outfile, "JIT study was not successful\n");
+              fprintf(outfile, "JIT study was not successful\n");
 #else
-            fprintf(outfile, "JIT support is not available in this version of PCRE\n");
+              fprintf(outfile, "JIT support is not available in this version of PCRE\n");
 #endif
+            }
           }
         }
       }
@@ -2188,16 +3229,17 @@
         }
       else
         {
-        uschar sbuf[8];
-        sbuf[0] = (uschar)((true_size >> 24) & 255);
-        sbuf[1] = (uschar)((true_size >> 16) & 255);
-        sbuf[2] = (uschar)((true_size >>  8) & 255);
-        sbuf[3] = (uschar)((true_size) & 255);
+        pcre_uint8 sbuf[8];

-        sbuf[4] = (uschar)((true_study_size >> 24) & 255);
-        sbuf[5] = (uschar)((true_study_size >> 16) & 255);
-        sbuf[6] = (uschar)((true_study_size >>  8) & 255);
-        sbuf[7] = (uschar)((true_study_size) & 255);
+        if (do_flip) regexflip(re, extra);
+        sbuf[0] = (pcre_uint8)((true_size >> 24) & 255);
+        sbuf[1] = (pcre_uint8)((true_size >> 16) & 255);
+        sbuf[2] = (pcre_uint8)((true_size >>  8) & 255);
+        sbuf[3] = (pcre_uint8)((true_size) & 255);
+        sbuf[4] = (pcre_uint8)((true_study_size >> 24) & 255);
+        sbuf[5] = (pcre_uint8)((true_study_size >> 16) & 255);
+        sbuf[6] = (pcre_uint8)((true_study_size >>  8) & 255);
+        sbuf[7] = (pcre_uint8)((true_study_size) & 255);

         if (fwrite(sbuf, 1, 8, f) < 8 ||
             fwrite(re, 1, true_size, f) < true_size)
@@ -2225,7 +3267,10 @@
         }

       new_free(re);
-      if (extra != NULL) pcre_free_study(extra);
+      if (extra != NULL)
+        {
+        PCRE_FREE_STUDY(extra);
+        }
       if (locale_set)
         {
         new_free((void *)tables);
@@ -2240,8 +3285,8 @@

   for (;;)
     {
-    uschar *q;
-    uschar *bptr;
+    pcre_uint8 *q;
+    pcre_uint8 *bptr;
     int *use_offsets = offsets;
     int use_size_offsets = size_offsets;
     int callout_data = 0;
@@ -2257,15 +3302,15 @@
     int g_notempty = 0;
     int use_dfa = 0;

-    options = 0;
-
     *copynames = 0;
     *getnames = 0;

-    copynamesptr = copynames;
-    getnamesptr = getnames;
+    cn16ptr = copynames;
+    gn16ptr = getnames;
+    cn8ptr = copynames8;
+    gn8ptr = getnames8;

-    pcre_callout = callout;
+    SET_PCRE_CALLOUT(callout);
     first_callout = 1;
     last_callout_mark = NULL;
     callout_extra = 0;
@@ -2273,6 +3318,7 @@
     callout_fail_count = 999999;
     callout_fail_id = -1;
     show_malloc = 0;
+    options = 0;

     if (extra != NULL) extra->flags &=
       ~(PCRE_EXTRA_MATCH_LIMIT|PCRE_EXTRA_MATCH_LIMIT_RECURSION);
@@ -2307,9 +3353,25 @@
       {
       int i = 0;
       int n = 0;
-
-      if (c == '\\') switch ((c = *p++))
+      
+      /* In UTF mode, input can be UTF-8, so just copy all non-backslash bytes.
+      In non-UTF mode, allow the value of the byte to fall through to later,
+      where values greater than 127 are turned into UTF-8 when running in
+      16-bit mode. */
+      
+      if (c != '\\')
         {
+        if (use_utf)
+          {
+          *q++ = c;
+          continue;
+          }    
+        }  
+ 
+      /* Handle backslash escapes */
+       
+      else switch ((c = *p++))
+        {
         case 'a': c =    7; break;
         case 'b': c = '\b'; break;
         case 'e': c =   27; break;
@@ -2324,27 +3386,12 @@
         c -= '0';
         while (i++ < 2 && isdigit(*p) && *p != '8' && *p != '9')
           c = c * 8 + *p++ - '0';
-
-#if !defined NOUTF8
-        if (use_utf8 && c > 255)
-          {
-          unsigned char buff8[8];
-          int ii, utn;
-          utn = ord2utf8(c, buff8);
-          for (ii = 0; ii < utn - 1; ii++) *q++ = buff8[ii];
-          c = buff8[ii];   /* Last byte */
-          }
-#endif
         break;

         case 'x':
-
-        /* Handle \x{..} specially - new Perl thing for utf8 */
-
-#if !defined NOUTF8
         if (*p == '{')
           {
-          unsigned char *pt = p;
+          pcre_uint8 *pt = p;
           c = 0;

           /* We used to have "while (isxdigit(*(++pt)))" here, but it fails
@@ -2356,29 +3403,17 @@
             c = c * 16 + tolower(*pt) - ((isdigit(*pt))? '0' : 'a' - 10);
           if (*pt == '}')
             {
-            unsigned char buff8[8];
-            int ii, utn;
-            if (use_utf8)
-              {
-              utn = ord2utf8(c, buff8);
-              for (ii = 0; ii < utn - 1; ii++) *q++ = buff8[ii];
-              c = buff8[ii];   /* Last byte */
-              }
-            else
-             {
-             if (c > 255)
-               fprintf(outfile, "** Character \\x{%x} is greater than 255 and "
-                 "UTF-8 mode is not enabled.\n"
-                 "** Truncation will probably give the wrong result.\n", c);
-             }
             p = pt + 1;
             break;
             }
-          /* Not correct form; fall through */
+          /* Not correct form for \x{...}; fall through */
           }
-#endif

-        /* Ordinary \x */
+        /* \x without {} always defines just one byte in 8-bit mode. This 
+        allows UTF-8 characters to be constructed byte by byte, and also allows 
+        invalid UTF-8 sequences to be made. Just copy the byte in UTF mode. 
+        Otherwise, pass it down to later code so that it can be turned into 
+        UTF-8 when running in 16-bit mode. */

         c = 0;
         while (i++ < 2 && isxdigit(*p))
@@ -2386,6 +3421,11 @@
           c = c * 16 + tolower(*p) - ((isdigit(*p))? '0' : 'a' - 10);
           p++;
           }
+        if (use_utf)
+          { 
+          *q++ = c;
+          continue;    
+          } 
         break;

         case 0:   /* \ followed by EOF allows for an empty line */
@@ -2418,14 +3458,7 @@
           }
         else if (isalnum(*p))
           {
-          uschar *npp = copynamesptr;
-          while (isalnum(*p)) *npp++ = *p++;
-          *npp++ = 0;
-          *npp = 0;
-          n = pcre_get_stringnumber(re, (char *)copynamesptr);
-          if (n < 0)
-            fprintf(outfile, "no parentheses with name \"%s\"\n", copynamesptr);
-          copynamesptr = npp;
+          READ_CAPTURE_NAME(p, &cn8ptr, &cn16ptr, re);
           }
         else if (*p == '+')
           {
@@ -2434,7 +3467,7 @@
           }
         else if (*p == '-')
           {
-          pcre_callout = NULL;
+          SET_PCRE_CALLOUT(NULL);
           p++;
           }
         else if (*p == '!')
@@ -2488,14 +3521,7 @@
           }
         else if (isalnum(*p))
           {
-          uschar *npp = getnamesptr;
-          while (isalnum(*p)) *npp++ = *p++;
-          *npp++ = 0;
-          *npp = 0;
-          n = pcre_get_stringnumber(re, (char *)getnamesptr);
-          if (n < 0)
-            fprintf(outfile, "no parentheses with name \"%s\"\n", getnamesptr);
-          getnamesptr = npp;
+          READ_CAPTURE_NAME(p, &gn8ptr, &gn16ptr, re);
           }
         continue;

@@ -2505,9 +3531,9 @@
             && (extra->flags & PCRE_EXTRA_EXECUTABLE_JIT) != 0
             && extra->executable_jit != NULL)
           {
-      if (jit_stack != NULL) pcre_jit_stack_free(jit_stack);
-      jit_stack = pcre_jit_stack_alloc(1, n * 1024);
-      pcre_assign_jit_stack(extra, jit_callback, jit_stack);
+          if (jit_stack != NULL) PCRE_JIT_STACK_FREE(jit_stack);
+          jit_stack = PCRE_JIT_STACK_ALLOC(1, n * 1024);
+          PCRE_ASSIGN_JIT_STACK(extra, jit_callback, jit_stack);
           }
         continue;

@@ -2603,8 +3629,36 @@
           }
         continue;
         }
-      *q++ = c;
+
+      /* We now have a character value in c that may be greater than 255. In 
+      16-bit mode, we always convert characters to UTF-8 so that values greater 
+      than 255 can be passed to non-UTF 16-bit strings. In 8-bit mode we
+      convert to UTF-8 if we are in UTF mode. Values greater than 127 in UTF 
+      mode must have come from \x{...} or octal constructs because values from
+      \x.. get this far only in non-UTF mode. */
+
+      if (use_pcre16 || use_utf)
+        {
+        pcre_uint8 buff8[8];
+        int ii, utn;
+        utn = ord2utf8(c, buff8);
+        for (ii = 0; ii < utn; ii++) *q++ = buff8[ii];
+        }
+      else
+        {
+        if (c > 255)
+          {
+          fprintf(outfile, "** Character \\x{%x} is greater than 255 "
+            "and UTF-8 mode is not enabled.\n", c);
+          fprintf(outfile, "** Truncation will probably give the wrong "
+            "result.\n");
+          }
+        *q++ = c;
+        }
       }
+      
+    /* Reached end of subject string */
+       
     *q = 0;
     len = (int)(q - dbuffer);

@@ -2666,13 +3720,13 @@
           if (pmatch[i].rm_so >= 0)
             {
             fprintf(outfile, "%2d: ", (int)i);
-            (void)pchars(dbuffer + pmatch[i].rm_so,
+            PCHARSV(dbuffer, pmatch[i].rm_so,
               pmatch[i].rm_eo - pmatch[i].rm_so, outfile);
             fprintf(outfile, "\n");
             if (do_showcaprest || (i == 0 && do_showrest))
               {
               fprintf(outfile, "%2d+ ", (int)i);
-              (void)pchars(dbuffer + pmatch[i].rm_eo, len - pmatch[i].rm_eo,
+              PCHARSV(dbuffer, pmatch[i].rm_eo, len - pmatch[i].rm_eo,
                 outfile);
               fprintf(outfile, "\n");
               }
@@ -2680,13 +3734,41 @@
           }
         }
       free(pmatch);
+      goto NEXT_DATA;
       }

+#endif  /* !defined NOPOSIX */
+
     /* Handle matching via the native interface - repeats for /g and /G */

-    else
-#endif  /* !defined NOPOSIX */
+#ifdef SUPPORT_PCRE16
+    if (use_pcre16)
+      {
+      len = to16(TRUE, bptr, (((real_pcre *)re)->options) & PCRE_UTF8, len);
+      switch(len)
+        {
+        case -1:
+        fprintf(outfile, "**Failed: invalid UTF-8 string cannot be "
+          "converted to UTF-16\n");
+        goto NEXT_DATA;

+        case -2:
+        fprintf(outfile, "**Failed: character value greater than 0x10ffff "
+          "cannot be converted to UTF-16\n");
+        goto NEXT_DATA;
+
+        case -3:
+        fprintf(outfile, "**Failed: character value greater than 0xffff "
+          "cannot be converted to 16-bit in non-UTF mode\n");
+        goto NEXT_DATA;   
+
+        default:
+        break;
+        }
+      bptr = (pcre_uint8 *)buffer16;
+      }
+#endif
+
     for (;; gmatched++)    /* Loop for /g or /G */
       {
       markptr = NULL;
@@ -2702,17 +3784,20 @@
           {
           int workspace[1000];
           for (i = 0; i < timeitm; i++)
-            count = pcre_dfa_exec(re, extra, (char *)bptr, len, start_offset,
-              options | g_notempty, use_offsets, use_size_offsets, workspace,
-              sizeof(workspace)/sizeof(int));
+            {
+            PCRE_DFA_EXEC(count, re, extra, bptr, len, start_offset,
+              (options | g_notempty), use_offsets, use_size_offsets, workspace,
+              (sizeof(workspace)/sizeof(int)));
+            }
           }
         else
 #endif

         for (i = 0; i < timeitm; i++)
-          count = pcre_exec(re, extra, (char *)bptr, len,
-            start_offset, options | g_notempty, use_offsets, use_size_offsets);
-
+          {
+          PCRE_EXEC(count, re, extra, bptr, len, start_offset,
+            (options | g_notempty), use_offsets, use_size_offsets);
+          }
         time_taken = clock() - start_time;
         fprintf(outfile, "Execute time %.4f milliseconds\n",
           (((double)time_taken * 1000.0) / (double)timeitm) /
@@ -2757,7 +3842,7 @@
           }
         extra->flags |= PCRE_EXTRA_CALLOUT_DATA;
         extra->callout_data = &callout_data;
-        count = pcre_exec(re, extra, (char *)bptr, len, start_offset,
+        PCRE_EXEC(count, re, extra, bptr, len, start_offset,
           options | g_notempty, use_offsets, use_size_offsets);
         extra->flags &= ~PCRE_EXTRA_CALLOUT_DATA;
         }
@@ -2769,9 +3854,9 @@
       else if (all_use_dfa || use_dfa)
         {
         int workspace[1000];
-        count = pcre_dfa_exec(re, extra, (char *)bptr, len, start_offset,
-          options | g_notempty, use_offsets, use_size_offsets, workspace,
-          sizeof(workspace)/sizeof(int));
+        PCRE_DFA_EXEC(count, re, extra, bptr, len, start_offset,
+          (options | g_notempty), use_offsets, use_size_offsets, workspace,
+          (sizeof(workspace)/sizeof(int)));
         if (count == 0)
           {
           fprintf(outfile, "Matched, but too many subsidiary matches\n");
@@ -2782,8 +3867,8 @@

       else
         {
-        count = pcre_exec(re, extra, (char *)bptr, len,
-          start_offset, options | g_notempty, use_offsets, use_size_offsets);
+        PCRE_EXEC(count, re, extra, bptr, len, start_offset,
+          options | g_notempty, use_offsets, use_size_offsets);
         if (count == 0)
           {
           fprintf(outfile, "Matched, but too many substrings\n");
@@ -2796,6 +3881,7 @@
       if (count >= 0)
         {
         int i, maxcount;
+        void *cnptr, *gnptr;

 #if !defined NODFA
         if (all_use_dfa || use_dfa) maxcount = use_size_offsets/2; else
@@ -2822,7 +3908,8 @@

         if (do_allcaps)
           {
-          new_info(re, NULL, PCRE_INFO_CAPTURECOUNT, &count);
+          if (new_info(re, NULL, PCRE_INFO_CAPTURECOUNT, &count) < 0)
+            goto SKIP_DATA;
           count++;   /* Allow for full match */
           if (count * 2 > use_size_offsets) count = use_size_offsets/2;
           }
@@ -2844,95 +3931,154 @@
           else
             {
             fprintf(outfile, "%2d: ", i/2);
-            (void)pchars(bptr + use_offsets[i],
+            PCHARSV(bptr, use_offsets[i],
               use_offsets[i+1] - use_offsets[i], outfile);
             fprintf(outfile, "\n");
             if (do_showcaprest || (i == 0 && do_showrest))
               {
               fprintf(outfile, "%2d+ ", i/2);
-              (void)pchars(bptr + use_offsets[i+1], len - use_offsets[i+1],
+              PCHARSV(bptr, use_offsets[i+1], len - use_offsets[i+1],
                 outfile);
               fprintf(outfile, "\n");
               }
             }
           }

-        if (markptr != NULL) fprintf(outfile, "MK: %s\n", markptr);
+        if (markptr != NULL)
+          {
+          fprintf(outfile, "MK: ");
+          PCHARSV(markptr, 0, -1, outfile);
+          fprintf(outfile, "\n");
+          }

         for (i = 0; i < 32; i++)
           {
           if ((copystrings & (1 << i)) != 0)
             {
+            int rc;
             char copybuffer[256];
-            int rc = pcre_copy_substring((char *)bptr, use_offsets, count,
-              i, copybuffer, sizeof(copybuffer));
+            PCRE_COPY_SUBSTRING(rc, bptr, use_offsets, count, i,
+              copybuffer, sizeof(copybuffer));
             if (rc < 0)
               fprintf(outfile, "copy substring %d failed %d\n", i, rc);
             else
-              fprintf(outfile, "%2dC %s (%d)\n", i, copybuffer, rc);
+              {
+              fprintf(outfile, "%2dC ", i);
+              PCHARSV(copybuffer, 0, rc, outfile);
+              fprintf(outfile, " (%d)\n", rc);
+              }
             }
           }

-        for (copynamesptr = copynames;
-             *copynamesptr != 0;
-             copynamesptr += (int)strlen((char*)copynamesptr) + 1)
+        cnptr = copynames;
+        for (;;)
           {
+          int rc;
           char copybuffer[256];
-          int rc = pcre_copy_named_substring(re, (char *)bptr, use_offsets,
-            count, (char *)copynamesptr, copybuffer, sizeof(copybuffer));
+
+          if (use_pcre16)
+            {
+            if (*(pcre_uint16 *)cnptr == 0) break;
+            }
+          else
+            {
+            if (*(pcre_uint8 *)cnptr == 0) break;
+            }
+
+          PCRE_COPY_NAMED_SUBSTRING(rc, re, bptr, use_offsets, count,
+            cnptr, copybuffer, sizeof(copybuffer));
+
           if (rc < 0)
-            fprintf(outfile, "copy substring %s failed %d\n", copynamesptr, rc);
+            {
+            fprintf(outfile, "copy substring ");
+            PCHARSV(cnptr, 0, -1, outfile);
+            fprintf(outfile, " failed %d\n", rc);
+            }
           else
-            fprintf(outfile, "  C %s (%d) %s\n", copybuffer, rc, copynamesptr);
+            {
+            fprintf(outfile, "  C ");
+            PCHARSV(copybuffer, 0, rc, outfile);
+            fprintf(outfile, " (%d) ", rc);
+            PCHARSV(cnptr, 0, -1, outfile);
+            putc('\n', outfile);
+            }
+
+          cnptr = (char *)cnptr + (STRLEN(cnptr) + 1) * CHAR_SIZE;
           }

         for (i = 0; i < 32; i++)
           {
           if ((getstrings & (1 << i)) != 0)
             {
+            int rc;
             const char *substring;
-            int rc = pcre_get_substring((char *)bptr, use_offsets, count,
-              i, &substring);
+            PCRE_GET_SUBSTRING(rc, bptr, use_offsets, count, i, &substring);
             if (rc < 0)
               fprintf(outfile, "get substring %d failed %d\n", i, rc);
             else
               {
-              fprintf(outfile, "%2dG %s (%d)\n", i, substring, rc);
-              pcre_free_substring(substring);
+              fprintf(outfile, "%2dG ", i);
+              PCHARSV(substring, 0, rc, outfile);
+              fprintf(outfile, " (%d)\n", rc);
+              PCRE_FREE_SUBSTRING(substring);
               }
             }
           }

-        for (getnamesptr = getnames;
-             *getnamesptr != 0;
-             getnamesptr += (int)strlen((char*)getnamesptr) + 1)
+        gnptr = getnames;
+        for (;;)
           {
+          int rc;
           const char *substring;
-          int rc = pcre_get_named_substring(re, (char *)bptr, use_offsets,
-            count, (char *)getnamesptr, &substring);
+
+          if (use_pcre16)
+            {
+            if (*(pcre_uint16 *)gnptr == 0) break;
+            }
+          else
+            {
+            if (*(pcre_uint8 *)gnptr == 0) break;
+            }
+
+          PCRE_GET_NAMED_SUBSTRING(rc, re, bptr, use_offsets, count,
+            gnptr, &substring);
           if (rc < 0)
-            fprintf(outfile, "copy substring %s failed %d\n", getnamesptr, rc);
+            {
+            fprintf(outfile, "get substring ");
+            PCHARSV(gnptr, 0, -1, outfile);
+            fprintf(outfile, " failed %d\n", rc);
+            }
           else
             {
-            fprintf(outfile, "  G %s (%d) %s\n", substring, rc, getnamesptr);
-            pcre_free_substring(substring);
+            fprintf(outfile, "  G ");
+            PCHARSV(substring, 0, rc, outfile);
+            fprintf(outfile, " (%d) ", rc);
+            PCHARSV(gnptr, 0, -1, outfile);
+            PCRE_FREE_SUBSTRING(substring);
+            putc('\n', outfile);
             }
+
+          gnptr = (char *)gnptr + (STRLEN(gnptr) + 1) * CHAR_SIZE;
           }

         if (getlist)
           {
+          int rc;
           const char **stringlist;
-          int rc = pcre_get_substring_list((char *)bptr, use_offsets, count,
-            &stringlist);
+          PCRE_GET_SUBSTRING_LIST(rc, bptr, use_offsets, count, &stringlist);
           if (rc < 0)
             fprintf(outfile, "get substring list failed %d\n", rc);
           else
             {
             for (i = 0; i < count; i++)
-              fprintf(outfile, "%2dL %s\n", i, stringlist[i]);
+              {
+              fprintf(outfile, "%2dL ", i);
+              PCHARSV(stringlist[i], 0, -1, outfile);
+              putc('\n', outfile);
+              }
             if (stringlist[i] != NULL)
               fprintf(outfile, "string list not terminated by NULL\n");
-            pcre_free_substring_list(stringlist);
+            PCRE_FREE_SUBSTRING_LIST(stringlist);
             }
           }
         }
@@ -2942,11 +4088,15 @@
       else if (count == PCRE_ERROR_PARTIAL)
         {
         if (markptr == NULL) fprintf(outfile, "Partial match");
-          else fprintf(outfile, "Partial match, mark=%s", markptr);
+        else
+          {
+          fprintf(outfile, "Partial match, mark=");
+          PCHARSV(markptr, 0, -1, outfile);
+          }
         if (use_size_offsets > 1)
           {
           fprintf(outfile, ": ");
-          pchars(bptr + use_offsets[0], use_offsets[1] - use_offsets[0],
+          PCHARSV(bptr, use_offsets[0], use_offsets[1] - use_offsets[0],
             outfile);
           }
         fprintf(outfile, "\n");
@@ -2963,7 +4113,7 @@
       terminated by CRLF, an advance of one character just passes the \r,
       whereas we should prefer the longer newline sequence, as does the code in
       pcre_exec(). Fudge the offset value to achieve this. We check for a
-      newline setting in the pattern; if none was set, use pcre_config() to
+      newline setting in the pattern; if none was set, use PCRE_CONFIG() to
       find the default.

       Otherwise, in the case of UTF-8 matching, the advance must be one
@@ -2979,7 +4129,7 @@
           if ((obits & PCRE_NEWLINE_BITS) == 0)
             {
             int d;
-            (void)pcre_config(PCRE_CONFIG_NEWLINE, &d);
+            (void)PCRE_CONFIG(PCRE_CONFIG_NEWLINE, &d);
             /* Note that these values are always the ASCII ones, even in
             EBCDIC environments. CR = 13, NL = 10. */
             obits = (d == 13)? PCRE_NEWLINE_CR :
@@ -2993,10 +4143,23 @@
                (obits & PCRE_NEWLINE_BITS) == PCRE_NEWLINE_ANYCRLF)
               &&
               start_offset < len - 1 &&
-              bptr[start_offset] == '\r' &&
-              bptr[start_offset+1] == '\n')
+#if defined SUPPORT_PCRE8 && defined SUPPORT_PCRE16
+              (use_pcre16?
+                   ((PCRE_SPTR16)bptr)[start_offset] == '\r'
+                && ((PCRE_SPTR16)bptr)[start_offset + 1] == '\n'
+              :
+                   bptr[start_offset] == '\r'
+                && bptr[start_offset + 1] == '\n')
+#elif defined SUPPORT_PCRE16
+                 ((PCRE_SPTR16)bptr)[start_offset] == '\r'
+              && ((PCRE_SPTR16)bptr)[start_offset + 1] == '\n'
+#else
+                 bptr[start_offset] == '\r'
+              && bptr[start_offset + 1] == '\n'
+#endif
+              )
             onechar++;
-          else if (use_utf8)
+          else if (use_utf)
             {
             while (start_offset + onechar < len)
               {
@@ -3013,21 +4176,35 @@
             case PCRE_ERROR_NOMATCH:
             if (gmatched == 0)
               {
-              if (markptr == NULL) fprintf(outfile, "No match\n");
-                else fprintf(outfile, "No match, mark = %s\n", markptr);
+              if (markptr == NULL)
+                {
+                fprintf(outfile, "No match\n");
+                }
+              else
+                {
+                fprintf(outfile, "No match, mark = ");
+                PCHARSV(markptr, 0, -1, outfile);
+                putc('\n', outfile);
+                }
               }
             break;

             case PCRE_ERROR_BADUTF8:
             case PCRE_ERROR_SHORTUTF8:
-            fprintf(outfile, "Error %d (%s UTF-8 string)", count,
-              (count == PCRE_ERROR_BADUTF8)? "bad" : "short");
+            fprintf(outfile, "Error %d (%s UTF-%s string)", count,
+              (count == PCRE_ERROR_BADUTF8)? "bad" : "short",
+              use_pcre16? "16" : "8");
             if (use_size_offsets >= 2)
               fprintf(outfile, " offset=%d reason=%d", use_offsets[0],
                 use_offsets[1]);
             fprintf(outfile, "\n");
             break;

+            case PCRE_ERROR_BADUTF8_OFFSET:
+            fprintf(outfile, "Error %d (bad UTF-%s offset)\n", count,
+              use_pcre16? "16" : "8");
+            break;
+
             default:
             if (count < 0 && (-count) < sizeof(errtexts)/sizeof(const char *))
               fprintf(outfile, "Error %d (%s)\n", count, errtexts[-count]);
@@ -3067,7 +4244,7 @@

       else
         {
-        bptr += use_offsets[1];
+        bptr += use_offsets[1] * CHAR_SIZE;
         len -= use_offsets[1];
         }
       }  /* End of loop for /g and /G */
@@ -3082,7 +4259,10 @@
 #endif

   if (re != NULL) new_free(re);
-  if (extra != NULL) pcre_free_study(extra);
+  if (extra != NULL)
+    {
+    PCRE_FREE_STUDY(extra);
+    }
   if (locale_set)
     {
     new_free((void *)tables);
@@ -3091,7 +4271,7 @@
     }
   if (jit_stack != NULL)
     {
-    pcre_jit_stack_free(jit_stack);
+    PCRE_JIT_STACK_FREE(jit_stack);
     jit_stack = NULL;
     }
   }
@@ -3108,6 +4288,10 @@
 free(pbuffer);
 free(offsets);

+#ifdef SUPPORT_PCRE16
+if (buffer16 != NULL) free(buffer16);
+#endif
+
return yield;
}

Modified: code/trunk/perltest.pl
===================================================================
--- code/trunk/perltest.pl    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/perltest.pl    2011-12-28 17:16:11 UTC (rev 836)
@@ -111,6 +111,10 @@

$pattern =~ s/S(?=[a-zA-Z]*$)//g;

+ # Remove /Y from a pattern (asks pcretest to disable PCRE optimization)
+
+ $pattern =~ s/Y(?=[a-zA-Z]*$)//;
+
# Check that the pattern is valid

eval "\$_ =~ ${pattern}";

Modified: code/trunk/sljit/sljitConfig.h
===================================================================
--- code/trunk/sljit/sljitConfig.h    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitConfig.h    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:

Modified: code/trunk/sljit/sljitConfigInternal.h
===================================================================
--- code/trunk/sljit/sljitConfigInternal.h    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitConfigInternal.h    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:
@@ -354,8 +354,8 @@
 #endif /* !SLJIT_UNALIGNED */

#if (defined SLJIT_EXECUTABLE_ALLOCATOR && SLJIT_EXECUTABLE_ALLOCATOR)
-void* sljit_malloc_exec(sljit_uw size);
-void sljit_free_exec(void* ptr);
+SLJIT_API_FUNC_ATTRIBUTE void* sljit_malloc_exec(sljit_uw size);
+SLJIT_API_FUNC_ATTRIBUTE void sljit_free_exec(void* ptr);
#define SLJIT_MALLOC_EXEC(size) sljit_malloc_exec(size)
#define SLJIT_FREE_EXEC(ptr) sljit_free_exec(ptr)
#endif

Modified: code/trunk/sljit/sljitExecAllocator.c
===================================================================
--- code/trunk/sljit/sljitExecAllocator.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitExecAllocator.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:
@@ -163,7 +163,7 @@
     }
 }

-void* sljit_malloc_exec(sljit_uw size)
+SLJIT_API_FUNC_ATTRIBUTE void* sljit_malloc_exec(sljit_uw size)
 {
     struct block_header *header;
     struct block_header *next_header;
@@ -231,7 +231,7 @@
     return MEM_START(header);
 }

-void sljit_free_exec(void* ptr)
+SLJIT_API_FUNC_ATTRIBUTE void sljit_free_exec(void* ptr)
 {
     struct block_header *header;
     struct free_block* free_block;
@@ -263,8 +263,11 @@
         header->prev_size = free_block->size;
     }

+    /* The whole chunk is free. */
     if (SLJIT_UNLIKELY(!free_block->header.prev_size && header->size == 1)) {
+        /* If this block is freed, we still have (allocated_size / 2) free space. */
         if (total_size - free_block->size > (allocated_size * 3 / 2)) {
+            total_size -= free_block->size;
             sljit_remove_free_block(free_block);
             free_chunk(free_block, free_block->size + sizeof(struct block_header));
         }

Modified: code/trunk/sljit/sljitLir.c
===================================================================
--- code/trunk/sljit/sljitLir.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitLir.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:

Modified: code/trunk/sljit/sljitLir.h
===================================================================
--- code/trunk/sljit/sljitLir.h    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitLir.h    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:
@@ -195,6 +195,8 @@
     int local_size;
     /* Code size. */
     sljit_uw size;
+    /* For statistical purposes. */
+    sljit_uw executable_size;

 #if (defined SLJIT_CONFIG_X86_32 && SLJIT_CONFIG_X86_32)
     int args;
@@ -291,6 +293,15 @@
 SLJIT_API_FUNC_ATTRIBUTE void* sljit_generate_code(struct sljit_compiler *compiler);
 SLJIT_API_FUNC_ATTRIBUTE void sljit_free_code(void* code);

+/*
+ After the code generation we can retrieve the allocated executable memory size,
+ although this area may not be fully filled with instructions depending on some
+ optimizations. This function is useful only for statistical purposes.
+
+ Before a successful code generation, this function returns with 0.
+*/
+static SLJIT_INLINE sljit_uw sljit_get_generated_code_size(struct sljit_compiler *compiler) { return compiler->executable_size; }
+
/* Instruction generation. Returns with error code. */

/*

Modified: code/trunk/sljit/sljitNativeARM_Thumb2.c
===================================================================
--- code/trunk/sljit/sljitNativeARM_Thumb2.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitNativeARM_Thumb2.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:
@@ -416,6 +416,7 @@

     SLJIT_CACHE_FLUSH(code, code_ptr);
     compiler->error = SLJIT_ERR_COMPILED;
+    compiler->executable_size = compiler->size * sizeof(sljit_uh);
     /* Set thumb mode flag. */
     return (void*)((sljit_uw)code | 0x1);
 }

Modified: code/trunk/sljit/sljitNativeARM_v5.c
===================================================================
--- code/trunk/sljit/sljitNativeARM_v5.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitNativeARM_v5.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:
@@ -788,6 +788,7 @@

     SLJIT_CACHE_FLUSH(code, code_ptr);
     compiler->error = SLJIT_ERR_COMPILED;
+    compiler->executable_size = size * sizeof(sljit_uw);
     return code;
 }

Modified: code/trunk/sljit/sljitNativeMIPS_32.c
===================================================================
--- code/trunk/sljit/sljitNativeMIPS_32.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitNativeMIPS_32.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:

Modified: code/trunk/sljit/sljitNativeMIPS_common.c
===================================================================
--- code/trunk/sljit/sljitNativeMIPS_common.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitNativeMIPS_common.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:
@@ -397,6 +397,7 @@
     }

     compiler->error = SLJIT_ERR_COMPILED;
+    compiler->executable_size = compiler->size * sizeof(sljit_ins);
 #ifndef __GNUC__
     SLJIT_CACHE_FLUSH(code, code_ptr);
 #else

Modified: code/trunk/sljit/sljitNativePPC_32.c
===================================================================
--- code/trunk/sljit/sljitNativePPC_32.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitNativePPC_32.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:

Modified: code/trunk/sljit/sljitNativePPC_64.c
===================================================================
--- code/trunk/sljit/sljitNativePPC_64.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitNativePPC_64.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:

Modified: code/trunk/sljit/sljitNativePPC_common.c
===================================================================
--- code/trunk/sljit/sljitNativePPC_common.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitNativePPC_common.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:
@@ -354,6 +354,7 @@

     SLJIT_CACHE_FLUSH(code, code_ptr);
     compiler->error = SLJIT_ERR_COMPILED;
+    compiler->executable_size = compiler->size * sizeof(sljit_ins);

 #if (defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64)
     if (((sljit_w)code_ptr) & 0x4)

Modified: code/trunk/sljit/sljitNativeX86_32.c
===================================================================
--- code/trunk/sljit/sljitNativeX86_32.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitNativeX86_32.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:

Modified: code/trunk/sljit/sljitNativeX86_64.c
===================================================================
--- code/trunk/sljit/sljitNativeX86_64.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitNativeX86_64.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:

Modified: code/trunk/sljit/sljitNativeX86_common.c
===================================================================
--- code/trunk/sljit/sljitNativeX86_common.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitNativeX86_common.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:
@@ -357,22 +357,22 @@
     while (jump) {
         if (jump->flags & PATCH_MB) {
             SLJIT_ASSERT((sljit_w)(jump->u.label->addr - (jump->addr + sizeof(sljit_b))) >= -128 && (sljit_w)(jump->u.label->addr - (jump->addr + sizeof(sljit_b))) <= 127);
-            *(sljit_ub*)jump->addr = jump->u.label->addr - (jump->addr + sizeof(sljit_b));
+            *(sljit_ub*)jump->addr = (sljit_ub)(jump->u.label->addr - (jump->addr + sizeof(sljit_b)));
         } else if (jump->flags & PATCH_MW) {
             if (jump->flags & JUMP_LABEL) {
 #if (defined SLJIT_CONFIG_X86_32 && SLJIT_CONFIG_X86_32)
-                *(sljit_w*)jump->addr = jump->u.label->addr - (jump->addr + sizeof(sljit_w));
+                *(sljit_w*)jump->addr = (sljit_w)(jump->u.label->addr - (jump->addr + sizeof(sljit_w)));
 #else
                 SLJIT_ASSERT((sljit_w)(jump->u.label->addr - (jump->addr + sizeof(sljit_hw))) >= -0x80000000ll && (sljit_w)(jump->u.label->addr - (jump->addr + sizeof(sljit_hw))) <= 0x7fffffffll);
-                *(sljit_hw*)jump->addr = jump->u.label->addr - (jump->addr + sizeof(sljit_hw));
+                *(sljit_hw*)jump->addr = (sljit_hw)(jump->u.label->addr - (jump->addr + sizeof(sljit_hw)));
 #endif
             }
             else {
 #if (defined SLJIT_CONFIG_X86_32 && SLJIT_CONFIG_X86_32)
-                *(sljit_w*)jump->addr = jump->u.target - (jump->addr + sizeof(sljit_w));
+                *(sljit_w*)jump->addr = (sljit_w)(jump->u.target - (jump->addr + sizeof(sljit_w)));
 #else
                 SLJIT_ASSERT((sljit_w)(jump->u.target - (jump->addr + sizeof(sljit_hw))) >= -0x80000000ll && (sljit_w)(jump->u.target - (jump->addr + sizeof(sljit_hw))) <= 0x7fffffffll);
-                *(sljit_hw*)jump->addr = jump->u.target - (jump->addr + sizeof(sljit_hw));
+                *(sljit_hw*)jump->addr = (sljit_hw)(jump->u.target - (jump->addr + sizeof(sljit_hw)));
 #endif
             }
         }
@@ -387,6 +387,7 @@
     /* Maybe we waste some space because of short jumps. */
     SLJIT_ASSERT(code_ptr <= code + compiler->size);
     compiler->error = SLJIT_ERR_COMPILED;
+    compiler->executable_size = compiler->size;
     return (void*)code;
 }

@@ -1360,7 +1361,7 @@
             code = (sljit_ub*)ensure_buf(compiler, 1 + 4);
             FAIL_IF(!code);
             INC_CSIZE(4);
-            *(sljit_hw*)code = src1w;
+            *(sljit_hw*)code = (sljit_hw)src1w;
         }
         else {
             EMIT_MOV(compiler, TMP_REG2, 0, SLJIT_IMM, src1w);
@@ -1403,7 +1404,7 @@
             code = (sljit_ub*)ensure_buf(compiler, 1 + 4);
             FAIL_IF(!code);
             INC_CSIZE(4);
-            *(sljit_hw*)code = src2w;
+            *(sljit_hw*)code = (sljit_hw)src2w;
         }
         else {
             EMIT_MOV(compiler, TMP_REG2, 0, SLJIT_IMM, src1w);

Modified: code/trunk/sljit/sljitUtils.c
===================================================================
--- code/trunk/sljit/sljitUtils.c    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/sljit/sljitUtils.c    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,7 @@
 /*
  *    Stack-less Just-In-Time compiler
  *
- *    Copyright 2009-2010 Zoltan Herczeg (hzmester@???). All rights reserved.
+ *    Copyright 2009-2012 Zoltan Herczeg (hzmester@???). All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without modification, are
  * permitted provided that the following conditions are met:

Copied: code/trunk/testdata/saved16 (from rev 835, code/branches/pcre16/testdata/saved16)
===================================================================
(Binary files differ)

Copied: code/trunk/testdata/saved8 (from rev 835, code/branches/pcre16/testdata/saved8)
===================================================================
(Binary files differ)

Modified: code/trunk/testdata/testinput1
===================================================================
--- code/trunk/testdata/testinput1    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput1    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,5 +1,6 @@
 /-- This set of tests is for features that are compatible with all versions of
-    Perl 5, in non-UTF-8 mode. --/
+    Perl >= 5.10, in non-UTF-8 mode. It should run clean for both the 8-bit and
+    16-bit PCRE libraries. --/

 /the quick brown fox/
     the quick brown fox
@@ -4318,4 +4319,935 @@
 /a[\C-X]b/
     aJb

+/\H\h\V\v/
+    X X\x0a
+    X\x09X\x0b
+    ** Failers
+    \xa0 X\x0a   
+    
+/\H*\h+\V?\v{3,4}/ 
+    \x09\x20\xa0X\x0a\x0b\x0c\x0d\x0a
+    \x09\x20\xa0\x0a\x0b\x0c\x0d\x0a
+    \x09\x20\xa0\x0a\x0b\x0c
+    ** Failers 
+    \x09\x20\xa0\x0a\x0b
+     
+/\H{3,4}/
+    XY  ABCDE
+    XY  PQR ST 
+    
+/.\h{3,4}./
+    XY  AB    PQRS
+
+/\h*X\h?\H+Y\H?Z/
+    >XNNNYZ
+    >  X NYQZ
+    ** Failers
+    >XYZ   
+    >  X NY Z
+
+/\v*X\v?Y\v+Z\V*\x0a\V+\x0b\V{2,3}\x0c/
+    >XY\x0aZ\x0aA\x0bNN\x0c
+    >\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
+
+/(foo)\Kbar/
+    foobar
+   
+/(foo)(\Kbar|baz)/
+    foobar
+    foobaz 
+
+/(foo\Kbar)baz/
+    foobarbaz
+
+/abc\K|def\K/g+
+    Xabcdefghi
+
+/ab\Kc|de\Kf/g+
+    Xabcdefghi
+    
+/(?=C)/g+
+    ABCDECBA
+    
+/^abc\K/+
+    abcdef
+    ** Failers
+    defabcxyz   
+
+/^(a(b))\1\g1\g{1}\g-1\g{-1}\g{-02}Z/
+    ababababbbabZXXXX
+
+/(?<A>tom|bon)-\g{A}/
+    tom-tom
+    bon-bon 
+    
+/(^(a|b\g{-1}))/
+    bacxxx
+
+/(?|(abc)|(xyz))\1/
+    abcabc
+    xyzxyz 
+    ** Failers
+    abcxyz
+    xyzabc   
+    
+/(?|(abc)|(xyz))(?1)/
+    abcabc
+    xyzabc 
+    ** Failers 
+    xyzxyz 
+ 
+/^X(?5)(a)(?|(b)|(q))(c)(d)(Y)/
+    XYabcdY
+
+/^X(?7)(a)(?|(b|(r)(s))|(q))(c)(d)(Y)/
+    XYabcdY
+
+/^X(?7)(a)(?|(b|(?|(r)|(t))(s))|(q))(c)(d)(Y)/
+    XYabcdY
+
+/(?'abc'\w+):\k<abc>{2}/
+    a:aaxyz
+    ab:ababxyz
+    ** Failers
+    a:axyz
+    ab:abxyz
+
+/(?'abc'\w+):\g{abc}{2}/
+    a:aaxyz
+    ab:ababxyz
+    ** Failers
+    a:axyz
+    ab:abxyz
+
+/^(?<ab>a)? (?(<ab>)b|c) (?('ab')d|e)/x
+    abd
+    ce
+
+/^(a.)\g-1Z/
+    aXaXZ
+
+/^(a.)\g{-1}Z/
+    aXaXZ
+
+/^(?(DEFINE) (?<A> a) (?<B> b) )  (?&A) (?&B) /x
+    abcd
+
+/(?<NAME>(?&NAME_PAT))\s+(?<ADDR>(?&ADDRESS_PAT))
+  (?(DEFINE)
+  (?<NAME_PAT>[a-z]+)
+  (?<ADDRESS_PAT>\d+)
+  )/x
+    metcalfe 33
+
+/(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))\b(?&byte)(\.(?&byte)){3}/
+    1.2.3.4
+    131.111.10.206
+    10.0.0.0
+    ** Failers
+    10.6
+    455.3.4.5
+
+/\b(?&byte)(\.(?&byte)){3}(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))/
+    1.2.3.4
+    131.111.10.206
+    10.0.0.0
+    ** Failers
+    10.6
+    455.3.4.5
+
+/^(\w++|\s++)*$/
+    now is the time for all good men to come to the aid of the party
+    *** Failers
+    this is not a line with only words and spaces!
+
+/(\d++)(\w)/
+    12345a
+    *** Failers
+    12345+
+
+/a++b/
+    aaab
+
+/(a++b)/
+    aaab
+
+/(a++)b/
+    aaab
+
+/([^()]++|\([^()]*\))+/
+    ((abc(ade)ufh()()x
+
+/\(([^()]++|\([^()]+\))+\)/
+    (abc)
+    (abc(def)xyz)
+    *** Failers
+    ((()aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+/^([^()]|\((?1)*\))*$/
+    abc
+    a(b)c
+    a(b(c))d
+    *** Failers)
+    a(b(c)d
+
+/^>abc>([^()]|\((?1)*\))*<xyz<$/
+   >abc>123<xyz<
+   >abc>1(2)3<xyz<
+   >abc>(1(2)3)<xyz<
+
+/^(?:((.)(?1)\2|)|((.)(?3)\4|.))$/i
+    1221
+    Satanoscillatemymetallicsonatas
+    AmanaplanacanalPanama
+    AblewasIereIsawElba
+    *** Failers
+    Thequickbrownfox
+
+/^(\d+|\((?1)([+*-])(?1)\)|-(?1))$/
+    12
+    (((2+2)*-3)-7)
+    -12
+    *** Failers
+    ((2+2)*-3)-7)
+
+/^(x(y|(?1){2})z)/
+    xyz
+    xxyzxyzz
+    *** Failers
+    xxyzz
+    xxyzxyzxyzz
+
+/((< (?: (?(R) \d++  | [^<>]*+) | (?2)) * >))/x
+    <>
+    <abcd>
+    <abc <123> hij>
+    <abc <def> hij>
+    <abc<>def>
+    <abc<>
+    *** Failers
+    <abc
+
+/^a+(*FAIL)/
+    aaaaaa
+    
+/a+b?c+(*FAIL)/
+    aaabccc
+
+/a+b?(*PRUNE)c+(*FAIL)/
+    aaabccc
+
+/a+b?(*COMMIT)c+(*FAIL)/
+    aaabccc
+    
+/a+b?(*SKIP)c+(*FAIL)/
+    aaabcccaaabccc
+
+/^(?:aaa(*THEN)\w{6}|bbb(*THEN)\w{5}|ccc(*THEN)\w{4}|\w{3})/
+    aaaxxxxxx
+    aaa++++++ 
+    bbbxxxxx
+    bbb+++++ 
+    cccxxxx
+    ccc++++ 
+    dddddddd   
+
+/^(aaa(*THEN)\w{6}|bbb(*THEN)\w{5}|ccc(*THEN)\w{4}|\w{3})/
+    aaaxxxxxx
+    aaa++++++ 
+    bbbxxxxx
+    bbb+++++ 
+    cccxxxx
+    ccc++++ 
+    dddddddd   
+
+/a+b?(*THEN)c+(*FAIL)/
+    aaabccc
+
+/(A (A|B(*ACCEPT)|C) D)(E)/x
+    AB
+    ABX
+    AADE
+    ACDE
+    ** Failers
+    AD 
+        
+/^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$/i
+    1221
+    Satan, oscillate my metallic sonatas!
+    A man, a plan, a canal: Panama!
+    Able was I ere I saw Elba.
+    *** Failers
+    The quick brown fox
+
+/^((.)(?1)\2|.)$/
+    a
+    aba
+    aabaa  
+    abcdcba 
+    pqaabaaqp  
+    ablewasiereisawelba
+    rhubarb
+    the quick brown fox  
+
+/(a)(?<=b(?1))/
+    baz
+    ** Failers
+    caz  
+    
+/(?<=b(?1))(a)/
+    zbaaz
+    ** Failers
+    aaa  
+    
+/(?<X>a)(?<=b(?&X))/
+    baz
+
+/^(?|(abc)|(def))\1/
+    abcabc
+    defdef 
+    ** Failers
+    abcdef
+    defabc   
+    
+/^(?|(abc)|(def))(?1)/
+    abcabc
+    defabc
+    ** Failers
+    defdef
+    abcdef    
+
+/(?:a(?<quote> (?<apostrophe>')|(?<realquote>")) |b(?<quote> (?<apostrophe>')|(?<realquote>")) ) (?('quote')[a-z]+|[0-9]+)/xJ
+    a\"aaaaa
+    b\"aaaaa 
+    ** Failers 
+    b\"11111
+
+/(?:(?1)|B)(A(*F)|C)/
+    ABCD
+    CCD
+    ** Failers
+    CAD   
+
+/^(?:(?1)|B)(A(*F)|C)/
+    CCD
+    BCD 
+    ** Failers
+    ABCD
+    CAD
+    BAD    
+
+/(?:(?1)|B)(A(*ACCEPT)XX|C)D/
+    AAD
+    ACD
+    BAD
+    BCD
+    BAX  
+    ** Failers
+    ACX
+    ABC   
+
+/(?(DEFINE)(A))B(?1)C/
+    BAC
+
+/(?(DEFINE)((A)\2))B(?1)C/
+    BAAC
+
+/(?<pn> \( ( [^()]++ | (?&pn) )* \) )/x
+    (ab(cd)ef)
+
+/^(?!a(*SKIP)b)/
+    ac
+    
+/^(?=a(*SKIP)b|ac)/
+    ** Failers
+    ac
+    
+/^(?=a(*THEN)b|ac)/
+    ac
+    
+/^(?=a(*PRUNE)b)/
+    ab  
+    ** Failers 
+    ac
+
+/^(?=a(*ACCEPT)b)/
+    ac
+
+/^(?(?!a(*SKIP)b))/
+    ac
+
+/(?>a\Kb)/
+    ab
+
+/((?>a\Kb))/
+    ab
+
+/(a\Kb)/
+    ab
+    
+/^a\Kcz|ac/
+    ac
+    
+/(?>a\Kbz|ab)/
+    ab 
+
+/^(?&t)(?(DEFINE)(?<t>a\Kb))$/
+    ab
+
+/^([^()]|\((?1)*\))*$/
+    a(b)c
+    a(b(c)d)e 
+
+/(?P<L1>(?P<L2>0)(?P>L1)|(?P>L2))/
+    0
+    00
+    0000  
+
+/(?P<L1>(?P<L2>0)|(?P>L2)(?P>L1))/
+    0
+    00
+    0000  
+
+/--- This one does fail, as expected, in Perl. It needs the complex item at the
+     end of the pattern. A single letter instead of (B|D) makes it not fail,
+     which I think is a Perl bug. --- /
+
+/A(*COMMIT)(B|D)/
+    ACABX
+
+/--- Check the use of names for failure ---/
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
+    ** Failers
+    AC
+    CB    
+    
+/--- Force no study, otherwise mark is not seen. The studied version is in
+     test 2 because it isn't Perl-compatible. ---/
+
+/(*MARK:A)(*SKIP:B)(C|X)/KSS
+    C
+    D
+     
+/^(A(*THEN:A)B|C(*THEN:B)D)/K
+    ** Failers
+    CB    
+
+/^(?:A(*THEN:A)B|C(*THEN:B)D)/K
+    CB    
+    
+/^(?>A(*THEN:A)B|C(*THEN:B)D)/K
+    CB    
+    
+/--- This should succeed, as the skip causes bump to offset 1 (the mark). Note
+that we have to have something complicated such as (B|Z) at the end because,
+for Perl, a simple character somehow causes an unwanted optimization to mess
+with the handling of backtracking verbs. ---/
+
+/A(*MARK:A)A+(*SKIP:A)(B|Z) | AC/xK
+    AAAC
+    
+/--- Test skipping over a non-matching mark. ---/
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:A)(B|Z) | AC/xK
+    AAAC
+    
+/--- Check shorthand for MARK ---/
+
+/A(*:A)A+(*SKIP:A)(B|Z) | AC/xK
+    AAAC
+
+/--- Don't loop! Force no study, otherwise mark is not seen. ---/
+
+/(*:A)A+(*SKIP:A)(B|Z)/KSS
+    AAAC
+
+/--- This should succeed, as a non-existent skip name disables the skip ---/ 
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC/xK
+    AAAC
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC(*:B)/xK
+    AAAC
+
+/--- COMMIT at the start of a pattern should act like an anchor. Again, 
+however, we need the complication for Perl. ---/
+
+/(*COMMIT)(A|P)(B|P)(C|P)/
+    ABCDEFG
+    ** Failers
+    DEFGABC  
+
+/--- COMMIT inside an atomic group can't stop backtracking over the group. ---/
+
+/(\w+)(?>b(*COMMIT))\w{2}/
+    abbb
+
+/(\w+)b(*COMMIT)\w{2}/
+    abbb
+
+/--- Check opening parens in comment when seeking forward reference. ---/ 
+
+/(?&t)(?#()(?(DEFINE)(?<t>a))/
+    bac
+
+/--- COMMIT should override THEN ---/
+
+/(?>(*COMMIT)(?>yes|no)(*THEN)(*F))?/
+  yes
+
+/(?>(*COMMIT)(yes|no)(*THEN)(*F))?/
+  yes
+
+/b?(*SKIP)c/
+    bc
+    abc
+   
+/(*SKIP)bc/
+    a
+
+/(*SKIP)b/
+    a 
+
+/(?P<abn>(?P=abn)xxx|)+/
+    xxx
+
+/(?i:([^b]))(?1)/
+    aa
+    aA     
+    ** Failers
+    ab
+    aB
+    Ba
+    ba
+
+/^(?&t)*+(?(DEFINE)(?<t>a))\w$/
+    aaaaaaX
+    ** Failers 
+    aaaaaa 
+
+/^(?&t)*(?(DEFINE)(?<t>a))\w$/
+    aaaaaaX
+    aaaaaa 
+
+/^(a)*+(\w)/
+    aaaaX
+    YZ 
+    ** Failers 
+    aaaa
+
+/^(?:a)*+(\w)/
+    aaaaX
+    YZ 
+    ** Failers 
+    aaaa
+
+/^(a)++(\w)/
+    aaaaX
+    ** Failers 
+    aaaa
+    YZ 
+
+/^(?:a)++(\w)/
+    aaaaX
+    ** Failers 
+    aaaa
+    YZ 
+
+/^(a)?+(\w)/
+    aaaaX
+    YZ 
+
+/^(?:a)?+(\w)/
+    aaaaX
+    YZ 
+
+/^(a){2,}+(\w)/
+    aaaaX
+    ** Failers
+    aaa
+    YZ 
+
+/^(?:a){2,}+(\w)/
+    aaaaX
+    ** Failers
+    aaa
+    YZ 
+
+/(a|)*(?1)b/
+    b
+    ab
+    aab  
+
+/(a)++(?1)b/
+    ** Failers
+    ab 
+    aab
+
+/(a)*+(?1)b/
+    ** Failers
+    ab
+    aab  
+
+/(?1)(?:(b)){0}/
+    b
+
+/(foo ( \( ((?:(?> [^()]+ )|(?2))*) \) ) )/x
+    foo(bar(baz)+baz(bop))
+
+/(A (A|B(*ACCEPT)|C) D)(E)/x
+    AB
+
+/\A.*?(?:a|b(*THEN)c)/
+    ba
+
+/\A.*?(?:a|bc)/
+    ba
+
+/\A.*?(a|b(*THEN)c)/
+    ba
+
+/\A.*?(a|bc)/
+    ba
+
+/\A.*?(?:a|b(*THEN)c)++/
+    ba
+
+/\A.*?(?:a|bc)++/
+    ba
+
+/\A.*?(a|b(*THEN)c)++/
+    ba
+
+/\A.*?(a|bc)++/
+    ba
+
+/\A.*?(?:a|b(*THEN)c|d)/
+    ba
+
+/\A.*?(?:a|bc|d)/
+    ba
+
+/(?:(b))++/
+    beetle
+
+/(?(?=(a(*ACCEPT)z))a)/
+    a
+
+/^(a)(?1)+ab/
+    aaaab
+    
+/^(a)(?1)++ab/
+    aaaab
+
+/^(?=a(*:M))aZ/K
+    aZbc
+
+/^(?!(*:M)b)aZ/K
+    aZbc
+
+/(?(DEFINE)(a))?b(?1)/
+    backgammon
+
+/^\N+/
+    abc\ndef
+    
+/^\N{1,}/
+    abc\ndef 
+
+/(?(R)a+|(?R)b)/
+    aaaabcde
+
+/(?(R)a+|((?R))b)/
+    aaaabcde
+
+/((?(R)a+|(?1)b))/
+    aaaabcde
+
+/((?(R1)a+|(?1)b))/
+    aaaabcde
+
+/a(*:any 
+name)/K
+    abc
+    
+/(?>(?&t)c|(?&t))(?(DEFINE)(?<t>a|b(*PRUNE)c))/
+    a
+    ba
+    bba 
+    
+/--- Checking revised (*THEN) handling ---/ 
+
+/--- Capture ---/
+
+/^.*? (a(*THEN)b) c/x
+    aabc
+
+/^.*? (a(*THEN)b|(*F)) c/x
+    aabc
+
+/^.*? ( (a(*THEN)b) | (*F) ) c/x
+    aabc
+
+/^.*? ( (a(*THEN)b) ) c/x
+    aabc
+
+/--- Non-capture ---/
+
+/^.*? (?:a(*THEN)b) c/x
+    aabc
+
+/^.*? (?:a(*THEN)b|(*F)) c/x
+    aabc
+
+/^.*? (?: (?:a(*THEN)b) | (*F) ) c/x
+    aabc
+
+/^.*? (?: (?:a(*THEN)b) ) c/x
+    aabc
+
+/--- Atomic ---/
+
+/^.*? (?>a(*THEN)b) c/x
+    aabc
+
+/^.*? (?>a(*THEN)b|(*F)) c/x
+    aabc
+
+/^.*? (?> (?>a(*THEN)b) | (*F) ) c/x
+    aabc
+
+/^.*? (?> (?>a(*THEN)b) ) c/x
+    aabc
+
+/--- Possessive capture ---/
+
+/^.*? (a(*THEN)b)++ c/x
+    aabc
+
+/^.*? (a(*THEN)b|(*F))++ c/x
+    aabc
+
+/^.*? ( (a(*THEN)b)++ | (*F) )++ c/x
+    aabc
+
+/^.*? ( (a(*THEN)b)++ )++ c/x
+    aabc
+
+/--- Possessive non-capture ---/
+
+/^.*? (?:a(*THEN)b)++ c/x
+    aabc
+
+/^.*? (?:a(*THEN)b|(*F))++ c/x
+    aabc
+
+/^.*? (?: (?:a(*THEN)b)++ | (*F) )++ c/x
+    aabc
+
+/^.*? (?: (?:a(*THEN)b)++ )++ c/x
+    aabc
+    
+/--- Condition assertion ---/
+
+/^(?(?=a(*THEN)b)ab|ac)/
+    ac
+ 
+/--- Condition ---/
+
+/^.*?(?(?=a)a|b(*THEN)c)/
+    ba
+
+/^.*?(?:(?(?=a)a|b(*THEN)c)|d)/
+    ba
+
+/^.*?(?(?=a)a(*THEN)b|c)/
+    ac
+
+/--- Assertion ---/
+
+/^.*(?=a(*THEN)b)/ 
+    aabc
+
+/------------------------------/
+
+/(?>a(*:m))/imsxSK 
+    a
+
+/(?>(a)(*:m))/imsxSK 
+    a
+
+/(?<=a(*ACCEPT)b)c/
+    xacd
+
+/(?<=(a(*ACCEPT)b))c/
+    xacd
+
+/(?<=(a(*COMMIT)b))c/
+    xabcd
+    ** Failers 
+    xacd
+    
+/(?<!a(*FAIL)b)c/
+    xcd
+    acd 
+
+/(?<=a(*:N)b)c/K
+    xabcd
+    
+/(?<=a(*PRUNE)b)c/
+    xabcd 
+
+/(?<=a(*SKIP)b)c/
+    xabcd 
+
+/(?<=a(*THEN)b)c/
+    xabcd 
+
+/(a)(?2){2}(.)/
+    abcd
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
+    C
+    D 
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KSS
+    C
+    D 
+
+/(*MARK:A)(*THEN:B)(C|X)/KS
+    C
+    D 
+
+/(*MARK:A)(*THEN:B)(C|X)/KSY
+    C
+    D 
+
+/(*MARK:A)(*THEN:B)(C|X)/KSS
+    C
+    D 
+
+/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
+
+/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
+    AAAC
+
+/--- Same --/
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK
+    AAAC
+
+/A(*:A)A+(*SKIP)(B|Z) | AC/xK
+    AAAC
+
+/--- This should fail, as a null name is the same as no name ---/
+
+/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK
+    AAAC
+
+/--- A check on what happens after hitting a mark and them bumping along to
+something that does not even start. Perl reports tags after the failures here, 
+though it does not when the individual letters are made into something 
+more complicated. ---/
+
+/A(*:A)B|XX(*:B)Y/K
+    AABC
+    XXYZ 
+    ** Failers
+    XAQQ  
+    XAQQXZZ  
+    AXQQQ 
+    AXXQQQ 
+    
+/^(A(*THEN:A)B|C(*THEN:B)D)/K
+    AB
+    CD
+    ** Failers
+    AC
+    CB    
+    
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
+    AB
+    CD
+    ** Failers
+    AC
+    CB    
+    
+/--- An empty name does not pass back an empty string. It is the same as if no
+name were given. ---/ 
+
+/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K
+    AB
+    CD 
+
+/--- PRUNE goes to next bumpalong; COMMIT does not. ---/
+    
+/A(*PRUNE:A)B/K
+    ACAB
+
+/--- Mark names can be duplicated ---/
+
+/A(*:A)B|X(*:A)Y/K
+    AABC
+    XXYZ 
+    
+/b(*:m)f|a(*:n)w/K
+    aw 
+    ** Failers 
+    abc
+
+/b(*:m)f|aw/K
+    abaw
+    ** Failers 
+    abc
+    abax 
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK
+    AAAC
+
+/a(*PRUNE:X)bc|qq/KY
+    ** Failers
+    axy
+
+/a(*THEN:X)bc|qq/KY
+    ** Failers
+    axy
+
+/(?=a(*MARK:A)b)..x/K
+    abxy
+    ** Failers
+    abpq  
+
+/(?=a(*MARK:A)b)..(*:Y)x/K
+    abxy
+    ** Failers
+    abpq  
+
+/(?=a(*PRUNE:A)b)..x/K
+    abxy
+    ** Failers
+    abpq  
+
+/(?=a(*PRUNE:A)b)..(*:Y)x/K
+    abxy
+    ** Failers
+    abpq  
+
+/(?=a(*THEN:A)b)..x/K
+    abxy
+    ** Failers
+    abpq  
+
+/(?=a(*THEN:A)b)..(*:Y)x/K
+    abxy
+    ** Failers
+    abpq  
+
+/(another)?(\1?)test/
+    hello world test
+
+/(another)?(\1+)test/
+    hello world test
+
 /-- End of testinput1 --/

Modified: code/trunk/testdata/testinput10
===================================================================
--- code/trunk/testdata/testinput10    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput10    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,137 +1,989 @@
-/-- These are a few representative patterns whose lengths and offsets are to be 
-shown when the link size is 2. This is just a doublecheck test to ensure the 
-sizes don't go horribly wrong when something is changed. The pattern contents 
-are all themselves checked in other tests. Unicode, including property support, 
-is required for these tests. --/
+/-- This set of tests check Unicode property support with the DFA matching 
+    functionality of pcre_dfa_exec(). The -dfa flag must be used with pcretest
+    when running it. --/

-/((?i)b)/BM
+/\pL\P{Nd}/8
+    AB
+    *** Failers
+    A0
+    00

-/(?s)(.*X|^B)/BM
+/\X./8
+    AB
+    A\x{300}BC 
+    A\x{300}\x{301}\x{302}BC 
+    *** Failers
+    \x{300}

-/(?s:.*X|^B)/BM
+/\X\X/8
+    ABC
+    A\x{300}B\x{300}\x{301}C 
+    A\x{300}\x{301}\x{302}BC 
+    *** Failers
+    \x{300}

-/^[[:alnum:]]/BM
+/^\pL+/8
+    abcd
+    a 
+    *** Failers

-/#/IxMD
+/^\PL+/8
+    1234
+    = 
+    *** Failers 
+    abcd

-/a#/IxMD
+/^\X+/8
+    abcdA\x{300}\x{301}\x{302}
+    A\x{300}\x{301}\x{302}
+    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
+    a 
+    *** Failers 
+    \x{300}\x{301}\x{302}

-/x?+/BM
+/\X?abc/8
+    abc
+    A\x{300}abc
+    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
+    \x{300}abc  
+    *** Failers

-/x++/BM
+/^\X?abc/8
+    abc
+    A\x{300}abc
+    *** Failers
+    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
+    \x{300}abc

-/x{1,3}+/BM 
+/\X*abc/8
+    abc
+    A\x{300}abc
+    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
+    \x{300}abc  
+    *** Failers

-/(x)*+/BM
+/^\X*abc/8
+    abc
+    A\x{300}abc
+    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
+    *** Failers
+    \x{300}abc

-/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/BM
+/^\pL?=./8
+    A=b
+    =c 
+    *** Failers
+    1=2 
+    AAAA=b

-|8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b|BM
+/^\pL*=./8
+    AAAA=b
+    =c 
+    *** Failers
+    1=2

-|\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b|BM
+/^\X{2,3}X/8
+    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
+    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X 
+    *** Failers
+    X
+    A\x{300}\x{301}\x{302}X
+    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X

-/(a(?1)b)/BM
+/^\pC\pL\pM\pN\pP\pS\pZ</8
+    \x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
+    \np\x{300}9!\$ < 
+    ** Failers 
+    ap\x{300}9!\$ < 
+  
+/^\PC/8
+    X
+    ** Failers 
+    \x7f
+  
+/^\PL/8
+    9
+    ** Failers 
+    \x{c0}
+  
+/^\PM/8
+    X
+    ** Failers 
+    \x{30f}
+  
+/^\PN/8
+    X
+    ** Failers 
+    \x{660}
+  
+/^\PP/8
+    X
+    ** Failers 
+    \x{66c}
+  
+/^\PS/8
+    X
+    ** Failers 
+    \x{f01}
+  
+/^\PZ/8
+    X
+    ** Failers 
+    \x{1680}
+    
+/^\p{Cc}/8
+    \x{017}
+    \x{09f} 
+    ** Failers
+    \x{0600} 
+  
+/^\p{Cf}/8
+    \x{601}
+    ** Failers
+    \x{09f} 
+  
+/^\p{Cn}/8
+    ** Failers
+    \x{09f} 
+  
+/^\p{Co}/8
+    \x{f8ff}
+    ** Failers
+    \x{09f} 
+  
+/^\p{Cs}/8
+    \?\x{dfff}
+    ** Failers
+    \x{09f} 
+  
+/^\p{Ll}/8
+    a
+    ** Failers 
+    Z
+    \x{e000}  
+  
+/^\p{Lm}/8
+    \x{2b0}
+    ** Failers
+    a 
+  
+/^\p{Lo}/8
+    \x{1bb}
+    ** Failers
+    a 
+    \x{2b0}
+  
+/^\p{Lt}/8
+    \x{1c5}
+    ** Failers
+    a 
+    \x{2b0}
+  
+/^\p{Lu}/8
+    A
+    ** Failers
+    \x{2b0}
+  
+/^\p{Mc}/8
+    \x{903}
+    ** Failers
+    X
+    \x{300}
+       
+/^\p{Me}/8
+    \x{488}
+    ** Failers
+    X
+    \x{903}
+    \x{300}
+  
+/^\p{Mn}/8
+    \x{300}
+    ** Failers
+    X
+    \x{903}
+  
+/^\p{Nd}+/8
+    0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}\x{66a}
+    \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}\x{6fa}
+    \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}\x{970}
+    ** Failers
+    X
+  
+/^\p{Nl}/8
+    \x{16ee}
+    ** Failers
+    X
+    \x{966}
+  
+/^\p{No}/8
+    \x{b2}
+    \x{b3}
+    ** Failers
+    X
+    \x{16ee}
+  
+/^\p{Pc}/8
+    \x5f
+    \x{203f}
+    ** Failers
+    X
+    -
+    \x{58a}
+  
+/^\p{Pd}/8
+    -
+    \x{58a}
+    ** Failers
+    X
+    \x{203f}
+  
+/^\p{Pe}/8
+    )
+    ]
+    }
+    \x{f3b}
+    ** Failers
+    X
+    \x{203f}
+    (
+    [
+    {
+    \x{f3c}
+  
+/^\p{Pf}/8
+    \x{bb}
+    \x{2019}
+    ** Failers
+    X
+    \x{203f}
+  
+/^\p{Pi}/8
+    \x{ab}
+    \x{2018}
+    ** Failers
+    X
+    \x{203f}
+  
+/^\p{Po}/8
+    !
+    \x{37e}
+    ** Failers
+    X
+    \x{203f}
+  
+/^\p{Ps}/8
+    (
+    [
+    {
+    \x{f3c}
+    ** Failers
+    X
+    )
+    ]
+    }
+    \x{f3b}
+  
+/^\p{Sc}+/8
+    $\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
+    \x{9f2}
+    ** Failers
+    X
+    \x{2c2}
+  
+/^\p{Sk}/8
+    \x{2c2}
+    ** Failers
+    X
+    \x{9f2}
+  
+/^\p{Sm}+/8
+    +<|~\x{ac}\x{2044}
+    ** Failers
+    X
+    \x{9f2}
+  
+/^\p{So}/8
+    \x{a6}
+    \x{482} 
+    ** Failers
+    X
+    \x{9f2}
+  
+/^\p{Zl}/8
+    \x{2028}
+    ** Failers
+    X
+    \x{2029}
+  
+/^\p{Zp}/8
+    \x{2029}
+    ** Failers
+    X
+    \x{2028}
+  
+/^\p{Zs}/8
+    \ \
+    \x{a0}
+    \x{1680}
+    \x{180e}
+    \x{2000}
+    \x{2001}     
+    ** Failers
+    \x{2028}
+    \x{200d} 
+  
+/\p{Nd}+(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}+?(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}{2,}(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}{2,}?(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}*(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}*?(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}{2}(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}{2,3}(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}{2,3}?(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}?(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}??(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}*+(..)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}*+(...)/8
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}*+(....)/8
+      ** Failers
+      \x{660}\x{661}\x{662}ABC
+  
+/\p{Lu}/8i
+    A
+    a\x{10a0}B 
+    ** Failers 
+    a
+    \x{1d00}

-/(a(?1)+b)/BM
+/\p{^Lu}/8i
+    1234
+    ** Failers
+    ABC

-/a(?P<name1>b|c)d(?P<longername2>e)/BM
+/\P{Lu}/8i
+    1234
+    ** Failers
+    ABC

-/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/BM
+/(?<=A\p{Nd})XYZ/8
+    A2XYZ
+    123A5XYZPQR
+    ABA\x{660}XYZpqr
+    ** Failers
+    AXYZ
+    XYZ     
+    
+/(?<!\pL)XYZ/8
+    1XYZ
+    AB=XYZ.. 
+    XYZ 
+    ** Failers
+    WXYZ

-/(?P<a>a)...(?P=a)bbb(?P>a)d/BM
+/[\p{Nd}]/8
+    1234

-/abc(?C255)de(?C)f/BM
+/[\p{Nd}+-]+/8
+    1234
+    12-34
+    12+\x{661}-34  
+    ** Failers
+    abcd

-/abcde/CBM
+/[\P{Nd}]+/8
+    abcd
+    ** Failers
+    1234

-/\x{100}/8BM
+/\D+/8
+    11111111111111111111111111111111111111111111111111111111111111111111111
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+     
+/\P{Nd}+/8
+    11111111111111111111111111111111111111111111111111111111111111111111111
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/\x{1000}/8BM
+/[\D]+/8
+    11111111111111111111111111111111111111111111111111111111111111111111111
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/\x{10000}/8BM
+/[\P{Nd}]+/8
+    11111111111111111111111111111111111111111111111111111111111111111111111
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/\x{100000}/8BM
+/[\D\P{Nd}]+/8
+    11111111111111111111111111111111111111111111111111111111111111111111111
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/\x{1000000}/8BM
+/\pL/8
+    a
+    A

-/\x{4000000}/8BM
+/\pL/8i
+    a
+    A 
+    
+/\p{Lu}/8 
+    A
+    aZ
+    ** Failers
+    abc

-/\x{7fffFFFF}/8BM
+/\p{Lu}/8i
+    A
+    aZ
+    ** Failers
+    abc

-/[\x{ff}]/8BM
+/\p{Ll}/8 
+    a
+    Az
+    ** Failers
+    ABC

-/[\x{100}]/8BM
+/\p{Ll}/8i 
+    a
+    Az
+    ** Failers
+    ABC

-/\x80/8BM
+/^\x{c0}$/8i
+    \x{c0}
+    \x{e0}

-/\xff/8BM
+/^\x{e0}$/8i
+    \x{c0}
+    \x{e0}

-/\x{0041}\x{2262}\x{0391}\x{002e}/D8M
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8
+    A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+    ** Failers
+    a\x{391}\x{10427}\x{ff3a}\x{1fb0}   
+    A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
+    A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
+    A\x{391}\x{10427}\x{ff5a}\x{1fb0}
+    A\x{391}\x{10427}\x{ff3a}\x{1fb8}
+
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8i
+    A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+    a\x{391}\x{10427}\x{ff3a}\x{1fb0}   
+    A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
+    A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
+    A\x{391}\x{10427}\x{ff5a}\x{1fb0}
+    A\x{391}\x{10427}\x{ff3a}\x{1fb8}
+
+/\x{391}+/8i
+    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
+
+/\x{391}{3,5}(.)/8i
+    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
+
+/\x{391}{3,5}?(.)/8i
+    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
+
+/[\x{391}\x{ff3a}]/8i
+    \x{391}
+    \x{ff3a}
+    \x{3b1}
+    \x{ff5a}

-/\x{D55c}\x{ad6d}\x{C5B4}/D8M 
+/[\x{c0}\x{391}]/8i
+    \x{c0}
+    \x{e0}

-/\x{65e5}\x{672c}\x{8a9e}/D8M
+/[\x{105}-\x{109}]/8i
+    \x{104}
+    \x{105}
+    \x{109}  
+    ** Failers
+    \x{100}
+    \x{10a} 
+    
+/[z-\x{100}]/8i 
+    Z
+    z
+    \x{39c}
+    \x{178}
+    |
+    \x{80}
+    \x{ff}
+    \x{100}
+    \x{101} 
+    ** Failers
+    \x{102}
+    Y
+    y

-/[\x{100}]/8BM
+/[z-\x{100}]/8i

-/[Z\x{100}]/8BM
+/^\X/8
+    A
+    A\x{300}BC 
+    A\x{300}\x{301}\x{302}BC 
+    *** Failers
+    \x{300}

-/^[\x{100}\E-\Q\E\x{150}]/B8M
+/^[\X]/8
+    X123
+    *** Failers
+    AXYZ

-/^[\QĀ\E-\QŐ\E]/B8M
+/^(\X*)C/8
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C

-/^[\QĀ\E-\QŐ\E/B8M
+/^(\X*?)C/8
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C

-/[\p{L}]/BM
+/^(\X*)(.)/8
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C

-/[\p{^L}]/BM
+/^(\X*?)(.)/8
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C

-/[\P{L}]/BM
+/^\X(.)/8
+    *** Failers
+    A\x{300}\x{301}\x{302}

-/[\P{^L}]/BM
+/^\X{2,3}(.)/8
+    A\x{300}\x{301}B\x{300}X
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
+    
+/^\X{2,3}?(.)/8
+    A\x{300}\x{301}B\x{300}X
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X

-/[abc\p{L}\x{0660}]/8BM
+/^\pN{2,3}X/
+    12X
+    123X
+    *** Failers
+    X
+    1X
+    1234X

-/[\p{Nd}]/8BM
+/\x{100}/i8
+    \x{100}   
+    \x{101} 
+    
+/^\p{Han}+/8
+    \x{2e81}\x{3007}\x{2f804}\x{31a0}
+    ** Failers
+    \x{2e7f}

-/[\p{Nd}+-]+/8BM
+/^\P{Katakana}+/8
+    \x{3105}
+    ** Failers
+    \x{30ff}

-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iBM
+/^[\p{Arabic}]/8
+    \x{06e9}
+    \x{060b}
+    ** Failers
+    X\x{06e9}

-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8BM
+/^[\P{Yi}]/8
+    \x{2f800}
+    ** Failers
+    \x{a014}
+    \x{a4c6}

-/[\x{105}-\x{109}]/8iBM
+/^\p{Any}X/8
+    AXYZ
+    \x{1234}XYZ 
+    ** Failers
+    X  
+    
+/^\P{Any}X/8
+    ** Failers
+    AX
+    
+/^\p{Any}?X/8
+    XYZ
+    AXYZ
+    \x{1234}XYZ 
+    ** Failers
+    ABXYZ

-/( ( (?(1)0|) )*   )/xBM
+/^\P{Any}?X/8
+    XYZ
+    ** Failers
+    AXYZ
+    \x{1234}XYZ 
+    ABXYZ

-/(  (?(1)0|)*   )/xBM
+/^\p{Any}+X/8
+    AXYZ
+    \x{1234}XYZ
+    A\x{1234}XYZ
+    ** Failers
+    XYZ

-/[a]/BM
+/^\P{Any}+X/8
+    ** Failers
+    AXYZ
+    \x{1234}XYZ
+    A\x{1234}XYZ
+    XYZ

-/[a]/8BM
+/^\p{Any}*X/8
+    XYZ
+    AXYZ
+    \x{1234}XYZ
+    A\x{1234}XYZ
+    ** Failers

-/[\xaa]/BM
+/^\P{Any}*X/8
+    XYZ
+    ** Failers
+    AXYZ
+    \x{1234}XYZ
+    A\x{1234}XYZ

-/[\xaa]/8BM
+/^[\p{Any}]X/8
+    AXYZ
+    \x{1234}XYZ 
+    ** Failers
+    X  
+    
+/^[\P{Any}]X/8
+    ** Failers
+    AX
+    
+/^[\p{Any}]?X/8
+    XYZ
+    AXYZ
+    \x{1234}XYZ 
+    ** Failers
+    ABXYZ

-/[^a]/BM
+/^[\P{Any}]?X/8
+    XYZ
+    ** Failers
+    AXYZ
+    \x{1234}XYZ 
+    ABXYZ

-/[^a]/8BM
+/^[\p{Any}]+X/8
+    AXYZ
+    \x{1234}XYZ
+    A\x{1234}XYZ
+    ** Failers
+    XYZ

-/[^\xaa]/BM
+/^[\P{Any}]+X/8
+    ** Failers
+    AXYZ
+    \x{1234}XYZ
+    A\x{1234}XYZ
+    XYZ

-/[^\xaa]/8BM
+/^[\p{Any}]*X/8
+    XYZ
+    AXYZ
+    \x{1234}XYZ
+    A\x{1234}XYZ
+    ** Failers

-/[^\d]/8WB
+/^[\P{Any}]*X/8
+    XYZ
+    ** Failers
+    AXYZ
+    \x{1234}XYZ
+    A\x{1234}XYZ

-/[[:^alpha:][:^cntrl:]]+/8WB
+/^\p{Any}{3,5}?/8
+    abcdefgh
+    \x{1234}\n\r\x{3456}xyz

-/[[:^cntrl:][:^alpha:]]+/8WB
+/^\p{Any}{3,5}/8
+    abcdefgh
+    \x{1234}\n\r\x{3456}xyz

-/[[:alpha:]]+/8WB
+/^\P{Any}{3,5}?/8
+    ** Failers
+    abcdefgh
+    \x{1234}\n\r\x{3456}xyz

-/[[:^alpha:]\S]+/8WB
+/^\p{L&}X/8
+     AXY
+     aXY
+     \x{1c5}XY
+     ** Failers
+     \x{1bb}XY
+     \x{2b0}XY
+     !XY

-/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/B
+/^[\p{L&}]X/8
+     AXY
+     aXY
+     \x{1c5}XY
+     ** Failers
+     \x{1bb}XY
+     \x{2b0}XY
+     !XY

-/-- End of testinput10 --/
+/^\p{L&}+X/8
+     AXY
+     aXY
+     AbcdeXyz 
+     \x{1c5}AbXY
+     abcDEXypqreXlmn 
+     ** Failers
+     \x{1bb}XY
+     \x{2b0}XY
+     !XY      
+
+/^[\p{L&}]+X/8
+     AXY
+     aXY
+     AbcdeXyz 
+     \x{1c5}AbXY
+     abcDEXypqreXlmn 
+     ** Failers
+     \x{1bb}XY
+     \x{2b0}XY
+     !XY      
+
+/^\p{L&}+?X/8
+     AXY
+     aXY
+     AbcdeXyz 
+     \x{1c5}AbXY
+     abcDEXypqreXlmn 
+     ** Failers
+     \x{1bb}XY
+     \x{2b0}XY
+     !XY      
+
+/^[\p{L&}]+?X/8
+     AXY
+     aXY
+     AbcdeXyz 
+     \x{1c5}AbXY
+     abcDEXypqreXlmn 
+     ** Failers
+     \x{1bb}XY
+     \x{2b0}XY
+     !XY      
+
+/^\P{L&}X/8
+     !XY
+     \x{1bb}XY
+     \x{2b0}XY
+     ** Failers
+     \x{1c5}XY
+     AXY      
+
+/^[\P{L&}]X/8
+     !XY
+     \x{1bb}XY
+     \x{2b0}XY
+     ** Failers
+     \x{1c5}XY
+     AXY      
+
+/^\x{023a}+?(\x{0130}+)/8i
+  \x{023a}\x{2c65}\x{0130}
+  
+/^\x{023a}+([^X])/8i
+  \x{023a}\x{2c65}X
+ 
+/\x{c0}+\x{116}+/8i
+    \x{c0}\x{e0}\x{116}\x{117}
+
+/[\x{c0}\x{116}]+/8i
+    \x{c0}\x{e0}\x{116}\x{117}
+
+/Check property support in non-UTF-8 mode/
+ 
+/\p{L}{4}/
+    123abcdefg
+    123abc\xc4\xc5zz
+
+/\p{Carian}\p{Cham}\p{Kayah_Li}\p{Lepcha}\p{Lycian}\p{Lydian}\p{Ol_Chiki}\p{Rejang}\p{Saurashtra}\p{Sundanese}\p{Vai}/8
+    \x{102A4}\x{AA52}\x{A91D}\x{1C46}\x{10283}\x{1092E}\x{1C6B}\x{A93B}\x{A8BF}\x{1BA0}\x{A50A}====
+
+/\x{a77d}\x{1d79}/8i
+    \x{a77d}\x{1d79}
+    \x{1d79}\x{a77d} 
+
+/\x{a77d}\x{1d79}/8
+    \x{a77d}\x{1d79}
+    ** Failers 
+    \x{1d79}\x{a77d} 
+
+/^\p{Xan}/8
+    ABCD
+    1234
+    \x{6ca}
+    \x{a6c}
+    \x{10a7}   
+    ** Failers
+    _ABC   
+
+/^\p{Xan}+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+    ** Failers
+    _ABC   
+
+/^\p{Xan}*/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+    
+/^\p{Xan}{2,9}/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+    
+/^[\p{Xan}]/8
+    ABCD1234_
+    1234abcd_
+    \x{6ca}
+    \x{a6c}
+    \x{10a7}   
+    ** Failers
+    _ABC   
+ 
+/^[\p{Xan}]+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+    ** Failers
+    _ABC   
+
+/^>\p{Xsp}/8
+    >\x{1680}\x{2028}\x{0b}
+    ** Failers
+    \x{0b} 
+
+/^>\p{Xsp}+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+
+/^>\p{Xsp}*/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+    
+/^>\p{Xsp}{2,9}/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+    
+/^>[\p{Xsp}]/8
+    >\x{2028}\x{0b}
+ 
+/^>[\p{Xsp}]+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+
+/^>\p{Xps}/8
+    >\x{1680}\x{2028}\x{0b}
+    >\x{a0} 
+    ** Failers
+    \x{0b} 
+
+/^>\p{Xps}+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+
+/^>\p{Xps}+?/8
+    >\x{1680}\x{2028}\x{0b}
+
+/^>\p{Xps}*/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+    
+/^>\p{Xps}{2,9}/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+    
+/^>\p{Xps}{2,9}?/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+    
+/^>[\p{Xps}]/8
+    >\x{2028}\x{0b}
+ 
+/^>[\p{Xps}]+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+
+/^\p{Xwd}/8
+    ABCD
+    1234
+    \x{6ca}
+    \x{a6c}
+    \x{10a7}
+    _ABC    
+    ** Failers
+    [] 
+
+/^\p{Xwd}+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+
+/^\p{Xwd}*/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+    
+/^\p{Xwd}{2,9}/8
+    A_12\x{6ca}\x{a6c}\x{10a7}
+    
+/^[\p{Xwd}]/8
+    ABCD1234_
+    1234abcd_
+    \x{6ca}
+    \x{a6c}
+    \x{10a7}   
+    _ABC 
+    ** Failers
+    []   
+ 
+/^[\p{Xwd}]+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+
+/-- Unicode properties for \b abd \B --/
+
+/\b...\B/8W
+    abc_
+    \x{37e}abc\x{376} 
+    \x{37e}\x{376}\x{371}\x{393}\x{394} 
+    !\x{c0}++\x{c1}\x{c2} 
+    !\x{c0}+++++ 
+
+/-- Without PCRE_UCP, non-ASCII always fail, even if < 256  --/
+
+/\b...\B/8
+    abc_
+    ** Failers 
+    \x{37e}abc\x{376} 
+    \x{37e}\x{376}\x{371}\x{393}\x{394} 
+    !\x{c0}++\x{c1}\x{c2} 
+    !\x{c0}+++++ 
+
+/-- With PCRE_UCP, non-UTF8 chars that are < 256 still check properties  --/
+
+/\b...\B/W
+    abc_
+    !\x{c0}++\x{c1}\x{c2} 
+    !\x{c0}+++++ 
+
+/-- End of testinput10 --/

Modified: code/trunk/testdata/testinput11
===================================================================
--- code/trunk/testdata/testinput11    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput11    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,803 +1,135 @@
-/-- These tests are for the Perl >= 5.10 features that PCRE supports. --/
+/-- These are a few representative patterns whose lengths and offsets are to be 
+shown when the link size is 2. This is just a doublecheck test to ensure the 
+sizes don't go horribly wrong when something is changed. The pattern contents 
+are all themselves checked in other tests. Unicode, including property support, 
+is required for these tests. --/

-/\H\h\V\v/
-    X X\x0a
-    X\x09X\x0b
-    ** Failers
-    \xa0 X\x0a   
-    
-/\H*\h+\V?\v{3,4}/ 
-    \x09\x20\xa0X\x0a\x0b\x0c\x0d\x0a
-    \x09\x20\xa0\x0a\x0b\x0c\x0d\x0a
-    \x09\x20\xa0\x0a\x0b\x0c
-    ** Failers 
-    \x09\x20\xa0\x0a\x0b
-     
-/\H{3,4}/
-    XY  ABCDE
-    XY  PQR ST 
-    
-/.\h{3,4}./
-    XY  AB    PQRS
+/((?i)b)/BM

-/\h*X\h?\H+Y\H?Z/
-    >XNNNYZ
-    >  X NYQZ
-    ** Failers
-    >XYZ   
-    >  X NY Z
+/(?s)(.*X|^B)/BM

-/\v*X\v?Y\v+Z\V*\x0a\V+\x0b\V{2,3}\x0c/
-    >XY\x0aZ\x0aA\x0bNN\x0c
-    >\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
+/(?s:.*X|^B)/BM

-/(foo)\Kbar/
-    foobar
-   
-/(foo)(\Kbar|baz)/
-    foobar
-    foobaz 
+/^[[:alnum:]]/BM

-/(foo\Kbar)baz/
-    foobarbaz
+/#/IxMD

-/abc\K|def\K/g+
-    Xabcdefghi
+/a#/IxMD

-/ab\Kc|de\Kf/g+
-    Xabcdefghi
-    
-/(?=C)/g+
-    ABCDECBA
-    
-/^abc\K/+
-    abcdef
-    ** Failers
-    defabcxyz   
+/x?+/BM

-/^(a(b))\1\g1\g{1}\g-1\g{-1}\g{-02}Z/
-    ababababbbabZXXXX
+/x++/BM

-/(?<A>tom|bon)-\g{A}/
-    tom-tom
-    bon-bon 
-    
-/(^(a|b\g{-1}))/
-    bacxxx
+/x{1,3}+/BM

-/(?|(abc)|(xyz))\1/
-    abcabc
-    xyzxyz 
-    ** Failers
-    abcxyz
-    xyzabc   
-    
-/(?|(abc)|(xyz))(?1)/
-    abcabc
-    xyzabc 
-    ** Failers 
-    xyzxyz 
- 
-/^X(?5)(a)(?|(b)|(q))(c)(d)(Y)/
-    XYabcdY
+/(x)*+/BM

-/^X(?7)(a)(?|(b|(r)(s))|(q))(c)(d)(Y)/
-    XYabcdY
+/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/BM

-/^X(?7)(a)(?|(b|(?|(r)|(t))(s))|(q))(c)(d)(Y)/
-    XYabcdY
+|8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b|BM

-/(?'abc'\w+):\k<abc>{2}/
-    a:aaxyz
-    ab:ababxyz
-    ** Failers
-    a:axyz
-    ab:abxyz
+|\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b|BM

-/(?'abc'\w+):\g{abc}{2}/
-    a:aaxyz
-    ab:ababxyz
-    ** Failers
-    a:axyz
-    ab:abxyz
+/(a(?1)b)/BM

-/^(?<ab>a)? (?(<ab>)b|c) (?('ab')d|e)/x
-    abd
-    ce
+/(a(?1)+b)/BM

-/^(a.)\g-1Z/
-    aXaXZ
+/a(?P<name1>b|c)d(?P<longername2>e)/BM

-/^(a.)\g{-1}Z/
-    aXaXZ
+/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/BM

-/^(?(DEFINE) (?<A> a) (?<B> b) )  (?&A) (?&B) /x
-    abcd
+/(?P<a>a)...(?P=a)bbb(?P>a)d/BM

-/(?<NAME>(?&NAME_PAT))\s+(?<ADDR>(?&ADDRESS_PAT))
-  (?(DEFINE)
-  (?<NAME_PAT>[a-z]+)
-  (?<ADDRESS_PAT>\d+)
-  )/x
-    metcalfe 33
+/abc(?C255)de(?C)f/BM

-/(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))\b(?&byte)(\.(?&byte)){3}/
-    1.2.3.4
-    131.111.10.206
-    10.0.0.0
-    ** Failers
-    10.6
-    455.3.4.5
+/abcde/CBM

-/\b(?&byte)(\.(?&byte)){3}(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))/
-    1.2.3.4
-    131.111.10.206
-    10.0.0.0
-    ** Failers
-    10.6
-    455.3.4.5
+/\x{100}/8BM

-/^(\w++|\s++)*$/
-    now is the time for all good men to come to the aid of the party
-    *** Failers
-    this is not a line with only words and spaces!
+/\x{1000}/8BM

-/(\d++)(\w)/
-    12345a
-    *** Failers
-    12345+
+/\x{10000}/8BM

-/a++b/
-    aaab
+/\x{100000}/8BM

-/(a++b)/
-    aaab
+/\x{10ffff}/8BM

-/(a++)b/
-    aaab
+/\x{110000}/8BM

-/([^()]++|\([^()]*\))+/
-    ((abc(ade)ufh()()x
+/[\x{ff}]/8BM

-/\(([^()]++|\([^()]+\))+\)/
-    (abc)
-    (abc(def)xyz)
-    *** Failers
-    ((()aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/[\x{100}]/8BM

-/^([^()]|\((?1)*\))*$/
-    abc
-    a(b)c
-    a(b(c))d
-    *** Failers)
-    a(b(c)d
+/\x80/8BM

-/^>abc>([^()]|$(?1)*$)*<xyz<$/
- >abc>123<xyz<
- >abc>1(2)3<xyz<
- >abc>(1(2)3)<xyz<
+/\xff/8BM

-/^(?:((.)(?1)\2|)|((.)(?3)\4|.))$/i
-    1221
-    Satanoscillatemymetallicsonatas
-    AmanaplanacanalPanama
-    AblewasIereIsawElba
-    *** Failers
-    Thequickbrownfox
-
-/^(\d+|\((?1)([+*-])(?1)\)|-(?1))$/
-    12
-    (((2+2)*-3)-7)
-    -12
-    *** Failers
-    ((2+2)*-3)-7)
-
-/^(x(y|(?1){2})z)/
-    xyz
-    xxyzxyzz
-    *** Failers
-    xxyzz
-    xxyzxyzxyzz
-
-/((< (?: (?(R) \d++  | [^<>]*+) | (?2)) * >))/x
-    <>
-    <abcd>
-    <abc <123> hij>
-    <abc <def> hij>
-    <abc<>def>
-    <abc<>
-    *** Failers
-    <abc
-
-/^a+(*FAIL)/
-    aaaaaa
+/\x{0041}\x{2262}\x{0391}\x{002e}/D8M

-/a+b?c+(*FAIL)/
-    aaabccc
+/\x{D55c}\x{ad6d}\x{C5B4}/D8M

-/a+b?(*PRUNE)c+(*FAIL)/
-    aaabccc
+/\x{65e5}\x{672c}\x{8a9e}/D8M

-/a+b?(*COMMIT)c+(*FAIL)/
-    aaabccc
-    
-/a+b?(*SKIP)c+(*FAIL)/
-    aaabcccaaabccc
+/[\x{100}]/8BM

-/^(?:aaa(*THEN)\w{6}|bbb(*THEN)\w{5}|ccc(*THEN)\w{4}|\w{3})/
-    aaaxxxxxx
-    aaa++++++ 
-    bbbxxxxx
-    bbb+++++ 
-    cccxxxx
-    ccc++++ 
-    dddddddd   
+/[Z\x{100}]/8BM

-/^(aaa(*THEN)\w{6}|bbb(*THEN)\w{5}|ccc(*THEN)\w{4}|\w{3})/
-    aaaxxxxxx
-    aaa++++++ 
-    bbbxxxxx
-    bbb+++++ 
-    cccxxxx
-    ccc++++ 
-    dddddddd   
+/^[\x{100}\E-\Q\E\x{150}]/B8M

-/a+b?(*THEN)c+(*FAIL)/
-    aaabccc
+/^[\QĀ\E-\QŐ\E]/B8M

-/(A (A|B(*ACCEPT)|C) D)(E)/x
-    AB
-    ABX
-    AADE
-    ACDE
-    ** Failers
-    AD 
-        
-/^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$/i
-    1221
-    Satan, oscillate my metallic sonatas!
-    A man, a plan, a canal: Panama!
-    Able was I ere I saw Elba.
-    *** Failers
-    The quick brown fox
+/^[\QĀ\E-\QŐ\E/B8M

-/^((.)(?1)\2|.)$/
-    a
-    aba
-    aabaa  
-    abcdcba 
-    pqaabaaqp  
-    ablewasiereisawelba
-    rhubarb
-    the quick brown fox  
+/[\p{L}]/BM

-/(a)(?<=b(?1))/
-    baz
-    ** Failers
-    caz  
-    
-/(?<=b(?1))(a)/
-    zbaaz
-    ** Failers
-    aaa  
-    
-/(?<X>a)(?<=b(?&X))/
-    baz
+/[\p{^L}]/BM

-/^(?|(abc)|(def))\1/
-    abcabc
-    defdef 
-    ** Failers
-    abcdef
-    defabc   
-    
-/^(?|(abc)|(def))(?1)/
-    abcabc
-    defabc
-    ** Failers
-    defdef
-    abcdef    
+/[\P{L}]/BM

-/(?:a(?<quote> (?<apostrophe>')|(?<realquote>")) |b(?<quote> (?<apostrophe>')|(?<realquote>")) ) (?('quote')[a-z]+|[0-9]+)/xJ
-    a\"aaaaa
-    b\"aaaaa 
-    ** Failers 
-    b\"11111
+/[\P{^L}]/BM

-/(?:(?1)|B)(A(*F)|C)/
-    ABCD
-    CCD
-    ** Failers
-    CAD   
+/[abc\p{L}\x{0660}]/8BM

-/^(?:(?1)|B)(A(*F)|C)/
-    CCD
-    BCD 
-    ** Failers
-    ABCD
-    CAD
-    BAD    
+/[\p{Nd}]/8BM

-/(?:(?1)|B)(A(*ACCEPT)XX|C)D/
-    AAD
-    ACD
-    BAD
-    BCD
-    BAX  
-    ** Failers
-    ACX
-    ABC   
+/[\p{Nd}+-]+/8BM

-/(?(DEFINE)(A))B(?1)C/
-    BAC
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iBM

-/(?(DEFINE)((A)\2))B(?1)C/
-    BAAC
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8BM

-/(?<pn> \( ( [^()]++ | (?&pn) )* \) )/x
-    (ab(cd)ef)
+/[\x{105}-\x{109}]/8iBM

-/^(?!a(*SKIP)b)/
-    ac
-    
-/^(?=a(*SKIP)b|ac)/
-    ** Failers
-    ac
-    
-/^(?=a(*THEN)b|ac)/
-    ac
-    
-/^(?=a(*PRUNE)b)/
-    ab  
-    ** Failers 
-    ac
+/( ( (?(1)0|) )*   )/xBM

-/^(?=a(*ACCEPT)b)/
-    ac
+/(  (?(1)0|)*   )/xBM

-/^(?(?!a(*SKIP)b))/
-    ac
+/[a]/BM

-/(?>a\Kb)/
-    ab
+/[a]/8BM

-/((?>a\Kb))/
-    ab
+/[\xaa]/BM

-/(a\Kb)/
-    ab
-    
-/^a\Kcz|ac/
-    ac
-    
-/(?>a\Kbz|ab)/
-    ab 
+/[\xaa]/8BM

-/^(?&t)(?(DEFINE)(?<t>a\Kb))$/
-    ab
+/[^a]/BM

-/^([^()]|\((?1)*\))*$/
-    a(b)c
-    a(b(c)d)e 
+/[^a]/8BM

-/(?P<L1>(?P<L2>0)(?P>L1)|(?P>L2))/
-    0
-    00
-    0000  
+/[^\xaa]/BM

-/(?P<L1>(?P<L2>0)|(?P>L2)(?P>L1))/
-    0
-    00
-    0000  
+/[^\xaa]/8BM

-/--- This one does fail, as expected, in Perl. It needs the complex item at the
-     end of the pattern. A single letter instead of (B|D) makes it not fail,
-     which I think is a Perl bug. --- /
+/[^\d]/8WB

-/A(*COMMIT)(B|D)/
-    ACABX
+/[[:^alpha:][:^cntrl:]]+/8WB

-/--- Check the use of names for failure ---/
+/[[:^cntrl:][:^alpha:]]+/8WB

-/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
-    ** Failers
-    AC
-    CB    
-    
-/--- Force no study, otherwise mark is not seen. The studied version is in
-     test 2 because it isn't Perl-compatible. ---/
+/[[:alpha:]]+/8WB

-/(*MARK:A)(*SKIP:B)(C|X)/KSS
-    C
-    D
-     
-/^(A(*THEN:A)B|C(*THEN:B)D)/K
-    ** Failers
-    CB    
+/[[:^alpha:]\S]+/8WB

-/^(?:A(*THEN:A)B|C(*THEN:B)D)/K
-    CB    
-    
-/^(?>A(*THEN:A)B|C(*THEN:B)D)/K
-    CB    
-    
-/--- This should succeed, as the skip causes bump to offset 1 (the mark). Note
-that we have to have something complicated such as (B|Z) at the end because,
-for Perl, a simple character somehow causes an unwanted optimization to mess
-with the handling of backtracking verbs. ---/
+/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/B

-/A(*MARK:A)A+(*SKIP:A)(B|Z) | AC/xK
-    AAAC
-    
-/--- Test skipping over a non-matching mark. ---/
-
-/A(*MARK:A)A+(*MARK:B)(*SKIP:A)(B|Z) | AC/xK
-    AAAC
-    
-/--- Check shorthand for MARK ---/
-
-/A(*:A)A+(*SKIP:A)(B|Z) | AC/xK
-    AAAC
-
-/--- Don't loop! Force no study, otherwise mark is not seen. ---/
-
-/(*:A)A+(*SKIP:A)(B|Z)/KSS
-    AAAC
-
-/--- This should succeed, as a non-existent skip name disables the skip ---/ 
-
-/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC/xK
-    AAAC
-
-/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC(*:B)/xK
-    AAAC
-
-/--- We use something more complicated than individual letters here, because
-that causes different behaviour in Perl. Perhaps it disables some optimization;
-anyway, the result now matches PCRE in that no tag is passed back for the 
-failures. ---/
-    
-/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK
-    AABC
-    XXYZ 
-    ** Failers
-    XAQQ  
-    XAQQXZZ  
-    AXQQQ 
-    AXXQQQ 
-    
-/--- COMMIT at the start of a pattern should act like an anchor. Again, 
-however, we need the complication for Perl. ---/
-
-/(*COMMIT)(A|P)(B|P)(C|P)/
-    ABCDEFG
-    ** Failers
-    DEFGABC  
-
-/--- COMMIT inside an atomic group can't stop backtracking over the group. ---/
-
-/(\w+)(?>b(*COMMIT))\w{2}/
-    abbb
-
-/(\w+)b(*COMMIT)\w{2}/
-    abbb
-
-/--- Check opening parens in comment when seeking forward reference. ---/ 
-
-/(?&t)(?#()(?(DEFINE)(?<t>a))/
-    bac
-
-/--- COMMIT should override THEN ---/
-
-/(?>(*COMMIT)(?>yes|no)(*THEN)(*F))?/
-  yes
-
-/(?>(*COMMIT)(yes|no)(*THEN)(*F))?/
-  yes
-
-/b?(*SKIP)c/
-    bc
-    abc
-   
-/(*SKIP)bc/
-    a
-
-/(*SKIP)b/
-    a 
-
-/(?P<abn>(?P=abn)xxx|)+/
-    xxx
-
-/(?i:([^b]))(?1)/
-    aa
-    aA     
-    ** Failers
-    ab
-    aB
-    Ba
-    ba
-
-/^(?&t)*+(?(DEFINE)(?<t>a))\w$/
-    aaaaaaX
-    ** Failers 
-    aaaaaa 
-
-/^(?&t)*(?(DEFINE)(?<t>a))\w$/
-    aaaaaaX
-    aaaaaa 
-
-/^(a)*+(\w)/
-    aaaaX
-    YZ 
-    ** Failers 
-    aaaa
-
-/^(?:a)*+(\w)/
-    aaaaX
-    YZ 
-    ** Failers 
-    aaaa
-
-/^(a)++(\w)/
-    aaaaX
-    ** Failers 
-    aaaa
-    YZ 
-
-/^(?:a)++(\w)/
-    aaaaX
-    ** Failers 
-    aaaa
-    YZ 
-
-/^(a)?+(\w)/
-    aaaaX
-    YZ 
-
-/^(?:a)?+(\w)/
-    aaaaX
-    YZ 
-
-/^(a){2,}+(\w)/
-    aaaaX
-    ** Failers
-    aaa
-    YZ 
-
-/^(?:a){2,}+(\w)/
-    aaaaX
-    ** Failers
-    aaa
-    YZ 
-
-/(a|)*(?1)b/
-    b
-    ab
-    aab  
-
-/(a)++(?1)b/
-    ** Failers
-    ab 
-    aab
-
-/(a)*+(?1)b/
-    ** Failers
-    ab
-    aab  
-
-/(?1)(?:(b)){0}/
-    b
-
-/(foo ( \( ((?:(?> [^()]+ )|(?2))*) \) ) )/x
-    foo(bar(baz)+baz(bop))
-
-/(A (A|B(*ACCEPT)|C) D)(E)/x
-    AB
-
-/\A.*?(?:a|b(*THEN)c)/
-    ba
-
-/\A.*?(?:a|bc)/
-    ba
-
-/\A.*?(a|b(*THEN)c)/
-    ba
-
-/\A.*?(a|bc)/
-    ba
-
-/\A.*?(?:a|b(*THEN)c)++/
-    ba
-
-/\A.*?(?:a|bc)++/
-    ba
-
-/\A.*?(a|b(*THEN)c)++/
-    ba
-
-/\A.*?(a|bc)++/
-    ba
-
-/\A.*?(?:a|b(*THEN)c|d)/
-    ba
-
-/\A.*?(?:a|bc|d)/
-    ba
-
-/(?:(b))++/
-    beetle
-
-/(?(?=(a(*ACCEPT)z))a)/
-    a
-
-/^(a)(?1)+ab/
-    aaaab
-    
-/^(a)(?1)++ab/
-    aaaab
-
-/^(?=a(*:M))aZ/K
-    aZbc
-
-/^(?!(*:M)b)aZ/K
-    aZbc
-
-/(?(DEFINE)(a))?b(?1)/
-    backgammon
-
-/^\N+/
-    abc\ndef
-    
-/^\N{1,}/
-    abc\ndef 
-
-/(?(R)a+|(?R)b)/
-    aaaabcde
-
-/(?(R)a+|((?R))b)/
-    aaaabcde
-
-/((?(R)a+|(?1)b))/
-    aaaabcde
-
-/((?(R1)a+|(?1)b))/
-    aaaabcde
-
-/a(*:any 
-name)/K
-    abc
-    
-/(?>(?&t)c|(?&t))(?(DEFINE)(?<t>a|b(*PRUNE)c))/
-    a
-    ba
-    bba 
-    
-/--- Checking revised (*THEN) handling ---/ 
-
-/--- Capture ---/
-
-/^.*? (a(*THEN)b) c/x
-    aabc
-
-/^.*? (a(*THEN)b|(*F)) c/x
-    aabc
-
-/^.*? ( (a(*THEN)b) | (*F) ) c/x
-    aabc
-
-/^.*? ( (a(*THEN)b) ) c/x
-    aabc
-
-/--- Non-capture ---/
-
-/^.*? (?:a(*THEN)b) c/x
-    aabc
-
-/^.*? (?:a(*THEN)b|(*F)) c/x
-    aabc
-
-/^.*? (?: (?:a(*THEN)b) | (*F) ) c/x
-    aabc
-
-/^.*? (?: (?:a(*THEN)b) ) c/x
-    aabc
-
-/--- Atomic ---/
-
-/^.*? (?>a(*THEN)b) c/x
-    aabc
-
-/^.*? (?>a(*THEN)b|(*F)) c/x
-    aabc
-
-/^.*? (?> (?>a(*THEN)b) | (*F) ) c/x
-    aabc
-
-/^.*? (?> (?>a(*THEN)b) ) c/x
-    aabc
-
-/--- Possessive capture ---/
-
-/^.*? (a(*THEN)b)++ c/x
-    aabc
-
-/^.*? (a(*THEN)b|(*F))++ c/x
-    aabc
-
-/^.*? ( (a(*THEN)b)++ | (*F) )++ c/x
-    aabc
-
-/^.*? ( (a(*THEN)b)++ )++ c/x
-    aabc
-
-/--- Possessive non-capture ---/
-
-/^.*? (?:a(*THEN)b)++ c/x
-    aabc
-
-/^.*? (?:a(*THEN)b|(*F))++ c/x
-    aabc
-
-/^.*? (?: (?:a(*THEN)b)++ | (*F) )++ c/x
-    aabc
-
-/^.*? (?: (?:a(*THEN)b)++ )++ c/x
-    aabc
-    
-/--- Condition assertion ---/
-
-/^(?(?=a(*THEN)b)ab|ac)/
-    ac
- 
-/--- Condition ---/
-
-/^.*?(?(?=a)a|b(*THEN)c)/
-    ba
-
-/^.*?(?:(?(?=a)a|b(*THEN)c)|d)/
-    ba
-
-/^.*?(?(?=a)a(*THEN)b|c)/
-    ac
-
-/--- Assertion ---/
-
-/^.*(?=a(*THEN)b)/ 
-    aabc
-
-/------------------------------/
-
-/(?>a(*:m))/imsxSK 
-    a
-
-/(?>(a)(*:m))/imsxSK 
-    a
-
-/(?<=a(*ACCEPT)b)c/
-    xacd
-
-/(?<=(a(*ACCEPT)b))c/
-    xacd
-
-/(?<=(a(*COMMIT)b))c/
-    xabcd
-    ** Failers 
-    xacd
-    
-/(?<!a(*FAIL)b)c/
-    xcd
-    acd 
-
-/(?<=a(*:N)b)c/K
-    xabcd
-    
-/(?<=a(*PRUNE)b)c/
-    xabcd 
-
-/(?<=a(*SKIP)b)c/
-    xabcd 
-
-/(?<=a(*THEN)b)c/
-    xabcd 
-
 /-- End of testinput11 --/

Modified: code/trunk/testdata/testinput12
===================================================================
--- code/trunk/testdata/testinput12    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput12    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7 +1,19 @@
-/a(*:a\x{1234}b)/8K
-    abc
+/-- This test is run only when JIT support is available. It checks for a
+successful and an unsuccessful JIT compile and save and restore behaviour,
+and a couple of things that are different with JIT. --/

-/a(*:a£b)/8K 
+/abc/S+I
+
+/ab(*COMMIT)/S+I
+
+/abc/S+I>testsavedregex
+
+<testsavedregex
     abc

+/a*/SI
+
+/(?(R)a*(?1)|((?R))b)/S+
+    aaaabcde
+
 /-- End of testinput12 --/

Modified: code/trunk/testdata/testinput13
===================================================================
--- code/trunk/testdata/testinput13    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput13    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,576 +1,9 @@
-/-- These tests for Unicode property support test PCRE's API and show some of
-    the compiled code. They are not Perl-compatible. --/
+/-- This test is run only when JIT support is not available. It checks that an 
+attempt to use it has the expected behaviour. It also tests things that
+are different without JIT. --/
+   
+/abc/S+I

-/[\p{L}]/DZ
+/a*/SI

-/[\p{^L}]/DZ
-
-/[\P{L}]/DZ
-
-/[\P{^L}]/DZ
-
-/[abc\p{L}\x{0660}]/8DZ
-
-/[\p{Nd}]/8DZ
-    1234
-
-/[\p{Nd}+-]+/8DZ
-    1234
-    12-34
-    12+\x{661}-34  
-    ** Failers
-    abcd  
-
-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iDZ
-
-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8DZ
-
-/AB\x{1fb0}/8DZ
-
-/AB\x{1fb0}/8DZi
-
-/[\x{105}-\x{109}]/8iDZ
-    \x{104}
-    \x{105}
-    \x{109}  
-    ** Failers
-    \x{100}
-    \x{10a} 
-    
-/[z-\x{100}]/8iDZ 
-    Z
-    z
-    \x{39c}
-    \x{178}
-    |
-    \x{80}
-    \x{ff}
-    \x{100}
-    \x{101} 
-    ** Failers
-    \x{102}
-    Y
-    y           
-
-/[z-\x{100}]/8DZi
-
-/(?:[\PPa*]*){8,}/
-
-/[\P{Any}]/BZ
-
-/[\P{Any}\E]/BZ
-
-/(\P{Yi}+\277)/
-
-/(\P{Yi}+\277)?/
-
-/(?<=\P{Yi}{3}A)X/
-
-/\p{Yi}+(\P{Yi}+)(?1)/
-
-/(\P{Yi}{2}\277)?/
-
-/[\P{Yi}A]/
-
-/[\P{Yi}\P{Yi}\P{Yi}A]/
-
-/[^\P{Yi}A]/
-
-/[^\P{Yi}\P{Yi}\P{Yi}A]/
-
-/(\P{Yi}*\277)*/
-
-/(\P{Yi}*?\277)*/
-
-/(\p{Yi}*+\277)*/
-
-/(\P{Yi}?\277)*/
-
-/(\P{Yi}??\277)*/
-
-/(\p{Yi}?+\277)*/
-
-/(\P{Yi}{0,3}\277)*/
-
-/(\P{Yi}{0,3}?\277)*/
-
-/(\p{Yi}{0,3}+\277)*/
-
-/\p{Zl}{2,3}+/8BZ
-    \xe2\x80\xa8\xe2\x80\xa8
-    \x{2028}\x{2028}\x{2028}
-    
-/\p{Zl}/8BZ
-
-/\p{Lu}{3}+/8BZ
-
-/\pL{2}+/8BZ
-
-/\p{Cc}{2}+/8BZ
-
-/^\p{Cs}/8
-    \?\x{dfff}
-    ** Failers
-    \x{09f} 
-  
-/^\p{Sc}+/8
-    $\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
-    \x{9f2}
-    ** Failers
-    X
-    \x{2c2}
-  
-/^\p{Zs}/8
-    \ \
-    \x{a0}
-    \x{1680}
-    \x{180e}
-    \x{2000}
-    \x{2001}     
-    ** Failers
-    \x{2028}
-    \x{200d} 
-  
-/-- These four are here rather than in test 6 because Perl has problems with
-    the negative versions of the properties. --/
-      
-/\p{^Lu}/8i
-    1234
-    ** Failers
-    ABC 
-
-/\P{Lu}/8i
-    1234
-    ** Failers
-    ABC 
-
-/\p{Ll}/8i 
-    a
-    Az
-    ** Failers
-    ABC   
-
-/\p{Lu}/8i
-    A
-    a\x{10a0}B 
-    ** Failers 
-    a
-    \x{1d00}  
-
-/[\x{c0}\x{391}]/8i
-    \x{c0}
-    \x{e0} 
-
-/-- The next two are special cases where the lengths of the different cases of
-the same character differ. The first went wrong with heap frame storage; the
-second was broken in all cases. --/
-
-/^\x{023a}+?(\x{0130}+)/8i
-  \x{023a}\x{2c65}\x{0130}
-  
-/^\x{023a}+([^X])/8i
-  \x{023a}\x{2c65}X
-
-/\x{c0}+\x{116}+/8i
-    \x{c0}\x{e0}\x{116}\x{117}
-
-/[\x{c0}\x{116}]+/8i
-    \x{c0}\x{e0}\x{116}\x{117}
-
-/(\x{de})\1/8i
-    \x{de}\x{de}
-    \x{de}\x{fe}
-    \x{fe}\x{fe}
-    \x{fe}\x{de}
-
-/^\x{c0}$/8i
-    \x{c0}
-    \x{e0} 
-
-/^\x{e0}$/8i
-    \x{c0}
-    \x{e0} 
-
-/-- The next two should be Perl-compatible, but it fails to match \x{e0}. PCRE
-will match it only with UCP support, because without that it has no notion
-of case for anything other than the ASCII letters. --/ 
-
-/((?i)[\x{c0}])/8
-    \x{c0}
-    \x{e0} 
-
-/(?i:[\x{c0}])/8
-    \x{c0}
-    \x{e0} 
-
-/-- This should be Perl-compatible but Perl 5.11 gets \x{300} wrong. --/8
-    
-/^\X/8
-    A
-    A\x{300}BC 
-    A\x{300}\x{301}\x{302}BC 
-    *** Failers
-    \x{300}  
-    
-/-- These are PCRE's extra properties to help with Unicodizing \d etc. --/
-
-/^\p{Xan}/8
-    ABCD
-    1234
-    \x{6ca}
-    \x{a6c}
-    \x{10a7}   
-    ** Failers
-    _ABC   
-
-/^\p{Xan}+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-    ** Failers
-    _ABC   
-
-/^\p{Xan}+?/8
-    \x{6ca}\x{a6c}\x{10a7}_
-
-/^\p{Xan}*/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-    
-/^\p{Xan}{2,9}/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-    
-/^\p{Xan}{2,9}?/8
-    \x{6ca}\x{a6c}\x{10a7}_
-    
-/^[\p{Xan}]/8
-    ABCD1234_
-    1234abcd_
-    \x{6ca}
-    \x{a6c}
-    \x{10a7}   
-    ** Failers
-    _ABC   
- 
-/^[\p{Xan}]+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-    ** Failers
-    _ABC   
-
-/^>\p{Xsp}/8
-    >\x{1680}\x{2028}\x{0b}
-    >\x{a0} 
-    ** Failers
-    \x{0b} 
-
-/^>\p{Xsp}+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-
-/^>\p{Xsp}+?/8
-    >\x{1680}\x{2028}\x{0b}
-
-/^>\p{Xsp}*/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>\p{Xsp}{2,9}/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>\p{Xsp}{2,9}?/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>[\p{Xsp}]/8
-    >\x{2028}\x{0b}
- 
-/^>[\p{Xsp}]+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-
-/^>\p{Xps}/8
-    >\x{1680}\x{2028}\x{0b}
-    >\x{a0} 
-    ** Failers
-    \x{0b} 
-
-/^>\p{Xps}+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-
-/^>\p{Xps}+?/8
-    >\x{1680}\x{2028}\x{0b}
-
-/^>\p{Xps}*/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>\p{Xps}{2,9}/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>\p{Xps}{2,9}?/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>[\p{Xps}]/8
-    >\x{2028}\x{0b}
- 
-/^>[\p{Xps}]+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-
-/^\p{Xwd}/8
-    ABCD
-    1234
-    \x{6ca}
-    \x{a6c}
-    \x{10a7}
-    _ABC    
-    ** Failers
-    [] 
-
-/^\p{Xwd}+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-
-/^\p{Xwd}+?/8
-    \x{6ca}\x{a6c}\x{10a7}_
-
-/^\p{Xwd}*/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-    
-/^\p{Xwd}{2,9}/8
-    A_B12\x{6ca}\x{a6c}\x{10a7}
-    
-/^\p{Xwd}{2,9}?/8
-    \x{6ca}\x{a6c}\x{10a7}_
-    
-/^[\p{Xwd}]/8
-    ABCD1234_
-    1234abcd_
-    \x{6ca}
-    \x{a6c}
-    \x{10a7}   
-    _ABC 
-    ** Failers
-    []   
- 
-/^[\p{Xwd}]+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-
-/-- A check not in UTF-8 mode --/
-
-/^[\p{Xwd}]+/
-    ABCD1234_
-    
-/-- Some negative checks --/
-
-/^[\P{Xwd}]+/8
-    !.+\x{019}\x{35a}AB
-
-/^[\p{^Xwd}]+/8
-    !.+\x{019}\x{35a}AB
-
-/[\D]/WBZ8
-    1\x{3c8}2
-
-/[\d]/WBZ8
-    >\x{6f4}<
-
-/[\S]/WBZ8
-    \x{1680}\x{6f4}\x{1680}
-
-/[\s]/WBZ8
-    >\x{1680}<
-
-/[\W]/WBZ8
-    A\x{1712}B
-
-/[\w]/WBZ8
-    >\x{1723}<
-
-/\D/WBZ8
-    1\x{3c8}2
-
-/\d/WBZ8
-    >\x{6f4}<
-
-/\S/WBZ8
-    \x{1680}\x{6f4}\x{1680}
-
-/\s/WBZ8
-    >\x{1680}>
-
-/\W/WBZ8
-    A\x{1712}B
-
-/\w/WBZ8
-    >\x{1723}<
-
-/[[:alpha:]]/WBZ
-
-/[[:lower:]]/WBZ
-
-/[[:upper:]]/WBZ
-
-/[[:alnum:]]/WBZ
-
-/[[:ascii:]]/WBZ
-
-/[[:blank:]]/WBZ
-
-/[[:cntrl:]]/WBZ
-
-/[[:digit:]]/WBZ
-
-/[[:graph:]]/WBZ
-
-/[[:print:]]/WBZ
-
-/[[:punct:]]/WBZ
-
-/[[:space:]]/WBZ
-
-/[[:word:]]/WBZ
-
-/[[:xdigit:]]/WBZ
-
-/-- Unicode properties for \b abd \B --/
-
-/\b...\B/8W
-    abc_
-    \x{37e}abc\x{376} 
-    \x{37e}\x{376}\x{371}\x{393}\x{394} 
-    !\x{c0}++\x{c1}\x{c2} 
-    !\x{c0}+++++ 
-
-/-- Without PCRE_UCP, non-ASCII always fail, even if < 256  --/
-
-/\b...\B/8
-    abc_
-    ** Failers 
-    \x{37e}abc\x{376} 
-    \x{37e}\x{376}\x{371}\x{393}\x{394} 
-    !\x{c0}++\x{c1}\x{c2} 
-    !\x{c0}+++++ 
-
-/-- With PCRE_UCP, non-UTF8 chars that are < 256 still check properties  --/
-
-/\b...\B/W
-    abc_
-    !\x{c0}++\x{c1}\x{c2} 
-    !\x{c0}+++++ 
-
-/-- POSIX interface --/
-
-/\w/P
-    +++\x{c2}
-
-/\w/WP
-    +++\x{c2}
-    
-/-- Some of these are silly, but they check various combinations --/
-
-/[[:^alpha:][:^cntrl:]]+/8WBZ
-    123
-    abc 
-
-/[[:^cntrl:][:^alpha:]]+/8WBZ
-    123
-    abc 
-
-/[[:alpha:]]+/8WBZ
-    abc
-
-/[[:^alpha:]\S]+/8WBZ
-    123
-    abc 
-
-/[^\d]+/8WBZ
-    abc123
-    abc\x{123}
-    \x{660}abc   
-
-/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/8iSI
-    \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
-    \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
-
-/\p{Lu}+9\p{Lu}+B\p{Lu}+b/BZ
-
-/\p{^Lu}+9\p{^Lu}+B\p{^Lu}+b/BZ
-
-/\P{Lu}+9\P{Lu}+B\P{Lu}+b/BZ
-
-/\p{Han}+X\p{Greek}+\x{370}/BZ8
-
-/\p{Xan}+!\p{Xan}+A/BZ
-
-/\p{Xsp}+!\p{Xsp}\t/BZ
-
-/\p{Xps}+!\p{Xps}\t/BZ
-
-/\p{Xwd}+!\p{Xwd}_/BZ
-
-/A+\p{N}A+\dB+\p{N}*B+\d*/WBZ
-
-/-- These behaved oddly in Perl, so they are kept in this test --/
-
-/(\x{23a}\x{23a}\x{23a})?\1/8i
-    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
-
-/(ȺȺȺ)?\1/8i
-    ȺȺȺⱥⱥ
-
-/(\x{23a}\x{23a}\x{23a})?\1/8i
-    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
-
-/(ȺȺȺ)?\1/8i
-    ȺȺȺⱥⱥⱥ
-
-/(\x{23a}\x{23a}\x{23a})\1/8i
-    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
-
-/(ȺȺȺ)\1/8i
-    ȺȺȺⱥⱥ
-
-/(\x{23a}\x{23a}\x{23a})\1/8i
-    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
-
-/(ȺȺȺ)\1/8i
-    ȺȺȺⱥⱥⱥ
-
-/(\x{2c65}\x{2c65})\1/8i
-    \x{2c65}\x{2c65}\x{23a}\x{23a}
-    
-/(ⱥⱥ)\1/8i
-    ⱥⱥȺȺ 
-    
-/(\x{23a}\x{23a}\x{23a})\1Y/8i
-    X\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}YZ
-
-/(\x{2c65}\x{2c65})\1Y/8i
-    X\x{2c65}\x{2c65}\x{23a}\x{23a}YZ
-
-/-- --/ 
-
-/-- These scripts weren't yet in Perl when I added Unicode 6.0.0 to PCRE --/
-
-/^[\p{Batak}]/8
-    \x{1bc0}
-    \x{1bff}
-    ** Failers
-    \x{1bf4}
-    
-/^[\p{Brahmi}]/8
-    \x{11000}
-    \x{1106f}
-    ** Failers
-    \x{1104e}
-    
-/^[\p{Mandaic}]/8
-    \x{840}
-    \x{85e}
-    ** Failers
-    \x{85c}
-    \x{85d}    
-
-/-- --/ 
-
-/(\X*)(.)/s8
-    A\x{300}
-
-/^S(\X*)e(\X*)$/8
-    Stéréo
-    
-/^\X/8 
-    ́réo
-
 /-- End of testinput13 --/

Modified: code/trunk/testdata/testinput14
===================================================================
--- code/trunk/testdata/testinput14    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput14    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,19 +1,310 @@
-/-- This test is run only when JIT support is available. It checks for a
-successful and an unsuccessful JIT compile and save and restore behaviour,
-and a couple of things that are different with JIT. --/
+/-- This set of tests is run only with the 8-bit library. It starts with all
+    the tests of the POSIX interface, because that is supported only with the
+    8-bit library. --/

-/abc/S+I
+/abc/P
+    abc
+    *** Failers

-/ab(*COMMIT)/S+I
+/^abc|def/P
+    abcdef
+    abcdef\B

-/abc/S+I>testsavedregex
+/.*((abc)$|(def))/P
+    defabc
+    \Zdefabc

-<testsavedregex
+/the quick brown fox/P
+    the quick brown fox
+    *** Failers
+    The Quick Brown Fox
+
+/the quick brown fox/Pi
+    the quick brown fox
+    The Quick Brown Fox
+
+/abc.def/P
+    *** Failers
+    abc\ndef
+
+/abc$/P
     abc
+    abc\n

-/a*/SI
+/(abc)\2/P

-/(?(R)a*(?1)|((?R))b)/S+
-    aaaabcde
+/(abc\1)/P
+    abc

+/a*(b+)(z)(z)/P
+    aaaabbbbzzzz
+    aaaabbbbzzzz\O0
+    aaaabbbbzzzz\O1
+    aaaabbbbzzzz\O2
+    aaaabbbbzzzz\O3
+    aaaabbbbzzzz\O4
+    aaaabbbbzzzz\O5
+
+/ab.cd/P
+    ab-cd
+    ab=cd
+    ** Failers
+    ab\ncd
+
+/ab.cd/Ps
+    ab-cd
+    ab=cd
+    ab\ncd
+
+/a(b)c/PN
+    abc
+
+/a(?P<name>b)c/PN
+    abc
+
+/a?|b?/P
+    abc
+    ** Failers
+    ddd\N   
+
+/\w+A/P
+   CDAAAAB 
+
+/\w+A/PU
+   CDAAAAB 
+   
+/\Biss\B/I+P
+    Mississippi
+
+/abc/\P
+
+/-- End of POSIX tests --/ 
+
+/a\Cb/
+    aXb
+    a\nb
+    ** Failers (too big char) 
+    A\x{123}B 
+  
+/\x{100}/I
+
+/  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                          # optional leading comment
+(?:    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+# address
+|                     #  or
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)             # one word, optionally followed by....
+(?:
+[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]  |  # atom and space parts, or...
+\(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)       |  # comments, or...
+
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+# quoted strings
+)*
+<  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                     # leading <
+(?:  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  ,  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+)* # further okay, if led by comma
+:                                # closing colon
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  )? #       optional route
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+#       address spec
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  > #                  trailing >
+# name and address
+)  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                       # optional trailing comment
+/xSI
+
+<testdata/saved16
+
+/\h/SI
+
+/\v/SI
+
+/\R/SI
+
+/[\h]/BZ
+    >\x09<
+
+/[\h]+/BZ
+    >\x09\x20\xa0<
+
+/[\v]/BZ
+
+/[\H]/BZ
+
+/[^\h]/BZ
+
+/[\V]/BZ
+
+/[\x0a\V]/BZ
+
 /-- End of testinput14 --/

Modified: code/trunk/testdata/testinput15
===================================================================
--- code/trunk/testdata/testinput15    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput15    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,9 +1,278 @@
-/-- This test is run only when JIT support is not available. It checks that an 
-attempt to use it has the expected behaviour. It also tests things that
-are different without JIT. --/
-   
-/abc/S+I
+/-- This set of tests is for UTF-8 support, and is relevant only to the 8-bit 
+    library. --/

-/a*/SI
+/X(\C{3})/8
+    X\x{1234}

+/X(\C{4})/8
+    X\x{1234}YZ
+    
+/X\C*/8
+    XYZabcdce
+    
+/X\C*?/8
+    XYZabcde
+    
+/X\C{3,5}/8
+    Xabcdefg   
+    X\x{1234} 
+    X\x{1234}YZ
+    X\x{1234}\x{512}  
+    X\x{1234}\x{512}YZ
+
+/X\C{3,5}?/8
+    Xabcdefg   
+    X\x{1234} 
+    X\x{1234}YZ
+    X\x{1234}\x{512}  
+
+/a\Cb/8
+    aXb
+    a\nb
+    
+/a\C\Cb/8 
+    a\x{100}b 
+
+/ab\Cde/8
+    abXde
+
+/a\C\Cb/8 
+    a\x{100}b
+    ** Failers 
+    a\x{12257}b
+
+/[\xC3]/8
+
+/\xC3/8
+
+/\xC3\xC3\xC3xxx/8
+
+/\xC3\xC3\xC3xxx/8?DZSS
+
+/abc/8
+    \xC3]
+    \xC3
+    \xC3\xC3\xC3
+    \xC3\xC3\xC3\?
+    \xe1\x88 
+    \P\xe1\x88 
+    \P\P\xe1\x88 
+    XX\xea
+    \O0XX\xea
+    \O1XX\xea
+    \O2XX\xea
+    XX\xf1
+    XX\xf8  
+    XX\xfc
+    ZZ\xea\xaf\x20YY
+    ZZ\xfd\xbf\xbf\x2f\xbf\xbfYY  
+    ZZ\xfd\xbf\xbf\xbf\x2f\xbfYY  
+    ZZ\xfd\xbf\xbf\xbf\xbf\x2fYY  
+    ZZ\xffYY
+    ZZ\xfeYY  
+
+/anything/8
+    \xc0\x80
+    \xc1\x8f 
+    \xe0\x9f\x80
+    \xf0\x8f\x80\x80 
+    \xf8\x87\x80\x80\x80  
+    \xfc\x83\x80\x80\x80\x80
+    \xfe\x80\x80\x80\x80\x80  
+    \xff\x80\x80\x80\x80\x80  
+    \xc3\x8f
+    \xe0\xaf\x80
+    \xe1\x80\x80
+    \xf0\x9f\x80\x80 
+    \xf1\x8f\x80\x80 
+    \xf8\x88\x80\x80\x80  
+    \xf9\x87\x80\x80\x80  
+    \xfc\x84\x80\x80\x80\x80
+    \xfd\x83\x80\x80\x80\x80
+    \?\xf8\x88\x80\x80\x80  
+    \?\xf9\x87\x80\x80\x80  
+    \?\xfc\x84\x80\x80\x80\x80
+    \?\xfd\x83\x80\x80\x80\x80
+
+/\x{100}/8DZ
+
+/\x{1000}/8DZ
+
+/\x{10000}/8DZ
+
+/\x{100000}/8DZ
+
+/\x{10ffff}/8DZ
+
+/[\x{ff}]/8DZ
+
+/[\x{100}]/8DZ
+
+/\x80/8DZ
+
+/\xff/8DZ
+
+/\x{D55c}\x{ad6d}\x{C5B4}/DZ8 
+    \x{D55c}\x{ad6d}\x{C5B4} 
+
+/\x{65e5}\x{672c}\x{8a9e}/DZ8
+    \x{65e5}\x{672c}\x{8a9e}
+
+/\x{80}/DZ8
+
+/\x{084}/DZ8
+
+/\x{104}/DZ8
+
+/\x{861}/DZ8
+
+/\x{212ab}/DZ8
+
+/-- This one is here not because it's different to Perl, but because the way
+the captured single-byte is displayed. (In Perl it becomes a character, and you
+can't tell the difference.) --/
+    
+/X(\C)(.*)/8
+    X\x{1234}
+    X\nabc 
+
+/-- This one is here because Perl gives out a grumbly error message (quite 
+correctly, but that messes up comparisons). --/
+    
+/a\Cb/8
+    *** Failers 
+    a\x{100}b 
+    
+/[^ab\xC0-\xF0]/8SDZ
+    \x{f1}
+    \x{bf}
+    \x{100}
+    \x{1000}   
+    *** Failers
+    \x{c0} 
+    \x{f0} 
+
+/Ā{3,4}/8SDZ
+  \x{100}\x{100}\x{100}\x{100\x{100}
+
+/(\x{100}+|x)/8SDZ
+
+/(\x{100}*a|x)/8SDZ
+
+/(\x{100}{0,2}a|x)/8SDZ
+
+/(\x{100}{1,2}a|x)/8SDZ
+
+/\x{100}/8DZ
+
+/a\x{100}\x{101}*/8DZ
+
+/a\x{100}\x{101}+/8DZ
+
+/[^\x{c4}]/DZ
+
+/[\x{100}]/8DZ
+    \x{100}
+    Z\x{100}
+    \x{100}Z
+    *** Failers 
+
+/[\xff]/DZ8
+    >\x{ff}<
+
+/[^\xff]/8DZ
+
+/\x{100}abc(xyz(?1))/8DZ
+
+/a\x{1234}b/P8
+    a\x{1234}b
+
+/\777/8I
+  \x{1ff}
+  \777 
+  
+/\x{100}+\x{200}/8DZ
+
+/\x{100}+X/8DZ
+
+/^[\QĀ\E-\QŐ\E/BZ8
+
+/-- This tests the stricter UTF-8 check according to RFC 3629. --/ 
+    
+/X/8
+    \x{0}\x{d7ff}\x{e000}\x{10ffff}
+    \x{d800}
+    \x{d800}\?
+    \x{da00}
+    \x{da00}\?
+    \x{dfff}
+    \x{dfff}\?
+    \x{110000}    
+    \x{110000}\?    
+    \x{2000000} 
+    \x{2000000}\? 
+    \x{7fffffff} 
+    \x{7fffffff}\? 
+
+/(*UTF8)\x{1234}/
+  abcd\x{1234}pqr
+
+/(*CRLF)(*UTF8)(*BSR_UNICODE)a\Rb/I
+
+/\h/SI8
+    ABC\x{09}
+    ABC\x{20}
+    ABC\x{a0}
+    ABC\x{1680}
+    ABC\x{180e}
+    ABC\x{2000}
+    ABC\x{202f} 
+    ABC\x{205f} 
+    ABC\x{3000} 
+
+/\v/SI8
+    ABC\x{0a}
+    ABC\x{0b}
+    ABC\x{0c}
+    ABC\x{0d}
+    ABC\x{85}
+    ABC\x{2028}
+
+/\h*A/SI8
+    CDBABC
+    
+/\v+A/SI8
+
+/\s?xxx\s/8SI
+
+/\sxxx\s/I8ST1
+    AB\x{85}xxx\x{a0}XYZ
+    AB\x{a0}xxx\x{85}XYZ
+
+/\S \S/I8ST1
+    \x{a2} \x{84} 
+    A Z 
+
+/a+/8
+    a\x{123}aa\>1
+    a\x{123}aa\>2
+    a\x{123}aa\>3
+    a\x{123}aa\>4
+    a\x{123}aa\>5
+    a\x{123}aa\>6
+
+/\x{1234}+/iS8I
+
+/\x{1234}+?/iS8I
+
+/\x{1234}++/iS8I
+
+/\x{1234}{2}/iS8I
+
+/[^\x{c4}]/8DZ
+
+/X+\x{200}/8DZ
+
+/\R/SI8
+
 /-- End of testinput15 --/

Copied: code/trunk/testdata/testinput16 (from rev 835, code/branches/pcre16/testdata/testinput16)
===================================================================
--- code/trunk/testdata/testinput16                            (rev 0)
+++ code/trunk/testdata/testinput16    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,35 @@
+/-- This set of tests is run only with the 8-bit library when Unicode property 
+    support is available. It starts with tests of the POSIX interface, because
+    that is supported only with the 8-bit library. --/
+
+/\w/P
+    +++\x{c2}
+
+/\w/WP
+    +++\x{c2}
+    
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iDZ
+
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8DZ
+
+/AB\x{1fb0}/8DZ
+
+/AB\x{1fb0}/8DZi
+
+/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/8iSI
+    \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
+    \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
+
+/[ⱥ]/8iBZ
+
+/[^ⱥ]/8iBZ
+
+/\h/SI
+
+/\v/SI
+
+/\R/SI
+
+/[[:blank:]]/WBZ
+
+/-- End of testinput16 --/

Copied: code/trunk/testdata/testinput17 (from rev 835, code/branches/pcre16/testdata/testinput17)
===================================================================
--- code/trunk/testdata/testinput17                            (rev 0)
+++ code/trunk/testdata/testinput17    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,277 @@
+/-- This set of tests is for the 16-bit library's basic (non-UTF-16) features 
+    that are not compatible with the 8-bit library, or which give different 
+    output in 16-bit mode. --/
+
+/a\Cb/
+    aXb
+    a\nb
+  
+/-- Check maximum non-UTF character size --/
+
+/\x{ffff}/
+    A\x{ffff}B
+
+/\x{10000}/ 
+
+/[^\x{c4}]/DZ
+
+  
+/\x{100}/I
+
+/  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                          # optional leading comment
+(?:    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+# address
+|                     #  or
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)             # one word, optionally followed by....
+(?:
+[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]  |  # atom and space parts, or...
+\(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)       |  # comments, or...
+
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+# quoted strings
+)*
+<  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                     # leading <
+(?:  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  ,  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+)* # further okay, if led by comma
+:                                # closing colon
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  )? #       optional route
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+#       address spec
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  > #                  trailing >
+# name and address
+)  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                       # optional trailing comment
+/xSI
+
+<testdata/saved8
+
+/[\h]/BZ
+    >\x09<
+
+/[\h]+/BZ
+    >\x09\x20\xa0<
+
+/[\v]/BZ
+
+/[\H]/BZ
+
+/[^\h]/BZ
+
+/[\V]/BZ
+
+/[\x0a\V]/BZ
+
+/\h+/SI
+    \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
+    \x{3001}\x{2fff}\x{200a}\xa0\x{2000}
+
+/[\h\x{dc00}]+/BZSI
+    \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
+    \x{3001}\x{2fff}\x{200a}\xa0\x{2000}
+
+/\H+/SI
+    \x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
+    \x{2000}\x{200a}\x{1fff}\x{200b}
+    \x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
+    \xa0\x{3000}\x9f\xa1\x{2fff}\x{3001}
+
+/[\H\x{d800}]+/BZSI
+    \x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
+    \x{2000}\x{200a}\x{1fff}\x{200b}
+    \x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
+    \xa0\x{3000}\x9f\xa1\x{2fff}\x{3001}
+
+/\v+/SI
+    \x{2027}\x{2030}\x{2028}\x{2029}
+    \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
+
+/[\v\x{dc00}]+/BZSI
+    \x{2027}\x{2030}\x{2028}\x{2029}
+    \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
+
+/\V+/SI
+    \x{2028}\x{2029}\x{2027}\x{2030}
+    \x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
+
+/[\V\x{d800}]+/BZSI
+    \x{2028}\x{2029}\x{2027}\x{2030}
+    \x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
+
+/\R+/SI<bsr_unicode>
+    \x{2027}\x{2030}\x{2028}\x{2029}
+    \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
+
+/\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}/I
+    \x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}
+
+/-- End of testinput17 --/

Copied: code/trunk/testdata/testinput18 (from rev 835, code/branches/pcre16/testdata/testinput18)
===================================================================
--- code/trunk/testdata/testinput18                            (rev 0)
+++ code/trunk/testdata/testinput18    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,241 @@
+/-- This set of tests is for UTF-16 support, and is relevant only to the 16-bit
+    library. --/
+
+/\xC3\xC3\xC3xxx/8?DZSS
+
+/abc/8
+    \xC3]
+
+/X(\C{3})/8
+    X\x{11234}Y
+
+/X(\C{4})/8
+    X\x{11234}YZ
+
+/X\C*/8
+    XYZabcdce
+
+/X\C*?/8
+    XYZabcde
+
+/X\C{3,5}/8
+    Xabcdefg
+    X\x{11234}Y
+    X\x{11234}YZ
+    X\x{11234}\x{512}
+    X\x{11234}\x{512}YZ
+    X\x{11234}\x{512}\x{11234}Z
+
+/X\C{3,5}?/8
+    Xabcdefg
+    X\x{11234}Y
+    X\x{11234}YZ
+    X\x{11234}\x{512}YZ
+    *** Failers
+    X\x{11234}
+
+/a\Cb/8
+    aXb
+    a\nb
+
+/a\C\Cb/8
+    a\x{12257}b
+    ** Failers
+    a\x{100}b
+
+/ab\Cde/8
+    abXde
+
+/-- Check maximum character size --/
+
+/\x{ffff}/8DZ
+
+/\x{10000}/8DZ
+
+/\x{100}/8DZ
+
+/\x{1000}/8DZ
+
+/\x{10000}/8DZ
+
+/\x{100000}/8DZ
+
+/\x{10ffff}/8DZ
+
+/[\x{ff}]/8DZ
+
+/[\x{100}]/8DZ
+
+/\x80/8DZ
+
+/\xff/8DZ
+
+/\x{D55c}\x{ad6d}\x{C5B4}/DZ8
+    \x{D55c}\x{ad6d}\x{C5B4}
+
+/\x{65e5}\x{672c}\x{8a9e}/DZ8
+    \x{65e5}\x{672c}\x{8a9e}
+
+/\x{80}/DZ8
+
+/\x{084}/DZ8
+
+/\x{104}/DZ8
+
+/\x{861}/DZ8
+
+/\x{212ab}/DZ8
+
+/-- This one is here not because it's different to Perl, but because the way
+the captured single-byte is displayed. (In Perl it becomes a character, and you
+can't tell the difference.) --/
+
+/X(\C)(.*)/8
+    X\x{1234}
+    X\nabc
+
+/-- This one is here because Perl gives out a grumbly error message (quite
+correctly, but that messes up comparisons). --/
+
+/a\Cb/8
+    *** Failers
+    a\x{100}b
+
+/[^ab\xC0-\xF0]/8SDZ
+    \x{f1}
+    \x{bf}
+    \x{100}
+    \x{1000}
+    *** Failers
+    \x{c0}
+    \x{f0}
+
+/Ā{3,4}/8SDZ
+  \x{100}\x{100}\x{100}\x{100\x{100}
+
+/(\x{100}+|x)/8SDZ
+
+/(\x{100}*a|x)/8SDZ
+
+/(\x{100}{0,2}a|x)/8SDZ
+
+/(\x{100}{1,2}a|x)/8SDZ
+
+/\x{100}/8DZ
+
+/a\x{100}\x{101}*/8DZ
+
+/a\x{100}\x{101}+/8DZ
+
+/[^\x{c4}]/DZ
+
+/[\x{100}]/8DZ
+    \x{100}
+    Z\x{100}
+    \x{100}Z
+    *** Failers
+
+/[\xff]/DZ8
+    >\x{ff}<
+
+/[^\xff]/8DZ
+
+/\x{100}abc(xyz(?1))/8DZ
+
+/\777/8I
+  \x{1ff}
+  \777
+
+/\x{100}+\x{200}/8DZ
+
+/\x{100}+X/8DZ
+
+/^[\QĀ\E-\QŐ\E/BZ8
+
+/X/8
+    \x{0}\x{d7ff}\x{e000}\x{10ffff}
+    \x{d800}
+    \x{d800}\?
+    \x{da00}
+    \x{da00}\?
+    \x{dc00}
+    \x{dc00}\?
+    \x{de00}
+    \x{de00}\?
+    \x{dfff}
+    \x{dfff}\?
+    \x{110000}
+    \x{d800}\x{1234}
+    \x{fffe}
+
+/(*UTF16)\x{11234}/
+  abcd\x{11234}pqr
+
+/(*CRLF)(*UTF16)(*BSR_UNICODE)a\Rb/I
+
+/\h/SI8
+    ABC\x{09}
+    ABC\x{20}
+    ABC\x{a0}
+    ABC\x{1680}
+    ABC\x{180e}
+    ABC\x{2000}
+    ABC\x{202f}
+    ABC\x{205f}
+    ABC\x{3000}
+
+/\v/SI8
+    ABC\x{0a}
+    ABC\x{0b}
+    ABC\x{0c}
+    ABC\x{0d}
+    ABC\x{85}
+    ABC\x{2028}
+
+/\h*A/SI8
+    CDBABC
+
+/\v+A/SI8
+
+/\s?xxx\s/8SI
+
+/\sxxx\s/I8ST1
+    AB\x{85}xxx\x{a0}XYZ
+    AB\x{a0}xxx\x{85}XYZ
+
+/\S \S/I8ST1
+    \x{a2} \x{84}
+    A Z
+
+/a+/8
+    a\x{123}aa\>1
+    a\x{123}aa\>2
+    a\x{123}aa\>3
+    a\x{123}aa\>4
+    a\x{123}aa\>5
+    a\x{123}aa\>6
+
+/\x{1234}+/iS8I
+
+/\x{1234}+?/iS8I
+
+/\x{1234}++/iS8I
+
+/\x{1234}{2}/iS8I
+
+/[^\x{c4}]/8DZ
+
+/X+\x{200}/8DZ
+
+/\R/SI8
+
+/-- Check bad offset --/
+
+/a/8
+    \x{10000}\>1
+    \x{10000}ab\>2
+    \x{10000}ab\>3
+    \x{10000}ab\>4
+    \x{10000}ab\>5
+
+/-- End of testinput18 --/

Copied: code/trunk/testdata/testinput19 (from rev 835, code/branches/pcre16/testdata/testinput19)
===================================================================
--- code/trunk/testdata/testinput19                            (rev 0)
+++ code/trunk/testdata/testinput19    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,22 @@
+/-- This set of tests is for Unicode property support, relevant only to the
+    16-bit library. --/
+    
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iDZ
+
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8DZ
+
+/AB\x{1fb0}/8DZ
+
+/AB\x{1fb0}/8DZi
+
+/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/8iSI
+    \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
+    \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
+
+/[ⱥ]/8iBZ
+
+/[^ⱥ]/8iBZ
+
+/[[:blank:]]/WBZ
+
+/-- End of testinput19 --/

Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput2    2011-12-28 17:16:11 UTC (rev 836)
@@ -3,12 +3,11 @@
     It also checks the non-Perl syntax the PCRE supports (Python, .NET, 
     Oniguruma). Finally, there are some tests where PCRE and Perl differ, 
     either because PCRE can't be compatible, or there is a possible Perl 
-    bug. --/  
+    bug.
+    
+    NOTE: This is a non-UTF set of tests. When UTF support is needed, use
+    test 5, and if Unicode Property Support is needed, use test 7. --/

-/-- Originally, the Perl >= 5.10 things were in here too, but now I have 
-    separated many (most?) of them out into test 11. However, there may still 
-    be some that were overlooked. --/   
-
 /(a)b|/I

 /abc/I
@@ -144,40 +143,6 @@
     defabc
     \Zdefabc

-/abc/P
-    abc
-    *** Failers
-
-/^abc|def/P
-    abcdef
-    abcdef\B
-
-/.*((abc)$|(def))/P
-    defabc
-    \Zdefabc
-
-/the quick brown fox/P
-    the quick brown fox
-    *** Failers
-    The Quick Brown Fox
-
-/the quick brown fox/Pi
-    the quick brown fox
-    The Quick Brown Fox
-
-/abc.def/P
-    *** Failers
-    abc\ndef
-
-/abc$/P
-    abc
-    abc\n
-
-/(abc)\2/P
-
-/(abc\1)/P
-    abc
-
 /)/

/a[]b/
@@ -442,8 +407,6 @@

/abc/\

-/abc/\P
-
/abc/\i

 /(a)bc(d)/I
@@ -491,9 +454,6 @@
 /\Biss\B/I+
     Mississippi

-/\Biss\B/I+P
-    Mississippi
-
 /iss/IG+
     Mississippi

@@ -629,15 +589,6 @@
     *** Failers
     \Nabc

-/a*(b+)(z)(z)/P
-    aaaabbbbzzzz
-    aaaabbbbzzzz\O0
-    aaaabbbbzzzz\O1
-    aaaabbbbzzzz\O2
-    aaaabbbbzzzz\O3
-    aaaabbbbzzzz\O4
-    aaaabbbbzzzz\O5
-
 /^.?abcd/IS

 /\(             # ( at start
@@ -1491,17 +1442,6 @@
     ** Failers
     line one\nthis is a line\nbreak in the second line

-/ab.cd/P
-    ab-cd
-    ab=cd
-    ** Failers
-    ab\ncd
-
-/ab.cd/Ps
-    ab-cd
-    ab=cd
-    ab\ncd
-
 /(?i)(?-i)AbCd/I
     AbCd
     ** Failers
@@ -1552,14 +1492,6 @@
     (this)
     ((this))

-/a(b)c/PN
-    abc
-
-/a(?P<name>b)c/PN
-    abc
-
-/\x{100}/I
-
 /\x{0000ff}/I

 /^((?P<A>a1)|(?P<A>a2)b)/I
@@ -2241,22 +2173,6 @@
     xabcpqrx
     xxyzx

-/[\h]/BZ
-    >\x09<
-
-/[\h]+/BZ
-    >\x09\x20\xa0<
-
-/[\v]/BZ
-
-/[\H]/BZ
-
-/[^\h]/BZ
-
-/[\V]/BZ
-
-/[\x0a\V]/BZ
-
 /\H++X/BZ
     ** Failers
     XXXX
@@ -2616,11 +2532,6 @@

/(?(?=.*b).*b|^d)/I

-/a?|b?/P
-    abc
-    ** Failers
-    ddd\N   
-
 /xyz/C
   xyz 
   abcxyz 
@@ -2812,12 +2723,6 @@
    abc\P
    abc\P\P

-/\w+A/P
-   CDAAAAB 
-
-/\w+A/PU
-   CDAAAAB 
-
 /abc\K123/
     xyzabc123pqr
     xyzabc12\P
@@ -2961,201 +2866,6 @@

/^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/SI

-/  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                          # optional leading comment
-(?:    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)                    # initial word
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)  )* # further okay, if led by a period
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-# address
-|                     #  or
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)             # one word, optionally followed by....
-(?:
-[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]  |  # atom and space parts, or...
-\(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)       |  # comments, or...
-
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-# quoted strings
-)*
-<  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                     # leading <
-(?:  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  ,  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-)* # further okay, if led by comma
-:                                # closing colon
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  )? #       optional route
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)                    # initial word
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)  )* # further okay, if led by a period
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-#       address spec
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  > #                  trailing >
-# name and address
-)  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                       # optional trailing comment
-/xSI
-
 /<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/isIS

 "(?>.*/)foo"SI
@@ -3322,116 +3032,19 @@
 /A(*PRUNE)B|A(*PRUNE)C/K
     AC

-/--- A whole lot of tests of verbs with arguments are here rather than in test
-     11 because Perl doesn't seem to follow its specification entirely 
-     correctly. ---/
-
-/--- Perl 5.11 sets $REGERROR on the AC failure case here; PCRE does not. It is
-     not clear how Perl defines "involved in the failure of the match". ---/ 
-
-/^(A(*THEN:A)B|C(*THEN:B)D)/K
-    AB
-    CD
-    ** Failers
-    AC
-    CB    
-    
-/--- Check the use of names for success and failure. PCRE doesn't show these 
-names for success, though Perl does, contrary to its spec. ---/
-
-/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
-    AB
-    CD
-    ** Failers
-    AC
-    CB    
-    
-/--- An empty name does not pass back an empty string. It is the same as if no
-name were given. ---/ 
-
-/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K
-    AB
-    CD 
-
-/--- PRUNE goes to next bumpalong; COMMIT does not. ---/
-    
-/A(*PRUNE:A)B/K
-    ACAB
-
-/(*MARK:A)(*PRUNE:B)(C|X)/KS
-    C
-    D 
-
-/(*MARK:A)(*PRUNE:B)(C|X)/KSS
-    C
-    D 
-
-/(*MARK:A)(*THEN:B)(C|X)/KS
-    C
-    D 
-
-/(*MARK:A)(*THEN:B)(C|X)/KSY
-    C
-    D 
-
-/(*MARK:A)(*THEN:B)(C|X)/KSS
-    C
-    D 
-
-/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
-
-/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
-    AAAC
-
-/--- Same --/
-
-/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK
-    AAAC
-
 /--- This should fail; the SKIP advances by one, but when we get to AC, the
-     PRUNE kills it. ---/ 
+     PRUNE kills it. Perl behaves differently. ---/

 /A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xK
     AAAC

-/A(*:A)A+(*SKIP)(B|Z) | AC/xK
-    AAAC
+/--- Mark names can be duplicated. Perl doesn't give a mark for this one,
+though PCRE does. ---/

-/--- This should fail, as a null name is the same as no name ---/
-
-/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK
-    AAAC
-
-/--- This fails in PCRE, and I think that is in accordance with Perl's 
-     documentation, though in Perl it succeeds. ---/
-    
-/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK
-    AAAC
-
-/--- Mark names can be duplicated ---/
-
-/A(*:A)B|X(*:A)Y/K
-    AABC
-    XXYZ 
-    
 /^A(*:A)B|^X(*:A)Y/K
     ** Failers
     XAQQ

-/--- A check on what happens after hitting a mark and them bumping along to
-something that does not even start. Perl reports tags after the failures here, 
-though it does not when the individual letters are made into something 
-more complicated. ---/
-
-/A(*:A)B|XX(*:B)Y/K
-    AABC
-    XXYZ 
-    ** Failers
-    XAQQ  
-    XAQQXZZ  
-    AXQQQ 
-    AXXQQQ 
-    
 /--- COMMIT at the start of a pattern should be the same as an anchor. Perl 
 optimizations defeat this. So does the PCRE optimization unless we disable it 
 with \Y. ---/
@@ -3441,78 +3054,6 @@
     ** Failers
     DEFGABC\Y

-/--- Repeat some tests with added studying. ---/
-
-/A(*COMMIT)B/+KS
-    ACABX
- 
-/A(*THEN)B|A(*THEN)C/KS
-    AC
-
-/A(*PRUNE)B|A(*PRUNE)C/KS
-    AC
-
-/^(A(*THEN:A)B|C(*THEN:B)D)/KS
-    AB
-    CD
-    ** Failers
-    AC
-    CB    
-
-/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/KS
-    AB
-    CD
-    ** Failers
-    AC
-    CB    
-
-/^(A(*PRUNE:)B|C(*PRUNE:B)D)/KS
-    AB
-    CD 
-
-/A(*PRUNE:A)B/KS
-    ACAB
-
-/(*MARK:A)(*PRUNE:B)(C|X)/KS
-    C
-    D 
-
-/(*MARK:A)(*THEN:B)(C|X)/KS
-    C
-    D 
-
-/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xKS
-    AAAC
-
-/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xKS
-    AAAC
-    
-/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xKS
-    AAAC
-
-/A(*:A)A+(*SKIP)(B|Z) | AC/xKS
-    AAAC
-
-/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xKS
-    AAAC
-
-/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xKS
-    AAAC
-
-/A(*:A)B|XX(*:B)Y/KS
-    AABC
-    XXYZ 
-    ** Failers
-    XAQQ  
-    XAQQXZZ  
-    AXQQQ 
-    AXXQQQ 
-    
-/(*COMMIT)ABC/
-    ABCDEFG
-    ** Failers
-    DEFGABC\Y  
-
 /^(ab (c+(*THEN)cd) | xyz)/x
     abcccd

@@ -3980,11 +3521,6 @@
 /^a\x1z/<JS>
     ax1z

-/^a\X41z/<JS>
-    aX41z
-    *** Failers
-    aAz
-
 /^a\u0041z/<JS>
     aAz
     *** Failers
@@ -4007,6 +3543,44 @@

/(?(?=c)c|d)*+Y/BZ

-/(?<=ab\Cde)X/8
+/a[\NB]c/
+    aNc
+    
+/a[B-\Nc]/

+/(a)(?2){0,1999}?(b)/
+
+/(a)(?(DEFINE)(b))(?2){0,1999}?(?2)/
+
+/--- This test, with something more complicated than individual letters, causes
+different behaviour in Perl. Perhaps it disables some optimization; no tag is
+passed back for the failures, whereas in PCRE there is a tag. ---/
+    
+/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK
+    AABC
+    XXYZ 
+    ** Failers
+    XAQQ  
+    XAQQXZZ  
+    AXQQQ 
+    AXXQQQ 
+
+/-- Perl doesn't give marks for these, though it does if the alternatives are
+replaced by single letters. --/
+    
+/(b|q)(*:m)f|a(*:n)w/K
+    aw 
+    ** Failers 
+    abc
+
+/(q|b)(*:m)f|a(*:n)w/K
+    aw 
+    ** Failers 
+    abc
+
+/-- After a partial match, the behaviour is as for a failure. --/
+
+/^a(*:X)bcde/K
+   abc\P
+
 /-- End of testinput2 --/

Copied: code/trunk/testdata/testinput20 (from rev 835, code/branches/pcre16/testdata/testinput20)
===================================================================
--- code/trunk/testdata/testinput20                            (rev 0)
+++ code/trunk/testdata/testinput20    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,19 @@
+/-- These tests are for the handling of characters greater than 255 in 16-bit,
+    non-UTF-16 mode. --/
+
+/^\x{ffff}+/i
+    \x{ffff}
+
+/^\x{ffff}?/i
+    \x{ffff}
+
+/^\x{ffff}*/i
+    \x{ffff}
+
+/^\x{ffff}{3}/i
+    \x{ffff}\x{ffff}\x{ffff}
+
+/^\x{ffff}{0,3}/i
+    \x{ffff}
+
+/-- End of testinput20 --/

Modified: code/trunk/testdata/testinput4
===================================================================
--- code/trunk/testdata/testinput4    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput4    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,5 +1,6 @@
-/-- This set of tests is for UTF-8 support, excluding Unicode properties. It is
-    compatible with all versions of Perl 5. --/
+/-- This set of tests is for UTF support, excluding Unicode properties. It is
+    compatible with all versions of Perl >= 5.10 and both the 8-bit and 16-bit
+    PCRE libraries. --/

 /a.b/8
     acb
@@ -126,31 +127,6 @@
     *** Failers
     XYZ

-/X(\C{3})/8
-    X\x{1234}
-
-/X(\C{4})/8
-    X\x{1234}YZ
-    
-/X\C*/8
-    XYZabcdce
-    
-/X\C*?/8
-    XYZabcde
-    
-/X\C{3,5}/8
-    Xabcdefg   
-    X\x{1234} 
-    X\x{1234}YZ
-    X\x{1234}\x{512}  
-    X\x{1234}\x{512}YZ
-
-/X\C{3,5}?/8
-    Xabcdefg   
-    X\x{1234} 
-    X\x{1234}YZ
-    X\x{1234}\x{512}  
-
 /[^a]+/8g
     bcd
     \x{100}aY\x{256}Z 
@@ -456,17 +432,6 @@
     \x{150}X
     \x{200}X

-/a\Cb/
-    aXb
-    a\nb
-  
-/a\Cb/8
-    aXb
-    a\nb
-    
-/a\C\Cb/8 
-    a\x{100}b 
-
 /[z-\x{100}]/8i
     z
     Z 
@@ -650,7 +615,10 @@
 /(abc)\1/8
    abc

-/ab\Cde/8
-    abXde
+/a(*:a\x{1234}b)/8K
+    abc

+/a(*:a£b)/8K 
+    abc
+
 /-- End of testinput4 --/

Modified: code/trunk/testdata/testinput5
===================================================================
--- code/trunk/testdata/testinput5    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput5    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,79 +1,36 @@
-/-- This set of tests checks the API, internals, and non-Perl stuff for UTF-8
-    support, excluding Unicode properties. --/
+/-- This set of tests checks the API, internals, and non-Perl stuff for UTF
+    support, excluding Unicode properties. However, tests that give different
+    results in 8-bit and 16-bit modes are excluded (see tests 16 and 17). --/

-/\x{100}/8DZ
+/\x{110000}/8DZ

-/\x{1000}/8DZ
+/\x{ffffffff}/8

-/\x{10000}/8DZ
+/\x{100000000}/8

-/\x{100000}/8DZ
+/\x{d800}/8

-/\x{1000000}/8DZ
+/\x{dfff}/8

-/\x{4000000}/8DZ
+/\x{d7ff}/8

-/\x{7fffFFFF}/8DZ
+/\x{e000}/8

-/[\x{ff}]/8DZ
-
-/[\x{100}]/8DZ
-
-/\x{ffffffff}/8
-
-/\x{100000000}/8
-
 /^\x{100}a\x{1234}/8
     \x{100}a\x{1234}bcd

-/\x80/8DZ
-
-/\xff/8DZ
-
 /\x{0041}\x{2262}\x{0391}\x{002e}/DZ8
     \x{0041}\x{2262}\x{0391}\x{002e}

-/\x{D55c}\x{ad6d}\x{C5B4}/DZ8 
-    \x{D55c}\x{ad6d}\x{C5B4} 
-
-/\x{65e5}\x{672c}\x{8a9e}/DZ8
-    \x{65e5}\x{672c}\x{8a9e}
-
-/\x{80}/DZ8
-
-/\x{084}/DZ8
-
-/\x{104}/DZ8
-
-/\x{861}/DZ8
-
-/\x{212ab}/DZ8
-
 /.{3,5}X/DZ8
     \x{212ab}\x{212ab}\x{212ab}\x{861}X

-
 /.{3,5}?/DZ8
     \x{212ab}\x{212ab}\x{212ab}\x{861}

 /(?<=\C)X/8
     Should produce an error diagnostic

-/-- This one is here not because it's different to Perl, but because the way
-the captured single-byte is displayed. (In Perl it becomes a character, and you
-can't tell the difference.) --/
-    
-/X(\C)(.*)/8
-    X\x{1234}
-    X\nabc 
-
-/-- This one is here because Perl gives out a grumbly error message (quite 
-correctly, but that messes up comparisons). --/
-    
-/a\Cb/8
-    *** Failers 
-    a\x{100}b 
-    
 /^[ab]/8DZ
     bar
     *** Failers
@@ -88,26 +45,6 @@
     *** Failers 
     aaa

-/[^ab\xC0-\xF0]/8SDZ
-    \x{f1}
-    \x{bf}
-    \x{100}
-    \x{1000}   
-    *** Failers
-    \x{c0} 
-    \x{f0} 
-
-/Ā{3,4}/8SDZ
-  \x{100}\x{100}\x{100}\x{100\x{100}
-
-/(\x{100}+|x)/8SDZ
-
-/(\x{100}*a|x)/8SDZ
-
-/(\x{100}{0,2}a|x)/8SDZ
-
-/(\x{100}{1,2}a|x)/8SDZ
-
 /\x{100}*(\d+|"(?1)")/8
     1234
     "1234" 
@@ -118,33 +55,17 @@
     *** Failers 
     \x{100}\x{100}abcd

-/\x{100}/8DZ
-
/\x{100}*/8DZ

/a\x{100}*/8DZ

/ab\x{100}*/8DZ

-/a\x{100}\x{101}*/8DZ
-
-/a\x{100}\x{101}+/8DZ
-
 /\x{100}*A/8DZ
     A

/\x{100}*\d(?R)/8DZ

-/[^\x{c4}]/DZ
-
-/[^\x{c4}]/8DZ
-
-/[\x{100}]/8DZ
-    \x{100}
-    Z\x{100}
-    \x{100}Z
-    *** Failers 
-
 /[Z\x{100}]/8DZ
     Z\x{100}
     \x{100}
@@ -169,13 +90,8 @@
 /[\xFF]/DZ
     >\xff<

-/[\xff]/DZ8
-    >\x{ff}<
-
 /[^\xFF]/DZ

-/[^\xff]/8DZ
-
 /[Ä-Ü]/8
     Ö # Matches without Study
     \x{d6}
@@ -192,61 +108,6 @@
     Ö <-- Same with Study
     \x{d6}

-/[\xC3]/8
-
-/\xC3/8
-
-/\xC3\xC3\xC3xxx/8
-
-/\xC3\xC3\xC3xxx/8?DZSS
-
-/abc/8
-    \xC3]
-    \xC3
-    \xC3\xC3\xC3
-    \xC3\xC3\xC3\?
-    \xe1\x88 
-    \P\xe1\x88 
-    \P\P\xe1\x88 
-    XX\xea
-    \O0XX\xea
-    \O1XX\xea
-    \O2XX\xea
-    XX\xf1
-    XX\xf8  
-    XX\xfc
-    ZZ\xea\xaf\x20YY
-    ZZ\xfd\xbf\xbf\x2f\xbf\xbfYY  
-    ZZ\xfd\xbf\xbf\xbf\x2f\xbfYY  
-    ZZ\xfd\xbf\xbf\xbf\xbf\x2fYY  
-    ZZ\xffYY
-    ZZ\xfeYY  
-
-/anything/8
-    \xc0\x80
-    \xc1\x8f 
-    \xe0\x9f\x80
-    \xf0\x8f\x80\x80 
-    \xf8\x87\x80\x80\x80  
-    \xfc\x83\x80\x80\x80\x80
-    \xfe\x80\x80\x80\x80\x80  
-    \xff\x80\x80\x80\x80\x80  
-    \xc3\x8f
-    \xe0\xaf\x80
-    \xe1\x80\x80
-    \xf0\x9f\x80\x80 
-    \xf1\x8f\x80\x80 
-    \xf8\x88\x80\x80\x80  
-    \xf9\x87\x80\x80\x80  
-    \xfc\x84\x80\x80\x80\x80
-    \xfd\x83\x80\x80\x80\x80
-    \?\xf8\x88\x80\x80\x80  
-    \?\xf9\x87\x80\x80\x80  
-    \?\xfc\x84\x80\x80\x80\x80
-    \?\xfd\x83\x80\x80\x80\x80
-
-/\x{100}abc(xyz(?1))/8DZ
-
 /[^\x{100}]abc(xyz(?1))/8DZ

 /[ab\x{100}]abc(xyz(?1))/8DZ
@@ -266,17 +127,10 @@
 /\w/8
     \x{100}X

-/a\x{1234}b/P8
-    a\x{1234}b
-
 /^\ሴ/8DZ

/\777/I

-/\777/8I
- \x{1ff}
- \777
-
/\x{100}*\d/8DZ

/\x{100}*\s/8DZ
@@ -289,12 +143,6 @@

/\x{100}*\W/8DZ

-/\x{100}+\x{200}/8DZ
-
-/\x{100}+X/8DZ
-
-/X+\x{200}/8DZ
-
/()()()()()()()()()()
()()()()()()()()()()
()()()()()()()()()()
@@ -306,8 +154,6 @@

/^[\QĀ\E-\QŐ\E]/BZ8

-/^[\QĀ\E-\QŐ\E/BZ8
-
 /^abc./mgx8<any>
     abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x{0085}abc7 \x{2028}abc8 \x{2029}abc9 JUNK

@@ -402,23 +248,6 @@
 /.*$/8<any>
     \x{1ec5}

-/-- This tests the stricter UTF-8 check according to RFC 3629. --/ 
-    
-/X/8
-    \x{0}\x{d7ff}\x{e000}\x{10ffff}
-    \x{d800}
-    \x{d800}\?
-    \x{da00}
-    \x{da00}\?
-    \x{dfff}
-    \x{dfff}\?
-    \x{110000}    
-    \x{110000}\?    
-    \x{2000000} 
-    \x{2000000}\? 
-    \x{7fffffff} 
-    \x{7fffffff}\? 
-
 /a\Rb/I8<bsr_anycrlf>
     a\rb
     a\nb
@@ -477,16 +306,10 @@

 /(\x{de})\1/
     \x{de}\x{de}
-    \x{123}

 /X/8f<any> 
     A\x{1ec5}ABCXYZ

-/(*UTF8)\x{1234}/
-  abcd\x{1234}pqr
-
-/(*CRLF)(*UTF8)(*BSR_UNICODE)a\Rb/I
-
 /Xa{2,4}b/8
     X\P
     Xa\P
@@ -768,55 +591,13 @@
 /X\W{3}X/8
     \PX

-/\h/SI
-
-/\h/SI8
-    ABC\x{09}
-    ABC\x{20}
-    ABC\x{a0}
-    ABC\x{1680}
-    ABC\x{180e}
-    ABC\x{2000}
-    ABC\x{202f} 
-    ABC\x{205f} 
-    ABC\x{3000} 
-
-/\v/SI
-
-/\v/SI8
-    ABC\x{0a}
-    ABC\x{0b}
-    ABC\x{0c}
-    ABC\x{0d}
-    ABC\x{85}
-    ABC\x{2028}
-
-/\R/SI
-
-/\R/SI8
-
-/\h*A/SI8
-    CDBABC
-    
-/\v+A/SI8
-
-/\s?xxx\s/8SI
-
 /\sxxx\s/8T1
     AB\x{85}xxx\x{a0}XYZ
     AB\x{a0}xxx\x{85}XYZ

-/\sxxx\s/I8ST1
-    AB\x{85}xxx\x{a0}XYZ
-    AB\x{a0}xxx\x{85}XYZ
-
 /\S \S/8T1
     \x{a2} \x{84}

-/\S \S/I8ST1
-    \x{a2} \x{84} 
-    A Z 
-
 'A#хц'8x<any>BZ

'A#хц
@@ -834,14 +615,6 @@
/\g{A}xxx#bх(?'A'123)
(?'A'456)/8x<any>BZ

-/a+/8
-    a\x{123}aa\>1
-    a\x{123}aa\>2
-    a\x{123}aa\>3
-    a\x{123}aa\>4
-    a\x{123}aa\>5
-    a\x{123}aa\>6
-
 /^\cģ/8

 /(\R*)(.)/s8
@@ -854,14 +627,6 @@
     \r\r\n\n\r 
     \r\r\n\n\r\n

-/\x{1234}+/iS8I
-
-/\x{1234}+?/iS8I
-
-/\x{1234}++/iS8I
-
-/\x{1234}{2}/iS8I
-
/[^\x{1234}]+/iS8I

/[^\x{1234}]+?/iS8I
@@ -883,5 +648,51 @@

 /f.*/8s
     \P\Pfor
+    
+/\x{d7ff}\x{e000}/8

+/\x{d800}/8
+
+/\x{dfff}/8 
+
+/\h+/8
+    \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
+    \x{3001}\x{2fff}\x{200a}\x{a0}\x{2000}
+
+/[\h\x{e000}]+/8BZ
+    \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
+    \x{3001}\x{2fff}\x{200a}\x{a0}\x{2000}
+
+/\H+/8
+    \x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
+    \x{2000}\x{200a}\x{1fff}\x{200b}
+    \x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
+    \x{a0}\x{3000}\x{9f}\x{a1}\x{2fff}\x{3001}
+
+/[\H\x{d7ff}]+/8BZ
+    \x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
+    \x{2000}\x{200a}\x{1fff}\x{200b}
+    \x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
+    \x{a0}\x{3000}\x{9f}\x{a1}\x{2fff}\x{3001}
+
+/\v+/8
+    \x{2027}\x{2030}\x{2028}\x{2029}
+    \x09\x0e\x{84}\x{86}\x{85}\x0a\x0b\x0c\x0d
+
+/[\v\x{e000}]+/8BZ
+    \x{2027}\x{2030}\x{2028}\x{2029}
+    \x09\x0e\x{84}\x{86}\x{85}\x0a\x0b\x0c\x0d
+
+/\V+/8
+    \x{2028}\x{2029}\x{2027}\x{2030}
+    \x{85}\x0a\x0b\x0c\x0d\x09\x0e\x{84}\x{86}
+
+/[\V\x{d7ff}]+/8BZ
+    \x{2028}\x{2029}\x{2027}\x{2030}
+    \x{85}\x0a\x0b\x0c\x0d\x09\x0e\x{84}\x{86}
+
+/\R+/8<bsr_unicode>
+    \x{2027}\x{2030}\x{2028}\x{2029}
+    \x09\x0e\x{84}\x{86}\x{85}\x0a\x0b\x0c\x0d
+
 /-- End of testinput5 --/

Modified: code/trunk/testdata/testinput6
===================================================================
--- code/trunk/testdata/testinput6    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput6    2011-12-28 17:16:11 UTC (rev 836)
@@ -802,4 +802,18 @@
     ** Failers 
     a\xFCb

+/ⱥ/8i
+    ⱥ
+    Ⱥx 
+    Ⱥ 
+
+/[ⱥ]/8i
+    ⱥ
+    Ⱥx 
+    Ⱥ 
+
+/Ⱥ/8i
+    Ⱥ
+    ⱥ
+
 /-- End of testinput6 --/

Modified: code/trunk/testdata/testinput7
===================================================================
--- code/trunk/testdata/testinput7    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput7    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,4712 +1,561 @@
-/-- This set of tests check the DFA matching functionality of pcre_dfa_exec().
-    The -dfa flag must be used with pcretest when running it. --/
-     
-/abc/
-    abc
-    
-/ab*c/
-    abc
-    abbbbc
-    ac
-    
-/ab+c/
-    abc
-    abbbbbbc
-    *** Failers 
-    ac
-    ab
-    
-/a*/
-    a
-    aaaaaaaaaaaaaaaaa
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\F 
-    
-/(a|abcd|african)/
-    a
-    abcd
-    african
-    
-/^abc/
-    abcdef
-    *** Failers
-    xyzabc
-    xyz\nabc    
-    
-/^abc/m
-    abcdef
-    xyz\nabc    
-    *** Failers
-    xyzabc
-    
-/\Aabc/
-    abcdef
-    *** Failers
-    xyzabc
-    xyz\nabc    
-    
-/\Aabc/m
-    abcdef
-    *** Failers
-    xyzabc
-    xyz\nabc    
-    
-/\Gabc/
-    abcdef
-    xyzabc\>3
-    *** Failers
-    xyzabc    
-    xyzabc\>2 
-    
-/x\dy\Dz/
-    x9yzz
-    x0y+z
-    *** Failers
-    xyz
-    xxy0z     
-    
-/x\sy\Sz/
-    x yzz
-    x y+z
-    *** Failers
-    xyz
-    xxyyz
-    
-/x\wy\Wz/
-    xxy+z
-    *** Failers
-    xxy0z
-    x+y+z         
-    
-/x.y/
-    x+y
-    x-y
-    *** Failers
-    x\ny
-    
-/x.y/s
-    x+y
-    x-y
-    x\ny
+/-- These tests for Unicode property support test PCRE's API and show some of
+    the compiled code. They are not Perl-compatible. --/

-/(a.b(?s)c.d|x.y)p.q/
-    a+bc+dp+q
-    a+bc\ndp+q
-    x\nyp+q 
-    *** Failers 
-    a\nbc\ndp+q
-    a+bc\ndp\nq
-    x\nyp\nq 
+/[\p{L}]/DZ

-/a\d\z/
-    ba0
-    *** Failers
-    ba0\n
-    ba0\ncd   
+/[\p{^L}]/DZ

-/a\d\z/m
-    ba0
-    *** Failers
-    ba0\n
-    ba0\ncd   
+/[\P{L}]/DZ

-/a\d\Z/
-    ba0
-    ba0\n
-    *** Failers
-    ba0\ncd   
+/[\P{^L}]/DZ

-/a\d\Z/m
-    ba0
-    ba0\n
-    *** Failers
-    ba0\ncd   
+/[abc\p{L}\x{0660}]/8DZ

-/a\d$/
-    ba0
-    ba0\n
-    *** Failers
-    ba0\ncd   
-
-/a\d$/m
-    ba0
-    ba0\n
-    ba0\ncd   
-    *** Failers
-
-/abc/i
-    abc
-    aBc
-    ABC
-    
-/[^a]/
-    abcd
-    
-/ab?\w/
-    abz
-    abbz
-    azz  
-
-/x{0,3}yz/
-    ayzq
-    axyzq
-    axxyz
-    axxxyzq
-    axxxxyzq
-    *** Failers
-    ax
-    axx     
-      
-/x{3}yz/
-    axxxyzq
-    axxxxyzq
-    *** Failers
-    ax
-    axx     
-    ayzq
-    axyzq
-    axxyz
-      
-/x{2,3}yz/
-    axxyz
-    axxxyzq
-    axxxxyzq
-    *** Failers
-    ax
-    axx     
-    ayzq
-    axyzq
-      
-/[^a]+/
-    bac
-    bcdefax
-    *** Failers
-    aaaaa   
-
-/[^a]*/
-    bac
-    bcdefax
-    *** Failers
-    aaaaa   
-    
-/[^a]{3,5}/
-    xyz
-    awxyza
-    abcdefa
-    abcdefghijk
-    *** Failers
-    axya
-    axa
-    aaaaa         
-
-/\d*/
-    1234b567
-    xyz
-    
-/\D*/
-    a1234b567
-    xyz
-     
-/\d+/
-    ab1234c56
-    *** Failers
-    xyz
-    
-/\D+/
-    ab123c56
-    *** Failers
-    789
-    
-/\d?A/
-    045ABC
-    ABC
-    *** Failers
-    XYZ
-    
-/\D?A/
-    ABC
-    BAC
-    9ABC             
-    *** Failers
-
-/a+/
-    aaaa
-
-/^.*xyz/
-    xyz
-    ggggggggxyz
-    
-/^.+xyz/
-    abcdxyz
-    axyz
-    *** Failers
-    xyz
-    
-/^.?xyz/
-    xyz
-    cxyz       
-
-/^\d{2,3}X/
-    12X
-    123X
-    *** Failers
-    X
-    1X
-    1234X     
-
-/^[abcd]\d/
-    a45
-    b93
-    c99z
-    d04
-    *** Failers
-    e45
-    abcd      
-    abcd1234
-    1234  
-
-/^[abcd]*\d/
-    a45
-    b93
-    c99z
-    d04
-    abcd1234
-    1234  
-    *** Failers
-    e45
-    abcd      
-
-/^[abcd]+\d/
-    a45
-    b93
-    c99z
-    d04
-    abcd1234
-    *** Failers
-    1234  
-    e45
-    abcd      
-
-/^a+X/
-    aX
-    aaX 
-
-/^[abcd]?\d/
-    a45
-    b93
-    c99z
-    d04
-    1234  
-    *** Failers
-    abcd1234
-    e45
-
-/^[abcd]{2,3}\d/
-    ab45
-    bcd93
-    *** Failers
-    1234 
-    a36 
-    abcd1234
-    ee45
-
-/^(abc)*\d/
-    abc45
-    abcabcabc45
-    42xyz 
-    *** Failers
-
-/^(abc)+\d/
-    abc45
-    abcabcabc45
-    *** Failers
-    42xyz 
-
-/^(abc)?\d/
-    abc45
-    42xyz 
-    *** Failers
-    abcabcabc45
-
-/^(abc){2,3}\d/
-    abcabc45
-    abcabcabc45
-    *** Failers
-    abcabcabcabc45
-    abc45
-    42xyz 
-
-/1(abc|xyz)2(?1)3/
-    1abc2abc3456
-    1abc2xyz3456 
-
-/^(a*\w|ab)=(a*\w|ab)/
-    ab=ab
-
-/^(a*\w|ab)=(?1)/
-    ab=ab
-
-/^([^()]|\((?1)*\))*$/
-    abc
-    a(b)c
-    a(b(c))d  
-    *** Failers)
-    a(b(c)d  
-
-/^>abc>([^()]|\((?1)*\))*<xyz<$/
-    >abc>123<xyz<
-    >abc>1(2)3<xyz<
-    >abc>(1(2)3)<xyz<
-
-/^(?>a*)\d/
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa9876
-    *** Failers 
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-
-/< (?: (?(R) \d++  | [^<>]*+) | (?R)) * >/x
-    <>
-    <abcd>
-    <abc <123> hij>
-    <abc <def> hij>
-    <abc<>def> 
-    <abc<>      
-    *** Failers
-    <abc
-
-/^(?(?=abc)\w{3}:|\d\d)$/        
-    abc:                          
-    12                             
-    *** Failers                     
-    123                       
-    xyz                        
-                                
-/^(?(?!abc)\d\d|\w{3}:)$/      
-    abc:                        
-    12         
-    *** Failers
-    123
-    xyz    
-
-/^(?=abc)\w{5}:$/        
-    abcde:                          
-    *** Failers                     
-    abc.. 
-    123                       
-    vwxyz                        
-                                
-/^(?!abc)\d\d$/      
-    12         
-    *** Failers
-    abcde:
-    abc..  
-    123
-    vwxyz    
-
-/(?<=abc|xy)123/
-    abc12345
-    wxy123z
-    *** Failers
-    123abc
-
-/(?<!abc|xy)123/
-    123abc
-    mno123456 
-    *** Failers
-    abc12345
-    wxy123z
-
-/abc(?C1)xyz/
-    abcxyz
-    123abcxyz999 
-
-/(ab|cd){3,4}/C
-  ababab
-  abcdabcd
-  abcdcdcdcdcd  
-
-/^abc/
-    abcdef
-    *** Failers
-    abcdef\B  
-
-/^(a*|xyz)/
-    bcd
-    aaabcd
-    xyz
-    xyz\N  
-    *** Failers
-    bcd\N   
-    
-/xyz$/
-    xyz
-    xyz\n
-    *** Failers
-    xyz\Z
-    xyz\n\Z    
-    
-/xyz$/m
-    xyz
-    xyz\n 
-    abcxyz\npqr 
-    abcxyz\npqr\Z 
-    xyz\n\Z    
-    *** Failers
-    xyz\Z
-
-/\Gabc/
-    abcdef
-    defabcxyz\>3 
-    *** Failers 
-    defabcxyz
-
-/^abcdef/
-    ab\P
-    abcde\P
-    abcdef\P
-    *** Failers
-    abx\P    
-
-/^a{2,4}\d+z/
-    a\P
-    aa\P
-    aa2\P 
-    aaa\P
-    aaa23\P 
-    aaaa12345\P
-    aa0z\P
-    aaaa4444444444444z\P 
-    *** Failers
-    az\P 
-    aaaaa\P 
-    a56\P 
-
-/^abcdef/
-   abc\P
-   def\R 
-   
-/(?<=foo)bar/
-   xyzfo\P 
-   foob\P\>2 
-   foobar...\R\P\>4 
-   xyzfo\P
-   foobar\>2  
-   *** Failers
-   xyzfo\P
-   obar\R   
-
-/(ab*(cd|ef))+X/
-    adfadadaklhlkalkajhlkjahdfasdfasdfladsfjkj\P\Z
-    lkjhlkjhlkjhlkjhabbbbbbcdaefabbbbbbbefa\P\B\Z
-    cdabbbbbbbb\P\R\B\Z
-    efabbbbbbbbbbbbbbbb\P\R\B\Z
-    bbbbbbbbbbbbcdXyasdfadf\P\R\B\Z    
-
-/(a|b)/SF>testsavedregex
-<testsavedregex
-    abc
-    ** Failers
-    def  
-    
-/the quick brown fox/
-    the quick brown fox
-    The quick brown FOX
-    What do you know about the quick brown fox?
-    What do you know about THE QUICK BROWN FOX?
-
-/The quick brown fox/i
-    the quick brown fox
-    The quick brown FOX
-    What do you know about the quick brown fox?
-    What do you know about THE QUICK BROWN FOX?
-
-/abcd\t\n\r\f\a\e\071\x3b\$\\\?caxyz/
-    abcd\t\n\r\f\a\e9;\$\\?caxyz
-
-/a*abc?xyz+pqr{3}ab{2,}xy{4,5}pq{0,6}AB{0,}zz/
-    abxyzpqrrrabbxyyyypqAzz
-    abxyzpqrrrabbxyyyypqAzz
-    aabxyzpqrrrabbxyyyypqAzz
-    aaabxyzpqrrrabbxyyyypqAzz
-    aaaabxyzpqrrrabbxyyyypqAzz
-    abcxyzpqrrrabbxyyyypqAzz
-    aabcxyzpqrrrabbxyyyypqAzz
-    aaabcxyzpqrrrabbxyyyypAzz
-    aaabcxyzpqrrrabbxyyyypqAzz
-    aaabcxyzpqrrrabbxyyyypqqAzz
-    aaabcxyzpqrrrabbxyyyypqqqAzz
-    aaabcxyzpqrrrabbxyyyypqqqqAzz
-    aaabcxyzpqrrrabbxyyyypqqqqqAzz
-    aaabcxyzpqrrrabbxyyyypqqqqqqAzz
-    aaaabcxyzpqrrrabbxyyyypqAzz
-    abxyzzpqrrrabbxyyyypqAzz
-    aabxyzzzpqrrrabbxyyyypqAzz
-    aaabxyzzzzpqrrrabbxyyyypqAzz
-    aaaabxyzzzzpqrrrabbxyyyypqAzz
-    abcxyzzpqrrrabbxyyyypqAzz
-    aabcxyzzzpqrrrabbxyyyypqAzz
-    aaabcxyzzzzpqrrrabbxyyyypqAzz
-    aaaabcxyzzzzpqrrrabbxyyyypqAzz
-    aaaabcxyzzzzpqrrrabbbxyyyypqAzz
-    aaaabcxyzzzzpqrrrabbbxyyyyypqAzz
-    aaabcxyzpqrrrabbxyyyypABzz
-    aaabcxyzpqrrrabbxyyyypABBzz
-    >>>aaabxyzpqrrrabbxyyyypqAzz
-    >aaaabxyzpqrrrabbxyyyypqAzz
-    >>>>abcxyzpqrrrabbxyyyypqAzz
-    *** Failers
-    abxyzpqrrabbxyyyypqAzz
-    abxyzpqrrrrabbxyyyypqAzz
-    abxyzpqrrrabxyyyypqAzz
-    aaaabcxyzzzzpqrrrabbbxyyyyyypqAzz
-    aaaabcxyzzzzpqrrrabbbxyyypqAzz
-    aaabcxyzpqrrrabbxyyyypqqqqqqqAzz
-
-/^(abc){1,2}zz/
-    abczz
-    abcabczz
-    *** Failers
-    zz
-    abcabcabczz
-    >>abczz
-
-/^(b+?|a){1,2}?c/
-    bc
-    bbc
-    bbbc
-    bac
-    bbac
-    aac
-    abbbbbbbbbbbc
-    bbbbbbbbbbbac
-    *** Failers
-    aaac
-    abbbbbbbbbbbac
-
-/^(b+|a){1,2}c/
-    bc
-    bbc
-    bbbc
-    bac
-    bbac
-    aac
-    abbbbbbbbbbbc
-    bbbbbbbbbbbac
-    *** Failers
-    aaac
-    abbbbbbbbbbbac
-
-/^(b+|a){1,2}?bc/
-    bbc
-
-/^(b*|ba){1,2}?bc/
-    babc
-    bbabc
-    bababc
-    *** Failers
-    bababbc
-    babababc
-
-/^(ba|b*){1,2}?bc/
-    babc
-    bbabc
-    bababc
-    *** Failers
-    bababbc
-    babababc
-
-/^\ca\cA\c[\c{\c:/
-    \x01\x01\e;z
-
-/^[ab\]cde]/
-    athing
-    bthing
-    ]thing
-    cthing
-    dthing
-    ething
-    *** Failers
-    fthing
-    [thing
-    \\thing
-
-/^[]cde]/
-    ]thing
-    cthing
-    dthing
-    ething
-    *** Failers
-    athing
-    fthing
-
-/^[^ab\]cde]/
-    fthing
-    [thing
-    \\thing
-    *** Failers
-    athing
-    bthing
-    ]thing
-    cthing
-    dthing
-    ething
-
-/^[^]cde]/
-    athing
-    fthing
-    *** Failers
-    ]thing
-    cthing
-    dthing
-    ething
-
-/^\\x81/
-    \x81
-
-/^\xFF/
-    \xFF
-
-/^[0-9]+$/
-    0
-    1
-    2
-    3
-    4
-    5
-    6
-    7
-    8
-    9
-    10
-    100
-    *** Failers
-    abc
-
-/^.*nter/
-    enter
-    inter
-    uponter
-
-/^xxx[0-9]+$/
-    xxx0
-    xxx1234
-    *** Failers
-    xxx
-
-/^.+[0-9][0-9][0-9]$/
-    x123
-    xx123
-    123456
-    *** Failers
-    123
-    x1234
-
-/^.+?[0-9][0-9][0-9]$/
-    x123
-    xx123
-    123456
-    *** Failers
-    123
-    x1234
-
-/^([^!]+)!(.+)=apquxz\.ixr\.zzz\.ac\.uk$/
-    abc!pqr=apquxz.ixr.zzz.ac.uk
-    *** Failers
-    !pqr=apquxz.ixr.zzz.ac.uk
-    abc!=apquxz.ixr.zzz.ac.uk
-    abc!pqr=apquxz:ixr.zzz.ac.uk
-    abc!pqr=apquxz.ixr.zzz.ac.ukk
-
-/:/
-    Well, we need a colon: somewhere
-    *** Fail if we don't
-
-/([\da-f:]+)$/i
-    0abc
-    abc
-    fed
-    E
-    ::
-    5f03:12C0::932e
-    fed def
-    Any old stuff
-    *** Failers
-    0zzz
-    gzzz
-    fed\x20
-    Any old rubbish
-
-/^.*\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})$/
-    .1.2.3
-    A.12.123.0
-    *** Failers
-    .1.2.3333
-    1.2.3
-    1234.2.3
-
-/^(\d+)\s+IN\s+SOA\s+(\S+)\s+(\S+)\s*\(\s*$/
-    1 IN SOA non-sp1 non-sp2(
-    1    IN    SOA    non-sp1    non-sp2   (
-    *** Failers
-    1IN SOA non-sp1 non-sp2(
-
-/^[a-zA-Z\d][a-zA-Z\d\-]*(\.[a-zA-Z\d][a-zA-z\d\-]*)*\.$/
-    a.
-    Z.
-    2.
-    ab-c.pq-r.
-    sxk.zzz.ac.uk.
-    x-.y-.
-    *** Failers
-    -abc.peq.
-
-/^\*\.[a-z]([a-z\-\d]*[a-z\d]+)?(\.[a-z]([a-z\-\d]*[a-z\d]+)?)*$/
-    *.a
-    *.b0-a
-    *.c3-b.c
-    *.c-a.b-c
-    *** Failers
-    *.0
-    *.a-
-    *.a-b.c-
-    *.c-a.0-c
-
-/^(?=ab(de))(abd)(e)/
-    abde
-
-/^(?!(ab)de|x)(abd)(f)/
-    abdf
-
-/^(?=(ab(cd)))(ab)/
-    abcd
-
-/^[\da-f](\.[\da-f])*$/i
-    a.b.c.d
-    A.B.C.D
-    a.b.c.1.2.3.C
-
-/^\".*\"\s*(;.*)?$/
-    \"1234\"
-    \"abcd\" ;
-    \"\" ; rhubarb
-    *** Failers
-    \"1234\" : things
-
-/^$/
-    \
-    *** Failers
-
-/   ^    a   (?# begins with a)  b\sc (?# then b c) $ (?# then end)/x
-    ab c
-    *** Failers
-    abc
-    ab cde
-
-/(?x)   ^    a   (?# begins with a)  b\sc (?# then b c) $ (?# then end)/
-    ab c
-    *** Failers
-    abc
-    ab cde
-
-/^   a\ b[c ]d       $/x
-    a bcd
-    a b d
-    *** Failers
-    abcd
-    ab d
-
-/^(a(b(c)))(d(e(f)))(h(i(j)))(k(l(m)))$/
-    abcdefhijklm
-
-/^(?:a(b(c)))(?:d(e(f)))(?:h(i(j)))(?:k(l(m)))$/
-    abcdefhijklm
-
-/^[\w][\W][\s][\S][\d][\D][\b][\n][\c]][\022]/
-    a+ Z0+\x08\n\x1d\x12
-
-/^[.^$|()*+?{,}]+/
-    .^\$(*+)|{?,?}
-
-/^a*\w/
-    z
-    az
-    aaaz
-    a
-    aa
-    aaaa
-    a+
-    aa+
-
-/^a*?\w/
-    z
-    az
-    aaaz
-    a
-    aa
-    aaaa
-    a+
-    aa+
-
-/^a+\w/
-    az
-    aaaz
-    aa
-    aaaa
-    aa+
-
-/^a+?\w/
-    az
-    aaaz
-    aa
-    aaaa
-    aa+
-
-/^\d{8}\w{2,}/
-    1234567890
-    12345678ab
-    12345678__
-    *** Failers
-    1234567
-
-/^[aeiou\d]{4,5}$/
-    uoie
+/[\p{Nd}]/8DZ
     1234
-    12345
-    aaaaa
-    *** Failers
-    123456

-/^[aeiou\d]{4,5}?/
-    uoie
+/[\p{Nd}+-]+/8DZ
     1234
-    12345
-    aaaaa
-    123456
-
-/^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/
-    From abcd  Mon Sep 01 12:33:02 1997
-
-/^From\s+\S+\s+([a-zA-Z]{3}\s+){2}\d{1,2}\s+\d\d:\d\d/
-    From abcd  Mon Sep 01 12:33:02 1997
-    From abcd  Mon Sep  1 12:33:02 1997
-    *** Failers
-    From abcd  Sep 01 12:33:02 1997
-
-/^12.34/s
-    12\n34
-    12\r34
-
-/\w+(?=\t)/
-    the quick brown\t fox
-
-/foo(?!bar)(.*)/
-    foobar is foolish see?
-
-/(?:(?!foo)...|^.{0,2})bar(.*)/
-    foobar crowbar etc
-    barrel
-    2barrel
-    A barrel
-
-/^(\D*)(?=\d)(?!123)/
-    abc456
-    *** Failers
-    abc123
-
-/^1234(?# test newlines
-  inside)/
-    1234
-
-/^1234 #comment in extended re
-  /x
-    1234
-
-/#rhubarb
-  abcd/x
-    abcd
-
-/^abcd#rhubarb/x
-    abcd
-
-/(?!^)abc/
-    the abc
-    *** Failers
-    abc
-
-/(?=^)abc/
-    abc
-    *** Failers
-    the abc
-
-/^[ab]{1,3}(ab*|b)/
-    aabbbbb
-
-/^[ab]{1,3}?(ab*|b)/
-    aabbbbb
-
-/^[ab]{1,3}?(ab*?|b)/
-    aabbbbb
-
-/^[ab]{1,3}(ab*?|b)/
-    aabbbbb
-
-/  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                          # optional leading comment
-(?:    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)                    # initial word
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)  )* # further okay, if led by a period
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-# address
-|                     #  or
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)             # one word, optionally followed by....
-(?:
-[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]  |  # atom and space parts, or...
-\(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)       |  # comments, or...
-
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-# quoted strings
-)*
-<  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                     # leading <
-(?:  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  ,  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-)* # further okay, if led by comma
-:                                # closing colon
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  )? #       optional route
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)                    # initial word
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)  )* # further okay, if led by a period
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-#       address spec
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  > #                  trailing >
-# name and address
-)  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                       # optional trailing comment
-/x
-    Alan Other <user\@dom.ain>
-    <user\@dom.ain>
-    user\@dom.ain
-    \"A. Other\" <user.1234\@dom.ain> (a comment)
-    A. Other <user.1234\@dom.ain> (a comment)
-    \"/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/\"\@x400-re.lay
-    A missing angle <user\@some.where
-    *** Failers
-    The quick brown fox
-
-/[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional leading comment
-(?:
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-# Atom
-|                       #  or
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-# Quoted string
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-# Atom
-|                       #  or
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-# Quoted string
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# additional words
-)*
-@
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-)*
-# address
-|                             #  or
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-# Atom
-|                       #  or
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-# Quoted string
-)
-# leading word
-[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] *               # "normal" atoms and or spaces
-(?:
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-|
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-) # "special" comment or quoted string
-[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] *            #  more "normal"
-)*
-<
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# <
-(?:
-@
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-)*
-(?: ,
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-@
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-)*
-)*  # additional domains
-:
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-)?     #       optional route
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-# Atom
-|                       #  or
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-# Quoted string
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-# Atom
-|                       #  or
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-# Quoted string
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# additional words
-)*
-@
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-)*
-#       address spec
->                    #                 >
-# name and address
-)
-/x
-    Alan Other <user\@dom.ain>
-    <user\@dom.ain>
-    user\@dom.ain
-    \"A. Other\" <user.1234\@dom.ain> (a comment)
-    A. Other <user.1234\@dom.ain> (a comment)
-    \"/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/\"\@x400-re.lay
-    A missing angle <user\@some.where
-    *** Failers
-    The quick brown fox
-
-/abc\0def\00pqr\000xyz\0000AB/
-    abc\0def\00pqr\000xyz\0000AB
-    abc456 abc\0def\00pqr\000xyz\0000ABCDE
-
-/abc\x0def\x00pqr\x000xyz\x0000AB/
-    abc\x0def\x00pqr\x000xyz\x0000AB
-    abc456 abc\x0def\x00pqr\x000xyz\x0000ABCDE
-
-/^[\000-\037]/
-    \0A
-    \01B
-    \037C
-
-/\0*/
-    \0\0\0\0
-
-/A\x0{2,3}Z/
-    The A\x0\x0Z
-    An A\0\x0\0Z
-    *** Failers
-    A\0Z
-    A\0\x0\0\x0Z
-
-/^\s/
-    \040abc
-    \x0cabc
-    \nabc
-    \rabc
-    \tabc
-    *** Failers
-    abc
-
-/^a    b
-    ?  c/x
-    abc
-
-/ab{1,3}bc/
-    abbbbc
-    abbbc
-    abbc
-    *** Failers
-    abc
-    abbbbbc
-
-/([^.]*)\.([^:]*):[T ]+(.*)/
-    track1.title:TBlah blah blah
-
-/([^.]*)\.([^:]*):[T ]+(.*)/i
-    track1.title:TBlah blah blah
-
-/([^.]*)\.([^:]*):[t ]+(.*)/i
-    track1.title:TBlah blah blah
-
-/^[W-c]+$/
-    WXY_^abc
-    *** Failers
-    wxy
-
-/^[W-c]+$/i
-    WXY_^abc
-    wxy_^ABC
-
-/^[\x3f-\x5F]+$/i
-    WXY_^abc
-    wxy_^ABC
-
-/^abc$/m
-    abc
-    qqq\nabc
-    abc\nzzz
-    qqq\nabc\nzzz
-
-/^abc$/
-    abc
-    *** Failers
-    qqq\nabc
-    abc\nzzz
-    qqq\nabc\nzzz
-
-/\Aabc\Z/m
-    abc
-    abc\n 
-    *** Failers
-    qqq\nabc
-    abc\nzzz
-    qqq\nabc\nzzz
-    
-/\A(.)*\Z/s
-    abc\ndef
-
-/\A(.)*\Z/m
-    *** Failers
-    abc\ndef
-
-/(?:b)|(?::+)/
-    b::c
-    c::b
-
-/[-az]+/
-    az-
-    *** Failers
-    b
-
-/[az-]+/
-    za-
-    *** Failers
-    b
-
-/[a\-z]+/
-    a-z
-    *** Failers
-    b
-
-/[a-z]+/
-    abcdxyz
-
-/[\d-]+/
     12-34
-    *** Failers
-    aaa
+    12+\x{661}-34  
+    ** Failers
+    abcd

-/[\d-z]+/
-    12-34z
-    *** Failers
-    aaa
-
-/\x5c/
-    \\
-
-/\x20Z/
-    the Zoo
-    *** Failers
-    Zulu
-
-/ab{3cd/
-    ab{3cd
-
-/ab{3,cd/
-    ab{3,cd
-
-/ab{3,4a}cd/
-    ab{3,4a}cd
-
-/{4,5a}bc/
-    {4,5a}bc
-
-/^a.b/<lf>
-    a\rb
-    *** Failers
-    a\nb
-
-/abc$/
-    abc
-    abc\n
-    *** Failers
-    abc\ndef
-
-/(abc)\123/
-    abc\x53
-
-/(abc)\223/
-    abc\x93
-
-/(abc)\323/
-    abc\xd3
-
-/(abc)\100/
-    abc\x40
-    abc\100
-
-/(abc)\1000/
-    abc\x400
-    abc\x40\x30
-    abc\1000
-    abc\100\x30
-    abc\100\060
-    abc\100\60
-
-/abc\81/
-    abc\081
-    abc\0\x38\x31
-
-/abc\91/
-    abc\091
-    abc\0\x39\x31
-
-/(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)\12\123/
-    abcdefghijk\12S
-
-/ab\idef/
-    abidef
-
-/a{0}bc/
-    bc
-
-/(a|(bc)){0,0}?xyz/
-    xyz
-
-/abc[\10]de/
-    abc\010de
-
-/abc[\1]de/
-    abc\1de
-
-/(abc)[\1]de/
-    abc\1de
-
-/(?s)a.b/
-    a\nb
-
-/^([^a])([^\b])([^c]*)([^d]{3,4})/
-    baNOTccccd
-    baNOTcccd
-    baNOTccd
-    bacccd
-    *** Failers
-    anything
-    b\bc   
-    baccd
-
-/[^a]/
-    Abc
-  
-/[^a]/i
-    Abc 
-
-/[^a]+/
-    AAAaAbc
-  
-/[^a]+/i
-    AAAaAbc 
-
-/[^a]+/
-    bbb\nccc
-   
-/[^k]$/
-    abc
-    *** Failers
-    abk   
-   
-/[^k]{2,3}$/
-    abc
-    kbc
-    kabc 
-    *** Failers
-    abk
-    akb
-    akk 
-
-/^\d{8,}\@.+[^k]$/
-    12345678\@a.b.c.d
-    123456789\@x.y.z
-    *** Failers
-    12345678\@x.y.uk
-    1234567\@a.b.c.d       
-
-/[^a]/
-    aaaabcd
-    aaAabcd 
-
-/[^a]/i
-    aaaabcd
-    aaAabcd 
-
-/[^az]/
-    aaaabcd
-    aaAabcd 
-
-/[^az]/i
-    aaaabcd
-    aaAabcd 
-
-/\000\001\002\003\004\005\006\007\010\011\012\013\014\015\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\040\041\042\043\044\045\046\047\050\051\052\053\054\055\056\057\060\061\062\063\064\065\066\067\070\071\072\073\074\075\076\077\100\101\102\103\104\105\106\107\110\111\112\113\114\115\116\117\120\121\122\123\124\125\126\127\130\131\132\133\134\135\136\137\140\141\142\143\144\145\146\147\150\151\152\153\154\155\156\157\160\161\162\163\164\165\166\167\170\171\172\173\174\175\176\177\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\257\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\365\366\367\370\371\372\373\374\375\376\377/
- \000\001\002\003\004\005\006\007\010\011\012\013\014\015\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\040\041\042\043\044\045\046\047\050\051\052\053\054\055\056\057\060\061\062\063\064\065\066\067\070\071\072\073\074\075\076\077\100\101\102\103\104\105\106\107\110\111\112\113\114\115\116\117\120\121\122\123\124\125\126\127\130\131\132\133\134\135\136\137\140\141\142\143\144\145\146\147\150\151\152\153\154\155\156\157\160\161\162\163\164\165\166\167\170\171\172\173\174\175\176\177\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\257\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\365\366\367\370\371\372\373\374\375\376\377
-
-/P[^*]TAIRE[^*]{1,6}?LL/
-    xxxxxxxxxxxPSTAIREISLLxxxxxxxxx
-
-/P[^*]TAIRE[^*]{1,}?LL/
-    xxxxxxxxxxxPSTAIREISLLxxxxxxxxx
-
-/(\.\d\d[1-9]?)\d+/
-    1.230003938
-    1.875000282   
-    1.235  
-                  
-/(\.\d\d((?=0)|\d(?=\d)))/
-    1.230003938      
-    1.875000282
-    *** Failers 
-    1.235 
+/[\x{105}-\x{109}]/8iDZ
+    \x{104}
+    \x{105}
+    \x{109}  
+    ** Failers
+    \x{100}
+    \x{10a}

-/a(?)b/
-    ab 
- 
-/\b(foo)\s+(\w+)/i
-    Food is on the foo table
-    
-/foo(.*)bar/
-    The food is under the bar in the barn.
-    
-/foo(.*?)bar/  
-    The food is under the bar in the barn.
+/[z-\x{100}]/8iDZ 
+    Z
+    z
+    \x{39c}
+    \x{178}
+    |
+    \x{80}
+    \x{ff}
+    \x{100}
+    \x{101} 
+    ** Failers
+    \x{102}
+    Y
+    y

-/(.*)(\d*)/
-    I have 2 numbers: 53147
-    
-/(.*)(\d+)/
-    I have 2 numbers: 53147
- 
-/(.*?)(\d*)/
-    I have 2 numbers: 53147
+/[z-\x{100}]/8DZi

-/(.*?)(\d+)/
-    I have 2 numbers: 53147
+/(?:[\PPa*]*){8,}/

-/(.*)(\d+)$/
-    I have 2 numbers: 53147
+/[\P{Any}]/BZ

-/(.*?)(\d+)$/
-    I have 2 numbers: 53147
+/[\P{Any}\E]/BZ

-/(.*)\b(\d+)$/
-    I have 2 numbers: 53147
+/(\P{Yi}+\277)/

-/(.*\D)(\d+)$/
-    I have 2 numbers: 53147
+/(\P{Yi}+\277)?/

-/^\D*(?!123)/
-    ABC123
-     
-/^(\D*)(?=\d)(?!123)/
-    ABC445
-    *** Failers
-    ABC123
-    
-/^[W-]46]/
-    W46]789 
-    -46]789
-    *** Failers
-    Wall
-    Zebra
-    42
-    [abcd] 
-    ]abcd[
-       
-/^[W-\]46]/
-    W46]789 
-    Wall
-    Zebra
-    Xylophone  
-    42
-    [abcd] 
-    ]abcd[
-    \\backslash 
-    *** Failers
-    -46]789
-    well
-    
-/\d\d\/\d\d\/\d\d\d\d/
-    01/01/2000
+/(?<=\P{Yi}{3}A)X/

-/word (?:[a-zA-Z0-9]+ ){0,10}otherword/
- word cat dog elephant mussel cow horse canary baboon snake shark otherword
- word cat dog elephant mussel cow horse canary baboon snake shark
+/\p{Yi}+(\P{Yi}+)(?1)/

-/word (?:[a-zA-Z0-9]+ ){0,300}otherword/
- word cat dog elephant mussel cow horse canary baboon snake shark the quick brown fox and the lazy dog and several other words getting close to thirty by now I hope
+/(\P{Yi}{2}\277)?/

-/^(a){0,0}/
-    bcd
-    abc
-    aab     
+/[\P{Yi}A]/

-/^(a){0,1}/
-    bcd
-    abc
-    aab  
+/[\P{Yi}\P{Yi}\P{Yi}A]/

-/^(a){0,2}/
-    bcd
-    abc
-    aab  
+/[^\P{Yi}A]/

-/^(a){0,3}/
-    bcd
-    abc
-    aab
-    aaa   
+/[^\P{Yi}\P{Yi}\P{Yi}A]/

-/^(a){0,}/
-    bcd
-    abc
-    aab
-    aaa
-    aaaaaaaa    
+/(\P{Yi}*\277)*/

-/^(a){1,1}/
-    bcd
-    abc
-    aab  
+/(\P{Yi}*?\277)*/

-/^(a){1,2}/
-    bcd
-    abc
-    aab  
+/(\p{Yi}*+\277)*/

-/^(a){1,3}/
-    bcd
-    abc
-    aab
-    aaa   
+/(\P{Yi}?\277)*/

-/^(a){1,}/
-    bcd
-    abc
-    aab
-    aaa
-    aaaaaaaa    
+/(\P{Yi}??\277)*/

-/.*\.gif/
-    borfle\nbib.gif\nno
+/(\p{Yi}?+\277)*/

-/.{0,}\.gif/
-    borfle\nbib.gif\nno
+/(\P{Yi}{0,3}\277)*/

-/.*\.gif/m
-    borfle\nbib.gif\nno
+/(\P{Yi}{0,3}?\277)*/

-/.*\.gif/s
-    borfle\nbib.gif\nno
+/(\p{Yi}{0,3}+\277)*/

-/.*\.gif/ms
-    borfle\nbib.gif\nno
+/\p{Zl}{2,3}+/8BZ
+    \xe2\x80\xa8\xe2\x80\xa8
+    \x{2028}\x{2028}\x{2028}

-/.*$/
-    borfle\nbib.gif\nno
+/\p{Zl}/8BZ

-/.*$/m
-    borfle\nbib.gif\nno
+/\p{Lu}{3}+/8BZ

-/.*$/s
-    borfle\nbib.gif\nno
+/\pL{2}+/8BZ

-/.*$/ms
-    borfle\nbib.gif\nno
-    
-/.*$/
-    borfle\nbib.gif\nno\n
+/\p{Cc}{2}+/8BZ

-/.*$/m
-    borfle\nbib.gif\nno\n
-
-/.*$/s
-    borfle\nbib.gif\nno\n
-
-/.*$/ms
-    borfle\nbib.gif\nno\n
-    
-/(.*X|^B)/
-    abcde\n1234Xyz
-    BarFoo 
-    *** Failers
-    abcde\nBar  
-
-/(.*X|^B)/m
-    abcde\n1234Xyz
-    BarFoo 
-    abcde\nBar  
-
-/(.*X|^B)/s
-    abcde\n1234Xyz
-    BarFoo 
-    *** Failers
-    abcde\nBar  
-
-/(.*X|^B)/ms
-    abcde\n1234Xyz
-    BarFoo 
-    abcde\nBar  
-
-/(?s)(.*X|^B)/
-    abcde\n1234Xyz
-    BarFoo 
-    *** Failers 
-    abcde\nBar  
-
-/(?s:.*X|^B)/
-    abcde\n1234Xyz
-    BarFoo 
-    *** Failers 
-    abcde\nBar  
-
-/^.*B/
-    **** Failers
-    abc\nB
-     
-/(?s)^.*B/
-    abc\nB
-
-/(?m)^.*B/
-    abc\nB
-     
-/(?ms)^.*B/
-    abc\nB
-
-/(?ms)^B/
-    abc\nB
-
-/(?s)B$/
-    B\n
-
-/^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/
-    123456654321
+/^\p{Cs}/8
+    \?\x{dfff}
+    ** Failers
+    \x{09f}

-/^\d\d\d\d\d\d\d\d\d\d\d\d/
-    123456654321 
-
-/^[\d][\d][\d][\d][\d][\d][\d][\d][\d][\d][\d][\d]/
-    123456654321
+/^\p{Sc}+/8
+    $\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
+    \x{9f2}
+    ** Failers
+    X
+    \x{2c2}

-/^[abc]{12}/
-    abcabcabcabc
-    
-/^[a-c]{12}/
-    abcabcabcabc
-    
-/^(a|b|c){12}/
-    abcabcabcabc 
-
-/^[abcdefghijklmnopqrstuvwxy0123456789]/
-    n
-    *** Failers 
-    z 
-
-/abcde{0,0}/
-    abcd
-    *** Failers
-    abce  
-
-/ab[cd]{0,0}e/
-    abe
-    *** Failers
-    abcde 
-    
-/ab(c){0,0}d/
-    abd
-    *** Failers
-    abcd   
-
-/a(b*)/
-    a
-    ab
-    abbbb
-    *** Failers
-    bbbbb    
-    
-/ab\d{0}e/
-    abe
-    *** Failers
-    ab1e   
-    
-/"([^\\"]+|\\.)*"/
-    the \"quick\" brown fox
-    \"the \\\"quick\\\" brown fox\" 
-
-/.*?/g+
-    abc
+/^\p{Zs}/8
+    \ \
+    \x{a0}
+    \x{1680}
+    \x{180e}
+    \x{2000}
+    \x{2001}     
+    ** Failers
+    \x{2028}
+    \x{200d}

-/\b/g+
-    abc 
-
-/\b/+g
-    abc 
-
-//g
-    abc
-
-/<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/is
-  <TR BGCOLOR='#DBE9E9'><TD align=left valign=top>43.<a href='joblist.cfm?JobID=94 6735&Keyword='>Word Processor<BR>(N-1286)</a></TD><TD align=left valign=top>Lega lstaff.com</TD><TD align=left valign=top>CA - Statewide</TD></TR>
-
-/a[^a]b/
-    acb
-    a\nb
-    
-/a.b/
-    acb
-    *** Failers 
-    a\nb   
-    
-/a[^a]b/s
-    acb
-    a\nb  
-    
-/a.b/s
-    acb
-    a\nb  
-
-/^(b+?|a){1,2}?c/
-    bac
-    bbac
-    bbbac
-    bbbbac
-    bbbbbac 
-
-/^(b+|a){1,2}?c/
-    bac
-    bbac
-    bbbac
-    bbbbac
-    bbbbbac 
-    
-/(?!\A)x/m
-    x\nb\n
-    a\bx\n  
-    
-/\x0{ab}/
-    \0{ab} 
-
-/(A|B)*?CD/
-    CD 
-    
-/(A|B)*CD/
-    CD 
-
-/(?<!bar)foo/
-    foo
-    catfood
-    arfootle
-    rfoosh
-    *** Failers
-    barfoo
-    towbarfoo
-
-/\w{3}(?<!bar)foo/
-    catfood
-    *** Failers
-    foo
-    barfoo
-    towbarfoo
-
-/(?<=(foo)a)bar/
-    fooabar
-    *** Failers
-    bar
-    foobbar
+/-- These four are here rather than in test 6 because Perl has problems with
+    the negative versions of the properties. --/

-/\Aabc\z/m
-    abc
-    *** Failers
-    abc\n   
-    qqq\nabc
-    abc\nzzz
-    qqq\nabc\nzzz
+/\p{^Lu}/8i
+    1234
+    ** Failers
+    ABC

-"(?>.*/)foo"
-    /this/is/a/very/long/line/in/deed/with/very/many/slashes/in/it/you/see/
+/\P{Lu}/8i
+    1234
+    ** Failers
+    ABC

-"(?>.*/)foo"
-    /this/is/a/very/long/line/in/deed/with/very/many/slashes/in/and/foo
-
-/(?>(\.\d\d[1-9]?))\d+/
-    1.230003938
-    1.875000282
-    *** Failers 
-    1.235 
-
-/^((?>\w+)|(?>\s+))*$/
-    now is the time for all good men to come to the aid of the party
-    *** Failers
-    this is not a line with only words and spaces!
-    
-/(\d+)(\w)/
-    12345a
-    12345+ 
-
-/((?>\d+))(\w)/
-    12345a
-    *** Failers
-    12345+ 
-
-/(?>a+)b/
-    aaab
-
-/((?>a+)b)/
-    aaab
-
-/(?>(a+))b/
-    aaab
-
-/(?>b)+/
-    aaabbbccc
-
-/(?>a+|b+|c+)*c/
-    aaabbbbccccd
-    
-/(a+|b+|c+)*c/
-    aaabbbbccccd
-
-/((?>[^()]+)|\([^()]*\))+/
-    ((abc(ade)ufh()()x
-    
-/\(((?>[^()]+)|\([^()]+\))+\)/ 
-    (abc)
-    (abc(def)xyz)
-    *** Failers
-    ((()aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa   
-
-/a(?-i)b/i
-    ab
-    Ab
-    *** Failers 
-    aB
-    AB
-        
-/(a (?x)b c)d e/
-    a bcd e
-    *** Failers
-    a b cd e
-    abcd e   
-    a bcde 
- 
-/(a b(?x)c d (?-x)e f)/
-    a bcde f
-    *** Failers
-    abcdef  
-
-/(a(?i)b)c/
-    abc
-    aBc
-    *** Failers
-    abC
-    aBC  
-    Abc
-    ABc
-    ABC
-    AbC
-    
-/a(?i:b)c/
-    abc
-    aBc
-    *** Failers 
-    ABC
-    abC
-    aBC
-    
-/a(?i:b)*c/
-    aBc
-    aBBc
-    *** Failers 
-    aBC
-    aBBC
-    
-/a(?=b(?i)c)\w\wd/
-    abcd
-    abCd
-    *** Failers
-    aBCd
-    abcD     
-    
-/(?s-i:more.*than).*million/i
-    more than million
-    more than MILLION
-    more \n than Million 
-    *** Failers
-    MORE THAN MILLION    
-    more \n than \n million 
-
-/(?:(?s-i)more.*than).*million/i
-    more than million
-    more than MILLION
-    more \n than Million 
-    *** Failers
-    MORE THAN MILLION    
-    more \n than \n million 
-    
-/(?>a(?i)b+)+c/ 
-    abc
-    aBbc
-    aBBc 
-    *** Failers
-    Abc
-    abAb    
-    abbC 
-    
-/(?=a(?i)b)\w\wc/
-    abc
-    aBc
-    *** Failers
-    Ab 
-    abC
-    aBC     
-    
-/(?<=a(?i)b)(\w\w)c/
-    abxxc
-    aBxxc
-    *** Failers
-    Abxxc
-    ABxxc
-    abxxC      
-
-/^(?(?=abc)\w{3}:|\d\d)$/
-    abc:
-    12
-    *** Failers
-    123
-    xyz    
-
-/^(?(?!abc)\d\d|\w{3}:)$/
-    abc:
-    12
-    *** Failers
-    123
-    xyz    
-    
-/(?(?<=foo)bar|cat)/
-    foobar
-    cat
-    fcat
-    focat   
-    *** Failers
-    foocat  
-
-/(?(?<!foo)cat|bar)/
-    foobar
-    cat
-    fcat
-    focat   
-    *** Failers
-    foocat  
-
-/(?>a*)*/
+/\p{Ll}/8i 
     a
-    aa
-    aaaa
-    
-/(abc|)+/
-    abc
-    abcabc
-    abcabcabc
-    xyz      
+    Az
+    ** Failers
+    ABC

-/([a]*)*/
+/\p{Lu}/8i
+    A
+    a\x{10a0}B 
+    ** Failers 
     a
-    aaaaa 
- 
-/([ab]*)*/
-    a
-    b
-    ababab
-    aaaabcde
-    bbbb    
- 
-/([^a]*)*/
-    b
-    bbbb
-    aaa   
- 
-/([^ab]*)*/
-    cccc
-    abab  
- 
-/([a]*?)*/
-    a
-    aaaa 
- 
-/([ab]*?)*/
-    a
-    b
-    abab
-    baba   
- 
-/([^a]*?)*/
-    b
-    bbbb
-    aaa   
- 
-/([^ab]*?)*/
-    c
-    cccc
-    baba   
- 
-/(?>a*)*/
-    a
-    aaabcde 
- 
-/((?>a*))*/
-    aaaaa
-    aabbaa 
- 
-/((?>a*?))*/
-    aaaaa
-    aabbaa 
+    \x{1d00}

-/(?(?=[^a-z]+[a-z])  \d{2}-[a-z]{3}-\d{2}  |  \d{2}-\d{2}-\d{2} ) /x
-    12-sep-98
-    12-09-98
-    *** Failers
-    sep-12-98
-        
-/(?i:saturday|sunday)/
-    saturday
-    sunday
-    Saturday
-    Sunday
-    SATURDAY
-    SUNDAY
-    SunDay
-    
-/(a(?i)bc|BB)x/
-    abcx
-    aBCx
-    bbx
-    BBx
-    *** Failers
-    abcX
-    aBCX
-    bbX
-    BBX               
+/[\x{c0}\x{391}]/8i
+    \x{c0}
+    \x{e0}

-/^([ab](?i)[cd]|[ef])/
-    ac
-    aC
-    bD
-    elephant
-    Europe 
-    frog
-    France
-    *** Failers
-    Africa     
+/-- The next two are special cases where the lengths of the different cases of
+the same character differ. The first went wrong with heap frame storage; the
+second was broken in all cases. --/

-/^(ab|a(?i)[b-c](?m-i)d|x(?i)y|z)/
-    ab
-    aBd
-    xy
-    xY
-    zebra
-    Zambesi
-    *** Failers
-    aCD  
-    XY  
+/^\x{023a}+?(\x{0130}+)/8i
+  \x{023a}\x{2c65}\x{0130}
+  
+/^\x{023a}+([^X])/8i
+  \x{023a}\x{2c65}X

-/(?<=foo\n)^bar/m
-    foo\nbar
-    *** Failers
-    bar
-    baz\nbar   
+/\x{c0}+\x{116}+/8i
+    \x{c0}\x{e0}\x{116}\x{117}

-/(?<=(?<!foo)bar)baz/
-    barbaz
-    barbarbaz 
-    koobarbaz 
-    *** Failers
-    baz
-    foobarbaz 
+/[\x{c0}\x{116}]+/8i
+    \x{c0}\x{e0}\x{116}\x{117}

-/The following tests are taken from the Perl 5.005 test suite; some of them/
-/are compatible with 5.004, but I'd rather not have to sort them out./
+/(\x{de})\1/8i
+    \x{de}\x{de}
+    \x{de}\x{fe}
+    \x{fe}\x{fe}
+    \x{fe}\x{de}

-/abc/
-    abc
-    xabcy
-    ababc
-    *** Failers
-    xbc
-    axc
-    abx
+/^\x{c0}$/8i
+    \x{c0}
+    \x{e0}

-/ab*c/
-    abc
+/^\x{e0}$/8i
+    \x{c0}
+    \x{e0}

-/ab*bc/
-    abc
-    abbc
-    abbbbc
+/-- The next two should be Perl-compatible, but it fails to match \x{e0}. PCRE
+will match it only with UCP support, because without that it has no notion
+of case for anything other than the ASCII letters. --/

-/.{1}/
-    abbbbc
+/((?i)[\x{c0}])/8
+    \x{c0}
+    \x{e0}

-/.{3,4}/
-    abbbbc
+/(?i:[\x{c0}])/8
+    \x{c0}
+    \x{e0}

-/ab{0,}bc/
-    abbbbc
-
-/ab+bc/
-    abbc
-    *** Failers
-    abc
-    abq
-
-/ab+bc/
-    abbbbc
-
-/ab{1,}bc/
-    abbbbc
-
-/ab{1,3}bc/
-    abbbbc
-
-/ab{3,4}bc/
-    abbbbc
-
-/ab{4,5}bc/
-    *** Failers
-    abq
-    abbbbc
-
-/ab?bc/
-    abbc
-    abc
-
-/ab{0,1}bc/
-    abc
-
-/ab?bc/
-
-/ab?c/
-    abc
-
-/ab{0,1}c/
-    abc
-
-/^abc$/
-    abc
-    *** Failers
-    abbbbc
-    abcc
-
-/^abc/
-    abcc
-
-/^abc$/
-
-/abc$/
-    aabc
-    *** Failers
-    aabc
-    aabcd
-
-/^/
-    abc
-
-/$/
-    abc
-
-/a.c/
-    abc
-    axc
-
-/a.*c/
-    axyzc
-
-/a[bc]d/
-    abd
-    *** Failers
-    axyzd
-    abc
-
-/a[b-d]e/
-    ace
-
-/a[b-d]/
-    aac
-
-/a[-b]/
-    a-
-
-/a[b-]/
-    a-
-
-/a]/
-    a]
-
-/a[]]b/
-    a]b
-
-/a[^bc]d/
-    aed
-    *** Failers
-    abd
-    abd
-
-/a[^-b]c/
-    adc
-
-/a[^]b]c/
-    adc
-    *** Failers
-    a-c
-    a]c
-
-/\ba\b/
-    a-
-    -a
-    -a-
-
-/\by\b/
-    *** Failers
-    xy
-    yz
-    xyz
-
-/\Ba\B/
-    *** Failers
-    a-
-    -a
-    -a-
-
-/\By\b/
-    xy
-
-/\by\B/
-    yz
-
-/\By\B/
-    xyz
-
-/\w/
-    a
-
-/\W/
-    -
-    *** Failers
-    -
-    a
-
-/a\sb/
-    a b
-
-/a\Sb/
-    a-b
-    *** Failers
-    a-b
-    a b
-
-/\d/
-    1
-
-/\D/
-    -
-    *** Failers
-    -
-    1
-
-/[\w]/
-    a
-
-/[\W]/
-    -
-    *** Failers
-    -
-    a
-
-/a[\s]b/
-    a b
-
-/a[\S]b/
-    a-b
-    *** Failers
-    a-b
-    a b
-
-/[\d]/
-    1
-
-/[\D]/
-    -
-    *** Failers
-    -
-    1
-
-/ab|cd/
-    abc
-    abcd
-
-/()ef/
-    def
-
-/$b/
-
-/a\(b/
-    a(b
-
-/a\(*b/
-    ab
-    a((b
-
-/a\\b/
-    a\b
-
-/((a))/
-    abc
-
-/(a)b(c)/
-    abc
-
-/a+b+c/
-    aabbabc
-
-/a{1,}b{1,}c/
-    aabbabc
-
-/a.+?c/
-    abcabc
-
-/(a+|b)*/
-    ab
-
-/(a+|b){0,}/
-    ab
-
-/(a+|b)+/
-    ab
-
-/(a+|b){1,}/
-    ab
-
-/(a+|b)?/
-    ab
-
-/(a+|b){0,1}/
-    ab
-
-/[^ab]*/
-    cde
-
-/abc/
-    *** Failers
-    b
+/-- This should be Perl-compatible but Perl 5.11 gets \x{300} wrong. --/8

-
-/a*/
-    
-
-/([abc])*d/
-    abbbcd
-
-/([abc])*bcd/
-    abcd
-
-/a|b|c|d|e/
-    e
-
-/(a|b|c|d|e)f/
-    ef
-
-/abcd*efg/
-    abcdefg
-
-/ab*/
-    xabyabbbz
-    xayabbbz
-
-/(ab|cd)e/
-    abcde
-
-/[abhgefdc]ij/
-    hij
-
-/^(ab|cd)e/
-
-/(abc|)ef/
-    abcdef
-
-/(a|b)c*d/
-    abcd
-
-/(ab|ab*)bc/
-    abc
-
-/a([bc]*)c*/
-    abc
-
-/a([bc]*)(c*d)/
-    abcd
-
-/a([bc]+)(c*d)/
-    abcd
-
-/a([bc]*)(c+d)/
-    abcd
-
-/a[bcd]*dcdcde/
-    adcdcde
-
-/a[bcd]+dcdcde/
+/^\X/8
+    A
+    A\x{300}BC 
+    A\x{300}\x{301}\x{302}BC 
     *** Failers
-    abcde
-    adcdcde
-
-/(ab|a)b*c/
-    abc
-
-/((a)(b)c)(d)/
-    abcd
-
-/[a-zA-Z_][a-zA-Z0-9_]*/
-    alpha
-
-/^a(bc+|b[eh])g|.h$/
-    abh
-
-/(bc+d$|ef*g.|h?i(j|k))/
-    effgz
-    ij
-    reffgz
-    *** Failers
-    effg
-    bcdd
-
-/((((((((((a))))))))))/
-    a
-
-/(((((((((a)))))))))/
-    a
-
-/multiple words of text/
-    *** Failers
-    aa
-    uh-uh
-
-/multiple words/
-    multiple words, yeah
-
-/(.*)c(.*)/
-    abcde
-
-/\((.*), (.*)\)/
-    (a, b)
-
-/[k]/
-
-/abcd/
-    abcd
-
-/a(bc)d/
-    abcd
-
-/a[-]?c/
-    ac
-
-/abc/i
-    ABC
-    XABCY
-    ABABC
-    *** Failers
-    aaxabxbaxbbx
-    XBC
-    AXC
-    ABX
-
-/ab*c/i
-    ABC
-
-/ab*bc/i
-    ABC
-    ABBC
-
-/ab*?bc/i
-    ABBBBC
-
-/ab{0,}?bc/i
-    ABBBBC
-
-/ab+?bc/i
-    ABBC
-
-/ab+bc/i
-    *** Failers
-    ABC
-    ABQ
-
-/ab{1,}bc/i
-
-/ab+bc/i
-    ABBBBC
-
-/ab{1,}?bc/i
-    ABBBBC
-
-/ab{1,3}?bc/i
-    ABBBBC
-
-/ab{3,4}?bc/i
-    ABBBBC
-
-/ab{4,5}?bc/i
-    *** Failers
-    ABQ
-    ABBBBC
-
-/ab??bc/i
-    ABBC
-    ABC
-
-/ab{0,1}?bc/i
-    ABC
-
-/ab??bc/i
-
-/ab??c/i
-    ABC
-
-/ab{0,1}?c/i
-    ABC
-
-/^abc$/i
-    ABC
-    *** Failers
-    ABBBBC
-    ABCC
-
-/^abc/i
-    ABCC
-
-/^abc$/i
-
-/abc$/i
-    AABC
-
-/^/i
-    ABC
-
-/$/i
-    ABC
-
-/a.c/i
-    ABC
-    AXC
-
-/a.*?c/i
-    AXYZC
-
-/a.*c/i
-    *** Failers
-    AABC
-    AXYZD
-
-/a[bc]d/i
-    ABD
-
-/a[b-d]e/i
-    ACE
-    *** Failers
-    ABC
-    ABD
-
-/a[b-d]/i
-    AAC
-
-/a[-b]/i
-    A-
-
-/a[b-]/i
-    A-
-
-/a]/i
-    A]
-
-/a[]]b/i
-    A]B
-
-/a[^bc]d/i
-    AED
-
-/a[^-b]c/i
-    ADC
-    *** Failers
-    ABD
-    A-C
-
-/a[^]b]c/i
-    ADC
-
-/ab|cd/i
-    ABC
-    ABCD
-
-/()ef/i
-    DEF
-
-/$b/i
-    *** Failers
-    A]C
-    B
-
-/a\(b/i
-    A(B
-
-/a\(*b/i
-    AB
-    A((B
-
-/a\\b/i
-    A\B
-
-/((a))/i
-    ABC
-
-/(a)b(c)/i
-    ABC
-
-/a+b+c/i
-    AABBABC
-
-/a{1,}b{1,}c/i
-    AABBABC
-
-/a.+?c/i
-    ABCABC
-
-/a.*?c/i
-    ABCABC
-
-/a.{0,5}?c/i
-    ABCABC
-
-/(a+|b)*/i
-    AB
-
-/(a+|b){0,}/i
-    AB
-
-/(a+|b)+/i
-    AB
-
-/(a+|b){1,}/i
-    AB
-
-/(a+|b)?/i
-    AB
-
-/(a+|b){0,1}/i
-    AB
-
-/(a+|b){0,1}?/i
-    AB
-
-/[^ab]*/i
-    CDE
-
-/abc/i
-
-/a*/i
+    \x{300}

+/-- These are PCRE's extra properties to help with Unicodizing \d etc. --/

-/([abc])*d/i
-    ABBBCD
-
-/([abc])*bcd/i
+/^\p{Xan}/8
     ABCD
+    1234
+    \x{6ca}
+    \x{a6c}
+    \x{10a7}   
+    ** Failers
+    _ABC

-/a|b|c|d|e/i
-    E
+/^\p{Xan}+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+    ** Failers
+    _ABC

-/(a|b|c|d|e)f/i
-    EF
+/^\p{Xan}+?/8
+    \x{6ca}\x{a6c}\x{10a7}_

-/abcd*efg/i
-    ABCDEFG
-
-/ab*/i
-    XABYABBBZ
-    XAYABBBZ
-
-/(ab|cd)e/i
-    ABCDE
-
-/[abhgefdc]ij/i
-    HIJ
-
-/^(ab|cd)e/i
-    ABCDE
-
-/(abc|)ef/i
-    ABCDEF
-
-/(a|b)c*d/i
-    ABCD
-
-/(ab|ab*)bc/i
-    ABC
-
-/a([bc]*)c*/i
-    ABC
-
-/a([bc]*)(c*d)/i
-    ABCD
-
-/a([bc]+)(c*d)/i
-    ABCD
-
-/a([bc]*)(c+d)/i
-    ABCD
-
-/a[bcd]*dcdcde/i
-    ADCDCDE
-
-/a[bcd]+dcdcde/i
-
-/(ab|a)b*c/i
-    ABC
-
-/((a)(b)c)(d)/i
-    ABCD
-
-/[a-zA-Z_][a-zA-Z0-9_]*/i
-    ALPHA
-
-/^a(bc+|b[eh])g|.h$/i
-    ABH
-
-/(bc+d$|ef*g.|h?i(j|k))/i
-    EFFGZ
-    IJ
-    REFFGZ
-    *** Failers
-    ADCDCDE
-    EFFG
-    BCDD
-
-/((((((((((a))))))))))/i
-    A
-
-/(((((((((a)))))))))/i
-    A
-
-/(?:(?:(?:(?:(?:(?:(?:(?:(?:(a))))))))))/i
-    A
-
-/(?:(?:(?:(?:(?:(?:(?:(?:(?:(a|b|c))))))))))/i
-    C
-
-/multiple words of text/i
-    *** Failers
-    AA
-    UH-UH
-
-/multiple words/i
-    MULTIPLE WORDS, YEAH
-
-/(.*)c(.*)/i
-    ABCDE
-
-/\((.*), (.*)\)/i
-    (A, B)
-
-/[k]/i
-
-/abcd/i
-    ABCD
-
-/a(bc)d/i
-    ABCD
-
-/a[-]?c/i
-    AC
-
-/a(?!b)./
-    abad
-
-/a(?=d)./
-    abad
-
-/a(?=c|d)./
-    abad
-
-/a(?:b|c|d)(.)/
-    ace
-
-/a(?:b|c|d)*(.)/
-    ace
-
-/a(?:b|c|d)+?(.)/
-    ace
-    acdbcdbe
-
-/a(?:b|c|d)+(.)/
-    acdbcdbe
-
-/a(?:b|c|d){2}(.)/
-    acdbcdbe
-
-/a(?:b|c|d){4,5}(.)/
-    acdbcdbe
-
-/a(?:b|c|d){4,5}?(.)/
-    acdbcdbe
-
-/((foo)|(bar))*/
-    foobar
-
-/a(?:b|c|d){6,7}(.)/
-    acdbcdbe
-
-/a(?:b|c|d){6,7}?(.)/
-    acdbcdbe
-
-/a(?:b|c|d){5,6}(.)/
-    acdbcdbe
-
-/a(?:b|c|d){5,6}?(.)/
-    acdbcdbe
-
-/a(?:b|c|d){5,7}(.)/
-    acdbcdbe
-
-/a(?:b|c|d){5,7}?(.)/
-    acdbcdbe
-
-/a(?:b|(c|e){1,2}?|d)+?(.)/
-    ace
-
-/^(.+)?B/
-    AB
-
-/^([^a-z])|(\^)$/
-    .
-
-/^[<>]&/
-    <&OUT
-
-/(?:(f)(o)(o)|(b)(a)(r))*/
-    foobar
-
-/(?<=a)b/
-    ab
-    *** Failers
-    cb
-    b
-
-/(?<!c)b/
-    ab
-    b
-    b
-
-/(?:..)*a/
-    aba
-
-/(?:..)*?a/
-    aba
-
-/^(){3,5}/
-    abc
-
-/^(a+)*ax/
-    aax
-
-/^((a|b)+)*ax/
-    aax
-
-/^((a|bc)+)*ax/
-    aax
-
-/(a|x)*ab/
-    cab
-
-/(a)*ab/
-    cab
-
-/(?:(?i)a)b/
-    ab
-
-/((?i)a)b/
-    ab
-
-/(?:(?i)a)b/
-    Ab
-
-/((?i)a)b/
-    Ab
-
-/(?:(?i)a)b/
-    *** Failers
-    cb
-    aB
-
-/((?i)a)b/
-
-/(?i:a)b/
-    ab
-
-/((?i:a))b/
-    ab
-
-/(?i:a)b/
-    Ab
-
-/((?i:a))b/
-    Ab
-
-/(?i:a)b/
-    *** Failers
-    aB
-    aB
-
-/((?i:a))b/
-
-/(?:(?-i)a)b/i
-    ab
-
-/((?-i)a)b/i
-    ab
-
-/(?:(?-i)a)b/i
-    aB
-
-/((?-i)a)b/i
-    aB
-
-/(?:(?-i)a)b/i
-    *** Failers
-    aB
-    Ab
-
-/((?-i)a)b/i
-
-/(?:(?-i)a)b/i
-    aB
-
-/((?-i)a)b/i
-    aB
-
-/(?:(?-i)a)b/i
-    *** Failers
-    Ab
-    AB
-
-/((?-i)a)b/i
-
-/(?-i:a)b/i
-    ab
-
-/((?-i:a))b/i
-    ab
-
-/(?-i:a)b/i
-    aB
-
-/((?-i:a))b/i
-    aB
-
-/(?-i:a)b/i
-    *** Failers
-    AB
-    Ab
-
-/((?-i:a))b/i
-
-/(?-i:a)b/i
-    aB
-
-/((?-i:a))b/i
-    aB
-
-/(?-i:a)b/i
-    *** Failers
-    Ab
-    AB
-
-/((?-i:a))b/i
-
-/((?-i:a.))b/i
-    *** Failers
-    AB
-    a\nB
-
-/((?s-i:a.))b/i
-    a\nB
-
-/(?:c|d)(?:)(?:a(?:)(?:b)(?:b(?:))(?:b(?:)(?:b)))/
-    cabbbb
-
-/(?:c|d)(?:)(?:aaaaaaaa(?:)(?:bbbbbbbb)(?:bbbbbbbb(?:))(?:bbbbbbbb(?:)(?:bbbbbbbb)))/
-    caaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
-
-/foo\w*\d{4}baz/
-    foobar1234baz
-
-/x(~~)*(?:(?:F)?)?/
-    x~~
-
-/^a(?#xxx){3}c/
-    aaac
-
-/^a (?#xxx) (?#yyy) {3}c/x
-    aaac
-
-/(?<![cd])b/
-    *** Failers
-    B\nB
-    dbcb
-
-/(?<![cd])[ab]/
-    dbaacb
-
-/(?<!(c|d))b/
-
-/(?<!(c|d))[ab]/
-    dbaacb
-
-/(?<!cd)[ab]/
-    cdaccb
-
-/^(?:a?b?)*$/
-    *** Failers
-    dbcb
-    a--
-
-/((?s)^a(.))((?m)^b$)/
-    a\nb\nc\n
-
-/((?m)^b$)/
-    a\nb\nc\n
-
-/(?m)^b/
-    a\nb\n
-
-/(?m)^(b)/
-    a\nb\n
-
-/((?m)^b)/
-    a\nb\n
-
-/\n((?m)^b)/
-    a\nb\n
-
-/((?s).)c(?!.)/
-    a\nb\nc\n
-    a\nb\nc\n
-
-/((?s)b.)c(?!.)/
-    a\nb\nc\n
-    a\nb\nc\n
-
-/^b/
-
-/()^b/
-    *** Failers
-    a\nb\nc\n
-    a\nb\nc\n
-
-/((?m)^b)/
-    a\nb\nc\n
-
-/(?(?!a)a|b)/
-
-/(?(?!a)b|a)/
-    a
-
-/(?(?=a)b|a)/
-    *** Failers
-    a
-    a
-
-/(?(?=a)a|b)/
-    a
-
-/(\w+:)+/
-    one:
-
-/$(?<=^(a))/
-    a
-
-/([\w:]+::)?(\w+)$/
-    abcd
-    xy:z:::abcd
-
-/^[^bcd]*(c+)/
-    aexycd
-
-/(a*)b+/
-    caab
-
-/([\w:]+::)?(\w+)$/
-    abcd
-    xy:z:::abcd
-    *** Failers
-    abcd:
-    abcd:
-
-/^[^bcd]*(c+)/
-    aexycd
-
-/(>a+)ab/
-
-/(?>a+)b/
-    aaab
-
-/([[:]+)/
-    a:[b]:
-
-/([[=]+)/
-    a=[b]=
-
-/([[.]+)/
-    a.[b].
-
-/((?>a+)b)/
-    aaab
-
-/(?>(a+))b/
-    aaab
-
-/((?>[^()]+)|\([^()]*\))+/
-    ((abc(ade)ufh()()x
-
-/a\Z/
-    *** Failers
-    aaab
-    a\nb\n
-
-/b\Z/
-    a\nb\n
-
-/b\z/
-
-/b\Z/
-    a\nb
-
-/b\z/
-    a\nb
-    *** Failers
+/^\p{Xan}*/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_

-/(?>.*)(?<=(abcd|wxyz))/
-    alphabetabcd
-    endingwxyz
-    *** Failers
-    a rather long string that doesn't end with one of them
-
-/word (?>(?:(?!otherword)[a-zA-Z0-9]+ ){0,30})otherword/
-    word cat dog elephant mussel cow horse canary baboon snake shark otherword
-    word cat dog elephant mussel cow horse canary baboon snake shark
-  
-/word (?>[a-zA-Z0-9]+ ){0,30}otherword/
-    word cat dog elephant mussel cow horse canary baboon snake shark the quick brown fox and the lazy dog and several other words getting close to thirty by now I hope
-
-/(?<=\d{3}(?!999))foo/
-    999foo
-    123999foo 
-    *** Failers
-    123abcfoo
+/^\p{Xan}{2,9}/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_

-/(?<=(?!...999)\d{3})foo/
-    999foo
-    123999foo 
-    *** Failers
-    123abcfoo
-
-/(?<=\d{3}(?!999)...)foo/
-    123abcfoo
-    123456foo 
-    *** Failers
-    123999foo  
+/^\p{Xan}{2,9}?/8
+    \x{6ca}\x{a6c}\x{10a7}_

-/(?<=\d{3}...)(?<!999)foo/
-    123abcfoo   
-    123456foo 
-    *** Failers
-    123999foo  
+/^[\p{Xan}]/8
+    ABCD1234_
+    1234abcd_
+    \x{6ca}
+    \x{a6c}
+    \x{10a7}   
+    ** Failers
+    _ABC   
+ 
+/^[\p{Xan}]+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+    ** Failers
+    _ABC

-/((Z)+|A)*/
-    ZABCDEFG
+/^>\p{Xsp}/8
+    >\x{1680}\x{2028}\x{0b}
+    >\x{a0} 
+    ** Failers
+    \x{0b}

-/(Z()|A)*/
-    ZABCDEFG
+/^>\p{Xsp}+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/(Z(())|A)*/
-    ZABCDEFG
+/^>\p{Xsp}+?/8
+    >\x{1680}\x{2028}\x{0b}

-/((?>Z)+|A)*/
-    ZABCDEFG
-
-/((?>)+|A)*/
-    ZABCDEFG
-
-/a*/g
-    abbab
-
-/^[a-\d]/
-    abcde
-    -things
-    0digit
-    *** Failers
-    bcdef    
-
-/^[\d-a]/
-    abcde
-    -things
-    0digit
-    *** Failers
-    bcdef    
+/^>\p{Xsp}*/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/[[:space:]]+/
-    > \x09\x0a\x0c\x0d\x0b<
-     
-/[[:blank:]]+/
-    > \x09\x0a\x0c\x0d\x0b<
-     
-/[\s]+/
-    > \x09\x0a\x0c\x0d\x0b<
-     
-/\s+/
-    > \x09\x0a\x0c\x0d\x0b<
-     
-/a?b/x
-    ab
-
-/(?!\A)x/m
-  a\nxb\n
-
-/(?!^)x/m
-  a\nxb\n
-
-/abc\Qabc\Eabc/
-    abcabcabc
+/^>\p{Xsp}{2,9}/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/abc\Q(*+|\Eabc/
-    abc(*+|abc 
-
-/   abc\Q abc\Eabc/x
-    abc abcabc
-    *** Failers
-    abcabcabc  
+/^>\p{Xsp}{2,9}?/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/abc#comment
-    \Q#not comment
-    literal\E/x
-    abc#not comment\n    literal     
+/^>[\p{Xsp}]/8
+    >\x{2028}\x{0b}
+ 
+/^>[\p{Xsp}]+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/abc#comment
-    \Q#not comment
-    literal/x
-    abc#not comment\n    literal     
+/^>\p{Xps}/8
+    >\x{1680}\x{2028}\x{0b}
+    >\x{a0} 
+    ** Failers
+    \x{0b}

-/abc#comment
-    \Q#not comment
-    literal\E #more comment
-    /x
-    abc#not comment\n    literal     
+/^>\p{Xps}+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/abc#comment
-    \Q#not comment
-    literal\E #more comment/x
-    abc#not comment\n    literal     
+/^>\p{Xps}+?/8
+    >\x{1680}\x{2028}\x{0b}

-/\Qabc\$xyz\E/
-    abc\\\$xyz
-
-/\Qabc\E\$\Qxyz\E/
-    abc\$xyz
-
-/\Gabc/
-    abc
-    *** Failers
-    xyzabc  
-
-/\Gabc./g
-    abc1abc2xyzabc3
-
-/abc./g
-    abc1abc2xyzabc3 
-
-/a(?x: b c )d/
-    XabcdY
-    *** Failers 
-    Xa b c d Y 
-
-/((?x)x y z | a b c)/
-    XabcY
-    AxyzB 
-
-/(?i)AB(?-i)C/
-    XabCY
-    *** Failers
-    XabcY  
-
-/((?i)AB(?-i)C|D)E/
-    abCE
-    DE
-    *** Failers
-    abcE
-    abCe  
-    dE
-    De    
-
-/[z\Qa-d]\E]/
-    z
-    a
-    -
-    d
-    ] 
-    *** Failers
-    b     
-
-/[\z\C]/
-    z
-    C 
+/^>\p{Xps}*/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/\M/
-    M 
+/^>\p{Xps}{2,9}/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/(a+)*b/
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 
+/^>\p{Xps}{2,9}?/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/(?i)reg(?:ul(?:[a\xE4]|ae)r|ex)/
-    REGular
-    regulaer
-    Regex  
-    regul\xE4r 
+/^>[\p{Xps}]/8
+    >\x{2028}\x{0b}
+ 
+/^>[\p{Xps}]+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/\xC5\xE6\xE5\xE4[\xE0-\xFF\xC0-\xDF]+/
-    \xC5\xE6\xE5\xE4\xE0
-    \xC5\xE6\xE5\xE4\xFF
-    \xC5\xE6\xE5\xE4\xC0
-    \xC5\xE6\xE5\xE4\xDF
+/^\p{Xwd}/8
+    ABCD
+    1234
+    \x{6ca}
+    \x{a6c}
+    \x{10a7}
+    _ABC    
+    ** Failers
+    []

-/(?<=Z)X./
-    \x84XAZXB
+/^\p{Xwd}+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_

-/^(?(2)a|(1)(2))+$/
-    123a
+/^\p{Xwd}+?/8
+    \x{6ca}\x{a6c}\x{10a7}_

-/(?<=a|bbbb)c/
-    ac
-    bbbbc
-
-/abc/SS>testsavedregex
-<testsavedregex
-    abc
-    *** Failers
-    bca
+/^\p{Xwd}*/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_

-/abc/FSS>testsavedregex
-<testsavedregex
-    abc
-    *** Failers
-    bca
-
-/(a|b)/S>testsavedregex
-<testsavedregex
-    abc
-    *** Failers
-    def  
+/^\p{Xwd}{2,9}/8
+    A_B12\x{6ca}\x{a6c}\x{10a7}

-/(a|b)/SF>testsavedregex
-<testsavedregex
-    abc
-    *** Failers
-    def  
+/^\p{Xwd}{2,9}?/8
+    \x{6ca}\x{a6c}\x{10a7}_

-/line\nbreak/
-    this is a line\nbreak
-    line one\nthis is a line\nbreak in the second line 
+/^[\p{Xwd}]/8
+    ABCD1234_
+    1234abcd_
+    \x{6ca}
+    \x{a6c}
+    \x{10a7}   
+    _ABC 
+    ** Failers
+    []   
+ 
+/^[\p{Xwd}]+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_

-/line\nbreak/f
-    this is a line\nbreak
-    ** Failers 
-    line one\nthis is a line\nbreak in the second line 
+/-- A check not in UTF-8 mode --/

-/line\nbreak/mf
-    this is a line\nbreak
-    ** Failers 
-    line one\nthis is a line\nbreak in the second line 
-
-/1234/
-    123\P
-    a4\P\R
-
-/1234/
-    123\P
-    4\P\R
-
-/^/mg
-    a\nb\nc\n
-    \ 
+/^[\p{Xwd}]+/
+    ABCD1234_

-/(?<=C\n)^/mg
-    A\nC\nC\n 
+/-- Some negative checks --/

-/(?s)A?B/
-    AB
-    aB  
+/^[\P{Xwd}]+/8
+    !.+\x{019}\x{35a}AB

-/(?s)A*B/
-    AB
-    aB  
+/^[\p{^Xwd}]+/8
+    !.+\x{019}\x{35a}AB

-/(?m)A?B/
-    AB
-    aB  
+/[\D]/WBZ8
+    1\x{3c8}2

-/(?m)A*B/
-    AB
-    aB  
+/[\d]/WBZ8
+    >\x{6f4}<

-/Content-Type\x3A[^\r\n]{6,}/
-    Content-Type:xxxxxyyy 
+/[\S]/WBZ8
+    \x{1680}\x{6f4}\x{1680}

-/Content-Type\x3A[^\r\n]{6,}z/
-    Content-Type:xxxxxyyyz
+/[\s]/WBZ8
+    >\x{1680}<

-/Content-Type\x3A[^a]{6,}/
-    Content-Type:xxxyyy 
+/[\W]/WBZ8
+    A\x{1712}B

-/Content-Type\x3A[^a]{6,}z/
-    Content-Type:xxxyyyz
+/[\w]/WBZ8
+    >\x{1723}<

-/^abc/m
-    xyz\nabc
-    xyz\nabc\<lf>
-    xyz\r\nabc\<lf>
-    xyz\rabc\<cr>
-    xyz\r\nabc\<crlf>
-    ** Failers 
-    xyz\nabc\<cr>
-    xyz\r\nabc\<cr>
-    xyz\nabc\<crlf>
-    xyz\rabc\<crlf>
-    xyz\rabc\<lf>
-    
-/abc$/m<lf>
-    xyzabc
-    xyzabc\n 
-    xyzabc\npqr 
-    xyzabc\r\<cr> 
-    xyzabc\rpqr\<cr> 
-    xyzabc\r\n\<crlf> 
-    xyzabc\r\npqr\<crlf> 
-    ** Failers
-    xyzabc\r 
-    xyzabc\rpqr 
-    xyzabc\r\n 
-    xyzabc\r\npqr 
-    
-/^abc/m<cr>
-    xyz\rabcdef
-    xyz\nabcdef\<lf>
-    ** Failers  
-    xyz\nabcdef
-       
-/^abc/m<lf>
-    xyz\nabcdef
-    xyz\rabcdef\<cr>
-    ** Failers  
-    xyz\rabcdef
-       
-/^abc/m<crlf>
-    xyz\r\nabcdef
-    xyz\rabcdef\<cr>
-    ** Failers  
-    xyz\rabcdef
-    
-/.*/<lf>
-    abc\ndef
-    abc\rdef
-    abc\r\ndef
-    \<cr>abc\ndef
-    \<cr>abc\rdef
-    \<cr>abc\r\ndef
-    \<crlf>abc\ndef
-    \<crlf>abc\rdef
-    \<crlf>abc\r\ndef
+/\D/WBZ8
+    1\x{3c8}2

-/\w+(.)(.)?def/s
-    abc\ndef
-    abc\rdef
-    abc\r\ndef
+/\d/WBZ8
+    >\x{6f4}<

-/^\w+=.*(\\\n.*)*/
-    abc=xyz\\\npqr
+/\S/WBZ8
+    \x{1680}\x{6f4}\x{1680}

-/^(a()*)*/
-    aaaa
+/\s/WBZ8
+    >\x{1680}>

-/^(?:a(?:(?:))*)*/
-    aaaa
+/\W/WBZ8
+    A\x{1712}B

-/^(a()+)+/
-    aaaa
+/\w/WBZ8
+    >\x{1723}<

-/^(?:a(?:(?:))+)+/
-    aaaa
+/[[:alpha:]]/WBZ

-/(a|)*\d/
- aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+/[[:lower:]]/WBZ

-/(?>a|)*\d/
- aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+/[[:upper:]]/WBZ

-/(?:a|)*\d/
- aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+/[[:alnum:]]/WBZ

-/^a.b/<lf>
-    a\rb
-    a\nb\<cr> 
-    ** Failers
-    a\nb
-    a\nb\<any>
-    a\rb\<cr>   
-    a\rb\<any>   
+/[[:ascii:]]/WBZ

-/^abc./mgx<any>
-    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 JUNK
+/[[:cntrl:]]/WBZ

-/abc.$/mgx<any>
-    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc9
+/[[:digit:]]/WBZ

-/^a\Rb/<bsr_unicode>
-    a\nb
-    a\rb
-    a\r\nb
-    a\x0bb
-    a\x0cb
-    a\x85b   
-    ** Failers
-    a\n\rb    
+/[[:graph:]]/WBZ

-/^a\R*b/<bsr_unicode>
-    ab
-    a\nb
-    a\rb
-    a\r\nb
-    a\x0bb
-    a\x0cb
-    a\x85b   
-    a\n\rb    
-    a\n\r\x85\x0cb 
+/[[:print:]]/WBZ

-/^a\R+b/<bsr_unicode>
-    a\nb
-    a\rb
-    a\r\nb
-    a\x0bb
-    a\x0cb
-    a\x85b   
-    a\n\rb    
-    a\n\r\x85\x0cb 
-    ** Failers
-    ab  
-    
-/^a\R{1,3}b/<bsr_unicode>
-    a\nb
-    a\n\rb
-    a\n\r\x85b
-    a\r\n\r\nb 
-    a\r\n\r\n\r\nb 
-    a\n\r\n\rb
-    a\n\n\r\nb 
-    ** Failers
-    a\n\n\n\rb
-    a\r
+/[[:punct:]]/WBZ

-/^a[\R]b/<bsr_unicode>
-    aRb
-    ** Failers
-    a\nb  
+/[[:space:]]/WBZ

-/.+foo/
-    afoo
-    ** Failers 
-    \r\nfoo 
-    \nfoo 
+/[[:word:]]/WBZ

-/.+foo/<crlf>
-    afoo
-    \nfoo 
-    ** Failers 
-    \r\nfoo 
+/[[:xdigit:]]/WBZ

-/.+foo/<any>
-    afoo
-    ** Failers 
-    \nfoo 
-    \r\nfoo 
+/-- Unicode properties for \b abd \B --/

-/.+foo/s
-    afoo
-    \r\nfoo 
-    \nfoo 
+/\b...\B/8W
+    abc_
+    \x{37e}abc\x{376} 
+    \x{37e}\x{376}\x{371}\x{393}\x{394} 
+    !\x{c0}++\x{c1}\x{c2} 
+    !\x{c0}+++++

-/^$/mg<any>
-    abc\r\rxyz
-    abc\n\rxyz  
-    ** Failers 
-    abc\r\nxyz
+/-- Without PCRE_UCP, non-ASCII always fail, even if < 256  --/

-/^X/m
-    XABC
+/\b...\B/8
+    abc_
     ** Failers 
-    XABC\B
+    \x{37e}abc\x{376} 
+    \x{37e}\x{376}\x{371}\x{393}\x{394} 
+    !\x{c0}++\x{c1}\x{c2} 
+    !\x{c0}+++++

-/(?m)^$/<any>g+
-    abc\r\n\r\n
+/-- With PCRE_UCP, non-UTF8 chars that are < 256 still check properties  --/

-/(?m)^$|^\r\n/<any>g+ 
-    abc\r\n\r\n
-    
-/(?m)$/<any>g+ 
-    abc\r\n\r\n
+/\b...\B/W
+    abc_
+    !\x{c0}++\x{c1}\x{c2} 
+    !\x{c0}+++++

-/(?|(abc)|(xyz))/
- >abc<
- >xyz<
+/-- Some of these are silly, but they check various combinations --/

-/(x)(?|(abc)|(xyz))(x)/
-    xabcx
-    xxyzx 
+/[[:^alpha:][:^cntrl:]]+/8WBZ
+    123
+    abc

-/(x)(?|(abc)(pqr)|(xyz))(x)/
-    xabcpqrx
-    xxyzx 
+/[[:^cntrl:][:^alpha:]]+/8WBZ
+    123
+    abc

-/(?|(abc)|(xyz))(?1)/
-    abcabc
-    xyzabc 
-    ** Failers 
-    xyzxyz 
- 
-/\H\h\V\v/
-    X X\x0a
-    X\x09X\x0b
-    ** Failers
-    \xa0 X\x0a   
-    
-/\H*\h+\V?\v{3,4}/ 
-    \x09\x20\xa0X\x0a\x0b\x0c\x0d\x0a
-    \x09\x20\xa0\x0a\x0b\x0c\x0d\x0a
-    \x09\x20\xa0\x0a\x0b\x0c
-    ** Failers 
-    \x09\x20\xa0\x0a\x0b
-     
-/\H{3,4}/
-    XY  ABCDE
-    XY  PQR ST 
-    
-/.\h{3,4}./
-    XY  AB    PQRS
+/[[:alpha:]]+/8WBZ
+    abc

-/\h*X\h?\H+Y\H?Z/
-    >XNNNYZ
-    >  X NYQZ
-    ** Failers
-    >XYZ   
-    >  X NY Z
-
-/\v*X\v?Y\v+Z\V*\x0a\V+\x0b\V{2,3}\x0c/
-    >XY\x0aZ\x0aA\x0bNN\x0c
-    >\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
-
-/.+A/<crlf>
-    \r\nA
-    
-/\nA/<crlf>
-    \r\nA 
-
-/[\r\n]A/<crlf>
-    \r\nA 
-
-/(\r|\n)A/<crlf>
-    \r\nA 
-
-/a\Rb/I<bsr_anycrlf>
-    a\rb
-    a\nb
-    a\r\nb
-    ** Failers
-    a\x85b
-    a\x0bb     
-
-/a\Rb/I<bsr_unicode>
-    a\rb
-    a\nb
-    a\r\nb
-    a\x85b
-    a\x0bb     
-    ** Failers 
-    a\x85b\<bsr_anycrlf>
-    a\x0bb\<bsr_anycrlf>
-    
-/a\R?b/I<bsr_anycrlf>
-    a\rb
-    a\nb
-    a\r\nb
-    ** Failers
-    a\x85b
-    a\x0bb     
-
-/a\R?b/I<bsr_unicode>
-    a\rb
-    a\nb
-    a\r\nb
-    a\x85b
-    a\x0bb     
-    ** Failers 
-    a\x85b\<bsr_anycrlf>
-    a\x0bb\<bsr_anycrlf>
-    
-/a\R{2,4}b/I<bsr_anycrlf>
-    a\r\n\nb
-    a\n\r\rb
-    a\r\n\r\n\r\n\r\nb
-    ** Failers
-    a\x85\85b
-    a\x0b\0bb     
-
-/a\R{2,4}b/I<bsr_unicode>
-    a\r\rb
-    a\n\n\nb
-    a\r\n\n\r\rb
-    a\x85\85b
-    a\x0b\0bb     
-    ** Failers 
-    a\r\r\r\r\rb 
-    a\x85\85b\<bsr_anycrlf>
-    a\x0b\0bb\<bsr_anycrlf>
-    
-/a(?!)|\wbc/
+/[[:^alpha:]\S]+/8WBZ
+    123
     abc

-/a[]b/<JS>
-    ** Failers
-    ab
+/[^\d]+/8WBZ
+    abc123
+    abc\x{123}
+    \x{660}abc

-/a[]+b/<JS>
-    ** Failers
-    ab 
+/\p{Lu}+9\p{Lu}+B\p{Lu}+b/BZ

-/a[]*+b/<JS>
-    ** Failers
-    ab 
+/\p{^Lu}+9\p{^Lu}+B\p{^Lu}+b/BZ

-/a[^]b/<JS>
-    aXb
-    a\nb 
-    ** Failers
-    ab  
-    
-/a[^]+b/<JS> 
-    aXb
-    a\nX\nXb 
-    ** Failers
-    ab  
+/\P{Lu}+9\P{Lu}+B\P{Lu}+b/BZ

-/X$/E
-    X
-    ** Failers 
-    X\n 
+/\p{Han}+X\p{Greek}+\x{370}/BZ8

-/X$/
-    X
-    X\n 
+/\p{Xan}+!\p{Xan}+A/BZ

-/xyz/C
- xyz
- abcxyz
- abcxyz\Y
- ** Failers
- abc
- abc\Y
- abcxypqr
- abcxypqr\Y
+/\p{Xsp}+!\p{Xsp}\t/BZ

-/(*NO_START_OPT)xyz/C
-  abcxyz 
-  
-/(?C)ab/
-  ab
-  \C-ab
-  
-/ab/C
-  ab
-  \C-ab    
+/\p{Xps}+!\p{Xps}\t/BZ

-/^"((?(?=[a])[^"])|b)*"$/C
-    "ab"
-    \C-"ab"
+/\p{Xwd}+!\p{Xwd}_/BZ

-/\d+X|9+Y/
-    ++++123999\P
-    ++++123999Y\P
+/A+\p{N}A+\dB+\p{N}*B+\d*/WBZ

-/Z(*F)/
-    Z\P
-    ZA\P 
-    
-/Z(?!)/
-    Z\P 
-    ZA\P 
+/-- These behaved oddly in Perl, so they are kept in this test --/

-/dog(sbody)?/
-    dogs\P
-    dogs\P\P 
-    
-/dog(sbody)??/
-    dogs\P
-    dogs\P\P 
+/(\x{23a}\x{23a}\x{23a})?\1/8i
+    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}

-/dog|dogsbody/
-    dogs\P
-    dogs\P\P 
- 
-/dogsbody|dog/
-    dogs\P
-    dogs\P\P 
+/(ȺȺȺ)?\1/8i
+    ȺȺȺⱥⱥ

-/Z(*F)Q|ZXY/
-    Z\P
-    ZA\P 
-    X\P 
+/(\x{23a}\x{23a}\x{23a})?\1/8i
+    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}

-/\bthe cat\b/
-    the cat\P
-    the cat\P\P
+/(ȺȺȺ)?\1/8i
+    ȺȺȺⱥⱥⱥ

-/dog(sbody)?/
-    dogs\D\P
-    body\D\R
+/(\x{23a}\x{23a}\x{23a})\1/8i
+    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}

-/dog(sbody)?/
-    dogs\D\P\P
-    body\D\R
+/(ȺȺȺ)\1/8i
+    ȺȺȺⱥⱥ

-/abc/
-   abc\P
-   abc\P\P
+/(\x{23a}\x{23a}\x{23a})\1/8i
+    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}

-/abc\K123/
-    xyzabc123pqr
-    
-/(?<=abc)123/
-    xyzabc123pqr 
-    xyzabc12\P
-    xyzabc12\P\P
+/(ȺȺȺ)\1/8i
+    ȺȺȺⱥⱥⱥ

-/\babc\b/
-    +++abc+++
-    +++ab\P
-    +++ab\P\P  
-
-/(?=C)/g+
-    ABCDECBA
-
-/(abc|def|xyz)/I
-    terhjk;abcdaadsfe
-    the quick xyz brown fox 
-    \Yterhjk;abcdaadsfe
-    \Ythe quick xyz brown fox 
-    ** Failers
-    thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
-    \Ythejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
-
-/(abc|def|xyz)/SI
-    terhjk;abcdaadsfe
-    the quick xyz brown fox 
-    \Yterhjk;abcdaadsfe
-    \Ythe quick xyz brown fox 
-    ** Failers
-    thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
-    \Ythejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
-
-/abcd*/+
-    xxxxabcd\P
-    xxxxabcd\P\P
-    dddxxx\R 
-    xxxxabcd\P\P
-    xxx\R 
-
-/abcd*/i
-    xxxxabcd\P
-    xxxxabcd\P\P
-    XXXXABCD\P
-    XXXXABCD\P\P
-
-/abc\d*/
-    xxxxabc1\P
-    xxxxabc1\P\P
-
-/abc[de]*/
-    xxxxabcde\P
-    xxxxabcde\P\P
-
-/(?:(?1)|B)(A(*F)|C)/
-    ABCD
-    CCD
-    ** Failers
-    CAD   
-
-/^(?:(?1)|B)(A(*F)|C)/
-    CCD
-    BCD 
-    ** Failers
-    ABCD
-    CAD
-    BAD    
-
-/^(?!a(*SKIP)b)/
-    ac
+/(\x{2c65}\x{2c65})\1/8i
+    \x{2c65}\x{2c65}\x{23a}\x{23a}

-/^(?=a(*SKIP)b|ac)/
-    ** Failers
-    ac
+/(ⱥⱥ)\1/8i
+    ⱥⱥȺȺ

-/^(?=a(*THEN)b|ac)/
-    ac
-    
-/^(?=a(*PRUNE)b)/
-    ab  
-    ** Failers 
-    ac
+/(\x{23a}\x{23a}\x{23a})\1Y/8i
+    X\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}YZ

-/^(?(?!a(*SKIP)b))/
-    ac
+/(\x{2c65}\x{2c65})\1Y/8i
+    X\x{2c65}\x{2c65}\x{23a}\x{23a}YZ

-/(?<=abc)def/
-    abc\P\P
+/-- --/

-/abc$/
-    abc
-    abc\P
-    abc\P\P
+/-- These scripts weren't yet in Perl when I added Unicode 6.0.0 to PCRE --/

-/abc$/m
-    abc
-    abc\n
-    abc\P\P
-    abc\n\P\P 
-    abc\P
-    abc\n\P
-
-/abc\z/
-    abc
-    abc\P
-    abc\P\P
-
-/abc\Z/
-    abc
-    abc\P
-    abc\P\P
-
-/abc\b/
-    abc
-    abc\P
-    abc\P\P
-
-/abc\B/
-    abc
-    abc\P
-    abc\P\P
-
-/.+/
-    abc\>0
-    abc\>1
-    abc\>2
-    abc\>3
-    abc\>4
-    abc\>-4 
-
-/^(?:a)++\w/
-     aaaab
-     ** Failers 
-     aaaa 
-     bbb 
-
-/^(?:aa|(?:a)++\w)/
-     aaaab
-     aaaa 
-     ** Failers 
-     bbb 
-
-/^(?:a)*+\w/
-     aaaab
-     bbb 
-     ** Failers 
-     aaaa 
-
-/^(a)++\w/
-     aaaab
-     ** Failers 
-     aaaa 
-     bbb 
-
-/^(a|)++\w/
-     aaaab
-     ** Failers 
-     aaaa 
-     bbb 
-
-/(?=abc){3}abc/+
-    abcabcabc
+/^[\p{Batak}]/8
+    \x{1bc0}
+    \x{1bff}
     ** Failers
-    xyz  
+    \x{1bf4}

-/(?=abc)+abc/+
-    abcabcabc
+/^[\p{Brahmi}]/8
+    \x{11000}
+    \x{1106f}
     ** Failers
-    xyz  
+    \x{1104e}

-/(?=abc)++abc/+
-    abcabcabc
+/^[\p{Mandaic}]/8
+    \x{840}
+    \x{85e}
     ** Failers
-    xyz  
-    
-/(?=abc){0}xyz/
-    xyz 
+    \x{85c}
+    \x{85d}

-/(?=abc){1}xyz/
-    ** Failers
-    xyz 
-    
-/(?=(a))?./
-    ab
-    bc
-      
-/(?=(a))??./
-    ab
-    bc
+/-- --/

-/^(?=(a)){0}b(?1)/
-    backgammon
+/(\X*)(.)/s8
+    A\x{300}

-/^(?=(?1))?[az]([abc])d/
-    abd 
-    zcdxx 
+/^S(\X*)e(\X*)$/8
+    Stéréo
+    
+/^\X/8 
+    ́réo

-/^(?!a){0}\w+/
-    aaaaa
+/^a\X41z/<JS>
+    aX41z
+    *** Failers
+    aAz

-/(?<=(abc))?xyz/
-    abcxyz
-    pqrxyz 
+/(?<=ab\Cde)X/8

-/((?2))((?1))/
-    abc
-
-/(?(R)a+|(?R)b)/
-    aaaabcde
-
-/(?(R)a+|((?R))b)/
-    aaaabcde
-
-/((?(R)a+|(?1)b))/
-    aaaabcde
-
-/((?(R2)a+|(?1)b))/
-    aaaabcde
-
-/(?(R)a*(?1)|((?R))b)/
-    aaaabcde
-
-/(a+)/
-    \O6aaaa
-    \O8aaaa
-
-/ab\Cde/
-    abXde
-    
-/(?<=ab\Cde)X/
-    abZdeX
-
 /-- End of testinput7 --/

Modified: code/trunk/testdata/testinput8
===================================================================
--- code/trunk/testdata/testinput8    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput8    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,590 +1,4199 @@
-/-- This set of tests checks UTF-8 support with the DFA matching functionality
-    of pcre_dfa_exec(). The -dfa flag must be used with pcretest when running 
-    it. --/
-
-/\x{100}ab/8
-  \x{100}ab
-  
-/a\x{100}*b/8
+/-- This set of tests check the DFA matching functionality of pcre_dfa_exec().
+    The -dfa flag must be used with pcretest when running it. --/
+     
+/abc/
+    abc
+    
+/ab*c/
+    abc
+    abbbbc
+    ac
+    
+/ab+c/
+    abc
+    abbbbbbc
+    *** Failers 
+    ac
     ab
-    a\x{100}b  
-    a\x{100}\x{100}b

-/a\x{100}+b/8
-    a\x{100}b  
-    a\x{100}\x{100}b  
+/a*/
+    a
+    aaaaaaaaaaaaaaaaa
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\F 
+    
+/(a|abcd|african)/
+    a
+    abcd
+    african
+    
+/^abc/
+    abcdef
+    *** Failers
+    xyzabc
+    xyz\nabc    
+    
+/^abc/m
+    abcdef
+    xyz\nabc    
+    *** Failers
+    xyzabc
+    
+/\Aabc/
+    abcdef
+    *** Failers
+    xyzabc
+    xyz\nabc    
+    
+/\Aabc/m
+    abcdef
+    *** Failers
+    xyzabc
+    xyz\nabc    
+    
+/\Gabc/
+    abcdef
+    xyzabc\>3
+    *** Failers
+    xyzabc    
+    xyzabc\>2 
+    
+/x\dy\Dz/
+    x9yzz
+    x0y+z
+    *** Failers
+    xyz
+    xxy0z     
+    
+/x\sy\Sz/
+    x yzz
+    x y+z
+    *** Failers
+    xyz
+    xxyyz
+    
+/x\wy\Wz/
+    xxy+z
+    *** Failers
+    xxy0z
+    x+y+z         
+    
+/x.y/
+    x+y
+    x-y
+    *** Failers
+    x\ny
+    
+/x.y/s
+    x+y
+    x-y
+    x\ny
+
+/(a.b(?s)c.d|x.y)p.q/
+    a+bc+dp+q
+    a+bc\ndp+q
+    x\nyp+q 
     *** Failers 
-    ab
+    a\nbc\ndp+q
+    a+bc\ndp\nq
+    x\nyp\nq 
+
+/a\d\z/
+    ba0
+    *** Failers
+    ba0\n
+    ba0\ncd   
+
+/a\d\z/m
+    ba0
+    *** Failers
+    ba0\n
+    ba0\ncd   
+
+/a\d\Z/
+    ba0
+    ba0\n
+    *** Failers
+    ba0\ncd   
+
+/a\d\Z/m
+    ba0
+    ba0\n
+    *** Failers
+    ba0\ncd   
+
+/a\d$/
+    ba0
+    ba0\n
+    *** Failers
+    ba0\ncd   
+
+/a\d$/m
+    ba0
+    ba0\n
+    ba0\ncd   
+    *** Failers
+
+/abc/i
+    abc
+    aBc
+    ABC
+    
+/[^a]/
+    abcd
+    
+/ab?\w/
+    abz
+    abbz
+    azz  
+
+/x{0,3}yz/
+    ayzq
+    axyzq
+    axxyz
+    axxxyzq
+    axxxxyzq
+    *** Failers
+    ax
+    axx     
+      
+/x{3}yz/
+    axxxyzq
+    axxxxyzq
+    *** Failers
+    ax
+    axx     
+    ayzq
+    axyzq
+    axxyz
+      
+/x{2,3}yz/
+    axxyz
+    axxxyzq
+    axxxxyzq
+    *** Failers
+    ax
+    axx     
+    ayzq
+    axyzq
+      
+/[^a]+/
+    bac
+    bcdefax
+    *** Failers
+    aaaaa   
+
+/[^a]*/
+    bac
+    bcdefax
+    *** Failers
+    aaaaa   
+    
+/[^a]{3,5}/
+    xyz
+    awxyza
+    abcdefa
+    abcdefghijk
+    *** Failers
+    axya
+    axa
+    aaaaa         
+
+/\d*/
+    1234b567
+    xyz
+    
+/\D*/
+    a1234b567
+    xyz

-/\bX/8
-    Xoanon
-    +Xoanon
-    \x{300}Xoanon 
+/\d+/
+    ab1234c56
+    *** Failers
+    xyz
+    
+/\D+/
+    ab123c56
+    *** Failers
+    789
+    
+/\d?A/
+    045ABC
+    ABC
+    *** Failers
+    XYZ
+    
+/\D?A/
+    ABC
+    BAC
+    9ABC             
+    *** Failers
+
+/a+/
+    aaaa
+
+/^.*xyz/
+    xyz
+    ggggggggxyz
+    
+/^.+xyz/
+    abcdxyz
+    axyz
+    *** Failers
+    xyz
+    
+/^.?xyz/
+    xyz
+    cxyz       
+
+/^\d{2,3}X/
+    12X
+    123X
+    *** Failers
+    X
+    1X
+    1234X     
+
+/^[abcd]\d/
+    a45
+    b93
+    c99z
+    d04
+    *** Failers
+    e45
+    abcd      
+    abcd1234
+    1234  
+
+/^[abcd]*\d/
+    a45
+    b93
+    c99z
+    d04
+    abcd1234
+    1234  
+    *** Failers
+    e45
+    abcd      
+
+/^[abcd]+\d/
+    a45
+    b93
+    c99z
+    d04
+    abcd1234
+    *** Failers
+    1234  
+    e45
+    abcd      
+
+/^a+X/
+    aX
+    aaX 
+
+/^[abcd]?\d/
+    a45
+    b93
+    c99z
+    d04
+    1234  
+    *** Failers
+    abcd1234
+    e45
+
+/^[abcd]{2,3}\d/
+    ab45
+    bcd93
+    *** Failers
+    1234 
+    a36 
+    abcd1234
+    ee45
+
+/^(abc)*\d/
+    abc45
+    abcabcabc45
+    42xyz 
+    *** Failers
+
+/^(abc)+\d/
+    abc45
+    abcabcabc45
+    *** Failers
+    42xyz 
+
+/^(abc)?\d/
+    abc45
+    42xyz 
+    *** Failers
+    abcabcabc45
+
+/^(abc){2,3}\d/
+    abcabc45
+    abcabcabc45
+    *** Failers
+    abcabcabcabc45
+    abc45
+    42xyz 
+
+/1(abc|xyz)2(?1)3/
+    1abc2abc3456
+    1abc2xyz3456 
+
+/^(a*\w|ab)=(a*\w|ab)/
+    ab=ab
+
+/^(a*\w|ab)=(?1)/
+    ab=ab
+
+/^([^()]|\((?1)*\))*$/
+    abc
+    a(b)c
+    a(b(c))d  
+    *** Failers)
+    a(b(c)d  
+
+/^>abc>([^()]|\((?1)*\))*<xyz<$/
+    >abc>123<xyz<
+    >abc>1(2)3<xyz<
+    >abc>(1(2)3)<xyz<
+
+/^(?>a*)\d/
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa9876
     *** Failers 
-    YXoanon  
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+/< (?: (?(R) \d++  | [^<>]*+) | (?R)) * >/x
+    <>
+    <abcd>
+    <abc <123> hij>
+    <abc <def> hij>
+    <abc<>def> 
+    <abc<>      
+    *** Failers
+    <abc
+
+/^(?(?=abc)\w{3}:|\d\d)$/        
+    abc:                          
+    12                             
+    *** Failers                     
+    123                       
+    xyz                        
+                                
+/^(?(?!abc)\d\d|\w{3}:)$/      
+    abc:                        
+    12         
+    *** Failers
+    123
+    xyz    
+
+/^(?=abc)\w{5}:$/        
+    abcde:                          
+    *** Failers                     
+    abc.. 
+    123                       
+    vwxyz                        
+                                
+/^(?!abc)\d\d$/      
+    12         
+    *** Failers
+    abcde:
+    abc..  
+    123
+    vwxyz    
+
+/(?<=abc|xy)123/
+    abc12345
+    wxy123z
+    *** Failers
+    123abc
+
+/(?<!abc|xy)123/
+    123abc
+    mno123456 
+    *** Failers
+    abc12345
+    wxy123z
+
+/abc(?C1)xyz/
+    abcxyz
+    123abcxyz999 
+
+/(ab|cd){3,4}/C
+  ababab
+  abcdabcd
+  abcdcdcdcdcd  
+
+/^abc/
+    abcdef
+    *** Failers
+    abcdef\B  
+
+/^(a*|xyz)/
+    bcd
+    aaabcd
+    xyz
+    xyz\N  
+    *** Failers
+    bcd\N

-/\BX/8
-    YXoanon
+/xyz$/
+    xyz
+    xyz\n
     *** Failers
-    Xoanon
-    +Xoanon    
-    \x{300}Xoanon 
+    xyz\Z
+    xyz\n\Z    
+    
+/xyz$/m
+    xyz
+    xyz\n 
+    abcxyz\npqr 
+    abcxyz\npqr\Z 
+    xyz\n\Z    
+    *** Failers
+    xyz\Z

-/X\b/8
-    X+oanon
-    ZX\x{300}oanon 
-    FAX 
+/\Gabc/
+    abcdef
+    defabcxyz\>3 
     *** Failers 
-    Xoanon  
+    defabcxyz
+
+/^abcdef/
+    ab\P
+    abcde\P
+    abcdef\P
+    *** Failers
+    abx\P    
+
+/^a{2,4}\d+z/
+    a\P
+    aa\P
+    aa2\P 
+    aaa\P
+    aaa23\P 
+    aaaa12345\P
+    aa0z\P
+    aaaa4444444444444z\P 
+    *** Failers
+    az\P 
+    aaaaa\P 
+    a56\P 
+
+/^abcdef/
+   abc\P
+   def\R 
+   
+/(?<=foo)bar/
+   xyzfo\P 
+   foob\P\>2 
+   foobar...\R\P\>4 
+   xyzfo\P
+   foobar\>2  
+   *** Failers
+   xyzfo\P
+   obar\R   
+
+/(ab*(cd|ef))+X/
+    adfadadaklhlkalkajhlkjahdfasdfasdfladsfjkj\P\Z
+    lkjhlkjhlkjhlkjhabbbbbbcdaefabbbbbbbefa\P\B\Z
+    cdabbbbbbbb\P\R\B\Z
+    efabbbbbbbbbbbbbbbb\P\R\B\Z
+    bbbbbbbbbbbbcdXyasdfadf\P\R\B\Z    
+
+/(a|b)/SF>testsavedregex
+<testsavedregex
+    abc
+    ** Failers
+    def

-/X\B/8
-    Xoanon  
+/the quick brown fox/
+    the quick brown fox
+    The quick brown FOX
+    What do you know about the quick brown fox?
+    What do you know about THE QUICK BROWN FOX?
+
+/The quick brown fox/i
+    the quick brown fox
+    The quick brown FOX
+    What do you know about the quick brown fox?
+    What do you know about THE QUICK BROWN FOX?
+
+/abcd\t\n\r\f\a\e\071\x3b\$\\\?caxyz/
+    abcd\t\n\r\f\a\e9;\$\\?caxyz
+
+/a*abc?xyz+pqr{3}ab{2,}xy{4,5}pq{0,6}AB{0,}zz/
+    abxyzpqrrrabbxyyyypqAzz
+    abxyzpqrrrabbxyyyypqAzz
+    aabxyzpqrrrabbxyyyypqAzz
+    aaabxyzpqrrrabbxyyyypqAzz
+    aaaabxyzpqrrrabbxyyyypqAzz
+    abcxyzpqrrrabbxyyyypqAzz
+    aabcxyzpqrrrabbxyyyypqAzz
+    aaabcxyzpqrrrabbxyyyypAzz
+    aaabcxyzpqrrrabbxyyyypqAzz
+    aaabcxyzpqrrrabbxyyyypqqAzz
+    aaabcxyzpqrrrabbxyyyypqqqAzz
+    aaabcxyzpqrrrabbxyyyypqqqqAzz
+    aaabcxyzpqrrrabbxyyyypqqqqqAzz
+    aaabcxyzpqrrrabbxyyyypqqqqqqAzz
+    aaaabcxyzpqrrrabbxyyyypqAzz
+    abxyzzpqrrrabbxyyyypqAzz
+    aabxyzzzpqrrrabbxyyyypqAzz
+    aaabxyzzzzpqrrrabbxyyyypqAzz
+    aaaabxyzzzzpqrrrabbxyyyypqAzz
+    abcxyzzpqrrrabbxyyyypqAzz
+    aabcxyzzzpqrrrabbxyyyypqAzz
+    aaabcxyzzzzpqrrrabbxyyyypqAzz
+    aaaabcxyzzzzpqrrrabbxyyyypqAzz
+    aaaabcxyzzzzpqrrrabbbxyyyypqAzz
+    aaaabcxyzzzzpqrrrabbbxyyyyypqAzz
+    aaabcxyzpqrrrabbxyyyypABzz
+    aaabcxyzpqrrrabbxyyyypABBzz
+    >>>aaabxyzpqrrrabbxyyyypqAzz
+    >aaaabxyzpqrrrabbxyyyypqAzz
+    >>>>abcxyzpqrrrabbxyyyypqAzz
     *** Failers
-    X+oanon
-    ZX\x{300}oanon 
-    FAX 
-    
-/[^a]/8
+    abxyzpqrrabbxyyyypqAzz
+    abxyzpqrrrrabbxyyyypqAzz
+    abxyzpqrrrabxyyyypqAzz
+    aaaabcxyzzzzpqrrrabbbxyyyyyypqAzz
+    aaaabcxyzzzzpqrrrabbbxyyypqAzz
+    aaabcxyzpqrrrabbxyyyypqqqqqqqAzz
+
+/^(abc){1,2}zz/
+    abczz
+    abcabczz
+    *** Failers
+    zz
+    abcabcabczz
+    >>abczz
+
+/^(b+?|a){1,2}?c/
+    bc
+    bbc
+    bbbc
+    bac
+    bbac
+    aac
+    abbbbbbbbbbbc
+    bbbbbbbbbbbac
+    *** Failers
+    aaac
+    abbbbbbbbbbbac
+
+/^(b+|a){1,2}c/
+    bc
+    bbc
+    bbbc
+    bac
+    bbac
+    aac
+    abbbbbbbbbbbc
+    bbbbbbbbbbbac
+    *** Failers
+    aaac
+    abbbbbbbbbbbac
+
+/^(b+|a){1,2}?bc/
+    bbc
+
+/^(b*|ba){1,2}?bc/
+    babc
+    bbabc
+    bababc
+    *** Failers
+    bababbc
+    babababc
+
+/^(ba|b*){1,2}?bc/
+    babc
+    bbabc
+    bababc
+    *** Failers
+    bababbc
+    babababc
+
+/^\ca\cA\c[\c{\c:/
+    \x01\x01\e;z
+
+/^[ab\]cde]/
+    athing
+    bthing
+    ]thing
+    cthing
+    dthing
+    ething
+    *** Failers
+    fthing
+    [thing
+    \\thing
+
+/^[]cde]/
+    ]thing
+    cthing
+    dthing
+    ething
+    *** Failers
+    athing
+    fthing
+
+/^[^ab\]cde]/
+    fthing
+    [thing
+    \\thing
+    *** Failers
+    athing
+    bthing
+    ]thing
+    cthing
+    dthing
+    ething
+
+/^[^]cde]/
+    athing
+    fthing
+    *** Failers
+    ]thing
+    cthing
+    dthing
+    ething
+
+/^\\x81/
+    \x81
+
+/^\xFF/
+    \xFF
+
+/^[0-9]+$/
+    0
+    1
+    2
+    3
+    4
+    5
+    6
+    7
+    8
+    9
+    10
+    100
+    *** Failers
+    abc
+
+/^.*nter/
+    enter
+    inter
+    uponter
+
+/^xxx[0-9]+$/
+    xxx0
+    xxx1234
+    *** Failers
+    xxx
+
+/^.+[0-9][0-9][0-9]$/
+    x123
+    xx123
+    123456
+    *** Failers
+    123
+    x1234
+
+/^.+?[0-9][0-9][0-9]$/
+    x123
+    xx123
+    123456
+    *** Failers
+    123
+    x1234
+
+/^([^!]+)!(.+)=apquxz\.ixr\.zzz\.ac\.uk$/
+    abc!pqr=apquxz.ixr.zzz.ac.uk
+    *** Failers
+    !pqr=apquxz.ixr.zzz.ac.uk
+    abc!=apquxz.ixr.zzz.ac.uk
+    abc!pqr=apquxz:ixr.zzz.ac.uk
+    abc!pqr=apquxz.ixr.zzz.ac.ukk
+
+/:/
+    Well, we need a colon: somewhere
+    *** Fail if we don't
+
+/([\da-f:]+)$/i
+    0abc
+    abc
+    fed
+    E
+    ::
+    5f03:12C0::932e
+    fed def
+    Any old stuff
+    *** Failers
+    0zzz
+    gzzz
+    fed\x20
+    Any old rubbish
+
+/^.*\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})$/
+    .1.2.3
+    A.12.123.0
+    *** Failers
+    .1.2.3333
+    1.2.3
+    1234.2.3
+
+/^(\d+)\s+IN\s+SOA\s+(\S+)\s+(\S+)\s*\(\s*$/
+    1 IN SOA non-sp1 non-sp2(
+    1    IN    SOA    non-sp1    non-sp2   (
+    *** Failers
+    1IN SOA non-sp1 non-sp2(
+
+/^[a-zA-Z\d][a-zA-Z\d\-]*(\.[a-zA-Z\d][a-zA-z\d\-]*)*\.$/
+    a.
+    Z.
+    2.
+    ab-c.pq-r.
+    sxk.zzz.ac.uk.
+    x-.y-.
+    *** Failers
+    -abc.peq.
+
+/^\*\.[a-z]([a-z\-\d]*[a-z\d]+)?(\.[a-z]([a-z\-\d]*[a-z\d]+)?)*$/
+    *.a
+    *.b0-a
+    *.c3-b.c
+    *.c-a.b-c
+    *** Failers
+    *.0
+    *.a-
+    *.a-b.c-
+    *.c-a.0-c
+
+/^(?=ab(de))(abd)(e)/
+    abde
+
+/^(?!(ab)de|x)(abd)(f)/
+    abdf
+
+/^(?=(ab(cd)))(ab)/
     abcd
-    a\x{100}

-/^[abc\x{123}\x{400}-\x{402}]{2,3}\d/8
-    ab99
-    \x{123}\x{123}45
-    \x{400}\x{401}\x{402}6  
+/^[\da-f](\.[\da-f])*$/i
+    a.b.c.d
+    A.B.C.D
+    a.b.c.1.2.3.C
+
+/^\".*\"\s*(;.*)?$/
+    \"1234\"
+    \"abcd\" ;
+    \"\" ; rhubarb
     *** Failers
-    d99
-    \x{123}\x{122}4   
-    \x{400}\x{403}6  
-    \x{400}\x{401}\x{402}\x{402}6  
+    \"1234\" : things

-/abc/8
-    \xC3]
-    \xC3
-    \xC3\xC3\xC3
-    \xC3\xC3\xC3\?
-    \xe1\x88 
-    \P\xe1\x88 
-    \P\P\xe1\x88 
+/^$/
+    \
+    *** Failers

-/a.b/8
-    acb
-    a\x7fb
-    a\x{100}b 
+/   ^    a   (?# begins with a)  b\sc (?# then b c) $ (?# then end)/x
+    ab c
     *** Failers
-    a\nb  
+    abc
+    ab cde

-/a(.{3})b/8
-    a\x{4000}xyb 
-    a\x{4000}\x7fyb 
-    a\x{4000}\x{100}yb 
+/(?x)   ^    a   (?# begins with a)  b\sc (?# then b c) $ (?# then end)/
+    ab c
     *** Failers
-    a\x{4000}b 
-    ac\ncb 
+    abc
+    ab cde

-/a(.*?)(.)/
-    a\xc0\x88b
+/^   a\ b[c ]d       $/x
+    a bcd
+    a b d
+    *** Failers
+    abcd
+    ab d

-/a(.*?)(.)/8
-    a\x{100}b
+/^(a(b(c)))(d(e(f)))(h(i(j)))(k(l(m)))$/
+    abcdefhijklm

-/a(.*)(.)/
-    a\xc0\x88b
+/^(?:a(b(c)))(?:d(e(f)))(?:h(i(j)))(?:k(l(m)))$/
+    abcdefhijklm

-/a(.*)(.)/8
-    a\x{100}b
+/^[\w][\W][\s][\S][\d][\D][\b][\n][\c]][\022]/
+    a+ Z0+\x08\n\x1d\x12

-/a(.)(.)/
-    a\xc0\x92bcd
+/^[.^$|()*+?{,}]+/
+    .^\$(*+)|{?,?}

-/a(.)(.)/8
-    a\x{240}bcd
+/^a*\w/
+    z
+    az
+    aaaz
+    a
+    aa
+    aaaa
+    a+
+    aa+

-/a(.?)(.)/
-    a\xc0\x92bcd
+/^a*?\w/
+    z
+    az
+    aaaz
+    a
+    aa
+    aaaa
+    a+
+    aa+

-/a(.?)(.)/8
-    a\x{240}bcd
+/^a+\w/
+    az
+    aaaz
+    aa
+    aaaa
+    aa+

-/a(.??)(.)/
-    a\xc0\x92bcd
+/^a+?\w/
+    az
+    aaaz
+    aa
+    aaaa
+    aa+

-/a(.??)(.)/8
-    a\x{240}bcd
+/^\d{8}\w{2,}/
+    1234567890
+    12345678ab
+    12345678__
+    *** Failers
+    1234567

-/a(.{3})b/8
-    a\x{1234}xyb 
-    a\x{1234}\x{4321}yb 
-    a\x{1234}\x{4321}\x{3412}b 
+/^[aeiou\d]{4,5}$/
+    uoie
+    1234
+    12345
+    aaaaa
     *** Failers
-    a\x{1234}b 
-    ac\ncb 
+    123456

-/a(.{3,})b/8
-    a\x{1234}xyb 
-    a\x{1234}\x{4321}yb 
-    a\x{1234}\x{4321}\x{3412}b 
-    axxxxbcdefghijb 
-    a\x{1234}\x{4321}\x{3412}\x{3421}b 
+/^[aeiou\d]{4,5}?/
+    uoie
+    1234
+    12345
+    aaaaa
+    123456
+
+/^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/
+    From abcd  Mon Sep 01 12:33:02 1997
+
+/^From\s+\S+\s+([a-zA-Z]{3}\s+){2}\d{1,2}\s+\d\d:\d\d/
+    From abcd  Mon Sep 01 12:33:02 1997
+    From abcd  Mon Sep  1 12:33:02 1997
     *** Failers
-    a\x{1234}b 
+    From abcd  Sep 01 12:33:02 1997

-/a(.{3,}?)b/8
-    a\x{1234}xyb 
-    a\x{1234}\x{4321}yb 
-    a\x{1234}\x{4321}\x{3412}b 
-    axxxxbcdefghijb 
-    a\x{1234}\x{4321}\x{3412}\x{3421}b 
+/^12.34/s
+    12\n34
+    12\r34
+
+/\w+(?=\t)/
+    the quick brown\t fox
+
+/foo(?!bar)(.*)/
+    foobar is foolish see?
+
+/(?:(?!foo)...|^.{0,2})bar(.*)/
+    foobar crowbar etc
+    barrel
+    2barrel
+    A barrel
+
+/^(\D*)(?=\d)(?!123)/
+    abc456
     *** Failers
-    a\x{1234}b 
+    abc123

-/a(.{3,5})b/8
-    a\x{1234}xyb 
-    a\x{1234}\x{4321}yb 
-    a\x{1234}\x{4321}\x{3412}b 
-    axxxxbcdefghijb 
-    a\x{1234}\x{4321}\x{3412}\x{3421}b 
-    axbxxbcdefghijb 
-    axxxxxbcdefghijb 
+/^1234(?# test newlines
+  inside)/
+    1234
+
+/^1234 #comment in extended re
+  /x
+    1234
+
+/#rhubarb
+  abcd/x
+    abcd
+
+/^abcd#rhubarb/x
+    abcd
+
+/(?!^)abc/
+    the abc
     *** Failers
-    a\x{1234}b 
-    axxxxxxbcdefghijb 
+    abc

-/a(.{3,5}?)b/8
-    a\x{1234}xyb 
-    a\x{1234}\x{4321}yb 
-    a\x{1234}\x{4321}\x{3412}b 
-    axxxxbcdefghijb 
-    a\x{1234}\x{4321}\x{3412}\x{3421}b 
-    axbxxbcdefghijb 
-    axxxxxbcdefghijb 
+/(?=^)abc/
+    abc
     *** Failers
-    a\x{1234}b 
-    axxxxxxbcdefghijb 
+    the abc

-/^[a\x{c0}]/8
+/^[ab]{1,3}(ab*|b)/
+    aabbbbb
+
+/^[ab]{1,3}?(ab*|b)/
+    aabbbbb
+
+/^[ab]{1,3}?(ab*?|b)/
+    aabbbbb
+
+/^[ab]{1,3}(ab*?|b)/
+    aabbbbb
+
+/  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                          # optional leading comment
+(?:    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+# address
+|                     #  or
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)             # one word, optionally followed by....
+(?:
+[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]  |  # atom and space parts, or...
+\(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)       |  # comments, or...
+
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+# quoted strings
+)*
+<  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                     # leading <
+(?:  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  ,  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+)* # further okay, if led by comma
+:                                # closing colon
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  )? #       optional route
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+#       address spec
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  > #                  trailing >
+# name and address
+)  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                       # optional trailing comment
+/x
+    Alan Other <user\@dom.ain>
+    <user\@dom.ain>
+    user\@dom.ain
+    \"A. Other\" <user.1234\@dom.ain> (a comment)
+    A. Other <user.1234\@dom.ain> (a comment)
+    \"/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/\"\@x400-re.lay
+    A missing angle <user\@some.where
     *** Failers
-    \x{100}
+    The quick brown fox

-/(?<=aXb)cd/8
-    aXbcd
+/[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional leading comment
+(?:
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+# Atom
+|                       #  or
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+# Quoted string
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+# Atom
+|                       #  or
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+# Quoted string
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# additional words
+)*
+@
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+)*
+# address
+|                             #  or
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+# Atom
+|                       #  or
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+# Quoted string
+)
+# leading word
+[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] *               # "normal" atoms and or spaces
+(?:
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+|
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+) # "special" comment or quoted string
+[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] *            #  more "normal"
+)*
+<
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# <
+(?:
+@
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+)*
+(?: ,
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+@
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+)*
+)*  # additional domains
+:
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+)?     #       optional route
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+# Atom
+|                       #  or
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+# Quoted string
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+# Atom
+|                       #  or
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+# Quoted string
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# additional words
+)*
+@
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+)*
+#       address spec
+>                    #                 >
+# name and address
+)
+/x
+    Alan Other <user\@dom.ain>
+    <user\@dom.ain>
+    user\@dom.ain
+    \"A. Other\" <user.1234\@dom.ain> (a comment)
+    A. Other <user.1234\@dom.ain> (a comment)
+    \"/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/\"\@x400-re.lay
+    A missing angle <user\@some.where
+    *** Failers
+    The quick brown fox

-/(?<=a\x{100}b)cd/8
-    a\x{100}bcd
+/abc\0def\00pqr\000xyz\0000AB/
+    abc\0def\00pqr\000xyz\0000AB
+    abc456 abc\0def\00pqr\000xyz\0000ABCDE

-/(?<=a\x{100000}b)cd/8
-    a\x{100000}bcd
+/abc\x0def\x00pqr\x000xyz\x0000AB/
+    abc\x0def\x00pqr\x000xyz\x0000AB
+    abc456 abc\x0def\x00pqr\x000xyz\x0000ABCDE
+
+/^[\000-\037]/
+    \0A
+    \01B
+    \037C
+
+/\0*/
+    \0\0\0\0
+
+/A\x0{2,3}Z/
+    The A\x0\x0Z
+    An A\0\x0\0Z
+    *** Failers
+    A\0Z
+    A\0\x0\0\x0Z
+
+/^\s/
+    \040abc
+    \x0cabc
+    \nabc
+    \rabc
+    \tabc
+    *** Failers
+    abc
+
+/^a    b
+    ?  c/x
+    abc
+
+/ab{1,3}bc/
+    abbbbc
+    abbbc
+    abbc
+    *** Failers
+    abc
+    abbbbbc
+
+/([^.]*)\.([^:]*):[T ]+(.*)/
+    track1.title:TBlah blah blah
+
+/([^.]*)\.([^:]*):[T ]+(.*)/i
+    track1.title:TBlah blah blah
+
+/([^.]*)\.([^:]*):[t ]+(.*)/i
+    track1.title:TBlah blah blah
+
+/^[W-c]+$/
+    WXY_^abc
+    *** Failers
+    wxy
+
+/^[W-c]+$/i
+    WXY_^abc
+    wxy_^ABC
+
+/^[\x3f-\x5F]+$/i
+    WXY_^abc
+    wxy_^ABC
+
+/^abc$/m
+    abc
+    qqq\nabc
+    abc\nzzz
+    qqq\nabc\nzzz
+
+/^abc$/
+    abc
+    *** Failers
+    qqq\nabc
+    abc\nzzz
+    qqq\nabc\nzzz
+
+/\Aabc\Z/m
+    abc
+    abc\n 
+    *** Failers
+    qqq\nabc
+    abc\nzzz
+    qqq\nabc\nzzz

-/(?:\x{100}){3}b/8
-    \x{100}\x{100}\x{100}b
-    *** Failers 
-    \x{100}\x{100}b
+/\A(.)*\Z/s
+    abc\ndef

-/\x{ab}/8
-    \x{ab} 
-    \xc2\xab
-    *** Failers 
-    \x00{ab}
+/\A(.)*\Z/m
+    *** Failers
+    abc\ndef

-/(?<=(.))X/8
-    WXYZ
-    \x{256}XYZ 
+/(?:b)|(?::+)/
+    b::c
+    c::b
+
+/[-az]+/
+    az-
     *** Failers
-    XYZ 
+    b

-/[^a]+/8g
-    bcd
-    \x{100}aY\x{256}Z 
+/[az-]+/
+    za-
+    *** Failers
+    b
+
+/[a\-z]+/
+    a-z
+    *** Failers
+    b
+
+/[a-z]+/
+    abcdxyz
+
+/[\d-]+/
+    12-34
+    *** Failers
+    aaa
+
+/[\d-z]+/
+    12-34z
+    *** Failers
+    aaa
+
+/\x5c/
+    \\
+
+/\x20Z/
+    the Zoo
+    *** Failers
+    Zulu
+
+/ab{3cd/
+    ab{3cd
+
+/ab{3,cd/
+    ab{3,cd
+
+/ab{3,4a}cd/
+    ab{3,4a}cd
+
+/{4,5a}bc/
+    {4,5a}bc
+
+/^a.b/<lf>
+    a\rb
+    *** Failers
+    a\nb
+
+/abc$/
+    abc
+    abc\n
+    *** Failers
+    abc\ndef
+
+/(abc)\123/
+    abc\x53
+
+/(abc)\223/
+    abc\x93
+
+/(abc)\323/
+    abc\xd3
+
+/(abc)\100/
+    abc\x40
+    abc\100
+
+/(abc)\1000/
+    abc\x400
+    abc\x40\x30
+    abc\1000
+    abc\100\x30
+    abc\100\060
+    abc\100\60
+
+/abc\81/
+    abc\081
+    abc\0\x38\x31
+
+/abc\91/
+    abc\091
+    abc\0\x39\x31
+
+/(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)\12\123/
+    abcdefghijk\12S
+
+/ab\idef/
+    abidef
+
+/a{0}bc/
+    bc
+
+/(a|(bc)){0,0}?xyz/
+    xyz
+
+/abc[\10]de/
+    abc\010de
+
+/abc[\1]de/
+    abc\1de
+
+/(abc)[\1]de/
+    abc\1de
+
+/(?s)a.b/
+    a\nb
+
+/^([^a])([^\b])([^c]*)([^d]{3,4})/
+    baNOTccccd
+    baNOTcccd
+    baNOTccd
+    bacccd
+    *** Failers
+    anything
+    b\bc   
+    baccd
+
+/[^a]/
+    Abc
+  
+/[^a]/i
+    Abc 
+
+/[^a]+/
+    AAAaAbc
+  
+/[^a]+/i
+    AAAaAbc 
+
+/[^a]+/
+    bbb\nccc
+   
+/[^k]$/
+    abc
+    *** Failers
+    abk   
+   
+/[^k]{2,3}$/
+    abc
+    kbc
+    kabc 
+    *** Failers
+    abk
+    akb
+    akk 
+
+/^\d{8,}\@.+[^k]$/
+    12345678\@a.b.c.d
+    123456789\@x.y.z
+    *** Failers
+    12345678\@x.y.uk
+    1234567\@a.b.c.d       
+
+/[^a]/
+    aaaabcd
+    aaAabcd 
+
+/[^a]/i
+    aaaabcd
+    aaAabcd 
+
+/[^az]/
+    aaaabcd
+    aaAabcd 
+
+/[^az]/i
+    aaaabcd
+    aaAabcd 
+
+/\000\001\002\003\004\005\006\007\010\011\012\013\014\015\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\040\041\042\043\044\045\046\047\050\051\052\053\054\055\056\057\060\061\062\063\064\065\066\067\070\071\072\073\074\075\076\077\100\101\102\103\104\105\106\107\110\111\112\113\114\115\116\117\120\121\122\123\124\125\126\127\130\131\132\133\134\135\136\137\140\141\142\143\144\145\146\147\150\151\152\153\154\155\156\157\160\161\162\163\164\165\166\167\170\171\172\173\174\175\176\177\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\257\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\365\366\367\370\371\372\373\374\375\376\377/
+ \000\001\002\003\004\005\006\007\010\011\012\013\014\015\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\040\041\042\043\044\045\046\047\050\051\052\053\054\055\056\057\060\061\062\063\064\065\066\067\070\071\072\073\074\075\076\077\100\101\102\103\104\105\106\107\110\111\112\113\114\115\116\117\120\121\122\123\124\125\126\127\130\131\132\133\134\135\136\137\140\141\142\143\144\145\146\147\150\151\152\153\154\155\156\157\160\161\162\163\164\165\166\167\170\171\172\173\174\175\176\177\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\257\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\365\366\367\370\371\372\373\374\375\376\377
+
+/P[^*]TAIRE[^*]{1,6}?LL/
+    xxxxxxxxxxxPSTAIREISLLxxxxxxxxx
+
+/P[^*]TAIRE[^*]{1,}?LL/
+    xxxxxxxxxxxPSTAIREISLLxxxxxxxxx
+
+/(\.\d\d[1-9]?)\d+/
+    1.230003938
+    1.875000282   
+    1.235  
+                  
+/(\.\d\d((?=0)|\d(?=\d)))/
+    1.230003938      
+    1.875000282
+    *** Failers 
+    1.235

-/^[^a]{2}/8
-    \x{100}bc
+/a(?)b/
+    ab

-/^[^a]{2,}/8
-    \x{100}bcAa
+/\b(foo)\s+(\w+)/i
+    Food is on the foo table
+    
+/foo(.*)bar/
+    The food is under the bar in the barn.
+    
+/foo(.*?)bar/  
+    The food is under the bar in the barn.

-/^[^a]{2,}?/8
-    \x{100}bca
+/(.*)(\d*)/
+    I have 2 numbers: 53147
+    
+/(.*)(\d+)/
+    I have 2 numbers: 53147
+ 
+/(.*?)(\d*)/
+    I have 2 numbers: 53147

-/[^a]+/8ig
+/(.*?)(\d+)/
+    I have 2 numbers: 53147
+
+/(.*)(\d+)$/
+    I have 2 numbers: 53147
+
+/(.*?)(\d+)$/
+    I have 2 numbers: 53147
+
+/(.*)\b(\d+)$/
+    I have 2 numbers: 53147
+
+/(.*\D)(\d+)$/
+    I have 2 numbers: 53147
+
+/^\D*(?!123)/
+    ABC123
+     
+/^(\D*)(?=\d)(?!123)/
+    ABC445
+    *** Failers
+    ABC123
+    
+/^[W-]46]/
+    W46]789 
+    -46]789
+    *** Failers
+    Wall
+    Zebra
+    42
+    [abcd] 
+    ]abcd[
+       
+/^[W-\]46]/
+    W46]789 
+    Wall
+    Zebra
+    Xylophone  
+    42
+    [abcd] 
+    ]abcd[
+    \\backslash 
+    *** Failers
+    -46]789
+    well
+    
+/\d\d\/\d\d\/\d\d\d\d/
+    01/01/2000
+
+/word (?:[a-zA-Z0-9]+ ){0,10}otherword/
+  word cat dog elephant mussel cow horse canary baboon snake shark otherword
+  word cat dog elephant mussel cow horse canary baboon snake shark
+
+/word (?:[a-zA-Z0-9]+ ){0,300}otherword/
+  word cat dog elephant mussel cow horse canary baboon snake shark the quick brown fox and the lazy dog and several other words getting close to thirty by now I hope
+
+/^(a){0,0}/
     bcd
-    \x{100}aY\x{256}Z 
+    abc
+    aab     
+
+/^(a){0,1}/
+    bcd
+    abc
+    aab  
+
+/^(a){0,2}/
+    bcd
+    abc
+    aab  
+
+/^(a){0,3}/
+    bcd
+    abc
+    aab
+    aaa   
+
+/^(a){0,}/
+    bcd
+    abc
+    aab
+    aaa
+    aaaaaaaa    
+
+/^(a){1,1}/
+    bcd
+    abc
+    aab  
+
+/^(a){1,2}/
+    bcd
+    abc
+    aab  
+
+/^(a){1,3}/
+    bcd
+    abc
+    aab
+    aaa   
+
+/^(a){1,}/
+    bcd
+    abc
+    aab
+    aaa
+    aaaaaaaa    
+
+/.*\.gif/
+    borfle\nbib.gif\nno
+
+/.{0,}\.gif/
+    borfle\nbib.gif\nno
+
+/.*\.gif/m
+    borfle\nbib.gif\nno
+
+/.*\.gif/s
+    borfle\nbib.gif\nno
+
+/.*\.gif/ms
+    borfle\nbib.gif\nno

-/^[^a]{2}/8i
-    \x{100}bc
- 
-/^[^a]{2,}/8i
-    \x{100}bcAa
+/.*$/
+    borfle\nbib.gif\nno

-/^[^a]{2,}?/8i
-    \x{100}bca
+/.*$/m
+    borfle\nbib.gif\nno

-/\x{100}{0,0}/8
+/.*$/s
+    borfle\nbib.gif\nno
+
+/.*$/ms
+    borfle\nbib.gif\nno
+    
+/.*$/
+    borfle\nbib.gif\nno\n
+
+/.*$/m
+    borfle\nbib.gif\nno\n
+
+/.*$/s
+    borfle\nbib.gif\nno\n
+
+/.*$/ms
+    borfle\nbib.gif\nno\n
+    
+/(.*X|^B)/
+    abcde\n1234Xyz
+    BarFoo 
+    *** Failers
+    abcde\nBar  
+
+/(.*X|^B)/m
+    abcde\n1234Xyz
+    BarFoo 
+    abcde\nBar  
+
+/(.*X|^B)/s
+    abcde\n1234Xyz
+    BarFoo 
+    *** Failers
+    abcde\nBar  
+
+/(.*X|^B)/ms
+    abcde\n1234Xyz
+    BarFoo 
+    abcde\nBar  
+
+/(?s)(.*X|^B)/
+    abcde\n1234Xyz
+    BarFoo 
+    *** Failers 
+    abcde\nBar  
+
+/(?s:.*X|^B)/
+    abcde\n1234Xyz
+    BarFoo 
+    *** Failers 
+    abcde\nBar  
+
+/^.*B/
+    **** Failers
+    abc\nB
+     
+/(?s)^.*B/
+    abc\nB
+
+/(?m)^.*B/
+    abc\nB
+     
+/(?ms)^.*B/
+    abc\nB
+
+/(?ms)^B/
+    abc\nB
+
+/(?s)B$/
+    B\n
+
+/^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/
+    123456654321
+  
+/^\d\d\d\d\d\d\d\d\d\d\d\d/
+    123456654321 
+
+/^[\d][\d][\d][\d][\d][\d][\d][\d][\d][\d][\d][\d]/
+    123456654321
+  
+/^[abc]{12}/
+    abcabcabcabc
+    
+/^[a-c]{12}/
+    abcabcabcabc
+    
+/^(a|b|c){12}/
+    abcabcabcabc 
+
+/^[abcdefghijklmnopqrstuvwxy0123456789]/
+    n
+    *** Failers 
+    z 
+
+/abcde{0,0}/
     abcd
+    *** Failers
+    abce  
+
+/ab[cd]{0,0}e/
+    abe
+    *** Failers
+    abcde 
+    
+/ab(c){0,0}d/
+    abd
+    *** Failers
+    abcd   
+
+/a(b*)/
+    a
+    ab
+    abbbb
+    *** Failers
+    bbbbb    
+    
+/ab\d{0}e/
+    abe
+    *** Failers
+    ab1e   
+    
+/"([^\\"]+|\\.)*"/
+    the \"quick\" brown fox
+    \"the \\\"quick\\\" brown fox\" 
+
+/.*?/g+
+    abc
+  
+/\b/g+
+    abc 
+
+/\b/+g
+    abc 
+
+//g
+    abc
+
+/<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/is
+  <TR BGCOLOR='#DBE9E9'><TD align=left valign=top>43.<a href='joblist.cfm?JobID=94 6735&Keyword='>Word Processor<BR>(N-1286)</a></TD><TD align=left valign=top>Lega lstaff.com</TD><TD align=left valign=top>CA - Statewide</TD></TR>
+
+/a[^a]b/
+    acb
+    a\nb
+    
+/a.b/
+    acb
+    *** Failers 
+    a\nb   
+    
+/a[^a]b/s
+    acb
+    a\nb  
+    
+/a.b/s
+    acb
+    a\nb  
+
+/^(b+?|a){1,2}?c/
+    bac
+    bbac
+    bbbac
+    bbbbac
+    bbbbbac 
+
+/^(b+|a){1,2}?c/
+    bac
+    bbac
+    bbbac
+    bbbbac
+    bbbbbac 
+    
+/(?!\A)x/m
+    x\nb\n
+    a\bx\n  
+    
+/\x0{ab}/
+    \0{ab} 
+
+/(A|B)*?CD/
+    CD 
+    
+/(A|B)*CD/
+    CD 
+
+/(?<!bar)foo/
+    foo
+    catfood
+    arfootle
+    rfoosh
+    *** Failers
+    barfoo
+    towbarfoo
+
+/\w{3}(?<!bar)foo/
+    catfood
+    *** Failers
+    foo
+    barfoo
+    towbarfoo
+
+/(?<=(foo)a)bar/
+    fooabar
+    *** Failers
+    bar
+    foobbar
+      
+/\Aabc\z/m
+    abc
+    *** Failers
+    abc\n   
+    qqq\nabc
+    abc\nzzz
+    qqq\nabc\nzzz
+
+"(?>.*/)foo"
+    /this/is/a/very/long/line/in/deed/with/very/many/slashes/in/it/you/see/
+
+"(?>.*/)foo"
+    /this/is/a/very/long/line/in/deed/with/very/many/slashes/in/and/foo
+
+/(?>(\.\d\d[1-9]?))\d+/
+    1.230003938
+    1.875000282
+    *** Failers 
+    1.235 
+
+/^((?>\w+)|(?>\s+))*$/
+    now is the time for all good men to come to the aid of the party
+    *** Failers
+    this is not a line with only words and spaces!
+    
+/(\d+)(\w)/
+    12345a
+    12345+ 
+
+/((?>\d+))(\w)/
+    12345a
+    *** Failers
+    12345+ 
+
+/(?>a+)b/
+    aaab
+
+/((?>a+)b)/
+    aaab
+
+/(?>(a+))b/
+    aaab
+
+/(?>b)+/
+    aaabbbccc
+
+/(?>a+|b+|c+)*c/
+    aaabbbbccccd
+    
+/(a+|b+|c+)*c/
+    aaabbbbccccd
+
+/((?>[^()]+)|\([^()]*\))+/
+    ((abc(ade)ufh()()x
+    
+/\(((?>[^()]+)|\([^()]+\))+\)/ 
+    (abc)
+    (abc(def)xyz)
+    *** Failers
+    ((()aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa   
+
+/a(?-i)b/i
+    ab
+    Ab
+    *** Failers 
+    aB
+    AB
+        
+/(a (?x)b c)d e/
+    a bcd e
+    *** Failers
+    a b cd e
+    abcd e   
+    a bcde

-/\x{100}?/8
+/(a b(?x)c d (?-x)e f)/
+    a bcde f
+    *** Failers
+    abcdef  
+
+/(a(?i)b)c/
+    abc
+    aBc
+    *** Failers
+    abC
+    aBC  
+    Abc
+    ABc
+    ABC
+    AbC
+    
+/a(?i:b)c/
+    abc
+    aBc
+    *** Failers 
+    ABC
+    abC
+    aBC
+    
+/a(?i:b)*c/
+    aBc
+    aBBc
+    *** Failers 
+    aBC
+    aBBC
+    
+/a(?=b(?i)c)\w\wd/
     abcd
-    \x{100}\x{100} 
+    abCd
+    *** Failers
+    aBCd
+    abcD     
+    
+/(?s-i:more.*than).*million/i
+    more than million
+    more than MILLION
+    more \n than Million 
+    *** Failers
+    MORE THAN MILLION    
+    more \n than \n million

-/\x{100}{0,3}/8 
-    \x{100}\x{100} 
-    \x{100}\x{100}\x{100}\x{100} 
+/(?:(?s-i)more.*than).*million/i
+    more than million
+    more than MILLION
+    more \n than Million 
+    *** Failers
+    MORE THAN MILLION    
+    more \n than \n million

-/\x{100}*/8
-    abce
-    \x{100}\x{100}\x{100}\x{100} 
+/(?>a(?i)b+)+c/ 
+    abc
+    aBbc
+    aBBc 
+    *** Failers
+    Abc
+    abAb    
+    abbC 
+    
+/(?=a(?i)b)\w\wc/
+    abc
+    aBc
+    *** Failers
+    Ab 
+    abC
+    aBC     
+    
+/(?<=a(?i)b)(\w\w)c/
+    abxxc
+    aBxxc
+    *** Failers
+    Abxxc
+    ABxxc
+    abxxC

-/\x{100}{1,1}/8
-    abcd\x{100}\x{100}\x{100}\x{100} 
+/^(?(?=abc)\w{3}:|\d\d)$/
+    abc:
+    12
+    *** Failers
+    123
+    xyz

-/\x{100}{1,3}/8
-    abcd\x{100}\x{100}\x{100}\x{100} 
+/^(?(?!abc)\d\d|\w{3}:)$/
+    abc:
+    12
+    *** Failers
+    123
+    xyz    
+    
+/(?(?<=foo)bar|cat)/
+    foobar
+    cat
+    fcat
+    focat   
+    *** Failers
+    foocat

-/\x{100}+/8
-    abcd\x{100}\x{100}\x{100}\x{100} 
+/(?(?<!foo)cat|bar)/
+    foobar
+    cat
+    fcat
+    focat   
+    *** Failers
+    foocat

-/\x{100}{3}/8
-    abcd\x{100}\x{100}\x{100}XX
+/(?>a*)*/
+    a
+    aa
+    aaaa
+    
+/(abc|)+/
+    abc
+    abcabc
+    abcabcabc
+    xyz

-/\x{100}{3,5}/8
-    abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
+/([a]*)*/
+    a
+    aaaaa 
+ 
+/([ab]*)*/
+    a
+    b
+    ababab
+    aaaabcde
+    bbbb    
+ 
+/([^a]*)*/
+    b
+    bbbb
+    aaa   
+ 
+/([^ab]*)*/
+    cccc
+    abab  
+ 
+/([a]*?)*/
+    a
+    aaaa 
+ 
+/([ab]*?)*/
+    a
+    b
+    abab
+    baba   
+ 
+/([^a]*?)*/
+    b
+    bbbb
+    aaa   
+ 
+/([^ab]*?)*/
+    c
+    cccc
+    baba   
+ 
+/(?>a*)*/
+    a
+    aaabcde 
+ 
+/((?>a*))*/
+    aaaaa
+    aabbaa 
+ 
+/((?>a*?))*/
+    aaaaa
+    aabbaa

-/\x{100}{3,}/8
-    abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
+/(?(?=[^a-z]+[a-z])  \d{2}-[a-z]{3}-\d{2}  |  \d{2}-\d{2}-\d{2} ) /x
+    12-sep-98
+    12-09-98
+    *** Failers
+    sep-12-98
+        
+/(?i:saturday|sunday)/
+    saturday
+    sunday
+    Saturday
+    Sunday
+    SATURDAY
+    SUNDAY
+    SunDay
+    
+/(a(?i)bc|BB)x/
+    abcx
+    aBCx
+    bbx
+    BBx
+    *** Failers
+    abcX
+    aBCX
+    bbX
+    BBX

-/(?<=a\x{100}{2}b)X/8
-    Xyyya\x{100}\x{100}bXzzz
+/^([ab](?i)[cd]|[ef])/
+    ac
+    aC
+    bD
+    elephant
+    Europe 
+    frog
+    France
+    *** Failers
+    Africa

-/\D*/8
-  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/^(ab|a(?i)[b-c](?m-i)d|x(?i)y|z)/
+    ab
+    aBd
+    xy
+    xY
+    zebra
+    Zambesi
+    *** Failers
+    aCD  
+    XY

-/\D*/8
-  \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+/(?<=foo\n)^bar/m
+    foo\nbar
+    *** Failers
+    bar
+    baz\nbar

-/\D/8
-    1X2
-    1\x{100}2 
-  
-/>\S/8
-    > >X Y
-    > >\x{100} Y
-  
-/\d/8
-    \x{100}3
+/(?<=(?<!foo)bar)baz/
+    barbaz
+    barbarbaz 
+    koobarbaz 
+    *** Failers
+    baz
+    foobarbaz 
+
+/The following tests are taken from the Perl 5.005 test suite; some of them/
+/are compatible with 5.004, but I'd rather not have to sort them out./
+
+/abc/
+    abc
+    xabcy
+    ababc
+    *** Failers
+    xbc
+    axc
+    abx
+
+/ab*c/
+    abc
+
+/ab*bc/
+    abc
+    abbc
+    abbbbc
+
+/.{1}/
+    abbbbc
+
+/.{3,4}/
+    abbbbc
+
+/ab{0,}bc/
+    abbbbc
+
+/ab+bc/
+    abbc
+    *** Failers
+    abc
+    abq
+
+/ab+bc/
+    abbbbc
+
+/ab{1,}bc/
+    abbbbc
+
+/ab{1,3}bc/
+    abbbbc
+
+/ab{3,4}bc/
+    abbbbc
+
+/ab{4,5}bc/
+    *** Failers
+    abq
+    abbbbc
+
+/ab?bc/
+    abbc
+    abc
+
+/ab{0,1}bc/
+    abc
+
+/ab?bc/
+
+/ab?c/
+    abc
+
+/ab{0,1}c/
+    abc
+
+/^abc$/
+    abc
+    *** Failers
+    abbbbc
+    abcc
+
+/^abc/
+    abcc
+
+/^abc$/
+
+/abc$/
+    aabc
+    *** Failers
+    aabc
+    aabcd
+
+/^/
+    abc
+
+/$/
+    abc
+
+/a.c/
+    abc
+    axc
+
+/a.*c/
+    axyzc
+
+/a[bc]d/
+    abd
+    *** Failers
+    axyzd
+    abc
+
+/a[b-d]e/
+    ace
+
+/a[b-d]/
+    aac
+
+/a[-b]/
+    a-
+
+/a[b-]/
+    a-
+
+/a]/
+    a]
+
+/a[]]b/
+    a]b
+
+/a[^bc]d/
+    aed
+    *** Failers
+    abd
+    abd
+
+/a[^-b]c/
+    adc
+
+/a[^]b]c/
+    adc
+    *** Failers
+    a-c
+    a]c
+
+/\ba\b/
+    a-
+    -a
+    -a-
+
+/\by\b/
+    *** Failers
+    xy
+    yz
+    xyz
+
+/\Ba\B/
+    *** Failers
+    a-
+    -a
+    -a-
+
+/\By\b/
+    xy
+
+/\by\B/
+    yz
+
+/\By\B/
+    xyz
+
+/\w/
+    a
+
+/\W/
+    -
+    *** Failers
+    -
+    a
+
+/a\sb/
+    a b
+
+/a\Sb/
+    a-b
+    *** Failers
+    a-b
+    a b
+
+/\d/
+    1
+
+/\D/
+    -
+    *** Failers
+    -
+    1
+
+/[\w]/
+    a
+
+/[\W]/
+    -
+    *** Failers
+    -
+    a
+
+/a[\s]b/
+    a b
+
+/a[\S]b/
+    a-b
+    *** Failers
+    a-b
+    a b
+
+/[\d]/
+    1
+
+/[\D]/
+    -
+    *** Failers
+    -
+    1
+
+/ab|cd/
+    abc
+    abcd
+
+/()ef/
+    def
+
+/$b/
+
+/a\(b/
+    a(b
+
+/a\(*b/
+    ab
+    a((b
+
+/a\\b/
+    a\b
+
+/((a))/
+    abc
+
+/(a)b(c)/
+    abc
+
+/a+b+c/
+    aabbabc
+
+/a{1,}b{1,}c/
+    aabbabc
+
+/a.+?c/
+    abcabc
+
+/(a+|b)*/
+    ab
+
+/(a+|b){0,}/
+    ab
+
+/(a+|b)+/
+    ab
+
+/(a+|b){1,}/
+    ab
+
+/(a+|b)?/
+    ab
+
+/(a+|b){0,1}/
+    ab
+
+/[^ab]*/
+    cde
+
+/abc/
+    *** Failers
+    b

-/\s/8
-    \x{100} X
+
+/a*/

-/\D+/8
-    12abcd34
+
+/([abc])*d/
+    abbbcd
+
+/([abc])*bcd/
+    abcd
+
+/a|b|c|d|e/
+    e
+
+/(a|b|c|d|e)f/
+    ef
+
+/abcd*efg/
+    abcdefg
+
+/ab*/
+    xabyabbbz
+    xayabbbz
+
+/(ab|cd)e/
+    abcde
+
+/[abhgefdc]ij/
+    hij
+
+/^(ab|cd)e/
+
+/(abc|)ef/
+    abcdef
+
+/(a|b)c*d/
+    abcd
+
+/(ab|ab*)bc/
+    abc
+
+/a([bc]*)c*/
+    abc
+
+/a([bc]*)(c*d)/
+    abcd
+
+/a([bc]+)(c*d)/
+    abcd
+
+/a([bc]*)(c+d)/
+    abcd
+
+/a[bcd]*dcdcde/
+    adcdcde
+
+/a[bcd]+dcdcde/
     *** Failers
-    1234  
+    abcde
+    adcdcde

-/\D{2,3}/8
-    12abcd34
-    12ab34
-    *** Failers  
-    1234
-    12a34  
+/(ab|a)b*c/
+    abc

-/\D{2,3}?/8
-    12abcd34
-    12ab34
-    *** Failers  
-    1234
-    12a34  
+/((a)(b)c)(d)/
+    abcd

-/\d+/8
-    12abcd34
+/[a-zA-Z_][a-zA-Z0-9_]*/
+    alpha
+
+/^a(bc+|b[eh])g|.h$/
+    abh
+
+/(bc+d$|ef*g.|h?i(j|k))/
+    effgz
+    ij
+    reffgz
     *** Failers
+    effg
+    bcdd

-/\d{2,3}/8
-    12abcd34
-    1234abcd
-    *** Failers  
-    1.4 
+/((((((((((a))))))))))/
+    a

-/\d{2,3}?/8
-    12abcd34
-    1234abcd
-    *** Failers  
-    1.4 
+/(((((((((a)))))))))/
+    a

-/\S+/8
-    12abcd34
+/multiple words of text/
     *** Failers
-    \    \ 
+    aa
+    uh-uh

-/\S{2,3}/8
-    12abcd34
-    1234abcd
+/multiple words/
+    multiple words, yeah
+
+/(.*)c(.*)/
+    abcde
+
+/\((.*), (.*)\)/
+    (a, b)
+
+/[k]/
+
+/abcd/
+    abcd
+
+/a(bc)d/
+    abcd
+
+/a[-]?c/
+    ac
+
+/abc/i
+    ABC
+    XABCY
+    ABABC
     *** Failers
-    \     \  
+    aaxabxbaxbbx
+    XBC
+    AXC
+    ABX

-/\S{2,3}?/8
-    12abcd34
-    1234abcd
+/ab*c/i
+    ABC
+
+/ab*bc/i
+    ABC
+    ABBC
+
+/ab*?bc/i
+    ABBBBC
+
+/ab{0,}?bc/i
+    ABBBBC
+
+/ab+?bc/i
+    ABBC
+
+/ab+bc/i
     *** Failers
-    \     \  
+    ABC
+    ABQ

-/>\s+</8
-    12>      <34
+/ab{1,}bc/i
+
+/ab+bc/i
+    ABBBBC
+
+/ab{1,}?bc/i
+    ABBBBC
+
+/ab{1,3}?bc/i
+    ABBBBC
+
+/ab{3,4}?bc/i
+    ABBBBC
+
+/ab{4,5}?bc/i
     *** Failers
+    ABQ
+    ABBBBC

-/>\s{2,3}</8
-    ab>  <cd
-    ab>   <ce
+/ab??bc/i
+    ABBC
+    ABC
+
+/ab{0,1}?bc/i
+    ABC
+
+/ab??bc/i
+
+/ab??c/i
+    ABC
+
+/ab{0,1}?c/i
+    ABC
+
+/^abc$/i
+    ABC
     *** Failers
-    ab>    <cd 
+    ABBBBC
+    ABCC

-/>\s{2,3}?</8
-    ab>  <cd
-    ab>   <ce
+/^abc/i
+    ABCC
+
+/^abc$/i
+
+/abc$/i
+    AABC
+
+/^/i
+    ABC
+
+/$/i
+    ABC
+
+/a.c/i
+    ABC
+    AXC
+
+/a.*?c/i
+    AXYZC
+
+/a.*c/i
     *** Failers
-    ab>    <cd 
+    AABC
+    AXYZD

-/\w+/8
-    12      34
+/a[bc]d/i
+    ABD
+
+/a[b-d]e/i
+    ACE
     *** Failers
-    +++=*! 
+    ABC
+    ABD

-/\w{2,3}/8
-    ab  cd
-    abcd ce
+/a[b-d]/i
+    AAC
+
+/a[-b]/i
+    A-
+
+/a[b-]/i
+    A-
+
+/a]/i
+    A]
+
+/a[]]b/i
+    A]B
+
+/a[^bc]d/i
+    AED
+
+/a[^-b]c/i
+    ADC
     *** Failers
-    a.b.c
+    ABD
+    A-C

-/\w{2,3}?/8
-    ab  cd
-    abcd ce
+/a[^]b]c/i
+    ADC
+
+/ab|cd/i
+    ABC
+    ABCD
+
+/()ef/i
+    DEF
+
+/$b/i
     *** Failers
-    a.b.c
+    A]C
+    B

-/\W+/8
-    12====34
+/a\(b/i
+    A(B
+
+/a\(*b/i
+    AB
+    A((B
+
+/a\\b/i
+    A\B
+
+/((a))/i
+    ABC
+
+/(a)b(c)/i
+    ABC
+
+/a+b+c/i
+    AABBABC
+
+/a{1,}b{1,}c/i
+    AABBABC
+
+/a.+?c/i
+    ABCABC
+
+/a.*?c/i
+    ABCABC
+
+/a.{0,5}?c/i
+    ABCABC
+
+/(a+|b)*/i
+    AB
+
+/(a+|b){0,}/i
+    AB
+
+/(a+|b)+/i
+    AB
+
+/(a+|b){1,}/i
+    AB
+
+/(a+|b)?/i
+    AB
+
+/(a+|b){0,1}/i
+    AB
+
+/(a+|b){0,1}?/i
+    AB
+
+/[^ab]*/i
+    CDE
+
+/abc/i
+
+/a*/i
+    
+
+/([abc])*d/i
+    ABBBCD
+
+/([abc])*bcd/i
+    ABCD
+
+/a|b|c|d|e/i
+    E
+
+/(a|b|c|d|e)f/i
+    EF
+
+/abcd*efg/i
+    ABCDEFG
+
+/ab*/i
+    XABYABBBZ
+    XAYABBBZ
+
+/(ab|cd)e/i
+    ABCDE
+
+/[abhgefdc]ij/i
+    HIJ
+
+/^(ab|cd)e/i
+    ABCDE
+
+/(abc|)ef/i
+    ABCDEF
+
+/(a|b)c*d/i
+    ABCD
+
+/(ab|ab*)bc/i
+    ABC
+
+/a([bc]*)c*/i
+    ABC
+
+/a([bc]*)(c*d)/i
+    ABCD
+
+/a([bc]+)(c*d)/i
+    ABCD
+
+/a([bc]*)(c+d)/i
+    ABCD
+
+/a[bcd]*dcdcde/i
+    ADCDCDE
+
+/a[bcd]+dcdcde/i
+
+/(ab|a)b*c/i
+    ABC
+
+/((a)(b)c)(d)/i
+    ABCD
+
+/[a-zA-Z_][a-zA-Z0-9_]*/i
+    ALPHA
+
+/^a(bc+|b[eh])g|.h$/i
+    ABH
+
+/(bc+d$|ef*g.|h?i(j|k))/i
+    EFFGZ
+    IJ
+    REFFGZ
     *** Failers
-    abcd 
+    ADCDCDE
+    EFFG
+    BCDD

-/\W{2,3}/8
-    ab====cd
-    ab==cd
+/((((((((((a))))))))))/i
+    A
+
+/(((((((((a)))))))))/i
+    A
+
+/(?:(?:(?:(?:(?:(?:(?:(?:(?:(a))))))))))/i
+    A
+
+/(?:(?:(?:(?:(?:(?:(?:(?:(?:(a|b|c))))))))))/i
+    C
+
+/multiple words of text/i
     *** Failers
-    a.b.c
+    AA
+    UH-UH

-/\W{2,3}?/8
-    ab====cd
-    ab==cd
+/multiple words/i
+    MULTIPLE WORDS, YEAH
+
+/(.*)c(.*)/i
+    ABCDE
+
+/\((.*), (.*)\)/i
+    (A, B)
+
+/[k]/i
+
+/abcd/i
+    ABCD
+
+/a(bc)d/i
+    ABCD
+
+/a[-]?c/i
+    AC
+
+/a(?!b)./
+    abad
+
+/a(?=d)./
+    abad
+
+/a(?=c|d)./
+    abad
+
+/a(?:b|c|d)(.)/
+    ace
+
+/a(?:b|c|d)*(.)/
+    ace
+
+/a(?:b|c|d)+?(.)/
+    ace
+    acdbcdbe
+
+/a(?:b|c|d)+(.)/
+    acdbcdbe
+
+/a(?:b|c|d){2}(.)/
+    acdbcdbe
+
+/a(?:b|c|d){4,5}(.)/
+    acdbcdbe
+
+/a(?:b|c|d){4,5}?(.)/
+    acdbcdbe
+
+/((foo)|(bar))*/
+    foobar
+
+/a(?:b|c|d){6,7}(.)/
+    acdbcdbe
+
+/a(?:b|c|d){6,7}?(.)/
+    acdbcdbe
+
+/a(?:b|c|d){5,6}(.)/
+    acdbcdbe
+
+/a(?:b|c|d){5,6}?(.)/
+    acdbcdbe
+
+/a(?:b|c|d){5,7}(.)/
+    acdbcdbe
+
+/a(?:b|c|d){5,7}?(.)/
+    acdbcdbe
+
+/a(?:b|(c|e){1,2}?|d)+?(.)/
+    ace
+
+/^(.+)?B/
+    AB
+
+/^([^a-z])|(\^)$/
+    .
+
+/^[<>]&/
+    <&OUT
+
+/(?:(f)(o)(o)|(b)(a)(r))*/
+    foobar
+
+/(?<=a)b/
+    ab
     *** Failers
-    a.b.c
+    cb
+    b

-/[\x{100}]/8
-    \x{100}
-    Z\x{100}
-    \x{100}Z
-    *** Failers 
+/(?<!c)b/
+    ab
+    b
+    b

-/[Z\x{100}]/8
-    Z\x{100}
-    \x{100}
-    \x{100}Z
-    *** Failers 
+/(?:..)*a/
+    aba

-/[\x{100}\x{200}]/8
-   ab\x{100}cd
-   ab\x{200}cd
-   *** Failers  
+/(?:..)*?a/
+    aba

-/[\x{100}-\x{200}]/8
-   ab\x{100}cd
-   ab\x{200}cd
-   ab\x{111}cd 
-   *** Failers  
+/^(){3,5}/
+    abc

-/[z-\x{200}]/8
-   ab\x{100}cd
-   ab\x{200}cd
-   ab\x{111}cd 
-   abzcd
-   ab|cd  
-   *** Failers  
+/^(a+)*ax/
+    aax

-/[Q\x{100}\x{200}]/8
-   ab\x{100}cd
-   ab\x{200}cd
-   Q? 
-   *** Failers  
+/^((a|b)+)*ax/
+    aax

-/[Q\x{100}-\x{200}]/8
-   ab\x{100}cd
-   ab\x{200}cd
-   ab\x{111}cd 
-   Q? 
-   *** Failers  
+/^((a|bc)+)*ax/
+    aax

-/[Qz-\x{200}]/8
-   ab\x{100}cd
-   ab\x{200}cd
-   ab\x{111}cd 
-   abzcd
-   ab|cd  
-   Q? 
-   *** Failers  
+/(a|x)*ab/
+    cab

-/[\x{100}\x{200}]{1,3}/8
-   ab\x{100}cd
-   ab\x{200}cd
-   ab\x{200}\x{100}\x{200}\x{100}cd
-   *** Failers  
+/(a)*ab/
+    cab

-/[\x{100}\x{200}]{1,3}?/8
-   ab\x{100}cd
-   ab\x{200}cd
-   ab\x{200}\x{100}\x{200}\x{100}cd
-   *** Failers  
+/(?:(?i)a)b/
+    ab

-/[Q\x{100}\x{200}]{1,3}/8
-   ab\x{100}cd
-   ab\x{200}cd
-   ab\x{200}\x{100}\x{200}\x{100}cd
-   *** Failers  
+/((?i)a)b/
+    ab

-/[Q\x{100}\x{200}]{1,3}?/8
-   ab\x{100}cd
-   ab\x{200}cd
-   ab\x{200}\x{100}\x{200}\x{100}cd
-   *** Failers  
+/(?:(?i)a)b/
+    Ab

-/(?<=[\x{100}\x{200}])X/8
-    abc\x{200}X
-    abc\x{100}X 
+/((?i)a)b/
+    Ab
+
+/(?:(?i)a)b/
     *** Failers
-    X  
+    cb
+    aB

-/(?<=[Q\x{100}\x{200}])X/8
-    abc\x{200}X
-    abc\x{100}X 
-    abQX 
+/((?i)a)b/
+
+/(?i:a)b/
+    ab
+
+/((?i:a))b/
+    ab
+
+/(?i:a)b/
+    Ab
+
+/((?i:a))b/
+    Ab
+
+/(?i:a)b/
     *** Failers
-    X  
+    aB
+    aB

-/(?<=[\x{100}\x{200}]{3})X/8
-    abc\x{100}\x{200}\x{100}X
+/((?i:a))b/
+
+/(?:(?-i)a)b/i
+    ab
+
+/((?-i)a)b/i
+    ab
+
+/(?:(?-i)a)b/i
+    aB
+
+/((?-i)a)b/i
+    aB
+
+/(?:(?-i)a)b/i
     *** Failers
-    abc\x{200}X
-    X  
+    aB
+    Ab

-/[^\x{100}\x{200}]X/8
-    AX
-    \x{150}X
-    \x{500}X 
+/((?-i)a)b/i
+
+/(?:(?-i)a)b/i
+    aB
+
+/((?-i)a)b/i
+    aB
+
+/(?:(?-i)a)b/i
     *** Failers
-    \x{100}X
-    \x{200}X   
+    Ab
+    AB

-/[^Q\x{100}\x{200}]X/8
-    AX
-    \x{150}X
-    \x{500}X 
+/((?-i)a)b/i
+
+/(?-i:a)b/i
+    ab
+
+/((?-i:a))b/i
+    ab
+
+/(?-i:a)b/i
+    aB
+
+/((?-i:a))b/i
+    aB
+
+/(?-i:a)b/i
     *** Failers
-    \x{100}X
-    \x{200}X   
-    QX 
+    AB
+    Ab

-/[^\x{100}-\x{200}]X/8
-    AX
-    \x{500}X 
+/((?-i:a))b/i
+
+/(?-i:a)b/i
+    aB
+
+/((?-i:a))b/i
+    aB
+
+/(?-i:a)b/i
     *** Failers
-    \x{100}X
-    \x{150}X
-    \x{200}X   
+    Ab
+    AB

-/[z-\x{100}]/8i
-    z
-    Z 
-    \x{100}
+/((?-i:a))b/i
+
+/((?-i:a.))b/i
     *** Failers
-    \x{102}
-    y    
+    AB
+    a\nB

-/[\xFF]/
-    >\xff<
+/((?s-i:a.))b/i
+    a\nB

-/[\xff]/8
-    >\x{ff}<
+/(?:c|d)(?:)(?:a(?:)(?:b)(?:b(?:))(?:b(?:)(?:b)))/
+    cabbbb

-/[^\xFF]/
-    XYZ
+/(?:c|d)(?:)(?:aaaaaaaa(?:)(?:bbbbbbbb)(?:bbbbbbbb(?:))(?:bbbbbbbb(?:)(?:bbbbbbbb)))/
+    caaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb

-/[^\xff]/8
-    XYZ
-    \x{123} 
+/foo\w*\d{4}baz/
+    foobar1234baz

-/^[ac]*b/8
-  xb
+/x(~~)*(?:(?:F)?)?/
+    x~~

-/^[ac\x{100}]*b/8
-  xb
+/^a(?#xxx){3}c/
+    aaac

-/^[^x]*b/8i
-  xb
+/^a (?#xxx) (?#yyy) {3}c/x
+    aaac

-/^[^x]*b/8
-  xb
+/(?<![cd])b/
+    *** Failers
+    B\nB
+    dbcb
+
+/(?<![cd])[ab]/
+    dbaacb
+
+/(?<!(c|d))b/
+
+/(?<!(c|d))[ab]/
+    dbaacb
+
+/(?<!cd)[ab]/
+    cdaccb
+
+/^(?:a?b?)*$/
+    *** Failers
+    dbcb
+    a--
+
+/((?s)^a(.))((?m)^b$)/
+    a\nb\nc\n
+
+/((?m)^b$)/
+    a\nb\nc\n
+
+/(?m)^b/
+    a\nb\n
+
+/(?m)^(b)/
+    a\nb\n
+
+/((?m)^b)/
+    a\nb\n
+
+/\n((?m)^b)/
+    a\nb\n
+
+/((?s).)c(?!.)/
+    a\nb\nc\n
+    a\nb\nc\n
+
+/((?s)b.)c(?!.)/
+    a\nb\nc\n
+    a\nb\nc\n
+
+/^b/
+
+/()^b/
+    *** Failers
+    a\nb\nc\n
+    a\nb\nc\n
+
+/((?m)^b)/
+    a\nb\nc\n
+
+/(?(?!a)a|b)/
+
+/(?(?!a)b|a)/
+    a
+
+/(?(?=a)b|a)/
+    *** Failers
+    a
+    a
+
+/(?(?=a)a|b)/
+    a
+
+/(\w+:)+/
+    one:
+
+/$(?<=^(a))/
+    a
+
+/([\w:]+::)?(\w+)$/
+    abcd
+    xy:z:::abcd
+
+/^[^bcd]*(c+)/
+    aexycd
+
+/(a*)b+/
+    caab
+
+/([\w:]+::)?(\w+)$/
+    abcd
+    xy:z:::abcd
+    *** Failers
+    abcd:
+    abcd:
+
+/^[^bcd]*(c+)/
+    aexycd
+
+/(>a+)ab/
+
+/(?>a+)b/
+    aaab
+
+/([[:]+)/
+    a:[b]:
+
+/([[=]+)/
+    a=[b]=
+
+/([[.]+)/
+    a.[b].
+
+/((?>a+)b)/
+    aaab
+
+/(?>(a+))b/
+    aaab
+
+/((?>[^()]+)|\([^()]*\))+/
+    ((abc(ade)ufh()()x
+
+/a\Z/
+    *** Failers
+    aaab
+    a\nb\n
+
+/b\Z/
+    a\nb\n
+
+/b\z/
+
+/b\Z/
+    a\nb
+
+/b\z/
+    a\nb
+    *** Failers
+    
+/(?>.*)(?<=(abcd|wxyz))/
+    alphabetabcd
+    endingwxyz
+    *** Failers
+    a rather long string that doesn't end with one of them
+
+/word (?>(?:(?!otherword)[a-zA-Z0-9]+ ){0,30})otherword/
+    word cat dog elephant mussel cow horse canary baboon snake shark otherword
+    word cat dog elephant mussel cow horse canary baboon snake shark

-/^\d*b/8
-  xb 
+/word (?>[a-zA-Z0-9]+ ){0,30}otherword/
+    word cat dog elephant mussel cow horse canary baboon snake shark the quick brown fox and the lazy dog and several other words getting close to thirty by now I hope

-/(|a)/g8
-    catac
-    a\x{256}a 
+/(?<=\d{3}(?!999))foo/
+    999foo
+    123999foo 
+    *** Failers
+    123abcfoo
+    
+/(?<=(?!...999)\d{3})foo/
+    999foo
+    123999foo 
+    *** Failers
+    123abcfoo

-/^\x{85}$/8i
-    \x{85}
+/(?<=\d{3}(?!999)...)foo/
+    123abcfoo
+    123456foo 
+    *** Failers
+    123999foo  
+    
+/(?<=\d{3}...)(?<!999)foo/
+    123abcfoo   
+    123456foo 
+    *** Failers
+    123999foo

-/^abc./mgx8<any>
-    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x{0085}abc7 \x{2028}abc8 \x{2029}abc9 JUNK
+/((Z)+|A)*/
+    ZABCDEFG

-/abc.$/mgx8<any>
-    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x{0085} abc7\x{2028} abc8\x{2029} abc9
+/(Z()|A)*/
+    ZABCDEFG

-/^a\Rb/8<bsr_unicode>
+/(Z(())|A)*/
+    ZABCDEFG
+
+/((?>Z)+|A)*/
+    ZABCDEFG
+
+/((?>)+|A)*/
+    ZABCDEFG
+
+/a*/g
+    abbab
+
+/^[a-\d]/
+    abcde
+    -things
+    0digit
+    *** Failers
+    bcdef    
+
+/^[\d-a]/
+    abcde
+    -things
+    0digit
+    *** Failers
+    bcdef    
+    
+/[[:space:]]+/
+    > \x09\x0a\x0c\x0d\x0b<
+     
+/[[:blank:]]+/
+    > \x09\x0a\x0c\x0d\x0b<
+     
+/[\s]+/
+    > \x09\x0a\x0c\x0d\x0b<
+     
+/\s+/
+    > \x09\x0a\x0c\x0d\x0b<
+     
+/a?b/x
+    ab
+
+/(?!\A)x/m
+  a\nxb\n
+
+/(?!^)x/m
+  a\nxb\n
+
+/abc\Qabc\Eabc/
+    abcabcabc
+    
+/abc\Q(*+|\Eabc/
+    abc(*+|abc 
+
+/   abc\Q abc\Eabc/x
+    abc abcabc
+    *** Failers
+    abcabcabc  
+    
+/abc#comment
+    \Q#not comment
+    literal\E/x
+    abc#not comment\n    literal     
+
+/abc#comment
+    \Q#not comment
+    literal/x
+    abc#not comment\n    literal     
+
+/abc#comment
+    \Q#not comment
+    literal\E #more comment
+    /x
+    abc#not comment\n    literal     
+
+/abc#comment
+    \Q#not comment
+    literal\E #more comment/x
+    abc#not comment\n    literal     
+
+/\Qabc\$xyz\E/
+    abc\\\$xyz
+
+/\Qabc\E\$\Qxyz\E/
+    abc\$xyz
+
+/\Gabc/
+    abc
+    *** Failers
+    xyzabc  
+
+/\Gabc./g
+    abc1abc2xyzabc3
+
+/abc./g
+    abc1abc2xyzabc3 
+
+/a(?x: b c )d/
+    XabcdY
+    *** Failers 
+    Xa b c d Y 
+
+/((?x)x y z | a b c)/
+    XabcY
+    AxyzB 
+
+/(?i)AB(?-i)C/
+    XabCY
+    *** Failers
+    XabcY  
+
+/((?i)AB(?-i)C|D)E/
+    abCE
+    DE
+    *** Failers
+    abcE
+    abCe  
+    dE
+    De    
+
+/[z\Qa-d]\E]/
+    z
+    a
+    -
+    d
+    ] 
+    *** Failers
+    b     
+
+/[\z\C]/
+    z
+    C 
+    
+/\M/
+    M 
+    
+/(a+)*b/
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 
+    
+/(?i)reg(?:ul(?:[a\xE4]|ae)r|ex)/
+    REGular
+    regulaer
+    Regex  
+    regul\xE4r 
+
+/\xC5\xE6\xE5\xE4[\xE0-\xFF\xC0-\xDF]+/
+    \xC5\xE6\xE5\xE4\xE0
+    \xC5\xE6\xE5\xE4\xFF
+    \xC5\xE6\xE5\xE4\xC0
+    \xC5\xE6\xE5\xE4\xDF
+
+/(?<=Z)X./
+    \x84XAZXB
+
+/^(?(2)a|(1)(2))+$/
+    123a
+
+/(?<=a|bbbb)c/
+    ac
+    bbbbc
+
+/abc/SS>testsavedregex
+<testsavedregex
+    abc
+    *** Failers
+    bca
+    
+/abc/FSS>testsavedregex
+<testsavedregex
+    abc
+    *** Failers
+    bca
+
+/(a|b)/S>testsavedregex
+<testsavedregex
+    abc
+    *** Failers
+    def  
+    
+/(a|b)/SF>testsavedregex
+<testsavedregex
+    abc
+    *** Failers
+    def  
+    
+/line\nbreak/
+    this is a line\nbreak
+    line one\nthis is a line\nbreak in the second line 
+
+/line\nbreak/f
+    this is a line\nbreak
+    ** Failers 
+    line one\nthis is a line\nbreak in the second line 
+
+/line\nbreak/mf
+    this is a line\nbreak
+    ** Failers 
+    line one\nthis is a line\nbreak in the second line 
+
+/1234/
+    123\P
+    a4\P\R
+
+/1234/
+    123\P
+    4\P\R
+
+/^/mg
+    a\nb\nc\n
+    \ 
+    
+/(?<=C\n)^/mg
+    A\nC\nC\n 
+
+/(?s)A?B/
+    AB
+    aB  
+
+/(?s)A*B/
+    AB
+    aB  
+
+/(?m)A?B/
+    AB
+    aB  
+
+/(?m)A*B/
+    AB
+    aB  
+
+/Content-Type\x3A[^\r\n]{6,}/
+    Content-Type:xxxxxyyy 
+
+/Content-Type\x3A[^\r\n]{6,}z/
+    Content-Type:xxxxxyyyz
+
+/Content-Type\x3A[^a]{6,}/
+    Content-Type:xxxyyy 
+
+/Content-Type\x3A[^a]{6,}z/
+    Content-Type:xxxyyyz
+
+/^abc/m
+    xyz\nabc
+    xyz\nabc\<lf>
+    xyz\r\nabc\<lf>
+    xyz\rabc\<cr>
+    xyz\r\nabc\<crlf>
+    ** Failers 
+    xyz\nabc\<cr>
+    xyz\r\nabc\<cr>
+    xyz\nabc\<crlf>
+    xyz\rabc\<crlf>
+    xyz\rabc\<lf>
+    
+/abc$/m<lf>
+    xyzabc
+    xyzabc\n 
+    xyzabc\npqr 
+    xyzabc\r\<cr> 
+    xyzabc\rpqr\<cr> 
+    xyzabc\r\n\<crlf> 
+    xyzabc\r\npqr\<crlf> 
+    ** Failers
+    xyzabc\r 
+    xyzabc\rpqr 
+    xyzabc\r\n 
+    xyzabc\r\npqr 
+    
+/^abc/m<cr>
+    xyz\rabcdef
+    xyz\nabcdef\<lf>
+    ** Failers  
+    xyz\nabcdef
+       
+/^abc/m<lf>
+    xyz\nabcdef
+    xyz\rabcdef\<cr>
+    ** Failers  
+    xyz\rabcdef
+       
+/^abc/m<crlf>
+    xyz\r\nabcdef
+    xyz\rabcdef\<cr>
+    ** Failers  
+    xyz\rabcdef
+    
+/.*/<lf>
+    abc\ndef
+    abc\rdef
+    abc\r\ndef
+    \<cr>abc\ndef
+    \<cr>abc\rdef
+    \<cr>abc\r\ndef
+    \<crlf>abc\ndef
+    \<crlf>abc\rdef
+    \<crlf>abc\r\ndef
+
+/\w+(.)(.)?def/s
+    abc\ndef
+    abc\rdef
+    abc\r\ndef
+
+/^\w+=.*(\\\n.*)*/
+    abc=xyz\\\npqr
+
+/^(a()*)*/
+    aaaa
+
+/^(?:a(?:(?:))*)*/
+    aaaa
+
+/^(a()+)+/
+    aaaa
+
+/^(?:a(?:(?:))+)+/
+    aaaa
+
+/(a|)*\d/
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+
+/(?>a|)*\d/
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+
+/(?:a|)*\d/
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+
+/^a.b/<lf>
+    a\rb
+    a\nb\<cr> 
+    ** Failers
     a\nb
+    a\nb\<any>
+    a\rb\<cr>   
+    a\rb\<any>   
+
+/^abc./mgx<any>
+    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 JUNK
+
+/abc.$/mgx<any>
+    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc9
+
+/^a\Rb/<bsr_unicode>
+    a\nb
     a\rb
     a\r\nb
     a\x0bb
     a\x0cb
-    a\x{85}b   
-    a\x{2028}b 
-    a\x{2029}b 
+    a\x85b   
     ** Failers
     a\n\rb

-/^a\R*b/8<bsr_unicode>
+/^a\R*b/<bsr_unicode>
     ab
     a\nb
     a\rb
     a\r\nb
     a\x0bb
-    a\x0c\x{2028}\x{2029}b
-    a\x{85}b   
+    a\x0cb
+    a\x85b   
     a\n\rb    
-    a\n\r\x{85}\x0cb 
+    a\n\r\x85\x0cb

-/^a\R+b/8<bsr_unicode>
+/^a\R+b/<bsr_unicode>
     a\nb
     a\rb
     a\r\nb
     a\x0bb
-    a\x0c\x{2028}\x{2029}b
-    a\x{85}b   
+    a\x0cb
+    a\x85b   
     a\n\rb    
-    a\n\r\x{85}\x0cb 
+    a\n\r\x85\x0cb 
     ** Failers
     ab  
-
-/^a\R{1,3}b/8<bsr_unicode>
+    
+/^a\R{1,3}b/<bsr_unicode>
     a\nb
     a\n\rb
-    a\n\r\x{85}b
+    a\n\r\x85b
     a\r\n\r\nb 
     a\r\n\r\n\r\nb 
     a\n\r\n\rb
@@ -593,116 +4202,511 @@
     a\n\n\n\rb
     a\r

-/\h+\V?\v{3,4}/8 
-    \x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
+/^a[\R]b/<bsr_unicode>
+    aRb
+    ** Failers
+    a\nb

-/\V?\v{3,4}/8 
-    \x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
+/.+foo/
+    afoo
+    ** Failers 
+    \r\nfoo 
+    \nfoo

-/\h+\V?\v{3,4}/8
-    >\x09\x20\x{a0}X\x0a\x0a\x0a<
+/.+foo/<crlf>
+    afoo
+    \nfoo 
+    ** Failers 
+    \r\nfoo

-/\V?\v{3,4}/8
-    >\x09\x20\x{a0}X\x0a\x0a\x0a<
+/.+foo/<any>
+    afoo
+    ** Failers 
+    \nfoo 
+    \r\nfoo

-/\H\h\V\v/8
+/.+foo/s
+    afoo
+    \r\nfoo 
+    \nfoo 
+
+/^$/mg<any>
+    abc\r\rxyz
+    abc\n\rxyz  
+    ** Failers 
+    abc\r\nxyz
+
+/^X/m
+    XABC
+    ** Failers 
+    XABC\B
+
+/(?m)^$/<any>g+
+    abc\r\n\r\n
+
+/(?m)^$|^\r\n/<any>g+ 
+    abc\r\n\r\n
+    
+/(?m)$/<any>g+ 
+    abc\r\n\r\n
+
+/(?|(abc)|(xyz))/
+   >abc<
+   >xyz< 
+
+/(x)(?|(abc)|(xyz))(x)/
+    xabcx
+    xxyzx 
+
+/(x)(?|(abc)(pqr)|(xyz))(x)/
+    xabcpqrx
+    xxyzx 
+
+/(?|(abc)|(xyz))(?1)/
+    abcabc
+    xyzabc 
+    ** Failers 
+    xyzxyz 
+ 
+/\H\h\V\v/
     X X\x0a
     X\x09X\x0b
     ** Failers
-    \x{a0} X\x0a   
+    \xa0 X\x0a

-/\H*\h+\V?\v{3,4}/8 
-    \x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
-    \x09\x20\x{a0}\x0a\x0b\x0c\x0d\x0a
-    \x09\x20\x{a0}\x0a\x0b\x0c
+/\H*\h+\V?\v{3,4}/ 
+    \x09\x20\xa0X\x0a\x0b\x0c\x0d\x0a
+    \x09\x20\xa0\x0a\x0b\x0c\x0d\x0a
+    \x09\x20\xa0\x0a\x0b\x0c
     ** Failers 
-    \x09\x20\x{a0}\x0a\x0b
+    \x09\x20\xa0\x0a\x0b

-/\H\h\V\v/8
-    \x{3001}\x{3000}\x{2030}\x{2028}
-    X\x{180e}X\x{85}
+/\H{3,4}/
+    XY  ABCDE
+    XY  PQR ST 
+    
+/.\h{3,4}./
+    XY  AB    PQRS
+
+/\h*X\h?\H+Y\H?Z/
+    >XNNNYZ
+    >  X NYQZ
     ** Failers
-    \x{2009} X\x0a   
+    >XYZ   
+    >  X NY Z
+
+/\v*X\v?Y\v+Z\V*\x0a\V+\x0b\V{2,3}\x0c/
+    >XY\x0aZ\x0aA\x0bNN\x0c
+    >\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
+
+/.+A/<crlf>
+    \r\nA

-/\H*\h+\V?\v{3,4}/8 
-    \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
-    \x09\x{205f}\x{a0}\x0a\x{2029}\x0c\x{2028}\x0a
-    \x09\x20\x{202f}\x0a\x0b\x0c
-    ** Failers 
-    \x09\x{200a}\x{a0}\x{2028}\x0b
-     
-/a\Rb/I8<bsr_anycrlf>
+/\nA/<crlf>
+    \r\nA 
+
+/[\r\n]A/<crlf>
+    \r\nA 
+
+/(\r|\n)A/<crlf>
+    \r\nA 
+
+/a\Rb/I<bsr_anycrlf>
     a\rb
     a\nb
     a\r\nb
     ** Failers
-    a\x{85}b
+    a\x85b
     a\x0bb

-/a\Rb/I8<bsr_unicode>
+/a\Rb/I<bsr_unicode>
     a\rb
     a\nb
     a\r\nb
-    a\x{85}b
+    a\x85b
     a\x0bb     
     ** Failers 
-    a\x{85}b\<bsr_anycrlf>
+    a\x85b\<bsr_anycrlf>
     a\x0bb\<bsr_anycrlf>

-/a\R?b/I8<bsr_anycrlf>
+/a\R?b/I<bsr_anycrlf>
     a\rb
     a\nb
     a\r\nb
     ** Failers
-    a\x{85}b
+    a\x85b
     a\x0bb

-/a\R?b/I8<bsr_unicode>
+/a\R?b/I<bsr_unicode>
     a\rb
     a\nb
     a\r\nb
-    a\x{85}b
+    a\x85b
     a\x0bb     
     ** Failers 
-    a\x{85}b\<bsr_anycrlf>
+    a\x85b\<bsr_anycrlf>
     a\x0bb\<bsr_anycrlf>
+    
+/a\R{2,4}b/I<bsr_anycrlf>
+    a\r\n\nb
+    a\n\r\rb
+    a\r\n\r\n\r\n\r\nb
+    ** Failers
+    a\x85\85b
+    a\x0b\0bb     
+
+/a\R{2,4}b/I<bsr_unicode>
+    a\r\rb
+    a\n\n\nb
+    a\r\n\n\r\rb
+    a\x85\85b
+    a\x0b\0bb     
+    ** Failers 
+    a\r\r\r\r\rb 
+    a\x85\85b\<bsr_anycrlf>
+    a\x0b\0bb\<bsr_anycrlf>
+    
+/a(?!)|\wbc/
+    abc 
+
+/a[]b/<JS>
+    ** Failers
+    ab
+
+/a[]+b/<JS>
+    ** Failers
+    ab 
+
+/a[]*+b/<JS>
+    ** Failers
+    ab 
+
+/a[^]b/<JS>
+    aXb
+    a\nb 
+    ** Failers
+    ab  
+    
+/a[^]+b/<JS> 
+    aXb
+    a\nX\nXb 
+    ** Failers
+    ab  
+
+/X$/E
+    X
+    ** Failers 
+    X\n 
+
+/X$/
+    X
+    X\n 
+
+/xyz/C
+  xyz 
+  abcxyz 
+  abcxyz\Y
+  ** Failers 
+  abc
+  abc\Y
+  abcxypqr  
+  abcxypqr\Y  
+
+/(*NO_START_OPT)xyz/C
+  abcxyz 
+  
+/(?C)ab/
+  ab
+  \C-ab
+  
+/ab/C
+  ab
+  \C-ab    
+
+/^"((?(?=[a])[^"])|b)*"$/C
+    "ab"
+    \C-"ab"
+
+/\d+X|9+Y/
+    ++++123999\P
+    ++++123999Y\P
+
+/Z(*F)/
+    Z\P
+    ZA\P 
+    
+/Z(?!)/
+    Z\P 
+    ZA\P 
+
+/dog(sbody)?/
+    dogs\P
+    dogs\P\P 
+    
+/dog(sbody)??/
+    dogs\P
+    dogs\P\P 
+
+/dog|dogsbody/
+    dogs\P
+    dogs\P\P

-/X/8f<any> 
-    A\x{1ec5}ABCXYZ
+/dogsbody|dog/
+    dogs\P
+    dogs\P\P

-/abcd*/8
+/Z(*F)Q|ZXY/
+    Z\P
+    ZA\P 
+    X\P 
+
+/\bthe cat\b/
+    the cat\P
+    the cat\P\P
+
+/dog(sbody)?/
+    dogs\D\P
+    body\D\R
+
+/dog(sbody)?/
+    dogs\D\P\P
+    body\D\R
+
+/abc/
+   abc\P
+   abc\P\P
+
+/abc\K123/
+    xyzabc123pqr
+    
+/(?<=abc)123/
+    xyzabc123pqr 
+    xyzabc12\P
+    xyzabc12\P\P
+
+/\babc\b/
+    +++abc+++
+    +++ab\P
+    +++ab\P\P  
+
+/(?=C)/g+
+    ABCDECBA
+
+/(abc|def|xyz)/I
+    terhjk;abcdaadsfe
+    the quick xyz brown fox 
+    \Yterhjk;abcdaadsfe
+    \Ythe quick xyz brown fox 
+    ** Failers
+    thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
+    \Ythejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
+
+/(abc|def|xyz)/SI
+    terhjk;abcdaadsfe
+    the quick xyz brown fox 
+    \Yterhjk;abcdaadsfe
+    \Ythe quick xyz brown fox 
+    ** Failers
+    thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
+    \Ythejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
+
+/abcd*/+
     xxxxabcd\P
     xxxxabcd\P\P
+    dddxxx\R 
+    xxxxabcd\P\P
+    xxx\R

-/abcd*/i8
+/abcd*/i
     xxxxabcd\P
     xxxxabcd\P\P
     XXXXABCD\P
     XXXXABCD\P\P

-/abc\d*/8
+/abc\d*/
     xxxxabc1\P
     xxxxabc1\P\P

-/abc[de]*/8
+/abc[de]*/
     xxxxabcde\P
     xxxxabcde\P\P

-/\bthe cat\b/8
-    the cat\P
-    the cat\P\P
+/(?:(?1)|B)(A(*F)|C)/
+    ABCD
+    CCD
+    ** Failers
+    CAD

-/a+/8
-    a\x{123}aa\>1
-    a\x{123}aa\>2
-    a\x{123}aa\>3
-    a\x{123}aa\>4
-    a\x{123}aa\>5
-    a\x{123}aa\>6
+/^(?:(?1)|B)(A(*F)|C)/
+    CCD
+    BCD 
+    ** Failers
+    ABCD
+    CAD
+    BAD

-/ab\Cde/8
+/^(?!a(*SKIP)b)/
+    ac
+    
+/^(?=a(*SKIP)b|ac)/
+    ** Failers
+    ac
+    
+/^(?=a(*THEN)b|ac)/
+    ac
+    
+/^(?=a(*PRUNE)b)/
+    ab  
+    ** Failers 
+    ac
+
+/^(?(?!a(*SKIP)b))/
+    ac
+
+/(?<=abc)def/
+    abc\P\P
+
+/abc$/
+    abc
+    abc\P
+    abc\P\P
+
+/abc$/m
+    abc
+    abc\n
+    abc\P\P
+    abc\n\P\P 
+    abc\P
+    abc\n\P
+
+/abc\z/
+    abc
+    abc\P
+    abc\P\P
+
+/abc\Z/
+    abc
+    abc\P
+    abc\P\P
+
+/abc\b/
+    abc
+    abc\P
+    abc\P\P
+
+/abc\B/
+    abc
+    abc\P
+    abc\P\P
+
+/.+/
+    abc\>0
+    abc\>1
+    abc\>2
+    abc\>3
+    abc\>4
+    abc\>-4 
+
+/^(?:a)++\w/
+     aaaab
+     ** Failers 
+     aaaa 
+     bbb 
+
+/^(?:aa|(?:a)++\w)/
+     aaaab
+     aaaa 
+     ** Failers 
+     bbb 
+
+/^(?:a)*+\w/
+     aaaab
+     bbb 
+     ** Failers 
+     aaaa 
+
+/^(a)++\w/
+     aaaab
+     ** Failers 
+     aaaa 
+     bbb 
+
+/^(a|)++\w/
+     aaaab
+     ** Failers 
+     aaaa 
+     bbb 
+
+/(?=abc){3}abc/+
+    abcabcabc
+    ** Failers
+    xyz  
+    
+/(?=abc)+abc/+
+    abcabcabc
+    ** Failers
+    xyz  
+    
+/(?=abc)++abc/+
+    abcabcabc
+    ** Failers
+    xyz  
+    
+/(?=abc){0}xyz/
+    xyz 
+
+/(?=abc){1}xyz/
+    ** Failers
+    xyz 
+    
+/(?=(a))?./
+    ab
+    bc
+      
+/(?=(a))??./
+    ab
+    bc
+
+/^(?=(a)){0}b(?1)/
+    backgammon
+
+/^(?=(?1))?[az]([abc])d/
+    abd 
+    zcdxx 
+
+/^(?!a){0}\w+/
+    aaaaa
+
+/(?<=(abc))?xyz/
+    abcxyz
+    pqrxyz 
+
+/((?2))((?1))/
+    abc
+
+/(?(R)a+|(?R)b)/
+    aaaabcde
+
+/(?(R)a+|((?R))b)/
+    aaaabcde
+
+/((?(R)a+|(?1)b))/
+    aaaabcde
+
+/((?(R2)a+|(?1)b))/
+    aaaabcde
+
+/(?(R)a*(?1)|((?R))b)/
+    aaaabcde
+
+/(a+)/
+    \O6aaaa
+    \O8aaaa
+
+/ab\Cde/
     abXde
+    
+/(?<=ab\Cde)X/
+    abZdeX

-/(?<=ab\Cde)X/8
-
-/-- End of testinput8 --/
+/-- End of testinput8 --/

Modified: code/trunk/testdata/testinput9
===================================================================
--- code/trunk/testdata/testinput9    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testinput9    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,989 +1,691 @@
-/-- This set of tests check Unicode property support with the DFA matching 
-    functionality of pcre_dfa_exec(). The -dfa flag must be used with pcretest
-    when running it. --/
+/-- This set of tests checks UTF-8 support with the DFA matching functionality
+    of pcre_dfa_exec(). The -dfa flag must be used with pcretest when running 
+    it. --/

-/\pL\P{Nd}/8
-    AB
+/\x{100}ab/8
+  \x{100}ab
+  
+/a\x{100}*b/8
+    ab
+    a\x{100}b  
+    a\x{100}\x{100}b  
+    
+/a\x{100}+b/8
+    a\x{100}b  
+    a\x{100}\x{100}b  
+    *** Failers 
+    ab
+     
+/\bX/8
+    Xoanon
+    +Xoanon
+    \x{300}Xoanon 
+    *** Failers 
+    YXoanon  
+    
+/\BX/8
+    YXoanon
     *** Failers
-    A0
-    00   
+    Xoanon
+    +Xoanon    
+    \x{300}Xoanon

-/\X./8
-    AB
-    A\x{300}BC 
-    A\x{300}\x{301}\x{302}BC 
+/X\b/8
+    X+oanon
+    ZX\x{300}oanon 
+    FAX 
+    *** Failers 
+    Xoanon  
+    
+/X\B/8
+    Xoanon  
     *** Failers
-    \x{300}  
+    X+oanon
+    ZX\x{300}oanon 
+    FAX 
+    
+/[^a]/8
+    abcd
+    a\x{100}

-/\X\X/8
-    ABC
-    A\x{300}B\x{300}\x{301}C 
-    A\x{300}\x{301}\x{302}BC 
+/^[abc\x{123}\x{400}-\x{402}]{2,3}\d/8
+    ab99
+    \x{123}\x{123}45
+    \x{400}\x{401}\x{402}6  
     *** Failers
-    \x{300}  
+    d99
+    \x{123}\x{122}4   
+    \x{400}\x{403}6  
+    \x{400}\x{401}\x{402}\x{402}6

-/^\pL+/8
-    abcd
-    a 
-    *** Failers 
+/a.b/8
+    acb
+    a\x7fb
+    a\x{100}b 
+    *** Failers
+    a\nb

-/^\PL+/8
-    1234
-    = 
-    *** Failers 
-    abcd 
+/a(.{3})b/8
+    a\x{4000}xyb 
+    a\x{4000}\x7fyb 
+    a\x{4000}\x{100}yb 
+    *** Failers
+    a\x{4000}b 
+    ac\ncb

-/^\X+/8
-    abcdA\x{300}\x{301}\x{302}
-    A\x{300}\x{301}\x{302}
-    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
-    a 
-    *** Failers 
-    \x{300}\x{301}\x{302}
+/a(.*?)(.)/
+    a\xc0\x88b

-/\X?abc/8
-    abc
-    A\x{300}abc
-    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
-    \x{300}abc  
-    *** Failers
+/a(.*?)(.)/8
+    a\x{100}b

-/^\X?abc/8
-    abc
-    A\x{300}abc
+/a(.*)(.)/
+    a\xc0\x88b
+
+/a(.*)(.)/8
+    a\x{100}b
+
+/a(.)(.)/
+    a\xc0\x92bcd
+
+/a(.)(.)/8
+    a\x{240}bcd
+
+/a(.?)(.)/
+    a\xc0\x92bcd
+
+/a(.?)(.)/8
+    a\x{240}bcd
+
+/a(.??)(.)/
+    a\xc0\x92bcd
+
+/a(.??)(.)/8
+    a\x{240}bcd
+
+/a(.{3})b/8
+    a\x{1234}xyb 
+    a\x{1234}\x{4321}yb 
+    a\x{1234}\x{4321}\x{3412}b 
     *** Failers
-    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
-    \x{300}abc  
+    a\x{1234}b 
+    ac\ncb

-/\X*abc/8
-    abc
-    A\x{300}abc
-    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
-    \x{300}abc  
+/a(.{3,})b/8
+    a\x{1234}xyb 
+    a\x{1234}\x{4321}yb 
+    a\x{1234}\x{4321}\x{3412}b 
+    axxxxbcdefghijb 
+    a\x{1234}\x{4321}\x{3412}\x{3421}b 
     *** Failers
+    a\x{1234}b

-/^\X*abc/8
-    abc
-    A\x{300}abc
-    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
+/a(.{3,}?)b/8
+    a\x{1234}xyb 
+    a\x{1234}\x{4321}yb 
+    a\x{1234}\x{4321}\x{3412}b 
+    axxxxbcdefghijb 
+    a\x{1234}\x{4321}\x{3412}\x{3421}b 
     *** Failers
-    \x{300}abc  
+    a\x{1234}b

-/^\pL?=./8
-    A=b
-    =c 
+/a(.{3,5})b/8
+    a\x{1234}xyb 
+    a\x{1234}\x{4321}yb 
+    a\x{1234}\x{4321}\x{3412}b 
+    axxxxbcdefghijb 
+    a\x{1234}\x{4321}\x{3412}\x{3421}b 
+    axbxxbcdefghijb 
+    axxxxxbcdefghijb 
     *** Failers
-    1=2 
-    AAAA=b  
+    a\x{1234}b 
+    axxxxxxbcdefghijb

-/^\pL*=./8
-    AAAA=b
-    =c 
+/a(.{3,5}?)b/8
+    a\x{1234}xyb 
+    a\x{1234}\x{4321}yb 
+    a\x{1234}\x{4321}\x{3412}b 
+    axxxxbcdefghijb 
+    a\x{1234}\x{4321}\x{3412}\x{3421}b 
+    axbxxbcdefghijb 
+    axxxxxbcdefghijb 
     *** Failers
-    1=2  
+    a\x{1234}b 
+    axxxxxxbcdefghijb

-/^\X{2,3}X/8
-    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
-    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X 
+/^[a\x{c0}]/8
     *** Failers
-    X
-    A\x{300}\x{301}\x{302}X
-    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
+    \x{100}

-/^\pC\pL\pM\pN\pP\pS\pZ</8
-    \x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
-    \np\x{300}9!\$ < 
-    ** Failers 
-    ap\x{300}9!\$ < 
-  
-/^\PC/8
-    X
-    ** Failers 
-    \x7f
-  
-/^\PL/8
-    9
-    ** Failers 
-    \x{c0}
-  
-/^\PM/8
-    X
-    ** Failers 
-    \x{30f}
-  
-/^\PN/8
-    X
-    ** Failers 
-    \x{660}
-  
-/^\PP/8
-    X
-    ** Failers 
-    \x{66c}
-  
-/^\PS/8
-    X
-    ** Failers 
-    \x{f01}
-  
-/^\PZ/8
-    X
-    ** Failers 
-    \x{1680}
+/(?<=aXb)cd/8
+    aXbcd
+
+/(?<=a\x{100}b)cd/8
+    a\x{100}bcd
+
+/(?<=a\x{100000}b)cd/8
+    a\x{100000}bcd

-/^\p{Cc}/8
-    \x{017}
-    \x{09f} 
-    ** Failers
-    \x{0600} 
-  
-/^\p{Cf}/8
-    \x{601}
-    ** Failers
-    \x{09f} 
-  
-/^\p{Cn}/8
-    ** Failers
-    \x{09f} 
-  
-/^\p{Co}/8
-    \x{f8ff}
-    ** Failers
-    \x{09f} 
-  
-/^\p{Cs}/8
-    \?\x{dfff}
-    ** Failers
-    \x{09f} 
-  
-/^\p{Ll}/8
-    a
-    ** Failers 
-    Z
-    \x{e000}  
-  
-/^\p{Lm}/8
-    \x{2b0}
-    ** Failers
-    a 
-  
-/^\p{Lo}/8
-    \x{1bb}
-    ** Failers
-    a 
-    \x{2b0}
-  
-/^\p{Lt}/8
-    \x{1c5}
-    ** Failers
-    a 
-    \x{2b0}
-  
-/^\p{Lu}/8
-    A
-    ** Failers
-    \x{2b0}
-  
-/^\p{Mc}/8
-    \x{903}
-    ** Failers
-    X
-    \x{300}
-       
-/^\p{Me}/8
-    \x{488}
-    ** Failers
-    X
-    \x{903}
-    \x{300}
-  
-/^\p{Mn}/8
-    \x{300}
-    ** Failers
-    X
-    \x{903}
-  
-/^\p{Nd}+/8
-    0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}\x{66a}
-    \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}\x{6fa}
-    \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}\x{970}
-    ** Failers
-    X
-  
-/^\p{Nl}/8
-    \x{16ee}
-    ** Failers
-    X
-    \x{966}
-  
-/^\p{No}/8
-    \x{b2}
-    \x{b3}
-    ** Failers
-    X
-    \x{16ee}
-  
-/^\p{Pc}/8
-    \x5f
-    \x{203f}
-    ** Failers
-    X
-    -
-    \x{58a}
-  
-/^\p{Pd}/8
-    -
-    \x{58a}
-    ** Failers
-    X
-    \x{203f}
-  
-/^\p{Pe}/8
-    )
-    ]
-    }
-    \x{f3b}
-    ** Failers
-    X
-    \x{203f}
-    (
-    [
-    {
-    \x{f3c}
-  
-/^\p{Pf}/8
-    \x{bb}
-    \x{2019}
-    ** Failers
-    X
-    \x{203f}
-  
-/^\p{Pi}/8
-    \x{ab}
-    \x{2018}
-    ** Failers
-    X
-    \x{203f}
-  
-/^\p{Po}/8
-    !
-    \x{37e}
-    ** Failers
-    X
-    \x{203f}
-  
-/^\p{Ps}/8
-    (
-    [
-    {
-    \x{f3c}
-    ** Failers
-    X
-    )
-    ]
-    }
-    \x{f3b}
-  
-/^\p{Sc}+/8
-    $\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
-    \x{9f2}
-    ** Failers
-    X
-    \x{2c2}
-  
-/^\p{Sk}/8
-    \x{2c2}
-    ** Failers
-    X
-    \x{9f2}
-  
-/^\p{Sm}+/8
-    +<|~\x{ac}\x{2044}
-    ** Failers
-    X
-    \x{9f2}
-  
-/^\p{So}/8
-    \x{a6}
-    \x{482} 
-    ** Failers
-    X
-    \x{9f2}
-  
-/^\p{Zl}/8
-    \x{2028}
-    ** Failers
-    X
-    \x{2029}
-  
-/^\p{Zp}/8
-    \x{2029}
-    ** Failers
-    X
-    \x{2028}
-  
-/^\p{Zs}/8
-    \ \
-    \x{a0}
-    \x{1680}
-    \x{180e}
-    \x{2000}
-    \x{2001}     
-    ** Failers
-    \x{2028}
-    \x{200d} 
-  
-/\p{Nd}+(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}+?(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}{2,}(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}{2,}?(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}*(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}*?(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}{2}(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}{2,3}(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}{2,3}?(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}?(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}??(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}*+(..)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}*+(...)/8
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}*+(....)/8
-      ** Failers
-      \x{660}\x{661}\x{662}ABC
-  
-/\p{Lu}/8i
-    A
-    a\x{10a0}B 
-    ** Failers 
-    a
-    \x{1d00}  
+/(?:\x{100}){3}b/8
+    \x{100}\x{100}\x{100}b
+    *** Failers 
+    \x{100}\x{100}b

-/\p{^Lu}/8i
-    1234
-    ** Failers
-    ABC 
+/\x{ab}/8
+    \x{ab} 
+    \xc2\xab
+    *** Failers 
+    \x00{ab}

-/\P{Lu}/8i
-    1234
-    ** Failers
-    ABC 
+/(?<=(.))X/8
+    WXYZ
+    \x{256}XYZ 
+    *** Failers
+    XYZ

-/(?<=A\p{Nd})XYZ/8
-    A2XYZ
-    123A5XYZPQR
-    ABA\x{660}XYZpqr
-    ** Failers
-    AXYZ
-    XYZ     
+/[^a]+/8g
+    bcd
+    \x{100}aY\x{256}Z

-/(?<!\pL)XYZ/8
-    1XYZ
-    AB=XYZ.. 
-    XYZ 
-    ** Failers
-    WXYZ 
+/^[^a]{2}/8
+    \x{100}bc
+ 
+/^[^a]{2,}/8
+    \x{100}bcAa

-/[\p{Nd}]/8
-    1234
+/^[^a]{2,}?/8
+    \x{100}bca

-/[\p{Nd}+-]+/8
-    1234
-    12-34
-    12+\x{661}-34  
-    ** Failers
-    abcd  
+/[^a]+/8ig
+    bcd
+    \x{100}aY\x{256}Z 
+    
+/^[^a]{2}/8i
+    \x{100}bc
+ 
+/^[^a]{2,}/8i
+    \x{100}bcAa

-/[\P{Nd}]+/8
+/^[^a]{2,}?/8i
+    \x{100}bca
+
+/\x{100}{0,0}/8
     abcd
-    ** Failers
-    1234
+ 
+/\x{100}?/8
+    abcd
+    \x{100}\x{100}

-/\D+/8
-    11111111111111111111111111111111111111111111111111111111111111111111111
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-     
-/\P{Nd}+/8
-    11111111111111111111111111111111111111111111111111111111111111111111111
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/\x{100}{0,3}/8 
+    \x{100}\x{100} 
+    \x{100}\x{100}\x{100}\x{100} 
+    
+/\x{100}*/8
+    abce
+    \x{100}\x{100}\x{100}\x{100}

-/[\D]+/8
-    11111111111111111111111111111111111111111111111111111111111111111111111
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/\x{100}{1,1}/8
+    abcd\x{100}\x{100}\x{100}\x{100}

-/[\P{Nd}]+/8
-    11111111111111111111111111111111111111111111111111111111111111111111111
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/\x{100}{1,3}/8
+    abcd\x{100}\x{100}\x{100}\x{100}

-/[\D\P{Nd}]+/8
-    11111111111111111111111111111111111111111111111111111111111111111111111
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/\x{100}+/8
+    abcd\x{100}\x{100}\x{100}\x{100}

-/\pL/8
-    a
-    A 
+/\x{100}{3}/8
+    abcd\x{100}\x{100}\x{100}XX

-/\pL/8i
-    a
-    A 
-    
-/\p{Lu}/8 
-    A
-    aZ
-    ** Failers
-    abc   
+/\x{100}{3,5}/8
+    abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX

-/\p{Lu}/8i
-    A
-    aZ
-    ** Failers
-    abc   
+/\x{100}{3,}/8
+    abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX

-/\p{Ll}/8 
-    a
-    Az
-    ** Failers
-    ABC   
+/(?<=a\x{100}{2}b)X/8
+    Xyyya\x{100}\x{100}bXzzz

-/\p{Ll}/8i 
-    a
-    Az
-    ** Failers
-    ABC   
+/\D*/8
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/^\x{c0}$/8i
-    \x{c0}
-    \x{e0} 
+/\D*/8
+  \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}

-/^\x{e0}$/8i
-    \x{c0}
-    \x{e0} 
+/\D/8
+    1X2
+    1\x{100}2 
+  
+/>\S/8
+    > >X Y
+    > >\x{100} Y
+  
+/\d/8
+    \x{100}3
+    
+/\s/8
+    \x{100} X
+    
+/\D+/8
+    12abcd34
+    *** Failers
+    1234

-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8
-    A\x{391}\x{10427}\x{ff3a}\x{1fb0}
-    ** Failers
-    a\x{391}\x{10427}\x{ff3a}\x{1fb0}   
-    A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
-    A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
-    A\x{391}\x{10427}\x{ff5a}\x{1fb0}
-    A\x{391}\x{10427}\x{ff3a}\x{1fb8}
+/\D{2,3}/8
+    12abcd34
+    12ab34
+    *** Failers  
+    1234
+    12a34

-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8i
-    A\x{391}\x{10427}\x{ff3a}\x{1fb0}
-    a\x{391}\x{10427}\x{ff3a}\x{1fb0}   
-    A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
-    A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
-    A\x{391}\x{10427}\x{ff5a}\x{1fb0}
-    A\x{391}\x{10427}\x{ff3a}\x{1fb8}
+/\D{2,3}?/8
+    12abcd34
+    12ab34
+    *** Failers  
+    1234
+    12a34

-/\x{391}+/8i
-    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
+/\d+/8
+    12abcd34
+    *** Failers

-/\x{391}{3,5}(.)/8i
-    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
+/\d{2,3}/8
+    12abcd34
+    1234abcd
+    *** Failers  
+    1.4

-/\x{391}{3,5}?(.)/8i
-    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
+/\d{2,3}?/8
+    12abcd34
+    1234abcd
+    *** Failers  
+    1.4

-/[\x{391}\x{ff3a}]/8i
-    \x{391}
-    \x{ff3a}
-    \x{3b1}
-    \x{ff5a}   
-    
-/[\x{c0}\x{391}]/8i
-    \x{c0}
-    \x{e0} 
+/\S+/8
+    12abcd34
+    *** Failers
+    \    \

-/[\x{105}-\x{109}]/8i
-    \x{104}
-    \x{105}
-    \x{109}  
-    ** Failers
-    \x{100}
-    \x{10a} 
-    
-/[z-\x{100}]/8i 
-    Z
-    z
-    \x{39c}
-    \x{178}
-    |
-    \x{80}
-    \x{ff}
-    \x{100}
-    \x{101} 
-    ** Failers
-    \x{102}
-    Y
-    y           
+/\S{2,3}/8
+    12abcd34
+    1234abcd
+    *** Failers
+    \     \

-/[z-\x{100}]/8i
+/\S{2,3}?/8
+    12abcd34
+    1234abcd
+    *** Failers
+    \     \

-/^\X/8
-    A
-    A\x{300}BC 
-    A\x{300}\x{301}\x{302}BC 
+/>\s+</8
+    12>      <34
     *** Failers
-    \x{300}

-/^[\X]/8
-    X123
+/>\s{2,3}</8
+    ab>  <cd
+    ab>   <ce
     *** Failers
-    AXYZ
+    ab>    <cd

-/^(\X*)C/8
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
+/>\s{2,3}?</8
+    ab>  <cd
+    ab>   <ce
+    *** Failers
+    ab>    <cd

-/^(\X*?)C/8
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
+/\w+/8
+    12      34
+    *** Failers
+    +++=*!

-/^(\X*)(.)/8
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
+/\w{2,3}/8
+    ab  cd
+    abcd ce
+    *** Failers
+    a.b.c

-/^(\X*?)(.)/8
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
+/\w{2,3}?/8
+    ab  cd
+    abcd ce
+    *** Failers
+    a.b.c

-/^\X(.)/8
+/\W+/8
+    12====34
     *** Failers
-    A\x{300}\x{301}\x{302}
+    abcd

-/^\X{2,3}(.)/8
-    A\x{300}\x{301}B\x{300}X
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
-    
-/^\X{2,3}?(.)/8
-    A\x{300}\x{301}B\x{300}X
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
+/\W{2,3}/8
+    ab====cd
+    ab==cd
+    *** Failers
+    a.b.c

-/^\pN{2,3}X/
-    12X
-    123X
+/\W{2,3}?/8
+    ab====cd
+    ab==cd
     *** Failers
-    X
-    1X
-    1234X     
+    a.b.c

-/\x{100}/i8
-    \x{100}   
-    \x{101} 
-    
-/^\p{Han}+/8
-    \x{2e81}\x{3007}\x{2f804}\x{31a0}
-    ** Failers
-    \x{2e7f}  
+/[\x{100}]/8
+    \x{100}
+    Z\x{100}
+    \x{100}Z
+    *** Failers

-/^\P{Katakana}+/8
-    \x{3105}
-    ** Failers
-    \x{30ff}  
+/[Z\x{100}]/8
+    Z\x{100}
+    \x{100}
+    \x{100}Z
+    *** Failers

-/^[\p{Arabic}]/8
-    \x{06e9}
-    \x{060b}
-    ** Failers
-    X\x{06e9}   
+/[\x{100}\x{200}]/8
+   ab\x{100}cd
+   ab\x{200}cd
+   *** Failers

-/^[\P{Yi}]/8
-    \x{2f800}
-    ** Failers
-    \x{a014}
-    \x{a4c6}   
+/[\x{100}-\x{200}]/8
+   ab\x{100}cd
+   ab\x{200}cd
+   ab\x{111}cd 
+   *** Failers

-/^\p{Any}X/8
-    AXYZ
-    \x{1234}XYZ 
-    ** Failers
-    X  
-    
-/^\P{Any}X/8
-    ** Failers
-    AX
-    
-/^\p{Any}?X/8
-    XYZ
-    AXYZ
-    \x{1234}XYZ 
-    ** Failers
-    ABXYZ   
+/[z-\x{200}]/8
+   ab\x{100}cd
+   ab\x{200}cd
+   ab\x{111}cd 
+   abzcd
+   ab|cd  
+   *** Failers

-/^\P{Any}?X/8
-    XYZ
-    ** Failers
-    AXYZ
-    \x{1234}XYZ 
-    ABXYZ   
+/[Q\x{100}\x{200}]/8
+   ab\x{100}cd
+   ab\x{200}cd
+   Q? 
+   *** Failers

-/^\p{Any}+X/8
-    AXYZ
-    \x{1234}XYZ
-    A\x{1234}XYZ
-    ** Failers
-    XYZ
+/[Q\x{100}-\x{200}]/8
+   ab\x{100}cd
+   ab\x{200}cd
+   ab\x{111}cd 
+   Q? 
+   *** Failers

-/^\P{Any}+X/8
-    ** Failers
-    AXYZ
-    \x{1234}XYZ
-    A\x{1234}XYZ
-    XYZ
+/[Qz-\x{200}]/8
+   ab\x{100}cd
+   ab\x{200}cd
+   ab\x{111}cd 
+   abzcd
+   ab|cd  
+   Q? 
+   *** Failers

-/^\p{Any}*X/8
-    XYZ
-    AXYZ
-    \x{1234}XYZ
-    A\x{1234}XYZ
-    ** Failers
+/[\x{100}\x{200}]{1,3}/8
+   ab\x{100}cd
+   ab\x{200}cd
+   ab\x{200}\x{100}\x{200}\x{100}cd
+   *** Failers

-/^\P{Any}*X/8
-    XYZ
-    ** Failers
-    AXYZ
-    \x{1234}XYZ
-    A\x{1234}XYZ
+/[\x{100}\x{200}]{1,3}?/8
+   ab\x{100}cd
+   ab\x{200}cd
+   ab\x{200}\x{100}\x{200}\x{100}cd
+   *** Failers

-/^[\p{Any}]X/8
-    AXYZ
-    \x{1234}XYZ 
-    ** Failers
-    X  
-    
-/^[\P{Any}]X/8
-    ** Failers
-    AX
-    
-/^[\p{Any}]?X/8
-    XYZ
-    AXYZ
-    \x{1234}XYZ 
-    ** Failers
-    ABXYZ   
+/[Q\x{100}\x{200}]{1,3}/8
+   ab\x{100}cd
+   ab\x{200}cd
+   ab\x{200}\x{100}\x{200}\x{100}cd
+   *** Failers

-/^[\P{Any}]?X/8
-    XYZ
-    ** Failers
-    AXYZ
-    \x{1234}XYZ 
-    ABXYZ   
+/[Q\x{100}\x{200}]{1,3}?/8
+   ab\x{100}cd
+   ab\x{200}cd
+   ab\x{200}\x{100}\x{200}\x{100}cd
+   *** Failers

-/^[\p{Any}]+X/8
-    AXYZ
-    \x{1234}XYZ
-    A\x{1234}XYZ
-    ** Failers
-    XYZ
+/(?<=[\x{100}\x{200}])X/8
+    abc\x{200}X
+    abc\x{100}X 
+    *** Failers
+    X

-/^[\P{Any}]+X/8
-    ** Failers
-    AXYZ
-    \x{1234}XYZ
-    A\x{1234}XYZ
-    XYZ
+/(?<=[Q\x{100}\x{200}])X/8
+    abc\x{200}X
+    abc\x{100}X 
+    abQX 
+    *** Failers
+    X

-/^[\p{Any}]*X/8
-    XYZ
-    AXYZ
-    \x{1234}XYZ
-    A\x{1234}XYZ
-    ** Failers
+/(?<=[\x{100}\x{200}]{3})X/8
+    abc\x{100}\x{200}\x{100}X
+    *** Failers
+    abc\x{200}X
+    X

-/^[\P{Any}]*X/8
-    XYZ
-    ** Failers
-    AXYZ
-    \x{1234}XYZ
-    A\x{1234}XYZ
+/[^\x{100}\x{200}]X/8
+    AX
+    \x{150}X
+    \x{500}X 
+    *** Failers
+    \x{100}X
+    \x{200}X

-/^\p{Any}{3,5}?/8
-    abcdefgh
-    \x{1234}\n\r\x{3456}xyz 
+/[^Q\x{100}\x{200}]X/8
+    AX
+    \x{150}X
+    \x{500}X 
+    *** Failers
+    \x{100}X
+    \x{200}X   
+    QX

-/^\p{Any}{3,5}/8
-    abcdefgh
-    \x{1234}\n\r\x{3456}xyz 
+/[^\x{100}-\x{200}]X/8
+    AX
+    \x{500}X 
+    *** Failers
+    \x{100}X
+    \x{150}X
+    \x{200}X

-/^\P{Any}{3,5}?/8
-    ** Failers
-    abcdefgh
-    \x{1234}\n\r\x{3456}xyz 
+/[z-\x{100}]/8i
+    z
+    Z 
+    \x{100}
+    *** Failers
+    \x{102}
+    y

-/^\p{L&}X/8
-     AXY
-     aXY
-     \x{1c5}XY
-     ** Failers
-     \x{1bb}XY
-     \x{2b0}XY
-     !XY      
+/[\xFF]/
+    >\xff<

-/^[\p{L&}]X/8
-     AXY
-     aXY
-     \x{1c5}XY
-     ** Failers
-     \x{1bb}XY
-     \x{2b0}XY
-     !XY      
+/[\xff]/8
+    >\x{ff}<

-/^\p{L&}+X/8
-     AXY
-     aXY
-     AbcdeXyz 
-     \x{1c5}AbXY
-     abcDEXypqreXlmn 
-     ** Failers
-     \x{1bb}XY
-     \x{2b0}XY
-     !XY      
+/[^\xFF]/
+    XYZ

-/^[\p{L&}]+X/8
-     AXY
-     aXY
-     AbcdeXyz 
-     \x{1c5}AbXY
-     abcDEXypqreXlmn 
-     ** Failers
-     \x{1bb}XY
-     \x{2b0}XY
-     !XY      
+/[^\xff]/8
+    XYZ
+    \x{123}

-/^\p{L&}+?X/8
-     AXY
-     aXY
-     AbcdeXyz 
-     \x{1c5}AbXY
-     abcDEXypqreXlmn 
-     ** Failers
-     \x{1bb}XY
-     \x{2b0}XY
-     !XY      
+/^[ac]*b/8
+  xb

-/^[\p{L&}]+?X/8
-     AXY
-     aXY
-     AbcdeXyz 
-     \x{1c5}AbXY
-     abcDEXypqreXlmn 
-     ** Failers
-     \x{1bb}XY
-     \x{2b0}XY
-     !XY      
+/^[ac\x{100}]*b/8
+  xb

-/^\P{L&}X/8
-     !XY
-     \x{1bb}XY
-     \x{2b0}XY
-     ** Failers
-     \x{1c5}XY
-     AXY      
+/^[^x]*b/8i
+  xb

-/^[\P{L&}]X/8
-     !XY
-     \x{1bb}XY
-     \x{2b0}XY
-     ** Failers
-     \x{1c5}XY
-     AXY      
-
-/^\x{023a}+?(\x{0130}+)/8i
-  \x{023a}\x{2c65}\x{0130}
+/^[^x]*b/8
+  xb

-/^\x{023a}+([^X])/8i
-  \x{023a}\x{2c65}X
- 
-/\x{c0}+\x{116}+/8i
-    \x{c0}\x{e0}\x{116}\x{117}
+/^\d*b/8
+  xb

-/[\x{c0}\x{116}]+/8i
-    \x{c0}\x{e0}\x{116}\x{117}
+/(|a)/g8
+    catac
+    a\x{256}a

-/Check property support in non-UTF-8 mode/
- 
-/\p{L}{4}/
-    123abcdefg
-    123abc\xc4\xc5zz
+/^\x{85}$/8i
+    \x{85}

-/\p{Carian}\p{Cham}\p{Kayah_Li}\p{Lepcha}\p{Lycian}\p{Lydian}\p{Ol_Chiki}\p{Rejang}\p{Saurashtra}\p{Sundanese}\p{Vai}/8
-    \x{102A4}\x{AA52}\x{A91D}\x{1C46}\x{10283}\x{1092E}\x{1C6B}\x{A93B}\x{A8BF}\x{1BA0}\x{A50A}====
+/^abc./mgx8<any>
+    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x{0085}abc7 \x{2028}abc8 \x{2029}abc9 JUNK

-/\x{a77d}\x{1d79}/8i
-    \x{a77d}\x{1d79}
-    \x{1d79}\x{a77d} 
+/abc.$/mgx8<any>
+    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x{0085} abc7\x{2028} abc8\x{2029} abc9

-/\x{a77d}\x{1d79}/8
-    \x{a77d}\x{1d79}
-    ** Failers 
-    \x{1d79}\x{a77d} 
-
-/^\p{Xan}/8
-    ABCD
-    1234
-    \x{6ca}
-    \x{a6c}
-    \x{10a7}   
+/^a\Rb/8<bsr_unicode>
+    a\nb
+    a\rb
+    a\r\nb
+    a\x0bb
+    a\x0cb
+    a\x{85}b   
+    a\x{2028}b 
+    a\x{2029}b 
     ** Failers
-    _ABC   
+    a\n\rb

-/^\p{Xan}+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-    ** Failers
-    _ABC   
+/^a\R*b/8<bsr_unicode>
+    ab
+    a\nb
+    a\rb
+    a\r\nb
+    a\x0bb
+    a\x0c\x{2028}\x{2029}b
+    a\x{85}b   
+    a\n\rb    
+    a\n\r\x{85}\x0cb

-/^\p{Xan}*/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-    
-/^\p{Xan}{2,9}/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-    
-/^[\p{Xan}]/8
-    ABCD1234_
-    1234abcd_
-    \x{6ca}
-    \x{a6c}
-    \x{10a7}   
+/^a\R+b/8<bsr_unicode>
+    a\nb
+    a\rb
+    a\r\nb
+    a\x0bb
+    a\x0c\x{2028}\x{2029}b
+    a\x{85}b   
+    a\n\rb    
+    a\n\r\x{85}\x0cb 
     ** Failers
-    _ABC   
- 
-/^[\p{Xan}]+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-    ** Failers
-    _ABC   
+    ab

-/^>\p{Xsp}/8
-    >\x{1680}\x{2028}\x{0b}
+/^a\R{1,3}b/8<bsr_unicode>
+    a\nb
+    a\n\rb
+    a\n\r\x{85}b
+    a\r\n\r\nb 
+    a\r\n\r\n\r\nb 
+    a\n\r\n\rb
+    a\n\n\r\nb 
     ** Failers
-    \x{0b} 
+    a\n\n\n\rb
+    a\r

-/^>\p{Xsp}+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+/\h+\V?\v{3,4}/8 
+    \x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a

-/^>\p{Xsp}*/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>\p{Xsp}{2,9}/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>[\p{Xsp}]/8
-    >\x{2028}\x{0b}
- 
-/^>[\p{Xsp}]+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+/\V?\v{3,4}/8 
+    \x20\x{a0}X\x0a\x0b\x0c\x0d\x0a

-/^>\p{Xps}/8
-    >\x{1680}\x{2028}\x{0b}
-    >\x{a0} 
-    ** Failers
-    \x{0b} 
+/\h+\V?\v{3,4}/8
+    >\x09\x20\x{a0}X\x0a\x0a\x0a<

-/^>\p{Xps}+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+/\V?\v{3,4}/8
+    >\x09\x20\x{a0}X\x0a\x0a\x0a<

-/^>\p{Xps}+?/8
-    >\x{1680}\x{2028}\x{0b}
-
-/^>\p{Xps}*/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+/\H\h\V\v/8
+    X X\x0a
+    X\x09X\x0b
+    ** Failers
+    \x{a0} X\x0a

-/^>\p{Xps}{2,9}/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+/\H*\h+\V?\v{3,4}/8 
+    \x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
+    \x09\x20\x{a0}\x0a\x0b\x0c\x0d\x0a
+    \x09\x20\x{a0}\x0a\x0b\x0c
+    ** Failers 
+    \x09\x20\x{a0}\x0a\x0b
+     
+/\H\h\V\v/8
+    \x{3001}\x{3000}\x{2030}\x{2028}
+    X\x{180e}X\x{85}
+    ** Failers
+    \x{2009} X\x0a

-/^>\p{Xps}{2,9}?/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>[\p{Xps}]/8
-    >\x{2028}\x{0b}
- 
-/^>[\p{Xps}]+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-
-/^\p{Xwd}/8
-    ABCD
-    1234
-    \x{6ca}
-    \x{a6c}
-    \x{10a7}
-    _ABC    
+/\H*\h+\V?\v{3,4}/8 
+    \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
+    \x09\x{205f}\x{a0}\x0a\x{2029}\x0c\x{2028}\x0a
+    \x09\x20\x{202f}\x0a\x0b\x0c
+    ** Failers 
+    \x09\x{200a}\x{a0}\x{2028}\x0b
+     
+/a\Rb/I8<bsr_anycrlf>
+    a\rb
+    a\nb
+    a\r\nb
     ** Failers
-    [] 
+    a\x{85}b
+    a\x0bb

-/^\p{Xwd}+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-
-/^\p{Xwd}*/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+/a\Rb/I8<bsr_unicode>
+    a\rb
+    a\nb
+    a\r\nb
+    a\x{85}b
+    a\x0bb     
+    ** Failers 
+    a\x{85}b\<bsr_anycrlf>
+    a\x0bb\<bsr_anycrlf>

-/^\p{Xwd}{2,9}/8
-    A_12\x{6ca}\x{a6c}\x{10a7}
-    
-/^[\p{Xwd}]/8
-    ABCD1234_
-    1234abcd_
-    \x{6ca}
-    \x{a6c}
-    \x{10a7}   
-    _ABC 
+/a\R?b/I8<bsr_anycrlf>
+    a\rb
+    a\nb
+    a\r\nb
     ** Failers
-    []   
+    a\x{85}b
+    a\x0bb     
+
+/a\R?b/I8<bsr_unicode>
+    a\rb
+    a\nb
+    a\r\nb
+    a\x{85}b
+    a\x0bb     
+    ** Failers 
+    a\x{85}b\<bsr_anycrlf>
+    a\x0bb\<bsr_anycrlf>

-/^[\p{Xwd}]+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+/X/8f<any> 
+    A\x{1ec5}ABCXYZ

-/-- Unicode properties for \b abd \B --/
+/abcd*/8
+    xxxxabcd\P
+    xxxxabcd\P\P

-/\b...\B/8W
-    abc_
-    \x{37e}abc\x{376} 
-    \x{37e}\x{376}\x{371}\x{393}\x{394} 
-    !\x{c0}++\x{c1}\x{c2} 
-    !\x{c0}+++++ 
+/abcd*/i8
+    xxxxabcd\P
+    xxxxabcd\P\P
+    XXXXABCD\P
+    XXXXABCD\P\P

-/-- Without PCRE_UCP, non-ASCII always fail, even if < 256  --/
+/abc\d*/8
+    xxxxabc1\P
+    xxxxabc1\P\P

-/\b...\B/8
-    abc_
-    ** Failers 
-    \x{37e}abc\x{376} 
-    \x{37e}\x{376}\x{371}\x{393}\x{394} 
-    !\x{c0}++\x{c1}\x{c2} 
-    !\x{c0}+++++ 
+/abc[de]*/8
+    xxxxabcde\P
+    xxxxabcde\P\P

-/-- With PCRE_UCP, non-UTF8 chars that are < 256 still check properties  --/
+/\bthe cat\b/8
+    the cat\P
+    the cat\P\P

-/\b...\B/W
-    abc_
-    !\x{c0}++\x{c1}\x{c2} 
-    !\x{c0}+++++ 
+/ab\Cde/8
+    abXde

+/(?<=ab\Cde)X/8
+
/-- End of testinput9 --/

Modified: code/trunk/testdata/testoutput1
===================================================================
--- code/trunk/testdata/testoutput1    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput1    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,5 +1,6 @@
 /-- This set of tests is for features that are compatible with all versions of
-    Perl 5, in non-UTF-8 mode. --/
+    Perl >= 5.10, in non-UTF-8 mode. It should run clean for both the 8-bit and
+    16-bit PCRE libraries. --/

 /the quick brown fox/
     the quick brown fox
@@ -7053,4 +7054,1665 @@
     aJb
  0: aJb

+/\H\h\V\v/
+    X X\x0a
+ 0: X X\x0a
+    X\x09X\x0b
+ 0: X\x09X\x0b
+    ** Failers
+No match
+    \xa0 X\x0a   
+No match
+    
+/\H*\h+\V?\v{3,4}/ 
+    \x09\x20\xa0X\x0a\x0b\x0c\x0d\x0a
+ 0: \x09 \xa0X\x0a\x0b\x0c\x0d
+    \x09\x20\xa0\x0a\x0b\x0c\x0d\x0a
+ 0: \x09 \xa0\x0a\x0b\x0c\x0d
+    \x09\x20\xa0\x0a\x0b\x0c
+ 0: \x09 \xa0\x0a\x0b\x0c
+    ** Failers 
+No match
+    \x09\x20\xa0\x0a\x0b
+No match
+     
+/\H{3,4}/
+    XY  ABCDE
+ 0: ABCD
+    XY  PQR ST 
+ 0: PQR
+    
+/.\h{3,4}./
+    XY  AB    PQRS
+ 0: B    P
+
+/\h*X\h?\H+Y\H?Z/
+    >XNNNYZ
+ 0: XNNNYZ
+    >  X NYQZ
+ 0:   X NYQZ
+    ** Failers
+No match
+    >XYZ   
+No match
+    >  X NY Z
+No match
+
+/\v*X\v?Y\v+Z\V*\x0a\V+\x0b\V{2,3}\x0c/
+    >XY\x0aZ\x0aA\x0bNN\x0c
+ 0: XY\x0aZ\x0aA\x0bNN\x0c
+    >\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
+ 0: \x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
+
+/(foo)\Kbar/
+    foobar
+ 0: bar
+ 1: foo
+   
+/(foo)(\Kbar|baz)/
+    foobar
+ 0: bar
+ 1: foo
+ 2: bar
+    foobaz 
+ 0: foobaz
+ 1: foo
+ 2: baz
+
+/(foo\Kbar)baz/
+    foobarbaz
+ 0: barbaz
+ 1: foobar
+
+/abc\K|def\K/g+
+    Xabcdefghi
+ 0: 
+ 0+ defghi
+ 0: 
+ 0+ ghi
+
+/ab\Kc|de\Kf/g+
+    Xabcdefghi
+ 0: c
+ 0+ defghi
+ 0: f
+ 0+ ghi
+    
+/(?=C)/g+
+    ABCDECBA
+ 0: 
+ 0+ CDECBA
+ 0: 
+ 0+ CBA
+    
+/^abc\K/+
+    abcdef
+ 0: 
+ 0+ def
+    ** Failers
+No match
+    defabcxyz   
+No match
+
+/^(a(b))\1\g1\g{1}\g-1\g{-1}\g{-02}Z/
+    ababababbbabZXXXX
+ 0: ababababbbabZ
+ 1: ab
+ 2: b
+
+/(?<A>tom|bon)-\g{A}/
+    tom-tom
+ 0: tom-tom
+ 1: tom
+    bon-bon 
+ 0: bon-bon
+ 1: bon
+    
+/(^(a|b\g{-1}))/
+    bacxxx
+No match
+
+/(?|(abc)|(xyz))\1/
+    abcabc
+ 0: abcabc
+ 1: abc
+    xyzxyz 
+ 0: xyzxyz
+ 1: xyz
+    ** Failers
+No match
+    abcxyz
+No match
+    xyzabc   
+No match
+    
+/(?|(abc)|(xyz))(?1)/
+    abcabc
+ 0: abcabc
+ 1: abc
+    xyzabc 
+ 0: xyzabc
+ 1: xyz
+    ** Failers 
+No match
+    xyzxyz 
+No match
+ 
+/^X(?5)(a)(?|(b)|(q))(c)(d)(Y)/
+    XYabcdY
+ 0: XYabcdY
+ 1: a
+ 2: b
+ 3: c
+ 4: d
+ 5: Y
+
+/^X(?7)(a)(?|(b|(r)(s))|(q))(c)(d)(Y)/
+    XYabcdY
+ 0: XYabcdY
+ 1: a
+ 2: b
+ 3: <unset>
+ 4: <unset>
+ 5: c
+ 6: d
+ 7: Y
+
+/^X(?7)(a)(?|(b|(?|(r)|(t))(s))|(q))(c)(d)(Y)/
+    XYabcdY
+ 0: XYabcdY
+ 1: a
+ 2: b
+ 3: <unset>
+ 4: <unset>
+ 5: c
+ 6: d
+ 7: Y
+
+/(?'abc'\w+):\k<abc>{2}/
+    a:aaxyz
+ 0: a:aa
+ 1: a
+    ab:ababxyz
+ 0: ab:abab
+ 1: ab
+    ** Failers
+No match
+    a:axyz
+No match
+    ab:abxyz
+No match
+
+/(?'abc'\w+):\g{abc}{2}/
+    a:aaxyz
+ 0: a:aa
+ 1: a
+    ab:ababxyz
+ 0: ab:abab
+ 1: ab
+    ** Failers
+No match
+    a:axyz
+No match
+    ab:abxyz
+No match
+
+/^(?<ab>a)? (?(<ab>)b|c) (?('ab')d|e)/x
+    abd
+ 0: abd
+ 1: a
+    ce
+ 0: ce
+
+/^(a.)\g-1Z/
+    aXaXZ
+ 0: aXaXZ
+ 1: aX
+
+/^(a.)\g{-1}Z/
+    aXaXZ
+ 0: aXaXZ
+ 1: aX
+
+/^(?(DEFINE) (?<A> a) (?<B> b) )  (?&A) (?&B) /x
+    abcd
+ 0: ab
+
+/(?<NAME>(?&NAME_PAT))\s+(?<ADDR>(?&ADDRESS_PAT))
+  (?(DEFINE)
+  (?<NAME_PAT>[a-z]+)
+  (?<ADDRESS_PAT>\d+)
+  )/x
+    metcalfe 33
+ 0: metcalfe 33
+ 1: metcalfe
+ 2: 33
+
+/(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))\b(?&byte)(\.(?&byte)){3}/
+    1.2.3.4
+ 0: 1.2.3.4
+ 1: <unset>
+ 2: .4
+    131.111.10.206
+ 0: 131.111.10.206
+ 1: <unset>
+ 2: .206
+    10.0.0.0
+ 0: 10.0.0.0
+ 1: <unset>
+ 2: .0
+    ** Failers
+No match
+    10.6
+No match
+    455.3.4.5
+No match
+
+/\b(?&byte)(\.(?&byte)){3}(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))/
+    1.2.3.4
+ 0: 1.2.3.4
+ 1: .4
+    131.111.10.206
+ 0: 131.111.10.206
+ 1: .206
+    10.0.0.0
+ 0: 10.0.0.0
+ 1: .0
+    ** Failers
+No match
+    10.6
+No match
+    455.3.4.5
+No match
+
+/^(\w++|\s++)*$/
+    now is the time for all good men to come to the aid of the party
+ 0: now is the time for all good men to come to the aid of the party
+ 1: party
+    *** Failers
+No match
+    this is not a line with only words and spaces!
+No match
+
+/(\d++)(\w)/
+    12345a
+ 0: 12345a
+ 1: 12345
+ 2: a
+    *** Failers
+No match
+    12345+
+No match
+
+/a++b/
+    aaab
+ 0: aaab
+
+/(a++b)/
+    aaab
+ 0: aaab
+ 1: aaab
+
+/(a++)b/
+    aaab
+ 0: aaab
+ 1: aaa
+
+/([^()]++|\([^()]*\))+/
+    ((abc(ade)ufh()()x
+ 0: abc(ade)ufh()()x
+ 1: x
+
+/\(([^()]++|\([^()]+\))+\)/
+    (abc)
+ 0: (abc)
+ 1: abc
+    (abc(def)xyz)
+ 0: (abc(def)xyz)
+ 1: xyz
+    *** Failers
+No match
+    ((()aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+No match
+
+/^([^()]|\((?1)*\))*$/
+    abc
+ 0: abc
+ 1: c
+    a(b)c
+ 0: a(b)c
+ 1: c
+    a(b(c))d
+ 0: a(b(c))d
+ 1: d
+    *** Failers)
+No match
+    a(b(c)d
+No match
+
+/^>abc>([^()]|\((?1)*\))*<xyz<$/
+   >abc>123<xyz<
+ 0: >abc>123<xyz<
+ 1: 3
+   >abc>1(2)3<xyz<
+ 0: >abc>1(2)3<xyz<
+ 1: 3
+   >abc>(1(2)3)<xyz<
+ 0: >abc>(1(2)3)<xyz<
+ 1: (1(2)3)
+
+/^(?:((.)(?1)\2|)|((.)(?3)\4|.))$/i
+    1221
+ 0: 1221
+ 1: 1221
+ 2: 1
+    Satanoscillatemymetallicsonatas
+ 0: Satanoscillatemymetallicsonatas
+ 1: <unset>
+ 2: <unset>
+ 3: Satanoscillatemymetallicsonatas
+ 4: S
+    AmanaplanacanalPanama
+ 0: AmanaplanacanalPanama
+ 1: <unset>
+ 2: <unset>
+ 3: AmanaplanacanalPanama
+ 4: A
+    AblewasIereIsawElba
+ 0: AblewasIereIsawElba
+ 1: <unset>
+ 2: <unset>
+ 3: AblewasIereIsawElba
+ 4: A
+    *** Failers
+No match
+    Thequickbrownfox
+No match
+
+/^(\d+|\((?1)([+*-])(?1)\)|-(?1))$/
+    12
+ 0: 12
+ 1: 12
+    (((2+2)*-3)-7)
+ 0: (((2+2)*-3)-7)
+ 1: (((2+2)*-3)-7)
+ 2: -
+    -12
+ 0: -12
+ 1: -12
+    *** Failers
+No match
+    ((2+2)*-3)-7)
+No match
+
+/^(x(y|(?1){2})z)/
+    xyz
+ 0: xyz
+ 1: xyz
+ 2: y
+    xxyzxyzz
+ 0: xxyzxyzz
+ 1: xxyzxyzz
+ 2: xyzxyz
+    *** Failers
+No match
+    xxyzz
+No match
+    xxyzxyzxyzz
+No match
+
+/((< (?: (?(R) \d++  | [^<>]*+) | (?2)) * >))/x
+    <>
+ 0: <>
+ 1: <>
+ 2: <>
+    <abcd>
+ 0: <abcd>
+ 1: <abcd>
+ 2: <abcd>
+    <abc <123> hij>
+ 0: <abc <123> hij>
+ 1: <abc <123> hij>
+ 2: <abc <123> hij>
+    <abc <def> hij>
+ 0: <def>
+ 1: <def>
+ 2: <def>
+    <abc<>def>
+ 0: <abc<>def>
+ 1: <abc<>def>
+ 2: <abc<>def>
+    <abc<>
+ 0: <>
+ 1: <>
+ 2: <>
+    *** Failers
+No match
+    <abc
+No match
+
+/^a+(*FAIL)/
+    aaaaaa
+No match
+    
+/a+b?c+(*FAIL)/
+    aaabccc
+No match
+
+/a+b?(*PRUNE)c+(*FAIL)/
+    aaabccc
+No match
+
+/a+b?(*COMMIT)c+(*FAIL)/
+    aaabccc
+No match
+    
+/a+b?(*SKIP)c+(*FAIL)/
+    aaabcccaaabccc
+No match
+
+/^(?:aaa(*THEN)\w{6}|bbb(*THEN)\w{5}|ccc(*THEN)\w{4}|\w{3})/
+    aaaxxxxxx
+ 0: aaaxxxxxx
+    aaa++++++ 
+ 0: aaa
+    bbbxxxxx
+ 0: bbbxxxxx
+    bbb+++++ 
+ 0: bbb
+    cccxxxx
+ 0: cccxxxx
+    ccc++++ 
+ 0: ccc
+    dddddddd   
+ 0: ddd
+
+/^(aaa(*THEN)\w{6}|bbb(*THEN)\w{5}|ccc(*THEN)\w{4}|\w{3})/
+    aaaxxxxxx
+ 0: aaaxxxxxx
+ 1: aaaxxxxxx
+    aaa++++++ 
+ 0: aaa
+ 1: aaa
+    bbbxxxxx
+ 0: bbbxxxxx
+ 1: bbbxxxxx
+    bbb+++++ 
+ 0: bbb
+ 1: bbb
+    cccxxxx
+ 0: cccxxxx
+ 1: cccxxxx
+    ccc++++ 
+ 0: ccc
+ 1: ccc
+    dddddddd   
+ 0: ddd
+ 1: ddd
+
+/a+b?(*THEN)c+(*FAIL)/
+    aaabccc
+No match
+
+/(A (A|B(*ACCEPT)|C) D)(E)/x
+    AB
+ 0: AB
+ 1: AB
+ 2: B
+    ABX
+ 0: AB
+ 1: AB
+ 2: B
+    AADE
+ 0: AADE
+ 1: AAD
+ 2: A
+ 3: E
+    ACDE
+ 0: ACDE
+ 1: ACD
+ 2: C
+ 3: E
+    ** Failers
+No match
+    AD 
+No match
+        
+/^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$/i
+    1221
+ 0: 1221
+ 1: 1221
+ 2: 1
+    Satan, oscillate my metallic sonatas!
+ 0: Satan, oscillate my metallic sonatas!
+ 1: <unset>
+ 2: <unset>
+ 3: Satan, oscillate my metallic sonatas
+ 4: S
+    A man, a plan, a canal: Panama!
+ 0: A man, a plan, a canal: Panama!
+ 1: <unset>
+ 2: <unset>
+ 3: A man, a plan, a canal: Panama
+ 4: A
+    Able was I ere I saw Elba.
+ 0: Able was I ere I saw Elba.
+ 1: <unset>
+ 2: <unset>
+ 3: Able was I ere I saw Elba
+ 4: A
+    *** Failers
+No match
+    The quick brown fox
+No match
+
+/^((.)(?1)\2|.)$/
+    a
+ 0: a
+ 1: a
+    aba
+ 0: aba
+ 1: aba
+ 2: a
+    aabaa  
+ 0: aabaa
+ 1: aabaa
+ 2: a
+    abcdcba 
+ 0: abcdcba
+ 1: abcdcba
+ 2: a
+    pqaabaaqp  
+ 0: pqaabaaqp
+ 1: pqaabaaqp
+ 2: p
+    ablewasiereisawelba
+ 0: ablewasiereisawelba
+ 1: ablewasiereisawelba
+ 2: a
+    rhubarb
+No match
+    the quick brown fox  
+No match
+
+/(a)(?<=b(?1))/
+    baz
+ 0: a
+ 1: a
+    ** Failers
+No match
+    caz  
+No match
+    
+/(?<=b(?1))(a)/
+    zbaaz
+ 0: a
+ 1: a
+    ** Failers
+No match
+    aaa  
+No match
+    
+/(?<X>a)(?<=b(?&X))/
+    baz
+ 0: a
+ 1: a
+
+/^(?|(abc)|(def))\1/
+    abcabc
+ 0: abcabc
+ 1: abc
+    defdef 
+ 0: defdef
+ 1: def
+    ** Failers
+No match
+    abcdef
+No match
+    defabc   
+No match
+    
+/^(?|(abc)|(def))(?1)/
+    abcabc
+ 0: abcabc
+ 1: abc
+    defabc
+ 0: defabc
+ 1: def
+    ** Failers
+No match
+    defdef
+No match
+    abcdef    
+No match
+
+/(?:a(?<quote> (?<apostrophe>')|(?<realquote>")) |b(?<quote> (?<apostrophe>')|(?<realquote>")) ) (?('quote')[a-z]+|[0-9]+)/xJ
+    a\"aaaaa
+ 0: a"aaaaa
+ 1: "
+ 2: <unset>
+ 3: "
+    b\"aaaaa 
+ 0: b"aaaaa
+ 1: <unset>
+ 2: <unset>
+ 3: <unset>
+ 4: "
+ 5: <unset>
+ 6: "
+    ** Failers 
+No match
+    b\"11111
+No match
+
+/(?:(?1)|B)(A(*F)|C)/
+    ABCD
+ 0: BC
+ 1: C
+    CCD
+ 0: CC
+ 1: C
+    ** Failers
+No match
+    CAD   
+No match
+
+/^(?:(?1)|B)(A(*F)|C)/
+    CCD
+ 0: CC
+ 1: C
+    BCD 
+ 0: BC
+ 1: C
+    ** Failers
+No match
+    ABCD
+No match
+    CAD
+No match
+    BAD    
+No match
+
+/(?:(?1)|B)(A(*ACCEPT)XX|C)D/
+    AAD
+ 0: AA
+ 1: A
+    ACD
+ 0: ACD
+ 1: C
+    BAD
+ 0: BA
+ 1: A
+    BCD
+ 0: BCD
+ 1: C
+    BAX  
+ 0: BA
+ 1: A
+    ** Failers
+No match
+    ACX
+No match
+    ABC   
+No match
+
+/(?(DEFINE)(A))B(?1)C/
+    BAC
+ 0: BAC
+
+/(?(DEFINE)((A)\2))B(?1)C/
+    BAAC
+ 0: BAAC
+
+/(?<pn> \( ( [^()]++ | (?&pn) )* \) )/x
+    (ab(cd)ef)
+ 0: (ab(cd)ef)
+ 1: (ab(cd)ef)
+ 2: ef
+
+/^(?!a(*SKIP)b)/
+    ac
+ 0: 
+    
+/^(?=a(*SKIP)b|ac)/
+    ** Failers
+No match
+    ac
+No match
+    
+/^(?=a(*THEN)b|ac)/
+    ac
+ 0: 
+    
+/^(?=a(*PRUNE)b)/
+    ab  
+ 0: 
+    ** Failers 
+No match
+    ac
+No match
+
+/^(?=a(*ACCEPT)b)/
+    ac
+ 0: 
+
+/^(?(?!a(*SKIP)b))/
+    ac
+ 0: 
+
+/(?>a\Kb)/
+    ab
+ 0: b
+
+/((?>a\Kb))/
+    ab
+ 0: b
+ 1: ab
+
+/(a\Kb)/
+    ab
+ 0: b
+ 1: ab
+    
+/^a\Kcz|ac/
+    ac
+ 0: ac
+    
+/(?>a\Kbz|ab)/
+    ab 
+ 0: ab
+
+/^(?&t)(?(DEFINE)(?<t>a\Kb))$/
+    ab
+ 0: b
+
+/^([^()]|\((?1)*\))*$/
+    a(b)c
+ 0: a(b)c
+ 1: c
+    a(b(c)d)e 
+ 0: a(b(c)d)e
+ 1: e
+
+/(?P<L1>(?P<L2>0)(?P>L1)|(?P>L2))/
+    0
+ 0: 0
+ 1: 0
+    00
+ 0: 00
+ 1: 00
+ 2: 0
+    0000  
+ 0: 0000
+ 1: 0000
+ 2: 0
+
+/(?P<L1>(?P<L2>0)|(?P>L2)(?P>L1))/
+    0
+ 0: 0
+ 1: 0
+ 2: 0
+    00
+ 0: 0
+ 1: 0
+ 2: 0
+    0000  
+ 0: 0
+ 1: 0
+ 2: 0
+
+/--- This one does fail, as expected, in Perl. It needs the complex item at the
+     end of the pattern. A single letter instead of (B|D) makes it not fail,
+     which I think is a Perl bug. --- /
+
+/A(*COMMIT)(B|D)/
+    ACABX
+No match
+
+/--- Check the use of names for failure ---/
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
+    ** Failers
+No match
+    AC
+No match, mark = A
+    CB    
+No match, mark = B
+    
+/--- Force no study, otherwise mark is not seen. The studied version is in
+     test 2 because it isn't Perl-compatible. ---/
+
+/(*MARK:A)(*SKIP:B)(C|X)/KSS
+    C
+ 0: C
+ 1: C
+MK: A
+    D
+No match, mark = A
+     
+/^(A(*THEN:A)B|C(*THEN:B)D)/K
+    ** Failers
+No match
+    CB    
+No match, mark = B
+
+/^(?:A(*THEN:A)B|C(*THEN:B)D)/K
+    CB    
+No match, mark = B
+    
+/^(?>A(*THEN:A)B|C(*THEN:B)D)/K
+    CB    
+No match, mark = B
+    
+/--- This should succeed, as the skip causes bump to offset 1 (the mark). Note
+that we have to have something complicated such as (B|Z) at the end because,
+for Perl, a simple character somehow causes an unwanted optimization to mess
+with the handling of backtracking verbs. ---/
+
+/A(*MARK:A)A+(*SKIP:A)(B|Z) | AC/xK
+    AAAC
+ 0: AC
+    
+/--- Test skipping over a non-matching mark. ---/
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:A)(B|Z) | AC/xK
+    AAAC
+ 0: AC
+    
+/--- Check shorthand for MARK ---/
+
+/A(*:A)A+(*SKIP:A)(B|Z) | AC/xK
+    AAAC
+ 0: AC
+
+/--- Don't loop! Force no study, otherwise mark is not seen. ---/
+
+/(*:A)A+(*SKIP:A)(B|Z)/KSS
+    AAAC
+No match, mark = A
+
+/--- This should succeed, as a non-existent skip name disables the skip ---/ 
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC/xK
+    AAAC
+ 0: AC
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC(*:B)/xK
+    AAAC
+ 0: AC
+MK: B
+
+/--- COMMIT at the start of a pattern should act like an anchor. Again, 
+however, we need the complication for Perl. ---/
+
+/(*COMMIT)(A|P)(B|P)(C|P)/
+    ABCDEFG
+ 0: ABC
+ 1: A
+ 2: B
+ 3: C
+    ** Failers
+No match
+    DEFGABC  
+No match
+
+/--- COMMIT inside an atomic group can't stop backtracking over the group. ---/
+
+/(\w+)(?>b(*COMMIT))\w{2}/
+    abbb
+ 0: abbb
+ 1: a
+
+/(\w+)b(*COMMIT)\w{2}/
+    abbb
+No match
+
+/--- Check opening parens in comment when seeking forward reference. ---/ 
+
+/(?&t)(?#()(?(DEFINE)(?<t>a))/
+    bac
+ 0: a
+
+/--- COMMIT should override THEN ---/
+
+/(?>(*COMMIT)(?>yes|no)(*THEN)(*F))?/
+  yes
+No match
+
+/(?>(*COMMIT)(yes|no)(*THEN)(*F))?/
+  yes
+No match
+
+/b?(*SKIP)c/
+    bc
+ 0: bc
+    abc
+ 0: bc
+   
+/(*SKIP)bc/
+    a
+No match
+
+/(*SKIP)b/
+    a 
+No match
+
+/(?P<abn>(?P=abn)xxx|)+/
+    xxx
+ 0: 
+ 1: 
+
+/(?i:([^b]))(?1)/
+    aa
+ 0: aa
+ 1: a
+    aA     
+ 0: aA
+ 1: a
+    ** Failers
+ 0: **
+ 1: *
+    ab
+No match
+    aB
+No match
+    Ba
+No match
+    ba
+No match
+
+/^(?&t)*+(?(DEFINE)(?<t>a))\w$/
+    aaaaaaX
+ 0: aaaaaaX
+    ** Failers 
+No match
+    aaaaaa 
+No match
+
+/^(?&t)*(?(DEFINE)(?<t>a))\w$/
+    aaaaaaX
+ 0: aaaaaaX
+    aaaaaa 
+ 0: aaaaaa
+
+/^(a)*+(\w)/
+    aaaaX
+ 0: aaaaX
+ 1: a
+ 2: X
+    YZ 
+ 0: Y
+ 1: <unset>
+ 2: Y
+    ** Failers 
+No match
+    aaaa
+No match
+
+/^(?:a)*+(\w)/
+    aaaaX
+ 0: aaaaX
+ 1: X
+    YZ 
+ 0: Y
+ 1: Y
+    ** Failers 
+No match
+    aaaa
+No match
+
+/^(a)++(\w)/
+    aaaaX
+ 0: aaaaX
+ 1: a
+ 2: X
+    ** Failers 
+No match
+    aaaa
+No match
+    YZ 
+No match
+
+/^(?:a)++(\w)/
+    aaaaX
+ 0: aaaaX
+ 1: X
+    ** Failers 
+No match
+    aaaa
+No match
+    YZ 
+No match
+
+/^(a)?+(\w)/
+    aaaaX
+ 0: aa
+ 1: a
+ 2: a
+    YZ 
+ 0: Y
+ 1: <unset>
+ 2: Y
+
+/^(?:a)?+(\w)/
+    aaaaX
+ 0: aa
+ 1: a
+    YZ 
+ 0: Y
+ 1: Y
+
+/^(a){2,}+(\w)/
+    aaaaX
+ 0: aaaaX
+ 1: a
+ 2: X
+    ** Failers
+No match
+    aaa
+No match
+    YZ 
+No match
+
+/^(?:a){2,}+(\w)/
+    aaaaX
+ 0: aaaaX
+ 1: X
+    ** Failers
+No match
+    aaa
+No match
+    YZ 
+No match
+
+/(a|)*(?1)b/
+    b
+ 0: b
+ 1: 
+    ab
+ 0: ab
+ 1: 
+    aab  
+ 0: aab
+ 1: 
+
+/(a)++(?1)b/
+    ** Failers
+No match
+    ab 
+No match
+    aab
+No match
+
+/(a)*+(?1)b/
+    ** Failers
+No match
+    ab
+No match
+    aab  
+No match
+
+/(?1)(?:(b)){0}/
+    b
+ 0: b
+
+/(foo ( \( ((?:(?> [^()]+ )|(?2))*) \) ) )/x
+    foo(bar(baz)+baz(bop))
+ 0: foo(bar(baz)+baz(bop))
+ 1: foo(bar(baz)+baz(bop))
+ 2: (bar(baz)+baz(bop))
+ 3: bar(baz)+baz(bop)
+
+/(A (A|B(*ACCEPT)|C) D)(E)/x
+    AB
+ 0: AB
+ 1: AB
+ 2: B
+
+/\A.*?(?:a|b(*THEN)c)/
+    ba
+ 0: ba
+
+/\A.*?(?:a|bc)/
+    ba
+ 0: ba
+
+/\A.*?(a|b(*THEN)c)/
+    ba
+ 0: ba
+ 1: a
+
+/\A.*?(a|bc)/
+    ba
+ 0: ba
+ 1: a
+
+/\A.*?(?:a|b(*THEN)c)++/
+    ba
+ 0: ba
+
+/\A.*?(?:a|bc)++/
+    ba
+ 0: ba
+
+/\A.*?(a|b(*THEN)c)++/
+    ba
+ 0: ba
+ 1: a
+
+/\A.*?(a|bc)++/
+    ba
+ 0: ba
+ 1: a
+
+/\A.*?(?:a|b(*THEN)c|d)/
+    ba
+ 0: ba
+
+/\A.*?(?:a|bc|d)/
+    ba
+ 0: ba
+
+/(?:(b))++/
+    beetle
+ 0: b
+ 1: b
+
+/(?(?=(a(*ACCEPT)z))a)/
+    a
+ 0: a
+ 1: a
+
+/^(a)(?1)+ab/
+    aaaab
+ 0: aaaab
+ 1: a
+    
+/^(a)(?1)++ab/
+    aaaab
+No match
+
+/^(?=a(*:M))aZ/K
+    aZbc
+ 0: aZ
+MK: M
+
+/^(?!(*:M)b)aZ/K
+    aZbc
+ 0: aZ
+
+/(?(DEFINE)(a))?b(?1)/
+    backgammon
+ 0: ba
+
+/^\N+/
+    abc\ndef
+ 0: abc
+    
+/^\N{1,}/
+    abc\ndef 
+ 0: abc
+
+/(?(R)a+|(?R)b)/
+    aaaabcde
+ 0: aaaab
+
+/(?(R)a+|((?R))b)/
+    aaaabcde
+ 0: aaaab
+ 1: aaaa
+
+/((?(R)a+|(?1)b))/
+    aaaabcde
+ 0: aaaab
+ 1: aaaab
+
+/((?(R1)a+|(?1)b))/
+    aaaabcde
+ 0: aaaab
+ 1: aaaab
+
+/a(*:any 
+name)/K
+    abc
+ 0: a
+MK: any \x0aname
+    
+/(?>(?&t)c|(?&t))(?(DEFINE)(?<t>a|b(*PRUNE)c))/
+    a
+ 0: a
+    ba
+ 0: a
+    bba 
+ 0: a
+    
+/--- Checking revised (*THEN) handling ---/ 
+
+/--- Capture ---/
+
+/^.*? (a(*THEN)b) c/x
+    aabc
+No match
+
+/^.*? (a(*THEN)b|(*F)) c/x
+    aabc
+ 0: aabc
+ 1: ab
+
+/^.*? ( (a(*THEN)b) | (*F) ) c/x
+    aabc
+ 0: aabc
+ 1: ab
+ 2: ab
+
+/^.*? ( (a(*THEN)b) ) c/x
+    aabc
+No match
+
+/--- Non-capture ---/
+
+/^.*? (?:a(*THEN)b) c/x
+    aabc
+No match
+
+/^.*? (?:a(*THEN)b|(*F)) c/x
+    aabc
+ 0: aabc
+
+/^.*? (?: (?:a(*THEN)b) | (*F) ) c/x
+    aabc
+ 0: aabc
+
+/^.*? (?: (?:a(*THEN)b) ) c/x
+    aabc
+No match
+
+/--- Atomic ---/
+
+/^.*? (?>a(*THEN)b) c/x
+    aabc
+No match
+
+/^.*? (?>a(*THEN)b|(*F)) c/x
+    aabc
+ 0: aabc
+
+/^.*? (?> (?>a(*THEN)b) | (*F) ) c/x
+    aabc
+ 0: aabc
+
+/^.*? (?> (?>a(*THEN)b) ) c/x
+    aabc
+No match
+
+/--- Possessive capture ---/
+
+/^.*? (a(*THEN)b)++ c/x
+    aabc
+No match
+
+/^.*? (a(*THEN)b|(*F))++ c/x
+    aabc
+ 0: aabc
+ 1: ab
+
+/^.*? ( (a(*THEN)b)++ | (*F) )++ c/x
+    aabc
+ 0: aabc
+ 1: ab
+ 2: ab
+
+/^.*? ( (a(*THEN)b)++ )++ c/x
+    aabc
+No match
+
+/--- Possessive non-capture ---/
+
+/^.*? (?:a(*THEN)b)++ c/x
+    aabc
+No match
+
+/^.*? (?:a(*THEN)b|(*F))++ c/x
+    aabc
+ 0: aabc
+
+/^.*? (?: (?:a(*THEN)b)++ | (*F) )++ c/x
+    aabc
+ 0: aabc
+
+/^.*? (?: (?:a(*THEN)b)++ )++ c/x
+    aabc
+No match
+    
+/--- Condition assertion ---/
+
+/^(?(?=a(*THEN)b)ab|ac)/
+    ac
+ 0: ac
+ 
+/--- Condition ---/
+
+/^.*?(?(?=a)a|b(*THEN)c)/
+    ba
+No match
+
+/^.*?(?:(?(?=a)a|b(*THEN)c)|d)/
+    ba
+ 0: ba
+
+/^.*?(?(?=a)a(*THEN)b|c)/
+    ac
+No match
+
+/--- Assertion ---/
+
+/^.*(?=a(*THEN)b)/ 
+    aabc
+ 0: a
+
+/------------------------------/
+
+/(?>a(*:m))/imsxSK 
+    a
+ 0: a
+MK: m
+
+/(?>(a)(*:m))/imsxSK 
+    a
+ 0: a
+ 1: a
+MK: m
+
+/(?<=a(*ACCEPT)b)c/
+    xacd
+ 0: c
+
+/(?<=(a(*ACCEPT)b))c/
+    xacd
+ 0: c
+ 1: a
+
+/(?<=(a(*COMMIT)b))c/
+    xabcd
+ 0: c
+ 1: ab
+    ** Failers 
+No match
+    xacd
+No match
+    
+/(?<!a(*FAIL)b)c/
+    xcd
+ 0: c
+    acd 
+ 0: c
+
+/(?<=a(*:N)b)c/K
+    xabcd
+ 0: c
+MK: N
+    
+/(?<=a(*PRUNE)b)c/
+    xabcd 
+ 0: c
+
+/(?<=a(*SKIP)b)c/
+    xabcd 
+ 0: c
+
+/(?<=a(*THEN)b)c/
+    xabcd 
+ 0: c
+
+/(a)(?2){2}(.)/
+    abcd
+ 0: abcd
+ 1: a
+ 2: d
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
+    C
+ 0: C
+ 1: C
+MK: B
+    D 
+No match, mark = B
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KSS
+    C
+ 0: C
+ 1: C
+MK: B
+    D 
+No match, mark = B
+
+/(*MARK:A)(*THEN:B)(C|X)/KS
+    C
+ 0: C
+ 1: C
+MK: B
+    D 
+No match, mark = B
+
+/(*MARK:A)(*THEN:B)(C|X)/KSY
+    C
+ 0: C
+ 1: C
+MK: B
+    D 
+No match, mark = B
+
+/(*MARK:A)(*THEN:B)(C|X)/KSS
+    C
+ 0: C
+ 1: C
+MK: B
+    D 
+No match, mark = B
+
+/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
+
+/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
+    AAAC
+No match, mark = A
+
+/--- Same --/
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK
+    AAAC
+No match, mark = B
+
+/A(*:A)A+(*SKIP)(B|Z) | AC/xK
+    AAAC
+No match, mark = A
+
+/--- This should fail, as a null name is the same as no name ---/
+
+/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK
+    AAAC
+No match, mark = A
+
+/--- A check on what happens after hitting a mark and them bumping along to
+something that does not even start. Perl reports tags after the failures here, 
+though it does not when the individual letters are made into something 
+more complicated. ---/
+
+/A(*:A)B|XX(*:B)Y/K
+    AABC
+ 0: AB
+MK: A
+    XXYZ 
+ 0: XXY
+MK: B
+    ** Failers
+No match
+    XAQQ  
+No match, mark = A
+    XAQQXZZ  
+No match, mark = A
+    AXQQQ 
+No match, mark = A
+    AXXQQQ 
+No match, mark = B
+    
+/^(A(*THEN:A)B|C(*THEN:B)D)/K
+    AB
+ 0: AB
+ 1: AB
+MK: A
+    CD
+ 0: CD
+ 1: CD
+MK: B
+    ** Failers
+No match
+    AC
+No match, mark = A
+    CB    
+No match, mark = B
+    
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
+    AB
+ 0: AB
+ 1: AB
+MK: A
+    CD
+ 0: CD
+ 1: CD
+MK: B
+    ** Failers
+No match
+    AC
+No match, mark = A
+    CB    
+No match, mark = B
+    
+/--- An empty name does not pass back an empty string. It is the same as if no
+name were given. ---/ 
+
+/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K
+    AB
+ 0: AB
+ 1: AB
+    CD 
+ 0: CD
+ 1: CD
+MK: B
+
+/--- PRUNE goes to next bumpalong; COMMIT does not. ---/
+    
+/A(*PRUNE:A)B/K
+    ACAB
+ 0: AB
+MK: A
+
+/--- Mark names can be duplicated ---/
+
+/A(*:A)B|X(*:A)Y/K
+    AABC
+ 0: AB
+MK: A
+    XXYZ 
+ 0: XY
+MK: A
+    
+/b(*:m)f|a(*:n)w/K
+    aw 
+ 0: aw
+MK: n
+    ** Failers 
+No match, mark = n
+    abc
+No match, mark = m
+
+/b(*:m)f|aw/K
+    abaw
+ 0: aw
+    ** Failers 
+No match
+    abc
+No match, mark = m
+    abax 
+No match, mark = m
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK
+    AAAC
+ 0: AAC
+
+/a(*PRUNE:X)bc|qq/KY
+    ** Failers
+No match, mark = X
+    axy
+No match, mark = X
+
+/a(*THEN:X)bc|qq/KY
+    ** Failers
+No match, mark = X
+    axy
+No match, mark = X
+
+/(?=a(*MARK:A)b)..x/K
+    abxy
+ 0: abx
+MK: A
+    ** Failers
+No match
+    abpq  
+No match
+
+/(?=a(*MARK:A)b)..(*:Y)x/K
+    abxy
+ 0: abx
+MK: Y
+    ** Failers
+No match
+    abpq  
+No match
+
+/(?=a(*PRUNE:A)b)..x/K
+    abxy
+ 0: abx
+MK: A
+    ** Failers
+No match
+    abpq  
+No match
+
+/(?=a(*PRUNE:A)b)..(*:Y)x/K
+    abxy
+ 0: abx
+MK: Y
+    ** Failers
+No match
+    abpq  
+No match
+
+/(?=a(*THEN:A)b)..x/K
+    abxy
+ 0: abx
+MK: A
+    ** Failers
+No match
+    abpq  
+No match
+
+/(?=a(*THEN:A)b)..(*:Y)x/K
+    abxy
+ 0: abx
+MK: Y
+    ** Failers
+No match
+    abpq  
+No match
+
+/(another)?(\1?)test/
+    hello world test
+ 0: test
+ 1: <unset>
+ 2: 
+
+/(another)?(\1+)test/
+    hello world test
+No match
+
 /-- End of testinput1 --/

Modified: code/trunk/testdata/testoutput10
===================================================================
--- code/trunk/testdata/testoutput10    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput10    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,728 +1,2037 @@
-/-- These are a few representative patterns whose lengths and offsets are to be 
-shown when the link size is 2. This is just a doublecheck test to ensure the 
-sizes don't go horribly wrong when something is changed. The pattern contents 
-are all themselves checked in other tests. Unicode, including property support, 
-is required for these tests. --/
+/-- This set of tests check Unicode property support with the DFA matching 
+    functionality of pcre_dfa_exec(). The -dfa flag must be used with pcretest
+    when running it. --/

-/((?i)b)/BM
-Memory allocation (code space): 17
-------------------------------------------------------------------
-  0  13 Bra
-  3   7 CBra 1
-  8  /i b
- 10   7 Ket
- 13  13 Ket
- 16     End
-------------------------------------------------------------------
+/\pL\P{Nd}/8
+    AB
+ 0: AB
+    *** Failers
+ 0: Fa
+    A0
+No match
+    00   
+No match

-/(?s)(.*X|^B)/BM
-Memory allocation (code space): 25
-------------------------------------------------------------------
-  0  21 Bra
-  3   9 CBra 1
-  8     AllAny*
- 10     X
- 12   6 Alt
- 15     ^
- 16     B
- 18  15 Ket
- 21  21 Ket
- 24     End
-------------------------------------------------------------------
+/\X./8
+    AB
+ 0: AB
+    A\x{300}BC 
+ 0: A\x{300}B
+    A\x{300}\x{301}\x{302}BC 
+ 0: A\x{300}\x{301}\x{302}B
+    *** Failers
+ 0: **
+    \x{300}  
+No match

-/(?s:.*X|^B)/BM
-Memory allocation (code space): 23
-------------------------------------------------------------------
-  0  19 Bra
-  3   7 Bra
-  6     AllAny*
-  8     X
- 10   6 Alt
- 13     ^
- 14     B
- 16  13 Ket
- 19  19 Ket
- 22     End
-------------------------------------------------------------------
+/\X\X/8
+    ABC
+ 0: AB
+    A\x{300}B\x{300}\x{301}C 
+ 0: A\x{300}B\x{300}\x{301}
+    A\x{300}\x{301}\x{302}BC 
+ 0: A\x{300}\x{301}\x{302}B
+    *** Failers
+ 0: **
+    \x{300}  
+No match

-/^[[:alnum:]]/BM
-Memory allocation (code space): 41
-------------------------------------------------------------------
-  0  37 Bra
-  3     ^
-  4     [0-9A-Za-z]
- 37  37 Ket
- 40     End
-------------------------------------------------------------------
+/^\pL+/8
+    abcd
+ 0: abcd
+ 1: abc
+ 2: ab
+ 3: a
+    a 
+ 0: a
+    *** Failers 
+No match

-/#/IxMD
-Memory allocation (code space): 7
-------------------------------------------------------------------
-  0   3 Bra
-  3   3 Ket
-  6     End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: extended
-No first char
-No need char
+/^\PL+/8
+    1234
+ 0: 1234
+ 1: 123
+ 2: 12
+ 3: 1
+    = 
+ 0: =
+    *** Failers 
+ 0: *** 
+ 1: ***
+ 2: **
+ 3: *
+    abcd 
+No match

-/a#/IxMD
-Memory allocation (code space): 9
-------------------------------------------------------------------
-  0   5 Bra
-  3     a
-  5   5 Ket
-  8     End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: extended
-First char = 'a'
-No need char
+/^\X+/8
+    abcdA\x{300}\x{301}\x{302}
+ 0: abcdA\x{300}\x{301}\x{302}
+ 1: abcd
+ 2: abc
+ 3: ab
+ 4: a
+    A\x{300}\x{301}\x{302}
+ 0: A\x{300}\x{301}\x{302}
+    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
+ 0: A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
+ 1: A\x{300}\x{301}\x{302}
+    a 
+ 0: a
+    *** Failers 
+ 0: *** Failers
+ 1: *** Failer
+ 2: *** Faile
+ 3: *** Fail
+ 4: *** Fai
+ 5: *** Fa
+ 6: *** F
+ 7: *** 
+ 8: ***
+ 9: **
+10: *
+    \x{300}\x{301}\x{302}
+No match

-/x?+/BM
-Memory allocation (code space): 9
-------------------------------------------------------------------
-  0   5 Bra
-  3     x?+
-  5   5 Ket
-  8     End
-------------------------------------------------------------------
+/\X?abc/8
+    abc
+ 0: abc
+    A\x{300}abc
+ 0: A\x{300}abc
+    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
+ 0: A\x{300}abc
+    \x{300}abc  
+ 0: abc
+    *** Failers
+No match

-/x++/BM
-Memory allocation (code space): 9
-------------------------------------------------------------------
-  0   5 Bra
-  3     x++
-  5   5 Ket
-  8     End
-------------------------------------------------------------------
+/^\X?abc/8
+    abc
+ 0: abc
+    A\x{300}abc
+ 0: A\x{300}abc
+    *** Failers
+No match
+    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
+No match
+    \x{300}abc  
+No match

-/x{1,3}+/BM 
-Memory allocation (code space): 19
-------------------------------------------------------------------
-  0  15 Bra
-  3   9 Once
-  6     x
-  8     x{0,2}
- 12   9 Ket
- 15  15 Ket
- 18     End
-------------------------------------------------------------------
+/\X*abc/8
+    abc
+ 0: abc
+    A\x{300}abc
+ 0: A\x{300}abc
+    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
+ 0: A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abc
+    \x{300}abc  
+ 0: abc
+    *** Failers
+No match

-/(x)*+/BM
-Memory allocation (code space): 18
-------------------------------------------------------------------
-  0  14 Bra
-  3     Braposzero
-  4   7 CBraPos 1
-  9     x
- 11   7 KetRpos
- 14  14 Ket
- 17     End
-------------------------------------------------------------------
+/^\X*abc/8
+    abc
+ 0: abc
+    A\x{300}abc
+ 0: A\x{300}abc
+    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
+ 0: A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abc
+    *** Failers
+No match
+    \x{300}abc  
+No match

-/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/BM
-Memory allocation (code space): 120
-------------------------------------------------------------------
-  0 116 Bra
-  3     ^
-  4 109 CBra 1
-  9   7 CBra 2
- 14     a+
- 16   7 Ket
- 19  39 CBra 3
- 24     [ab]+?
- 58  39 Ket
- 61  39 CBra 4
- 66     [bc]+
-100  39 Ket
-103   7 CBra 5
-108     \w*
-110   7 Ket
-113 109 Ket
-116 116 Ket
-119     End
-------------------------------------------------------------------
+/^\pL?=./8
+    A=b
+ 0: A=b
+    =c 
+ 0: =c
+    *** Failers
+No match
+    1=2 
+No match
+    AAAA=b  
+No match

-|8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b|BM
-Memory allocation (code space): 826
-------------------------------------------------------------------
-  0 822 Bra
-  3     8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
-821     \b
-822 822 Ket
-825     End
-------------------------------------------------------------------
+/^\pL*=./8
+    AAAA=b
+ 0: AAAA=b
+    =c 
+ 0: =c
+    *** Failers
+No match
+    1=2  
+No match

-|\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b|BM
-Memory allocation (code space): 816
-------------------------------------------------------------------
-  0 812 Bra
-  3     $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
-811     \b
-812 812 Ket
-815     End
-------------------------------------------------------------------
+/^\X{2,3}X/8
+    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
+ 0: A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
+    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X 
+ 0: A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
+    *** Failers
+No match
+    X
+No match
+    A\x{300}\x{301}\x{302}X
+No match
+    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
+No match

-/(a(?1)b)/BM
-Memory allocation (code space): 22
-------------------------------------------------------------------
-  0  18 Bra
-  3  12 CBra 1
-  8     a
- 10   3 Recurse
- 13     b
- 15  12 Ket
- 18  18 Ket
- 21     End
-------------------------------------------------------------------
+/^\pC\pL\pM\pN\pP\pS\pZ</8
+    \x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
+ 0: \x{7f}\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
+    \np\x{300}9!\$ < 
+ 0: \x{0a}p\x{300}9!$ <
+    ** Failers 
+No match
+    ap\x{300}9!\$ < 
+No match
+  
+/^\PC/8
+    X
+ 0: X
+    ** Failers 
+ 0: *
+    \x7f
+No match
+  
+/^\PL/8
+    9
+ 0: 9
+    ** Failers 
+ 0: *
+    \x{c0}
+No match
+  
+/^\PM/8
+    X
+ 0: X
+    ** Failers 
+ 0: *
+    \x{30f}
+No match
+  
+/^\PN/8
+    X
+ 0: X
+    ** Failers 
+ 0: *
+    \x{660}
+No match
+  
+/^\PP/8
+    X
+ 0: X
+    ** Failers 
+No match
+    \x{66c}
+No match
+  
+/^\PS/8
+    X
+ 0: X
+    ** Failers 
+ 0: *
+    \x{f01}
+No match
+  
+/^\PZ/8
+    X
+ 0: X
+    ** Failers 
+ 0: *
+    \x{1680}
+No match
+    
+/^\p{Cc}/8
+    \x{017}
+ 0: \x{17}
+    \x{09f} 
+ 0: \x{9f}
+    ** Failers
+No match
+    \x{0600} 
+No match
+  
+/^\p{Cf}/8
+    \x{601}
+ 0: \x{601}
+    ** Failers
+No match
+    \x{09f} 
+No match
+  
+/^\p{Cn}/8
+    ** Failers
+No match
+    \x{09f} 
+No match
+  
+/^\p{Co}/8
+    \x{f8ff}
+ 0: \x{f8ff}
+    ** Failers
+No match
+    \x{09f} 
+No match
+  
+/^\p{Cs}/8
+    \?\x{dfff}
+ 0: \x{dfff}
+    ** Failers
+No match
+    \x{09f} 
+No match
+  
+/^\p{Ll}/8
+    a
+ 0: a
+    ** Failers 
+No match
+    Z
+No match
+    \x{e000}  
+No match
+  
+/^\p{Lm}/8
+    \x{2b0}
+ 0: \x{2b0}
+    ** Failers
+No match
+    a 
+No match
+  
+/^\p{Lo}/8
+    \x{1bb}
+ 0: \x{1bb}
+    ** Failers
+No match
+    a 
+No match
+    \x{2b0}
+No match
+  
+/^\p{Lt}/8
+    \x{1c5}
+ 0: \x{1c5}
+    ** Failers
+No match
+    a 
+No match
+    \x{2b0}
+No match
+  
+/^\p{Lu}/8
+    A
+ 0: A
+    ** Failers
+No match
+    \x{2b0}
+No match
+  
+/^\p{Mc}/8
+    \x{903}
+ 0: \x{903}
+    ** Failers
+No match
+    X
+No match
+    \x{300}
+No match
+       
+/^\p{Me}/8
+    \x{488}
+ 0: \x{488}
+    ** Failers
+No match
+    X
+No match
+    \x{903}
+No match
+    \x{300}
+No match
+  
+/^\p{Mn}/8
+    \x{300}
+ 0: \x{300}
+    ** Failers
+No match
+    X
+No match
+    \x{903}
+No match
+  
+/^\p{Nd}+/8
+    0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}\x{66a}
+ 0: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}
+ 1: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}
+ 2: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}
+ 3: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}
+ 4: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}
+ 5: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}
+ 6: 0123456789\x{660}\x{661}\x{662}\x{663}
+ 7: 0123456789\x{660}\x{661}\x{662}
+ 8: 0123456789\x{660}\x{661}
+ 9: 0123456789\x{660}
+10: 0123456789
+11: 012345678
+12: 01234567
+13: 0123456
+14: 012345
+15: 01234
+16: 0123
+17: 012
+18: 01
+19: 0
+    \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}\x{6fa}
+ 0: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}
+ 1: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}
+ 2: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}
+ 3: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}
+ 4: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}
+ 5: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}
+ 6: \x{6f0}\x{6f1}\x{6f2}\x{6f3}
+ 7: \x{6f0}\x{6f1}\x{6f2}
+ 8: \x{6f0}\x{6f1}
+ 9: \x{6f0}
+    \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}\x{970}
+ 0: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}
+ 1: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}
+ 2: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}
+ 3: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}
+ 4: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}
+ 5: \x{966}\x{967}\x{968}\x{969}\x{96a}
+ 6: \x{966}\x{967}\x{968}\x{969}
+ 7: \x{966}\x{967}\x{968}
+ 8: \x{966}\x{967}
+ 9: \x{966}
+    ** Failers
+No match
+    X
+No match
+  
+/^\p{Nl}/8
+    \x{16ee}
+ 0: \x{16ee}
+    ** Failers
+No match
+    X
+No match
+    \x{966}
+No match
+  
+/^\p{No}/8
+    \x{b2}
+ 0: \x{b2}
+    \x{b3}
+ 0: \x{b3}
+    ** Failers
+No match
+    X
+No match
+    \x{16ee}
+No match
+  
+/^\p{Pc}/8
+    \x5f
+ 0: _
+    \x{203f}
+ 0: \x{203f}
+    ** Failers
+No match
+    X
+No match
+    -
+No match
+    \x{58a}
+No match
+  
+/^\p{Pd}/8
+    -
+ 0: -
+    \x{58a}
+ 0: \x{58a}
+    ** Failers
+No match
+    X
+No match
+    \x{203f}
+No match
+  
+/^\p{Pe}/8
+    )
+ 0: )
+    ]
+ 0: ]
+    }
+ 0: }
+    \x{f3b}
+ 0: \x{f3b}
+    ** Failers
+No match
+    X
+No match
+    \x{203f}
+No match
+    (
+No match
+    [
+No match
+    {
+No match
+    \x{f3c}
+No match
+  
+/^\p{Pf}/8
+    \x{bb}
+ 0: \x{bb}
+    \x{2019}
+ 0: \x{2019}
+    ** Failers
+No match
+    X
+No match
+    \x{203f}
+No match
+  
+/^\p{Pi}/8
+    \x{ab}
+ 0: \x{ab}
+    \x{2018}
+ 0: \x{2018}
+    ** Failers
+No match
+    X
+No match
+    \x{203f}
+No match
+  
+/^\p{Po}/8
+    !
+ 0: !
+    \x{37e}
+ 0: \x{37e}
+    ** Failers
+ 0: *
+    X
+No match
+    \x{203f}
+No match
+  
+/^\p{Ps}/8
+    (
+ 0: (
+    [
+ 0: [
+    {
+ 0: {
+    \x{f3c}
+ 0: \x{f3c}
+    ** Failers
+No match
+    X
+No match
+    )
+No match
+    ]
+No match
+    }
+No match
+    \x{f3b}
+No match
+  
+/^\p{Sc}+/8
+    $\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
+ 0: $\x{a2}\x{a3}\x{a4}\x{a5}
+ 1: $\x{a2}\x{a3}\x{a4}
+ 2: $\x{a2}\x{a3}
+ 3: $\x{a2}
+ 4: $
+    \x{9f2}
+ 0: \x{9f2}
+    ** Failers
+No match
+    X
+No match
+    \x{2c2}
+No match
+  
+/^\p{Sk}/8
+    \x{2c2}
+ 0: \x{2c2}
+    ** Failers
+No match
+    X
+No match
+    \x{9f2}
+No match
+  
+/^\p{Sm}+/8
+    +<|~\x{ac}\x{2044}
+ 0: +<|~\x{ac}\x{2044}
+ 1: +<|~\x{ac}
+ 2: +<|~
+ 3: +<|
+ 4: +<
+ 5: +
+    ** Failers
+No match
+    X
+No match
+    \x{9f2}
+No match
+  
+/^\p{So}/8
+    \x{a6}
+ 0: \x{a6}
+    \x{482} 
+ 0: \x{482}
+    ** Failers
+No match
+    X
+No match
+    \x{9f2}
+No match
+  
+/^\p{Zl}/8
+    \x{2028}
+ 0: \x{2028}
+    ** Failers
+No match
+    X
+No match
+    \x{2029}
+No match
+  
+/^\p{Zp}/8
+    \x{2029}
+ 0: \x{2029}
+    ** Failers
+No match
+    X
+No match
+    \x{2028}
+No match
+  
+/^\p{Zs}/8
+    \ \
+ 0:  
+    \x{a0}
+ 0: \x{a0}
+    \x{1680}
+ 0: \x{1680}
+    \x{180e}
+ 0: \x{180e}
+    \x{2000}
+ 0: \x{2000}
+    \x{2001}     
+ 0: \x{2001}
+    ** Failers
+No match
+    \x{2028}
+No match
+    \x{200d} 
+No match
+  
+/\p{Nd}+(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}AB
+ 1: \x{660}\x{661}\x{662}A
+ 2: \x{660}\x{661}\x{662}
+  
+/\p{Nd}+?(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}AB
+ 1: \x{660}\x{661}\x{662}A
+ 2: \x{660}\x{661}\x{662}
+  
+/\p{Nd}{2,}(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}AB
+ 1: \x{660}\x{661}\x{662}A
+  
+/\p{Nd}{2,}?(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}AB
+ 1: \x{660}\x{661}\x{662}A
+  
+/\p{Nd}*(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}AB
+ 1: \x{660}\x{661}\x{662}A
+ 2: \x{660}\x{661}\x{662}
+ 3: \x{660}\x{661}
+  
+/\p{Nd}*?(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}AB
+ 1: \x{660}\x{661}\x{662}A
+ 2: \x{660}\x{661}\x{662}
+ 3: \x{660}\x{661}
+  
+/\p{Nd}{2}(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}A
+  
+/\p{Nd}{2,3}(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}AB
+ 1: \x{660}\x{661}\x{662}A
+  
+/\p{Nd}{2,3}?(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}AB
+ 1: \x{660}\x{661}\x{662}A
+  
+/\p{Nd}?(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}
+ 1: \x{660}\x{661}
+  
+/\p{Nd}??(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}
+ 1: \x{660}\x{661}
+  
+/\p{Nd}*+(..)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}AB
+  
+/\p{Nd}*+(...)/8
+      \x{660}\x{661}\x{662}ABC
+ 0: \x{660}\x{661}\x{662}ABC
+  
+/\p{Nd}*+(....)/8
+      ** Failers
+ 0: ** F
+      \x{660}\x{661}\x{662}ABC
+No match
+  
+/\p{Lu}/8i
+    A
+ 0: A
+    a\x{10a0}B 
+ 0: \x{10a0}
+    ** Failers 
+ 0: F
+    a
+No match
+    \x{1d00}  
+No match

-/(a(?1)+b)/BM
-Memory allocation (code space): 28
-------------------------------------------------------------------
-  0  24 Bra
-  3  18 CBra 1
-  8     a
- 10   6 Once
- 13   3 Recurse
- 16   6 KetRmax
- 19     b
- 21  18 Ket
- 24  24 Ket
- 27     End
-------------------------------------------------------------------
+/\p{^Lu}/8i
+    1234
+ 0: 1
+    ** Failers
+ 0: *
+    ABC 
+No match

-/a(?P<name1>b|c)d(?P<longername2>e)/BM
-Memory allocation (code space): 42
-------------------------------------------------------------------
-  0  32 Bra
-  3     a
-  5   7 CBra 1
- 10     b
- 12   5 Alt
- 15     c
- 17  12 Ket
- 20     d
- 22   7 CBra 2
- 27     e
- 29   7 Ket
- 32  32 Ket
- 35     End
-------------------------------------------------------------------
+/\P{Lu}/8i
+    1234
+ 0: 1
+    ** Failers
+ 0: *
+    ABC 
+No match

-/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/BM
-Memory allocation (code space): 54
-------------------------------------------------------------------
-  0  41 Bra
-  3  25 Bra
-  6     a
-  8  17 CBra 1
- 13     c
- 15   7 CBra 2
- 20     d
- 22   7 Ket
- 25  17 Ket
- 28  25 Ket
- 31   7 CBra 3
- 36     a
- 38   7 Ket
- 41  41 Ket
- 44     End
-------------------------------------------------------------------
+/(?<=A\p{Nd})XYZ/8
+    A2XYZ
+ 0: XYZ
+    123A5XYZPQR
+ 0: XYZ
+    ABA\x{660}XYZpqr
+ 0: XYZ
+    ** Failers
+No match
+    AXYZ
+No match
+    XYZ     
+No match
+    
+/(?<!\pL)XYZ/8
+    1XYZ
+ 0: XYZ
+    AB=XYZ.. 
+ 0: XYZ
+    XYZ 
+ 0: XYZ
+    ** Failers
+No match
+    WXYZ 
+No match

-/(?P<a>a)...(?P=a)bbb(?P>a)d/BM
-Memory allocation (code space): 37
-------------------------------------------------------------------
-  0  30 Bra
-  3   7 CBra 1
-  8     a
- 10   7 Ket
- 13     Any
- 14     Any
- 15     Any
- 16     \1
- 19     bbb
- 25   3 Recurse
- 28     d
- 30  30 Ket
- 33     End
-------------------------------------------------------------------
+/[\p{Nd}]/8
+    1234
+ 0: 1

-/abc(?C255)de(?C)f/BM
-Memory allocation (code space): 31
-------------------------------------------------------------------
-  0  27 Bra
-  3     abc
-  9     Callout 255 10 1
- 15     de
- 19     Callout 0 16 1
- 25     f
- 27  27 Ket
- 30     End
-------------------------------------------------------------------
+/[\p{Nd}+-]+/8
+    1234
+ 0: 1234
+ 1: 123
+ 2: 12
+ 3: 1
+    12-34
+ 0: 12-34
+ 1: 12-3
+ 2: 12-
+ 3: 12
+ 4: 1
+    12+\x{661}-34  
+ 0: 12+\x{661}-34
+ 1: 12+\x{661}-3
+ 2: 12+\x{661}-
+ 3: 12+\x{661}
+ 4: 12+
+ 5: 12
+ 6: 1
+    ** Failers
+No match
+    abcd  
+No match

-/abcde/CBM
-Memory allocation (code space): 53
-------------------------------------------------------------------
-  0  49 Bra
-  3     Callout 255 0 1
-  9     a
- 11     Callout 255 1 1
- 17     b
- 19     Callout 255 2 1
- 25     c
- 27     Callout 255 3 1
- 33     d
- 35     Callout 255 4 1
- 41     e
- 43     Callout 255 5 0
- 49  49 Ket
- 52     End
-------------------------------------------------------------------
+/[\P{Nd}]+/8
+    abcd
+ 0: abcd
+ 1: abc
+ 2: ab
+ 3: a
+    ** Failers
+ 0: ** Failers
+ 1: ** Failer
+ 2: ** Faile
+ 3: ** Fail
+ 4: ** Fai
+ 5: ** Fa
+ 6: ** F
+ 7: ** 
+ 8: **
+ 9: *
+    1234
+No match

-/\x{100}/8BM
-Memory allocation (code space): 10
-------------------------------------------------------------------
-  0   6 Bra
-  3     \x{100}
-  6   6 Ket
-  9     End
-------------------------------------------------------------------
+/\D+/8
+    11111111111111111111111111111111111111111111111111111111111111111111111
+No match
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+Matched, but too many subsidiary matches
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+     
+/\P{Nd}+/8
+    11111111111111111111111111111111111111111111111111111111111111111111111
+No match
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+Matched, but too many subsidiary matches
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/\x{1000}/8BM
-Memory allocation (code space): 11
-------------------------------------------------------------------
-  0   7 Bra
-  3     \x{1000}
-  7   7 Ket
- 10     End
-------------------------------------------------------------------
+/[\D]+/8
+    11111111111111111111111111111111111111111111111111111111111111111111111
+No match
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+Matched, but too many subsidiary matches
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/\x{10000}/8BM
-Memory allocation (code space): 12
-------------------------------------------------------------------
-  0   8 Bra
-  3     \x{10000}
-  8   8 Ket
- 11     End
-------------------------------------------------------------------
+/[\P{Nd}]+/8
+    11111111111111111111111111111111111111111111111111111111111111111111111
+No match
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+Matched, but too many subsidiary matches
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/\x{100000}/8BM
-Memory allocation (code space): 12
-------------------------------------------------------------------
-  0   8 Bra
-  3     \x{100000}
-  8   8 Ket
- 11     End
-------------------------------------------------------------------
+/[\D\P{Nd}]+/8
+    11111111111111111111111111111111111111111111111111111111111111111111111
+No match
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+Matched, but too many subsidiary matches
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/\x{1000000}/8BM
-Memory allocation (code space): 13
-------------------------------------------------------------------
-  0   9 Bra
-  3     \x{1000000}
-  9   9 Ket
- 12     End
-------------------------------------------------------------------
+/\pL/8
+    a
+ 0: a
+    A 
+ 0: A

-/\x{4000000}/8BM
-Memory allocation (code space): 14
-------------------------------------------------------------------
-  0  10 Bra
-  3     \x{4000000}
- 10  10 Ket
- 13     End
-------------------------------------------------------------------
+/\pL/8i
+    a
+ 0: a
+    A 
+ 0: A
+    
+/\p{Lu}/8 
+    A
+ 0: A
+    aZ
+ 0: Z
+    ** Failers
+ 0: F
+    abc   
+No match

-/\x{7fffFFFF}/8BM
-Memory allocation (code space): 14
-------------------------------------------------------------------
-  0  10 Bra
-  3     \x{7fffffff}
- 10  10 Ket
- 13     End
-------------------------------------------------------------------
+/\p{Lu}/8i
+    A
+ 0: A
+    aZ
+ 0: Z
+    ** Failers
+ 0: F
+    abc   
+No match

-/[\x{ff}]/8BM
-Memory allocation (code space): 10
-------------------------------------------------------------------
-  0   6 Bra
-  3     \x{ff}
-  6   6 Ket
-  9     End
-------------------------------------------------------------------
+/\p{Ll}/8 
+    a
+ 0: a
+    Az
+ 0: z
+    ** Failers
+ 0: a
+    ABC   
+No match

-/[\x{100}]/8BM
-Memory allocation (code space): 15
-------------------------------------------------------------------
-  0  11 Bra
-  3     [\x{100}]
- 11  11 Ket
- 14     End
-------------------------------------------------------------------
+/\p{Ll}/8i 
+    a
+ 0: a
+    Az
+ 0: z
+    ** Failers
+ 0: a
+    ABC   
+No match

-/\x80/8BM
-Memory allocation (code space): 10
-------------------------------------------------------------------
-  0   6 Bra
-  3     \x{80}
-  6   6 Ket
-  9     End
-------------------------------------------------------------------
+/^\x{c0}$/8i
+    \x{c0}
+ 0: \x{c0}
+    \x{e0} 
+ 0: \x{e0}

-/\xff/8BM
-Memory allocation (code space): 10
-------------------------------------------------------------------
-  0   6 Bra
-  3     \x{ff}
-  6   6 Ket
-  9     End
-------------------------------------------------------------------
+/^\x{e0}$/8i
+    \x{c0}
+ 0: \x{c0}
+    \x{e0} 
+ 0: \x{e0}

-/\x{0041}\x{2262}\x{0391}\x{002e}/D8M
-Memory allocation (code space): 18
-------------------------------------------------------------------
-  0  14 Bra
-  3     A\x{2262}\x{391}.
- 14  14 Ket
- 17     End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 'A'
-Need char = '.'
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8
+    A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+ 0: A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+    ** Failers
+No match
+    a\x{391}\x{10427}\x{ff3a}\x{1fb0}   
+No match
+    A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
+No match
+    A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
+No match
+    A\x{391}\x{10427}\x{ff5a}\x{1fb0}
+No match
+    A\x{391}\x{10427}\x{ff3a}\x{1fb8}
+No match
+
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8i
+    A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+ 0: A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+    a\x{391}\x{10427}\x{ff3a}\x{1fb0}   
+ 0: a\x{391}\x{10427}\x{ff3a}\x{1fb0}
+    A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
+ 0: A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
+    A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
+ 0: A\x{391}\x{1044f}\x{ff3a}\x{1fb0}
+    A\x{391}\x{10427}\x{ff5a}\x{1fb0}
+ 0: A\x{391}\x{10427}\x{ff5a}\x{1fb0}
+    A\x{391}\x{10427}\x{ff3a}\x{1fb8}
+ 0: A\x{391}\x{10427}\x{ff3a}\x{1fb8}
+
+/\x{391}+/8i
+    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
+ 0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
+ 1: \x{391}\x{3b1}\x{3b1}\x{3b1}
+ 2: \x{391}\x{3b1}\x{3b1}
+ 3: \x{391}\x{3b1}
+ 4: \x{391}
+
+/\x{391}{3,5}(.)/8i
+    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
+ 0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
+ 1: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
+ 2: \x{391}\x{3b1}\x{3b1}\x{3b1}
+
+/\x{391}{3,5}?(.)/8i
+    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
+ 0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
+ 1: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
+ 2: \x{391}\x{3b1}\x{3b1}\x{3b1}
+
+/[\x{391}\x{ff3a}]/8i
+    \x{391}
+ 0: \x{391}
+    \x{ff3a}
+ 0: \x{ff3a}
+    \x{3b1}
+ 0: \x{3b1}
+    \x{ff5a}   
+ 0: \x{ff5a}

-/\x{D55c}\x{ad6d}\x{C5B4}/D8M 
-Memory allocation (code space): 19
-------------------------------------------------------------------
-  0  15 Bra
-  3     \x{d55c}\x{ad6d}\x{c5b4}
- 15  15 Ket
- 18     End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 237
-Need char = 180
+/[\x{c0}\x{391}]/8i
+    \x{c0}
+ 0: \x{c0}
+    \x{e0} 
+ 0: \x{e0}

-/\x{65e5}\x{672c}\x{8a9e}/D8M
-Memory allocation (code space): 19
-------------------------------------------------------------------
-  0  15 Bra
-  3     \x{65e5}\x{672c}\x{8a9e}
- 15  15 Ket
- 18     End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 230
-Need char = 158
+/[\x{105}-\x{109}]/8i
+    \x{104}
+ 0: \x{104}
+    \x{105}
+ 0: \x{105}
+    \x{109}  
+ 0: \x{109}
+    ** Failers
+No match
+    \x{100}
+No match
+    \x{10a} 
+No match
+    
+/[z-\x{100}]/8i 
+    Z
+ 0: Z
+    z
+ 0: z
+    \x{39c}
+ 0: \x{39c}
+    \x{178}
+ 0: \x{178}
+    |
+ 0: |
+    \x{80}
+ 0: \x{80}
+    \x{ff}
+ 0: \x{ff}
+    \x{100}
+ 0: \x{100}
+    \x{101} 
+ 0: \x{101}
+    ** Failers
+No match
+    \x{102}
+No match
+    Y
+No match
+    y           
+No match

-/[\x{100}]/8BM
-Memory allocation (code space): 15
-------------------------------------------------------------------
-  0  11 Bra
-  3     [\x{100}]
- 11  11 Ket
- 14     End
-------------------------------------------------------------------
+/[z-\x{100}]/8i

-/[Z\x{100}]/8BM
-Memory allocation (code space): 47
-------------------------------------------------------------------
-  0  43 Bra
-  3     [Z\x{100}]
- 43  43 Ket
- 46     End
-------------------------------------------------------------------
+/^\X/8
+    A
+ 0: A
+    A\x{300}BC 
+ 0: A\x{300}
+    A\x{300}\x{301}\x{302}BC 
+ 0: A\x{300}\x{301}\x{302}
+    *** Failers
+ 0: *
+    \x{300}  
+No match

-/^[\x{100}\E-\Q\E\x{150}]/B8M
-Memory allocation (code space): 18
-------------------------------------------------------------------
-  0  14 Bra
-  3     ^
-  4     [\x{100}-\x{150}]
- 14  14 Ket
- 17     End
-------------------------------------------------------------------
+/^[\X]/8
+    X123
+ 0: X
+    *** Failers
+No match
+    AXYZ
+No match

-/^[\QĀ\E-\QŐ\E]/B8M
-Memory allocation (code space): 18
-------------------------------------------------------------------
-  0  14 Bra
-  3     ^
-  4     [\x{100}-\x{150}]
- 14  14 Ket
- 17     End
-------------------------------------------------------------------
+/^(\X*)C/8
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
+ 0: A\x{300}\x{301}\x{302}BC
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
+ 0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
+ 1: A\x{300}\x{301}\x{302}BC

-/^[\QĀ\E-\QŐ\E/B8M
-Failed: missing terminating ] for character class at offset 15
+/^(\X*?)C/8
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
+ 0: A\x{300}\x{301}\x{302}BC
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
+ 0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
+ 1: A\x{300}\x{301}\x{302}BC

-/[\p{L}]/BM
-Memory allocation (code space): 15
-------------------------------------------------------------------
-  0  11 Bra
-  3     [\p{L}]
- 11  11 Ket
- 14     End
-------------------------------------------------------------------
+/^(\X*)(.)/8
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
+ 0: A\x{300}\x{301}\x{302}BCA
+ 1: A\x{300}\x{301}\x{302}BC
+ 2: A\x{300}\x{301}\x{302}B
+ 3: A
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
+ 0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
+ 1: A\x{300}\x{301}\x{302}BCA
+ 2: A\x{300}\x{301}\x{302}BC
+ 3: A\x{300}\x{301}\x{302}B
+ 4: A

-/[\p{^L}]/BM
-Memory allocation (code space): 15
-------------------------------------------------------------------
-  0  11 Bra
-  3     [\P{L}]
- 11  11 Ket
- 14     End
-------------------------------------------------------------------
+/^(\X*?)(.)/8
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
+ 0: A\x{300}\x{301}\x{302}BCA
+ 1: A\x{300}\x{301}\x{302}BC
+ 2: A\x{300}\x{301}\x{302}B
+ 3: A
+    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
+ 0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
+ 1: A\x{300}\x{301}\x{302}BCA
+ 2: A\x{300}\x{301}\x{302}BC
+ 3: A\x{300}\x{301}\x{302}B
+ 4: A

-/[\P{L}]/BM
-Memory allocation (code space): 15
-------------------------------------------------------------------
-  0  11 Bra
-  3     [\P{L}]
- 11  11 Ket
- 14     End
-------------------------------------------------------------------
+/^\X(.)/8
+    *** Failers
+ 0: **
+    A\x{300}\x{301}\x{302}
+No match

-/[\P{^L}]/BM
-Memory allocation (code space): 15
-------------------------------------------------------------------
-  0  11 Bra
-  3     [\p{L}]
- 11  11 Ket
- 14     End
-------------------------------------------------------------------
+/^\X{2,3}(.)/8
+    A\x{300}\x{301}B\x{300}X
+ 0: A\x{300}\x{301}B\x{300}X
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}
+ 0: A\x{300}\x{301}B\x{300}C
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
+ 0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
+ 1: A\x{300}\x{301}B\x{300}C
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
+ 0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}D
+ 1: A\x{300}\x{301}B\x{300}C
+    
+/^\X{2,3}?(.)/8
+    A\x{300}\x{301}B\x{300}X
+ 0: A\x{300}\x{301}B\x{300}X
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}
+ 0: A\x{300}\x{301}B\x{300}C
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
+ 0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
+ 1: A\x{300}\x{301}B\x{300}C
+    A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
+ 0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}D
+ 1: A\x{300}\x{301}B\x{300}C

-/[abc\p{L}\x{0660}]/8BM
-Memory allocation (code space): 50
-------------------------------------------------------------------
-  0  46 Bra
-  3     [a-c\p{L}\x{660}]
- 46  46 Ket
- 49     End
-------------------------------------------------------------------
+/^\pN{2,3}X/
+    12X
+ 0: 12X
+    123X
+ 0: 123X
+    *** Failers
+No match
+    X
+No match
+    1X
+No match
+    1234X     
+No match

-/[\p{Nd}]/8BM
-Memory allocation (code space): 15
-------------------------------------------------------------------
-  0  11 Bra
-  3     [\p{Nd}]
- 11  11 Ket
- 14     End
-------------------------------------------------------------------
+/\x{100}/i8
+    \x{100}   
+ 0: \x{100}
+    \x{101} 
+ 0: \x{101}
+    
+/^\p{Han}+/8
+    \x{2e81}\x{3007}\x{2f804}\x{31a0}
+ 0: \x{2e81}\x{3007}\x{2f804}
+ 1: \x{2e81}\x{3007}
+ 2: \x{2e81}
+    ** Failers
+No match
+    \x{2e7f}  
+No match

-/[\p{Nd}+-]+/8BM
-Memory allocation (code space): 48
-------------------------------------------------------------------
-  0  44 Bra
-  3     [+\-\p{Nd}]+
- 44  44 Ket
- 47     End
-------------------------------------------------------------------
+/^\P{Katakana}+/8
+    \x{3105}
+ 0: \x{3105}
+    ** Failers
+ 0: ** Failers
+ 1: ** Failer
+ 2: ** Faile
+ 3: ** Fail
+ 4: ** Fai
+ 5: ** Fa
+ 6: ** F
+ 7: ** 
+ 8: **
+ 9: *
+    \x{30ff}  
+No match

-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iBM
-Memory allocation (code space): 25
-------------------------------------------------------------------
-  0  21 Bra
-  3  /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
- 21  21 Ket
- 24     End
-------------------------------------------------------------------
+/^[\p{Arabic}]/8
+    \x{06e9}
+ 0: \x{6e9}
+    \x{060b}
+ 0: \x{60b}
+    ** Failers
+No match
+    X\x{06e9}   
+No match

-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8BM
-Memory allocation (code space): 25
-------------------------------------------------------------------
-  0  21 Bra
-  3     A\x{391}\x{10427}\x{ff3a}\x{1fb0}
- 21  21 Ket
- 24     End
-------------------------------------------------------------------
+/^[\P{Yi}]/8
+    \x{2f800}
+ 0: \x{2f800}
+    ** Failers
+ 0: *
+    \x{a014}
+No match
+    \x{a4c6}   
+No match

-/[\x{105}-\x{109}]/8iBM
-Memory allocation (code space): 17
-------------------------------------------------------------------
-  0  13 Bra
-  3     [\x{104}-\x{109}]
- 13  13 Ket
- 16     End
-------------------------------------------------------------------
+/^\p{Any}X/8
+    AXYZ
+ 0: AX
+    \x{1234}XYZ 
+ 0: \x{1234}X
+    ** Failers
+No match
+    X  
+No match
+    
+/^\P{Any}X/8
+    ** Failers
+No match
+    AX
+No match
+    
+/^\p{Any}?X/8
+    XYZ
+ 0: X
+    AXYZ
+ 0: AX
+    \x{1234}XYZ 
+ 0: \x{1234}X
+    ** Failers
+No match
+    ABXYZ   
+No match

-/( ( (?(1)0|) )*   )/xBM
-Memory allocation (code space): 38
-------------------------------------------------------------------
-  0  34 Bra
-  3  28 CBra 1
-  8     Brazero
-  9  19 SCBra 2
- 14   8 Cond
- 17   1 Cond ref
- 20     0
- 22   3 Alt
- 25  11 Ket
- 28  19 KetRmax
- 31  28 Ket
- 34  34 Ket
- 37     End
-------------------------------------------------------------------
+/^\P{Any}?X/8
+    XYZ
+ 0: X
+    ** Failers
+No match
+    AXYZ
+No match
+    \x{1234}XYZ 
+No match
+    ABXYZ   
+No match

-/(  (?(1)0|)*   )/xBM
-Memory allocation (code space): 30
-------------------------------------------------------------------
-  0  26 Bra
-  3  20 CBra 1
-  8     Brazero
-  9   8 SCond
- 12   1 Cond ref
- 15     0
- 17   3 Alt
- 20  11 KetRmax
- 23  20 Ket
- 26  26 Ket
- 29     End
-------------------------------------------------------------------
+/^\p{Any}+X/8
+    AXYZ
+ 0: AX
+    \x{1234}XYZ
+ 0: \x{1234}X
+    A\x{1234}XYZ
+ 0: A\x{1234}X
+    ** Failers
+No match
+    XYZ
+No match

-/[a]/BM
-Memory allocation (code space): 9
-------------------------------------------------------------------
-  0   5 Bra
-  3     a
-  5   5 Ket
-  8     End
-------------------------------------------------------------------
+/^\P{Any}+X/8
+    ** Failers
+No match
+    AXYZ
+No match
+    \x{1234}XYZ
+No match
+    A\x{1234}XYZ
+No match
+    XYZ
+No match

-/[a]/8BM
-Memory allocation (code space): 9
-------------------------------------------------------------------
-  0   5 Bra
-  3     a
-  5   5 Ket
-  8     End
-------------------------------------------------------------------
+/^\p{Any}*X/8
+    XYZ
+ 0: X
+    AXYZ
+ 0: AX
+    \x{1234}XYZ
+ 0: \x{1234}X
+    A\x{1234}XYZ
+ 0: A\x{1234}X
+    ** Failers
+No match

-/[\xaa]/BM
-Memory allocation (code space): 9
-------------------------------------------------------------------
-  0   5 Bra
-  3     \xaa
-  5   5 Ket
-  8     End
-------------------------------------------------------------------
+/^\P{Any}*X/8
+    XYZ
+ 0: X
+    ** Failers
+No match
+    AXYZ
+No match
+    \x{1234}XYZ
+No match
+    A\x{1234}XYZ
+No match

-/[\xaa]/8BM
-Memory allocation (code space): 10
-------------------------------------------------------------------
-  0   6 Bra
-  3     \x{aa}
-  6   6 Ket
-  9     End
-------------------------------------------------------------------
+/^[\p{Any}]X/8
+    AXYZ
+ 0: AX
+    \x{1234}XYZ 
+ 0: \x{1234}X
+    ** Failers
+No match
+    X  
+No match
+    
+/^[\P{Any}]X/8
+    ** Failers
+No match
+    AX
+No match
+    
+/^[\p{Any}]?X/8
+    XYZ
+ 0: X
+    AXYZ
+ 0: AX
+    \x{1234}XYZ 
+ 0: \x{1234}X
+    ** Failers
+No match
+    ABXYZ   
+No match

-/[^a]/BM
-Memory allocation (code space): 9
-------------------------------------------------------------------
-  0   5 Bra
-  3     [^a]
-  5   5 Ket
-  8     End
-------------------------------------------------------------------
+/^[\P{Any}]?X/8
+    XYZ
+ 0: X
+    ** Failers
+No match
+    AXYZ
+No match
+    \x{1234}XYZ 
+No match
+    ABXYZ   
+No match

-/[^a]/8BM
-Memory allocation (code space): 9
-------------------------------------------------------------------
-  0   5 Bra
-  3     [^a]
-  5   5 Ket
-  8     End
-------------------------------------------------------------------
+/^[\p{Any}]+X/8
+    AXYZ
+ 0: AX
+    \x{1234}XYZ
+ 0: \x{1234}X
+    A\x{1234}XYZ
+ 0: A\x{1234}X
+    ** Failers
+No match
+    XYZ
+No match

-/[^\xaa]/BM
-Memory allocation (code space): 9
-------------------------------------------------------------------
-  0   5 Bra
-  3     [^\xaa]
-  5   5 Ket
-  8     End
-------------------------------------------------------------------
+/^[\P{Any}]+X/8
+    ** Failers
+No match
+    AXYZ
+No match
+    \x{1234}XYZ
+No match
+    A\x{1234}XYZ
+No match
+    XYZ
+No match

-/[^\xaa]/8BM
-Memory allocation (code space): 40
-------------------------------------------------------------------
-  0  36 Bra
-  3     [\x00-\xa9\xab-\xff] (neg)
- 36  36 Ket
- 39     End
-------------------------------------------------------------------
+/^[\p{Any}]*X/8
+    XYZ
+ 0: X
+    AXYZ
+ 0: AX
+    \x{1234}XYZ
+ 0: \x{1234}X
+    A\x{1234}XYZ
+ 0: A\x{1234}X
+    ** Failers
+No match

-/[^\d]/8WB
-------------------------------------------------------------------
-  0  11 Bra
-  3     [^\p{Nd}]
- 11  11 Ket
- 14     End
-------------------------------------------------------------------
+/^[\P{Any}]*X/8
+    XYZ
+ 0: X
+    ** Failers
+No match
+    AXYZ
+No match
+    \x{1234}XYZ
+No match
+    A\x{1234}XYZ
+No match

-/[[:^alpha:][:^cntrl:]]+/8WB
-------------------------------------------------------------------
-  0  44 Bra
-  3     [ -~\x80-\xff\P{L}]+
- 44  44 Ket
- 47     End
-------------------------------------------------------------------
+/^\p{Any}{3,5}?/8
+    abcdefgh
+ 0: abcde
+ 1: abcd
+ 2: abc
+    \x{1234}\n\r\x{3456}xyz 
+ 0: \x{1234}\x{0a}\x{0d}\x{3456}x
+ 1: \x{1234}\x{0a}\x{0d}\x{3456}
+ 2: \x{1234}\x{0a}\x{0d}

-/[[:^cntrl:][:^alpha:]]+/8WB
-------------------------------------------------------------------
-  0  44 Bra
-  3     [ -~\x80-\xff\P{L}]+
- 44  44 Ket
- 47     End
-------------------------------------------------------------------
+/^\p{Any}{3,5}/8
+    abcdefgh
+ 0: abcde
+ 1: abcd
+ 2: abc
+    \x{1234}\n\r\x{3456}xyz 
+ 0: \x{1234}\x{0a}\x{0d}\x{3456}x
+ 1: \x{1234}\x{0a}\x{0d}\x{3456}
+ 2: \x{1234}\x{0a}\x{0d}

-/[[:alpha:]]+/8WB
-------------------------------------------------------------------
-  0  12 Bra
-  3     [\p{L}]+
- 12  12 Ket
- 15     End
-------------------------------------------------------------------
+/^\P{Any}{3,5}?/8
+    ** Failers
+No match
+    abcdefgh
+No match
+    \x{1234}\n\r\x{3456}xyz 
+No match

-/[[:^alpha:]\S]+/8WB
-------------------------------------------------------------------
-  0  15 Bra
-  3     [\P{L}\P{Xsp}]+
- 15  15 Ket
- 18     End
-------------------------------------------------------------------
+/^\p{L&}X/8
+     AXY
+ 0: AX
+     aXY
+ 0: aX
+     \x{1c5}XY
+ 0: \x{1c5}X
+     ** Failers
+No match
+     \x{1bb}XY
+No match
+     \x{2b0}XY
+No match
+     !XY      
+No match

-/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/B
-------------------------------------------------------------------
-  0  73 Bra
-  3     abc
-  9   7 CBra 1
- 14     d
- 16   5 Alt
- 19     e
- 21  12 Ket
- 24     *THEN
- 25     x
- 27  14 CBra 2
- 32     123
- 38     *THEN
- 39     4
- 41  29 Alt
- 44     567
- 50   7 CBra 3
- 55     b
- 57   5 Alt
- 60     q
- 62  12 Ket
- 65     *THEN
- 66     xx
- 70  43 Ket
- 73  73 Ket
- 76     End
-------------------------------------------------------------------
+/^[\p{L&}]X/8
+     AXY
+ 0: AX
+     aXY
+ 0: aX
+     \x{1c5}XY
+ 0: \x{1c5}X
+     ** Failers
+No match
+     \x{1bb}XY
+No match
+     \x{2b0}XY
+No match
+     !XY      
+No match

-/-- End of testinput10 --/
+/^\p{L&}+X/8
+     AXY
+ 0: AX
+     aXY
+ 0: aX
+     AbcdeXyz 
+ 0: AbcdeX
+     \x{1c5}AbXY
+ 0: \x{1c5}AbX
+     abcDEXypqreXlmn 
+ 0: abcDEXypqreX
+ 1: abcDEX
+     ** Failers
+No match
+     \x{1bb}XY
+No match
+     \x{2b0}XY
+No match
+     !XY      
+No match
+
+/^[\p{L&}]+X/8
+     AXY
+ 0: AX
+     aXY
+ 0: aX
+     AbcdeXyz 
+ 0: AbcdeX
+     \x{1c5}AbXY
+ 0: \x{1c5}AbX
+     abcDEXypqreXlmn 
+ 0: abcDEXypqreX
+ 1: abcDEX
+     ** Failers
+No match
+     \x{1bb}XY
+No match
+     \x{2b0}XY
+No match
+     !XY      
+No match
+
+/^\p{L&}+?X/8
+     AXY
+ 0: AX
+     aXY
+ 0: aX
+     AbcdeXyz 
+ 0: AbcdeX
+     \x{1c5}AbXY
+ 0: \x{1c5}AbX
+     abcDEXypqreXlmn 
+ 0: abcDEXypqreX
+ 1: abcDEX
+     ** Failers
+No match
+     \x{1bb}XY
+No match
+     \x{2b0}XY
+No match
+     !XY      
+No match
+
+/^[\p{L&}]+?X/8
+     AXY
+ 0: AX
+     aXY
+ 0: aX
+     AbcdeXyz 
+ 0: AbcdeX
+     \x{1c5}AbXY
+ 0: \x{1c5}AbX
+     abcDEXypqreXlmn 
+ 0: abcDEXypqreX
+ 1: abcDEX
+     ** Failers
+No match
+     \x{1bb}XY
+No match
+     \x{2b0}XY
+No match
+     !XY      
+No match
+
+/^\P{L&}X/8
+     !XY
+ 0: !X
+     \x{1bb}XY
+ 0: \x{1bb}X
+     \x{2b0}XY
+ 0: \x{2b0}X
+     ** Failers
+No match
+     \x{1c5}XY
+No match
+     AXY      
+No match
+
+/^[\P{L&}]X/8
+     !XY
+ 0: !X
+     \x{1bb}XY
+ 0: \x{1bb}X
+     \x{2b0}XY
+ 0: \x{2b0}X
+     ** Failers
+No match
+     \x{1c5}XY
+No match
+     AXY      
+No match
+
+/^\x{023a}+?(\x{0130}+)/8i
+  \x{023a}\x{2c65}\x{0130}
+ 0: \x{23a}\x{2c65}\x{130}
+  
+/^\x{023a}+([^X])/8i
+  \x{023a}\x{2c65}X
+ 0: \x{23a}\x{2c65}
+ 
+/\x{c0}+\x{116}+/8i
+    \x{c0}\x{e0}\x{116}\x{117}
+ 0: \x{c0}\x{e0}\x{116}\x{117}
+ 1: \x{c0}\x{e0}\x{116}
+
+/[\x{c0}\x{116}]+/8i
+    \x{c0}\x{e0}\x{116}\x{117}
+ 0: \x{c0}\x{e0}\x{116}\x{117}
+ 1: \x{c0}\x{e0}\x{116}
+ 2: \x{c0}\x{e0}
+ 3: \x{c0}
+
+/Check property support in non-UTF-8 mode/
+ 
+/\p{L}{4}/
+    123abcdefg
+ 0: abcd
+    123abc\xc4\xc5zz
+ 0: abc\xc4
+
+/\p{Carian}\p{Cham}\p{Kayah_Li}\p{Lepcha}\p{Lycian}\p{Lydian}\p{Ol_Chiki}\p{Rejang}\p{Saurashtra}\p{Sundanese}\p{Vai}/8
+    \x{102A4}\x{AA52}\x{A91D}\x{1C46}\x{10283}\x{1092E}\x{1C6B}\x{A93B}\x{A8BF}\x{1BA0}\x{A50A}====
+ 0: \x{102a4}\x{aa52}\x{a91d}\x{1c46}\x{10283}\x{1092e}\x{1c6b}\x{a93b}\x{a8bf}\x{1ba0}\x{a50a}
+
+/\x{a77d}\x{1d79}/8i
+    \x{a77d}\x{1d79}
+ 0: \x{a77d}\x{1d79}
+    \x{1d79}\x{a77d} 
+ 0: \x{1d79}\x{a77d}
+
+/\x{a77d}\x{1d79}/8
+    \x{a77d}\x{1d79}
+ 0: \x{a77d}\x{1d79}
+    ** Failers 
+No match
+    \x{1d79}\x{a77d} 
+No match
+
+/^\p{Xan}/8
+    ABCD
+ 0: A
+    1234
+ 0: 1
+    \x{6ca}
+ 0: \x{6ca}
+    \x{a6c}
+ 0: \x{a6c}
+    \x{10a7}   
+ 0: \x{10a7}
+    ** Failers
+No match
+    _ABC   
+No match
+
+/^\p{Xan}+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
+ 1: ABCD1234\x{6ca}\x{a6c}
+ 2: ABCD1234\x{6ca}
+ 3: ABCD1234
+ 4: ABCD123
+ 5: ABCD12
+ 6: ABCD1
+ 7: ABCD
+ 8: ABC
+ 9: AB
+10: A
+    ** Failers
+No match
+    _ABC   
+No match
+
+/^\p{Xan}*/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
+ 1: ABCD1234\x{6ca}\x{a6c}
+ 2: ABCD1234\x{6ca}
+ 3: ABCD1234
+ 4: ABCD123
+ 5: ABCD12
+ 6: ABCD1
+ 7: ABCD
+ 8: ABC
+ 9: AB
+10: A
+11: 
+    
+/^\p{Xan}{2,9}/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}
+ 1: ABCD1234
+ 2: ABCD123
+ 3: ABCD12
+ 4: ABCD1
+ 5: ABCD
+ 6: ABC
+ 7: AB
+    
+/^[\p{Xan}]/8
+    ABCD1234_
+ 0: A
+    1234abcd_
+ 0: 1
+    \x{6ca}
+ 0: \x{6ca}
+    \x{a6c}
+ 0: \x{a6c}
+    \x{10a7}   
+ 0: \x{10a7}
+    ** Failers
+No match
+    _ABC   
+No match
+ 
+/^[\p{Xan}]+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
+ 1: ABCD1234\x{6ca}\x{a6c}
+ 2: ABCD1234\x{6ca}
+ 3: ABCD1234
+ 4: ABCD123
+ 5: ABCD12
+ 6: ABCD1
+ 7: ABCD
+ 8: ABC
+ 9: AB
+10: A
+    ** Failers
+No match
+    _ABC   
+No match
+
+/^>\p{Xsp}/8
+    >\x{1680}\x{2028}\x{0b}
+ 0: >\x{1680}
+    ** Failers
+No match
+    \x{0b} 
+No match
+
+/^>\p{Xsp}+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
+ 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
+ 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
+ 3: > \x{09}\x{0a}\x{0c}\x{0d}
+ 4: > \x{09}\x{0a}\x{0c}
+ 5: > \x{09}\x{0a}
+ 6: > \x{09}
+ 7: > 
+
+/^>\p{Xsp}*/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
+ 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
+ 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
+ 3: > \x{09}\x{0a}\x{0c}\x{0d}
+ 4: > \x{09}\x{0a}\x{0c}
+ 5: > \x{09}\x{0a}
+ 6: > \x{09}
+ 7: > 
+ 8: >
+    
+/^>\p{Xsp}{2,9}/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
+ 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
+ 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
+ 3: > \x{09}\x{0a}\x{0c}\x{0d}
+ 4: > \x{09}\x{0a}\x{0c}
+ 5: > \x{09}\x{0a}
+ 6: > \x{09}
+    
+/^>[\p{Xsp}]/8
+    >\x{2028}\x{0b}
+ 0: >\x{2028}
+ 
+/^>[\p{Xsp}]+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
+ 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
+ 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
+ 3: > \x{09}\x{0a}\x{0c}\x{0d}
+ 4: > \x{09}\x{0a}\x{0c}
+ 5: > \x{09}\x{0a}
+ 6: > \x{09}
+ 7: > 
+
+/^>\p{Xps}/8
+    >\x{1680}\x{2028}\x{0b}
+ 0: >\x{1680}
+    >\x{a0} 
+ 0: >\x{a0}
+    ** Failers
+No match
+    \x{0b} 
+No match
+
+/^>\p{Xps}+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
+ 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
+ 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
+ 4: > \x{09}\x{0a}\x{0c}\x{0d}
+ 5: > \x{09}\x{0a}\x{0c}
+ 6: > \x{09}\x{0a}
+ 7: > \x{09}
+ 8: > 
+
+/^>\p{Xps}+?/8
+    >\x{1680}\x{2028}\x{0b}
+ 0: >\x{1680}\x{2028}\x{0b}
+ 1: >\x{1680}\x{2028}
+ 2: >\x{1680}
+
+/^>\p{Xps}*/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
+ 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
+ 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
+ 4: > \x{09}\x{0a}\x{0c}\x{0d}
+ 5: > \x{09}\x{0a}\x{0c}
+ 6: > \x{09}\x{0a}
+ 7: > \x{09}
+ 8: > 
+ 9: >
+    
+/^>\p{Xps}{2,9}/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
+ 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
+ 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
+ 4: > \x{09}\x{0a}\x{0c}\x{0d}
+ 5: > \x{09}\x{0a}\x{0c}
+ 6: > \x{09}\x{0a}
+ 7: > \x{09}
+    
+/^>\p{Xps}{2,9}?/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
+ 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
+ 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
+ 4: > \x{09}\x{0a}\x{0c}\x{0d}
+ 5: > \x{09}\x{0a}\x{0c}
+ 6: > \x{09}\x{0a}
+ 7: > \x{09}
+    
+/^>[\p{Xps}]/8
+    >\x{2028}\x{0b}
+ 0: >\x{2028}
+ 
+/^>[\p{Xps}]+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
+ 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
+ 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
+ 4: > \x{09}\x{0a}\x{0c}\x{0d}
+ 5: > \x{09}\x{0a}\x{0c}
+ 6: > \x{09}\x{0a}
+ 7: > \x{09}
+ 8: > 
+
+/^\p{Xwd}/8
+    ABCD
+ 0: A
+    1234
+ 0: 1
+    \x{6ca}
+ 0: \x{6ca}
+    \x{a6c}
+ 0: \x{a6c}
+    \x{10a7}
+ 0: \x{10a7}
+    _ABC    
+ 0: _
+    ** Failers
+No match
+    [] 
+No match
+
+/^\p{Xwd}+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 1: ABCD1234\x{6ca}\x{a6c}\x{10a7}
+ 2: ABCD1234\x{6ca}\x{a6c}
+ 3: ABCD1234\x{6ca}
+ 4: ABCD1234
+ 5: ABCD123
+ 6: ABCD12
+ 7: ABCD1
+ 8: ABCD
+ 9: ABC
+10: AB
+11: A
+
+/^\p{Xwd}*/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 1: ABCD1234\x{6ca}\x{a6c}\x{10a7}
+ 2: ABCD1234\x{6ca}\x{a6c}
+ 3: ABCD1234\x{6ca}
+ 4: ABCD1234
+ 5: ABCD123
+ 6: ABCD12
+ 7: ABCD1
+ 8: ABCD
+ 9: ABC
+10: AB
+11: A
+12: 
+    
+/^\p{Xwd}{2,9}/8
+    A_12\x{6ca}\x{a6c}\x{10a7}
+ 0: A_12\x{6ca}\x{a6c}\x{10a7}
+ 1: A_12\x{6ca}\x{a6c}
+ 2: A_12\x{6ca}
+ 3: A_12
+ 4: A_1
+ 5: A_
+    
+/^[\p{Xwd}]/8
+    ABCD1234_
+ 0: A
+    1234abcd_
+ 0: 1
+    \x{6ca}
+ 0: \x{6ca}
+    \x{a6c}
+ 0: \x{a6c}
+    \x{10a7}   
+ 0: \x{10a7}
+    _ABC 
+ 0: _
+    ** Failers
+No match
+    []   
+No match
+ 
+/^[\p{Xwd}]+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 1: ABCD1234\x{6ca}\x{a6c}\x{10a7}
+ 2: ABCD1234\x{6ca}\x{a6c}
+ 3: ABCD1234\x{6ca}
+ 4: ABCD1234
+ 5: ABCD123
+ 6: ABCD12
+ 7: ABCD1
+ 8: ABCD
+ 9: ABC
+10: AB
+11: A
+
+/-- Unicode properties for \b abd \B --/
+
+/\b...\B/8W
+    abc_
+ 0: abc
+    \x{37e}abc\x{376} 
+ 0: abc
+    \x{37e}\x{376}\x{371}\x{393}\x{394} 
+ 0: \x{376}\x{371}\x{393}
+    !\x{c0}++\x{c1}\x{c2} 
+ 0: ++\x{c1}
+    !\x{c0}+++++ 
+ 0: \x{c0}++
+
+/-- Without PCRE_UCP, non-ASCII always fail, even if < 256  --/
+
+/\b...\B/8
+    abc_
+ 0: abc
+    ** Failers 
+ 0: Fai
+    \x{37e}abc\x{376} 
+No match
+    \x{37e}\x{376}\x{371}\x{393}\x{394} 
+No match
+    !\x{c0}++\x{c1}\x{c2} 
+No match
+    !\x{c0}+++++ 
+No match
+
+/-- With PCRE_UCP, non-UTF8 chars that are < 256 still check properties  --/
+
+/\b...\B/W
+    abc_
+ 0: abc
+    !\x{c0}++\x{c1}\x{c2} 
+ 0: ++\xc1
+    !\x{c0}+++++ 
+ 0: \xc0++
+
+/-- End of testinput10 --/

Deleted: code/trunk/testdata/testoutput11
===================================================================
--- code/trunk/testdata/testoutput11    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput11    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,1444 +0,0 @@
-/-- These tests are for the Perl >= 5.10 features that PCRE supports. --/
-
-/\H\h\V\v/
-    X X\x0a
- 0: X X\x0a
-    X\x09X\x0b
- 0: X\x09X\x0b
-    ** Failers
-No match
-    \xa0 X\x0a   
-No match
-    
-/\H*\h+\V?\v{3,4}/ 
-    \x09\x20\xa0X\x0a\x0b\x0c\x0d\x0a
- 0: \x09 \xa0X\x0a\x0b\x0c\x0d
-    \x09\x20\xa0\x0a\x0b\x0c\x0d\x0a
- 0: \x09 \xa0\x0a\x0b\x0c\x0d
-    \x09\x20\xa0\x0a\x0b\x0c
- 0: \x09 \xa0\x0a\x0b\x0c
-    ** Failers 
-No match
-    \x09\x20\xa0\x0a\x0b
-No match
-     
-/\H{3,4}/
-    XY  ABCDE
- 0: ABCD
-    XY  PQR ST 
- 0: PQR
-    
-/.\h{3,4}./
-    XY  AB    PQRS
- 0: B    P
-
-/\h*X\h?\H+Y\H?Z/
-    >XNNNYZ
- 0: XNNNYZ
-    >  X NYQZ
- 0:   X NYQZ
-    ** Failers
-No match
-    >XYZ   
-No match
-    >  X NY Z
-No match
-
-/\v*X\v?Y\v+Z\V*\x0a\V+\x0b\V{2,3}\x0c/
-    >XY\x0aZ\x0aA\x0bNN\x0c
- 0: XY\x0aZ\x0aA\x0bNN\x0c
-    >\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
- 0: \x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
-
-/(foo)\Kbar/
-    foobar
- 0: bar
- 1: foo
-   
-/(foo)(\Kbar|baz)/
-    foobar
- 0: bar
- 1: foo
- 2: bar
-    foobaz 
- 0: foobaz
- 1: foo
- 2: baz
-
-/(foo\Kbar)baz/
-    foobarbaz
- 0: barbaz
- 1: foobar
-
-/abc\K|def\K/g+
-    Xabcdefghi
- 0: 
- 0+ defghi
- 0: 
- 0+ ghi
-
-/ab\Kc|de\Kf/g+
-    Xabcdefghi
- 0: c
- 0+ defghi
- 0: f
- 0+ ghi
-    
-/(?=C)/g+
-    ABCDECBA
- 0: 
- 0+ CDECBA
- 0: 
- 0+ CBA
-    
-/^abc\K/+
-    abcdef
- 0: 
- 0+ def
-    ** Failers
-No match
-    defabcxyz   
-No match
-
-/^(a(b))\1\g1\g{1}\g-1\g{-1}\g{-02}Z/
-    ababababbbabZXXXX
- 0: ababababbbabZ
- 1: ab
- 2: b
-
-/(?<A>tom|bon)-\g{A}/
-    tom-tom
- 0: tom-tom
- 1: tom
-    bon-bon 
- 0: bon-bon
- 1: bon
-    
-/(^(a|b\g{-1}))/
-    bacxxx
-No match
-
-/(?|(abc)|(xyz))\1/
-    abcabc
- 0: abcabc
- 1: abc
-    xyzxyz 
- 0: xyzxyz
- 1: xyz
-    ** Failers
-No match
-    abcxyz
-No match
-    xyzabc   
-No match
-    
-/(?|(abc)|(xyz))(?1)/
-    abcabc
- 0: abcabc
- 1: abc
-    xyzabc 
- 0: xyzabc
- 1: xyz
-    ** Failers 
-No match
-    xyzxyz 
-No match
- 
-/^X(?5)(a)(?|(b)|(q))(c)(d)(Y)/
-    XYabcdY
- 0: XYabcdY
- 1: a
- 2: b
- 3: c
- 4: d
- 5: Y
-
-/^X(?7)(a)(?|(b|(r)(s))|(q))(c)(d)(Y)/
-    XYabcdY
- 0: XYabcdY
- 1: a
- 2: b
- 3: <unset>
- 4: <unset>
- 5: c
- 6: d
- 7: Y
-
-/^X(?7)(a)(?|(b|(?|(r)|(t))(s))|(q))(c)(d)(Y)/
-    XYabcdY
- 0: XYabcdY
- 1: a
- 2: b
- 3: <unset>
- 4: <unset>
- 5: c
- 6: d
- 7: Y
-
-/(?'abc'\w+):\k<abc>{2}/
-    a:aaxyz
- 0: a:aa
- 1: a
-    ab:ababxyz
- 0: ab:abab
- 1: ab
-    ** Failers
-No match
-    a:axyz
-No match
-    ab:abxyz
-No match
-
-/(?'abc'\w+):\g{abc}{2}/
-    a:aaxyz
- 0: a:aa
- 1: a
-    ab:ababxyz
- 0: ab:abab
- 1: ab
-    ** Failers
-No match
-    a:axyz
-No match
-    ab:abxyz
-No match
-
-/^(?<ab>a)? (?(<ab>)b|c) (?('ab')d|e)/x
-    abd
- 0: abd
- 1: a
-    ce
- 0: ce
-
-/^(a.)\g-1Z/
-    aXaXZ
- 0: aXaXZ
- 1: aX
-
-/^(a.)\g{-1}Z/
-    aXaXZ
- 0: aXaXZ
- 1: aX
-
-/^(?(DEFINE) (?<A> a) (?<B> b) )  (?&A) (?&B) /x
-    abcd
- 0: ab
-
-/(?<NAME>(?&NAME_PAT))\s+(?<ADDR>(?&ADDRESS_PAT))
-  (?(DEFINE)
-  (?<NAME_PAT>[a-z]+)
-  (?<ADDRESS_PAT>\d+)
-  )/x
-    metcalfe 33
- 0: metcalfe 33
- 1: metcalfe
- 2: 33
-
-/(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))\b(?&byte)(\.(?&byte)){3}/
-    1.2.3.4
- 0: 1.2.3.4
- 1: <unset>
- 2: .4
-    131.111.10.206
- 0: 131.111.10.206
- 1: <unset>
- 2: .206
-    10.0.0.0
- 0: 10.0.0.0
- 1: <unset>
- 2: .0
-    ** Failers
-No match
-    10.6
-No match
-    455.3.4.5
-No match
-
-/\b(?&byte)(\.(?&byte)){3}(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))/
-    1.2.3.4
- 0: 1.2.3.4
- 1: .4
-    131.111.10.206
- 0: 131.111.10.206
- 1: .206
-    10.0.0.0
- 0: 10.0.0.0
- 1: .0
-    ** Failers
-No match
-    10.6
-No match
-    455.3.4.5
-No match
-
-/^(\w++|\s++)*$/
-    now is the time for all good men to come to the aid of the party
- 0: now is the time for all good men to come to the aid of the party
- 1: party
-    *** Failers
-No match
-    this is not a line with only words and spaces!
-No match
-
-/(\d++)(\w)/
-    12345a
- 0: 12345a
- 1: 12345
- 2: a
-    *** Failers
-No match
-    12345+
-No match
-
-/a++b/
-    aaab
- 0: aaab
-
-/(a++b)/
-    aaab
- 0: aaab
- 1: aaab
-
-/(a++)b/
-    aaab
- 0: aaab
- 1: aaa
-
-/([^()]++|\([^()]*\))+/
-    ((abc(ade)ufh()()x
- 0: abc(ade)ufh()()x
- 1: x
-
-/\(([^()]++|\([^()]+\))+\)/
-    (abc)
- 0: (abc)
- 1: abc
-    (abc(def)xyz)
- 0: (abc(def)xyz)
- 1: xyz
-    *** Failers
-No match
-    ((()aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-No match
-
-/^([^()]|\((?1)*\))*$/
-    abc
- 0: abc
- 1: c
-    a(b)c
- 0: a(b)c
- 1: c
-    a(b(c))d
- 0: a(b(c))d
- 1: d
-    *** Failers)
-No match
-    a(b(c)d
-No match
-
-/^>abc>([^()]|\((?1)*\))*<xyz<$/
-   >abc>123<xyz<
- 0: >abc>123<xyz<
- 1: 3
-   >abc>1(2)3<xyz<
- 0: >abc>1(2)3<xyz<
- 1: 3
-   >abc>(1(2)3)<xyz<
- 0: >abc>(1(2)3)<xyz<
- 1: (1(2)3)
-
-/^(?:((.)(?1)\2|)|((.)(?3)\4|.))$/i
-    1221
- 0: 1221
- 1: 1221
- 2: 1
-    Satanoscillatemymetallicsonatas
- 0: Satanoscillatemymetallicsonatas
- 1: <unset>
- 2: <unset>
- 3: Satanoscillatemymetallicsonatas
- 4: S
-    AmanaplanacanalPanama
- 0: AmanaplanacanalPanama
- 1: <unset>
- 2: <unset>
- 3: AmanaplanacanalPanama
- 4: A
-    AblewasIereIsawElba
- 0: AblewasIereIsawElba
- 1: <unset>
- 2: <unset>
- 3: AblewasIereIsawElba
- 4: A
-    *** Failers
-No match
-    Thequickbrownfox
-No match
-
-/^(\d+|\((?1)([+*-])(?1)\)|-(?1))$/
-    12
- 0: 12
- 1: 12
-    (((2+2)*-3)-7)
- 0: (((2+2)*-3)-7)
- 1: (((2+2)*-3)-7)
- 2: -
-    -12
- 0: -12
- 1: -12
-    *** Failers
-No match
-    ((2+2)*-3)-7)
-No match
-
-/^(x(y|(?1){2})z)/
-    xyz
- 0: xyz
- 1: xyz
- 2: y
-    xxyzxyzz
- 0: xxyzxyzz
- 1: xxyzxyzz
- 2: xyzxyz
-    *** Failers
-No match
-    xxyzz
-No match
-    xxyzxyzxyzz
-No match
-
-/((< (?: (?(R) \d++  | [^<>]*+) | (?2)) * >))/x
-    <>
- 0: <>
- 1: <>
- 2: <>
-    <abcd>
- 0: <abcd>
- 1: <abcd>
- 2: <abcd>
-    <abc <123> hij>
- 0: <abc <123> hij>
- 1: <abc <123> hij>
- 2: <abc <123> hij>
-    <abc <def> hij>
- 0: <def>
- 1: <def>
- 2: <def>
-    <abc<>def>
- 0: <abc<>def>
- 1: <abc<>def>
- 2: <abc<>def>
-    <abc<>
- 0: <>
- 1: <>
- 2: <>
-    *** Failers
-No match
-    <abc
-No match
-
-/^a+(*FAIL)/
-    aaaaaa
-No match
-    
-/a+b?c+(*FAIL)/
-    aaabccc
-No match
-
-/a+b?(*PRUNE)c+(*FAIL)/
-    aaabccc
-No match
-
-/a+b?(*COMMIT)c+(*FAIL)/
-    aaabccc
-No match
-    
-/a+b?(*SKIP)c+(*FAIL)/
-    aaabcccaaabccc
-No match
-
-/^(?:aaa(*THEN)\w{6}|bbb(*THEN)\w{5}|ccc(*THEN)\w{4}|\w{3})/
-    aaaxxxxxx
- 0: aaaxxxxxx
-    aaa++++++ 
- 0: aaa
-    bbbxxxxx
- 0: bbbxxxxx
-    bbb+++++ 
- 0: bbb
-    cccxxxx
- 0: cccxxxx
-    ccc++++ 
- 0: ccc
-    dddddddd   
- 0: ddd
-
-/^(aaa(*THEN)\w{6}|bbb(*THEN)\w{5}|ccc(*THEN)\w{4}|\w{3})/
-    aaaxxxxxx
- 0: aaaxxxxxx
- 1: aaaxxxxxx
-    aaa++++++ 
- 0: aaa
- 1: aaa
-    bbbxxxxx
- 0: bbbxxxxx
- 1: bbbxxxxx
-    bbb+++++ 
- 0: bbb
- 1: bbb
-    cccxxxx
- 0: cccxxxx
- 1: cccxxxx
-    ccc++++ 
- 0: ccc
- 1: ccc
-    dddddddd   
- 0: ddd
- 1: ddd
-
-/a+b?(*THEN)c+(*FAIL)/
-    aaabccc
-No match
-
-/(A (A|B(*ACCEPT)|C) D)(E)/x
-    AB
- 0: AB
- 1: AB
- 2: B
-    ABX
- 0: AB
- 1: AB
- 2: B
-    AADE
- 0: AADE
- 1: AAD
- 2: A
- 3: E
-    ACDE
- 0: ACDE
- 1: ACD
- 2: C
- 3: E
-    ** Failers
-No match
-    AD 
-No match
-        
-/^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$/i
-    1221
- 0: 1221
- 1: 1221
- 2: 1
-    Satan, oscillate my metallic sonatas!
- 0: Satan, oscillate my metallic sonatas!
- 1: <unset>
- 2: <unset>
- 3: Satan, oscillate my metallic sonatas
- 4: S
-    A man, a plan, a canal: Panama!
- 0: A man, a plan, a canal: Panama!
- 1: <unset>
- 2: <unset>
- 3: A man, a plan, a canal: Panama
- 4: A
-    Able was I ere I saw Elba.
- 0: Able was I ere I saw Elba.
- 1: <unset>
- 2: <unset>
- 3: Able was I ere I saw Elba
- 4: A
-    *** Failers
-No match
-    The quick brown fox
-No match
-
-/^((.)(?1)\2|.)$/
-    a
- 0: a
- 1: a
-    aba
- 0: aba
- 1: aba
- 2: a
-    aabaa  
- 0: aabaa
- 1: aabaa
- 2: a
-    abcdcba 
- 0: abcdcba
- 1: abcdcba
- 2: a
-    pqaabaaqp  
- 0: pqaabaaqp
- 1: pqaabaaqp
- 2: p
-    ablewasiereisawelba
- 0: ablewasiereisawelba
- 1: ablewasiereisawelba
- 2: a
-    rhubarb
-No match
-    the quick brown fox  
-No match
-
-/(a)(?<=b(?1))/
-    baz
- 0: a
- 1: a
-    ** Failers
-No match
-    caz  
-No match
-    
-/(?<=b(?1))(a)/
-    zbaaz
- 0: a
- 1: a
-    ** Failers
-No match
-    aaa  
-No match
-    
-/(?<X>a)(?<=b(?&X))/
-    baz
- 0: a
- 1: a
-
-/^(?|(abc)|(def))\1/
-    abcabc
- 0: abcabc
- 1: abc
-    defdef 
- 0: defdef
- 1: def
-    ** Failers
-No match
-    abcdef
-No match
-    defabc   
-No match
-    
-/^(?|(abc)|(def))(?1)/
-    abcabc
- 0: abcabc
- 1: abc
-    defabc
- 0: defabc
- 1: def
-    ** Failers
-No match
-    defdef
-No match
-    abcdef    
-No match
-
-/(?:a(?<quote> (?<apostrophe>')|(?<realquote>")) |b(?<quote> (?<apostrophe>')|(?<realquote>")) ) (?('quote')[a-z]+|[0-9]+)/xJ
-    a\"aaaaa
- 0: a"aaaaa
- 1: "
- 2: <unset>
- 3: "
-    b\"aaaaa 
- 0: b"aaaaa
- 1: <unset>
- 2: <unset>
- 3: <unset>
- 4: "
- 5: <unset>
- 6: "
-    ** Failers 
-No match
-    b\"11111
-No match
-
-/(?:(?1)|B)(A(*F)|C)/
-    ABCD
- 0: BC
- 1: C
-    CCD
- 0: CC
- 1: C
-    ** Failers
-No match
-    CAD   
-No match
-
-/^(?:(?1)|B)(A(*F)|C)/
-    CCD
- 0: CC
- 1: C
-    BCD 
- 0: BC
- 1: C
-    ** Failers
-No match
-    ABCD
-No match
-    CAD
-No match
-    BAD    
-No match
-
-/(?:(?1)|B)(A(*ACCEPT)XX|C)D/
-    AAD
- 0: AA
- 1: A
-    ACD
- 0: ACD
- 1: C
-    BAD
- 0: BA
- 1: A
-    BCD
- 0: BCD
- 1: C
-    BAX  
- 0: BA
- 1: A
-    ** Failers
-No match
-    ACX
-No match
-    ABC   
-No match
-
-/(?(DEFINE)(A))B(?1)C/
-    BAC
- 0: BAC
-
-/(?(DEFINE)((A)\2))B(?1)C/
-    BAAC
- 0: BAAC
-
-/(?<pn> \( ( [^()]++ | (?&pn) )* \) )/x
-    (ab(cd)ef)
- 0: (ab(cd)ef)
- 1: (ab(cd)ef)
- 2: ef
-
-/^(?!a(*SKIP)b)/
-    ac
- 0: 
-    
-/^(?=a(*SKIP)b|ac)/
-    ** Failers
-No match
-    ac
-No match
-    
-/^(?=a(*THEN)b|ac)/
-    ac
- 0: 
-    
-/^(?=a(*PRUNE)b)/
-    ab  
- 0: 
-    ** Failers 
-No match
-    ac
-No match
-
-/^(?=a(*ACCEPT)b)/
-    ac
- 0: 
-
-/^(?(?!a(*SKIP)b))/
-    ac
- 0: 
-
-/(?>a\Kb)/
-    ab
- 0: b
-
-/((?>a\Kb))/
-    ab
- 0: b
- 1: ab
-
-/(a\Kb)/
-    ab
- 0: b
- 1: ab
-    
-/^a\Kcz|ac/
-    ac
- 0: ac
-    
-/(?>a\Kbz|ab)/
-    ab 
- 0: ab
-
-/^(?&t)(?(DEFINE)(?<t>a\Kb))$/
-    ab
- 0: b
-
-/^([^()]|\((?1)*\))*$/
-    a(b)c
- 0: a(b)c
- 1: c
-    a(b(c)d)e 
- 0: a(b(c)d)e
- 1: e
-
-/(?P<L1>(?P<L2>0)(?P>L1)|(?P>L2))/
-    0
- 0: 0
- 1: 0
-    00
- 0: 00
- 1: 00
- 2: 0
-    0000  
- 0: 0000
- 1: 0000
- 2: 0
-
-/(?P<L1>(?P<L2>0)|(?P>L2)(?P>L1))/
-    0
- 0: 0
- 1: 0
- 2: 0
-    00
- 0: 0
- 1: 0
- 2: 0
-    0000  
- 0: 0
- 1: 0
- 2: 0
-
-/--- This one does fail, as expected, in Perl. It needs the complex item at the
-     end of the pattern. A single letter instead of (B|D) makes it not fail,
-     which I think is a Perl bug. --- /
-
-/A(*COMMIT)(B|D)/
-    ACABX
-No match
-
-/--- Check the use of names for failure ---/
-
-/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
-    ** Failers
-No match
-    AC
-No match, mark = A
-    CB    
-No match, mark = B
-    
-/--- Force no study, otherwise mark is not seen. The studied version is in
-     test 2 because it isn't Perl-compatible. ---/
-
-/(*MARK:A)(*SKIP:B)(C|X)/KSS
-    C
- 0: C
- 1: C
-MK: A
-    D
-No match, mark = A
-     
-/^(A(*THEN:A)B|C(*THEN:B)D)/K
-    ** Failers
-No match
-    CB    
-No match, mark = B
-
-/^(?:A(*THEN:A)B|C(*THEN:B)D)/K
-    CB    
-No match, mark = B
-    
-/^(?>A(*THEN:A)B|C(*THEN:B)D)/K
-    CB    
-No match, mark = B
-    
-/--- This should succeed, as the skip causes bump to offset 1 (the mark). Note
-that we have to have something complicated such as (B|Z) at the end because,
-for Perl, a simple character somehow causes an unwanted optimization to mess
-with the handling of backtracking verbs. ---/
-
-/A(*MARK:A)A+(*SKIP:A)(B|Z) | AC/xK
-    AAAC
- 0: AC
-    
-/--- Test skipping over a non-matching mark. ---/
-
-/A(*MARK:A)A+(*MARK:B)(*SKIP:A)(B|Z) | AC/xK
-    AAAC
- 0: AC
-    
-/--- Check shorthand for MARK ---/
-
-/A(*:A)A+(*SKIP:A)(B|Z) | AC/xK
-    AAAC
- 0: AC
-
-/--- Don't loop! Force no study, otherwise mark is not seen. ---/
-
-/(*:A)A+(*SKIP:A)(B|Z)/KSS
-    AAAC
-No match, mark = A
-
-/--- This should succeed, as a non-existent skip name disables the skip ---/ 
-
-/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC/xK
-    AAAC
- 0: AC
-
-/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC(*:B)/xK
-    AAAC
- 0: AC
-MK: B
-
-/--- We use something more complicated than individual letters here, because
-that causes different behaviour in Perl. Perhaps it disables some optimization;
-anyway, the result now matches PCRE in that no tag is passed back for the 
-failures. ---/
-    
-/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK
-    AABC
- 0: AB
- 1: A
- 2: B
-MK: A
-    XXYZ 
- 0: XXY
- 1: <unset>
- 2: <unset>
- 3: X
- 4: X
- 5: Y
-MK: B
-    ** Failers
-No match
-    XAQQ  
-No match
-    XAQQXZZ  
-No match
-    AXQQQ 
-No match
-    AXXQQQ 
-No match
-    
-/--- COMMIT at the start of a pattern should act like an anchor. Again, 
-however, we need the complication for Perl. ---/
-
-/(*COMMIT)(A|P)(B|P)(C|P)/
-    ABCDEFG
- 0: ABC
- 1: A
- 2: B
- 3: C
-    ** Failers
-No match
-    DEFGABC  
-No match
-
-/--- COMMIT inside an atomic group can't stop backtracking over the group. ---/
-
-/(\w+)(?>b(*COMMIT))\w{2}/
-    abbb
- 0: abbb
- 1: a
-
-/(\w+)b(*COMMIT)\w{2}/
-    abbb
-No match
-
-/--- Check opening parens in comment when seeking forward reference. ---/ 
-
-/(?&t)(?#()(?(DEFINE)(?<t>a))/
-    bac
- 0: a
-
-/--- COMMIT should override THEN ---/
-
-/(?>(*COMMIT)(?>yes|no)(*THEN)(*F))?/
-  yes
-No match
-
-/(?>(*COMMIT)(yes|no)(*THEN)(*F))?/
-  yes
-No match
-
-/b?(*SKIP)c/
-    bc
- 0: bc
-    abc
- 0: bc
-   
-/(*SKIP)bc/
-    a
-No match
-
-/(*SKIP)b/
-    a 
-No match
-
-/(?P<abn>(?P=abn)xxx|)+/
-    xxx
- 0: 
- 1: 
-
-/(?i:([^b]))(?1)/
-    aa
- 0: aa
- 1: a
-    aA     
- 0: aA
- 1: a
-    ** Failers
- 0: **
- 1: *
-    ab
-No match
-    aB
-No match
-    Ba
-No match
-    ba
-No match
-
-/^(?&t)*+(?(DEFINE)(?<t>a))\w$/
-    aaaaaaX
- 0: aaaaaaX
-    ** Failers 
-No match
-    aaaaaa 
-No match
-
-/^(?&t)*(?(DEFINE)(?<t>a))\w$/
-    aaaaaaX
- 0: aaaaaaX
-    aaaaaa 
- 0: aaaaaa
-
-/^(a)*+(\w)/
-    aaaaX
- 0: aaaaX
- 1: a
- 2: X
-    YZ 
- 0: Y
- 1: <unset>
- 2: Y
-    ** Failers 
-No match
-    aaaa
-No match
-
-/^(?:a)*+(\w)/
-    aaaaX
- 0: aaaaX
- 1: X
-    YZ 
- 0: Y
- 1: Y
-    ** Failers 
-No match
-    aaaa
-No match
-
-/^(a)++(\w)/
-    aaaaX
- 0: aaaaX
- 1: a
- 2: X
-    ** Failers 
-No match
-    aaaa
-No match
-    YZ 
-No match
-
-/^(?:a)++(\w)/
-    aaaaX
- 0: aaaaX
- 1: X
-    ** Failers 
-No match
-    aaaa
-No match
-    YZ 
-No match
-
-/^(a)?+(\w)/
-    aaaaX
- 0: aa
- 1: a
- 2: a
-    YZ 
- 0: Y
- 1: <unset>
- 2: Y
-
-/^(?:a)?+(\w)/
-    aaaaX
- 0: aa
- 1: a
-    YZ 
- 0: Y
- 1: Y
-
-/^(a){2,}+(\w)/
-    aaaaX
- 0: aaaaX
- 1: a
- 2: X
-    ** Failers
-No match
-    aaa
-No match
-    YZ 
-No match
-
-/^(?:a){2,}+(\w)/
-    aaaaX
- 0: aaaaX
- 1: X
-    ** Failers
-No match
-    aaa
-No match
-    YZ 
-No match
-
-/(a|)*(?1)b/
-    b
- 0: b
- 1: 
-    ab
- 0: ab
- 1: 
-    aab  
- 0: aab
- 1: 
-
-/(a)++(?1)b/
-    ** Failers
-No match
-    ab 
-No match
-    aab
-No match
-
-/(a)*+(?1)b/
-    ** Failers
-No match
-    ab
-No match
-    aab  
-No match
-
-/(?1)(?:(b)){0}/
-    b
- 0: b
-
-/(foo ( \( ((?:(?> [^()]+ )|(?2))*) \) ) )/x
-    foo(bar(baz)+baz(bop))
- 0: foo(bar(baz)+baz(bop))
- 1: foo(bar(baz)+baz(bop))
- 2: (bar(baz)+baz(bop))
- 3: bar(baz)+baz(bop)
-
-/(A (A|B(*ACCEPT)|C) D)(E)/x
-    AB
- 0: AB
- 1: AB
- 2: B
-
-/\A.*?(?:a|b(*THEN)c)/
-    ba
- 0: ba
-
-/\A.*?(?:a|bc)/
-    ba
- 0: ba
-
-/\A.*?(a|b(*THEN)c)/
-    ba
- 0: ba
- 1: a
-
-/\A.*?(a|bc)/
-    ba
- 0: ba
- 1: a
-
-/\A.*?(?:a|b(*THEN)c)++/
-    ba
- 0: ba
-
-/\A.*?(?:a|bc)++/
-    ba
- 0: ba
-
-/\A.*?(a|b(*THEN)c)++/
-    ba
- 0: ba
- 1: a
-
-/\A.*?(a|bc)++/
-    ba
- 0: ba
- 1: a
-
-/\A.*?(?:a|b(*THEN)c|d)/
-    ba
- 0: ba
-
-/\A.*?(?:a|bc|d)/
-    ba
- 0: ba
-
-/(?:(b))++/
-    beetle
- 0: b
- 1: b
-
-/(?(?=(a(*ACCEPT)z))a)/
-    a
- 0: a
- 1: a
-
-/^(a)(?1)+ab/
-    aaaab
- 0: aaaab
- 1: a
-    
-/^(a)(?1)++ab/
-    aaaab
-No match
-
-/^(?=a(*:M))aZ/K
-    aZbc
- 0: aZ
-MK: M
-
-/^(?!(*:M)b)aZ/K
-    aZbc
- 0: aZ
-
-/(?(DEFINE)(a))?b(?1)/
-    backgammon
- 0: ba
-
-/^\N+/
-    abc\ndef
- 0: abc
-    
-/^\N{1,}/
-    abc\ndef 
- 0: abc
-
-/(?(R)a+|(?R)b)/
-    aaaabcde
- 0: aaaab
-
-/(?(R)a+|((?R))b)/
-    aaaabcde
- 0: aaaab
- 1: aaaa
-
-/((?(R)a+|(?1)b))/
-    aaaabcde
- 0: aaaab
- 1: aaaab
-
-/((?(R1)a+|(?1)b))/
-    aaaabcde
- 0: aaaab
- 1: aaaab
-
-/a(*:any 
-name)/K
-    abc
- 0: a
-MK: any 
-name
-    
-/(?>(?&t)c|(?&t))(?(DEFINE)(?<t>a|b(*PRUNE)c))/
-    a
- 0: a
-    ba
- 0: a
-    bba 
- 0: a
-    
-/--- Checking revised (*THEN) handling ---/ 
-
-/--- Capture ---/
-
-/^.*? (a(*THEN)b) c/x
-    aabc
-No match
-
-/^.*? (a(*THEN)b|(*F)) c/x
-    aabc
- 0: aabc
- 1: ab
-
-/^.*? ( (a(*THEN)b) | (*F) ) c/x
-    aabc
- 0: aabc
- 1: ab
- 2: ab
-
-/^.*? ( (a(*THEN)b) ) c/x
-    aabc
-No match
-
-/--- Non-capture ---/
-
-/^.*? (?:a(*THEN)b) c/x
-    aabc
-No match
-
-/^.*? (?:a(*THEN)b|(*F)) c/x
-    aabc
- 0: aabc
-
-/^.*? (?: (?:a(*THEN)b) | (*F) ) c/x
-    aabc
- 0: aabc
-
-/^.*? (?: (?:a(*THEN)b) ) c/x
-    aabc
-No match
-
-/--- Atomic ---/
-
-/^.*? (?>a(*THEN)b) c/x
-    aabc
-No match
-
-/^.*? (?>a(*THEN)b|(*F)) c/x
-    aabc
- 0: aabc
-
-/^.*? (?> (?>a(*THEN)b) | (*F) ) c/x
-    aabc
- 0: aabc
-
-/^.*? (?> (?>a(*THEN)b) ) c/x
-    aabc
-No match
-
-/--- Possessive capture ---/
-
-/^.*? (a(*THEN)b)++ c/x
-    aabc
-No match
-
-/^.*? (a(*THEN)b|(*F))++ c/x
-    aabc
- 0: aabc
- 1: ab
-
-/^.*? ( (a(*THEN)b)++ | (*F) )++ c/x
-    aabc
- 0: aabc
- 1: ab
- 2: ab
-
-/^.*? ( (a(*THEN)b)++ )++ c/x
-    aabc
-No match
-
-/--- Possessive non-capture ---/
-
-/^.*? (?:a(*THEN)b)++ c/x
-    aabc
-No match
-
-/^.*? (?:a(*THEN)b|(*F))++ c/x
-    aabc
- 0: aabc
-
-/^.*? (?: (?:a(*THEN)b)++ | (*F) )++ c/x
-    aabc
- 0: aabc
-
-/^.*? (?: (?:a(*THEN)b)++ )++ c/x
-    aabc
-No match
-    
-/--- Condition assertion ---/
-
-/^(?(?=a(*THEN)b)ab|ac)/
-    ac
- 0: ac
- 
-/--- Condition ---/
-
-/^.*?(?(?=a)a|b(*THEN)c)/
-    ba
-No match
-
-/^.*?(?:(?(?=a)a|b(*THEN)c)|d)/
-    ba
- 0: ba
-
-/^.*?(?(?=a)a(*THEN)b|c)/
-    ac
-No match
-
-/--- Assertion ---/
-
-/^.*(?=a(*THEN)b)/ 
-    aabc
- 0: a
-
-/------------------------------/
-
-/(?>a(*:m))/imsxSK 
-    a
- 0: a
-MK: m
-
-/(?>(a)(*:m))/imsxSK 
-    a
- 0: a
- 1: a
-MK: m
-
-/(?<=a(*ACCEPT)b)c/
-    xacd
- 0: c
-
-/(?<=(a(*ACCEPT)b))c/
-    xacd
- 0: c
- 1: a
-
-/(?<=(a(*COMMIT)b))c/
-    xabcd
- 0: c
- 1: ab
-    ** Failers 
-No match
-    xacd
-No match
-    
-/(?<!a(*FAIL)b)c/
-    xcd
- 0: c
-    acd 
- 0: c
-
-/(?<=a(*:N)b)c/K
-    xabcd
- 0: c
-MK: N
-    
-/(?<=a(*PRUNE)b)c/
-    xabcd 
- 0: c
-
-/(?<=a(*SKIP)b)c/
-    xabcd 
- 0: c
-
-/(?<=a(*THEN)b)c/
-    xabcd 
- 0: c
-
-/-- End of testinput11 --/

Copied: code/trunk/testdata/testoutput11-16 (from rev 835, code/branches/pcre16/testdata/testoutput11-16)
===================================================================
--- code/trunk/testdata/testoutput11-16                            (rev 0)
+++ code/trunk/testdata/testoutput11-16    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,713 @@
+/-- These are a few representative patterns whose lengths and offsets are to be 
+shown when the link size is 2. This is just a doublecheck test to ensure the 
+sizes don't go horribly wrong when something is changed. The pattern contents 
+are all themselves checked in other tests. Unicode, including property support, 
+is required for these tests. --/
+
+/((?i)b)/BM
+Memory allocation (code space): 24
+------------------------------------------------------------------
+  0   9 Bra
+  2   5 CBra 1
+  5  /i b
+  7   5 Ket
+  9   9 Ket
+ 11     End
+------------------------------------------------------------------
+
+/(?s)(.*X|^B)/BM
+Memory allocation (code space): 38
+------------------------------------------------------------------
+  0  16 Bra
+  2   7 CBra 1
+  5     AllAny*
+  7     X
+  9   5 Alt
+ 11     ^
+ 12     B
+ 14  12 Ket
+ 16  16 Ket
+ 18     End
+------------------------------------------------------------------
+
+/(?s:.*X|^B)/BM
+Memory allocation (code space): 36
+------------------------------------------------------------------
+  0  15 Bra
+  2   6 Bra
+  4     AllAny*
+  6     X
+  8   5 Alt
+ 10     ^
+ 11     B
+ 13  11 Ket
+ 15  15 Ket
+ 17     End
+------------------------------------------------------------------
+
+/^[[:alnum:]]/BM
+Memory allocation (code space): 46
+------------------------------------------------------------------
+  0  20 Bra
+  2     ^
+  3     [0-9A-Za-z]
+ 20  20 Ket
+ 22     End
+------------------------------------------------------------------
+
+/#/IxMD
+Memory allocation (code space): 10
+------------------------------------------------------------------
+  0   2 Bra
+  2   2 Ket
+  4     End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: extended
+No first char
+No need char
+
+/a#/IxMD
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     a
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: extended
+First char = 'a'
+No need char
+
+/x?+/BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     x?+
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/x++/BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     x++
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/x{1,3}+/BM 
+Memory allocation (code space): 28
+------------------------------------------------------------------
+  0  11 Bra
+  2   7 Once
+  4     x
+  6     x{0,2}
+  9   7 Ket
+ 11  11 Ket
+ 13     End
+------------------------------------------------------------------
+
+/(x)*+/BM
+Memory allocation (code space): 26
+------------------------------------------------------------------
+  0  10 Bra
+  2     Braposzero
+  3   5 CBraPos 1
+  6     x
+  8   5 KetRpos
+ 10  10 Ket
+ 12     End
+------------------------------------------------------------------
+
+/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/BM
+Memory allocation (code space): 142
+------------------------------------------------------------------
+  0  68 Bra
+  2     ^
+  3  63 CBra 1
+  6   5 CBra 2
+  9     a+
+ 11   5 Ket
+ 13  21 CBra 3
+ 16     [ab]+?
+ 34  21 Ket
+ 36  21 CBra 4
+ 39     [bc]+
+ 57  21 Ket
+ 59   5 CBra 5
+ 62     \w*
+ 64   5 Ket
+ 66  63 Ket
+ 68  68 Ket
+ 70     End
+------------------------------------------------------------------
+
+|8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b|BM
+Memory allocation (code space): 1648
+------------------------------------------------------------------
+  0 821 Bra
+  2     8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
+820     \b
+821 821 Ket
+823     End
+------------------------------------------------------------------
+
+|\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b|BM
+Memory allocation (code space): 1628
+------------------------------------------------------------------
+  0 811 Bra
+  2     $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
+810     \b
+811 811 Ket
+813     End
+------------------------------------------------------------------
+
+/(a(?1)b)/BM
+Memory allocation (code space): 32
+------------------------------------------------------------------
+  0  13 Bra
+  2   9 CBra 1
+  5     a
+  7   2 Recurse
+  9     b
+ 11   9 Ket
+ 13  13 Ket
+ 15     End
+------------------------------------------------------------------
+
+/(a(?1)+b)/BM
+Memory allocation (code space): 40
+------------------------------------------------------------------
+  0  17 Bra
+  2  13 CBra 1
+  5     a
+  7   4 Once
+  9   2 Recurse
+ 11   4 KetRmax
+ 13     b
+ 15  13 Ket
+ 17  17 Ket
+ 19     End
+------------------------------------------------------------------
+
+/a(?P<name1>b|c)d(?P<longername2>e)/BM
+Memory allocation (code space): 80
+------------------------------------------------------------------
+  0  24 Bra
+  2     a
+  4   5 CBra 1
+  7     b
+  9   4 Alt
+ 11     c
+ 13   9 Ket
+ 15     d
+ 17   5 CBra 2
+ 20     e
+ 22   5 Ket
+ 24  24 Ket
+ 26     End
+------------------------------------------------------------------
+
+/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/BM
+Memory allocation (code space): 73
+------------------------------------------------------------------
+  0  29 Bra
+  2  18 Bra
+  4     a
+  6  12 CBra 1
+  9     c
+ 11   5 CBra 2
+ 14     d
+ 16   5 Ket
+ 18  12 Ket
+ 20  18 Ket
+ 22   5 CBra 3
+ 25     a
+ 27   5 Ket
+ 29  29 Ket
+ 31     End
+------------------------------------------------------------------
+
+/(?P<a>a)...(?P=a)bbb(?P>a)d/BM
+Memory allocation (code space): 57
+------------------------------------------------------------------
+  0  24 Bra
+  2   5 CBra 1
+  5     a
+  7   5 Ket
+  9     Any
+ 10     Any
+ 11     Any
+ 12     \1
+ 14     bbb
+ 20   2 Recurse
+ 22     d
+ 24  24 Ket
+ 26     End
+------------------------------------------------------------------
+
+/abc(?C255)de(?C)f/BM
+Memory allocation (code space): 50
+------------------------------------------------------------------
+  0  22 Bra
+  2     abc
+  8     Callout 255 10 1
+ 12     de
+ 16     Callout 0 16 1
+ 20     f
+ 22  22 Ket
+ 24     End
+------------------------------------------------------------------
+
+/abcde/CBM
+Memory allocation (code space): 78
+------------------------------------------------------------------
+  0  36 Bra
+  2     Callout 255 0 1
+  6     a
+  8     Callout 255 1 1
+ 12     b
+ 14     Callout 255 2 1
+ 18     c
+ 20     Callout 255 3 1
+ 24     d
+ 26     Callout 255 4 1
+ 30     e
+ 32     Callout 255 5 0
+ 36  36 Ket
+ 38     End
+------------------------------------------------------------------
+
+/\x{100}/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     \x{100}
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/\x{1000}/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     \x{1000}
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/\x{10000}/8BM
+Memory allocation (code space): 16
+------------------------------------------------------------------
+  0   5 Bra
+  2     \x{10000}
+  5   5 Ket
+  7     End
+------------------------------------------------------------------
+
+/\x{100000}/8BM
+Memory allocation (code space): 16
+------------------------------------------------------------------
+  0   5 Bra
+  2     \x{100000}
+  5   5 Ket
+  7     End
+------------------------------------------------------------------
+
+/\x{10ffff}/8BM
+Memory allocation (code space): 16
+------------------------------------------------------------------
+  0   5 Bra
+  2     \x{10ffff}
+  5   5 Ket
+  7     End
+------------------------------------------------------------------
+
+/\x{110000}/8BM
+Failed: character value in \x{...} sequence is too large at offset 9
+
+/[\x{ff}]/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     \xff
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/[\x{100}]/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     \x{100}
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/\x80/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     \x80
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/\xff/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     \xff
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/\x{0041}\x{2262}\x{0391}\x{002e}/D8M
+Memory allocation (code space): 26
+------------------------------------------------------------------
+  0  10 Bra
+  2     A\x{2262}\x{391}.
+ 10  10 Ket
+ 12     End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'A'
+Need char = '.'
+    
+/\x{D55c}\x{ad6d}\x{C5B4}/D8M 
+Memory allocation (code space): 22
+------------------------------------------------------------------
+  0   8 Bra
+  2     \x{d55c}\x{ad6d}\x{c5b4}
+  8   8 Ket
+ 10     End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{d55c}
+Need char = \x{c5b4}
+
+/\x{65e5}\x{672c}\x{8a9e}/D8M
+Memory allocation (code space): 22
+------------------------------------------------------------------
+  0   8 Bra
+  2     \x{65e5}\x{672c}\x{8a9e}
+  8   8 Ket
+ 10     End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{65e5}
+Need char = \x{8a9e}
+
+/[\x{100}]/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     \x{100}
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/[Z\x{100}]/8BM
+Memory allocation (code space): 54
+------------------------------------------------------------------
+  0  24 Bra
+  2     [Z\x{100}]
+ 24  24 Ket
+ 26     End
+------------------------------------------------------------------
+
+/^[\x{100}\E-\Q\E\x{150}]/B8M
+Memory allocation (code space): 26
+------------------------------------------------------------------
+  0  10 Bra
+  2     ^
+  3     [\x{100}-\x{150}]
+ 10  10 Ket
+ 12     End
+------------------------------------------------------------------
+
+/^[\QĀ\E-\QŐ\E]/B8M
+Memory allocation (code space): 26
+------------------------------------------------------------------
+  0  10 Bra
+  2     ^
+  3     [\x{100}-\x{150}]
+ 10  10 Ket
+ 12     End
+------------------------------------------------------------------
+
+/^[\QĀ\E-\QŐ\E/B8M
+Failed: missing terminating ] for character class at offset 13
+
+/[\p{L}]/BM
+Memory allocation (code space): 24
+------------------------------------------------------------------
+  0   9 Bra
+  2     [\p{L}]
+  9   9 Ket
+ 11     End
+------------------------------------------------------------------
+
+/[\p{^L}]/BM
+Memory allocation (code space): 24
+------------------------------------------------------------------
+  0   9 Bra
+  2     [\P{L}]
+  9   9 Ket
+ 11     End
+------------------------------------------------------------------
+
+/[\P{L}]/BM
+Memory allocation (code space): 24
+------------------------------------------------------------------
+  0   9 Bra
+  2     [\P{L}]
+  9   9 Ket
+ 11     End
+------------------------------------------------------------------
+
+/[\P{^L}]/BM
+Memory allocation (code space): 24
+------------------------------------------------------------------
+  0   9 Bra
+  2     [\p{L}]
+  9   9 Ket
+ 11     End
+------------------------------------------------------------------
+
+/[abc\p{L}\x{0660}]/8BM
+Memory allocation (code space): 60
+------------------------------------------------------------------
+  0  27 Bra
+  2     [a-c\p{L}\x{660}]
+ 27  27 Ket
+ 29     End
+------------------------------------------------------------------
+
+/[\p{Nd}]/8BM
+Memory allocation (code space): 24
+------------------------------------------------------------------
+  0   9 Bra
+  2     [\p{Nd}]
+  9   9 Ket
+ 11     End
+------------------------------------------------------------------
+
+/[\p{Nd}+-]+/8BM
+Memory allocation (code space): 58
+------------------------------------------------------------------
+  0  26 Bra
+  2     [+\-\p{Nd}]+
+ 26  26 Ket
+ 28     End
+------------------------------------------------------------------
+
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iBM
+Memory allocation (code space): 32
+------------------------------------------------------------------
+  0  13 Bra
+  2  /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+ 13  13 Ket
+ 15     End
+------------------------------------------------------------------
+
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8BM
+Memory allocation (code space): 32
+------------------------------------------------------------------
+  0  13 Bra
+  2     A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+ 13  13 Ket
+ 15     End
+------------------------------------------------------------------
+
+/[\x{105}-\x{109}]/8iBM
+Memory allocation (code space): 24
+------------------------------------------------------------------
+  0   9 Bra
+  2     [\x{104}-\x{109}]
+  9   9 Ket
+ 11     End
+------------------------------------------------------------------
+
+/( ( (?(1)0|) )*   )/xBM
+Memory allocation (code space): 52
+------------------------------------------------------------------
+  0  23 Bra
+  2  19 CBra 1
+  5     Brazero
+  6  13 SCBra 2
+  9   6 Cond
+ 11   1 Cond ref
+ 13     0
+ 15   2 Alt
+ 17   8 Ket
+ 19  13 KetRmax
+ 21  19 Ket
+ 23  23 Ket
+ 25     End
+------------------------------------------------------------------
+
+/(  (?(1)0|)*   )/xBM
+Memory allocation (code space): 42
+------------------------------------------------------------------
+  0  18 Bra
+  2  14 CBra 1
+  5     Brazero
+  6   6 SCond
+  8   1 Cond ref
+ 10     0
+ 12   2 Alt
+ 14   8 KetRmax
+ 16  14 Ket
+ 18  18 Ket
+ 20     End
+------------------------------------------------------------------
+
+/[a]/BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     a
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/[a]/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     a
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/[\xaa]/BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     \xaa
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/[\xaa]/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     \xaa
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/[^a]/BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     [^a]
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/[^a]/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     [^a]
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/[^\xaa]/BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     [^\xaa]
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/[^\xaa]/8BM
+Memory allocation (code space): 14
+------------------------------------------------------------------
+  0   4 Bra
+  2     [^\x{aa}]
+  4   4 Ket
+  6     End
+------------------------------------------------------------------
+
+/[^\d]/8WB
+------------------------------------------------------------------
+  0   9 Bra
+  2     [^\p{Nd}]
+  9   9 Ket
+ 11     End
+------------------------------------------------------------------
+
+/[[:^alpha:][:^cntrl:]]+/8WB
+------------------------------------------------------------------
+  0  26 Bra
+  2     [ -~\x80-\xff\P{L}]+
+ 26  26 Ket
+ 28     End
+------------------------------------------------------------------
+
+/[[:^cntrl:][:^alpha:]]+/8WB
+------------------------------------------------------------------
+  0  26 Bra
+  2     [ -~\x80-\xff\P{L}]+
+ 26  26 Ket
+ 28     End
+------------------------------------------------------------------
+
+/[[:alpha:]]+/8WB
+------------------------------------------------------------------
+  0  10 Bra
+  2     [\p{L}]+
+ 10  10 Ket
+ 12     End
+------------------------------------------------------------------
+
+/[[:^alpha:]\S]+/8WB
+------------------------------------------------------------------
+  0  13 Bra
+  2     [\P{L}\P{Xsp}]+
+ 13  13 Ket
+ 15     End
+------------------------------------------------------------------
+
+/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/B
+------------------------------------------------------------------
+  0  60 Bra
+  2     abc
+  8   5 CBra 1
+ 11     d
+ 13   4 Alt
+ 15     e
+ 17   9 Ket
+ 19     *THEN
+ 20     x
+ 22  12 CBra 2
+ 25     123
+ 31     *THEN
+ 32     4
+ 34  24 Alt
+ 36     567
+ 42   5 CBra 3
+ 45     b
+ 47   4 Alt
+ 49     q
+ 51   9 Ket
+ 53     *THEN
+ 54     xx
+ 58  36 Ket
+ 60  60 Ket
+ 62     End
+------------------------------------------------------------------
+
+/-- End of testinput11 --/

Copied: code/trunk/testdata/testoutput11-8 (from rev 835, code/branches/pcre16/testdata/testoutput11-8)
===================================================================
--- code/trunk/testdata/testoutput11-8                            (rev 0)
+++ code/trunk/testdata/testoutput11-8    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,713 @@
+/-- These are a few representative patterns whose lengths and offsets are to be 
+shown when the link size is 2. This is just a doublecheck test to ensure the 
+sizes don't go horribly wrong when something is changed. The pattern contents 
+are all themselves checked in other tests. Unicode, including property support, 
+is required for these tests. --/
+
+/((?i)b)/BM
+Memory allocation (code space): 17
+------------------------------------------------------------------
+  0  13 Bra
+  3   7 CBra 1
+  8  /i b
+ 10   7 Ket
+ 13  13 Ket
+ 16     End
+------------------------------------------------------------------
+
+/(?s)(.*X|^B)/BM
+Memory allocation (code space): 25
+------------------------------------------------------------------
+  0  21 Bra
+  3   9 CBra 1
+  8     AllAny*
+ 10     X
+ 12   6 Alt
+ 15     ^
+ 16     B
+ 18  15 Ket
+ 21  21 Ket
+ 24     End
+------------------------------------------------------------------
+
+/(?s:.*X|^B)/BM
+Memory allocation (code space): 23
+------------------------------------------------------------------
+  0  19 Bra
+  3   7 Bra
+  6     AllAny*
+  8     X
+ 10   6 Alt
+ 13     ^
+ 14     B
+ 16  13 Ket
+ 19  19 Ket
+ 22     End
+------------------------------------------------------------------
+
+/^[[:alnum:]]/BM
+Memory allocation (code space): 41
+------------------------------------------------------------------
+  0  37 Bra
+  3     ^
+  4     [0-9A-Za-z]
+ 37  37 Ket
+ 40     End
+------------------------------------------------------------------
+
+/#/IxMD
+Memory allocation (code space): 7
+------------------------------------------------------------------
+  0   3 Bra
+  3   3 Ket
+  6     End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: extended
+No first char
+No need char
+
+/a#/IxMD
+Memory allocation (code space): 9
+------------------------------------------------------------------
+  0   5 Bra
+  3     a
+  5   5 Ket
+  8     End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: extended
+First char = 'a'
+No need char
+
+/x?+/BM
+Memory allocation (code space): 9
+------------------------------------------------------------------
+  0   5 Bra
+  3     x?+
+  5   5 Ket
+  8     End
+------------------------------------------------------------------
+
+/x++/BM
+Memory allocation (code space): 9
+------------------------------------------------------------------
+  0   5 Bra
+  3     x++
+  5   5 Ket
+  8     End
+------------------------------------------------------------------
+
+/x{1,3}+/BM 
+Memory allocation (code space): 19
+------------------------------------------------------------------
+  0  15 Bra
+  3   9 Once
+  6     x
+  8     x{0,2}
+ 12   9 Ket
+ 15  15 Ket
+ 18     End
+------------------------------------------------------------------
+
+/(x)*+/BM
+Memory allocation (code space): 18
+------------------------------------------------------------------
+  0  14 Bra
+  3     Braposzero
+  4   7 CBraPos 1
+  9     x
+ 11   7 KetRpos
+ 14  14 Ket
+ 17     End
+------------------------------------------------------------------
+
+/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/BM
+Memory allocation (code space): 120
+------------------------------------------------------------------
+  0 116 Bra
+  3     ^
+  4 109 CBra 1
+  9   7 CBra 2
+ 14     a+
+ 16   7 Ket
+ 19  39 CBra 3
+ 24     [ab]+?
+ 58  39 Ket
+ 61  39 CBra 4
+ 66     [bc]+
+100  39 Ket
+103   7 CBra 5
+108     \w*
+110   7 Ket
+113 109 Ket
+116 116 Ket
+119     End
+------------------------------------------------------------------
+
+|8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b|BM
+Memory allocation (code space): 826
+------------------------------------------------------------------
+  0 822 Bra
+  3     8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
+821     \b
+822 822 Ket
+825     End
+------------------------------------------------------------------
+
+|\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b|BM
+Memory allocation (code space): 816
+------------------------------------------------------------------
+  0 812 Bra
+  3     $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
+811     \b
+812 812 Ket
+815     End
+------------------------------------------------------------------
+
+/(a(?1)b)/BM
+Memory allocation (code space): 22
+------------------------------------------------------------------
+  0  18 Bra
+  3  12 CBra 1
+  8     a
+ 10   3 Recurse
+ 13     b
+ 15  12 Ket
+ 18  18 Ket
+ 21     End
+------------------------------------------------------------------
+
+/(a(?1)+b)/BM
+Memory allocation (code space): 28
+------------------------------------------------------------------
+  0  24 Bra
+  3  18 CBra 1
+  8     a
+ 10   6 Once
+ 13   3 Recurse
+ 16   6 KetRmax
+ 19     b
+ 21  18 Ket
+ 24  24 Ket
+ 27     End
+------------------------------------------------------------------
+
+/a(?P<name1>b|c)d(?P<longername2>e)/BM
+Memory allocation (code space): 36
+------------------------------------------------------------------
+  0  32 Bra
+  3     a
+  5   7 CBra 1
+ 10     b
+ 12   5 Alt
+ 15     c
+ 17  12 Ket
+ 20     d
+ 22   7 CBra 2
+ 27     e
+ 29   7 Ket
+ 32  32 Ket
+ 35     End
+------------------------------------------------------------------
+
+/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/BM
+Memory allocation (code space): 45
+------------------------------------------------------------------
+  0  41 Bra
+  3  25 Bra
+  6     a
+  8  17 CBra 1
+ 13     c
+ 15   7 CBra 2
+ 20     d
+ 22   7 Ket
+ 25  17 Ket
+ 28  25 Ket
+ 31   7 CBra 3
+ 36     a
+ 38   7 Ket
+ 41  41 Ket
+ 44     End
+------------------------------------------------------------------
+
+/(?P<a>a)...(?P=a)bbb(?P>a)d/BM
+Memory allocation (code space): 34
+------------------------------------------------------------------
+  0  30 Bra
+  3   7 CBra 1
+  8     a
+ 10   7 Ket
+ 13     Any
+ 14     Any
+ 15     Any
+ 16     \1
+ 19     bbb
+ 25   3 Recurse
+ 28     d
+ 30  30 Ket
+ 33     End
+------------------------------------------------------------------
+
+/abc(?C255)de(?C)f/BM
+Memory allocation (code space): 31
+------------------------------------------------------------------
+  0  27 Bra
+  3     abc
+  9     Callout 255 10 1
+ 15     de
+ 19     Callout 0 16 1
+ 25     f
+ 27  27 Ket
+ 30     End
+------------------------------------------------------------------
+
+/abcde/CBM
+Memory allocation (code space): 53
+------------------------------------------------------------------
+  0  49 Bra
+  3     Callout 255 0 1
+  9     a
+ 11     Callout 255 1 1
+ 17     b
+ 19     Callout 255 2 1
+ 25     c
+ 27     Callout 255 3 1
+ 33     d
+ 35     Callout 255 4 1
+ 41     e
+ 43     Callout 255 5 0
+ 49  49 Ket
+ 52     End
+------------------------------------------------------------------
+
+/\x{100}/8BM
+Memory allocation (code space): 10
+------------------------------------------------------------------
+  0   6 Bra
+  3     \x{100}
+  6   6 Ket
+  9     End
+------------------------------------------------------------------
+
+/\x{1000}/8BM
+Memory allocation (code space): 11
+------------------------------------------------------------------
+  0   7 Bra
+  3     \x{1000}
+  7   7 Ket
+ 10     End
+------------------------------------------------------------------
+
+/\x{10000}/8BM
+Memory allocation (code space): 12
+------------------------------------------------------------------
+  0   8 Bra
+  3     \x{10000}
+  8   8 Ket
+ 11     End
+------------------------------------------------------------------
+
+/\x{100000}/8BM
+Memory allocation (code space): 12
+------------------------------------------------------------------
+  0   8 Bra
+  3     \x{100000}
+  8   8 Ket
+ 11     End
+------------------------------------------------------------------
+
+/\x{10ffff}/8BM
+Memory allocation (code space): 12
+------------------------------------------------------------------
+  0   8 Bra
+  3     \x{10ffff}
+  8   8 Ket
+ 11     End
+------------------------------------------------------------------
+
+/\x{110000}/8BM
+Failed: character value in \x{...} sequence is too large at offset 9
+
+/[\x{ff}]/8BM
+Memory allocation (code space): 10
+------------------------------------------------------------------
+  0   6 Bra
+  3     \x{ff}
+  6   6 Ket
+  9     End
+------------------------------------------------------------------
+
+/[\x{100}]/8BM
+Memory allocation (code space): 10
+------------------------------------------------------------------
+  0   6 Bra
+  3     \x{100}
+  6   6 Ket
+  9     End
+------------------------------------------------------------------
+
+/\x80/8BM
+Memory allocation (code space): 10
+------------------------------------------------------------------
+  0   6 Bra
+  3     \x{80}
+  6   6 Ket
+  9     End
+------------------------------------------------------------------
+
+/\xff/8BM
+Memory allocation (code space): 10
+------------------------------------------------------------------
+  0   6 Bra
+  3     \x{ff}
+  6   6 Ket
+  9     End
+------------------------------------------------------------------
+
+/\x{0041}\x{2262}\x{0391}\x{002e}/D8M
+Memory allocation (code space): 18
+------------------------------------------------------------------
+  0  14 Bra
+  3     A\x{2262}\x{391}.
+ 14  14 Ket
+ 17     End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'A'
+Need char = '.'
+    
+/\x{D55c}\x{ad6d}\x{C5B4}/D8M 
+Memory allocation (code space): 19
+------------------------------------------------------------------
+  0  15 Bra
+  3     \x{d55c}\x{ad6d}\x{c5b4}
+ 15  15 Ket
+ 18     End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{ed}
+Need char = \x{b4}
+
+/\x{65e5}\x{672c}\x{8a9e}/D8M
+Memory allocation (code space): 19
+------------------------------------------------------------------
+  0  15 Bra
+  3     \x{65e5}\x{672c}\x{8a9e}
+ 15  15 Ket
+ 18     End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{e6}
+Need char = \x{9e}
+
+/[\x{100}]/8BM
+Memory allocation (code space): 10
+------------------------------------------------------------------
+  0   6 Bra
+  3     \x{100}
+  6   6 Ket
+  9     End
+------------------------------------------------------------------
+
+/[Z\x{100}]/8BM
+Memory allocation (code space): 47
+------------------------------------------------------------------
+  0  43 Bra
+  3     [Z\x{100}]
+ 43  43 Ket
+ 46     End
+------------------------------------------------------------------
+
+/^[\x{100}\E-\Q\E\x{150}]/B8M
+Memory allocation (code space): 18
+------------------------------------------------------------------
+  0  14 Bra
+  3     ^
+  4     [\x{100}-\x{150}]
+ 14  14 Ket
+ 17     End
+------------------------------------------------------------------
+
+/^[\QĀ\E-\QŐ\E]/B8M
+Memory allocation (code space): 18
+------------------------------------------------------------------
+  0  14 Bra
+  3     ^
+  4     [\x{100}-\x{150}]
+ 14  14 Ket
+ 17     End
+------------------------------------------------------------------
+
+/^[\QĀ\E-\QŐ\E/B8M
+Failed: missing terminating ] for character class at offset 15
+
+/[\p{L}]/BM
+Memory allocation (code space): 15
+------------------------------------------------------------------
+  0  11 Bra
+  3     [\p{L}]
+ 11  11 Ket
+ 14     End
+------------------------------------------------------------------
+
+/[\p{^L}]/BM
+Memory allocation (code space): 15
+------------------------------------------------------------------
+  0  11 Bra
+  3     [\P{L}]
+ 11  11 Ket
+ 14     End
+------------------------------------------------------------------
+
+/[\P{L}]/BM
+Memory allocation (code space): 15
+------------------------------------------------------------------
+  0  11 Bra
+  3     [\P{L}]
+ 11  11 Ket
+ 14     End
+------------------------------------------------------------------
+
+/[\P{^L}]/BM
+Memory allocation (code space): 15
+------------------------------------------------------------------
+  0  11 Bra
+  3     [\p{L}]
+ 11  11 Ket
+ 14     End
+------------------------------------------------------------------
+
+/[abc\p{L}\x{0660}]/8BM
+Memory allocation (code space): 50
+------------------------------------------------------------------
+  0  46 Bra
+  3     [a-c\p{L}\x{660}]
+ 46  46 Ket
+ 49     End
+------------------------------------------------------------------
+
+/[\p{Nd}]/8BM
+Memory allocation (code space): 15
+------------------------------------------------------------------
+  0  11 Bra
+  3     [\p{Nd}]
+ 11  11 Ket
+ 14     End
+------------------------------------------------------------------
+
+/[\p{Nd}+-]+/8BM
+Memory allocation (code space): 48
+------------------------------------------------------------------
+  0  44 Bra
+  3     [+\-\p{Nd}]+
+ 44  44 Ket
+ 47     End
+------------------------------------------------------------------
+
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iBM
+Memory allocation (code space): 25
+------------------------------------------------------------------
+  0  21 Bra
+  3  /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+ 21  21 Ket
+ 24     End
+------------------------------------------------------------------
+
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8BM
+Memory allocation (code space): 25
+------------------------------------------------------------------
+  0  21 Bra
+  3     A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+ 21  21 Ket
+ 24     End
+------------------------------------------------------------------
+
+/[\x{105}-\x{109}]/8iBM
+Memory allocation (code space): 17
+------------------------------------------------------------------
+  0  13 Bra
+  3     [\x{104}-\x{109}]
+ 13  13 Ket
+ 16     End
+------------------------------------------------------------------
+
+/( ( (?(1)0|) )*   )/xBM
+Memory allocation (code space): 38
+------------------------------------------------------------------
+  0  34 Bra
+  3  28 CBra 1
+  8     Brazero
+  9  19 SCBra 2
+ 14   8 Cond
+ 17   1 Cond ref
+ 20     0
+ 22   3 Alt
+ 25  11 Ket
+ 28  19 KetRmax
+ 31  28 Ket
+ 34  34 Ket
+ 37     End
+------------------------------------------------------------------
+
+/(  (?(1)0|)*   )/xBM
+Memory allocation (code space): 30
+------------------------------------------------------------------
+  0  26 Bra
+  3  20 CBra 1
+  8     Brazero
+  9   8 SCond
+ 12   1 Cond ref
+ 15     0
+ 17   3 Alt
+ 20  11 KetRmax
+ 23  20 Ket
+ 26  26 Ket
+ 29     End
+------------------------------------------------------------------
+
+/[a]/BM
+Memory allocation (code space): 9
+------------------------------------------------------------------
+  0   5 Bra
+  3     a
+  5   5 Ket
+  8     End
+------------------------------------------------------------------
+
+/[a]/8BM
+Memory allocation (code space): 9
+------------------------------------------------------------------
+  0   5 Bra
+  3     a
+  5   5 Ket
+  8     End
+------------------------------------------------------------------
+
+/[\xaa]/BM
+Memory allocation (code space): 9
+------------------------------------------------------------------
+  0   5 Bra
+  3     \xaa
+  5   5 Ket
+  8     End
+------------------------------------------------------------------
+
+/[\xaa]/8BM
+Memory allocation (code space): 10
+------------------------------------------------------------------
+  0   6 Bra
+  3     \x{aa}
+  6   6 Ket
+  9     End
+------------------------------------------------------------------
+
+/[^a]/BM
+Memory allocation (code space): 9
+------------------------------------------------------------------
+  0   5 Bra
+  3     [^a]
+  5   5 Ket
+  8     End
+------------------------------------------------------------------
+
+/[^a]/8BM
+Memory allocation (code space): 9
+------------------------------------------------------------------
+  0   5 Bra
+  3     [^a]
+  5   5 Ket
+  8     End
+------------------------------------------------------------------
+
+/[^\xaa]/BM
+Memory allocation (code space): 9
+------------------------------------------------------------------
+  0   5 Bra
+  3     [^\xaa]
+  5   5 Ket
+  8     End
+------------------------------------------------------------------
+
+/[^\xaa]/8BM
+Memory allocation (code space): 40
+------------------------------------------------------------------
+  0  36 Bra
+  3     [\x00-\xa9\xab-\xff] (neg)
+ 36  36 Ket
+ 39     End
+------------------------------------------------------------------
+
+/[^\d]/8WB
+------------------------------------------------------------------
+  0  11 Bra
+  3     [^\p{Nd}]
+ 11  11 Ket
+ 14     End
+------------------------------------------------------------------
+
+/[[:^alpha:][:^cntrl:]]+/8WB
+------------------------------------------------------------------
+  0  44 Bra
+  3     [ -~\x80-\xff\P{L}]+
+ 44  44 Ket
+ 47     End
+------------------------------------------------------------------
+
+/[[:^cntrl:][:^alpha:]]+/8WB
+------------------------------------------------------------------
+  0  44 Bra
+  3     [ -~\x80-\xff\P{L}]+
+ 44  44 Ket
+ 47     End
+------------------------------------------------------------------
+
+/[[:alpha:]]+/8WB
+------------------------------------------------------------------
+  0  12 Bra
+  3     [\p{L}]+
+ 12  12 Ket
+ 15     End
+------------------------------------------------------------------
+
+/[[:^alpha:]\S]+/8WB
+------------------------------------------------------------------
+  0  15 Bra
+  3     [\P{L}\P{Xsp}]+
+ 15  15 Ket
+ 18     End
+------------------------------------------------------------------
+
+/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/B
+------------------------------------------------------------------
+  0  73 Bra
+  3     abc
+  9   7 CBra 1
+ 14     d
+ 16   5 Alt
+ 19     e
+ 21  12 Ket
+ 24     *THEN
+ 25     x
+ 27  14 CBra 2
+ 32     123
+ 38     *THEN
+ 39     4
+ 41  29 Alt
+ 44     567
+ 50   7 CBra 3
+ 55     b
+ 57   5 Alt
+ 60     q
+ 62  12 Ket
+ 65     *THEN
+ 66     xx
+ 70  43 Ket
+ 73  73 Ket
+ 76     End
+------------------------------------------------------------------
+
+/-- End of testinput11 --/

Modified: code/trunk/testdata/testoutput12
===================================================================
--- code/trunk/testdata/testoutput12    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput12    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,11 +1,51 @@
-/a(*:a\x{1234}b)/8K
-    abc
- 0: a
-MK: a\x{1234}b
+/-- This test is run only when JIT support is available. It checks for a
+successful and an unsuccessful JIT compile and save and restore behaviour,
+and a couple of things that are different with JIT. --/

-/a(*:a£b)/8K 
+/abc/S+I
+Capturing subpattern count = 0
+No options
+First char = 'a'
+Need char = 'c'
+Subject length lower bound = 3
+No set of starting bytes
+JIT study was successful
+
+/ab(*COMMIT)/S+I
+Capturing subpattern count = 0
+No options
+First char = 'a'
+Need char = 'b'
+Subject length lower bound = 2
+No set of starting bytes
+JIT study was not successful
+
+/abc/S+I>testsavedregex
+Capturing subpattern count = 0
+No options
+First char = 'a'
+Need char = 'c'
+Subject length lower bound = 3
+No set of starting bytes
+JIT study was successful
+Compiled pattern written to testsavedregex
+Study data written to testsavedregex
+
+<testsavedregex
+Compiled pattern loaded from testsavedregex
+Study data loaded from testsavedregex
     abc
- 0: a
-MK: a£b
+ 0: abc

+/a*/SI
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Study returned NULL
+
+/(?(R)a*(?1)|((?R))b)/S+
+    aaaabcde
+Error -27 (JIT stack limit reached)
+
 /-- End of testinput12 --/

Modified: code/trunk/testdata/testoutput13
===================================================================
--- code/trunk/testdata/testoutput13    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput13    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,1281 +1,21 @@
-/-- These tests for Unicode property support test PCRE's API and show some of
-    the compiled code. They are not Perl-compatible. --/
-
-/[\p{L}]/DZ
-------------------------------------------------------------------
-        Bra
-        [\p{L}]
-        Ket
-        End
-------------------------------------------------------------------
+/-- This test is run only when JIT support is not available. It checks that an 
+attempt to use it has the expected behaviour. It also tests things that
+are different without JIT. --/
+   
+/abc/S+I
 Capturing subpattern count = 0
 No options
-No first char
-No need char
+First char = 'a'
+Need char = 'c'
+Subject length lower bound = 3
+No set of starting bytes
+JIT support is not available in this version of PCRE

-/[\p{^L}]/DZ
-------------------------------------------------------------------
-        Bra
-        [\P{L}]
-        Ket
-        End
-------------------------------------------------------------------
+/a*/SI
 Capturing subpattern count = 0
 No options
 No first char
 No need char
+Study returned NULL

-/[\P{L}]/DZ
-------------------------------------------------------------------
-        Bra
-        [\P{L}]
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-No options
-No first char
-No need char
-
-/[\P{^L}]/DZ
-------------------------------------------------------------------
-        Bra
-        [\p{L}]
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-No options
-No first char
-No need char
-
-/[abc\p{L}\x{0660}]/8DZ
-------------------------------------------------------------------
-        Bra
-        [a-c\p{L}\x{660}]
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-
-/[\p{Nd}]/8DZ
-------------------------------------------------------------------
-        Bra
-        [\p{Nd}]
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-    1234
- 0: 1
-
-/[\p{Nd}+-]+/8DZ
-------------------------------------------------------------------
-        Bra
-        [+\-\p{Nd}]+
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-    1234
- 0: 1234
-    12-34
- 0: 12-34
-    12+\x{661}-34  
- 0: 12+\x{661}-34
-    ** Failers
-No match
-    abcd  
-No match
-
-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iDZ
-------------------------------------------------------------------
-        Bra
-     /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: caseless utf8
-First char = 'A' (caseless)
-No need char
-
-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8DZ
-------------------------------------------------------------------
-        Bra
-        A\x{391}\x{10427}\x{ff3a}\x{1fb0}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 'A'
-Need char = 176
-
-/AB\x{1fb0}/8DZ
-------------------------------------------------------------------
-        Bra
-        AB\x{1fb0}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 'A'
-Need char = 176
-
-/AB\x{1fb0}/8DZi
-------------------------------------------------------------------
-        Bra
-     /i AB\x{1fb0}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: caseless utf8
-First char = 'A' (caseless)
-Need char = 'B' (caseless)
-
-/[\x{105}-\x{109}]/8iDZ
-------------------------------------------------------------------
-        Bra
-        [\x{104}-\x{109}]
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: caseless utf8
-No first char
-No need char
-    \x{104}
- 0: \x{104}
-    \x{105}
- 0: \x{105}
-    \x{109}  
- 0: \x{109}
-    ** Failers
-No match
-    \x{100}
-No match
-    \x{10a} 
-No match
-    
-/[z-\x{100}]/8iDZ 
-------------------------------------------------------------------
-        Bra
-        [Z\x{39c}\x{178}z-\x{101}]
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: caseless utf8
-No first char
-No need char
-    Z
- 0: Z
-    z
- 0: z
-    \x{39c}
- 0: \x{39c}
-    \x{178}
- 0: \x{178}
-    |
- 0: |
-    \x{80}
- 0: \x{80}
-    \x{ff}
- 0: \x{ff}
-    \x{100}
- 0: \x{100}
-    \x{101} 
- 0: \x{101}
-    ** Failers
-No match
-    \x{102}
-No match
-    Y
-No match
-    y           
-No match
-
-/[z-\x{100}]/8DZi
-------------------------------------------------------------------
-        Bra
-        [Z\x{39c}\x{178}z-\x{101}]
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: caseless utf8
-No first char
-No need char
-
-/(?:[\PPa*]*){8,}/
-
-/[\P{Any}]/BZ
-------------------------------------------------------------------
-        Bra
-        [\P{Any}]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[\P{Any}\E]/BZ
-------------------------------------------------------------------
-        Bra
-        [\P{Any}]
-        Ket
-        End
-------------------------------------------------------------------
-
-/(\P{Yi}+\277)/
-
-/(\P{Yi}+\277)?/
-
-/(?<=\P{Yi}{3}A)X/
-
-/\p{Yi}+(\P{Yi}+)(?1)/
-
-/(\P{Yi}{2}\277)?/
-
-/[\P{Yi}A]/
-
-/[\P{Yi}\P{Yi}\P{Yi}A]/
-
-/[^\P{Yi}A]/
-
-/[^\P{Yi}\P{Yi}\P{Yi}A]/
-
-/(\P{Yi}*\277)*/
-
-/(\P{Yi}*?\277)*/
-
-/(\p{Yi}*+\277)*/
-
-/(\P{Yi}?\277)*/
-
-/(\P{Yi}??\277)*/
-
-/(\p{Yi}?+\277)*/
-
-/(\P{Yi}{0,3}\277)*/
-
-/(\P{Yi}{0,3}?\277)*/
-
-/(\p{Yi}{0,3}+\277)*/
-
-/\p{Zl}{2,3}+/8BZ
-------------------------------------------------------------------
-        Bra
-        prop Zl {2}
-        prop Zl ?+
-        Ket
-        End
-------------------------------------------------------------------
-    \xe2\x80\xa8\xe2\x80\xa8
- 0: \x{2028}\x{2028}
-    \x{2028}\x{2028}\x{2028}
- 0: \x{2028}\x{2028}\x{2028}
-    
-/\p{Zl}/8BZ
-------------------------------------------------------------------
-        Bra
-        prop Zl
-        Ket
-        End
-------------------------------------------------------------------
-
-/\p{Lu}{3}+/8BZ
-------------------------------------------------------------------
-        Bra
-        prop Lu {3}
-        Ket
-        End
-------------------------------------------------------------------
-
-/\pL{2}+/8BZ
-------------------------------------------------------------------
-        Bra
-        prop L {2}
-        Ket
-        End
-------------------------------------------------------------------
-
-/\p{Cc}{2}+/8BZ
-------------------------------------------------------------------
-        Bra
-        prop Cc {2}
-        Ket
-        End
-------------------------------------------------------------------
-
-/^\p{Cs}/8
-    \?\x{dfff}
- 0: \x{dfff}
-    ** Failers
-No match
-    \x{09f} 
-No match
-  
-/^\p{Sc}+/8
-    $\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
- 0: $\x{a2}\x{a3}\x{a4}\x{a5}
-    \x{9f2}
- 0: \x{9f2}
-    ** Failers
-No match
-    X
-No match
-    \x{2c2}
-No match
-  
-/^\p{Zs}/8
-    \ \
- 0:  
-    \x{a0}
- 0: \x{a0}
-    \x{1680}
- 0: \x{1680}
-    \x{180e}
- 0: \x{180e}
-    \x{2000}
- 0: \x{2000}
-    \x{2001}     
- 0: \x{2001}
-    ** Failers
-No match
-    \x{2028}
-No match
-    \x{200d} 
-No match
-  
-/-- These four are here rather than in test 6 because Perl has problems with
-    the negative versions of the properties. --/
-      
-/\p{^Lu}/8i
-    1234
- 0: 1
-    ** Failers
- 0: *
-    ABC 
-No match
-
-/\P{Lu}/8i
-    1234
- 0: 1
-    ** Failers
- 0: *
-    ABC 
-No match
-
-/\p{Ll}/8i 
-    a
- 0: a
-    Az
- 0: z
-    ** Failers
- 0: a
-    ABC   
-No match
-
-/\p{Lu}/8i
-    A
- 0: A
-    a\x{10a0}B 
- 0: \x{10a0}
-    ** Failers 
- 0: F
-    a
-No match
-    \x{1d00}  
-No match
-
-/[\x{c0}\x{391}]/8i
-    \x{c0}
- 0: \x{c0}
-    \x{e0} 
- 0: \x{e0}
-
-/-- The next two are special cases where the lengths of the different cases of
-the same character differ. The first went wrong with heap frame storage; the
-second was broken in all cases. --/
-
-/^\x{023a}+?(\x{0130}+)/8i
-  \x{023a}\x{2c65}\x{0130}
- 0: \x{23a}\x{2c65}\x{130}
- 1: \x{130}
-  
-/^\x{023a}+([^X])/8i
-  \x{023a}\x{2c65}X
- 0: \x{23a}\x{2c65}
- 1: \x{2c65}
-
-/\x{c0}+\x{116}+/8i
-    \x{c0}\x{e0}\x{116}\x{117}
- 0: \x{c0}\x{e0}\x{116}\x{117}
-
-/[\x{c0}\x{116}]+/8i
-    \x{c0}\x{e0}\x{116}\x{117}
- 0: \x{c0}\x{e0}\x{116}\x{117}
-
-/(\x{de})\1/8i
-    \x{de}\x{de}
- 0: \x{de}\x{de}
- 1: \x{de}
-    \x{de}\x{fe}
- 0: \x{de}\x{fe}
- 1: \x{de}
-    \x{fe}\x{fe}
- 0: \x{fe}\x{fe}
- 1: \x{fe}
-    \x{fe}\x{de}
- 0: \x{fe}\x{de}
- 1: \x{fe}
-
-/^\x{c0}$/8i
-    \x{c0}
- 0: \x{c0}
-    \x{e0} 
- 0: \x{e0}
-
-/^\x{e0}$/8i
-    \x{c0}
- 0: \x{c0}
-    \x{e0} 
- 0: \x{e0}
-
-/-- The next two should be Perl-compatible, but it fails to match \x{e0}. PCRE
-will match it only with UCP support, because without that it has no notion
-of case for anything other than the ASCII letters. --/ 
-
-/((?i)[\x{c0}])/8
-    \x{c0}
- 0: \x{c0}
- 1: \x{c0}
-    \x{e0} 
- 0: \x{e0}
- 1: \x{e0}
-
-/(?i:[\x{c0}])/8
-    \x{c0}
- 0: \x{c0}
-    \x{e0} 
- 0: \x{e0}
-
-/-- This should be Perl-compatible but Perl 5.11 gets \x{300} wrong. --/8
-    
-/^\X/8
-    A
- 0: A
-    A\x{300}BC 
- 0: A\x{300}
-    A\x{300}\x{301}\x{302}BC 
- 0: A\x{300}\x{301}\x{302}
-    *** Failers
- 0: *
-    \x{300}  
-No match
-    
-/-- These are PCRE's extra properties to help with Unicodizing \d etc. --/
-
-/^\p{Xan}/8
-    ABCD
- 0: A
-    1234
- 0: 1
-    \x{6ca}
- 0: \x{6ca}
-    \x{a6c}
- 0: \x{a6c}
-    \x{10a7}   
- 0: \x{10a7}
-    ** Failers
-No match
-    _ABC   
-No match
-
-/^\p{Xan}+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
-    ** Failers
-No match
-    _ABC   
-No match
-
-/^\p{Xan}+?/8
-    \x{6ca}\x{a6c}\x{10a7}_
- 0: \x{6ca}
-
-/^\p{Xan}*/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
-    
-/^\p{Xan}{2,9}/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}
-    
-/^\p{Xan}{2,9}?/8
-    \x{6ca}\x{a6c}\x{10a7}_
- 0: \x{6ca}\x{a6c}
-    
-/^[\p{Xan}]/8
-    ABCD1234_
- 0: A
-    1234abcd_
- 0: 1
-    \x{6ca}
- 0: \x{6ca}
-    \x{a6c}
- 0: \x{a6c}
-    \x{10a7}   
- 0: \x{10a7}
-    ** Failers
-No match
-    _ABC   
-No match
- 
-/^[\p{Xan}]+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
-    ** Failers
-No match
-    _ABC   
-No match
-
-/^>\p{Xsp}/8
-    >\x{1680}\x{2028}\x{0b}
- 0: >\x{1680}
-    >\x{a0} 
- 0: >\x{a0}
-    ** Failers
-No match
-    \x{0b} 
-No match
-
-/^>\p{Xsp}+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
-
-/^>\p{Xsp}+?/8
-    >\x{1680}\x{2028}\x{0b}
- 0: >\x{1680}
-
-/^>\p{Xsp}*/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
-    
-/^>\p{Xsp}{2,9}/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
-    
-/^>\p{Xsp}{2,9}?/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}
-    
-/^>[\p{Xsp}]/8
-    >\x{2028}\x{0b}
- 0: >\x{2028}
- 
-/^>[\p{Xsp}]+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
-
-/^>\p{Xps}/8
-    >\x{1680}\x{2028}\x{0b}
- 0: >\x{1680}
-    >\x{a0} 
- 0: >\x{a0}
-    ** Failers
-No match
-    \x{0b} 
-No match
-
-/^>\p{Xps}+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-
-/^>\p{Xps}+?/8
-    >\x{1680}\x{2028}\x{0b}
- 0: >\x{1680}
-
-/^>\p{Xps}*/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>\p{Xps}{2,9}/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-    
-/^>\p{Xps}{2,9}?/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}
-    
-/^>[\p{Xps}]/8
-    >\x{2028}\x{0b}
- 0: >\x{2028}
- 
-/^>[\p{Xps}]+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
-
-/^\p{Xwd}/8
-    ABCD
- 0: A
-    1234
- 0: 1
-    \x{6ca}
- 0: \x{6ca}
-    \x{a6c}
- 0: \x{a6c}
-    \x{10a7}
- 0: \x{10a7}
-    _ABC    
- 0: _
-    ** Failers
-No match
-    [] 
-No match
-
-/^\p{Xwd}+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-
-/^\p{Xwd}+?/8
-    \x{6ca}\x{a6c}\x{10a7}_
- 0: \x{6ca}
-
-/^\p{Xwd}*/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-    
-/^\p{Xwd}{2,9}/8
-    A_B12\x{6ca}\x{a6c}\x{10a7}
- 0: A_B12\x{6ca}\x{a6c}\x{10a7}
-    
-/^\p{Xwd}{2,9}?/8
-    \x{6ca}\x{a6c}\x{10a7}_
- 0: \x{6ca}\x{a6c}
-    
-/^[\p{Xwd}]/8
-    ABCD1234_
- 0: A
-    1234abcd_
- 0: 1
-    \x{6ca}
- 0: \x{6ca}
-    \x{a6c}
- 0: \x{a6c}
-    \x{10a7}   
- 0: \x{10a7}
-    _ABC 
- 0: _
-    ** Failers
-No match
-    []   
-No match
- 
-/^[\p{Xwd}]+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
-
-/-- A check not in UTF-8 mode --/
-
-/^[\p{Xwd}]+/
-    ABCD1234_
- 0: ABCD1234_
-    
-/-- Some negative checks --/
-
-/^[\P{Xwd}]+/8
-    !.+\x{019}\x{35a}AB
- 0: !.+\x{19}\x{35a}
-
-/^[\p{^Xwd}]+/8
-    !.+\x{019}\x{35a}AB
- 0: !.+\x{19}\x{35a}
-
-/[\D]/WBZ8
-------------------------------------------------------------------
-        Bra
-        [\P{Nd}]
-        Ket
-        End
-------------------------------------------------------------------
-    1\x{3c8}2
- 0: \x{3c8}
-
-/[\d]/WBZ8
-------------------------------------------------------------------
-        Bra
-        [\p{Nd}]
-        Ket
-        End
-------------------------------------------------------------------
-    >\x{6f4}<
- 0: \x{6f4}
-
-/[\S]/WBZ8
-------------------------------------------------------------------
-        Bra
-        [\P{Xsp}]
-        Ket
-        End
-------------------------------------------------------------------
-    \x{1680}\x{6f4}\x{1680}
- 0: \x{6f4}
-
-/[\s]/WBZ8
-------------------------------------------------------------------
-        Bra
-        [\p{Xsp}]
-        Ket
-        End
-------------------------------------------------------------------
-    >\x{1680}<
- 0: \x{1680}
-
-/[\W]/WBZ8
-------------------------------------------------------------------
-        Bra
-        [\P{Xwd}]
-        Ket
-        End
-------------------------------------------------------------------
-    A\x{1712}B
- 0: \x{1712}
-
-/[\w]/WBZ8
-------------------------------------------------------------------
-        Bra
-        [\p{Xwd}]
-        Ket
-        End
-------------------------------------------------------------------
-    >\x{1723}<
- 0: \x{1723}
-
-/\D/WBZ8
-------------------------------------------------------------------
-        Bra
-        notprop Nd
-        Ket
-        End
-------------------------------------------------------------------
-    1\x{3c8}2
- 0: \x{3c8}
-
-/\d/WBZ8
-------------------------------------------------------------------
-        Bra
-        prop Nd
-        Ket
-        End
-------------------------------------------------------------------
-    >\x{6f4}<
- 0: \x{6f4}
-
-/\S/WBZ8
-------------------------------------------------------------------
-        Bra
-        notprop Xsp
-        Ket
-        End
-------------------------------------------------------------------
-    \x{1680}\x{6f4}\x{1680}
- 0: \x{6f4}
-
-/\s/WBZ8
-------------------------------------------------------------------
-        Bra
-        prop Xsp
-        Ket
-        End
-------------------------------------------------------------------
-    >\x{1680}>
- 0: \x{1680}
-
-/\W/WBZ8
-------------------------------------------------------------------
-        Bra
-        notprop Xwd
-        Ket
-        End
-------------------------------------------------------------------
-    A\x{1712}B
- 0: \x{1712}
-
-/\w/WBZ8
-------------------------------------------------------------------
-        Bra
-        prop Xwd
-        Ket
-        End
-------------------------------------------------------------------
-    >\x{1723}<
- 0: \x{1723}
-
-/[[:alpha:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [\p{L}]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:lower:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [\p{Ll}]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:upper:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [\p{Lu}]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:alnum:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [\p{Xan}]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:ascii:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [\x00-\x7f]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:blank:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [\x09 \xa0]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:cntrl:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [\x00-\x1f\x7f]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:digit:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [\p{Nd}]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:graph:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [!-~]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:print:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [ -~]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:punct:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [!-/:-@[-`{-~]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:space:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [\p{Xps}]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:word:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [\p{Xwd}]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[[:xdigit:]]/WBZ
-------------------------------------------------------------------
-        Bra
-        [0-9A-Fa-f]
-        Ket
-        End
-------------------------------------------------------------------
-
-/-- Unicode properties for \b abd \B --/
-
-/\b...\B/8W
-    abc_
- 0: abc
-    \x{37e}abc\x{376} 
- 0: abc
-    \x{37e}\x{376}\x{371}\x{393}\x{394} 
- 0: \x{376}\x{371}\x{393}
-    !\x{c0}++\x{c1}\x{c2} 
- 0: ++\x{c1}
-    !\x{c0}+++++ 
- 0: \x{c0}++
-
-/-- Without PCRE_UCP, non-ASCII always fail, even if < 256  --/
-
-/\b...\B/8
-    abc_
- 0: abc
-    ** Failers 
- 0: Fai
-    \x{37e}abc\x{376} 
-No match
-    \x{37e}\x{376}\x{371}\x{393}\x{394} 
-No match
-    !\x{c0}++\x{c1}\x{c2} 
-No match
-    !\x{c0}+++++ 
-No match
-
-/-- With PCRE_UCP, non-UTF8 chars that are < 256 still check properties  --/
-
-/\b...\B/W
-    abc_
- 0: abc
-    !\x{c0}++\x{c1}\x{c2} 
- 0: ++\xc1
-    !\x{c0}+++++ 
- 0: \xc0++
-
-/-- POSIX interface --/
-
-/\w/P
-    +++\x{c2}
-No match: POSIX code 17: match failed
-
-/\w/WP
-    +++\x{c2}
- 0: \xc2
-    
-/-- Some of these are silly, but they check various combinations --/
-
-/[[:^alpha:][:^cntrl:]]+/8WBZ
-------------------------------------------------------------------
-        Bra
-        [ -~\x80-\xff\P{L}]+
-        Ket
-        End
-------------------------------------------------------------------
-    123
- 0: 123
-    abc 
- 0: abc
-
-/[[:^cntrl:][:^alpha:]]+/8WBZ
-------------------------------------------------------------------
-        Bra
-        [ -~\x80-\xff\P{L}]+
-        Ket
-        End
-------------------------------------------------------------------
-    123
- 0: 123
-    abc 
- 0: abc
-
-/[[:alpha:]]+/8WBZ
-------------------------------------------------------------------
-        Bra
-        [\p{L}]+
-        Ket
-        End
-------------------------------------------------------------------
-    abc
- 0: abc
-
-/[[:^alpha:]\S]+/8WBZ
-------------------------------------------------------------------
-        Bra
-        [\P{L}\P{Xsp}]+
-        Ket
-        End
-------------------------------------------------------------------
-    123
- 0: 123
-    abc 
- 0: abc
-
-/[^\d]+/8WBZ
-------------------------------------------------------------------
-        Bra
-        [^\p{Nd}]+
-        Ket
-        End
-------------------------------------------------------------------
-    abc123
- 0: abc
-    abc\x{123}
- 0: abc\x{123}
-    \x{660}abc   
- 0: abc
-
-/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/8iSI
-Capturing subpattern count = 0
-Options: caseless utf8
-No first char
-No need char
-Subject length lower bound = 17
-Starting byte set: \xd0 \xd1 
-    \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
- 0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
-    \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
- 0: \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
-
-/\p{Lu}+9\p{Lu}+B\p{Lu}+b/BZ
-------------------------------------------------------------------
-        Bra
-        prop Lu ++
-        9
-        prop Lu +
-        B
-        prop Lu ++
-        b
-        Ket
-        End
-------------------------------------------------------------------
-
-/\p{^Lu}+9\p{^Lu}+B\p{^Lu}+b/BZ
-------------------------------------------------------------------
-        Bra
-        notprop Lu +
-        9
-        notprop Lu ++
-        B
-        notprop Lu +
-        b
-        Ket
-        End
-------------------------------------------------------------------
-
-/\P{Lu}+9\P{Lu}+B\P{Lu}+b/BZ
-------------------------------------------------------------------
-        Bra
-        notprop Lu +
-        9
-        notprop Lu ++
-        B
-        notprop Lu +
-        b
-        Ket
-        End
-------------------------------------------------------------------
-
-/\p{Han}+X\p{Greek}+\x{370}/BZ8
-------------------------------------------------------------------
-        Bra
-        prop Han ++
-        X
-        prop Greek +
-        \x{370}
-        Ket
-        End
-------------------------------------------------------------------
-
-/\p{Xan}+!\p{Xan}+A/BZ
-------------------------------------------------------------------
-        Bra
-        prop Xan ++
-        !
-        prop Xan +
-        A
-        Ket
-        End
-------------------------------------------------------------------
-
-/\p{Xsp}+!\p{Xsp}\t/BZ
-------------------------------------------------------------------
-        Bra
-        prop Xsp ++
-        !
-        prop Xsp
-        \x09
-        Ket
-        End
-------------------------------------------------------------------
-
-/\p{Xps}+!\p{Xps}\t/BZ
-------------------------------------------------------------------
-        Bra
-        prop Xps ++
-        !
-        prop Xps
-        \x09
-        Ket
-        End
-------------------------------------------------------------------
-
-/\p{Xwd}+!\p{Xwd}_/BZ
-------------------------------------------------------------------
-        Bra
-        prop Xwd ++
-        !
-        prop Xwd
-        _
-        Ket
-        End
-------------------------------------------------------------------
-
-/A+\p{N}A+\dB+\p{N}*B+\d*/WBZ
-------------------------------------------------------------------
-        Bra
-        A++
-        prop N
-        A++
-        prop Nd
-        B+
-        prop N *+
-        B+
-        prop Nd *
-        Ket
-        End
-------------------------------------------------------------------
-
-/-- These behaved oddly in Perl, so they are kept in this test --/
-
-/(\x{23a}\x{23a}\x{23a})?\1/8i
-    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
-No match
-
-/(ȺȺȺ)?\1/8i
-    ȺȺȺⱥⱥ
-No match
-
-/(\x{23a}\x{23a}\x{23a})?\1/8i
-    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
- 0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
- 1: \x{23a}\x{23a}\x{23a}
-
-/(ȺȺȺ)?\1/8i
-    ȺȺȺⱥⱥⱥ
- 0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
- 1: \x{23a}\x{23a}\x{23a}
-
-/(\x{23a}\x{23a}\x{23a})\1/8i
-    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
-No match
-
-/(ȺȺȺ)\1/8i
-    ȺȺȺⱥⱥ
-No match
-
-/(\x{23a}\x{23a}\x{23a})\1/8i
-    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
- 0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
- 1: \x{23a}\x{23a}\x{23a}
-
-/(ȺȺȺ)\1/8i
-    ȺȺȺⱥⱥⱥ
- 0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
- 1: \x{23a}\x{23a}\x{23a}
-
-/(\x{2c65}\x{2c65})\1/8i
-    \x{2c65}\x{2c65}\x{23a}\x{23a}
- 0: \x{2c65}\x{2c65}\x{23a}\x{23a}
- 1: \x{2c65}\x{2c65}
-    
-/(ⱥⱥ)\1/8i
-    ⱥⱥȺȺ 
- 0: \x{2c65}\x{2c65}\x{23a}\x{23a}
- 1: \x{2c65}\x{2c65}
-    
-/(\x{23a}\x{23a}\x{23a})\1Y/8i
-    X\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}YZ
- 0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}Y
- 1: \x{23a}\x{23a}\x{23a}
-
-/(\x{2c65}\x{2c65})\1Y/8i
-    X\x{2c65}\x{2c65}\x{23a}\x{23a}YZ
- 0: \x{2c65}\x{2c65}\x{23a}\x{23a}Y
- 1: \x{2c65}\x{2c65}
-
-/-- --/ 
-
-/-- These scripts weren't yet in Perl when I added Unicode 6.0.0 to PCRE --/
-
-/^[\p{Batak}]/8
-    \x{1bc0}
- 0: \x{1bc0}
-    \x{1bff}
- 0: \x{1bff}
-    ** Failers
-No match
-    \x{1bf4}
-No match
-    
-/^[\p{Brahmi}]/8
-    \x{11000}
- 0: \x{11000}
-    \x{1106f}
- 0: \x{1106f}
-    ** Failers
-No match
-    \x{1104e}
-No match
-    
-/^[\p{Mandaic}]/8
-    \x{840}
- 0: \x{840}
-    \x{85e}
- 0: \x{85e}
-    ** Failers
-No match
-    \x{85c}
-No match
-    \x{85d}    
-No match
-
-/-- --/ 
-
-/(\X*)(.)/s8
-    A\x{300}
- 0: A
- 1: 
- 2: A
-
-/^S(\X*)e(\X*)$/8
-    Stéréo
-No match
-    
-/^\X/8 
-    ́réo
-No match
-
 /-- End of testinput13 --/

Modified: code/trunk/testdata/testoutput14
===================================================================
--- code/trunk/testdata/testoutput14    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput14    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,53 +1,448 @@
-/-- This test is run only when JIT support is available. It checks for a
-successful and an unsuccessful JIT compile and save and restore behaviour,
-and a couple of things that are different with JIT. --/
+/-- This set of tests is run only with the 8-bit library. It starts with all
+    the tests of the POSIX interface, because that is supported only with the
+    8-bit library. --/

-/abc/S+I
+/abc/P
+    abc
+ 0: abc
+    *** Failers
+No match: POSIX code 17: match failed
+
+/^abc|def/P
+    abcdef
+ 0: abc
+    abcdef\B
+ 0: def
+
+/.*((abc)$|(def))/P
+    defabc
+ 0: defabc
+ 1: abc
+ 2: abc
+    \Zdefabc
+ 0: def
+ 1: def
+ 3: def
+
+/the quick brown fox/P
+    the quick brown fox
+ 0: the quick brown fox
+    *** Failers
+No match: POSIX code 17: match failed
+    The Quick Brown Fox
+No match: POSIX code 17: match failed
+
+/the quick brown fox/Pi
+    the quick brown fox
+ 0: the quick brown fox
+    The Quick Brown Fox
+ 0: The Quick Brown Fox
+
+/abc.def/P
+    *** Failers
+No match: POSIX code 17: match failed
+    abc\ndef
+No match: POSIX code 17: match failed
+
+/abc$/P
+    abc
+ 0: abc
+    abc\n
+ 0: abc
+
+/(abc)\2/P
+Failed: POSIX code 15: bad back reference at offset 7     
+
+/(abc\1)/P
+    abc
+No match: POSIX code 17: match failed
+
+/a*(b+)(z)(z)/P
+    aaaabbbbzzzz
+ 0: aaaabbbbzz
+ 1: bbbb
+ 2: z
+ 3: z
+    aaaabbbbzzzz\O0
+    aaaabbbbzzzz\O1
+ 0: aaaabbbbzz
+    aaaabbbbzzzz\O2
+ 0: aaaabbbbzz
+ 1: bbbb
+    aaaabbbbzzzz\O3
+ 0: aaaabbbbzz
+ 1: bbbb
+ 2: z
+    aaaabbbbzzzz\O4
+ 0: aaaabbbbzz
+ 1: bbbb
+ 2: z
+ 3: z
+    aaaabbbbzzzz\O5
+ 0: aaaabbbbzz
+ 1: bbbb
+ 2: z
+ 3: z
+
+/ab.cd/P
+    ab-cd
+ 0: ab-cd
+    ab=cd
+ 0: ab=cd
+    ** Failers
+No match: POSIX code 17: match failed
+    ab\ncd
+No match: POSIX code 17: match failed
+
+/ab.cd/Ps
+    ab-cd
+ 0: ab-cd
+    ab=cd
+ 0: ab=cd
+    ab\ncd
+ 0: ab\x0acd
+
+/a(b)c/PN
+    abc
+Matched with REG_NOSUB
+
+/a(?P<name>b)c/PN
+    abc
+Matched with REG_NOSUB
+
+/a?|b?/P
+    abc
+ 0: a
+    ** Failers
+ 0: 
+    ddd\N   
+No match: POSIX code 17: match failed
+
+/\w+A/P
+   CDAAAAB 
+ 0: CDAAAA
+
+/\w+A/PU
+   CDAAAAB 
+ 0: CDA
+   
+/\Biss\B/I+P
+    Mississippi
+ 0: iss
+ 0+ issippi
+
+/abc/\P
+Failed: POSIX code 9: bad escape sequence at offset 4     
+
+/-- End of POSIX tests --/ 
+
+/a\Cb/
+    aXb
+ 0: aXb
+    a\nb
+ 0: a\x0ab
+    ** Failers (too big char) 
+No match
+    A\x{123}B 
+** Character \x{123} is greater than 255 and UTF-8 mode is not enabled.
+** Truncation will probably give the wrong result.
+No match
+  
+/\x{100}/I
+Failed: character value in \x{...} sequence is too large at offset 6
+
+/  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                          # optional leading comment
+(?:    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+# address
+|                     #  or
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)             # one word, optionally followed by....
+(?:
+[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]  |  # atom and space parts, or...
+\(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)       |  # comments, or...
+
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+# quoted strings
+)*
+<  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                     # leading <
+(?:  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  ,  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+)* # further okay, if led by comma
+:                                # closing colon
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  )? #       optional route
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+#       address spec
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  > #                  trailing >
+# name and address
+)  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                       # optional trailing comment
+/xSI
 Capturing subpattern count = 0
-No options
-First char = 'a'
-Need char = 'c'
+Contains explicit CR or LF match
+Options: extended
+No first char
+No need char
 Subject length lower bound = 3
-No set of starting bytes
-JIT study was successful
+Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 
+  9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e 
+  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f

-/ab(*COMMIT)/S+I
+<testdata/saved16
+Compiled pattern loaded from testdata/saved16
+No study data
+Error -28 from pcre_fullinfo(0)
+Running in 8-bit mode but pattern was compiled in 16-bit mode
+
+/\h/SI
Capturing subpattern count = 0
No options
-First char = 'a'
-Need char = 'b'
-Subject length lower bound = 2
-No set of starting bytes
-JIT study was not successful
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x09 \x20 \xa0

-/abc/S+I>testsavedregex
+/\v/SI
Capturing subpattern count = 0
No options
-First char = 'a'
-Need char = 'c'
-Subject length lower bound = 3
-No set of starting bytes
-JIT study was successful
-Compiled pattern written to testsavedregex
-Study data written to testsavedregex
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x0a \x0b \x0c \x0d \x85

-<testsavedregex
-Compiled pattern loaded from testsavedregex
-Study data loaded from testsavedregex
-    abc
- 0: abc
-
-/a*/SI
+/\R/SI
 Capturing subpattern count = 0
 No options
 No first char
 No need char
-Subject length lower bound = -1
-No set of starting bytes
-JIT study was successful
+Subject length lower bound = 1
+Starting byte set: \x0a \x0b \x0c \x0d \x85

-/(?(R)a*(?1)|((?R))b)/S+
-    aaaabcde
-Error -27 (JIT stack limit reached)
+/[\h]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x09 \xa0]
+        Ket
+        End
+------------------------------------------------------------------
+    >\x09<
+ 0: \x09

+/[\h]+/BZ
+------------------------------------------------------------------
+        Bra
+        [\x09 \xa0]+
+        Ket
+        End
+------------------------------------------------------------------
+    >\x09\x20\xa0<
+ 0: \x09 \xa0
+
+/[\v]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x0a-\x0d\x85]
+        Ket
+        End
+------------------------------------------------------------------
+
+/[\H]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff]
+        Ket
+        End
+------------------------------------------------------------------
+
+/[^\h]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff] (neg)
+        Ket
+        End
+------------------------------------------------------------------
+
+/[\V]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x09\x0e-\x84\x86-\xff]
+        Ket
+        End
+------------------------------------------------------------------
+
+/[\x0a\V]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x0a\x0e-\x84\x86-\xff]
+        Ket
+        End
+------------------------------------------------------------------
+
 /-- End of testinput14 --/

Modified: code/trunk/testdata/testoutput15
===================================================================
--- code/trunk/testdata/testoutput15    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput15    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,22 +1,901 @@
-/-- This test is run only when JIT support is not available. It checks that an 
-attempt to use it has the expected behaviour. It also tests things that
-are different without JIT. --/
-   
-/abc/S+I
+/-- This set of tests is for UTF-8 support, and is relevant only to the 8-bit 
+    library. --/
+
+/X(\C{3})/8
+    X\x{1234}
+ 0: X\x{1234}
+ 1: \x{1234}
+
+/X(\C{4})/8
+    X\x{1234}YZ
+ 0: X\x{1234}Y
+ 1: \x{1234}Y
+    
+/X\C*/8
+    XYZabcdce
+ 0: XYZabcdce
+    
+/X\C*?/8
+    XYZabcde
+ 0: X
+    
+/X\C{3,5}/8
+    Xabcdefg   
+ 0: Xabcde
+    X\x{1234} 
+ 0: X\x{1234}
+    X\x{1234}YZ
+ 0: X\x{1234}YZ
+    X\x{1234}\x{512}  
+ 0: X\x{1234}\x{512}
+    X\x{1234}\x{512}YZ
+ 0: X\x{1234}\x{512}
+
+/X\C{3,5}?/8
+    Xabcdefg   
+ 0: Xabc
+    X\x{1234} 
+ 0: X\x{1234}
+    X\x{1234}YZ
+ 0: X\x{1234}
+    X\x{1234}\x{512}  
+ 0: X\x{1234}
+
+/a\Cb/8
+    aXb
+ 0: aXb
+    a\nb
+ 0: a\x{0a}b
+    
+/a\C\Cb/8 
+    a\x{100}b 
+ 0: a\x{100}b
+
+/ab\Cde/8
+    abXde
+ 0: abXde
+
+/a\C\Cb/8 
+    a\x{100}b
+ 0: a\x{100}b
+    ** Failers 
+No match
+    a\x{12257}b
+No match
+
+/[\xC3]/8
+Failed: invalid UTF-8 string at offset 1
+
+/\xC3/8
+Failed: invalid UTF-8 string at offset 0
+
+/\xC3\xC3\xC3xxx/8
+Failed: invalid UTF-8 string at offset 0
+
+/\xC3\xC3\xC3xxx/8?DZSS
+------------------------------------------------------------------
+        Bra
+        \X{c0}\X{c0}\X{c0}xxx
+        Ket
+        End
+------------------------------------------------------------------
 Capturing subpattern count = 0
-No options
-First char = 'a'
-Need char = 'c'
+Options: utf no_utf_check
+First char = \x{c3}
+Need char = 'x'
+
+/abc/8
+    \xC3]
+Error -10 (bad UTF-8 string) offset=0 reason=6
+    \xC3
+Error -10 (bad UTF-8 string) offset=0 reason=1
+    \xC3\xC3\xC3
+Error -10 (bad UTF-8 string) offset=0 reason=6
+    \xC3\xC3\xC3\?
+No match
+    \xe1\x88 
+Error -10 (bad UTF-8 string) offset=0 reason=1
+    \P\xe1\x88 
+Error -10 (bad UTF-8 string) offset=0 reason=1
+    \P\P\xe1\x88 
+Error -25 (short UTF-8 string) offset=0 reason=1
+    XX\xea
+Error -10 (bad UTF-8 string) offset=2 reason=2
+    \O0XX\xea
+Error -10 (bad UTF-8 string)
+    \O1XX\xea
+Error -10 (bad UTF-8 string)
+    \O2XX\xea
+Error -10 (bad UTF-8 string) offset=2 reason=2
+    XX\xf1
+Error -10 (bad UTF-8 string) offset=2 reason=3
+    XX\xf8  
+Error -10 (bad UTF-8 string) offset=2 reason=4
+    XX\xfc
+Error -10 (bad UTF-8 string) offset=2 reason=5
+    ZZ\xea\xaf\x20YY
+Error -10 (bad UTF-8 string) offset=2 reason=7
+    ZZ\xfd\xbf\xbf\x2f\xbf\xbfYY  
+Error -10 (bad UTF-8 string) offset=2 reason=8
+    ZZ\xfd\xbf\xbf\xbf\x2f\xbfYY  
+Error -10 (bad UTF-8 string) offset=2 reason=9
+    ZZ\xfd\xbf\xbf\xbf\xbf\x2fYY  
+Error -10 (bad UTF-8 string) offset=2 reason=10
+    ZZ\xffYY
+Error -10 (bad UTF-8 string) offset=2 reason=21
+    ZZ\xfeYY  
+Error -10 (bad UTF-8 string) offset=2 reason=21
+
+/anything/8
+    \xc0\x80
+Error -10 (bad UTF-8 string) offset=0 reason=15
+    \xc1\x8f 
+Error -10 (bad UTF-8 string) offset=0 reason=15
+    \xe0\x9f\x80
+Error -10 (bad UTF-8 string) offset=0 reason=16
+    \xf0\x8f\x80\x80 
+Error -10 (bad UTF-8 string) offset=0 reason=17
+    \xf8\x87\x80\x80\x80  
+Error -10 (bad UTF-8 string) offset=0 reason=18
+    \xfc\x83\x80\x80\x80\x80
+Error -10 (bad UTF-8 string) offset=0 reason=19
+    \xfe\x80\x80\x80\x80\x80  
+Error -10 (bad UTF-8 string) offset=0 reason=21
+    \xff\x80\x80\x80\x80\x80  
+Error -10 (bad UTF-8 string) offset=0 reason=21
+    \xc3\x8f
+No match
+    \xe0\xaf\x80
+No match
+    \xe1\x80\x80
+No match
+    \xf0\x9f\x80\x80 
+No match
+    \xf1\x8f\x80\x80 
+No match
+    \xf8\x88\x80\x80\x80  
+Error -10 (bad UTF-8 string) offset=0 reason=11
+    \xf9\x87\x80\x80\x80  
+Error -10 (bad UTF-8 string) offset=0 reason=11
+    \xfc\x84\x80\x80\x80\x80
+Error -10 (bad UTF-8 string) offset=0 reason=12
+    \xfd\x83\x80\x80\x80\x80
+Error -10 (bad UTF-8 string) offset=0 reason=12
+    \?\xf8\x88\x80\x80\x80  
+No match
+    \?\xf9\x87\x80\x80\x80  
+No match
+    \?\xfc\x84\x80\x80\x80\x80
+No match
+    \?\xfd\x83\x80\x80\x80\x80
+No match
+
+/\x{100}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c4}
+Need char = \x{80}
+
+/\x{1000}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{1000}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{e1}
+Need char = \x{80}
+
+/\x{10000}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{10000}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{f0}
+Need char = \x{80}
+
+/\x{100000}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100000}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{f4}
+Need char = \x{80}
+
+/\x{10ffff}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{10ffff}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{f4}
+Need char = \x{bf}
+
+/[\x{ff}]/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{ff}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c3}
+Need char = \x{bf}
+
+/[\x{100}]/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c4}
+Need char = \x{80}
+
+/\x80/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{80}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c2}
+Need char = \x{80}
+
+/\xff/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{ff}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c3}
+Need char = \x{bf}
+
+/\x{D55c}\x{ad6d}\x{C5B4}/DZ8 
+------------------------------------------------------------------
+        Bra
+        \x{d55c}\x{ad6d}\x{c5b4}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{ed}
+Need char = \x{b4}
+    \x{D55c}\x{ad6d}\x{C5B4} 
+ 0: \x{d55c}\x{ad6d}\x{c5b4}
+
+/\x{65e5}\x{672c}\x{8a9e}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{65e5}\x{672c}\x{8a9e}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{e6}
+Need char = \x{9e}
+    \x{65e5}\x{672c}\x{8a9e}
+ 0: \x{65e5}\x{672c}\x{8a9e}
+
+/\x{80}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{80}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c2}
+Need char = \x{80}
+
+/\x{084}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{84}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c2}
+Need char = \x{84}
+
+/\x{104}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{104}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c4}
+Need char = \x{84}
+
+/\x{861}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{861}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{e0}
+Need char = \x{a1}
+
+/\x{212ab}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{212ab}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{f0}
+Need char = \x{ab}
+
+/-- This one is here not because it's different to Perl, but because the way
+the captured single-byte is displayed. (In Perl it becomes a character, and you
+can't tell the difference.) --/
+    
+/X(\C)(.*)/8
+    X\x{1234}
+ 0: X\x{1234}
+ 1: \x{e1}
+ 2: \x{88}\x{b4}
+    X\nabc 
+ 0: X\x{0a}abc
+ 1: \x{0a}
+ 2: abc
+
+/-- This one is here because Perl gives out a grumbly error message (quite 
+correctly, but that messes up comparisons). --/
+    
+/a\Cb/8
+    *** Failers 
+No match
+    a\x{100}b 
+No match
+    
+/[^ab\xC0-\xF0]/8SDZ
+------------------------------------------------------------------
+        Bra
+        [\x00-`c-\xbf\xf1-\xff] (neg)
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
+  \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 
+  \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 
+  5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y 
+  Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f 
+  \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 
+  \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf 
+  \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee 
+  \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd 
+  \xfe \xff 
+    \x{f1}
+ 0: \x{f1}
+    \x{bf}
+ 0: \x{bf}
+    \x{100}
+ 0: \x{100}
+    \x{1000}   
+ 0: \x{1000}
+    *** Failers
+ 0: *
+    \x{c0} 
+No match
+    \x{f0} 
+No match
+
+/Ā{3,4}/8SDZ
+------------------------------------------------------------------
+        Bra
+        \x{100}{3}
+        \x{100}?
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c4}
+Need char = \x{80}
 Subject length lower bound = 3
 No set of starting bytes
-JIT support is not available in this version of PCRE
+  \x{100}\x{100}\x{100}\x{100\x{100}
+ 0: \x{100}\x{100}\x{100}

-/a*/SI
+/(\x{100}+|x)/8SDZ
+------------------------------------------------------------------
+        Bra
+        CBra 1
+        \x{100}+
+        Alt
+        x
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: x \xc4 
+
+/(\x{100}*a|x)/8SDZ
+------------------------------------------------------------------
+        Bra
+        CBra 1
+        \x{100}*+
+        a
+        Alt
+        x
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: a x \xc4 
+
+/(\x{100}{0,2}a|x)/8SDZ
+------------------------------------------------------------------
+        Bra
+        CBra 1
+        \x{100}{0,2}
+        a
+        Alt
+        x
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: a x \xc4 
+
+/(\x{100}{1,2}a|x)/8SDZ
+------------------------------------------------------------------
+        Bra
+        CBra 1
+        \x{100}
+        \x{100}{0,1}
+        a
+        Alt
+        x
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: x \xc4 
+
+/\x{100}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}
+        Ket
+        End
+------------------------------------------------------------------
 Capturing subpattern count = 0
+Options: utf
+First char = \x{c4}
+Need char = \x{80}
+
+/a\x{100}\x{101}*/8DZ
+------------------------------------------------------------------
+        Bra
+        a\x{100}
+        \x{101}*
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'a'
+Need char = \x{80}
+
+/a\x{100}\x{101}+/8DZ
+------------------------------------------------------------------
+        Bra
+        a\x{100}
+        \x{101}+
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'a'
+Need char = \x{81}
+
+/[^\x{c4}]/DZ
+------------------------------------------------------------------
+        Bra
+        [^\xc4]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
 No options
 No first char
 No need char
-Study returned NULL
-JIT support is not available in this version of PCRE

+/[\x{100}]/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c4}
+Need char = \x{80}
+    \x{100}
+ 0: \x{100}
+    Z\x{100}
+ 0: \x{100}
+    \x{100}Z
+ 0: \x{100}
+    *** Failers 
+No match
+
+/[\xff]/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{ff}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c3}
+Need char = \x{bf}
+    >\x{ff}<
+ 0: \x{ff}
+
+/[^\xff]/8DZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\xfe] (neg)
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+
+/\x{100}abc(xyz(?1))/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}abc
+        CBra 1
+        xyz
+        Recurse
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+First char = \x{c4}
+Need char = 'z'
+
+/a\x{1234}b/P8
+    a\x{1234}b
+ 0: a\x{1234}b
+
+/\777/8I
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c7}
+Need char = \x{bf}
+  \x{1ff}
+ 0: \x{1ff}
+  \777 
+ 0: \x{1ff}
+  
+/\x{100}+\x{200}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}++
+        \x{200}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c4}
+Need char = \x{80}
+
+/\x{100}+X/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}++
+        X
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{c4}
+Need char = 'X'
+
+/^[\QĀ\E-\QŐ\E/BZ8
+Failed: missing terminating ] for character class at offset 15
+
+/-- This tests the stricter UTF-8 check according to RFC 3629. --/ 
+    
+/X/8
+    \x{0}\x{d7ff}\x{e000}\x{10ffff}
+No match
+    \x{d800}
+Error -10 (bad UTF-8 string) offset=0 reason=14
+    \x{d800}\?
+No match
+    \x{da00}
+Error -10 (bad UTF-8 string) offset=0 reason=14
+    \x{da00}\?
+No match
+    \x{dfff}
+Error -10 (bad UTF-8 string) offset=0 reason=14
+    \x{dfff}\?
+No match
+    \x{110000}    
+Error -10 (bad UTF-8 string) offset=0 reason=13
+    \x{110000}\?    
+No match
+    \x{2000000} 
+Error -10 (bad UTF-8 string) offset=0 reason=11
+    \x{2000000}\? 
+No match
+    \x{7fffffff} 
+Error -10 (bad UTF-8 string) offset=0 reason=12
+    \x{7fffffff}\? 
+No match
+
+/(*UTF8)\x{1234}/
+  abcd\x{1234}pqr
+ 0: \x{1234}
+
+/(*CRLF)(*UTF8)(*BSR_UNICODE)a\Rb/I
+Capturing subpattern count = 0
+Options: bsr_unicode utf
+Forced newline sequence: CRLF
+First char = 'a'
+Need char = 'b'
+
+/\h/SI8
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x09 \x20 \xc2 \xe1 \xe2 \xe3 
+    ABC\x{09}
+ 0: \x{09}
+    ABC\x{20}
+ 0:  
+    ABC\x{a0}
+ 0: \x{a0}
+    ABC\x{1680}
+ 0: \x{1680}
+    ABC\x{180e}
+ 0: \x{180e}
+    ABC\x{2000}
+ 0: \x{2000}
+    ABC\x{202f} 
+ 0: \x{202f}
+    ABC\x{205f} 
+ 0: \x{205f}
+    ABC\x{3000} 
+ 0: \x{3000}
+
+/\v/SI8
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
+    ABC\x{0a}
+ 0: \x{0a}
+    ABC\x{0b}
+ 0: \x{0b}
+    ABC\x{0c}
+ 0: \x{0c}
+    ABC\x{0d}
+ 0: \x{0d}
+    ABC\x{85}
+ 0: \x{85}
+    ABC\x{2028}
+ 0: \x{2028}
+
+/\h*A/SI8
+Capturing subpattern count = 0
+Options: utf
+No first char
+Need char = 'A'
+Subject length lower bound = 1
+Starting byte set: \x09 \x20 A \xc2 \xe1 \xe2 \xe3 
+    CDBABC
+ 0: A
+    
+/\v+A/SI8
+Capturing subpattern count = 0
+Options: utf
+No first char
+Need char = 'A'
+Subject length lower bound = 2
+Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
+
+/\s?xxx\s/8SI
+Capturing subpattern count = 0
+Options: utf
+No first char
+Need char = 'x'
+Subject length lower bound = 4
+Starting byte set: \x09 \x0a \x0c \x0d \x20 x 
+
+/\sxxx\s/I8ST1
+Capturing subpattern count = 0
+Options: utf
+No first char
+Need char = 'x'
+Subject length lower bound = 5
+Starting byte set: \x09 \x0a \x0c \x0d \x20 \xc2 
+    AB\x{85}xxx\x{a0}XYZ
+ 0: \x{85}xxx\x{a0}
+    AB\x{a0}xxx\x{85}XYZ
+ 0: \x{a0}xxx\x{85}
+
+/\S \S/I8ST1
+Capturing subpattern count = 0
+Options: utf
+No first char
+Need char = ' '
+Subject length lower bound = 3
+Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e 
+  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
+  \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ 
+  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e 
+  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3 
+  \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 
+  \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 
+  \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 
+  \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff 
+    \x{a2} \x{84} 
+ 0: \x{a2} \x{84}
+    A Z 
+ 0: A Z
+
+/a+/8
+    a\x{123}aa\>1
+ 0: aa
+    a\x{123}aa\>2
+Error -11 (bad UTF-8 offset)
+    a\x{123}aa\>3
+ 0: aa
+    a\x{123}aa\>4
+ 0: a
+    a\x{123}aa\>5
+No match
+    a\x{123}aa\>6
+Error -24 (bad offset value)
+
+/\x{1234}+/iS8I
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \xe1 
+
+/\x{1234}+?/iS8I
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \xe1 
+
+/\x{1234}++/iS8I
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \xe1 
+
+/\x{1234}{2}/iS8I
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+Subject length lower bound = 2
+Starting byte set: \xe1 
+
+/[^\x{c4}]/8DZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\xc3\xc5-\xff] (neg)
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+
+/X+\x{200}/8DZ
+------------------------------------------------------------------
+        Bra
+        X++
+        \x{200}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'X'
+Need char = \x{80}
+
+/\R/SI8
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
+
 /-- End of testinput15 --/

Copied: code/trunk/testdata/testoutput16 (from rev 835, code/branches/pcre16/testdata/testoutput16)
===================================================================
--- code/trunk/testdata/testoutput16                            (rev 0)
+++ code/trunk/testdata/testoutput16    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,121 @@
+/-- This set of tests is run only with the 8-bit library when Unicode property 
+    support is available. It starts with tests of the POSIX interface, because
+    that is supported only with the 8-bit library. --/
+
+/\w/P
+    +++\x{c2}
+No match: POSIX code 17: match failed
+
+/\w/WP
+    +++\x{c2}
+ 0: \xc2
+    
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iDZ
+------------------------------------------------------------------
+        Bra
+     /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: caseless utf
+First char = 'A' (caseless)
+No need char
+
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8DZ
+------------------------------------------------------------------
+        Bra
+        A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'A'
+Need char = \x{b0}
+
+/AB\x{1fb0}/8DZ
+------------------------------------------------------------------
+        Bra
+        AB\x{1fb0}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'A'
+Need char = \x{b0}
+
+/AB\x{1fb0}/8DZi
+------------------------------------------------------------------
+        Bra
+     /i AB\x{1fb0}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: caseless utf
+First char = 'A' (caseless)
+Need char = 'B' (caseless)
+
+/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/8iSI
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+Subject length lower bound = 17
+Starting byte set: \xd0 \xd1 
+    \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
+ 0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
+    \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
+ 0: \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
+
+/[ⱥ]/8iBZ
+------------------------------------------------------------------
+        Bra
+     /i \x{2c65}
+        Ket
+        End
+------------------------------------------------------------------
+
+/[^ⱥ]/8iBZ
+------------------------------------------------------------------
+        Bra
+        [^\x{2c65}\x{23a}]
+        Ket
+        End
+------------------------------------------------------------------
+
+/\h/SI
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x09 \x20 \xa0 
+
+/\v/SI
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x0a \x0b \x0c \x0d \x85 
+
+/\R/SI
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x0a \x0b \x0c \x0d \x85 
+
+/[[:blank:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\x09 \xa0]
+        Ket
+        End
+------------------------------------------------------------------
+
+/-- End of testinput16 --/

Copied: code/trunk/testdata/testoutput17 (from rev 835, code/branches/pcre16/testdata/testoutput17)
===================================================================
--- code/trunk/testdata/testoutput17                            (rev 0)
+++ code/trunk/testdata/testoutput17    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,457 @@
+/-- This set of tests is for the 16-bit library's basic (non-UTF-16) features 
+    that are not compatible with the 8-bit library, or which give different 
+    output in 16-bit mode. --/
+
+/a\Cb/
+    aXb
+ 0: aXb
+    a\nb
+ 0: a\x0ab
+  
+/-- Check maximum non-UTF character size --/
+
+/\x{ffff}/
+    A\x{ffff}B
+ 0: \x{ffff}
+
+/\x{10000}/ 
+Failed: character value in \x{...} sequence is too large at offset 8
+
+/[^\x{c4}]/DZ
+------------------------------------------------------------------
+        Bra
+        [^\xc4]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+
+  
+/\x{100}/I
+Capturing subpattern count = 0
+No options
+First char = \x{100}
+No need char
+
+/  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                          # optional leading comment
+(?:    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+# address
+|                     #  or
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)             # one word, optionally followed by....
+(?:
+[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]  |  # atom and space parts, or...
+\(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)       |  # comments, or...
+
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+# quoted strings
+)*
+<  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                     # leading <
+(?:  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  ,  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+)* # further okay, if led by comma
+:                                # closing colon
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  )? #       optional route
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+#       address spec
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  > #                  trailing >
+# name and address
+)  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                       # optional trailing comment
+/xSI
+Capturing subpattern count = 0
+Contains explicit CR or LF match
+Options: extended
+No first char
+No need char
+Subject length lower bound = 3
+Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 
+  9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e 
+  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xff 
+
+<testdata/saved8
+Compiled pattern loaded from testdata/saved8
+No study data
+Error -28 from pcre16_fullinfo(0)
+Running in 16-bit mode but pattern was compiled in 8-bit mode
+
+/[\h]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]
+        Ket
+        End
+------------------------------------------------------------------
+    >\x09<
+ 0: \x09
+
+/[\h]+/BZ
+------------------------------------------------------------------
+        Bra
+        [\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]+
+        Ket
+        End
+------------------------------------------------------------------
+    >\x09\x20\xa0<
+ 0: \x09 \xa0
+
+/[\v]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x0a-\x0d\x85\x{2028}-\x{2029}]
+        Ket
+        End
+------------------------------------------------------------------
+
+/[\H]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{ffff}]
+        Ket
+        End
+------------------------------------------------------------------
+
+/[^\h]/BZ
+------------------------------------------------------------------
+        Bra
+        [^\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]
+        Ket
+        End
+------------------------------------------------------------------
+
+/[\V]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{ffff}]
+        Ket
+        End
+------------------------------------------------------------------
+
+/[\x0a\V]/BZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x0a\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{ffff}]
+        Ket
+        End
+------------------------------------------------------------------
+
+/\h+/SI
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x09 \x20 \xa0 \xff 
+    \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
+ 0: \x{1680}\x{2000}\x{202f}\x{3000}
+    \x{3001}\x{2fff}\x{200a}\xa0\x{2000}
+ 0: \x{200a}\xa0\x{2000}
+
+/[\h\x{dc00}]+/BZSI
+------------------------------------------------------------------
+        Bra
+        [\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}\x{dc00}]+
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+No set of starting bytes
+    \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
+ 0: \x{1680}\x{2000}\x{202f}\x{3000}
+    \x{3001}\x{2fff}\x{200a}\xa0\x{2000}
+ 0: \x{200a}\xa0\x{2000}
+
+/\H+/SI
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+No set of starting bytes
+    \x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
+ 0: \x{167f}\x{1681}\x{180d}\x{180f}
+    \x{2000}\x{200a}\x{1fff}\x{200b}
+ 0: \x{1fff}\x{200b}
+    \x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
+ 0: \x{202e}\x{2030}\x{205e}\x{2060}
+    \xa0\x{3000}\x9f\xa1\x{2fff}\x{3001}
+ 0: \x9f\xa1\x{2fff}\x{3001}
+
+/[\H\x{d800}]+/BZSI
+------------------------------------------------------------------
+        Bra
+        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{ffff}\x{d800}]+
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+No set of starting bytes
+    \x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
+ 0: \x{167f}\x{1681}\x{180d}\x{180f}
+    \x{2000}\x{200a}\x{1fff}\x{200b}
+ 0: \x{1fff}\x{200b}
+    \x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
+ 0: \x{202e}\x{2030}\x{205e}\x{2060}
+    \xa0\x{3000}\x9f\xa1\x{2fff}\x{3001}
+ 0: \x9f\xa1\x{2fff}\x{3001}
+
+/\v+/SI
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+    \x{2027}\x{2030}\x{2028}\x{2029}
+ 0: \x{2028}\x{2029}
+    \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
+ 0: \x85\x0a\x0b\x0c\x0d
+
+/[\v\x{dc00}]+/BZSI
+------------------------------------------------------------------
+        Bra
+        [\x0a-\x0d\x85\x{2028}-\x{2029}\x{dc00}]+
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+No set of starting bytes
+    \x{2027}\x{2030}\x{2028}\x{2029}
+ 0: \x{2028}\x{2029}
+    \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
+ 0: \x85\x0a\x0b\x0c\x0d
+
+/\V+/SI
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+No set of starting bytes
+    \x{2028}\x{2029}\x{2027}\x{2030}
+ 0: \x{2027}\x{2030}
+    \x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
+ 0: \x09\x0e\x84\x86
+
+/[\V\x{d800}]+/BZSI
+------------------------------------------------------------------
+        Bra
+        [\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{ffff}\x{d800}]+
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+No set of starting bytes
+    \x{2028}\x{2029}\x{2027}\x{2030}
+ 0: \x{2027}\x{2030}
+    \x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
+ 0: \x09\x0e\x84\x86
+
+/\R+/SI<bsr_unicode>
+Capturing subpattern count = 0
+Options: bsr_unicode
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+    \x{2027}\x{2030}\x{2028}\x{2029}
+ 0: \x{2028}\x{2029}
+    \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
+ 0: \x85\x0a\x0b\x0c\x0d
+
+/\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}/I
+Capturing subpattern count = 0
+No options
+First char = \x{d800}
+Need char = \x{dd00}
+    \x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}
+ 0: \x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}
+
+/-- End of testinput17 --/

Copied: code/trunk/testdata/testoutput18 (from rev 835, code/branches/pcre16/testdata/testoutput18)
===================================================================
--- code/trunk/testdata/testoutput18                            (rev 0)
+++ code/trunk/testdata/testoutput18    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,845 @@
+/-- This set of tests is for UTF-16 support, and is relevant only to the 16-bit
+    library. --/
+
+/\xC3\xC3\xC3xxx/8?DZSS
+**Failed: invalid UTF-8 string cannot be converted to UTF-16
+
+/abc/8
+    \xC3]
+**Failed: invalid UTF-8 string cannot be converted to UTF-16
+
+/X(\C{3})/8
+    X\x{11234}Y
+ 0: X\x{11234}Y
+ 1: \x{11234}Y
+
+/X(\C{4})/8
+    X\x{11234}YZ
+ 0: X\x{11234}YZ
+ 1: \x{11234}YZ
+
+/X\C*/8
+    XYZabcdce
+ 0: XYZabcdce
+
+/X\C*?/8
+    XYZabcde
+ 0: X
+
+/X\C{3,5}/8
+    Xabcdefg
+ 0: Xabcde
+    X\x{11234}Y
+ 0: X\x{11234}Y
+    X\x{11234}YZ
+ 0: X\x{11234}YZ
+    X\x{11234}\x{512}
+ 0: X\x{11234}\x{512}
+    X\x{11234}\x{512}YZ
+ 0: X\x{11234}\x{512}YZ
+    X\x{11234}\x{512}\x{11234}Z
+ 0: X\x{11234}\x{512}\x{11234}
+
+/X\C{3,5}?/8
+    Xabcdefg
+ 0: Xabc
+    X\x{11234}Y
+ 0: X\x{11234}Y
+    X\x{11234}YZ
+ 0: X\x{11234}Y
+    X\x{11234}\x{512}YZ
+ 0: X\x{11234}\x{512}
+    *** Failers
+No match
+    X\x{11234}
+No match
+
+/a\Cb/8
+    aXb
+ 0: aXb
+    a\nb
+ 0: a\x{0a}b
+
+/a\C\Cb/8
+    a\x{12257}b
+ 0: a\x{12257}b
+    ** Failers
+No match
+    a\x{100}b
+No match
+
+/ab\Cde/8
+    abXde
+ 0: abXde
+
+/-- Check maximum character size --/
+
+/\x{ffff}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{ffff}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{ffff}
+No need char
+
+/\x{10000}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{10000}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{d800}
+Need char = \x{dc00}
+
+/\x{100}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{100}
+No need char
+
+/\x{1000}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{1000}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{1000}
+No need char
+
+/\x{10000}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{10000}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{d800}
+Need char = \x{dc00}
+
+/\x{100000}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100000}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{dbc0}
+Need char = \x{dc00}
+
+/\x{10ffff}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{10ffff}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{dbff}
+Need char = \x{dfff}
+
+/[\x{ff}]/8DZ
+------------------------------------------------------------------
+        Bra
+        \xff
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{ff}
+No need char
+
+/[\x{100}]/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{100}
+No need char
+
+/\x80/8DZ
+------------------------------------------------------------------
+        Bra
+        \x80
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{80}
+No need char
+
+/\xff/8DZ
+------------------------------------------------------------------
+        Bra
+        \xff
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{ff}
+No need char
+
+/\x{D55c}\x{ad6d}\x{C5B4}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{d55c}\x{ad6d}\x{c5b4}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{d55c}
+Need char = \x{c5b4}
+    \x{D55c}\x{ad6d}\x{C5B4}
+ 0: \x{d55c}\x{ad6d}\x{c5b4}
+
+/\x{65e5}\x{672c}\x{8a9e}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{65e5}\x{672c}\x{8a9e}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{65e5}
+Need char = \x{8a9e}
+    \x{65e5}\x{672c}\x{8a9e}
+ 0: \x{65e5}\x{672c}\x{8a9e}
+
+/\x{80}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x80
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{80}
+No need char
+
+/\x{084}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x84
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{84}
+No need char
+
+/\x{104}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{104}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{104}
+No need char
+
+/\x{861}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{861}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{861}
+No need char
+
+/\x{212ab}/DZ8
+------------------------------------------------------------------
+        Bra
+        \x{212ab}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{d844}
+Need char = \x{deab}
+
+/-- This one is here not because it's different to Perl, but because the way
+the captured single-byte is displayed. (In Perl it becomes a character, and you
+can't tell the difference.) --/
+
+/X(\C)(.*)/8
+    X\x{1234}
+ 0: X\x{1234}
+ 1: \x{1234}
+ 2: 
+    X\nabc
+ 0: X\x{0a}abc
+ 1: \x{0a}
+ 2: abc
+
+/-- This one is here because Perl gives out a grumbly error message (quite
+correctly, but that messes up comparisons). --/
+
+/a\Cb/8
+    *** Failers
+No match
+    a\x{100}b
+ 0: a\x{100}b
+
+/[^ab\xC0-\xF0]/8SDZ
+------------------------------------------------------------------
+        Bra
+        [\x00-`c-\xbf\xf1-\xff] (neg)
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
+  \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 
+  \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 
+  5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y 
+  Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f 
+  \x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e 
+  \x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d 
+  \x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac 
+  \xad \xae \xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb 
+  \xbc \xbd \xbe \xbf \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb 
+  \xfc \xfd \xfe \xff 
+    \x{f1}
+ 0: \x{f1}
+    \x{bf}
+ 0: \x{bf}
+    \x{100}
+ 0: \x{100}
+    \x{1000}
+ 0: \x{1000}
+    *** Failers
+ 0: *
+    \x{c0}
+No match
+    \x{f0}
+No match
+
+/Ā{3,4}/8SDZ
+------------------------------------------------------------------
+        Bra
+        \x{100}{3}
+        \x{100}?
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{100}
+Need char = \x{100}
+Subject length lower bound = 3
+No set of starting bytes
+  \x{100}\x{100}\x{100}\x{100\x{100}
+ 0: \x{100}\x{100}\x{100}
+
+/(\x{100}+|x)/8SDZ
+------------------------------------------------------------------
+        Bra
+        CBra 1
+        \x{100}+
+        Alt
+        x
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: x \xff 
+
+/(\x{100}*a|x)/8SDZ
+------------------------------------------------------------------
+        Bra
+        CBra 1
+        \x{100}*+
+        a
+        Alt
+        x
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: a x \xff 
+
+/(\x{100}{0,2}a|x)/8SDZ
+------------------------------------------------------------------
+        Bra
+        CBra 1
+        \x{100}{0,2}
+        a
+        Alt
+        x
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: a x \xff 
+
+/(\x{100}{1,2}a|x)/8SDZ
+------------------------------------------------------------------
+        Bra
+        CBra 1
+        \x{100}
+        \x{100}{0,1}
+        a
+        Alt
+        x
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: x \xff 
+
+/\x{100}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{100}
+No need char
+
+/a\x{100}\x{101}*/8DZ
+------------------------------------------------------------------
+        Bra
+        a\x{100}
+        \x{101}*
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'a'
+Need char = \x{100}
+
+/a\x{100}\x{101}+/8DZ
+------------------------------------------------------------------
+        Bra
+        a\x{100}
+        \x{101}+
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'a'
+Need char = \x{101}
+
+/[^\x{c4}]/DZ
+------------------------------------------------------------------
+        Bra
+        [^\xc4]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+
+/[\x{100}]/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{100}
+No need char
+    \x{100}
+ 0: \x{100}
+    Z\x{100}
+ 0: \x{100}
+    \x{100}Z
+ 0: \x{100}
+    *** Failers
+No match
+
+/[\xff]/DZ8
+------------------------------------------------------------------
+        Bra
+        \xff
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{ff}
+No need char
+    >\x{ff}<
+ 0: \x{ff}
+
+/[^\xff]/8DZ
+------------------------------------------------------------------
+        Bra
+        [^\x{ff}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+
+/\x{100}abc(xyz(?1))/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}abc
+        CBra 1
+        xyz
+        Recurse
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+First char = \x{100}
+Need char = 'z'
+
+/\777/8I
+Capturing subpattern count = 0
+Options: utf
+First char = \x{1ff}
+No need char
+  \x{1ff}
+ 0: \x{1ff}
+  \777
+ 0: \x{1ff}
+
+/\x{100}+\x{200}/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}++
+        \x{200}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{100}
+Need char = \x{200}
+
+/\x{100}+X/8DZ
+------------------------------------------------------------------
+        Bra
+        \x{100}++
+        X
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = \x{100}
+Need char = 'X'
+
+/^[\QĀ\E-\QŐ\E/BZ8
+Failed: missing terminating ] for character class at offset 13
+
+/X/8
+    \x{0}\x{d7ff}\x{e000}\x{10ffff}
+No match
+    \x{d800}
+Error -10 (bad UTF-16 string) offset=0 reason=1
+    \x{d800}\?
+No match
+    \x{da00}
+Error -10 (bad UTF-16 string) offset=0 reason=1
+    \x{da00}\?
+No match
+    \x{dc00}
+Error -10 (bad UTF-16 string) offset=0 reason=3
+    \x{dc00}\?
+No match
+    \x{de00}
+Error -10 (bad UTF-16 string) offset=0 reason=3
+    \x{de00}\?
+No match
+    \x{dfff}
+Error -10 (bad UTF-16 string) offset=0 reason=3
+    \x{dfff}\?
+No match
+    \x{110000}
+**Failed: character value greater than 0x10ffff cannot be converted to UTF-16
+    \x{d800}\x{1234}
+Error -10 (bad UTF-16 string) offset=1 reason=2
+    \x{fffe}
+Error -10 (bad UTF-16 string) offset=0 reason=4
+
+/(*UTF16)\x{11234}/
+  abcd\x{11234}pqr
+ 0: \x{11234}
+
+/(*CRLF)(*UTF16)(*BSR_UNICODE)a\Rb/I
+Capturing subpattern count = 0
+Options: bsr_unicode utf
+Forced newline sequence: CRLF
+First char = 'a'
+Need char = 'b'
+
+/\h/SI8
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x09 \x20 \xa0 \xff 
+    ABC\x{09}
+ 0: \x{09}
+    ABC\x{20}
+ 0:  
+    ABC\x{a0}
+ 0: \x{a0}
+    ABC\x{1680}
+ 0: \x{1680}
+    ABC\x{180e}
+ 0: \x{180e}
+    ABC\x{2000}
+ 0: \x{2000}
+    ABC\x{202f}
+ 0: \x{202f}
+    ABC\x{205f}
+ 0: \x{205f}
+    ABC\x{3000}
+ 0: \x{3000}
+
+/\v/SI8
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+    ABC\x{0a}
+ 0: \x{0a}
+    ABC\x{0b}
+ 0: \x{0b}
+    ABC\x{0c}
+ 0: \x{0c}
+    ABC\x{0d}
+ 0: \x{0d}
+    ABC\x{85}
+ 0: \x{85}
+    ABC\x{2028}
+ 0: \x{2028}
+
+/\h*A/SI8
+Capturing subpattern count = 0
+Options: utf
+No first char
+Need char = 'A'
+Subject length lower bound = 1
+Starting byte set: \x09 \x20 A \xa0 
+    CDBABC
+ 0: A
+
+/\v+A/SI8
+Capturing subpattern count = 0
+Options: utf
+No first char
+Need char = 'A'
+Subject length lower bound = 2
+Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+
+/\s?xxx\s/8SI
+Capturing subpattern count = 0
+Options: utf
+No first char
+Need char = 'x'
+Subject length lower bound = 4
+Starting byte set: \x09 \x0a \x0c \x0d \x20 x 
+
+/\sxxx\s/I8ST1
+Capturing subpattern count = 0
+Options: utf
+No first char
+Need char = 'x'
+Subject length lower bound = 5
+Starting byte set: \x09 \x0a \x0c \x0d \x20 \x85 \xa0 
+    AB\x{85}xxx\x{a0}XYZ
+ 0: \x{85}xxx\x{a0}
+    AB\x{a0}xxx\x{85}XYZ
+ 0: \x{a0}xxx\x{85}
+
+/\S \S/I8ST1
+Capturing subpattern count = 0
+Options: utf
+No first char
+Need char = ' '
+Subject length lower bound = 3
+Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e 
+  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
+  \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ 
+  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e 
+  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 
+  \x84 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 
+  \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 
+  \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 
+  \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 
+  \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 
+  \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf 
+  \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee 
+  \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd 
+  \xfe \xff 
+    \x{a2} \x{84}
+ 0: \x{a2} \x{84}
+    A Z
+ 0: A Z
+
+/a+/8
+    a\x{123}aa\>1
+ 0: aa
+    a\x{123}aa\>2
+ 0: aa
+    a\x{123}aa\>3
+ 0: a
+    a\x{123}aa\>4
+No match
+    a\x{123}aa\>5
+Error -24 (bad offset value)
+    a\x{123}aa\>6
+Error -24 (bad offset value)
+
+/\x{1234}+/iS8I
+Capturing subpattern count = 0
+Options: caseless utf
+First char = \x{1234}
+No need char
+Subject length lower bound = 1
+No set of starting bytes
+
+/\x{1234}+?/iS8I
+Capturing subpattern count = 0
+Options: caseless utf
+First char = \x{1234}
+No need char
+Subject length lower bound = 1
+No set of starting bytes
+
+/\x{1234}++/iS8I
+Capturing subpattern count = 0
+Options: caseless utf
+First char = \x{1234}
+No need char
+Subject length lower bound = 1
+No set of starting bytes
+
+/\x{1234}{2}/iS8I
+Capturing subpattern count = 0
+Options: caseless utf
+First char = \x{1234}
+Need char = \x{1234}
+Subject length lower bound = 2
+No set of starting bytes
+
+/[^\x{c4}]/8DZ
+------------------------------------------------------------------
+        Bra
+        [^\x{c4}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+
+/X+\x{200}/8DZ
+------------------------------------------------------------------
+        Bra
+        X++
+        \x{200}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'X'
+Need char = \x{200}
+
+/\R/SI8
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+
+/-- Check bad offset --/
+
+/a/8
+    \x{10000}\>1
+Error -11 (bad UTF-16 offset)
+    \x{10000}ab\>2
+ 0: a
+    \x{10000}ab\>3
+No match
+    \x{10000}ab\>4
+No match
+    \x{10000}ab\>5
+Error -24 (bad offset value)
+
+/-- End of testinput18 --/

Copied: code/trunk/testdata/testoutput19 (from rev 835, code/branches/pcre16/testdata/testoutput19)
===================================================================
--- code/trunk/testdata/testoutput19                            (rev 0)
+++ code/trunk/testdata/testoutput19    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,88 @@
+/-- This set of tests is for Unicode property support, relevant only to the
+    16-bit library. --/
+    
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iDZ
+------------------------------------------------------------------
+        Bra
+     /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: caseless utf
+First char = 'A' (caseless)
+Need char = \x{1fb0} (caseless)
+
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8DZ
+------------------------------------------------------------------
+        Bra
+        A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'A'
+Need char = \x{1fb0}
+
+/AB\x{1fb0}/8DZ
+------------------------------------------------------------------
+        Bra
+        AB\x{1fb0}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+First char = 'A'
+Need char = \x{1fb0}
+
+/AB\x{1fb0}/8DZi
+------------------------------------------------------------------
+        Bra
+     /i AB\x{1fb0}
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: caseless utf
+First char = 'A' (caseless)
+Need char = \x{1fb0} (caseless)
+
+/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/8iSI
+Capturing subpattern count = 0
+Options: caseless utf
+First char = \x{401} (caseless)
+Need char = \x{42f} (caseless)
+Subject length lower bound = 17
+No set of starting bytes
+    \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
+ 0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
+    \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
+ 0: \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
+
+/[ⱥ]/8iBZ
+------------------------------------------------------------------
+        Bra
+     /i \x{2c65}
+        Ket
+        End
+------------------------------------------------------------------
+
+/[^ⱥ]/8iBZ
+------------------------------------------------------------------
+        Bra
+     /i [^\x{2c65}]
+        Ket
+        End
+------------------------------------------------------------------
+
+/[[:blank:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]
+        Ket
+        End
+------------------------------------------------------------------
+
+/-- End of testinput19 --/

Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput2    2011-12-28 17:16:11 UTC (rev 836)
@@ -3,12 +3,11 @@
     It also checks the non-Perl syntax the PCRE supports (Python, .NET, 
     Oniguruma). Finally, there are some tests where PCRE and Perl differ, 
     either because PCRE can't be compatible, or there is a possible Perl 
-    bug. --/  
+    bug.
+    
+    NOTE: This is a non-UTF set of tests. When UTF support is needed, use
+    test 5, and if Unicode Property Support is needed, use test 7. --/

-/-- Originally, the Perl >= 5.10 things were in here too, but now I have 
-    separated many (most?) of them out into test 11. However, there may still 
-    be some that were overlooked. --/   
-
 /(a)b|/I
 Capturing subpattern count = 1
 No options
@@ -377,61 +376,6 @@
  2: <unset>
  3: def

-/abc/P
-    abc
- 0: abc
-    *** Failers
-No match: POSIX code 17: match failed
-
-/^abc|def/P
-    abcdef
- 0: abc
-    abcdef\B
- 0: def
-
-/.*((abc)$|(def))/P
-    defabc
- 0: defabc
- 1: abc
- 2: abc
-    \Zdefabc
- 0: def
- 1: def
- 3: def
-
-/the quick brown fox/P
-    the quick brown fox
- 0: the quick brown fox
-    *** Failers
-No match: POSIX code 17: match failed
-    The Quick Brown Fox
-No match: POSIX code 17: match failed
-
-/the quick brown fox/Pi
-    the quick brown fox
- 0: the quick brown fox
-    The Quick Brown Fox
- 0: The Quick Brown Fox
-
-/abc.def/P
-    *** Failers
-No match: POSIX code 17: match failed
-    abc\ndef
-No match: POSIX code 17: match failed
-
-/abc$/P
-    abc
- 0: abc
-    abc\n
- 0: abc
-
-/(abc)\2/P
-Failed: POSIX code 15: bad back reference at offset 7     
-
-/(abc\1)/P
-    abc
-No match: POSIX code 17: match failed
-
 /)/
 Failed: unmatched parentheses at offset 0

@@ -1031,9 +975,6 @@
/abc/\
Failed: \ at end of pattern at offset 4

-/abc/\P
-Failed: POSIX code 9: bad escape sequence at offset 4     
-
 /abc/\i
 Failed: \ at end of pattern at offset 4

@@ -1149,7 +1090,7 @@
 No need char
     abc\00def\L\C0
  0: abc\x00def
- 0C abc (7)
+ 0C abc\x00def (7)
  0L abc

/word ((?:[a-zA-Z0-9]+ )((?:[a-zA-Z0-9]+ )((?:[a-zA-Z0-9]+ )((?:[a-zA-Z0-9]+
@@ -1268,11 +1209,6 @@
0: iss
0+ issippi

-/\Biss\B/I+P
-    Mississippi
- 0: iss
- 0+ issippi
-
 /iss/IG+
 Capturing subpattern count = 0
 No options
@@ -1402,7 +1338,7 @@
 Contains explicit CR or LF match
 Options: multiline
 First char at start or follows newline
-Need char = 10
+Need char = \x0a
     ab\nab\ncd
  0: ab\x0a
  0+ ab\x0acd
@@ -1689,33 +1625,6 @@
     \Nabc
 No match

-/a*(b+)(z)(z)/P
-    aaaabbbbzzzz
- 0: aaaabbbbzz
- 1: bbbb
- 2: z
- 3: z
-    aaaabbbbzzzz\O0
-    aaaabbbbzzzz\O1
- 0: aaaabbbbzz
-    aaaabbbbzzzz\O2
- 0: aaaabbbbzz
- 1: bbbb
-    aaaabbbbzzzz\O3
- 0: aaaabbbbzz
- 1: bbbb
- 2: z
-    aaaabbbbzzzz\O4
- 0: aaaabbbbzz
- 1: bbbb
- 2: z
- 3: z
-    aaaabbbbzzzz\O5
- 0: aaaabbbbzz
- 1: bbbb
- 2: z
- 3: z
-
 /^.?abcd/IS
 Capturing subpattern count = 0
 Options: anchored
@@ -5851,24 +5760,6 @@
     line one\nthis is a line\nbreak in the second line
 No match

-/ab.cd/P
-    ab-cd
- 0: ab-cd
-    ab=cd
- 0: ab=cd
-    ** Failers
-No match: POSIX code 17: match failed
-    ab\ncd
-No match: POSIX code 17: match failed
-
-/ab.cd/Ps
-    ab-cd
- 0: ab-cd
-    ab=cd
- 0: ab=cd
-    ab\ncd
- 0: ab\x0acd
-
 /(?i)(?-i)AbCd/I
 Capturing subpattern count = 0
 No options
@@ -6161,21 +6052,10 @@
     ((this))
  0: ((this))

-/a(b)c/PN
-    abc
-Matched with REG_NOSUB
-
-/a(?P<name>b)c/PN
-    abc
-Matched with REG_NOSUB
-
-/\x{100}/I
-Failed: character value in \x{...} sequence is too large at offset 6
-
 /\x{0000ff}/I
 Capturing subpattern count = 0
 No options
-First char = 255
+First char = \xff
 No need char

/^((?P<A>a1)|(?P<A>a2)b)/I
@@ -6285,7 +6165,7 @@
0: a1
1: a1
2: a1
-copy substring Z failed -7
+get substring Z failed -7
G a1 (2) A

 /^(?P<A>a)(?P<A>b)/IJ
@@ -6317,7 +6197,7 @@
   G a (1) A
     cd\GA
  0: cd
-copy substring A failed -7
+get substring A failed -7

 /^(?P<A>a)(?P<A>b)|cd(?P<A>ef)(?P<A>gh)/IJ
 Capturing subpattern count = 4
@@ -7548,7 +7428,7 @@
 /[^a]+a/BZi
 ------------------------------------------------------------------
         Bra
-     /i [^A]++
+     /i [^a]++
      /i a
         Ket
         End
@@ -7557,7 +7437,7 @@
 /[^a]+A/BZi
 ------------------------------------------------------------------
         Bra
-     /i [^A]++
+     /i [^a]++
      /i A
         Ket
         End
@@ -8503,66 +8383,6 @@
  3: <unset>
  4: x

-/[\h]/BZ
-------------------------------------------------------------------
-        Bra
-        [\x09 \xa0]
-        Ket
-        End
-------------------------------------------------------------------
-    >\x09<
- 0: \x09
-
-/[\h]+/BZ
-------------------------------------------------------------------
-        Bra
-        [\x09 \xa0]+
-        Ket
-        End
-------------------------------------------------------------------
-    >\x09\x20\xa0<
- 0: \x09 \xa0
-
-/[\v]/BZ
-------------------------------------------------------------------
-        Bra
-        [\x0a-\x0d\x85]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[\H]/BZ
-------------------------------------------------------------------
-        Bra
-        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[^\h]/BZ
-------------------------------------------------------------------
-        Bra
-        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff] (neg)
-        Ket
-        End
-------------------------------------------------------------------
-
-/[\V]/BZ
-------------------------------------------------------------------
-        Bra
-        [\x00-\x09\x0e-\x84\x86-\xff]
-        Ket
-        End
-------------------------------------------------------------------
-
-/[\x0a\V]/BZ
-------------------------------------------------------------------
-        Bra
-        [\x00-\x0a\x0e-\x84\x86-\xff]
-        Ket
-        End
-------------------------------------------------------------------
-
 /\H++X/BZ
 ------------------------------------------------------------------
         Bra
@@ -9475,14 +9295,6 @@
 First char at start or follows newline
 No need char

-/a?|b?/P
-    abc
- 0: a
-    ** Failers
- 0: 
-    ddd\N   
-No match: POSIX code 17: match failed
-
 /xyz/C
   xyz 
 --->xyz
@@ -9877,14 +9689,6 @@
    abc\P\P
  0: abc

-/\w+A/P
-   CDAAAAB 
- 0: CDAAAA
-
-/\w+A/PU
-   CDAAAAB 
- 0: CDA
-
 /abc\K123/
     xyzabc123pqr
  0: 123
@@ -10277,210 +10081,6 @@
 Subject length lower bound = 22
 No set of starting bytes

-/  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                          # optional leading comment
-(?:    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)                    # initial word
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)  )* # further okay, if led by a period
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-# address
-|                     #  or
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)             # one word, optionally followed by....
-(?:
-[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]  |  # atom and space parts, or...
-\(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)       |  # comments, or...
-
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-# quoted strings
-)*
-<  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                     # leading <
-(?:  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  ,  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-)* # further okay, if led by comma
-:                                # closing colon
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  )? #       optional route
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)                    # initial word
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)  )* # further okay, if led by a period
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-#       address spec
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  > #                  trailing >
-# name and address
-)  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                       # optional trailing comment
-/xSI
-Capturing subpattern count = 0
-Contains explicit CR or LF match
-Options: extended
-No first char
-No need char
-Subject length lower bound = 3
-Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 
-  9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e 
-  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f 
-
 /<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/isIS
 Capturing subpattern count = 11
 Options: caseless dotall
@@ -10989,176 +10589,22 @@
     AC
 No match

-/--- A whole lot of tests of verbs with arguments are here rather than in test
-     11 because Perl doesn't seem to follow its specification entirely 
-     correctly. ---/
-
-/--- Perl 5.11 sets $REGERROR on the AC failure case here; PCRE does not. It is
-     not clear how Perl defines "involved in the failure of the match". ---/ 
-
-/^(A(*THEN:A)B|C(*THEN:B)D)/K
-    AB
- 0: AB
- 1: AB
-    CD
- 0: CD
- 1: CD
-    ** Failers
-No match
-    AC
-No match
-    CB    
-No match, mark = B
-    
-/--- Check the use of names for success and failure. PCRE doesn't show these 
-names for success, though Perl does, contrary to its spec. ---/
-
-/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
-    AB
- 0: AB
- 1: AB
-    CD
- 0: CD
- 1: CD
-    ** Failers
-No match
-    AC
-No match, mark = A
-    CB    
-No match, mark = B
-    
-/--- An empty name does not pass back an empty string. It is the same as if no
-name were given. ---/ 
-
-/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K
-    AB
- 0: AB
- 1: AB
-    CD 
- 0: CD
- 1: CD
-
-/--- PRUNE goes to next bumpalong; COMMIT does not. ---/
-    
-/A(*PRUNE:A)B/K
-    ACAB
- 0: AB
-
-/(*MARK:A)(*PRUNE:B)(C|X)/KS
-    C
- 0: C
- 1: C
-MK: A
-    D 
-No match
-
-/(*MARK:A)(*PRUNE:B)(C|X)/KSS
-    C
- 0: C
- 1: C
-MK: A
-    D 
-No match, mark = B
-
-/(*MARK:A)(*THEN:B)(C|X)/KS
-    C
- 0: C
- 1: C
-MK: A
-    D 
-No match
-
-/(*MARK:A)(*THEN:B)(C|X)/KSY
-    C
- 0: C
- 1: C
-MK: A
-    D 
-No match, mark = B
-
-/(*MARK:A)(*THEN:B)(C|X)/KSS
-    C
- 0: C
- 1: C
-MK: A
-    D 
-No match, mark = B
-
-/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
-
-/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
-    AAAC
-No match
-
-/--- Same --/
-
-/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK
-    AAAC
-No match
-
 /--- This should fail; the SKIP advances by one, but when we get to AC, the
-     PRUNE kills it. ---/ 
+     PRUNE kills it. Perl behaves differently. ---/

 /A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xK
     AAAC
-No match
+No match, mark = A

-/A(*:A)A+(*SKIP)(B|Z) | AC/xK
-    AAAC
-No match
+/--- Mark names can be duplicated. Perl doesn't give a mark for this one,
+though PCRE does. ---/

-/--- This should fail, as a null name is the same as no name ---/
-
-/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK
-    AAAC
-No match
-
-/--- This fails in PCRE, and I think that is in accordance with Perl's 
-     documentation, though in Perl it succeeds. ---/
-    
-/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK
-    AAAC
-No match
-
-/--- Mark names can be duplicated ---/
-
-/A(*:A)B|X(*:A)Y/K
-    AABC
- 0: AB
-MK: A
-    XXYZ 
- 0: XY
-MK: A
-    
 /^A(*:A)B|^X(*:A)Y/K
     ** Failers
 No match
     XAQQ
 No match, mark = A

-/--- A check on what happens after hitting a mark and them bumping along to
-something that does not even start. Perl reports tags after the failures here, 
-though it does not when the individual letters are made into something 
-more complicated. ---/
-
-/A(*:A)B|XX(*:B)Y/K
-    AABC
- 0: AB
-MK: A
-    XXYZ 
- 0: XXY
-MK: B
-    ** Failers
-No match
-    XAQQ  
-No match
-    XAQQXZZ  
-No match
-    AXQQQ 
-No match
-    AXXQQQ 
-No match
-    
 /--- COMMIT at the start of a pattern should be the same as an anchor. Perl 
 optimizations defeat this. So does the PCRE optimization unless we disable it 
 with \Y. ---/
@@ -11171,126 +10617,6 @@
     DEFGABC\Y  
 No match

-/--- Repeat some tests with added studying. ---/
-
-/A(*COMMIT)B/+KS
-    ACABX
-No match
- 
-/A(*THEN)B|A(*THEN)C/KS
-    AC
- 0: AC
-
-/A(*PRUNE)B|A(*PRUNE)C/KS
-    AC
-No match
-
-/^(A(*THEN:A)B|C(*THEN:B)D)/KS
-    AB
- 0: AB
- 1: AB
-    CD
- 0: CD
- 1: CD
-    ** Failers
-No match
-    AC
-No match
-    CB    
-No match, mark = B
-
-/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/KS
-    AB
- 0: AB
- 1: AB
-    CD
- 0: CD
- 1: CD
-    ** Failers
-No match
-    AC
-No match, mark = A
-    CB    
-No match, mark = B
-
-/^(A(*PRUNE:)B|C(*PRUNE:B)D)/KS
-    AB
- 0: AB
- 1: AB
-    CD 
- 0: CD
- 1: CD
-
-/A(*PRUNE:A)B/KS
-    ACAB
- 0: AB
-
-/(*MARK:A)(*PRUNE:B)(C|X)/KS
-    C
- 0: C
- 1: C
-MK: A
-    D 
-No match
-
-/(*MARK:A)(*THEN:B)(C|X)/KS
-    C
- 0: C
- 1: C
-MK: A
-    D 
-No match
-
-/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xKS
-    AAAC
-No match
-
-/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xKS
-    AAAC
-No match
-    
-/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xKS
-    AAAC
-No match
-
-/A(*:A)A+(*SKIP)(B|Z) | AC/xKS
-    AAAC
-No match
-
-/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xKS
-    AAAC
-No match
-
-/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xKS
-    AAAC
-No match
-
-/A(*:A)B|XX(*:B)Y/KS
-    AABC
- 0: AB
-MK: A
-    XXYZ 
- 0: XXY
-MK: B
-    ** Failers
-No match
-    XAQQ  
-No match
-    XAQQXZZ  
-No match
-    AXQQQ 
-No match
-    AXXQQQ 
-No match
-    
-/(*COMMIT)ABC/
-    ABCDEFG
- 0: ABC
-    ** Failers
-No match
-    DEFGABC\Y  
-No match
-
 /^(ab (c+(*THEN)cd) | xyz)/x
     abcccd  
 No match
@@ -11875,11 +11201,11 @@
  1: C
 MK: A
     D
-No match
+No match, mark = A

 /(*:A)A+(*SKIP:A)(B|Z)/KS
     AAAC
-No match
+No match, mark = A

/-- --/

@@ -12257,7 +11583,6 @@
 Latest Mark: B
 +18 ^ ^          z
 +20 ^            a
-Latest Mark: <unset>
 +21 ^^           e
 +22 ^ ^          q
 +23 ^  ^         )
@@ -12518,14 +11843,6 @@
     ax1z
  0: ax1z

-/^a\X41z/<JS>
-    aX41z
- 0: aX41z
-    *** Failers
-No match
-    aAz
-No match
-
 /^a\u0041z/<JS>
     aAz
  0: aAz
@@ -12591,7 +11908,70 @@
         End
 ------------------------------------------------------------------

-/(?<=ab\Cde)X/8
-Failed: \C not allowed in lookbehind assertion at offset 10
+/a[\NB]c/
+Failed: \N is not supported in a class at offset 3

+/a[B-\Nc]/ 
+Failed: \N is not supported in a class at offset 5
+
+/(a)(?2){0,1999}?(b)/
+
+/(a)(?(DEFINE)(b))(?2){0,1999}?(?2)/
+
+/--- This test, with something more complicated than individual letters, causes
+different behaviour in Perl. Perhaps it disables some optimization; no tag is
+passed back for the failures, whereas in PCRE there is a tag. ---/
+    
+/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK
+    AABC
+ 0: AB
+ 1: A
+ 2: B
+MK: A
+    XXYZ 
+ 0: XXY
+ 1: <unset>
+ 2: <unset>
+ 3: X
+ 4: X
+ 5: Y
+MK: B
+    ** Failers
+No match
+    XAQQ  
+No match, mark = A
+    XAQQXZZ  
+No match, mark = A
+    AXQQQ 
+No match, mark = A
+    AXXQQQ 
+No match, mark = B
+
+/-- Perl doesn't give marks for these, though it does if the alternatives are
+replaced by single letters. --/
+    
+/(b|q)(*:m)f|a(*:n)w/K
+    aw 
+ 0: aw
+MK: n
+    ** Failers 
+No match, mark = n
+    abc
+No match, mark = m
+
+/(q|b)(*:m)f|a(*:n)w/K
+    aw 
+ 0: aw
+MK: n
+    ** Failers 
+No match, mark = n
+    abc
+No match, mark = m
+
+/-- After a partial match, the behaviour is as for a failure. --/
+
+/^a(*:X)bcde/K
+   abc\P
+Partial match, mark=X: abc
+
 /-- End of testinput2 --/

Copied: code/trunk/testdata/testoutput20 (from rev 835, code/branches/pcre16/testdata/testoutput20)
===================================================================
--- code/trunk/testdata/testoutput20                            (rev 0)
+++ code/trunk/testdata/testoutput20    2011-12-28 17:16:11 UTC (rev 836)
@@ -0,0 +1,24 @@
+/-- These tests are for the handling of characters greater than 255 in 16-bit,
+    non-UTF-16 mode. --/
+
+/^\x{ffff}+/i
+    \x{ffff}
+ 0: \x{ffff}
+
+/^\x{ffff}?/i
+    \x{ffff}
+ 0: \x{ffff}
+
+/^\x{ffff}*/i
+    \x{ffff}
+ 0: \x{ffff}
+
+/^\x{ffff}{3}/i
+    \x{ffff}\x{ffff}\x{ffff}
+ 0: \x{ffff}\x{ffff}\x{ffff}
+
+/^\x{ffff}{0,3}/i
+    \x{ffff}
+ 0: \x{ffff}
+
+/-- End of testinput20 --/

Modified: code/trunk/testdata/testoutput4
===================================================================
--- code/trunk/testdata/testoutput4    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput4    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,5 +1,6 @@
-/-- This set of tests is for UTF-8 support, excluding Unicode properties. It is
-    compatible with all versions of Perl 5. --/
+/-- This set of tests is for UTF support, excluding Unicode properties. It is
+    compatible with all versions of Perl >= 5.10 and both the 8-bit and 16-bit
+    PCRE libraries. --/

 /a.b/8
     acb
@@ -255,46 +256,6 @@
     XYZ 
 No match

-/X(\C{3})/8
-    X\x{1234}
- 0: X\x{1234}
- 1: \x{1234}
-
-/X(\C{4})/8
-    X\x{1234}YZ
- 0: X\x{1234}Y
- 1: \x{1234}Y
-    
-/X\C*/8
-    XYZabcdce
- 0: XYZabcdce
-    
-/X\C*?/8
-    XYZabcde
- 0: X
-    
-/X\C{3,5}/8
-    Xabcdefg   
- 0: Xabcde
-    X\x{1234} 
- 0: X\x{1234}
-    X\x{1234}YZ
- 0: X\x{1234}YZ
-    X\x{1234}\x{512}  
- 0: X\x{1234}\x{512}
-    X\x{1234}\x{512}YZ
- 0: X\x{1234}\x{512}
-
-/X\C{3,5}?/8
-    Xabcdefg   
- 0: Xabc
-    X\x{1234} 
- 0: X\x{1234}
-    X\x{1234}YZ
- 0: X\x{1234}
-    X\x{1234}\x{512}  
- 0: X\x{1234}
-
 /[^a]+/8g
     bcd
  0: bcd
@@ -791,22 +752,6 @@
     \x{200}X   
 No match

-/a\Cb/
-    aXb
- 0: aXb
-    a\nb
- 0: a\x0ab
-  
-/a\Cb/8
-    aXb
- 0: aXb
-    a\nb
- 0: a\x{0a}b
-    
-/a\C\Cb/8 
-    a\x{100}b 
- 0: a\x{100}b
-
 /[z-\x{100}]/8i
     z
  0: z
@@ -1136,8 +1081,14 @@
    abc
 No match

-/ab\Cde/8
-    abXde
- 0: abXde
+/a(*:a\x{1234}b)/8K
+    abc
+ 0: a
+MK: a\x{1234}b

+/a(*:a£b)/8K 
+    abc
+ 0: a
+MK: a\x{a3}b
+
 /-- End of testinput4 --/

Modified: code/trunk/testdata/testoutput5
===================================================================
--- code/trunk/testdata/testoutput5    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput5    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,148 +1,30 @@
-/-- This set of tests checks the API, internals, and non-Perl stuff for UTF-8
-    support, excluding Unicode properties. --/
+/-- This set of tests checks the API, internals, and non-Perl stuff for UTF
+    support, excluding Unicode properties. However, tests that give different
+    results in 8-bit and 16-bit modes are excluded (see tests 16 and 17). --/

-/\x{100}/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{100}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 196
-Need char = 128
+/\x{110000}/8DZ
+Failed: character value in \x{...} sequence is too large at offset 9

-/\x{1000}/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{1000}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 225
-Need char = 128
-
-/\x{10000}/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{10000}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 240
-Need char = 128
-
-/\x{100000}/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{100000}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 244
-Need char = 128
-
-/\x{1000000}/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{1000000}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 249
-Need char = 128
-
-/\x{4000000}/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{4000000}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 252
-Need char = 128
-
-/\x{7fffFFFF}/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{7fffffff}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 253
-Need char = 191
-
-/[\x{ff}]/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{ff}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 195
-Need char = 191
-
-/[\x{100}]/8DZ
-------------------------------------------------------------------
-        Bra
-        [\x{100}]
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-
 /\x{ffffffff}/8
 Failed: character value in \x{...} sequence is too large at offset 11

/\x{100000000}/8
Failed: character value in \x{...} sequence is too large at offset 12

+/\x{d800}/8
+Failed: disallowed UTF-8/16 code point (>= 0xd800 && <= 0xdfff) at offset 7
+
+/\x{dfff}/8
+Failed: disallowed UTF-8/16 code point (>= 0xd800 && <= 0xdfff) at offset 7
+
+/\x{d7ff}/8
+
+/\x{e000}/8
+
 /^\x{100}a\x{1234}/8
     \x{100}a\x{1234}bcd
  0: \x{100}a\x{1234}

-/\x80/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{80}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 194
-Need char = 128
-
-/\xff/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{ff}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 195
-Need char = 191
-
 /\x{0041}\x{2262}\x{0391}\x{002e}/DZ8
 ------------------------------------------------------------------
         Bra
@@ -151,100 +33,12 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 First char = 'A'
 Need char = '.'
     \x{0041}\x{2262}\x{0391}\x{002e}
  0: A\x{2262}\x{391}.

-/\x{D55c}\x{ad6d}\x{C5B4}/DZ8 
-------------------------------------------------------------------
-        Bra
-        \x{d55c}\x{ad6d}\x{c5b4}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 237
-Need char = 180
-    \x{D55c}\x{ad6d}\x{C5B4} 
- 0: \x{d55c}\x{ad6d}\x{c5b4}
-
-/\x{65e5}\x{672c}\x{8a9e}/DZ8
-------------------------------------------------------------------
-        Bra
-        \x{65e5}\x{672c}\x{8a9e}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 230
-Need char = 158
-    \x{65e5}\x{672c}\x{8a9e}
- 0: \x{65e5}\x{672c}\x{8a9e}
-
-/\x{80}/DZ8
-------------------------------------------------------------------
-        Bra
-        \x{80}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 194
-Need char = 128
-
-/\x{084}/DZ8
-------------------------------------------------------------------
-        Bra
-        \x{84}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 194
-Need char = 132
-
-/\x{104}/DZ8
-------------------------------------------------------------------
-        Bra
-        \x{104}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 196
-Need char = 132
-
-/\x{861}/DZ8
-------------------------------------------------------------------
-        Bra
-        \x{861}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 224
-Need char = 161
-
-/\x{212ab}/DZ8
-------------------------------------------------------------------
-        Bra
-        \x{212ab}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 240
-Need char = 171
-
 /.{3,5}X/DZ8
 ------------------------------------------------------------------
         Bra
@@ -255,13 +49,12 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 Need char = 'X'
     \x{212ab}\x{212ab}\x{212ab}\x{861}X
  0: \x{212ab}\x{212ab}\x{212ab}\x{861}X

-
 /.{3,5}?/DZ8
 ------------------------------------------------------------------
         Bra
@@ -271,7 +64,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char
     \x{212ab}\x{212ab}\x{212ab}\x{861}
@@ -280,29 +73,6 @@
 /(?<=\C)X/8
 Failed: \C not allowed in lookbehind assertion at offset 6

-/-- This one is here not because it's different to Perl, but because the way
-the captured single-byte is displayed. (In Perl it becomes a character, and you
-can't tell the difference.) --/
-    
-/X(\C)(.*)/8
-    X\x{1234}
- 0: X\x{1234}
- 1: \xe1
- 2: \x88\xb4
-    X\nabc 
- 0: X\x{0a}abc
- 1: \x{0a}
- 2: abc
-
-/-- This one is here because Perl gives out a grumbly error message (quite 
-correctly, but that messes up comparisons). --/
-    
-/a\Cb/8
-    *** Failers 
-No match
-    a\x{100}b 
-No match
-    
 /^[ab]/8DZ
 ------------------------------------------------------------------
         Bra
@@ -312,7 +82,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: anchored utf8
+Options: anchored utf
 No first char
 No need char
     bar
@@ -335,7 +105,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: anchored utf8
+Options: anchored utf
 No first char
 No need char
     c
@@ -349,136 +119,6 @@
     aaa
 No match

-/[^ab\xC0-\xF0]/8SDZ
-------------------------------------------------------------------
-        Bra
-        [\x00-`c-\xbf\xf1-\xff] (neg)
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
-  \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 
-  \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 
-  5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y 
-  Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f 
-  \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 
-  \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf 
-  \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee 
-  \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd 
-  \xfe \xff 
-    \x{f1}
- 0: \x{f1}
-    \x{bf}
- 0: \x{bf}
-    \x{100}
- 0: \x{100}
-    \x{1000}   
- 0: \x{1000}
-    *** Failers
- 0: *
-    \x{c0} 
-No match
-    \x{f0} 
-No match
-
-/Ā{3,4}/8SDZ
-------------------------------------------------------------------
-        Bra
-        \x{100}{3}
-        \x{100}?
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 196
-Need char = 128
-Subject length lower bound = 3
-No set of starting bytes
-  \x{100}\x{100}\x{100}\x{100\x{100}
- 0: \x{100}\x{100}\x{100}
-
-/(\x{100}+|x)/8SDZ
-------------------------------------------------------------------
-        Bra
-        CBra 1
-        \x{100}+
-        Alt
-        x
-        Ket
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 1
-Options: utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: x \xc4 
-
-/(\x{100}*a|x)/8SDZ
-------------------------------------------------------------------
-        Bra
-        CBra 1
-        \x{100}*+
-        a
-        Alt
-        x
-        Ket
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 1
-Options: utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: a x \xc4 
-
-/(\x{100}{0,2}a|x)/8SDZ
-------------------------------------------------------------------
-        Bra
-        CBra 1
-        \x{100}{0,2}
-        a
-        Alt
-        x
-        Ket
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 1
-Options: utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: a x \xc4 
-
-/(\x{100}{1,2}a|x)/8SDZ
-------------------------------------------------------------------
-        Bra
-        CBra 1
-        \x{100}
-        \x{100}{0,1}
-        a
-        Alt
-        x
-        Ket
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 1
-Options: utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: x \xc4 
-
 /\x{100}*(\d+|"(?1)")/8
     1234
  0: 1234
@@ -503,18 +143,6 @@
     \x{100}\x{100}abcd
 No match

-/\x{100}/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{100}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 196
-Need char = 128
-
 /\x{100}*/8DZ
 ------------------------------------------------------------------
         Bra
@@ -523,7 +151,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -536,7 +164,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 First char = 'a'
 No need char

@@ -549,36 +177,10 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 First char = 'a'
 Need char = 'b'

-/a\x{100}\x{101}*/8DZ
-------------------------------------------------------------------
-        Bra
-        a\x{100}
-        \x{101}*
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 'a'
-Need char = 128
-
-/a\x{100}\x{101}+/8DZ
-------------------------------------------------------------------
-        Bra
-        a\x{100}
-        \x{101}+
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 'a'
-Need char = 129
-
 /\x{100}*A/8DZ
 ------------------------------------------------------------------
         Bra
@@ -588,7 +190,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 Need char = 'A'
     A
@@ -604,54 +206,10 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char

-/[^\x{c4}]/DZ
-------------------------------------------------------------------
-        Bra
-        [^\xc4]
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-No options
-No first char
-No need char
-
-/[^\x{c4}]/8DZ
-------------------------------------------------------------------
-        Bra
-        [\x00-\xc3\xc5-\xff] (neg)
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-
-/[\x{100}]/8DZ
-------------------------------------------------------------------
-        Bra
-        [\x{100}]
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-    \x{100}
- 0: \x{100}
-    Z\x{100}
- 0: \x{100}
-    \x{100}Z
- 0: \x{100}
-    *** Failers 
-No match
-
 /[Z\x{100}]/8DZ
 ------------------------------------------------------------------
         Bra
@@ -660,7 +218,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char
     Z\x{100}
@@ -695,7 +253,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -707,7 +265,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char
     \x{100}
@@ -724,25 +282,11 @@
 ------------------------------------------------------------------
 Capturing subpattern count = 0
 No options
-First char = 255
+First char = \xff
 No need char
     >\xff<
  0: \xff

-/[\xff]/DZ8
-------------------------------------------------------------------
-        Bra
-        \x{ff}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 195
-Need char = 191
-    >\x{ff}<
- 0: \x{ff}
-
 /[^\xFF]/DZ
 ------------------------------------------------------------------
         Bra
@@ -755,18 +299,6 @@
 No first char
 No need char

-/[^\xff]/8DZ
-------------------------------------------------------------------
-        Bra
-        [\x00-\xfe] (neg)
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-
 /[Ä-Ü]/8
     Ö # Matches without Study
  0: \x{d6}
@@ -791,129 +323,6 @@
     \x{d6} 
  0: \x{d6}

-/[\xC3]/8
-Failed: invalid UTF-8 string at offset 1
-
-/\xC3/8
-Failed: invalid UTF-8 string at offset 0
-
-/\xC3\xC3\xC3xxx/8
-Failed: invalid UTF-8 string at offset 0
-
-/\xC3\xC3\xC3xxx/8?DZSS
-------------------------------------------------------------------
-        Bra
-        \X{c0}\X{c0}\X{c0}xxx
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8 no_utf8_check
-First char = 195
-Need char = 'x'
-
-/abc/8
-    \xC3]
-Error -10 (bad UTF-8 string) offset=0 reason=6
-    \xC3
-Error -10 (bad UTF-8 string) offset=0 reason=1
-    \xC3\xC3\xC3
-Error -10 (bad UTF-8 string) offset=0 reason=6
-    \xC3\xC3\xC3\?
-No match
-    \xe1\x88 
-Error -10 (bad UTF-8 string) offset=0 reason=1
-    \P\xe1\x88 
-Error -10 (bad UTF-8 string) offset=0 reason=1
-    \P\P\xe1\x88 
-Error -25 (short UTF-8 string) offset=0 reason=1
-    XX\xea
-Error -10 (bad UTF-8 string) offset=2 reason=2
-    \O0XX\xea
-Error -10 (bad UTF-8 string)
-    \O1XX\xea
-Error -10 (bad UTF-8 string)
-    \O2XX\xea
-Error -10 (bad UTF-8 string) offset=2 reason=2
-    XX\xf1
-Error -10 (bad UTF-8 string) offset=2 reason=3
-    XX\xf8  
-Error -10 (bad UTF-8 string) offset=2 reason=4
-    XX\xfc
-Error -10 (bad UTF-8 string) offset=2 reason=5
-    ZZ\xea\xaf\x20YY
-Error -10 (bad UTF-8 string) offset=2 reason=7
-    ZZ\xfd\xbf\xbf\x2f\xbf\xbfYY  
-Error -10 (bad UTF-8 string) offset=2 reason=8
-    ZZ\xfd\xbf\xbf\xbf\x2f\xbfYY  
-Error -10 (bad UTF-8 string) offset=2 reason=9
-    ZZ\xfd\xbf\xbf\xbf\xbf\x2fYY  
-Error -10 (bad UTF-8 string) offset=2 reason=10
-    ZZ\xffYY
-Error -10 (bad UTF-8 string) offset=2 reason=21
-    ZZ\xfeYY  
-Error -10 (bad UTF-8 string) offset=2 reason=21
-
-/anything/8
-    \xc0\x80
-Error -10 (bad UTF-8 string) offset=0 reason=15
-    \xc1\x8f 
-Error -10 (bad UTF-8 string) offset=0 reason=15
-    \xe0\x9f\x80
-Error -10 (bad UTF-8 string) offset=0 reason=16
-    \xf0\x8f\x80\x80 
-Error -10 (bad UTF-8 string) offset=0 reason=17
-    \xf8\x87\x80\x80\x80  
-Error -10 (bad UTF-8 string) offset=0 reason=18
-    \xfc\x83\x80\x80\x80\x80
-Error -10 (bad UTF-8 string) offset=0 reason=19
-    \xfe\x80\x80\x80\x80\x80  
-Error -10 (bad UTF-8 string) offset=0 reason=21
-    \xff\x80\x80\x80\x80\x80  
-Error -10 (bad UTF-8 string) offset=0 reason=21
-    \xc3\x8f
-No match
-    \xe0\xaf\x80
-No match
-    \xe1\x80\x80
-No match
-    \xf0\x9f\x80\x80 
-No match
-    \xf1\x8f\x80\x80 
-No match
-    \xf8\x88\x80\x80\x80  
-Error -10 (bad UTF-8 string) offset=0 reason=11
-    \xf9\x87\x80\x80\x80  
-Error -10 (bad UTF-8 string) offset=0 reason=11
-    \xfc\x84\x80\x80\x80\x80
-Error -10 (bad UTF-8 string) offset=0 reason=12
-    \xfd\x83\x80\x80\x80\x80
-Error -10 (bad UTF-8 string) offset=0 reason=12
-    \?\xf8\x88\x80\x80\x80  
-No match
-    \?\xf9\x87\x80\x80\x80  
-No match
-    \?\xfc\x84\x80\x80\x80\x80
-No match
-    \?\xfd\x83\x80\x80\x80\x80
-No match
-
-/\x{100}abc(xyz(?1))/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{100}abc
-        CBra 1
-        xyz
-        Recurse
-        Ket
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 1
-Options: utf8
-First char = 196
-Need char = 'z'
-
 /[^\x{100}]abc(xyz(?1))/8DZ
 ------------------------------------------------------------------
         Bra
@@ -927,7 +336,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 1
-Options: utf8
+Options: utf
 No first char
 Need char = 'z'

@@ -944,7 +353,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 1
-Options: utf8
+Options: utf
 No first char
 Need char = 'z'

@@ -964,7 +373,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 2
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -995,7 +404,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 2
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -1015,7 +424,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 2
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -1046,7 +455,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 2
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -1060,10 +469,6 @@
     \x{100}X   
  0: X

-/a\x{1234}b/P8
-    a\x{1234}b
- 0: a\x{1234}b
-
 /^\ሴ/8DZ
 ------------------------------------------------------------------
         Bra
@@ -1073,23 +478,13 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: anchored utf8
+Options: anchored utf
 No first char
 No need char

/\777/I
Failed: octal value is greater than \377 (not in UTF-8 mode) at offset 3

-/\777/8I
-Capturing subpattern count = 0
-Options: utf8
-First char = 199
-Need char = 191
-  \x{1ff}
- 0: \x{1ff}
-  \777 
- 0: \x{1ff}
-  
 /\x{100}*\d/8DZ
 ------------------------------------------------------------------
         Bra
@@ -1099,7 +494,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -1112,7 +507,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -1125,7 +520,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -1138,7 +533,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -1151,7 +546,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char

@@ -1164,49 +559,10 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: utf8
+Options: utf
 No first char
 No need char

-/\x{100}+\x{200}/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{100}++
-        \x{200}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 196
-Need char = 128
-
-/\x{100}+X/8DZ
-------------------------------------------------------------------
-        Bra
-        \x{100}++
-        X
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 196
-Need char = 'X'
-
-/X+\x{200}/8DZ
-------------------------------------------------------------------
-        Bra
-        X++
-        \x{200}
-        Ket
-        End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: utf8
-First char = 'X'
-Need char = 128
-
 /()()()()()()()()()()
  ()()()()()()()()()()
  ()()()()()()()()()()
@@ -1248,9 +604,6 @@
         End
 ------------------------------------------------------------------

-/^[\QĀ\E-\QŐ\E/BZ8
-Failed: missing terminating ] for character class at offset 15
-
 /^abc./mgx8<any>
     abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x{0085}abc7 \x{2028}abc8 \x{2029}abc9 JUNK
  0: abc1
@@ -1436,7 +789,7 @@
 /[\H]/8BZ
 ------------------------------------------------------------------
         Bra
-        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{7fffffff}]
+        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}]
         Ket
         End
 ------------------------------------------------------------------
@@ -1444,7 +797,7 @@
 /[\V]/8BZ
 ------------------------------------------------------------------
         Bra
-        [\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{2029}-\x{7fffffff}]
+        [\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{10ffff}]
         Ket
         End
 ------------------------------------------------------------------
@@ -1453,39 +806,9 @@
     \x{1ec5} 
  0: \x{1ec5}

-/-- This tests the stricter UTF-8 check according to RFC 3629. --/ 
-    
-/X/8
-    \x{0}\x{d7ff}\x{e000}\x{10ffff}
-No match
-    \x{d800}
-Error -10 (bad UTF-8 string) offset=0 reason=14
-    \x{d800}\?
-No match
-    \x{da00}
-Error -10 (bad UTF-8 string) offset=0 reason=14
-    \x{da00}\?
-No match
-    \x{dfff}
-Error -10 (bad UTF-8 string) offset=0 reason=14
-    \x{dfff}\?
-No match
-    \x{110000}    
-Error -10 (bad UTF-8 string) offset=0 reason=13
-    \x{110000}\?    
-No match
-    \x{2000000} 
-Error -10 (bad UTF-8 string) offset=0 reason=11
-    \x{2000000}\? 
-No match
-    \x{7fffffff} 
-Error -10 (bad UTF-8 string) offset=0 reason=12
-    \x{7fffffff}\? 
-No match
-
 /a\Rb/I8<bsr_anycrlf>
 Capturing subpattern count = 0
-Options: bsr_anycrlf utf8
+Options: bsr_anycrlf utf
 First char = 'a'
 Need char = 'b'
     a\rb
@@ -1503,7 +826,7 @@

 /a\Rb/I8<bsr_unicode>
 Capturing subpattern count = 0
-Options: bsr_unicode utf8
+Options: bsr_unicode utf
 First char = 'a'
 Need char = 'b'
     a\rb
@@ -1525,7 +848,7 @@

 /a\R?b/I8<bsr_anycrlf>
 Capturing subpattern count = 0
-Options: bsr_anycrlf utf8
+Options: bsr_anycrlf utf
 First char = 'a'
 Need char = 'b'
     a\rb
@@ -1543,7 +866,7 @@

 /a\R?b/I8<bsr_unicode>
 Capturing subpattern count = 0
-Options: bsr_unicode utf8
+Options: bsr_unicode utf
 First char = 'a'
 Need char = 'b'
     a\rb
@@ -1600,26 +923,11 @@
     \x{de}\x{de}
  0: \xde\xde
  1: \xde
-    \x{123} 
-** Character \x{123} is greater than 255 and UTF-8 mode is not enabled.
-** Truncation will probably give the wrong result.
-No match

 /X/8f<any> 
     A\x{1ec5}ABCXYZ
  0: X

-/(*UTF8)\x{1234}/
-  abcd\x{1234}pqr
- 0: \x{1234}
-
-/(*CRLF)(*UTF8)(*BSR_UNICODE)a\Rb/I
-Capturing subpattern count = 0
-Options: bsr_unicode utf8
-Forced newline sequence: CRLF
-First char = 'a'
-Need char = 'b'
-
 /Xa{2,4}b/8
     X\P
 Partial match: X
@@ -2097,152 +1405,16 @@
     \PX
 Partial match: X

-/\h/SI
-Capturing subpattern count = 0
-No options
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0 
-
-/\h/SI8
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xc2 \xe1 \xe2 \xe3 
-    ABC\x{09}
- 0: \x{09}
-    ABC\x{20}
- 0:  
-    ABC\x{a0}
- 0: \x{a0}
-    ABC\x{1680}
- 0: \x{1680}
-    ABC\x{180e}
- 0: \x{180e}
-    ABC\x{2000}
- 0: \x{2000}
-    ABC\x{202f} 
- 0: \x{202f}
-    ABC\x{205f} 
- 0: \x{205f}
-    ABC\x{3000} 
- 0: \x{3000}
-
-/\v/SI
-Capturing subpattern count = 0
-No options
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 
-
-/\v/SI8
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
-    ABC\x{0a}
- 0: \x{0a}
-    ABC\x{0b}
- 0: \x{0b}
-    ABC\x{0c}
- 0: \x{0c}
-    ABC\x{0d}
- 0: \x{0d}
-    ABC\x{85}
- 0: \x{85}
-    ABC\x{2028}
- 0: \x{2028}
-
-/\R/SI
-Capturing subpattern count = 0
-No options
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 
-
-/\R/SI8
-Capturing subpattern count = 0
-Options: utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
-
-/\h*A/SI8
-Capturing subpattern count = 0
-Options: utf8
-No first char
-Need char = 'A'
-Subject length lower bound = 1
-Starting byte set: \x09 \x20 A \xc2 \xe1 \xe2 \xe3 
-    CDBABC
- 0: A
-    
-/\v+A/SI8
-Capturing subpattern count = 0
-Options: utf8
-No first char
-Need char = 'A'
-Subject length lower bound = 2
-Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
-
-/\s?xxx\s/8SI
-Capturing subpattern count = 0
-Options: utf8
-No first char
-Need char = 'x'
-Subject length lower bound = 4
-Starting byte set: \x09 \x0a \x0c \x0d \x20 x 
-
 /\sxxx\s/8T1
     AB\x{85}xxx\x{a0}XYZ
  0: \x{85}xxx\x{a0}
     AB\x{a0}xxx\x{85}XYZ
  0: \x{a0}xxx\x{85}

-/\sxxx\s/I8ST1
-Capturing subpattern count = 0
-Options: utf8
-No first char
-Need char = 'x'
-Subject length lower bound = 5
-Starting byte set: \x09 \x0a \x0c \x0d \x20 \xc2 
-    AB\x{85}xxx\x{a0}XYZ
- 0: \x{85}xxx\x{a0}
-    AB\x{a0}xxx\x{85}XYZ
- 0: \x{a0}xxx\x{85}
-
 /\S \S/8T1
     \x{a2} \x{84} 
  0: \x{a2} \x{84}

-/\S \S/I8ST1
-Capturing subpattern count = 0
-Options: utf8
-No first char
-Need char = ' '
-Subject length lower bound = 3
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e 
-  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
-  \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ 
-  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e 
-  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3 
-  \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 
-  \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 
-  \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 
-  \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff 
-    \x{a2} \x{84} 
- 0: \x{a2} \x{84}
-    A Z 
- 0: A Z
-
 'A#хц'8x<any>BZ
 ------------------------------------------------------------------
         Bra
@@ -2306,20 +1478,6 @@
         End
 ------------------------------------------------------------------

-/a+/8
-    a\x{123}aa\>1
- 0: aa
-    a\x{123}aa\>2
-Error -11 (bad UTF-8 offset)
-    a\x{123}aa\>3
- 0: aa
-    a\x{123}aa\>4
- 0: a
-    a\x{123}aa\>5
-No match
-    a\x{123}aa\>6
-Error -24 (bad offset value)
-
 /^\cģ/8
 Failed: \c must be followed by an ASCII character at offset 3

@@ -2351,41 +1509,9 @@
1: \x{0a}
2: \x{0d}

-/\x{1234}+/iS8I
-Capturing subpattern count = 0
-Options: caseless utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: \xe1
-
-/\x{1234}+?/iS8I
-Capturing subpattern count = 0
-Options: caseless utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: \xe1
-
-/\x{1234}++/iS8I
-Capturing subpattern count = 0
-Options: caseless utf8
-No first char
-No need char
-Subject length lower bound = 1
-Starting byte set: \xe1
-
-/\x{1234}{2}/iS8I
-Capturing subpattern count = 0
-Options: caseless utf8
-No first char
-No need char
-Subject length lower bound = 2
-Starting byte set: \xe1
-
/[^\x{1234}]+/iS8I
Capturing subpattern count = 0
-Options: caseless utf8
+Options: caseless utf
No first char
No need char
Subject length lower bound = 1
@@ -2393,7 +1519,7 @@

/[^\x{1234}]+?/iS8I
Capturing subpattern count = 0
-Options: caseless utf8
+Options: caseless utf
No first char
No need char
Subject length lower bound = 1
@@ -2401,7 +1527,7 @@

/[^\x{1234}]++/iS8I
Capturing subpattern count = 0
-Options: caseless utf8
+Options: caseless utf
No first char
No need char
Subject length lower bound = 1
@@ -2409,7 +1535,7 @@

 /[^\x{1234}]{2}/iS8I
 Capturing subpattern count = 0
-Options: caseless utf8
+Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 2
@@ -2433,5 +1559,99 @@
 /f.*/8s
     \P\Pfor
 Partial match: for
+    
+/\x{d7ff}\x{e000}/8

+/\x{d800}/8
+Failed: disallowed UTF-8/16 code point (>= 0xd800 && <= 0xdfff) at offset 7
+
+/\x{dfff}/8 
+Failed: disallowed UTF-8/16 code point (>= 0xd800 && <= 0xdfff) at offset 7
+
+/\h+/8
+    \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
+ 0: \x{1680}\x{2000}\x{202f}\x{3000}
+    \x{3001}\x{2fff}\x{200a}\x{a0}\x{2000}
+ 0: \x{200a}\x{a0}\x{2000}
+
+/[\h\x{e000}]+/8BZ
+------------------------------------------------------------------
+        Bra
+        [\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}\x{e000}]+
+        Ket
+        End
+------------------------------------------------------------------
+    \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
+ 0: \x{1680}\x{2000}\x{202f}\x{3000}
+    \x{3001}\x{2fff}\x{200a}\x{a0}\x{2000}
+ 0: \x{200a}\x{a0}\x{2000}
+
+/\H+/8
+    \x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
+ 0: \x{167f}\x{1681}\x{180d}\x{180f}
+    \x{2000}\x{200a}\x{1fff}\x{200b}
+ 0: \x{1fff}\x{200b}
+    \x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
+ 0: \x{202e}\x{2030}\x{205e}\x{2060}
+    \x{a0}\x{3000}\x{9f}\x{a1}\x{2fff}\x{3001}
+ 0: \x{9f}\x{a1}\x{2fff}\x{3001}
+
+/[\H\x{d7ff}]+/8BZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}\x{d7ff}]+
+        Ket
+        End
+------------------------------------------------------------------
+    \x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
+ 0: \x{167f}\x{1681}\x{180d}\x{180f}
+    \x{2000}\x{200a}\x{1fff}\x{200b}
+ 0: \x{1fff}\x{200b}
+    \x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
+ 0: \x{202e}\x{2030}\x{205e}\x{2060}
+    \x{a0}\x{3000}\x{9f}\x{a1}\x{2fff}\x{3001}
+ 0: \x{9f}\x{a1}\x{2fff}\x{3001}
+
+/\v+/8
+    \x{2027}\x{2030}\x{2028}\x{2029}
+ 0: \x{2028}\x{2029}
+    \x09\x0e\x{84}\x{86}\x{85}\x0a\x0b\x0c\x0d
+ 0: \x{85}\x{0a}\x{0b}\x{0c}\x{0d}
+
+/[\v\x{e000}]+/8BZ
+------------------------------------------------------------------
+        Bra
+        [\x0a-\x0d\x85\x{2028}-\x{2029}\x{e000}]+
+        Ket
+        End
+------------------------------------------------------------------
+    \x{2027}\x{2030}\x{2028}\x{2029}
+ 0: \x{2028}\x{2029}
+    \x09\x0e\x{84}\x{86}\x{85}\x0a\x0b\x0c\x0d
+ 0: \x{85}\x{0a}\x{0b}\x{0c}\x{0d}
+
+/\V+/8
+    \x{2028}\x{2029}\x{2027}\x{2030}
+ 0: \x{2027}\x{2030}
+    \x{85}\x0a\x0b\x0c\x0d\x09\x0e\x{84}\x{86}
+ 0: \x{09}\x{0e}\x{84}\x{86}
+
+/[\V\x{d7ff}]+/8BZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{10ffff}\x{d7ff}]+
+        Ket
+        End
+------------------------------------------------------------------
+    \x{2028}\x{2029}\x{2027}\x{2030}
+ 0: \x{2027}\x{2030}
+    \x{85}\x0a\x0b\x0c\x0d\x09\x0e\x{84}\x{86}
+ 0: \x{09}\x{0e}\x{84}\x{86}
+
+/\R+/8<bsr_unicode>
+    \x{2027}\x{2030}\x{2028}\x{2029}
+ 0: \x{2028}\x{2029}
+    \x09\x0e\x{84}\x{86}\x{85}\x0a\x0b\x0c\x0d
+ 0: \x{85}\x{0a}\x{0b}\x{0c}\x{0d}
+
 /-- End of testinput5 --/

Modified: code/trunk/testdata/testoutput6
===================================================================
--- code/trunk/testdata/testoutput6    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput6    2011-12-28 17:16:11 UTC (rev 836)
@@ -1353,4 +1353,26 @@
     a\xFCb   
 No match

+/ⱥ/8i
+    ⱥ
+ 0: \x{2c65}
+    Ⱥx 
+ 0: \x{23a}
+    Ⱥ 
+ 0: \x{23a}
+
+/[ⱥ]/8i
+    ⱥ
+ 0: \x{2c65}
+    Ⱥx 
+ 0: \x{23a}
+    Ⱥ 
+ 0: \x{23a}
+
+/Ⱥ/8i
+    Ⱥ
+ 0: \x{23a}
+    ⱥ
+ 0: \x{2c65}
+
 /-- End of testinput6 --/

Modified: code/trunk/testdata/testoutput7
===================================================================
--- code/trunk/testdata/testoutput7    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput7    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,7869 +1,1214 @@
-/-- This set of tests check the DFA matching functionality of pcre_dfa_exec().
-    The -dfa flag must be used with pcretest when running it. --/
-     
-/abc/
-    abc
- 0: abc
-    
-/ab*c/
-    abc
- 0: abc
-    abbbbc
- 0: abbbbc
-    ac
- 0: ac
-    
-/ab+c/
-    abc
- 0: abc
-    abbbbbbc
- 0: abbbbbbc
-    *** Failers 
-No match
-    ac
-No match
-    ab
-No match
-    
-/a*/
-    a
- 0: a
- 1: 
-    aaaaaaaaaaaaaaaaa
- 0: aaaaaaaaaaaaaaaaa
- 1: aaaaaaaaaaaaaaaa
- 2: aaaaaaaaaaaaaaa
- 3: aaaaaaaaaaaaaa
- 4: aaaaaaaaaaaaa
- 5: aaaaaaaaaaaa
- 6: aaaaaaaaaaa
- 7: aaaaaaaaaa
- 8: aaaaaaaaa
- 9: aaaaaaaa
-10: aaaaaaa
-11: aaaaaa
-12: aaaaa
-13: aaaa
-14: aaa
-15: aa
-16: a
-17: 
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 
-Matched, but too many subsidiary matches
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 3: aaaaaaaaaaaaaaaaaaaaaaaaaaa
- 4: aaaaaaaaaaaaaaaaaaaaaaaaaa
- 5: aaaaaaaaaaaaaaaaaaaaaaaaa
- 6: aaaaaaaaaaaaaaaaaaaaaaaa
- 7: aaaaaaaaaaaaaaaaaaaaaaa
- 8: aaaaaaaaaaaaaaaaaaaaaa
- 9: aaaaaaaaaaaaaaaaaaaaa
-10: aaaaaaaaaaaaaaaaaaaa
-11: aaaaaaaaaaaaaaaaaaa
-12: aaaaaaaaaaaaaaaaaa
-13: aaaaaaaaaaaaaaaaa
-14: aaaaaaaaaaaaaaaa
-15: aaaaaaaaaaaaaaa
-16: aaaaaaaaaaaaaa
-17: aaaaaaaaaaaaa
-18: aaaaaaaaaaaa
-19: aaaaaaaaaaa
-20: aaaaaaaaaa
-21: aaaaaaaaa
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\F 
- 0: 
-    
-/(a|abcd|african)/
-    a
- 0: a
-    abcd
- 0: abcd
- 1: a
-    african
- 0: african
- 1: a
-    
-/^abc/
-    abcdef
- 0: abc
-    *** Failers
-No match
-    xyzabc
-No match
-    xyz\nabc    
-No match
-    
-/^abc/m
-    abcdef
- 0: abc
-    xyz\nabc    
- 0: abc
-    *** Failers
-No match
-    xyzabc
-No match
-    
-/\Aabc/
-    abcdef
- 0: abc
-    *** Failers
-No match
-    xyzabc
-No match
-    xyz\nabc    
-No match
-    
-/\Aabc/m
-    abcdef
- 0: abc
-    *** Failers
-No match
-    xyzabc
-No match
-    xyz\nabc    
-No match
-    
-/\Gabc/
-    abcdef
- 0: abc
-    xyzabc\>3
- 0: abc
-    *** Failers
-No match
-    xyzabc    
-No match
-    xyzabc\>2 
-No match
-    
-/x\dy\Dz/
-    x9yzz
- 0: x9yzz
-    x0y+z
- 0: x0y+z
-    *** Failers
-No match
-    xyz
-No match
-    xxy0z     
-No match
-    
-/x\sy\Sz/
-    x yzz
- 0: x yzz
-    x y+z
- 0: x y+z
-    *** Failers
-No match
-    xyz
-No match
-    xxyyz
-No match
-    
-/x\wy\Wz/
-    xxy+z
- 0: xxy+z
-    *** Failers
-No match
-    xxy0z
-No match
-    x+y+z         
-No match
-    
-/x.y/
-    x+y
- 0: x+y
-    x-y
- 0: x-y
-    *** Failers
-No match
-    x\ny
-No match
-    
-/x.y/s
-    x+y
- 0: x+y
-    x-y
- 0: x-y
-    x\ny
- 0: x\x0ay
+/-- These tests for Unicode property support test PCRE's API and show some of
+    the compiled code. They are not Perl-compatible. --/

-/(a.b(?s)c.d|x.y)p.q/
-    a+bc+dp+q
- 0: a+bc+dp+q
-    a+bc\ndp+q
- 0: a+bc\x0adp+q
-    x\nyp+q 
- 0: x\x0ayp+q
-    *** Failers 
-No match
-    a\nbc\ndp+q
-No match
-    a+bc\ndp\nq
-No match
-    x\nyp\nq 
-No match
+/[\p{L}]/DZ
+------------------------------------------------------------------
+        Bra
+        [\p{L}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char

-/a\d\z/
-    ba0
- 0: a0
-    *** Failers
-No match
-    ba0\n
-No match
-    ba0\ncd   
-No match
+/[\p{^L}]/DZ
+------------------------------------------------------------------
+        Bra
+        [\P{L}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char

-/a\d\z/m
-    ba0
- 0: a0
-    *** Failers
-No match
-    ba0\n
-No match
-    ba0\ncd   
-No match
+/[\P{L}]/DZ
+------------------------------------------------------------------
+        Bra
+        [\P{L}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char

-/a\d\Z/
-    ba0
- 0: a0
-    ba0\n
- 0: a0
-    *** Failers
-No match
-    ba0\ncd   
-No match
+/[\P{^L}]/DZ
+------------------------------------------------------------------
+        Bra
+        [\p{L}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char

-/a\d\Z/m
-    ba0
- 0: a0
-    ba0\n
- 0: a0
-    *** Failers
-No match
-    ba0\ncd   
-No match
+/[abc\p{L}\x{0660}]/8DZ
+------------------------------------------------------------------
+        Bra
+        [a-c\p{L}\x{660}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char

-/a\d$/
-    ba0
- 0: a0
-    ba0\n
- 0: a0
-    *** Failers
-No match
-    ba0\ncd   
-No match
+/[\p{Nd}]/8DZ
+------------------------------------------------------------------
+        Bra
+        [\p{Nd}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+    1234
+ 0: 1

-/a\d$/m
-    ba0
- 0: a0
-    ba0\n
- 0: a0
-    ba0\ncd   
- 0: a0
-    *** Failers
+/[\p{Nd}+-]+/8DZ
+------------------------------------------------------------------
+        Bra
+        [+\-\p{Nd}]+
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: utf
+No first char
+No need char
+    1234
+ 0: 1234
+    12-34
+ 0: 12-34
+    12+\x{661}-34  
+ 0: 12+\x{661}-34
+    ** Failers
 No match
-
-/abc/i
-    abc
- 0: abc
-    aBc
- 0: aBc
-    ABC
- 0: ABC
-    
-/[^a]/
-    abcd
- 0: b
-    
-/ab?\w/
-    abz
- 0: abz
- 1: ab
-    abbz
- 0: abb
- 1: ab
-    azz  
- 0: az
-
-/x{0,3}yz/
-    ayzq
- 0: yz
-    axyzq
- 0: xyz
-    axxyz
- 0: xxyz
-    axxxyzq
- 0: xxxyz
-    axxxxyzq
- 0: xxxyz
-    *** Failers
+    abcd  
 No match
-    ax
-No match
-    axx     
-No match
-      
-/x{3}yz/
-    axxxyzq
- 0: xxxyz
-    axxxxyzq
- 0: xxxyz
-    *** Failers
-No match
-    ax
-No match
-    axx     
-No match
-    ayzq
-No match
-    axyzq
-No match
-    axxyz
-No match
-      
-/x{2,3}yz/
-    axxyz
- 0: xxyz
-    axxxyzq
- 0: xxxyz
-    axxxxyzq
- 0: xxxyz
-    *** Failers
-No match
-    ax
-No match
-    axx     
-No match
-    ayzq
-No match
-    axyzq
-No match
-      
-/[^a]+/
-    bac
- 0: b
-    bcdefax
- 0: bcdef
- 1: bcde
- 2: bcd
- 3: bc
- 4: b
-    *** Failers
- 0: *** F
- 1: *** 
- 2: ***
- 3: **
- 4: *
-    aaaaa   
-No match

-/[^a]*/
-    bac
- 0: b
- 1: 
-    bcdefax
- 0: bcdef
- 1: bcde
- 2: bcd
- 3: bc
- 4: b
- 5: 
-    *** Failers
- 0: *** F
- 1: *** 
- 2: ***
- 3: **
- 4: *
- 5: 
-    aaaaa   
- 0: 
-    
-/[^a]{3,5}/
-    xyz
- 0: xyz
-    awxyza
- 0: wxyz
- 1: wxy
-    abcdefa
- 0: bcdef
- 1: bcde
- 2: bcd
-    abcdefghijk
- 0: bcdef
- 1: bcde
- 2: bcd
-    *** Failers
- 0: *** F
- 1: *** 
- 2: ***
-    axya
+/[\x{105}-\x{109}]/8iDZ
+------------------------------------------------------------------
+        Bra
+        [\x{104}-\x{109}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+    \x{104}
+ 0: \x{104}
+    \x{105}
+ 0: \x{105}
+    \x{109}  
+ 0: \x{109}
+    ** Failers
 No match
-    axa
+    \x{100}
 No match
-    aaaaa         
+    \x{10a} 
 No match
-
-/\d*/
-    1234b567
- 0: 1234
- 1: 123
- 2: 12
- 3: 1
- 4: 
-    xyz
- 0:

-/\D*/
-    a1234b567
- 0: a
- 1: 
-    xyz
- 0: xyz
- 1: xy
- 2: x
- 3: 
-     
-/\d+/
-    ab1234c56
- 0: 1234
- 1: 123
- 2: 12
- 3: 1
-    *** Failers
+/[z-\x{100}]/8iDZ 
+------------------------------------------------------------------
+        Bra
+        [Z\x{39c}\x{178}z-\x{101}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+    Z
+ 0: Z
+    z
+ 0: z
+    \x{39c}
+ 0: \x{39c}
+    \x{178}
+ 0: \x{178}
+    |
+ 0: |
+    \x{80}
+ 0: \x{80}
+    \x{ff}
+ 0: \x{ff}
+    \x{100}
+ 0: \x{100}
+    \x{101} 
+ 0: \x{101}
+    ** Failers
 No match
-    xyz
+    \x{102}
 No match
-    
-/\D+/
-    ab123c56
- 0: ab
- 1: a
-    *** Failers
- 0: *** Failers
- 1: *** Failer
- 2: *** Faile
- 3: *** Fail
- 4: *** Fai
- 5: *** Fa
- 6: *** F
- 7: *** 
- 8: ***
- 9: **
-10: *
-    789
+    Y
 No match
-    
-/\d?A/
-    045ABC
- 0: 5A
-    ABC
- 0: A
-    *** Failers
+    y           
 No match
-    XYZ
-No match
-    
-/\D?A/
-    ABC
- 0: A
-    BAC
- 0: BA
-    9ABC             
- 0: A
-    *** Failers
-No match

-/a+/
-    aaaa
- 0: aaaa
- 1: aaa
- 2: aa
- 3: a
+/[z-\x{100}]/8DZi
+------------------------------------------------------------------
+        Bra
+        [Z\x{39c}\x{178}z-\x{101}]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char

-/^.*xyz/
-    xyz
- 0: xyz
-    ggggggggxyz
- 0: ggggggggxyz
-    
-/^.+xyz/
-    abcdxyz
- 0: abcdxyz
-    axyz
- 0: axyz
-    *** Failers
-No match
-    xyz
-No match
-    
-/^.?xyz/
-    xyz
- 0: xyz
-    cxyz       
- 0: cxyz
+/(?:[\PPa*]*){8,}/

-/^\d{2,3}X/
-    12X
- 0: 12X
-    123X
- 0: 123X
-    *** Failers
-No match
-    X
-No match
-    1X
-No match
-    1234X     
-No match
+/[\P{Any}]/BZ
+------------------------------------------------------------------
+        Bra
+        [\P{Any}]
+        Ket
+        End
+------------------------------------------------------------------

-/^[abcd]\d/
-    a45
- 0: a4
-    b93
- 0: b9
-    c99z
- 0: c9
-    d04
- 0: d0
-    *** Failers
-No match
-    e45
-No match
-    abcd      
-No match
-    abcd1234
-No match
-    1234  
-No match
+/[\P{Any}\E]/BZ
+------------------------------------------------------------------
+        Bra
+        [\P{Any}]
+        Ket
+        End
+------------------------------------------------------------------

-/^[abcd]*\d/
-    a45
- 0: a4
-    b93
- 0: b9
-    c99z
- 0: c9
-    d04
- 0: d0
-    abcd1234
- 0: abcd1
-    1234  
- 0: 1
-    *** Failers
-No match
-    e45
-No match
-    abcd      
-No match
+/(\P{Yi}+\277)/

-/^[abcd]+\d/
-    a45
- 0: a4
-    b93
- 0: b9
-    c99z
- 0: c9
-    d04
- 0: d0
-    abcd1234
- 0: abcd1
-    *** Failers
-No match
-    1234  
-No match
-    e45
-No match
-    abcd      
-No match
+/(\P{Yi}+\277)?/

-/^a+X/
-    aX
- 0: aX
-    aaX 
- 0: aaX
+/(?<=\P{Yi}{3}A)X/

-/^[abcd]?\d/
-    a45
- 0: a4
-    b93
- 0: b9
-    c99z
- 0: c9
-    d04
- 0: d0
-    1234  
- 0: 1
-    *** Failers
-No match
-    abcd1234
-No match
-    e45
-No match
+/\p{Yi}+(\P{Yi}+)(?1)/

-/^[abcd]{2,3}\d/
-    ab45
- 0: ab4
-    bcd93
- 0: bcd9
-    *** Failers
-No match
-    1234 
-No match
-    a36 
-No match
-    abcd1234
-No match
-    ee45
-No match
+/(\P{Yi}{2}\277)?/

-/^(abc)*\d/
-    abc45
- 0: abc4
-    abcabcabc45
- 0: abcabcabc4
-    42xyz 
- 0: 4
-    *** Failers
-No match
+/[\P{Yi}A]/

-/^(abc)+\d/
-    abc45
- 0: abc4
-    abcabcabc45
- 0: abcabcabc4
-    *** Failers
-No match
-    42xyz 
-No match
+/[\P{Yi}\P{Yi}\P{Yi}A]/

-/^(abc)?\d/
-    abc45
- 0: abc4
-    42xyz 
- 0: 4
-    *** Failers
-No match
-    abcabcabc45
-No match
+/[^\P{Yi}A]/

-/^(abc){2,3}\d/
-    abcabc45
- 0: abcabc4
-    abcabcabc45
- 0: abcabcabc4
-    *** Failers
-No match
-    abcabcabcabc45
-No match
-    abc45
-No match
-    42xyz 
-No match
+/[^\P{Yi}\P{Yi}\P{Yi}A]/

-/1(abc|xyz)2(?1)3/
-    1abc2abc3456
- 0: 1abc2abc3
-    1abc2xyz3456 
- 0: 1abc2xyz3
+/(\P{Yi}*\277)*/

-/^(a*\w|ab)=(a*\w|ab)/
-    ab=ab
- 0: ab=ab
- 1: ab=a
+/(\P{Yi}*?\277)*/

-/^(a*\w|ab)=(?1)/
-    ab=ab
- 0: ab=ab
- 1: ab=a
+/(\p{Yi}*+\277)*/

-/^([^()]|\((?1)*\))*$/
-    abc
- 0: abc
-    a(b)c
- 0: a(b)c
-    a(b(c))d  
- 0: a(b(c))d
-    *** Failers)
-No match
-    a(b(c)d  
-No match
+/(\P{Yi}?\277)*/

-/^>abc>([^()]|\((?1)*\))*<xyz<$/
-    >abc>123<xyz<
- 0: >abc>123<xyz<
-    >abc>1(2)3<xyz<
- 0: >abc>1(2)3<xyz<
-    >abc>(1(2)3)<xyz<
- 0: >abc>(1(2)3)<xyz<
+/(\P{Yi}??\277)*/

-/^(?>a*)\d/
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa9876
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa9
-    *** Failers 
-No match
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-No match
+/(\p{Yi}?+\277)*/

-/< (?: (?(R) \d++  | [^<>]*+) | (?R)) * >/x
-    <>
- 0: <>
-    <abcd>
- 0: <abcd>
-    <abc <123> hij>
- 0: <abc <123> hij>
-    <abc <def> hij>
- 0: <def>
-    <abc<>def> 
- 0: <abc<>def>
-    <abc<>      
- 0: <>
-    *** Failers
-No match
-    <abc
-No match
+/(\P{Yi}{0,3}\277)*/

-/^(?(?=abc)\w{3}:|\d\d)$/        
-    abc:                          
- 0: abc:
-    12                             
- 0: 12
-    *** Failers                     
-No match
-    123                       
-No match
-    xyz                        
-No match
-                                
-/^(?(?!abc)\d\d|\w{3}:)$/      
-    abc:                        
- 0: abc:
-    12         
- 0: 12
-    *** Failers
-No match
-    123
-No match
-    xyz    
-No match
+/(\P{Yi}{0,3}?\277)*/

-/^(?=abc)\w{5}:$/        
-    abcde:                          
- 0: abcde:
-    *** Failers                     
-No match
-    abc.. 
-No match
-    123                       
-No match
-    vwxyz                        
-No match
-                                
-/^(?!abc)\d\d$/      
-    12         
- 0: 12
-    *** Failers
-No match
-    abcde:
-No match
-    abc..  
-No match
-    123
-No match
-    vwxyz    
-No match
+/(\p{Yi}{0,3}+\277)*/

-/(?<=abc|xy)123/
-    abc12345
- 0: 123
-    wxy123z
- 0: 123
-    *** Failers
-No match
-    123abc
-No match
-
-/(?<!abc|xy)123/
-    123abc
- 0: 123
-    mno123456 
- 0: 123
-    *** Failers
-No match
-    abc12345
-No match
-    wxy123z
-No match
-
-/abc(?C1)xyz/
-    abcxyz
---->abcxyz
-  1 ^  ^       x
- 0: abcxyz
-    123abcxyz999 
---->123abcxyz999
-  1    ^  ^          x
- 0: abcxyz
-
-/(ab|cd){3,4}/C
-  ababab
---->ababab
- +0 ^          (ab|cd){3,4}
- +1 ^          a
- +4 ^          c
- +2 ^^         b
- +3 ^ ^        |
- +1 ^ ^        a
- +4 ^ ^        c
- +2 ^  ^       b
- +3 ^   ^      |
- +1 ^   ^      a
- +4 ^   ^      c
- +2 ^    ^     b
- +3 ^     ^    |
-+12 ^     ^    
- +1 ^     ^    a
- +4 ^     ^    c
- 0: ababab
-  abcdabcd
---->abcdabcd
- +0 ^            (ab|cd){3,4}
- +1 ^            a
- +4 ^            c
- +2 ^^           b
- +3 ^ ^          |
- +1 ^ ^          a
- +4 ^ ^          c
- +5 ^  ^         d
- +6 ^   ^        )
- +1 ^   ^        a
- +4 ^   ^        c
- +2 ^    ^       b
- +3 ^     ^      |
-+12 ^     ^      
- +1 ^     ^      a
- +4 ^     ^      c
- +5 ^      ^     d
- +6 ^       ^    )
-+12 ^       ^    
- 0: abcdabcd
- 1: abcdab
-  abcdcdcdcdcd  
---->abcdcdcdcdcd
- +0 ^                (ab|cd){3,4}
- +1 ^                a
- +4 ^                c
- +2 ^^               b
- +3 ^ ^              |
- +1 ^ ^              a
- +4 ^ ^              c
- +5 ^  ^             d
- +6 ^   ^            )
- +1 ^   ^            a
- +4 ^   ^            c
- +5 ^    ^           d
- +6 ^     ^          )
-+12 ^     ^          
- +1 ^     ^          a
- +4 ^     ^          c
- +5 ^      ^         d
- +6 ^       ^        )
-+12 ^       ^        
- 0: abcdcdcd
- 1: abcdcd
-
-/^abc/
-    abcdef
- 0: abc
-    *** Failers
-No match
-    abcdef\B  
-No match
-
-/^(a*|xyz)/
-    bcd
- 0: 
-    aaabcd
- 0: aaa
- 1: aa
- 2: a
- 3: 
-    xyz
- 0: xyz
- 1: 
-    xyz\N  
- 0: xyz
-    *** Failers
- 0: 
-    bcd\N   
-No match
+/\p{Zl}{2,3}+/8BZ
+------------------------------------------------------------------
+        Bra
+        prop Zl {2}
+        prop Zl ?+
+        Ket
+        End
+------------------------------------------------------------------
+    \xe2\x80\xa8\xe2\x80\xa8
+ 0: \x{2028}\x{2028}
+    \x{2028}\x{2028}\x{2028}
+ 0: \x{2028}\x{2028}\x{2028}

-/xyz$/
-    xyz
- 0: xyz
-    xyz\n
- 0: xyz
-    *** Failers
-No match
-    xyz\Z
-No match
-    xyz\n\Z    
-No match
-    
-/xyz$/m
-    xyz
- 0: xyz
-    xyz\n 
- 0: xyz
-    abcxyz\npqr 
- 0: xyz
-    abcxyz\npqr\Z 
- 0: xyz
-    xyz\n\Z    
- 0: xyz
-    *** Failers
-No match
-    xyz\Z
-No match
+/\p{Zl}/8BZ
+------------------------------------------------------------------
+        Bra
+        prop Zl
+        Ket
+        End
+------------------------------------------------------------------

-/\Gabc/
-    abcdef
- 0: abc
-    defabcxyz\>3 
- 0: abc
-    *** Failers 
-No match
-    defabcxyz
-No match
+/\p{Lu}{3}+/8BZ
+------------------------------------------------------------------
+        Bra
+        prop Lu {3}
+        Ket
+        End
+------------------------------------------------------------------

-/^abcdef/
-    ab\P
-Partial match: ab
-    abcde\P
-Partial match: abcde
-    abcdef\P
- 0: abcdef
-    *** Failers
-No match
-    abx\P    
-No match
+/\pL{2}+/8BZ
+------------------------------------------------------------------
+        Bra
+        prop L {2}
+        Ket
+        End
+------------------------------------------------------------------

-/^a{2,4}\d+z/
-    a\P
-Partial match: a
-    aa\P
-Partial match: aa
-    aa2\P 
-Partial match: aa2
-    aaa\P
-Partial match: aaa
-    aaa23\P 
-Partial match: aaa23
-    aaaa12345\P
-Partial match: aaaa12345
-    aa0z\P
- 0: aa0z
-    aaaa4444444444444z\P 
- 0: aaaa4444444444444z
-    *** Failers
-No match
-    az\P 
-No match
-    aaaaa\P 
-No match
-    a56\P 
-No match
+/\p{Cc}{2}+/8BZ
+------------------------------------------------------------------
+        Bra
+        prop Cc {2}
+        Ket
+        End
+------------------------------------------------------------------

-/^abcdef/
-   abc\P
-Partial match: abc
-   def\R 
- 0: def
-   
-/(?<=foo)bar/
-   xyzfo\P 
+/^\p{Cs}/8
+    \?\x{dfff}
+ 0: \x{dfff}
+    ** Failers
 No match
-   foob\P\>2 
-Partial match: foob
-   foobar...\R\P\>4 
- 0: ar
-   xyzfo\P
+    \x{09f} 
 No match
-   foobar\>2  
- 0: bar
-   *** Failers
+  
+/^\p{Sc}+/8
+    $\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
+ 0: $\x{a2}\x{a3}\x{a4}\x{a5}
+    \x{9f2}
+ 0: \x{9f2}
+    ** Failers
 No match
-   xyzfo\P
+    X
 No match
-   obar\R   
+    \x{2c2}
 No match
-
-/(ab*(cd|ef))+X/
-    adfadadaklhlkalkajhlkjahdfasdfasdfladsfjkj\P\Z
-No match
-    lkjhlkjhlkjhlkjhabbbbbbcdaefabbbbbbbefa\P\B\Z
-Partial match: abbbbbbcdaefabbbbbbbefa
-    cdabbbbbbbb\P\R\B\Z
-Partial match: cdabbbbbbbb
-    efabbbbbbbbbbbbbbbb\P\R\B\Z
-Partial match: efabbbbbbbbbbbbbbbb
-    bbbbbbbbbbbbcdXyasdfadf\P\R\B\Z    
- 0: bbbbbbbbbbbbcdX
-
-/(a|b)/SF>testsavedregex
-Compiled pattern written to testsavedregex
-Study data written to testsavedregex
-<testsavedregex
-Compiled pattern (byte-inverted) loaded from testsavedregex
-Study data loaded from testsavedregex
-    abc
- 0: a
+  
+/^\p{Zs}/8
+    \ \
+ 0:  
+    \x{a0}
+ 0: \x{a0}
+    \x{1680}
+ 0: \x{1680}
+    \x{180e}
+ 0: \x{180e}
+    \x{2000}
+ 0: \x{2000}
+    \x{2001}     
+ 0: \x{2001}
     ** Failers
- 0: a
-    def  
 No match
-    
-/the quick brown fox/
-    the quick brown fox
- 0: the quick brown fox
-    The quick brown FOX
+    \x{2028}
 No match
-    What do you know about the quick brown fox?
- 0: the quick brown fox
-    What do you know about THE QUICK BROWN FOX?
+    \x{200d} 
 No match
-
-/The quick brown fox/i
-    the quick brown fox
- 0: the quick brown fox
-    The quick brown FOX
- 0: The quick brown FOX
-    What do you know about the quick brown fox?
- 0: the quick brown fox
-    What do you know about THE QUICK BROWN FOX?
- 0: THE QUICK BROWN FOX
-
-/abcd\t\n\r\f\a\e\071\x3b\$\\\?caxyz/
-    abcd\t\n\r\f\a\e9;\$\\?caxyz
- 0: abcd\x09\x0a\x0d\x0c\x07\x1b9;$\?caxyz
-
-/a*abc?xyz+pqr{3}ab{2,}xy{4,5}pq{0,6}AB{0,}zz/
-    abxyzpqrrrabbxyyyypqAzz
- 0: abxyzpqrrrabbxyyyypqAzz
-    abxyzpqrrrabbxyyyypqAzz
- 0: abxyzpqrrrabbxyyyypqAzz
-    aabxyzpqrrrabbxyyyypqAzz
- 0: aabxyzpqrrrabbxyyyypqAzz
-    aaabxyzpqrrrabbxyyyypqAzz
- 0: aaabxyzpqrrrabbxyyyypqAzz
-    aaaabxyzpqrrrabbxyyyypqAzz
- 0: aaaabxyzpqrrrabbxyyyypqAzz
-    abcxyzpqrrrabbxyyyypqAzz
- 0: abcxyzpqrrrabbxyyyypqAzz
-    aabcxyzpqrrrabbxyyyypqAzz
- 0: aabcxyzpqrrrabbxyyyypqAzz
-    aaabcxyzpqrrrabbxyyyypAzz
- 0: aaabcxyzpqrrrabbxyyyypAzz
-    aaabcxyzpqrrrabbxyyyypqAzz
- 0: aaabcxyzpqrrrabbxyyyypqAzz
-    aaabcxyzpqrrrabbxyyyypqqAzz
- 0: aaabcxyzpqrrrabbxyyyypqqAzz
-    aaabcxyzpqrrrabbxyyyypqqqAzz
- 0: aaabcxyzpqrrrabbxyyyypqqqAzz
-    aaabcxyzpqrrrabbxyyyypqqqqAzz
- 0: aaabcxyzpqrrrabbxyyyypqqqqAzz
-    aaabcxyzpqrrrabbxyyyypqqqqqAzz
- 0: aaabcxyzpqrrrabbxyyyypqqqqqAzz
-    aaabcxyzpqrrrabbxyyyypqqqqqqAzz
- 0: aaabcxyzpqrrrabbxyyyypqqqqqqAzz
-    aaaabcxyzpqrrrabbxyyyypqAzz
- 0: aaaabcxyzpqrrrabbxyyyypqAzz
-    abxyzzpqrrrabbxyyyypqAzz
- 0: abxyzzpqrrrabbxyyyypqAzz
-    aabxyzzzpqrrrabbxyyyypqAzz
- 0: aabxyzzzpqrrrabbxyyyypqAzz
-    aaabxyzzzzpqrrrabbxyyyypqAzz
- 0: aaabxyzzzzpqrrrabbxyyyypqAzz
-    aaaabxyzzzzpqrrrabbxyyyypqAzz
- 0: aaaabxyzzzzpqrrrabbxyyyypqAzz
-    abcxyzzpqrrrabbxyyyypqAzz
- 0: abcxyzzpqrrrabbxyyyypqAzz
-    aabcxyzzzpqrrrabbxyyyypqAzz
- 0: aabcxyzzzpqrrrabbxyyyypqAzz
-    aaabcxyzzzzpqrrrabbxyyyypqAzz
- 0: aaabcxyzzzzpqrrrabbxyyyypqAzz
-    aaaabcxyzzzzpqrrrabbxyyyypqAzz
- 0: aaaabcxyzzzzpqrrrabbxyyyypqAzz
-    aaaabcxyzzzzpqrrrabbbxyyyypqAzz
- 0: aaaabcxyzzzzpqrrrabbbxyyyypqAzz
-    aaaabcxyzzzzpqrrrabbbxyyyyypqAzz
- 0: aaaabcxyzzzzpqrrrabbbxyyyyypqAzz
-    aaabcxyzpqrrrabbxyyyypABzz
- 0: aaabcxyzpqrrrabbxyyyypABzz
-    aaabcxyzpqrrrabbxyyyypABBzz
- 0: aaabcxyzpqrrrabbxyyyypABBzz
-    >>>aaabxyzpqrrrabbxyyyypqAzz
- 0: aaabxyzpqrrrabbxyyyypqAzz
-    >aaaabxyzpqrrrabbxyyyypqAzz
- 0: aaaabxyzpqrrrabbxyyyypqAzz
-    >>>>abcxyzpqrrrabbxyyyypqAzz
- 0: abcxyzpqrrrabbxyyyypqAzz
-    *** Failers
-No match
-    abxyzpqrrabbxyyyypqAzz
-No match
-    abxyzpqrrrrabbxyyyypqAzz
-No match
-    abxyzpqrrrabxyyyypqAzz
-No match
-    aaaabcxyzzzzpqrrrabbbxyyyyyypqAzz
-No match
-    aaaabcxyzzzzpqrrrabbbxyyypqAzz
-No match
-    aaabcxyzpqrrrabbxyyyypqqqqqqqAzz
-No match
-
-/^(abc){1,2}zz/
-    abczz
- 0: abczz
-    abcabczz
- 0: abcabczz
-    *** Failers
-No match
-    zz
-No match
-    abcabcabczz
-No match
-    >>abczz
-No match
-
-/^(b+?|a){1,2}?c/
-    bc
- 0: bc
-    bbc
- 0: bbc
-    bbbc
- 0: bbbc
-    bac
- 0: bac
-    bbac
- 0: bbac
-    aac
- 0: aac
-    abbbbbbbbbbbc
- 0: abbbbbbbbbbbc
-    bbbbbbbbbbbac
- 0: bbbbbbbbbbbac
-    *** Failers
-No match
-    aaac
-No match
-    abbbbbbbbbbbac
-No match
-
-/^(b+|a){1,2}c/
-    bc
- 0: bc
-    bbc
- 0: bbc
-    bbbc
- 0: bbbc
-    bac
- 0: bac
-    bbac
- 0: bbac
-    aac
- 0: aac
-    abbbbbbbbbbbc
- 0: abbbbbbbbbbbc
-    bbbbbbbbbbbac
- 0: bbbbbbbbbbbac
-    *** Failers
-No match
-    aaac
-No match
-    abbbbbbbbbbbac
-No match
-
-/^(b+|a){1,2}?bc/
-    bbc
- 0: bbc
-
-/^(b*|ba){1,2}?bc/
-    babc
- 0: babc
-    bbabc
- 0: bbabc
-    bababc
- 0: bababc
-    *** Failers
-No match
-    bababbc
-No match
-    babababc
-No match
-
-/^(ba|b*){1,2}?bc/
-    babc
- 0: babc
-    bbabc
- 0: bbabc
-    bababc
- 0: bababc
-    *** Failers
-No match
-    bababbc
-No match
-    babababc
-No match
-
-/^\ca\cA\c[\c{\c:/
-    \x01\x01\e;z
- 0: \x01\x01\x1b;z
-
-/^[ab\]cde]/
-    athing
- 0: a
-    bthing
- 0: b
-    ]thing
- 0: ]
-    cthing
- 0: c
-    dthing
- 0: d
-    ething
- 0: e
-    *** Failers
-No match
-    fthing
-No match
-    [thing
-No match
-    \\thing
-No match
-
-/^[]cde]/
-    ]thing
- 0: ]
-    cthing
- 0: c
-    dthing
- 0: d
-    ething
- 0: e
-    *** Failers
-No match
-    athing
-No match
-    fthing
-No match
-
-/^[^ab\]cde]/
-    fthing
- 0: f
-    [thing
- 0: [
-    \\thing
- 0: \
-    *** Failers
+  
+/-- These four are here rather than in test 6 because Perl has problems with
+    the negative versions of the properties. --/
+      
+/\p{^Lu}/8i
+    1234
+ 0: 1
+    ** Failers
  0: *
-    athing
+    ABC 
 No match
-    bthing
-No match
-    ]thing
-No match
-    cthing
-No match
-    dthing
-No match
-    ething
-No match

-/^[^]cde]/
-    athing
- 0: a
-    fthing
- 0: f
-    *** Failers
+/\P{Lu}/8i
+    1234
+ 0: 1
+    ** Failers
  0: *
-    ]thing
+    ABC 
 No match
-    cthing
-No match
-    dthing
-No match
-    ething
-No match

-/^\\x81/
-    \x81
- 0: \x81
-
-/^\xFF/
-    \xFF
- 0: \xff
-
-/^[0-9]+$/
-    0
- 0: 0
-    1
- 0: 1
-    2
- 0: 2
-    3
- 0: 3
-    4
- 0: 4
-    5
- 0: 5
-    6
- 0: 6
-    7
- 0: 7
-    8
- 0: 8
-    9
- 0: 9
-    10
- 0: 10
-    100
- 0: 100
-    *** Failers
-No match
-    abc
-No match
-
-/^.*nter/
-    enter
- 0: enter
-    inter
- 0: inter
-    uponter
- 0: uponter
-
-/^xxx[0-9]+$/
-    xxx0
- 0: xxx0
-    xxx1234
- 0: xxx1234
-    *** Failers
-No match
-    xxx
-No match
-
-/^.+[0-9][0-9][0-9]$/
-    x123
- 0: x123
-    xx123
- 0: xx123
-    123456
- 0: 123456
-    *** Failers
-No match
-    123
-No match
-    x1234
- 0: x1234
-
-/^.+?[0-9][0-9][0-9]$/
-    x123
- 0: x123
-    xx123
- 0: xx123
-    123456
- 0: 123456
-    *** Failers
-No match
-    123
-No match
-    x1234
- 0: x1234
-
-/^([^!]+)!(.+)=apquxz\.ixr\.zzz\.ac\.uk$/
-    abc!pqr=apquxz.ixr.zzz.ac.uk
- 0: abc!pqr=apquxz.ixr.zzz.ac.uk
-    *** Failers
-No match
-    !pqr=apquxz.ixr.zzz.ac.uk
-No match
-    abc!=apquxz.ixr.zzz.ac.uk
-No match
-    abc!pqr=apquxz:ixr.zzz.ac.uk
-No match
-    abc!pqr=apquxz.ixr.zzz.ac.ukk
-No match
-
-/:/
-    Well, we need a colon: somewhere
- 0: :
-    *** Fail if we don't
-No match
-
-/([\da-f:]+)$/i
-    0abc
- 0: 0abc
-    abc
- 0: abc
-    fed
- 0: fed
-    E
- 0: E
-    ::
- 0: ::
-    5f03:12C0::932e
- 0: 5f03:12C0::932e
-    fed def
- 0: def
-    Any old stuff
- 0: ff
-    *** Failers
-No match
-    0zzz
-No match
-    gzzz
-No match
-    fed\x20
-No match
-    Any old rubbish
-No match
-
-/^.*\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})$/
-    .1.2.3
- 0: .1.2.3
-    A.12.123.0
- 0: A.12.123.0
-    *** Failers
-No match
-    .1.2.3333
-No match
-    1.2.3
-No match
-    1234.2.3
-No match
-
-/^(\d+)\s+IN\s+SOA\s+(\S+)\s+(\S+)\s*\(\s*$/
-    1 IN SOA non-sp1 non-sp2(
- 0: 1 IN SOA non-sp1 non-sp2(
-    1    IN    SOA    non-sp1    non-sp2   (
- 0: 1    IN    SOA    non-sp1    non-sp2   (
-    *** Failers
-No match
-    1IN SOA non-sp1 non-sp2(
-No match
-
-/^[a-zA-Z\d][a-zA-Z\d\-]*(\.[a-zA-Z\d][a-zA-z\d\-]*)*\.$/
-    a.
- 0: a.
-    Z.
- 0: Z.
-    2.
- 0: 2.
-    ab-c.pq-r.
- 0: ab-c.pq-r.
-    sxk.zzz.ac.uk.
- 0: sxk.zzz.ac.uk.
-    x-.y-.
- 0: x-.y-.
-    *** Failers
-No match
-    -abc.peq.
-No match
-
-/^\*\.[a-z]([a-z\-\d]*[a-z\d]+)?(\.[a-z]([a-z\-\d]*[a-z\d]+)?)*$/
-    *.a
- 0: *.a
-    *.b0-a
- 0: *.b0-a
-    *.c3-b.c
- 0: *.c3-b.c
-    *.c-a.b-c
- 0: *.c-a.b-c
-    *** Failers
-No match
-    *.0
-No match
-    *.a-
-No match
-    *.a-b.c-
-No match
-    *.c-a.0-c
-No match
-
-/^(?=ab(de))(abd)(e)/
-    abde
- 0: abde
-
-/^(?!(ab)de|x)(abd)(f)/
-    abdf
- 0: abdf
-
-/^(?=(ab(cd)))(ab)/
-    abcd
- 0: ab
-
-/^[\da-f](\.[\da-f])*$/i
-    a.b.c.d
- 0: a.b.c.d
-    A.B.C.D
- 0: A.B.C.D
-    a.b.c.1.2.3.C
- 0: a.b.c.1.2.3.C
-
-/^\".*\"\s*(;.*)?$/
-    \"1234\"
- 0: "1234"
-    \"abcd\" ;
- 0: "abcd" ;
-    \"\" ; rhubarb
- 0: "" ; rhubarb
-    *** Failers
-No match
-    \"1234\" : things
-No match
-
-/^$/
-    \
- 0: 
-    *** Failers
-No match
-
-/   ^    a   (?# begins with a)  b\sc (?# then b c) $ (?# then end)/x
-    ab c
- 0: ab c
-    *** Failers
-No match
-    abc
-No match
-    ab cde
-No match
-
-/(?x)   ^    a   (?# begins with a)  b\sc (?# then b c) $ (?# then end)/
-    ab c
- 0: ab c
-    *** Failers
-No match
-    abc
-No match
-    ab cde
-No match
-
-/^   a\ b[c ]d       $/x
-    a bcd
- 0: a bcd
-    a b d
- 0: a b d
-    *** Failers
-No match
-    abcd
-No match
-    ab d
-No match
-
-/^(a(b(c)))(d(e(f)))(h(i(j)))(k(l(m)))$/
-    abcdefhijklm
- 0: abcdefhijklm
-
-/^(?:a(b(c)))(?:d(e(f)))(?:h(i(j)))(?:k(l(m)))$/
-    abcdefhijklm
- 0: abcdefhijklm
-
-/^[\w][\W][\s][\S][\d][\D][\b][\n][\c]][\022]/
-    a+ Z0+\x08\n\x1d\x12
- 0: a+ Z0+\x08\x0a\x1d\x12
-
-/^[.^$|()*+?{,}]+/
-    .^\$(*+)|{?,?}
- 0: .^$(*+)|{?,?}
- 1: .^$(*+)|{?,?
- 2: .^$(*+)|{?,
- 3: .^$(*+)|{?
- 4: .^$(*+)|{
- 5: .^$(*+)|
- 6: .^$(*+)
- 7: .^$(*+
- 8: .^$(*
- 9: .^$(
-10: .^$
-11: .^
-12: .
-
-/^a*\w/
-    z
- 0: z
-    az
- 0: az
- 1: a
-    aaaz
- 0: aaaz
- 1: aaa
- 2: aa
- 3: a
+/\p{Ll}/8i 
     a
  0: a
-    aa
- 0: aa
- 1: a
-    aaaa
- 0: aaaa
- 1: aaa
- 2: aa
- 3: a
-    a+
- 0: a
-    aa+
- 0: aa
- 1: a
-
-/^a*?\w/
-    z
+    Az
  0: z
-    az
- 0: az
- 1: a
-    aaaz
- 0: aaaz
- 1: aaa
- 2: aa
- 3: a
-    a
+    ** Failers
  0: a
-    aa
- 0: aa
- 1: a
-    aaaa
- 0: aaaa
- 1: aaa
- 2: aa
- 3: a
-    a+
- 0: a
-    aa+
- 0: aa
- 1: a
-
-/^a+\w/
-    az
- 0: az
-    aaaz
- 0: aaaz
- 1: aaa
- 2: aa
-    aa
- 0: aa
-    aaaa
- 0: aaaa
- 1: aaa
- 2: aa
-    aa+
- 0: aa
-
-/^a+?\w/
-    az
- 0: az
-    aaaz
- 0: aaaz
- 1: aaa
- 2: aa
-    aa
- 0: aa
-    aaaa
- 0: aaaa
- 1: aaa
- 2: aa
-    aa+
- 0: aa
-
-/^\d{8}\w{2,}/
-    1234567890
- 0: 1234567890
-    12345678ab
- 0: 12345678ab
-    12345678__
- 0: 12345678__
-    *** Failers
+    ABC   
 No match
-    1234567
-No match

-/^[aeiou\d]{4,5}$/
-    uoie
- 0: uoie
-    1234
- 0: 1234
-    12345
- 0: 12345
-    aaaaa
- 0: aaaaa
-    *** Failers
+/\p{Lu}/8i
+    A
+ 0: A
+    a\x{10a0}B 
+ 0: \x{10a0}
+    ** Failers 
+ 0: F
+    a
 No match
-    123456
+    \x{1d00}  
 No match

-/^[aeiou\d]{4,5}?/
-    uoie
- 0: uoie
-    1234
- 0: 1234
-    12345
- 0: 12345
- 1: 1234
-    aaaaa
- 0: aaaaa
- 1: aaaa
-    123456
- 0: 12345
- 1: 1234
+/[\x{c0}\x{391}]/8i
+    \x{c0}
+ 0: \x{c0}
+    \x{e0} 
+ 0: \x{e0}

-/^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/
-    From abcd  Mon Sep 01 12:33:02 1997
- 0: From abcd  Mon Sep 01 12:33
+/-- The next two are special cases where the lengths of the different cases of
+the same character differ. The first went wrong with heap frame storage; the
+second was broken in all cases. --/

-/^From\s+\S+\s+([a-zA-Z]{3}\s+){2}\d{1,2}\s+\d\d:\d\d/
-    From abcd  Mon Sep 01 12:33:02 1997
- 0: From abcd  Mon Sep 01 12:33
-    From abcd  Mon Sep  1 12:33:02 1997
- 0: From abcd  Mon Sep  1 12:33
-    *** Failers
-No match
-    From abcd  Sep 01 12:33:02 1997
-No match
+/^\x{023a}+?(\x{0130}+)/8i
+  \x{023a}\x{2c65}\x{0130}
+ 0: \x{23a}\x{2c65}\x{130}
+ 1: \x{130}
+  
+/^\x{023a}+([^X])/8i
+  \x{023a}\x{2c65}X
+ 0: \x{23a}\x{2c65}
+ 1: \x{2c65}

-/^12.34/s
-    12\n34
- 0: 12\x0a34
-    12\r34
- 0: 12\x0d34
+/\x{c0}+\x{116}+/8i
+    \x{c0}\x{e0}\x{116}\x{117}
+ 0: \x{c0}\x{e0}\x{116}\x{117}

-/\w+(?=\t)/
-    the quick brown\t fox
- 0: brown
+/[\x{c0}\x{116}]+/8i
+    \x{c0}\x{e0}\x{116}\x{117}
+ 0: \x{c0}\x{e0}\x{116}\x{117}

-/foo(?!bar)(.*)/
-    foobar is foolish see?
- 0: foolish see?
- 1: foolish see
- 2: foolish se
- 3: foolish s
- 4: foolish 
- 5: foolish
- 6: foolis
- 7: fooli
- 8: fool
- 9: foo
+/(\x{de})\1/8i
+    \x{de}\x{de}
+ 0: \x{de}\x{de}
+ 1: \x{de}
+    \x{de}\x{fe}
+ 0: \x{de}\x{fe}
+ 1: \x{de}
+    \x{fe}\x{fe}
+ 0: \x{fe}\x{fe}
+ 1: \x{fe}
+    \x{fe}\x{de}
+ 0: \x{fe}\x{de}
+ 1: \x{fe}

-/(?:(?!foo)...|^.{0,2})bar(.*)/
-    foobar crowbar etc
- 0: rowbar etc
- 1: rowbar et
- 2: rowbar e
- 3: rowbar 
- 4: rowbar
-    barrel
- 0: barrel
- 1: barre
- 2: barr
- 3: bar
-    2barrel
- 0: 2barrel
- 1: 2barre
- 2: 2barr
- 3: 2bar
-    A barrel
- 0: A barrel
- 1: A barre
- 2: A barr
- 3: A bar
+/^\x{c0}$/8i
+    \x{c0}
+ 0: \x{c0}
+    \x{e0} 
+ 0: \x{e0}

-/^(\D*)(?=\d)(?!123)/
-    abc456
- 0: abc
-    *** Failers
-No match
-    abc123
-No match
+/^\x{e0}$/8i
+    \x{c0}
+ 0: \x{c0}
+    \x{e0} 
+ 0: \x{e0}

-/^1234(?# test newlines
-  inside)/
-    1234
- 0: 1234
+/-- The next two should be Perl-compatible, but it fails to match \x{e0}. PCRE
+will match it only with UCP support, because without that it has no notion
+of case for anything other than the ASCII letters. --/

-/^1234 #comment in extended re
-  /x
-    1234
- 0: 1234
+/((?i)[\x{c0}])/8
+    \x{c0}
+ 0: \x{c0}
+ 1: \x{c0}
+    \x{e0} 
+ 0: \x{e0}
+ 1: \x{e0}

-/#rhubarb
-  abcd/x
-    abcd
- 0: abcd
+/(?i:[\x{c0}])/8
+    \x{c0}
+ 0: \x{c0}
+    \x{e0} 
+ 0: \x{e0}

-/^abcd#rhubarb/x
-    abcd
- 0: abcd
-
-/(?!^)abc/
-    the abc
- 0: abc
-    *** Failers
-No match
-    abc
-No match
-
-/(?=^)abc/
-    abc
- 0: abc
-    *** Failers
-No match
-    the abc
-No match
-
-/^[ab]{1,3}(ab*|b)/
-    aabbbbb
- 0: aabbbbb
- 1: aabbbb
- 2: aabbb
- 3: aabb
- 4: aab
- 5: aa
-
-/^[ab]{1,3}?(ab*|b)/
-    aabbbbb
- 0: aabbbbb
- 1: aabbbb
- 2: aabbb
- 3: aabb
- 4: aab
- 5: aa
-
-/^[ab]{1,3}?(ab*?|b)/
-    aabbbbb
- 0: aabbbbb
- 1: aabbbb
- 2: aabbb
- 3: aabb
- 4: aab
- 5: aa
-
-/^[ab]{1,3}(ab*?|b)/
-    aabbbbb
- 0: aabbbbb
- 1: aabbbb
- 2: aabbb
- 3: aabb
- 4: aab
- 5: aa
-
-/  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                          # optional leading comment
-(?:    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)                    # initial word
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)  )* # further okay, if led by a period
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-# address
-|                     #  or
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)             # one word, optionally followed by....
-(?:
-[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]  |  # atom and space parts, or...
-\(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)       |  # comments, or...
-
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-# quoted strings
-)*
-<  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                     # leading <
-(?:  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  ,  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-)* # further okay, if led by comma
-:                                # closing colon
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  )? #       optional route
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)                    # initial word
-(?:  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-" (?:                      # opening quote...
-[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
-|                     #    or
-\\ [^\x80-\xff]           #   Escaped something (something != CR)
-)* "  # closing quote
-)  )* # further okay, if led by a period
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  @  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*    (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                           # initial subdomain
-(?:                                  #
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  \.                        # if led by a period...
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*   (?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|   \[                         # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
-\]                        #           ]
-)                     #   ...further okay
-)*
-#       address spec
-(?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*  > #                  trailing >
-# name and address
-)  (?: [\040\t] |  \(
-(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
-\)  )*                       # optional trailing comment
-/x
-    Alan Other <user\@dom.ain>
- 0: Alan Other <user@???>
-    <user\@dom.ain>
- 0: user@???
- 1: user@dom
-    user\@dom.ain
- 0: user@???
- 1: user@dom
-    \"A. Other\" <user.1234\@dom.ain> (a comment)
- 0: "A. Other" <user.1234@???> (a comment)
- 1: "A. Other" <user.1234@???> 
- 2: "A. Other" <user.1234@???>
-    A. Other <user.1234\@dom.ain> (a comment)
- 0:  Other <user.1234@???> (a comment)
- 1:  Other <user.1234@???> 
- 2:  Other <user.1234@???>
-    \"/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/\"\@x400-re.lay
- 0: "/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/"@???
- 1: "/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/"@x400-re
-    A missing angle <user\@some.where
- 0: user@???
- 1: user@some
-    *** Failers
-No match
-    The quick brown fox
-No match
-
-/[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional leading comment
-(?:
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-# Atom
-|                       #  or
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-# Quoted string
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-# Atom
-|                       #  or
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-# Quoted string
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# additional words
-)*
-@
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-)*
-# address
-|                             #  or
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-# Atom
-|                       #  or
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-# Quoted string
-)
-# leading word
-[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] *               # "normal" atoms and or spaces
-(?:
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-|
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-) # "special" comment or quoted string
-[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] *            #  more "normal"
-)*
-<
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# <
-(?:
-@
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-)*
-(?: ,
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-@
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-)*
-)*  # additional domains
-:
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-)?     #       optional route
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-# Atom
-|                       #  or
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-# Quoted string
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-# Atom
-|                       #  or
-"                                     # "
-[^\\\x80-\xff\n\015"] *                            #   normal
-(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
-"                                     #        "
-# Quoted string
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# additional words
-)*
-@
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-(?:
-\.
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-(?:
-[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
-(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
-|
-\[                            # [
-(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
-\]                           #           ]
-)
-[\040\t]*                    # Nab whitespace.
-(?:
-\(                              #  (
-[^\\\x80-\xff\n\015()] *                             #     normal*
-(?:                                 #       (
-(?:  \\ [^\x80-\xff]  |
-\(                            #  (
-[^\\\x80-\xff\n\015()] *                            #     normal*
-(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
-\)                           #                       )
-)    #         special
-[^\\\x80-\xff\n\015()] *                         #         normal*
-)*                                  #            )*
-\)                             #                )
-[\040\t]* )*    # If comment found, allow more spaces.
-# optional trailing comments
-)*
-#       address spec
->                    #                 >
-# name and address
-)
-/x
-    Alan Other <user\@dom.ain>
- 0: Alan Other <user@???>
-    <user\@dom.ain>
- 0: user@???
- 1: user@dom
-    user\@dom.ain
- 0: user@???
- 1: user@dom
-    \"A. Other\" <user.1234\@dom.ain> (a comment)
- 0: "A. Other" <user.1234@???>
-    A. Other <user.1234\@dom.ain> (a comment)
- 0:  Other <user.1234@???>
-    \"/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/\"\@x400-re.lay
- 0: "/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/"@???
- 1: "/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/"@x400-re
-    A missing angle <user\@some.where
- 0: user@???
- 1: user@some
-    *** Failers
-No match
-    The quick brown fox
-No match
-
-/abc\0def\00pqr\000xyz\0000AB/
-    abc\0def\00pqr\000xyz\0000AB
- 0: abc\x00def\x00pqr\x00xyz\x000AB
-    abc456 abc\0def\00pqr\000xyz\0000ABCDE
- 0: abc\x00def\x00pqr\x00xyz\x000AB
-
-/abc\x0def\x00pqr\x000xyz\x0000AB/
-    abc\x0def\x00pqr\x000xyz\x0000AB
- 0: abc\x0def\x00pqr\x000xyz\x0000AB
-    abc456 abc\x0def\x00pqr\x000xyz\x0000ABCDE
- 0: abc\x0def\x00pqr\x000xyz\x0000AB
-
-/^[\000-\037]/
-    \0A
- 0: \x00
-    \01B
- 0: \x01
-    \037C
- 0: \x1f
-
-/\0*/
-    \0\0\0\0
- 0: \x00\x00\x00\x00
- 1: \x00\x00\x00
- 2: \x00\x00
- 3: \x00
- 4: 
-
-/A\x0{2,3}Z/
-    The A\x0\x0Z
- 0: A\x00\x00Z
-    An A\0\x0\0Z
- 0: A\x00\x00\x00Z
-    *** Failers
-No match
-    A\0Z
-No match
-    A\0\x0\0\x0Z
-No match
-
-/^\s/
-    \040abc
- 0:  
-    \x0cabc
- 0: \x0c
-    \nabc
- 0: \x0a
-    \rabc
- 0: \x0d
-    \tabc
- 0: \x09
-    *** Failers
-No match
-    abc
-No match
-
-/^a    b
-    ?  c/x
-    abc
- 0: abc
-
-/ab{1,3}bc/
-    abbbbc
- 0: abbbbc
-    abbbc
- 0: abbbc
-    abbc
- 0: abbc
-    *** Failers
-No match
-    abc
-No match
-    abbbbbc
-No match
-
-/([^.]*)\.([^:]*):[T ]+(.*)/
-    track1.title:TBlah blah blah
- 0: track1.title:TBlah blah blah
- 1: track1.title:TBlah blah bla
- 2: track1.title:TBlah blah bl
- 3: track1.title:TBlah blah b
- 4: track1.title:TBlah blah 
- 5: track1.title:TBlah blah
- 6: track1.title:TBlah bla
- 7: track1.title:TBlah bl
- 8: track1.title:TBlah b
- 9: track1.title:TBlah 
-10: track1.title:TBlah
-11: track1.title:TBla
-12: track1.title:TBl
-13: track1.title:TB
-14: track1.title:T
-
-/([^.]*)\.([^:]*):[T ]+(.*)/i
-    track1.title:TBlah blah blah
- 0: track1.title:TBlah blah blah
- 1: track1.title:TBlah blah bla
- 2: track1.title:TBlah blah bl
- 3: track1.title:TBlah blah b
- 4: track1.title:TBlah blah 
- 5: track1.title:TBlah blah
- 6: track1.title:TBlah bla
- 7: track1.title:TBlah bl
- 8: track1.title:TBlah b
- 9: track1.title:TBlah 
-10: track1.title:TBlah
-11: track1.title:TBla
-12: track1.title:TBl
-13: track1.title:TB
-14: track1.title:T
-
-/([^.]*)\.([^:]*):[t ]+(.*)/i
-    track1.title:TBlah blah blah
- 0: track1.title:TBlah blah blah
- 1: track1.title:TBlah blah bla
- 2: track1.title:TBlah blah bl
- 3: track1.title:TBlah blah b
- 4: track1.title:TBlah blah 
- 5: track1.title:TBlah blah
- 6: track1.title:TBlah bla
- 7: track1.title:TBlah bl
- 8: track1.title:TBlah b
- 9: track1.title:TBlah 
-10: track1.title:TBlah
-11: track1.title:TBla
-12: track1.title:TBl
-13: track1.title:TB
-14: track1.title:T
-
-/^[W-c]+$/
-    WXY_^abc
- 0: WXY_^abc
-    *** Failers
-No match
-    wxy
-No match
-
-/^[W-c]+$/i
-    WXY_^abc
- 0: WXY_^abc
-    wxy_^ABC
- 0: wxy_^ABC
-
-/^[\x3f-\x5F]+$/i
-    WXY_^abc
- 0: WXY_^abc
-    wxy_^ABC
- 0: wxy_^ABC
-
-/^abc$/m
-    abc
- 0: abc
-    qqq\nabc
- 0: abc
-    abc\nzzz
- 0: abc
-    qqq\nabc\nzzz
- 0: abc
-
-/^abc$/
-    abc
- 0: abc
-    *** Failers
-No match
-    qqq\nabc
-No match
-    abc\nzzz
-No match
-    qqq\nabc\nzzz
-No match
-
-/\Aabc\Z/m
-    abc
- 0: abc
-    abc\n 
- 0: abc
-    *** Failers
-No match
-    qqq\nabc
-No match
-    abc\nzzz
-No match
-    qqq\nabc\nzzz
-No match
+/-- This should be Perl-compatible but Perl 5.11 gets \x{300} wrong. --/8

-/\A(.)*\Z/s
-    abc\ndef
- 0: abc\x0adef
-
-/\A(.)*\Z/m
-    *** Failers
- 0: *** Failers
-    abc\ndef
-No match
-
-/(?:b)|(?::+)/
-    b::c
- 0: b
-    c::b
- 0: ::
- 1: :
-
-/[-az]+/
-    az-
- 0: az-
- 1: az
- 2: a
-    *** Failers
- 0: a
-    b
-No match
-
-/[az-]+/
-    za-
- 0: za-
- 1: za
- 2: z
-    *** Failers
- 0: a
-    b
-No match
-
-/[a\-z]+/
-    a-z
- 0: a-z
- 1: a-
- 2: a
-    *** Failers
- 0: a
-    b
-No match
-
-/[a-z]+/
-    abcdxyz
- 0: abcdxyz
- 1: abcdxy
- 2: abcdx
- 3: abcd
- 4: abc
- 5: ab
- 6: a
-
-/[\d-]+/
-    12-34
- 0: 12-34
- 1: 12-3
- 2: 12-
- 3: 12
- 4: 1
-    *** Failers
-No match
-    aaa
-No match
-
-/[\d-z]+/
-    12-34z
- 0: 12-34z
- 1: 12-34
- 2: 12-3
- 3: 12-
- 4: 12
- 5: 1
-    *** Failers
-No match
-    aaa
-No match
-
-/\x5c/
-    \\
- 0: \
-
-/\x20Z/
-    the Zoo
- 0:  Z
-    *** Failers
-No match
-    Zulu
-No match
-
-/ab{3cd/
-    ab{3cd
- 0: ab{3cd
-
-/ab{3,cd/
-    ab{3,cd
- 0: ab{3,cd
-
-/ab{3,4a}cd/
-    ab{3,4a}cd
- 0: ab{3,4a}cd
-
-/{4,5a}bc/
-    {4,5a}bc
- 0: {4,5a}bc
-
-/^a.b/<lf>
-    a\rb
- 0: a\x0db
-    *** Failers
-No match
-    a\nb
-No match
-
-/abc$/
-    abc
- 0: abc
-    abc\n
- 0: abc
-    *** Failers
-No match
-    abc\ndef
-No match
-
-/(abc)\123/
-    abc\x53
- 0: abcS
-
-/(abc)\223/
-    abc\x93
- 0: abc\x93
-
-/(abc)\323/
-    abc\xd3
- 0: abc\xd3
-
-/(abc)\100/
-    abc\x40
- 0: abc@
-    abc\100
- 0: abc@
-
-/(abc)\1000/
-    abc\x400
- 0: abc@0
-    abc\x40\x30
- 0: abc@0
-    abc\1000
- 0: abc@0
-    abc\100\x30
- 0: abc@0
-    abc\100\060
- 0: abc@0
-    abc\100\60
- 0: abc@0
-
-/abc\81/
-    abc\081
- 0: abc\x0081
-    abc\0\x38\x31
- 0: abc\x0081
-
-/abc\91/
-    abc\091
- 0: abc\x0091
-    abc\0\x39\x31
- 0: abc\x0091
-
-/(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)\12\123/
-    abcdefghijk\12S
- 0: abcdefghijk\x0aS
-
-/ab\idef/
-    abidef
- 0: abidef
-
-/a{0}bc/
-    bc
- 0: bc
-
-/(a|(bc)){0,0}?xyz/
-    xyz
- 0: xyz
-
-/abc[\10]de/
-    abc\010de
- 0: abc\x08de
-
-/abc[\1]de/
-    abc\1de
- 0: abc\x01de
-
-/(abc)[\1]de/
-    abc\1de
- 0: abc\x01de
-
-/(?s)a.b/
-    a\nb
- 0: a\x0ab
-
-/^([^a])([^\b])([^c]*)([^d]{3,4})/
-    baNOTccccd
- 0: baNOTcccc
- 1: baNOTccc
- 2: baNOTcc
- 3: baNOTc
- 4: baNOT
-    baNOTcccd
- 0: baNOTccc
- 1: baNOTcc
- 2: baNOTc
- 3: baNOT
-    baNOTccd
- 0: baNOTcc
- 1: baNOTc
- 2: baNOT
-    bacccd
- 0: baccc
-    *** Failers
- 0: *** Failers
- 1: *** Failer
- 2: *** Faile
- 3: *** Fail
- 4: *** Fai
- 5: *** Fa
- 6: *** F
-    anything
-No match
-    b\bc   
-No match
-    baccd
-No match
-
-/[^a]/
-    Abc
+/^\X/8
+    A
  0: A
-  
-/[^a]/i
-    Abc 
- 0: b
-
-/[^a]+/
-    AAAaAbc
- 0: AAA
- 1: AA
- 2: A
-  
-/[^a]+/i
-    AAAaAbc 
- 0: bc
- 1: b
-
-/[^a]+/
-    bbb\nccc
- 0: bbb\x0accc
- 1: bbb\x0acc
- 2: bbb\x0ac
- 3: bbb\x0a
- 4: bbb
- 5: bb
- 6: b
-   
-/[^k]$/
-    abc
- 0: c
+    A\x{300}BC 
+ 0: A\x{300}
+    A\x{300}\x{301}\x{302}BC 
+ 0: A\x{300}\x{301}\x{302}
     *** Failers
- 0: s
-    abk   
+ 0: *
+    \x{300}  
 No match
-   
-/[^k]{2,3}$/
-    abc
- 0: abc
-    kbc
- 0: bc
-    kabc 
- 0: abc
-    *** Failers
- 0: ers
-    abk
-No match
-    akb
-No match
-    akk 
-No match
+    
+/-- These are PCRE's extra properties to help with Unicodizing \d etc. --/

-/^\d{8,}\@.+[^k]$/
-    12345678\@a.b.c.d
- 0: 12345678@???
-    123456789\@x.y.z
- 0: 123456789@???
-    *** Failers
-No match
-    12345678\@x.y.uk
-No match
-    1234567\@a.b.c.d       
-No match
-
-/[^a]/
-    aaaabcd
- 0: b
-    aaAabcd 
+/^\p{Xan}/8
+    ABCD
  0: A
-
-/[^a]/i
-    aaaabcd
- 0: b
-    aaAabcd 
- 0: b
-
-/[^az]/
-    aaaabcd
- 0: b
-    aaAabcd 
- 0: A
-
-/[^az]/i
-    aaaabcd
- 0: b
-    aaAabcd 
- 0: b
-
-/\000\001\002\003\004\005\006\007\010\011\012\013\014\015\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\040\041\042\043\044\045\046\047\050\051\052\053\054\055\056\057\060\061\062\063\064\065\066\067\070\071\072\073\074\075\076\077\100\101\102\103\104\105\106\107\110\111\112\113\114\115\116\117\120\121\122\123\124\125\126\127\130\131\132\133\134\135\136\137\140\141\142\143\144\145\146\147\150\151\152\153\154\155\156\157\160\161\162\163\164\165\166\167\170\171\172\173\174\175\176\177\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\257\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\365\366\367\370\371\372\373\374\375\376\377/
- \000\001\002\003\004\005\006\007\010\011\012\013\014\015\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\040\041\042\043\044\045\046\047\050\051\052\053\054\055\056\057\060\061\062\063\064\065\066\067\070\071\072\073\074\075\076\077\100\101\102\103\104\105\106\107\110\111\112\113\114\115\116\117\120\121\122\123\124\125\126\127\130\131\132\133\134\135\136\137\140\141\142\143\144\145\146\147\150\151\152\153\154\155\156\157\160\161\162\163\164\165\166\167\170\171\172\173\174\175\176\177\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\257\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\365\366\367\370\371\372\373\374\375\376\377
- 0: \x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff
-
-/P[^*]TAIRE[^*]{1,6}?LL/
-    xxxxxxxxxxxPSTAIREISLLxxxxxxxxx
- 0: PSTAIREISLL
-
-/P[^*]TAIRE[^*]{1,}?LL/
-    xxxxxxxxxxxPSTAIREISLLxxxxxxxxx
- 0: PSTAIREISLL
-
-/(\.\d\d[1-9]?)\d+/
-    1.230003938
- 0: .230003938
- 1: .23000393
- 2: .2300039
- 3: .230003
- 4: .23000
- 5: .2300
- 6: .230
-    1.875000282   
- 0: .875000282
- 1: .87500028
- 2: .8750002
- 3: .875000
- 4: .87500
- 5: .8750
- 6: .875
-    1.235  
- 0: .235
-                  
-/(\.\d\d((?=0)|\d(?=\d)))/
-    1.230003938      
- 0: .230
- 1: .23
-    1.875000282
- 0: .875
-    *** Failers 
+    1234
+ 0: 1
+    \x{6ca}
+ 0: \x{6ca}
+    \x{a6c}
+ 0: \x{a6c}
+    \x{10a7}   
+ 0: \x{10a7}
+    ** Failers
 No match
-    1.235 
+    _ABC   
 No match
-    
-/a(?)b/
-    ab 
- 0: ab
- 
-/\b(foo)\s+(\w+)/i
-    Food is on the foo table
- 0: foo table
- 1: foo tabl
- 2: foo tab
- 3: foo ta
- 4: foo t
-    
-/foo(.*)bar/
-    The food is under the bar in the barn.
- 0: food is under the bar in the bar
- 1: food is under the bar
-    
-/foo(.*?)bar/  
-    The food is under the bar in the barn.
- 0: food is under the bar in the bar
- 1: food is under the bar

-/(.*)(\d*)/
-    I have 2 numbers: 53147
-Matched, but too many subsidiary matches
- 0: I have 2 numbers: 53147
- 1: I have 2 numbers: 5314
- 2: I have 2 numbers: 531
- 3: I have 2 numbers: 53
- 4: I have 2 numbers: 5
- 5: I have 2 numbers: 
- 6: I have 2 numbers:
- 7: I have 2 numbers
- 8: I have 2 number
- 9: I have 2 numbe
-10: I have 2 numb
-11: I have 2 num
-12: I have 2 nu
-13: I have 2 n
-14: I have 2 
-15: I have 2
-16: I have 
-17: I have
-18: I hav
-19: I ha
-20: I h
-21: I 
-    
-/(.*)(\d+)/
-    I have 2 numbers: 53147
- 0: I have 2 numbers: 53147
- 1: I have 2 numbers: 5314
- 2: I have 2 numbers: 531
- 3: I have 2 numbers: 53
- 4: I have 2 numbers: 5
- 5: I have 2
- 
-/(.*?)(\d*)/
-    I have 2 numbers: 53147
-Matched, but too many subsidiary matches
- 0: I have 2 numbers: 53147
- 1: I have 2 numbers: 5314
- 2: I have 2 numbers: 531
- 3: I have 2 numbers: 53
- 4: I have 2 numbers: 5
- 5: I have 2 numbers: 
- 6: I have 2 numbers:
- 7: I have 2 numbers
- 8: I have 2 number
- 9: I have 2 numbe
-10: I have 2 numb
-11: I have 2 num
-12: I have 2 nu
-13: I have 2 n
-14: I have 2 
-15: I have 2
-16: I have 
-17: I have
-18: I hav
-19: I ha
-20: I h
-21: I 
-
-/(.*?)(\d+)/
-    I have 2 numbers: 53147
- 0: I have 2 numbers: 53147
- 1: I have 2 numbers: 5314
- 2: I have 2 numbers: 531
- 3: I have 2 numbers: 53
- 4: I have 2 numbers: 5
- 5: I have 2
-
-/(.*)(\d+)$/
-    I have 2 numbers: 53147
- 0: I have 2 numbers: 53147
-
-/(.*?)(\d+)$/
-    I have 2 numbers: 53147
- 0: I have 2 numbers: 53147
-
-/(.*)\b(\d+)$/
-    I have 2 numbers: 53147
- 0: I have 2 numbers: 53147
-
-/(.*\D)(\d+)$/
-    I have 2 numbers: 53147
- 0: I have 2 numbers: 53147
-
-/^\D*(?!123)/
-    ABC123
- 0: AB
- 1: A
- 2: 
-     
-/^(\D*)(?=\d)(?!123)/
-    ABC445
- 0: ABC
-    *** Failers
+/^\p{Xan}+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
+    ** Failers
 No match
-    ABC123
+    _ABC   
 No match
-    
-/^[W-]46]/
-    W46]789 
- 0: W46]
-    -46]789
- 0: -46]
-    *** Failers
-No match
-    Wall
-No match
-    Zebra
-No match
-    42
-No match
-    [abcd] 
-No match
-    ]abcd[
-No match
-       
-/^[W-\]46]/
-    W46]789 
- 0: W
-    Wall
- 0: W
-    Zebra
- 0: Z
-    Xylophone  
- 0: X
-    42
- 0: 4
-    [abcd] 
- 0: [
-    ]abcd[
- 0: ]
-    \\backslash 
- 0: \
-    *** Failers
-No match
-    -46]789
-No match
-    well
-No match
-    
-/\d\d\/\d\d\/\d\d\d\d/
-    01/01/2000
- 0: 01/01/2000

-/word (?:[a-zA-Z0-9]+ ){0,10}otherword/
-  word cat dog elephant mussel cow horse canary baboon snake shark otherword
- 0: word cat dog elephant mussel cow horse canary baboon snake shark otherword
-  word cat dog elephant mussel cow horse canary baboon snake shark
-No match
+/^\p{Xan}+?/8
+    \x{6ca}\x{a6c}\x{10a7}_
+ 0: \x{6ca}

-/word (?:[a-zA-Z0-9]+ ){0,300}otherword/
-  word cat dog elephant mussel cow horse canary baboon snake shark the quick brown fox and the lazy dog and several other words getting close to thirty by now I hope
-No match
-
-/^(a){0,0}/
-    bcd
- 0: 
-    abc
- 0: 
-    aab     
- 0: 
-
-/^(a){0,1}/
-    bcd
- 0: 
-    abc
- 0: a
- 1: 
-    aab  
- 0: a
- 1: 
-
-/^(a){0,2}/
-    bcd
- 0: 
-    abc
- 0: a
- 1: 
-    aab  
- 0: aa
- 1: a
- 2: 
-
-/^(a){0,3}/
-    bcd
- 0: 
-    abc
- 0: a
- 1: 
-    aab
- 0: aa
- 1: a
- 2: 
-    aaa   
- 0: aaa
- 1: aa
- 2: a
- 3: 
-
-/^(a){0,}/
-    bcd
- 0: 
-    abc
- 0: a
- 1: 
-    aab
- 0: aa
- 1: a
- 2: 
-    aaa
- 0: aaa
- 1: aa
- 2: a
- 3: 
-    aaaaaaaa    
- 0: aaaaaaaa
- 1: aaaaaaa
- 2: aaaaaa
- 3: aaaaa
- 4: aaaa
- 5: aaa
- 6: aa
- 7: a
- 8: 
-
-/^(a){1,1}/
-    bcd
-No match
-    abc
- 0: a
-    aab  
- 0: a
-
-/^(a){1,2}/
-    bcd
-No match
-    abc
- 0: a
-    aab  
- 0: aa
- 1: a
-
-/^(a){1,3}/
-    bcd
-No match
-    abc
- 0: a
-    aab
- 0: aa
- 1: a
-    aaa   
- 0: aaa
- 1: aa
- 2: a
-
-/^(a){1,}/
-    bcd
-No match
-    abc
- 0: a
-    aab
- 0: aa
- 1: a
-    aaa
- 0: aaa
- 1: aa
- 2: a
-    aaaaaaaa    
- 0: aaaaaaaa
- 1: aaaaaaa
- 2: aaaaaa
- 3: aaaaa
- 4: aaaa
- 5: aaa
- 6: aa
- 7: a
-
-/.*\.gif/
-    borfle\nbib.gif\nno
- 0: bib.gif
-
-/.{0,}\.gif/
-    borfle\nbib.gif\nno
- 0: bib.gif
-
-/.*\.gif/m
-    borfle\nbib.gif\nno
- 0: bib.gif
-
-/.*\.gif/s
-    borfle\nbib.gif\nno
- 0: borfle\x0abib.gif
-
-/.*\.gif/ms
-    borfle\nbib.gif\nno
- 0: borfle\x0abib.gif
+/^\p{Xan}*/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}

-/.*$/
-    borfle\nbib.gif\nno
- 0: no
-
-/.*$/m
-    borfle\nbib.gif\nno
- 0: borfle
-
-/.*$/s
-    borfle\nbib.gif\nno
- 0: borfle\x0abib.gif\x0ano
-
-/.*$/ms
-    borfle\nbib.gif\nno
- 0: borfle\x0abib.gif\x0ano
- 1: borfle\x0abib.gif
- 2: borfle
+/^\p{Xan}{2,9}/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}

-/.*$/
-    borfle\nbib.gif\nno\n
- 0: no
-
-/.*$/m
-    borfle\nbib.gif\nno\n
- 0: borfle
-
-/.*$/s
-    borfle\nbib.gif\nno\n
- 0: borfle\x0abib.gif\x0ano\x0a
- 1: borfle\x0abib.gif\x0ano
-
-/.*$/ms
-    borfle\nbib.gif\nno\n
- 0: borfle\x0abib.gif\x0ano\x0a
- 1: borfle\x0abib.gif\x0ano
- 2: borfle\x0abib.gif
- 3: borfle
+/^\p{Xan}{2,9}?/8
+    \x{6ca}\x{a6c}\x{10a7}_
+ 0: \x{6ca}\x{a6c}

-/(.*X|^B)/
-    abcde\n1234Xyz
- 0: 1234X
-    BarFoo 
- 0: B
-    *** Failers
+/^[\p{Xan}]/8
+    ABCD1234_
+ 0: A
+    1234abcd_
+ 0: 1
+    \x{6ca}
+ 0: \x{6ca}
+    \x{a6c}
+ 0: \x{a6c}
+    \x{10a7}   
+ 0: \x{10a7}
+    ** Failers
 No match
-    abcde\nBar  
+    _ABC   
 No match
-
-/(.*X|^B)/m
-    abcde\n1234Xyz
- 0: 1234X
-    BarFoo 
- 0: B
-    abcde\nBar  
- 0: B
-
-/(.*X|^B)/s
-    abcde\n1234Xyz
- 0: abcde\x0a1234X
-    BarFoo 
- 0: B
-    *** Failers
+ 
+/^[\p{Xan}]+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
+    ** Failers
 No match
-    abcde\nBar  
+    _ABC   
 No match

-/(.*X|^B)/ms
-    abcde\n1234Xyz
- 0: abcde\x0a1234X
-    BarFoo 
- 0: B
-    abcde\nBar  
- 0: B
-
-/(?s)(.*X|^B)/
-    abcde\n1234Xyz
- 0: abcde\x0a1234X
-    BarFoo 
- 0: B
-    *** Failers 
+/^>\p{Xsp}/8
+    >\x{1680}\x{2028}\x{0b}
+ 0: >\x{1680}
+    >\x{a0} 
+ 0: >\x{a0}
+    ** Failers
 No match
-    abcde\nBar  
+    \x{0b} 
 No match

-/(?s:.*X|^B)/
-    abcde\n1234Xyz
- 0: abcde\x0a1234X
-    BarFoo 
- 0: B
-    *** Failers 
-No match
-    abcde\nBar  
-No match
+/^>\p{Xsp}+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}

-/^.*B/
-    **** Failers
-No match
-    abc\nB
-No match
-     
-/(?s)^.*B/
-    abc\nB
- 0: abc\x0aB
+/^>\p{Xsp}+?/8
+    >\x{1680}\x{2028}\x{0b}
+ 0: >\x{1680}

-/(?m)^.*B/
-    abc\nB
- 0: B
-     
-/(?ms)^.*B/
-    abc\nB
- 0: abc\x0aB
-
-/(?ms)^B/
-    abc\nB
- 0: B
-
-/(?s)B$/
-    B\n
- 0: B
-
-/^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/
-    123456654321
- 0: 123456654321
-  
-/^\d\d\d\d\d\d\d\d\d\d\d\d/
-    123456654321 
- 0: 123456654321
-
-/^[\d][\d][\d][\d][\d][\d][\d][\d][\d][\d][\d][\d]/
-    123456654321
- 0: 123456654321
-  
-/^[abc]{12}/
-    abcabcabcabc
- 0: abcabcabcabc
+/^>\p{Xsp}*/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}

-/^[a-c]{12}/
-    abcabcabcabc
- 0: abcabcabcabc
+/^>\p{Xsp}{2,9}/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}

-/^(a|b|c){12}/
-    abcabcabcabc 
- 0: abcabcabcabc
-
-/^[abcdefghijklmnopqrstuvwxy0123456789]/
-    n
- 0: n
-    *** Failers 
-No match
-    z 
-No match
-
-/abcde{0,0}/
-    abcd
- 0: abcd
-    *** Failers
-No match
-    abce  
-No match
-
-/ab[cd]{0,0}e/
-    abe
- 0: abe
-    *** Failers
-No match
-    abcde 
-No match
+/^>\p{Xsp}{2,9}?/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}

-/ab(c){0,0}d/
-    abd
- 0: abd
-    *** Failers
-No match
-    abcd   
-No match
+/^>[\p{Xsp}]/8
+    >\x{2028}\x{0b}
+ 0: >\x{2028}
+ 
+/^>[\p{Xsp}]+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}

-/a(b*)/
-    a
- 0: a
-    ab
- 0: ab
- 1: a
-    abbbb
- 0: abbbb
- 1: abbb
- 2: abb
- 3: ab
- 4: a
-    *** Failers
- 0: a
-    bbbbb    
+/^>\p{Xps}/8
+    >\x{1680}\x{2028}\x{0b}
+ 0: >\x{1680}
+    >\x{a0} 
+ 0: >\x{a0}
+    ** Failers
 No match
-    
-/ab\d{0}e/
-    abe
- 0: abe
-    *** Failers
+    \x{0b} 
 No match
-    ab1e   
-No match
-    
-/"([^\\"]+|\\.)*"/
-    the \"quick\" brown fox
- 0: "quick"
-    \"the \\\"quick\\\" brown fox\" 
- 0: "the \"quick\" brown fox"

-/.*?/g+
-    abc
- 0: abc
- 0+ 
- 1: ab
- 2: a
- 3: 
- 0: 
- 0+ 
-  
-/\b/g+
-    abc 
- 0: 
- 0+ abc
- 0: 
- 0+ 
+/^>\p{Xps}+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/\b/+g
-    abc 
- 0: 
- 0+ abc
- 0: 
- 0+ 
+/^>\p{Xps}+?/8
+    >\x{1680}\x{2028}\x{0b}
+ 0: >\x{1680}

-//g
-    abc
- 0: 
- 0: 
- 0: 
- 0: 
-
-/<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/is
-  <TR BGCOLOR='#DBE9E9'><TD align=left valign=top>43.<a href='joblist.cfm?JobID=94 6735&Keyword='>Word Processor<BR>(N-1286)</a></TD><TD align=left valign=top>Lega lstaff.com</TD><TD align=left valign=top>CA - Statewide</TD></TR>
- 0: <TR BGCOLOR='#DBE9E9'><TD align=left valign=top>43.<a href='joblist.cfm?JobID=94 6735&Keyword='>Word Processor<BR>(N-1286)</a></TD><TD align=left valign=top>Lega lstaff.com</TD><TD align=left valign=top>CA - Statewide</TD></TR>
-
-/a[^a]b/
-    acb
- 0: acb
-    a\nb
- 0: a\x0ab
+/^>\p{Xps}*/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/a.b/
-    acb
- 0: acb
-    *** Failers 
-No match
-    a\nb   
-No match
+/^>\p{Xps}{2,9}/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/a[^a]b/s
-    acb
- 0: acb
-    a\nb  
- 0: a\x0ab
+/^>\p{Xps}{2,9}?/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}

-/a.b/s
-    acb
- 0: acb
-    a\nb  
- 0: a\x0ab
+/^>[\p{Xps}]/8
+    >\x{2028}\x{0b}
+ 0: >\x{2028}
+ 
+/^>[\p{Xps}]+/8
+    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
+ 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}

-/^(b+?|a){1,2}?c/
-    bac
- 0: bac
-    bbac
- 0: bbac
-    bbbac
- 0: bbbac
-    bbbbac
- 0: bbbbac
-    bbbbbac 
- 0: bbbbbac
-
-/^(b+|a){1,2}?c/
-    bac
- 0: bac
-    bbac
- 0: bbac
-    bbbac
- 0: bbbac
-    bbbbac
- 0: bbbbac
-    bbbbbac 
- 0: bbbbbac
-    
-/(?!\A)x/m
-    x\nb\n
+/^\p{Xwd}/8
+    ABCD
+ 0: A
+    1234
+ 0: 1
+    \x{6ca}
+ 0: \x{6ca}
+    \x{a6c}
+ 0: \x{a6c}
+    \x{10a7}
+ 0: \x{10a7}
+    _ABC    
+ 0: _
+    ** Failers
 No match
-    a\bx\n  
- 0: x
-    
-/\x0{ab}/
-    \0{ab} 
- 0: \x00{ab}
-
-/(A|B)*?CD/
-    CD 
- 0: CD
-    
-/(A|B)*CD/
-    CD 
- 0: CD
-
-/(?<!bar)foo/
-    foo
- 0: foo
-    catfood
- 0: foo
-    arfootle
- 0: foo
-    rfoosh
- 0: foo
-    *** Failers
+    [] 
 No match
-    barfoo
-No match
-    towbarfoo
-No match

-/\w{3}(?<!bar)foo/
-    catfood
- 0: catfoo
-    *** Failers
-No match
-    foo
-No match
-    barfoo
-No match
-    towbarfoo
-No match
+/^\p{Xwd}+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_

-/(?<=(foo)a)bar/
-    fooabar
- 0: bar
-    *** Failers
-No match
-    bar
-No match
-    foobbar
-No match
-      
-/\Aabc\z/m
-    abc
- 0: abc
-    *** Failers
-No match
-    abc\n   
-No match
-    qqq\nabc
-No match
-    abc\nzzz
-No match
-    qqq\nabc\nzzz
-No match
+/^\p{Xwd}+?/8
+    \x{6ca}\x{a6c}\x{10a7}_
+ 0: \x{6ca}

-"(?>.*/)foo"
-    /this/is/a/very/long/line/in/deed/with/very/many/slashes/in/it/you/see/
-No match
-
-"(?>.*/)foo"
-    /this/is/a/very/long/line/in/deed/with/very/many/slashes/in/and/foo
- 0: /this/is/a/very/long/line/in/deed/with/very/many/slashes/in/and/foo
-
-/(?>(\.\d\d[1-9]?))\d+/
-    1.230003938
- 0: .230003938
- 1: .23000393
- 2: .2300039
- 3: .230003
- 4: .23000
- 5: .2300
- 6: .230
-    1.875000282
- 0: .875000282
- 1: .87500028
- 2: .8750002
- 3: .875000
- 4: .87500
- 5: .8750
-    *** Failers 
-No match
-    1.235 
-No match
-
-/^((?>\w+)|(?>\s+))*$/
-    now is the time for all good men to come to the aid of the party
- 0: now is the time for all good men to come to the aid of the party
-    *** Failers
-No match
-    this is not a line with only words and spaces!
-No match
+/^\p{Xwd}*/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_

-/(\d+)(\w)/
-    12345a
- 0: 12345a
- 1: 12345
- 2: 1234
- 3: 123
- 4: 12
-    12345+ 
- 0: 12345
- 1: 1234
- 2: 123
- 3: 12
-
-/((?>\d+))(\w)/
-    12345a
- 0: 12345a
-    *** Failers
-No match
-    12345+ 
-No match
-
-/(?>a+)b/
-    aaab
- 0: aaab
-
-/((?>a+)b)/
-    aaab
- 0: aaab
-
-/(?>(a+))b/
-    aaab
- 0: aaab
-
-/(?>b)+/
-    aaabbbccc
- 0: bbb
- 1: bb
- 2: b
-
-/(?>a+|b+|c+)*c/
-    aaabbbbccccd
- 0: aaabbbbcccc
- 1: aaabbbbc
+/^\p{Xwd}{2,9}/8
+    A_B12\x{6ca}\x{a6c}\x{10a7}
+ 0: A_B12\x{6ca}\x{a6c}\x{10a7}

-/(a+|b+|c+)*c/
-    aaabbbbccccd
- 0: aaabbbbcccc
- 1: aaabbbbccc
- 2: aaabbbbcc
- 3: aaabbbbc
-
-/((?>[^()]+)|\([^()]*\))+/
-    ((abc(ade)ufh()()x
- 0: abc(ade)ufh()()x
- 1: abc(ade)ufh()()
- 2: abc(ade)ufh()
- 3: abc(ade)ufh
- 4: abc(ade)
- 5: abc
+/^\p{Xwd}{2,9}?/8
+    \x{6ca}\x{a6c}\x{10a7}_
+ 0: \x{6ca}\x{a6c}

-/\(((?>[^()]+)|\([^()]+\))+\)/ 
-    (abc)
- 0: (abc)
-    (abc(def)xyz)
- 0: (abc(def)xyz)
-    *** Failers
+/^[\p{Xwd}]/8
+    ABCD1234_
+ 0: A
+    1234abcd_
+ 0: 1
+    \x{6ca}
+ 0: \x{6ca}
+    \x{a6c}
+ 0: \x{a6c}
+    \x{10a7}   
+ 0: \x{10a7}
+    _ABC 
+ 0: _
+    ** Failers
 No match
-    ((()aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa   
+    []   
 No match
-
-/a(?-i)b/i
-    ab
- 0: ab
-    Ab
- 0: Ab
-    *** Failers 
-No match
-    aB
-No match
-    AB
-No match
-        
-/(a (?x)b c)d e/
-    a bcd e
- 0: a bcd e
-    *** Failers
-No match
-    a b cd e
-No match
-    abcd e   
-No match
-    a bcde 
-No match

-/(a b(?x)c d (?-x)e f)/
-    a bcde f
- 0: a bcde f
-    *** Failers
-No match
-    abcdef  
-No match
+/^[\p{Xwd}]+/8
+    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
+ 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_

-/(a(?i)b)c/
-    abc
- 0: abc
-    aBc
- 0: aBc
-    *** Failers
-No match
-    abC
-No match
-    aBC  
-No match
-    Abc
-No match
-    ABc
-No match
-    ABC
-No match
-    AbC
-No match
-    
-/a(?i:b)c/
-    abc
- 0: abc
-    aBc
- 0: aBc
-    *** Failers 
-No match
-    ABC
-No match
-    abC
-No match
-    aBC
-No match
-    
-/a(?i:b)*c/
-    aBc
- 0: aBc
-    aBBc
- 0: aBBc
-    *** Failers 
-No match
-    aBC
-No match
-    aBBC
-No match
-    
-/a(?=b(?i)c)\w\wd/
-    abcd
- 0: abcd
-    abCd
- 0: abCd
-    *** Failers
-No match
-    aBCd
-No match
-    abcD     
-No match
-    
-/(?s-i:more.*than).*million/i
-    more than million
- 0: more than million
-    more than MILLION
- 0: more than MILLION
-    more \n than Million 
- 0: more \x0a than Million
-    *** Failers
-No match
-    MORE THAN MILLION    
-No match
-    more \n than \n million 
-No match
+/-- A check not in UTF-8 mode --/

-/(?:(?s-i)more.*than).*million/i
-    more than million
- 0: more than million
-    more than MILLION
- 0: more than MILLION
-    more \n than Million 
- 0: more \x0a than Million
-    *** Failers
-No match
-    MORE THAN MILLION    
-No match
-    more \n than \n million 
-No match
+/^[\p{Xwd}]+/
+    ABCD1234_
+ 0: ABCD1234_

-/(?>a(?i)b+)+c/ 
-    abc
- 0: abc
-    aBbc
- 0: aBbc
-    aBBc 
- 0: aBBc
-    *** Failers
-No match
-    Abc
-No match
-    abAb    
-No match
-    abbC 
-No match
-    
-/(?=a(?i)b)\w\wc/
-    abc
- 0: abc
-    aBc
- 0: aBc
-    *** Failers
-No match
-    Ab 
-No match
-    abC
-No match
-    aBC     
-No match
-    
-/(?<=a(?i)b)(\w\w)c/
-    abxxc
- 0: xxc
-    aBxxc
- 0: xxc
-    *** Failers
-No match
-    Abxxc
-No match
-    ABxxc
-No match
-    abxxC      
-No match
+/-- Some negative checks --/

-/^(?(?=abc)\w{3}:|\d\d)$/
-    abc:
- 0: abc:
-    12
- 0: 12
-    *** Failers
-No match
-    123
-No match
-    xyz    
-No match
+/^[\P{Xwd}]+/8
+    !.+\x{019}\x{35a}AB
+ 0: !.+\x{19}\x{35a}

-/^(?(?!abc)\d\d|\w{3}:)$/
-    abc:
- 0: abc:
-    12
- 0: 12
-    *** Failers
-No match
-    123
-No match
-    xyz    
-No match
-    
-/(?(?<=foo)bar|cat)/
-    foobar
- 0: bar
-    cat
- 0: cat
-    fcat
- 0: cat
-    focat   
- 0: cat
-    *** Failers
-No match
-    foocat  
-No match
+/^[\p{^Xwd}]+/8
+    !.+\x{019}\x{35a}AB
+ 0: !.+\x{19}\x{35a}

-/(?(?<!foo)cat|bar)/
-    foobar
- 0: bar
-    cat
- 0: cat
-    fcat
- 0: cat
-    focat   
- 0: cat
-    *** Failers
-No match
-    foocat  
-No match
+/[\D]/WBZ8
+------------------------------------------------------------------
+        Bra
+        [\P{Nd}]
+        Ket
+        End
+------------------------------------------------------------------
+    1\x{3c8}2
+ 0: \x{3c8}

-/(?>a*)*/
-    a
- 0: a
- 1: 
-    aa
- 0: aa
- 1: 
-    aaaa
- 0: aaaa
- 1: 
-    
-/(abc|)+/
-    abc
- 0: abc
- 1: 
-    abcabc
- 0: abcabc
- 1: abc
- 2: 
-    abcabcabc
- 0: abcabcabc
- 1: abcabc
- 2: abc
- 3: 
-    xyz      
- 0: 
+/[\d]/WBZ8
+------------------------------------------------------------------
+        Bra
+        [\p{Nd}]
+        Ket
+        End
+------------------------------------------------------------------
+    >\x{6f4}<
+ 0: \x{6f4}

-/([a]*)*/
-    a
- 0: a
- 1: 
-    aaaaa 
- 0: aaaaa
- 1: aaaa
- 2: aaa
- 3: aa
- 4: a
- 5: 
- 
-/([ab]*)*/
-    a
- 0: a
- 1: 
-    b
- 0: b
- 1: 
-    ababab
- 0: ababab
- 1: ababa
- 2: abab
- 3: aba
- 4: ab
- 5: a
- 6: 
-    aaaabcde
- 0: aaaab
- 1: aaaa
- 2: aaa
- 3: aa
- 4: a
- 5: 
-    bbbb    
- 0: bbbb
- 1: bbb
- 2: bb
- 3: b
- 4: 
- 
-/([^a]*)*/
-    b
- 0: b
- 1: 
-    bbbb
- 0: bbbb
- 1: bbb
- 2: bb
- 3: b
- 4: 
-    aaa   
- 0: 
- 
-/([^ab]*)*/
-    cccc
- 0: cccc
- 1: ccc
- 2: cc
- 3: c
- 4: 
-    abab  
- 0: 
- 
-/([a]*?)*/
-    a
- 0: a
- 1: 
-    aaaa 
- 0: aaaa
- 1: aaa
- 2: aa
- 3: a
- 4: 
- 
-/([ab]*?)*/
-    a
- 0: a
- 1: 
-    b
- 0: b
- 1: 
-    abab
- 0: abab
- 1: aba
- 2: ab
- 3: a
- 4: 
-    baba   
- 0: baba
- 1: bab
- 2: ba
- 3: b
- 4: 
- 
-/([^a]*?)*/
-    b
- 0: b
- 1: 
-    bbbb
- 0: bbbb
- 1: bbb
- 2: bb
- 3: b
- 4: 
-    aaa   
- 0: 
- 
-/([^ab]*?)*/
-    c
- 0: c
- 1: 
-    cccc
- 0: cccc
- 1: ccc
- 2: cc
- 3: c
- 4: 
-    baba   
- 0: 
- 
-/(?>a*)*/
-    a
- 0: a
- 1: 
-    aaabcde 
- 0: aaa
- 1: 
- 
-/((?>a*))*/
-    aaaaa
- 0: aaaaa
- 1: 
-    aabbaa 
- 0: aa
- 1: 
- 
-/((?>a*?))*/
-    aaaaa
- 0: aaaaa
- 1: 
-    aabbaa 
- 0: aa
- 1: 
+/[\S]/WBZ8
+------------------------------------------------------------------
+        Bra
+        [\P{Xsp}]
+        Ket
+        End
+------------------------------------------------------------------
+    \x{1680}\x{6f4}\x{1680}
+ 0: \x{6f4}

-/(?(?=[^a-z]+[a-z])  \d{2}-[a-z]{3}-\d{2}  |  \d{2}-\d{2}-\d{2} ) /x
-    12-sep-98
- 0: 12-sep-98
-    12-09-98
- 0: 12-09-98
-    *** Failers
-No match
-    sep-12-98
-No match
-        
-/(?i:saturday|sunday)/
-    saturday
- 0: saturday
-    sunday
- 0: sunday
-    Saturday
- 0: Saturday
-    Sunday
- 0: Sunday
-    SATURDAY
- 0: SATURDAY
-    SUNDAY
- 0: SUNDAY
-    SunDay
- 0: SunDay
-    
-/(a(?i)bc|BB)x/
-    abcx
- 0: abcx
-    aBCx
- 0: aBCx
-    bbx
- 0: bbx
-    BBx
- 0: BBx
-    *** Failers
-No match
-    abcX
-No match
-    aBCX
-No match
-    bbX
-No match
-    BBX               
-No match
+/[\s]/WBZ8
+------------------------------------------------------------------
+        Bra
+        [\p{Xsp}]
+        Ket
+        End
+------------------------------------------------------------------
+    >\x{1680}<
+ 0: \x{1680}

-/^([ab](?i)[cd]|[ef])/
-    ac
- 0: ac
-    aC
- 0: aC
-    bD
- 0: bD
-    elephant
- 0: e
-    Europe 
- 0: E
-    frog
- 0: f
-    France
- 0: F
-    *** Failers
-No match
-    Africa     
-No match
+/[\W]/WBZ8
+------------------------------------------------------------------
+        Bra
+        [\P{Xwd}]
+        Ket
+        End
+------------------------------------------------------------------
+    A\x{1712}B
+ 0: \x{1712}

-/^(ab|a(?i)[b-c](?m-i)d|x(?i)y|z)/
-    ab
- 0: ab
-    aBd
- 0: aBd
-    xy
- 0: xy
-    xY
- 0: xY
-    zebra
- 0: z
-    Zambesi
- 0: Z
-    *** Failers
-No match
-    aCD  
-No match
-    XY  
-No match
+/[\w]/WBZ8
+------------------------------------------------------------------
+        Bra
+        [\p{Xwd}]
+        Ket
+        End
+------------------------------------------------------------------
+    >\x{1723}<
+ 0: \x{1723}

-/(?<=foo\n)^bar/m
-    foo\nbar
- 0: bar
-    *** Failers
-No match
-    bar
-No match
-    baz\nbar   
-No match
+/\D/WBZ8
+------------------------------------------------------------------
+        Bra
+        notprop Nd
+        Ket
+        End
+------------------------------------------------------------------
+    1\x{3c8}2
+ 0: \x{3c8}

-/(?<=(?<!foo)bar)baz/
-    barbaz
- 0: baz
-    barbarbaz 
- 0: baz
-    koobarbaz 
- 0: baz
-    *** Failers
-No match
-    baz
-No match
-    foobarbaz 
-No match
+/\d/WBZ8
+------------------------------------------------------------------
+        Bra
+        prop Nd
+        Ket
+        End
+------------------------------------------------------------------
+    >\x{6f4}<
+ 0: \x{6f4}

-/The following tests are taken from the Perl 5.005 test suite; some of them/
-/are compatible with 5.004, but I'd rather not have to sort them out./
-No match
+/\S/WBZ8
+------------------------------------------------------------------
+        Bra
+        notprop Xsp
+        Ket
+        End
+------------------------------------------------------------------
+    \x{1680}\x{6f4}\x{1680}
+ 0: \x{6f4}

-/abc/
-    abc
- 0: abc
-    xabcy
- 0: abc
-    ababc
- 0: abc
-    *** Failers
-No match
-    xbc
-No match
-    axc
-No match
-    abx
-No match
+/\s/WBZ8
+------------------------------------------------------------------
+        Bra
+        prop Xsp
+        Ket
+        End
+------------------------------------------------------------------
+    >\x{1680}>
+ 0: \x{1680}

-/ab*c/
-    abc
- 0: abc
+/\W/WBZ8
+------------------------------------------------------------------
+        Bra
+        notprop Xwd
+        Ket
+        End
+------------------------------------------------------------------
+    A\x{1712}B
+ 0: \x{1712}

-/ab*bc/
-    abc
- 0: abc
-    abbc
- 0: abbc
-    abbbbc
- 0: abbbbc
+/\w/WBZ8
+------------------------------------------------------------------
+        Bra
+        prop Xwd
+        Ket
+        End
+------------------------------------------------------------------
+    >\x{1723}<
+ 0: \x{1723}

-/.{1}/
-    abbbbc
- 0: a
+/[[:alpha:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\p{L}]
+        Ket
+        End
+------------------------------------------------------------------

-/.{3,4}/
-    abbbbc
- 0: abbb
- 1: abb
+/[[:lower:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\p{Ll}]
+        Ket
+        End
+------------------------------------------------------------------

-/ab{0,}bc/
-    abbbbc
- 0: abbbbc
+/[[:upper:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\p{Lu}]
+        Ket
+        End
+------------------------------------------------------------------

-/ab+bc/
-    abbc
- 0: abbc
-    *** Failers
-No match
-    abc
-No match
-    abq
-No match
+/[[:alnum:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\p{Xan}]
+        Ket
+        End
+------------------------------------------------------------------

-/ab+bc/
-    abbbbc
- 0: abbbbc
+/[[:ascii:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x7f]
+        Ket
+        End
+------------------------------------------------------------------

-/ab{1,}bc/
-    abbbbc
- 0: abbbbc
+/[[:cntrl:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\x00-\x1f\x7f]
+        Ket
+        End
+------------------------------------------------------------------

-/ab{1,3}bc/
-    abbbbc
- 0: abbbbc
+/[[:digit:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\p{Nd}]
+        Ket
+        End
+------------------------------------------------------------------

-/ab{3,4}bc/
-    abbbbc
- 0: abbbbc
+/[[:graph:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [!-~]
+        Ket
+        End
+------------------------------------------------------------------

-/ab{4,5}bc/
-    *** Failers
-No match
-    abq
-No match
-    abbbbc
-No match
+/[[:print:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [ -~]
+        Ket
+        End
+------------------------------------------------------------------

-/ab?bc/
-    abbc
- 0: abbc
-    abc
- 0: abc
+/[[:punct:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [!-/:-@[-`{-~]
+        Ket
+        End
+------------------------------------------------------------------

-/ab{0,1}bc/
-    abc
- 0: abc
+/[[:space:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\p{Xps}]
+        Ket
+        End
+------------------------------------------------------------------

-/ab?bc/
+/[[:word:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [\p{Xwd}]
+        Ket
+        End
+------------------------------------------------------------------

-/ab?c/
-    abc
- 0: abc
+/[[:xdigit:]]/WBZ
+------------------------------------------------------------------
+        Bra
+        [0-9A-Fa-f]
+        Ket
+        End
+------------------------------------------------------------------

-/ab{0,1}c/
-    abc
- 0: abc
+/-- Unicode properties for \b abd \B --/

-/^abc$/
-    abc
+/\b...\B/8W
+    abc_
  0: abc
-    *** Failers
-No match
-    abbbbc
-No match
-    abcc
-No match
-
-/^abc/
-    abcc
+    \x{37e}abc\x{376} 
  0: abc
+    \x{37e}\x{376}\x{371}\x{393}\x{394} 
+ 0: \x{376}\x{371}\x{393}
+    !\x{c0}++\x{c1}\x{c2} 
+ 0: ++\x{c1}
+    !\x{c0}+++++ 
+ 0: \x{c0}++

-/^abc$/
+/-- Without PCRE_UCP, non-ASCII always fail, even if < 256 --/

-/abc$/
-    aabc
+/\b...\B/8
+    abc_
  0: abc
-    *** Failers
+    ** Failers 
+ 0: Fai
+    \x{37e}abc\x{376} 
 No match
-    aabc
- 0: abc
-    aabcd
+    \x{37e}\x{376}\x{371}\x{393}\x{394} 
 No match
-
-/^/
-    abc
- 0: 
-
-/$/
-    abc
- 0: 
-
-/a.c/
-    abc
- 0: abc
-    axc
- 0: axc
-
-/a.*c/
-    axyzc
- 0: axyzc
-
-/a[bc]d/
-    abd
- 0: abd
-    *** Failers
+    !\x{c0}++\x{c1}\x{c2} 
 No match
-    axyzd
+    !\x{c0}+++++ 
 No match
-    abc
-No match

-/a[b-d]e/
-    ace
- 0: ace
+/-- With PCRE_UCP, non-UTF8 chars that are < 256 still check properties  --/

-/a[b-d]/
-    aac
- 0: ac
-
-/a[-b]/
-    a-
- 0: a-
-
-/a[b-]/
-    a-
- 0: a-
-
-/a]/
-    a]
- 0: a]
-
-/a[]]b/
-    a]b
- 0: a]b
-
-/a[^bc]d/
-    aed
- 0: aed
-    *** Failers
-No match
-    abd
-No match
-    abd
-No match
-
-/a[^-b]c/
-    adc
- 0: adc
-
-/a[^]b]c/
-    adc
- 0: adc
-    *** Failers
-No match
-    a-c
- 0: a-c
-    a]c
-No match
-
-/\ba\b/
-    a-
- 0: a
-    -a
- 0: a
-    -a-
- 0: a
-
-/\by\b/
-    *** Failers
-No match
-    xy
-No match
-    yz
-No match
-    xyz
-No match
-
-/\Ba\B/
-    *** Failers
- 0: a
-    a-
-No match
-    -a
-No match
-    -a-
-No match
-
-/\By\b/
-    xy
- 0: y
-
-/\by\B/
-    yz
- 0: y
-
-/\By\B/
-    xyz
- 0: y
-
-/\w/
-    a
- 0: a
-
-/\W/
-    -
- 0: -
-    *** Failers
- 0: *
-    -
- 0: -
-    a
-No match
-
-/a\sb/
-    a b
- 0: a b
-
-/a\Sb/
-    a-b
- 0: a-b
-    *** Failers
-No match
-    a-b
- 0: a-b
-    a b
-No match
-
-/\d/
-    1
- 0: 1
-
-/\D/
-    -
- 0: -
-    *** Failers
- 0: *
-    -
- 0: -
-    1
-No match
-
-/[\w]/
-    a
- 0: a
-
-/[\W]/
-    -
- 0: -
-    *** Failers
- 0: *
-    -
- 0: -
-    a
-No match
-
-/a[\s]b/
-    a b
- 0: a b
-
-/a[\S]b/
-    a-b
- 0: a-b
-    *** Failers
-No match
-    a-b
- 0: a-b
-    a b
-No match
-
-/[\d]/
-    1
- 0: 1
-
-/[\D]/
-    -
- 0: -
-    *** Failers
- 0: *
-    -
- 0: -
-    1
-No match
-
-/ab|cd/
-    abc
- 0: ab
-    abcd
- 0: ab
-
-/()ef/
-    def
- 0: ef
-
-/$b/
-
-/a\(b/
-    a(b
- 0: a(b
-
-/a\(*b/
-    ab
- 0: ab
-    a((b
- 0: a((b
-
-/a\\b/
-    a\b
-No match
-
-/((a))/
-    abc
- 0: a
-
-/(a)b(c)/
-    abc
+/\b...\B/W
+    abc_
  0: abc
+    !\x{c0}++\x{c1}\x{c2} 
+ 0: ++\xc1
+    !\x{c0}+++++ 
+ 0: \xc0++

-/a+b+c/
-    aabbabc
- 0: abc
+/-- Some of these are silly, but they check various combinations --/

-/a{1,}b{1,}c/
-    aabbabc
+/[[:^alpha:][:^cntrl:]]+/8WBZ
+------------------------------------------------------------------
+        Bra
+        [ -~\x80-\xff\P{L}]+
+        Ket
+        End
+------------------------------------------------------------------
+    123
+ 0: 123
+    abc 
  0: abc

-/a.+?c/
-    abcabc
- 0: abcabc
- 1: abc
-
-/(a+|b)*/
-    ab
- 0: ab
- 1: a
- 2: 
-
-/(a+|b){0,}/
-    ab
- 0: ab
- 1: a
- 2: 
-
-/(a+|b)+/
-    ab
- 0: ab
- 1: a
-
-/(a+|b){1,}/
-    ab
- 0: ab
- 1: a
-
-/(a+|b)?/
-    ab
- 0: a
- 1: 
-
-/(a+|b){0,1}/
-    ab
- 0: a
- 1: 
-
-/[^ab]*/
-    cde
- 0: cde
- 1: cd
- 2: c
- 3: 
-
-/abc/
-    *** Failers
-No match
-    b
-No match
-    
-
-/a*/
-    
-
-/([abc])*d/
-    abbbcd
- 0: abbbcd
-
-/([abc])*bcd/
-    abcd
- 0: abcd
-
-/a|b|c|d|e/
-    e
- 0: e
-
-/(a|b|c|d|e)f/
-    ef
- 0: ef
-
-/abcd*efg/
-    abcdefg
- 0: abcdefg
-
-/ab*/
-    xabyabbbz
- 0: ab
- 1: a
-    xayabbbz
- 0: a
-
-/(ab|cd)e/
-    abcde
- 0: cde
-
-/[abhgefdc]ij/
-    hij
- 0: hij
-
-/^(ab|cd)e/
-
-/(abc|)ef/
-    abcdef
- 0: ef
-
-/(a|b)c*d/
-    abcd
- 0: bcd
-
-/(ab|ab*)bc/
-    abc
+/[[:^cntrl:][:^alpha:]]+/8WBZ
+------------------------------------------------------------------
+        Bra
+        [ -~\x80-\xff\P{L}]+
+        Ket
+        End
+------------------------------------------------------------------
+    123
+ 0: 123
+    abc 
  0: abc

-/a([bc]*)c*/
+/[[:alpha:]]+/8WBZ
+------------------------------------------------------------------
+        Bra
+        [\p{L}]+
+        Ket
+        End
+------------------------------------------------------------------
     abc
  0: abc
- 1: ab
- 2: a

-/a([bc]*)(c*d)/
-    abcd
- 0: abcd
-
-/a([bc]+)(c*d)/
-    abcd
- 0: abcd
-
-/a([bc]*)(c+d)/
-    abcd
- 0: abcd
-
-/a[bcd]*dcdcde/
-    adcdcde
- 0: adcdcde
-
-/a[bcd]+dcdcde/
-    *** Failers
-No match
-    abcde
-No match
-    adcdcde
-No match
-
-/(ab|a)b*c/
-    abc
+/[[:^alpha:]\S]+/8WBZ
+------------------------------------------------------------------
+        Bra
+        [\P{L}\P{Xsp}]+
+        Ket
+        End
+------------------------------------------------------------------
+    123
+ 0: 123
+    abc 
  0: abc

-/((a)(b)c)(d)/
-    abcd
- 0: abcd
-
-/[a-zA-Z_][a-zA-Z0-9_]*/
-    alpha
- 0: alpha
- 1: alph
- 2: alp
- 3: al
- 4: a
-
-/^a(bc+|b[eh])g|.h$/
-    abh
- 0: bh
-
-/(bc+d$|ef*g.|h?i(j|k))/
-    effgz
- 0: effgz
-    ij
- 0: ij
-    reffgz
- 0: effgz
-    *** Failers
-No match
-    effg
-No match
-    bcdd
-No match
-
-/((((((((((a))))))))))/
-    a
- 0: a
-
-/(((((((((a)))))))))/
-    a
- 0: a
-
-/multiple words of text/
-    *** Failers
-No match
-    aa
-No match
-    uh-uh
-No match
-
-/multiple words/
-    multiple words, yeah
- 0: multiple words
-
-/(.*)c(.*)/
-    abcde
- 0: abcde
- 1: abcd
- 2: abc
-
-/\((.*), (.*)\)/
-    (a, b)
- 0: (a, b)
-
-/[k]/
-
-/abcd/
-    abcd
- 0: abcd
-
-/a(bc)d/
-    abcd
- 0: abcd
-
-/a[-]?c/
-    ac
- 0: ac
-
-/abc/i
-    ABC
- 0: ABC
-    XABCY
- 0: ABC
-    ABABC
- 0: ABC
-    *** Failers
-No match
-    aaxabxbaxbbx
-No match
-    XBC
-No match
-    AXC
-No match
-    ABX
-No match
-
-/ab*c/i
-    ABC
- 0: ABC
-
-/ab*bc/i
-    ABC
- 0: ABC
-    ABBC
- 0: ABBC
-
-/ab*?bc/i
-    ABBBBC
- 0: ABBBBC
-
-/ab{0,}?bc/i
-    ABBBBC
- 0: ABBBBC
-
-/ab+?bc/i
-    ABBC
- 0: ABBC
-
-/ab+bc/i
-    *** Failers
-No match
-    ABC
-No match
-    ABQ
-No match
-
-/ab{1,}bc/i
-
-/ab+bc/i
-    ABBBBC
- 0: ABBBBC
-
-/ab{1,}?bc/i
-    ABBBBC
- 0: ABBBBC
-
-/ab{1,3}?bc/i
-    ABBBBC
- 0: ABBBBC
-
-/ab{3,4}?bc/i
-    ABBBBC
- 0: ABBBBC
-
-/ab{4,5}?bc/i
-    *** Failers
-No match
-    ABQ
-No match
-    ABBBBC
-No match
-
-/ab??bc/i
-    ABBC
- 0: ABBC
-    ABC
- 0: ABC
-
-/ab{0,1}?bc/i
-    ABC
- 0: ABC
-
-/ab??bc/i
-
-/ab??c/i
-    ABC
- 0: ABC
-
-/ab{0,1}?c/i
-    ABC
- 0: ABC
-
-/^abc$/i
-    ABC
- 0: ABC
-    *** Failers
-No match
-    ABBBBC
-No match
-    ABCC
-No match
-
-/^abc/i
-    ABCC
- 0: ABC
-
-/^abc$/i
-
-/abc$/i
-    AABC
- 0: ABC
-
-/^/i
-    ABC
- 0: 
-
-/$/i
-    ABC
- 0: 
-
-/a.c/i
-    ABC
- 0: ABC
-    AXC
- 0: AXC
-
-/a.*?c/i
-    AXYZC
- 0: AXYZC
-
-/a.*c/i
-    *** Failers
-No match
-    AABC
- 0: AABC
-    AXYZD
-No match
-
-/a[bc]d/i
-    ABD
- 0: ABD
-
-/a[b-d]e/i
-    ACE
- 0: ACE
-    *** Failers
-No match
-    ABC
-No match
-    ABD
-No match
-
-/a[b-d]/i
-    AAC
- 0: AC
-
-/a[-b]/i
-    A-
- 0: A-
-
-/a[b-]/i
-    A-
- 0: A-
-
-/a]/i
-    A]
- 0: A]
-
-/a[]]b/i
-    A]B
- 0: A]B
-
-/a[^bc]d/i
-    AED
- 0: AED
-
-/a[^-b]c/i
-    ADC
- 0: ADC
-    *** Failers
-No match
-    ABD
-No match
-    A-C
-No match
-
-/a[^]b]c/i
-    ADC
- 0: ADC
-
-/ab|cd/i
-    ABC
- 0: AB
-    ABCD
- 0: AB
-
-/()ef/i
-    DEF
- 0: EF
-
-/$b/i
-    *** Failers
-No match
-    A]C
-No match
-    B
-No match
-
-/a\(b/i
-    A(B
- 0: A(B
-
-/a\(*b/i
-    AB
- 0: AB
-    A((B
- 0: A((B
-
-/a\\b/i
-    A\B
-No match
-
-/((a))/i
-    ABC
- 0: A
-
-/(a)b(c)/i
-    ABC
- 0: ABC
-
-/a+b+c/i
-    AABBABC
- 0: ABC
-
-/a{1,}b{1,}c/i
-    AABBABC
- 0: ABC
-
-/a.+?c/i
-    ABCABC
- 0: ABCABC
- 1: ABC
-
-/a.*?c/i
-    ABCABC
- 0: ABCABC
- 1: ABC
-
-/a.{0,5}?c/i
-    ABCABC
- 0: ABCABC
- 1: ABC
-
-/(a+|b)*/i
-    AB
- 0: AB
- 1: A
- 2: 
-
-/(a+|b){0,}/i
-    AB
- 0: AB
- 1: A
- 2: 
-
-/(a+|b)+/i
-    AB
- 0: AB
- 1: A
-
-/(a+|b){1,}/i
-    AB
- 0: AB
- 1: A
-
-/(a+|b)?/i
-    AB
- 0: A
- 1: 
-
-/(a+|b){0,1}/i
-    AB
- 0: A
- 1: 
-
-/(a+|b){0,1}?/i
-    AB
- 0: A
- 1: 
-
-/[^ab]*/i
-    CDE
- 0: CDE
- 1: CD
- 2: C
- 3: 
-
-/abc/i
-
-/a*/i
-    
-
-/([abc])*d/i
-    ABBBCD
- 0: ABBBCD
-
-/([abc])*bcd/i
-    ABCD
- 0: ABCD
-
-/a|b|c|d|e/i
-    E
- 0: E
-
-/(a|b|c|d|e)f/i
-    EF
- 0: EF
-
-/abcd*efg/i
-    ABCDEFG
- 0: ABCDEFG
-
-/ab*/i
-    XABYABBBZ
- 0: AB
- 1: A
-    XAYABBBZ
- 0: A
-
-/(ab|cd)e/i
-    ABCDE
- 0: CDE
-
-/[abhgefdc]ij/i
-    HIJ
- 0: HIJ
-
-/^(ab|cd)e/i
-    ABCDE
-No match
-
-/(abc|)ef/i
-    ABCDEF
- 0: EF
-
-/(a|b)c*d/i
-    ABCD
- 0: BCD
-
-/(ab|ab*)bc/i
-    ABC
- 0: ABC
-
-/a([bc]*)c*/i
-    ABC
- 0: ABC
- 1: AB
- 2: A
-
-/a([bc]*)(c*d)/i
-    ABCD
- 0: ABCD
-
-/a([bc]+)(c*d)/i
-    ABCD
- 0: ABCD
-
-/a([bc]*)(c+d)/i
-    ABCD
- 0: ABCD
-
-/a[bcd]*dcdcde/i
-    ADCDCDE
- 0: ADCDCDE
-
-/a[bcd]+dcdcde/i
-
-/(ab|a)b*c/i
-    ABC
- 0: ABC
-
-/((a)(b)c)(d)/i
-    ABCD
- 0: ABCD
-
-/[a-zA-Z_][a-zA-Z0-9_]*/i
-    ALPHA
- 0: ALPHA
- 1: ALPH
- 2: ALP
- 3: AL
- 4: A
-
-/^a(bc+|b[eh])g|.h$/i
-    ABH
- 0: BH
-
-/(bc+d$|ef*g.|h?i(j|k))/i
-    EFFGZ
- 0: EFFGZ
-    IJ
- 0: IJ
-    REFFGZ
- 0: EFFGZ
-    *** Failers
-No match
-    ADCDCDE
-No match
-    EFFG
-No match
-    BCDD
-No match
-
-/((((((((((a))))))))))/i
-    A
- 0: A
-
-/(((((((((a)))))))))/i
-    A
- 0: A
-
-/(?:(?:(?:(?:(?:(?:(?:(?:(?:(a))))))))))/i
-    A
- 0: A
-
-/(?:(?:(?:(?:(?:(?:(?:(?:(?:(a|b|c))))))))))/i
-    C
- 0: C
-
-/multiple words of text/i
-    *** Failers
-No match
-    AA
-No match
-    UH-UH
-No match
-
-/multiple words/i
-    MULTIPLE WORDS, YEAH
- 0: MULTIPLE WORDS
-
-/(.*)c(.*)/i
-    ABCDE
- 0: ABCDE
- 1: ABCD
- 2: ABC
-
-/\((.*), (.*)\)/i
-    (A, B)
- 0: (A, B)
-
-/[k]/i
-
-/abcd/i
-    ABCD
- 0: ABCD
-
-/a(bc)d/i
-    ABCD
- 0: ABCD
-
-/a[-]?c/i
-    AC
- 0: AC
-
-/a(?!b)./
-    abad
- 0: ad
-
-/a(?=d)./
-    abad
- 0: ad
-
-/a(?=c|d)./
-    abad
- 0: ad
-
-/a(?:b|c|d)(.)/
-    ace
- 0: ace
-
-/a(?:b|c|d)*(.)/
-    ace
- 0: ace
- 1: ac
-
-/a(?:b|c|d)+?(.)/
-    ace
- 0: ace
-    acdbcdbe
- 0: acdbcdbe
- 1: acdbcdb
- 2: acdbcd
- 3: acdbc
- 4: acdb
- 5: acd
-
-/a(?:b|c|d)+(.)/
-    acdbcdbe
- 0: acdbcdbe
- 1: acdbcdb
- 2: acdbcd
- 3: acdbc
- 4: acdb
- 5: acd
-
-/a(?:b|c|d){2}(.)/
-    acdbcdbe
- 0: acdb
-
-/a(?:b|c|d){4,5}(.)/
-    acdbcdbe
- 0: acdbcdb
- 1: acdbcd
-
-/a(?:b|c|d){4,5}?(.)/
-    acdbcdbe
- 0: acdbcdb
- 1: acdbcd
-
-/((foo)|(bar))*/
-    foobar
- 0: foobar
- 1: foo
- 2: 
-
-/a(?:b|c|d){6,7}(.)/
-    acdbcdbe
- 0: acdbcdbe
-
-/a(?:b|c|d){6,7}?(.)/
-    acdbcdbe
- 0: acdbcdbe
-
-/a(?:b|c|d){5,6}(.)/
-    acdbcdbe
- 0: acdbcdbe
- 1: acdbcdb
-
-/a(?:b|c|d){5,6}?(.)/
-    acdbcdbe
- 0: acdbcdbe
- 1: acdbcdb
-
-/a(?:b|c|d){5,7}(.)/
-    acdbcdbe
- 0: acdbcdbe
- 1: acdbcdb
-
-/a(?:b|c|d){5,7}?(.)/
-    acdbcdbe
- 0: acdbcdbe
- 1: acdbcdb
-
-/a(?:b|(c|e){1,2}?|d)+?(.)/
-    ace
- 0: ace
-
-/^(.+)?B/
-    AB
- 0: AB
-
-/^([^a-z])|(\^)$/
-    .
- 0: .
-
-/^[<>]&/
-    <&OUT
- 0: <&
-
-/(?:(f)(o)(o)|(b)(a)(r))*/
-    foobar
- 0: foobar
- 1: foo
- 2: 
-
-/(?<=a)b/
-    ab
- 0: b
-    *** Failers
-No match
-    cb
-No match
-    b
-No match
-
-/(?<!c)b/
-    ab
- 0: b
-    b
- 0: b
-    b
- 0: b
-
-/(?:..)*a/
-    aba
- 0: aba
- 1: a
-
-/(?:..)*?a/
-    aba
- 0: aba
- 1: a
-
-/^(){3,5}/
-    abc
- 0: 
-
-/^(a+)*ax/
-    aax
- 0: aax
-
-/^((a|b)+)*ax/
-    aax
- 0: aax
-
-/^((a|bc)+)*ax/
-    aax
- 0: aax
-
-/(a|x)*ab/
-    cab
- 0: ab
-
-/(a)*ab/
-    cab
- 0: ab
-
-/(?:(?i)a)b/
-    ab
- 0: ab
-
-/((?i)a)b/
-    ab
- 0: ab
-
-/(?:(?i)a)b/
-    Ab
- 0: Ab
-
-/((?i)a)b/
-    Ab
- 0: Ab
-
-/(?:(?i)a)b/
-    *** Failers
-No match
-    cb
-No match
-    aB
-No match
-
-/((?i)a)b/
-
-/(?i:a)b/
-    ab
- 0: ab
-
-/((?i:a))b/
-    ab
- 0: ab
-
-/(?i:a)b/
-    Ab
- 0: Ab
-
-/((?i:a))b/
-    Ab
- 0: Ab
-
-/(?i:a)b/
-    *** Failers
-No match
-    aB
-No match
-    aB
-No match
-
-/((?i:a))b/
-
-/(?:(?-i)a)b/i
-    ab
- 0: ab
-
-/((?-i)a)b/i
-    ab
- 0: ab
-
-/(?:(?-i)a)b/i
-    aB
- 0: aB
-
-/((?-i)a)b/i
-    aB
- 0: aB
-
-/(?:(?-i)a)b/i
-    *** Failers
-No match
-    aB
- 0: aB
-    Ab
-No match
-
-/((?-i)a)b/i
-
-/(?:(?-i)a)b/i
-    aB
- 0: aB
-
-/((?-i)a)b/i
-    aB
- 0: aB
-
-/(?:(?-i)a)b/i
-    *** Failers
-No match
-    Ab
-No match
-    AB
-No match
-
-/((?-i)a)b/i
-
-/(?-i:a)b/i
-    ab
- 0: ab
-
-/((?-i:a))b/i
-    ab
- 0: ab
-
-/(?-i:a)b/i
-    aB
- 0: aB
-
-/((?-i:a))b/i
-    aB
- 0: aB
-
-/(?-i:a)b/i
-    *** Failers
-No match
-    AB
-No match
-    Ab
-No match
-
-/((?-i:a))b/i
-
-/(?-i:a)b/i
-    aB
- 0: aB
-
-/((?-i:a))b/i
-    aB
- 0: aB
-
-/(?-i:a)b/i
-    *** Failers
-No match
-    Ab
-No match
-    AB
-No match
-
-/((?-i:a))b/i
-
-/((?-i:a.))b/i
-    *** Failers
-No match
-    AB
-No match
-    a\nB
-No match
-
-/((?s-i:a.))b/i
-    a\nB
- 0: a\x0aB
-
-/(?:c|d)(?:)(?:a(?:)(?:b)(?:b(?:))(?:b(?:)(?:b)))/
-    cabbbb
- 0: cabbbb
-
-/(?:c|d)(?:)(?:aaaaaaaa(?:)(?:bbbbbbbb)(?:bbbbbbbb(?:))(?:bbbbbbbb(?:)(?:bbbbbbbb)))/
-    caaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
- 0: caaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
-
-/foo\w*\d{4}baz/
-    foobar1234baz
- 0: foobar1234baz
-
-/x(~~)*(?:(?:F)?)?/
-    x~~
- 0: x~~
- 1: x
-
-/^a(?#xxx){3}c/
-    aaac
- 0: aaac
-
-/^a (?#xxx) (?#yyy) {3}c/x
-    aaac
- 0: aaac
-
-/(?<![cd])b/
-    *** Failers
-No match
-    B\nB
-No match
-    dbcb
-No match
-
-/(?<![cd])[ab]/
-    dbaacb
- 0: a
-
-/(?<!(c|d))b/
-
-/(?<!(c|d))[ab]/
-    dbaacb
- 0: a
-
-/(?<!cd)[ab]/
-    cdaccb
- 0: b
-
-/^(?:a?b?)*$/
-    *** Failers
-No match
-    dbcb
-No match
-    a--
-No match
-
-/((?s)^a(.))((?m)^b$)/
-    a\nb\nc\n
- 0: a\x0ab
-
-/((?m)^b$)/
-    a\nb\nc\n
- 0: b
-
-/(?m)^b/
-    a\nb\n
- 0: b
-
-/(?m)^(b)/
-    a\nb\n
- 0: b
-
-/((?m)^b)/
-    a\nb\n
- 0: b
-
-/\n((?m)^b)/
-    a\nb\n
- 0: \x0ab
-
-/((?s).)c(?!.)/
-    a\nb\nc\n
- 0: \x0ac
-    a\nb\nc\n
- 0: \x0ac
-
-/((?s)b.)c(?!.)/
-    a\nb\nc\n
- 0: b\x0ac
-    a\nb\nc\n
- 0: b\x0ac
-
-/^b/
-
-/()^b/
-    *** Failers
-No match
-    a\nb\nc\n
-No match
-    a\nb\nc\n
-No match
-
-/((?m)^b)/
-    a\nb\nc\n
- 0: b
-
-/(?(?!a)a|b)/
-
-/(?(?!a)b|a)/
-    a
- 0: a
-
-/(?(?=a)b|a)/
-    *** Failers
-No match
-    a
-No match
-    a
-No match
-
-/(?(?=a)a|b)/
-    a
- 0: a
-
-/(\w+:)+/
-    one:
- 0: one:
-
-/$(?<=^(a))/
-    a
- 0: 
-
-/([\w:]+::)?(\w+)$/
-    abcd
- 0: abcd
-    xy:z:::abcd
- 0: xy:z:::abcd
-
-/^[^bcd]*(c+)/
-    aexycd
- 0: aexyc
-
-/(a*)b+/
-    caab
- 0: aab
-
-/([\w:]+::)?(\w+)$/
-    abcd
- 0: abcd
-    xy:z:::abcd
- 0: xy:z:::abcd
-    *** Failers
- 0: Failers
-    abcd:
-No match
-    abcd:
-No match
-
-/^[^bcd]*(c+)/
-    aexycd
- 0: aexyc
-
-/(>a+)ab/
-
-/(?>a+)b/
-    aaab
- 0: aaab
-
-/([[:]+)/
-    a:[b]:
- 0: :[
- 1: :
-
-/([[=]+)/
-    a=[b]=
- 0: =[
- 1: =
-
-/([[.]+)/
-    a.[b].
- 0: .[
- 1: .
-
-/((?>a+)b)/
-    aaab
- 0: aaab
-
-/(?>(a+))b/
-    aaab
- 0: aaab
-
-/((?>[^()]+)|\([^()]*\))+/
-    ((abc(ade)ufh()()x
- 0: abc(ade)ufh()()x
- 1: abc(ade)ufh()()
- 2: abc(ade)ufh()
- 3: abc(ade)ufh
- 4: abc(ade)
- 5: abc
-
-/a\Z/
-    *** Failers
-No match
-    aaab
-No match
-    a\nb\n
-No match
-
-/b\Z/
-    a\nb\n
- 0: b
-
-/b\z/
-
-/b\Z/
-    a\nb
- 0: b
-
-/b\z/
-    a\nb
- 0: b
-    *** Failers
-No match
-    
-/(?>.*)(?<=(abcd|wxyz))/
-    alphabetabcd
- 0: alphabetabcd
-    endingwxyz
- 0: endingwxyz
-    *** Failers
-No match
-    a rather long string that doesn't end with one of them
-No match
-
-/word (?>(?:(?!otherword)[a-zA-Z0-9]+ ){0,30})otherword/
-    word cat dog elephant mussel cow horse canary baboon snake shark otherword
- 0: word cat dog elephant mussel cow horse canary baboon snake shark otherword
-    word cat dog elephant mussel cow horse canary baboon snake shark
-No match
-  
-/word (?>[a-zA-Z0-9]+ ){0,30}otherword/
-    word cat dog elephant mussel cow horse canary baboon snake shark the quick brown fox and the lazy dog and several other words getting close to thirty by now I hope
-No match
-
-/(?<=\d{3}(?!999))foo/
-    999foo
- 0: foo
-    123999foo 
- 0: foo
-    *** Failers
-No match
-    123abcfoo
-No match
-    
-/(?<=(?!...999)\d{3})foo/
-    999foo
- 0: foo
-    123999foo 
- 0: foo
-    *** Failers
-No match
-    123abcfoo
-No match
-
-/(?<=\d{3}(?!999)...)foo/
-    123abcfoo
- 0: foo
-    123456foo 
- 0: foo
-    *** Failers
-No match
-    123999foo  
-No match
-    
-/(?<=\d{3}...)(?<!999)foo/
-    123abcfoo   
- 0: foo
-    123456foo 
- 0: foo
-    *** Failers
-No match
-    123999foo  
-No match
-
-/((Z)+|A)*/
-    ZABCDEFG
- 0: ZA
- 1: Z
- 2: 
-
-/(Z()|A)*/
-    ZABCDEFG
- 0: ZA
- 1: Z
- 2: 
-
-/(Z(())|A)*/
-    ZABCDEFG
- 0: ZA
- 1: Z
- 2: 
-
-/((?>Z)+|A)*/
-    ZABCDEFG
- 0: ZA
- 1: Z
- 2: 
-
-/((?>)+|A)*/
-    ZABCDEFG
- 0: 
-
-/a*/g
-    abbab
- 0: a
- 1: 
- 0: 
- 0: 
- 0: a
- 1: 
- 0: 
- 0: 
-
-/^[a-\d]/
-    abcde
- 0: a
-    -things
- 0: -
-    0digit
- 0: 0
-    *** Failers
-No match
-    bcdef    
-No match
-
-/^[\d-a]/
-    abcde
- 0: a
-    -things
- 0: -
-    0digit
- 0: 0
-    *** Failers
-No match
-    bcdef    
-No match
-    
-/[[:space:]]+/
-    > \x09\x0a\x0c\x0d\x0b<
- 0:  \x09\x0a\x0c\x0d\x0b
- 1:  \x09\x0a\x0c\x0d
- 2:  \x09\x0a\x0c
- 3:  \x09\x0a
- 4:  \x09
- 5:  
-     
-/[[:blank:]]+/
-    > \x09\x0a\x0c\x0d\x0b<
- 0:  \x09
- 1:  
-     
-/[\s]+/
-    > \x09\x0a\x0c\x0d\x0b<
- 0:  \x09\x0a\x0c\x0d
- 1:  \x09\x0a\x0c
- 2:  \x09\x0a
- 3:  \x09
- 4:  
-     
-/\s+/
-    > \x09\x0a\x0c\x0d\x0b<
- 0:  \x09\x0a\x0c\x0d
- 1:  \x09\x0a\x0c
- 2:  \x09\x0a
- 3:  \x09
- 4:  
-     
-/a?b/x
-    ab
-No match
-
-/(?!\A)x/m
-  a\nxb\n
- 0: x
-
-/(?!^)x/m
-  a\nxb\n
-No match
-
-/abc\Qabc\Eabc/
-    abcabcabc
- 0: abcabcabc
-    
-/abc\Q(*+|\Eabc/
-    abc(*+|abc 
- 0: abc(*+|abc
-
-/   abc\Q abc\Eabc/x
-    abc abcabc
- 0: abc abcabc
-    *** Failers
-No match
-    abcabcabc  
-No match
-    
-/abc#comment
-    \Q#not comment
-    literal\E/x
-    abc#not comment\n    literal     
- 0: abc#not comment\x0a    literal
-
-/abc#comment
-    \Q#not comment
-    literal/x
-    abc#not comment\n    literal     
- 0: abc#not comment\x0a    literal
-
-/abc#comment
-    \Q#not comment
-    literal\E #more comment
-    /x
-    abc#not comment\n    literal     
- 0: abc#not comment\x0a    literal
-
-/abc#comment
-    \Q#not comment
-    literal\E #more comment/x
-    abc#not comment\n    literal     
- 0: abc#not comment\x0a    literal
-
-/\Qabc\$xyz\E/
-    abc\\\$xyz
- 0: abc\$xyz
-
-/\Qabc\E\$\Qxyz\E/
-    abc\$xyz
- 0: abc$xyz
-
-/\Gabc/
-    abc
+/[^\d]+/8WBZ
+------------------------------------------------------------------
+        Bra
+        [^\p{Nd}]+
+        Ket
+        End
+------------------------------------------------------------------
+    abc123
  0: abc
-    *** Failers
-No match
-    xyzabc  
-No match
-
-/\Gabc./g
-    abc1abc2xyzabc3
- 0: abc1
- 0: abc2
-
-/abc./g
-    abc1abc2xyzabc3 
- 0: abc1
- 0: abc2
- 0: abc3
-
-/a(?x: b c )d/
-    XabcdY
- 0: abcd
-    *** Failers 
-No match
-    Xa b c d Y 
-No match
-
-/((?x)x y z | a b c)/
-    XabcY
+    abc\x{123}
+ 0: abc\x{123}
+    \x{660}abc   
  0: abc
-    AxyzB 
- 0: xyz

-/(?i)AB(?-i)C/
-    XabCY
- 0: abC
-    *** Failers
-No match
-    XabcY  
-No match
+/\p{Lu}+9\p{Lu}+B\p{Lu}+b/BZ
+------------------------------------------------------------------
+        Bra
+        prop Lu ++
+        9
+        prop Lu +
+        B
+        prop Lu ++
+        b
+        Ket
+        End
+------------------------------------------------------------------

-/((?i)AB(?-i)C|D)E/
-    abCE
- 0: abCE
-    DE
- 0: DE
-    *** Failers
-No match
-    abcE
-No match
-    abCe  
-No match
-    dE
-No match
-    De    
-No match
+/\p{^Lu}+9\p{^Lu}+B\p{^Lu}+b/BZ
+------------------------------------------------------------------
+        Bra
+        notprop Lu +
+        9
+        notprop Lu ++
+        B
+        notprop Lu +
+        b
+        Ket
+        End
+------------------------------------------------------------------

-/[z\Qa-d]\E]/
-    z
- 0: z
-    a
- 0: a
-    -
- 0: -
-    d
- 0: d
-    ] 
- 0: ]
-    *** Failers
- 0: a
-    b     
-No match
+/\P{Lu}+9\P{Lu}+B\P{Lu}+b/BZ
+------------------------------------------------------------------
+        Bra
+        notprop Lu +
+        9
+        notprop Lu ++
+        B
+        notprop Lu +
+        b
+        Ket
+        End
+------------------------------------------------------------------

-/[\z\C]/
-    z
- 0: z
-    C 
- 0: C
-    
-/\M/
-    M 
- 0: M
-    
-/(a+)*b/
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 
-No match
-    
-/(?i)reg(?:ul(?:[a\xE4]|ae)r|ex)/
-    REGular
- 0: REGular
-    regulaer
- 0: regulaer
-    Regex  
- 0: Regex
-    regul\xE4r 
- 0: regul\xe4r
+/\p{Han}+X\p{Greek}+\x{370}/BZ8
+------------------------------------------------------------------
+        Bra
+        prop Han ++
+        X
+        prop Greek +
+        \x{370}
+        Ket
+        End
+------------------------------------------------------------------

-/\xC5\xE6\xE5\xE4[\xE0-\xFF\xC0-\xDF]+/
-    \xC5\xE6\xE5\xE4\xE0
- 0: \xc5\xe6\xe5\xe4\xe0
-    \xC5\xE6\xE5\xE4\xFF
- 0: \xc5\xe6\xe5\xe4\xff
-    \xC5\xE6\xE5\xE4\xC0
- 0: \xc5\xe6\xe5\xe4\xc0
-    \xC5\xE6\xE5\xE4\xDF
- 0: \xc5\xe6\xe5\xe4\xdf
+/\p{Xan}+!\p{Xan}+A/BZ
+------------------------------------------------------------------
+        Bra
+        prop Xan ++
+        !
+        prop Xan +
+        A
+        Ket
+        End
+------------------------------------------------------------------

-/(?<=Z)X./
-    \x84XAZXB
- 0: XB
+/\p{Xsp}+!\p{Xsp}\t/BZ
+------------------------------------------------------------------
+        Bra
+        prop Xsp ++
+        !
+        prop Xsp
+        \x09
+        Ket
+        End
+------------------------------------------------------------------

-/^(?(2)a|(1)(2))+$/
-    123a
-Error -17 (backreference condition or recursion test not supported for DFA matching)
+/\p{Xps}+!\p{Xps}\t/BZ
+------------------------------------------------------------------
+        Bra
+        prop Xps ++
+        !
+        prop Xps
+        \x09
+        Ket
+        End
+------------------------------------------------------------------

-/(?<=a|bbbb)c/
-    ac
- 0: c
-    bbbbc
- 0: c
+/\p{Xwd}+!\p{Xwd}_/BZ
+------------------------------------------------------------------
+        Bra
+        prop Xwd ++
+        !
+        prop Xwd
+        _
+        Ket
+        End
+------------------------------------------------------------------

-/abc/SS>testsavedregex
-Compiled pattern written to testsavedregex
-<testsavedregex
-Compiled pattern loaded from testsavedregex
-No study data
-    abc
- 0: abc
-    *** Failers
-No match
-    bca
-No match
-    
-/abc/FSS>testsavedregex
-Compiled pattern written to testsavedregex
-<testsavedregex
-Compiled pattern (byte-inverted) loaded from testsavedregex
-No study data
-    abc
- 0: abc
-    *** Failers
-No match
-    bca
-No match
+/A+\p{N}A+\dB+\p{N}*B+\d*/WBZ
+------------------------------------------------------------------
+        Bra
+        A++
+        prop N
+        A++
+        prop Nd
+        B+
+        prop N *+
+        B+
+        prop Nd *
+        Ket
+        End
+------------------------------------------------------------------

-/(a|b)/S>testsavedregex
-Compiled pattern written to testsavedregex
-Study data written to testsavedregex
-<testsavedregex
-Compiled pattern loaded from testsavedregex
-Study data loaded from testsavedregex
-    abc
- 0: a
-    *** Failers
- 0: a
-    def  
-No match
-    
-/(a|b)/SF>testsavedregex
-Compiled pattern written to testsavedregex
-Study data written to testsavedregex
-<testsavedregex
-Compiled pattern (byte-inverted) loaded from testsavedregex
-Study data loaded from testsavedregex
-    abc
- 0: a
-    *** Failers
- 0: a
-    def  
-No match
-    
-/line\nbreak/
-    this is a line\nbreak
- 0: line\x0abreak
-    line one\nthis is a line\nbreak in the second line 
- 0: line\x0abreak
+/-- These behaved oddly in Perl, so they are kept in this test --/

-/line\nbreak/f
-    this is a line\nbreak
- 0: line\x0abreak
-    ** Failers 
+/(\x{23a}\x{23a}\x{23a})?\1/8i
+    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
 No match
-    line one\nthis is a line\nbreak in the second line 
-No match

-/line\nbreak/mf
-    this is a line\nbreak
- 0: line\x0abreak
-    ** Failers 
+/(ȺȺȺ)?\1/8i
+    ȺȺȺⱥⱥ
 No match
-    line one\nthis is a line\nbreak in the second line 
-No match

-/1234/
-    123\P
-Partial match: 123
-    a4\P\R
-No match
+/(\x{23a}\x{23a}\x{23a})?\1/8i
+    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
+ 0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
+ 1: \x{23a}\x{23a}\x{23a}

-/1234/
-    123\P
-Partial match: 123
-    4\P\R
- 0: 4
+/(ȺȺȺ)?\1/8i
+    ȺȺȺⱥⱥⱥ
+ 0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
+ 1: \x{23a}\x{23a}\x{23a}

-/^/mg
-    a\nb\nc\n
- 0: 
- 0: 
- 0: 
-    \ 
- 0: 
-    
-/(?<=C\n)^/mg
-    A\nC\nC\n 
- 0: 
-
-/(?s)A?B/
-    AB
- 0: AB
-    aB  
- 0: B
-
-/(?s)A*B/
-    AB
- 0: AB
-    aB  
- 0: B
-
-/(?m)A?B/
-    AB
- 0: AB
-    aB  
- 0: B
-
-/(?m)A*B/
-    AB
- 0: AB
-    aB  
- 0: B
-
-/Content-Type\x3A[^\r\n]{6,}/
-    Content-Type:xxxxxyyy 
- 0: Content-Type:xxxxxyyy
- 1: Content-Type:xxxxxyy
- 2: Content-Type:xxxxxy
-
-/Content-Type\x3A[^\r\n]{6,}z/
-    Content-Type:xxxxxyyyz
- 0: Content-Type:xxxxxyyyz
-
-/Content-Type\x3A[^a]{6,}/
-    Content-Type:xxxyyy 
- 0: Content-Type:xxxyyy
-
-/Content-Type\x3A[^a]{6,}z/
-    Content-Type:xxxyyyz
- 0: Content-Type:xxxyyyz
-
-/^abc/m
-    xyz\nabc
- 0: abc
-    xyz\nabc\<lf>
- 0: abc
-    xyz\r\nabc\<lf>
- 0: abc
-    xyz\rabc\<cr>
- 0: abc
-    xyz\r\nabc\<crlf>
- 0: abc
-    ** Failers 
+/(\x{23a}\x{23a}\x{23a})\1/8i
+    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
 No match
-    xyz\nabc\<cr>
-No match
-    xyz\r\nabc\<cr>
-No match
-    xyz\nabc\<crlf>
-No match
-    xyz\rabc\<crlf>
-No match
-    xyz\rabc\<lf>
-No match
-    
-/abc$/m<lf>
-    xyzabc
- 0: abc
-    xyzabc\n 
- 0: abc
-    xyzabc\npqr 
- 0: abc
-    xyzabc\r\<cr> 
- 0: abc
-    xyzabc\rpqr\<cr> 
- 0: abc
-    xyzabc\r\n\<crlf> 
- 0: abc
-    xyzabc\r\npqr\<crlf> 
- 0: abc
-    ** Failers
-No match
-    xyzabc\r 
-No match
-    xyzabc\rpqr 
-No match
-    xyzabc\r\n 
-No match
-    xyzabc\r\npqr 
-No match
-    
-/^abc/m<cr>
-    xyz\rabcdef
- 0: abc
-    xyz\nabcdef\<lf>
- 0: abc
-    ** Failers  
-No match
-    xyz\nabcdef
-No match
-       
-/^abc/m<lf>
-    xyz\nabcdef
- 0: abc
-    xyz\rabcdef\<cr>
- 0: abc
-    ** Failers  
-No match
-    xyz\rabcdef
-No match
-       
-/^abc/m<crlf>
-    xyz\r\nabcdef
- 0: abc
-    xyz\rabcdef\<cr>
- 0: abc
-    ** Failers  
-No match
-    xyz\rabcdef
-No match
-    
-/.*/<lf>
-    abc\ndef
- 0: abc
- 1: ab
- 2: a
- 3: 
-    abc\rdef
- 0: abc\x0ddef
- 1: abc\x0dde
- 2: abc\x0dd
- 3: abc\x0d
- 4: abc
- 5: ab
- 6: a
- 7: 
-    abc\r\ndef
- 0: abc\x0d
- 1: abc
- 2: ab
- 3: a
- 4: 
-    \<cr>abc\ndef
- 0: abc\x0adef
- 1: abc\x0ade
- 2: abc\x0ad
- 3: abc\x0a
- 4: abc
- 5: ab
- 6: a
- 7: 
-    \<cr>abc\rdef
- 0: abc
- 1: ab
- 2: a
- 3: 
-    \<cr>abc\r\ndef
- 0: abc
- 1: ab
- 2: a
- 3: 
-    \<crlf>abc\ndef
- 0: abc\x0adef
- 1: abc\x0ade
- 2: abc\x0ad
- 3: abc\x0a
- 4: abc
- 5: ab
- 6: a
- 7: 
-    \<crlf>abc\rdef
- 0: abc\x0ddef
- 1: abc\x0dde
- 2: abc\x0dd
- 3: abc\x0d
- 4: abc
- 5: ab
- 6: a
- 7: 
-    \<crlf>abc\r\ndef
- 0: abc
- 1: ab
- 2: a
- 3:

-/\w+(.)(.)?def/s
-    abc\ndef
- 0: abc\x0adef
-    abc\rdef
- 0: abc\x0ddef
-    abc\r\ndef
- 0: abc\x0d\x0adef
-
-/^\w+=.*(\\\n.*)*/
-    abc=xyz\\\npqr
- 0: abc=xyz\\x0apqr
- 1: abc=xyz\\x0apq
- 2: abc=xyz\\x0ap
- 3: abc=xyz\\x0a
- 4: abc=xyz\
- 5: abc=xyz
- 6: abc=xy
- 7: abc=x
- 8: abc=
-
-/^(a()*)*/
-    aaaa
- 0: aaaa
- 1: aaa
- 2: aa
- 3: a
- 4: 
-
-/^(?:a(?:(?:))*)*/
-    aaaa
- 0: aaaa
- 1: aaa
- 2: aa
- 3: a
- 4: 
-
-/^(a()+)+/
-    aaaa
- 0: aaaa
- 1: aaa
- 2: aa
- 3: a
-
-/^(?:a(?:(?:))+)+/
-    aaaa
- 0: aaaa
- 1: aaa
- 2: aa
- 3: a
-
-/(a|)*\d/
-  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/(ȺȺȺ)\1/8i
+    ȺȺȺⱥⱥ
 No match
-  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4

-/(?>a|)*\d/
-  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-No match
-  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+/(\x{23a}\x{23a}\x{23a})\1/8i
+    \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
+ 0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
+ 1: \x{23a}\x{23a}\x{23a}

-/(?:a|)*\d/
-  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-No match
-  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+/(ȺȺȺ)\1/8i
+    ȺȺȺⱥⱥⱥ
+ 0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
+ 1: \x{23a}\x{23a}\x{23a}

-/^a.b/<lf>
-    a\rb
- 0: a\x0db
-    a\nb\<cr> 
- 0: a\x0ab
-    ** Failers
-No match
-    a\nb
-No match
-    a\nb\<any>
-No match
-    a\rb\<cr>   
-No match
-    a\rb\<any>   
-No match
-
-/^abc./mgx<any>
-    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 JUNK
- 0: abc1
- 0: abc2
- 0: abc3
- 0: abc4
- 0: abc5
- 0: abc6
- 0: abc7
-
-/abc.$/mgx<any>
-    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc9
- 0: abc1
- 0: abc2
- 0: abc3
- 0: abc4
- 0: abc5
- 0: abc6
- 0: abc9
-
-/^a\Rb/<bsr_unicode>
-    a\nb
- 0: a\x0ab
-    a\rb
- 0: a\x0db
-    a\r\nb
- 0: a\x0d\x0ab
-    a\x0bb
- 0: a\x0bb
-    a\x0cb
- 0: a\x0cb
-    a\x85b   
- 0: a\x85b
-    ** Failers
-No match
-    a\n\rb    
-No match
-
-/^a\R*b/<bsr_unicode>
-    ab
- 0: ab
-    a\nb
- 0: a\x0ab
-    a\rb
- 0: a\x0db
-    a\r\nb
- 0: a\x0d\x0ab
-    a\x0bb
- 0: a\x0bb
-    a\x0cb
- 0: a\x0cb
-    a\x85b   
- 0: a\x85b
-    a\n\rb    
- 0: a\x0a\x0db
-    a\n\r\x85\x0cb 
- 0: a\x0a\x0d\x85\x0cb
-
-/^a\R+b/<bsr_unicode>
-    a\nb
- 0: a\x0ab
-    a\rb
- 0: a\x0db
-    a\r\nb
- 0: a\x0d\x0ab
-    a\x0bb
- 0: a\x0bb
-    a\x0cb
- 0: a\x0cb
-    a\x85b   
- 0: a\x85b
-    a\n\rb    
- 0: a\x0a\x0db
-    a\n\r\x85\x0cb 
- 0: a\x0a\x0d\x85\x0cb
-    ** Failers
-No match
-    ab  
-No match
+/(\x{2c65}\x{2c65})\1/8i
+    \x{2c65}\x{2c65}\x{23a}\x{23a}
+ 0: \x{2c65}\x{2c65}\x{23a}\x{23a}
+ 1: \x{2c65}\x{2c65}

-/^a\R{1,3}b/<bsr_unicode>
-    a\nb
- 0: a\x0ab
-    a\n\rb
- 0: a\x0a\x0db
-    a\n\r\x85b
- 0: a\x0a\x0d\x85b
-    a\r\n\r\nb 
- 0: a\x0d\x0a\x0d\x0ab
-    a\r\n\r\n\r\nb 
- 0: a\x0d\x0a\x0d\x0a\x0d\x0ab
-    a\n\r\n\rb
- 0: a\x0a\x0d\x0a\x0db
-    a\n\n\r\nb 
- 0: a\x0a\x0a\x0d\x0ab
-    ** Failers
-No match
-    a\n\n\n\rb
-No match
-    a\r
-No match
-
-/^a[\R]b/<bsr_unicode>
-    aRb
- 0: aRb
-    ** Failers
-No match
-    a\nb  
-No match
-
-/.+foo/
-    afoo
- 0: afoo
-    ** Failers 
-No match
-    \r\nfoo 
-No match
-    \nfoo 
-No match
-
-/.+foo/<crlf>
-    afoo
- 0: afoo
-    \nfoo 
- 0: \x0afoo
-    ** Failers 
-No match
-    \r\nfoo 
-No match
-
-/.+foo/<any>
-    afoo
- 0: afoo
-    ** Failers 
-No match
-    \nfoo 
-No match
-    \r\nfoo 
-No match
-
-/.+foo/s
-    afoo
- 0: afoo
-    \r\nfoo 
- 0: \x0d\x0afoo
-    \nfoo 
- 0: \x0afoo
-
-/^$/mg<any>
-    abc\r\rxyz
- 0: 
-    abc\n\rxyz  
- 0: 
-    ** Failers 
-No match
-    abc\r\nxyz
-No match
-
-/^X/m
-    XABC
- 0: X
-    ** Failers 
-No match
-    XABC\B
-No match
-
-/(?m)^$/<any>g+
-    abc\r\n\r\n
- 0: 
- 0+ \x0d\x0a
-
-/(?m)^$|^\r\n/<any>g+ 
-    abc\r\n\r\n
- 0: \x0d\x0a
- 0+ 
- 1: 
+/(ⱥⱥ)\1/8i
+    ⱥⱥȺȺ 
+ 0: \x{2c65}\x{2c65}\x{23a}\x{23a}
+ 1: \x{2c65}\x{2c65}

-/(?m)$/<any>g+ 
-    abc\r\n\r\n
- 0: 
- 0+ \x0d\x0a\x0d\x0a
- 0: 
- 0+ \x0d\x0a
- 0: 
- 0+ 
+/(\x{23a}\x{23a}\x{23a})\1Y/8i
+    X\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}YZ
+ 0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}Y
+ 1: \x{23a}\x{23a}\x{23a}

-/(?|(abc)|(xyz))/
-   >abc<
- 0: abc
-   >xyz< 
- 0: xyz
+/(\x{2c65}\x{2c65})\1Y/8i
+    X\x{2c65}\x{2c65}\x{23a}\x{23a}YZ
+ 0: \x{2c65}\x{2c65}\x{23a}\x{23a}Y
+ 1: \x{2c65}\x{2c65}

-/(x)(?|(abc)|(xyz))(x)/
-    xabcx
- 0: xabcx
-    xxyzx 
- 0: xxyzx
+/-- --/

-/(x)(?|(abc)(pqr)|(xyz))(x)/
-    xabcpqrx
- 0: xabcpqrx
-    xxyzx 
- 0: xxyzx
+/-- These scripts weren't yet in Perl when I added Unicode 6.0.0 to PCRE --/

-/(?|(abc)|(xyz))(?1)/
-    abcabc
- 0: abcabc
-    xyzabc 
- 0: xyzabc
-    ** Failers 
-No match
-    xyzxyz 
-No match
- 
-/\H\h\V\v/
-    X X\x0a
- 0: X X\x0a
-    X\x09X\x0b
- 0: X\x09X\x0b
+/^[\p{Batak}]/8
+    \x{1bc0}
+ 0: \x{1bc0}
+    \x{1bff}
+ 0: \x{1bff}
     ** Failers
 No match
-    \xa0 X\x0a   
+    \x{1bf4}
 No match

-/\H*\h+\V?\v{3,4}/ 
-    \x09\x20\xa0X\x0a\x0b\x0c\x0d\x0a
- 0: \x09 \xa0X\x0a\x0b\x0c\x0d
- 1: \x09 \xa0X\x0a\x0b\x0c
-    \x09\x20\xa0\x0a\x0b\x0c\x0d\x0a
- 0: \x09 \xa0\x0a\x0b\x0c\x0d
- 1: \x09 \xa0\x0a\x0b\x0c
-    \x09\x20\xa0\x0a\x0b\x0c
- 0: \x09 \xa0\x0a\x0b\x0c
-    ** Failers 
-No match
-    \x09\x20\xa0\x0a\x0b
-No match
-     
-/\H{3,4}/
-    XY  ABCDE
- 0: ABCD
- 1: ABC
-    XY  PQR ST 
- 0: PQR
-    
-/.\h{3,4}./
-    XY  AB    PQRS
- 0: B    P
- 1: B    
-
-/\h*X\h?\H+Y\H?Z/
-    >XNNNYZ
- 0: XNNNYZ
-    >  X NYQZ
- 0:   X NYQZ
+/^[\p{Brahmi}]/8
+    \x{11000}
+ 0: \x{11000}
+    \x{1106f}
+ 0: \x{1106f}
     ** Failers
 No match
-    >XYZ   
+    \x{1104e}
 No match
-    >  X NY Z
-No match
-
-/\v*X\v?Y\v+Z\V*\x0a\V+\x0b\V{2,3}\x0c/
-    >XY\x0aZ\x0aA\x0bNN\x0c
- 0: XY\x0aZ\x0aA\x0bNN\x0c
-    >\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
- 0: \x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
-
-/.+A/<crlf>
-    \r\nA
-No match

-/\nA/<crlf>
-    \r\nA 
- 0: \x0aA
-
-/[\r\n]A/<crlf>
-    \r\nA 
- 0: \x0aA
-
-/(\r|\n)A/<crlf>
-    \r\nA 
- 0: \x0aA
-
-/a\Rb/I<bsr_anycrlf>
-Capturing subpattern count = 0
-Options: bsr_anycrlf
-First char = 'a'
-Need char = 'b'
-    a\rb
- 0: a\x0db
-    a\nb
- 0: a\x0ab
-    a\r\nb
- 0: a\x0d\x0ab
+/^[\p{Mandaic}]/8
+    \x{840}
+ 0: \x{840}
+    \x{85e}
+ 0: \x{85e}
     ** Failers
 No match
-    a\x85b
+    \x{85c}
 No match
-    a\x0bb     
+    \x{85d}    
 No match

-/a\Rb/I<bsr_unicode>
-Capturing subpattern count = 0
-Options: bsr_unicode
-First char = 'a'
-Need char = 'b'
-    a\rb
- 0: a\x0db
-    a\nb
- 0: a\x0ab
-    a\r\nb
- 0: a\x0d\x0ab
-    a\x85b
- 0: a\x85b
-    a\x0bb     
- 0: a\x0bb
-    ** Failers 
-No match
-    a\x85b\<bsr_anycrlf>
-No match
-    a\x0bb\<bsr_anycrlf>
-No match
-    
-/a\R?b/I<bsr_anycrlf>
-Capturing subpattern count = 0
-Options: bsr_anycrlf
-First char = 'a'
-Need char = 'b'
-    a\rb
- 0: a\x0db
-    a\nb
- 0: a\x0ab
-    a\r\nb
- 0: a\x0d\x0ab
-    ** Failers
-No match
-    a\x85b
-No match
-    a\x0bb     
-No match
+/-- --/

-/a\R?b/I<bsr_unicode>
-Capturing subpattern count = 0
-Options: bsr_unicode
-First char = 'a'
-Need char = 'b'
-    a\rb
- 0: a\x0db
-    a\nb
- 0: a\x0ab
-    a\r\nb
- 0: a\x0d\x0ab
-    a\x85b
- 0: a\x85b
-    a\x0bb     
- 0: a\x0bb
-    ** Failers 
-No match
-    a\x85b\<bsr_anycrlf>
-No match
-    a\x0bb\<bsr_anycrlf>
-No match
-    
-/a\R{2,4}b/I<bsr_anycrlf>
-Capturing subpattern count = 0
-Options: bsr_anycrlf
-First char = 'a'
-Need char = 'b'
-    a\r\n\nb
- 0: a\x0d\x0a\x0ab
-    a\n\r\rb
- 0: a\x0a\x0d\x0db
-    a\r\n\r\n\r\n\r\nb
- 0: a\x0d\x0a\x0d\x0a\x0d\x0a\x0d\x0ab
-    ** Failers
-No match
-    a\x85\85b
-No match
-    a\x0b\0bb     
-No match
+/(\X*)(.)/s8
+    A\x{300}
+ 0: A
+ 1: 
+ 2: A

-/a\R{2,4}b/I<bsr_unicode>
-Capturing subpattern count = 0
-Options: bsr_unicode
-First char = 'a'
-Need char = 'b'
-    a\r\rb
- 0: a\x0d\x0db
-    a\n\n\nb
- 0: a\x0a\x0a\x0ab
-    a\r\n\n\r\rb
- 0: a\x0d\x0a\x0a\x0d\x0db
-    a\x85\85b
+/^S(\X*)e(\X*)$/8
+    Stéréo
 No match
-    a\x0b\0bb     
-No match
-    ** Failers 
-No match
-    a\r\r\r\r\rb 
-No match
-    a\x85\85b\<bsr_anycrlf>
-No match
-    a\x0b\0bb\<bsr_anycrlf>
-No match

-/a(?!)|\wbc/
-    abc 
- 0: abc
-
-/a[]b/<JS>
-    ** Failers
+/^\X/8 
+    ́réo
 No match
-    ab
-No match

-/a[]+b/<JS>
-    ** Failers
+/^a\X41z/<JS>
+    aX41z
+ 0: aX41z
+    *** Failers
 No match
-    ab 
+    aAz
 No match

-/a[]*+b/<JS>
-    ** Failers
-No match
-    ab 
-No match
+/(?<=ab\Cde)X/8
+Failed: \C not allowed in lookbehind assertion at offset 10

-/a[^]b/<JS>
-    aXb
- 0: aXb
-    a\nb 
- 0: a\x0ab
-    ** Failers
-No match
-    ab  
-No match
-    
-/a[^]+b/<JS> 
-    aXb
- 0: aXb
-    a\nX\nXb 
- 0: a\x0aX\x0aXb
-    ** Failers
-No match
-    ab  
-No match
-
-/X$/E
-    X
- 0: X
-    ** Failers 
-No match
-    X\n 
-No match
-
-/X$/
-    X
- 0: X
-    X\n 
- 0: X
-
-/xyz/C
-  xyz 
---->xyz
- +0 ^       x
- +1 ^^      y
- +2 ^ ^     z
- +3 ^  ^    
- 0: xyz
-  abcxyz 
---->abcxyz
- +0    ^       x
- +1    ^^      y
- +2    ^ ^     z
- +3    ^  ^    
- 0: xyz
-  abcxyz\Y
---->abcxyz
- +0 ^          x
- +0  ^         x
- +0   ^        x
- +0    ^       x
- +1    ^^      y
- +2    ^ ^     z
- +3    ^  ^    
- 0: xyz
-  ** Failers 
-No match
-  abc
-No match
-  abc\Y
---->abc
- +0 ^       x
- +0  ^      x
- +0   ^     x
- +0    ^    x
-No match
-  abcxypqr  
-No match
-  abcxypqr\Y  
---->abcxypqr
- +0 ^            x
- +0  ^           x
- +0   ^          x
- +0    ^         x
- +1    ^^        y
- +2    ^ ^       z
- +0     ^        x
- +0      ^       x
- +0       ^      x
- +0        ^     x
- +0         ^    x
-No match
-
-/(*NO_START_OPT)xyz/C
-  abcxyz 
---->abcxyz
-+15 ^          x
-+15  ^         x
-+15   ^        x
-+15    ^       x
-+16    ^^      y
-+17    ^ ^     z
-+18    ^  ^    
- 0: xyz
-  
-/(?C)ab/
-  ab
---->ab
-  0 ^      a
- 0: ab
-  \C-ab
- 0: ab
-  
-/ab/C
-  ab
---->ab
- +0 ^      a
- +1 ^^     b
- +2 ^ ^    
- 0: ab
-  \C-ab    
- 0: ab
-
-/^"((?(?=[a])[^"])|b)*"$/C
-    "ab"
---->"ab"
- +0 ^        ^
- +1 ^        "
- +2 ^^       ((?(?=[a])[^"])|b)*
-+21 ^^       "
- +3 ^^       (?(?=[a])[^"])
-+18 ^^       b
- +5 ^^       (?=[a])
- +8  ^       [a]
-+11  ^^      )
-+12 ^^       [^"]
-+16 ^ ^      )
-+17 ^ ^      |
-+21 ^ ^      "
- +3 ^ ^      (?(?=[a])[^"])
-+18 ^ ^      b
- +5 ^ ^      (?=[a])
- +8   ^      [a]
-+19 ^  ^     )
-+21 ^  ^     "
- +3 ^  ^     (?(?=[a])[^"])
-+18 ^  ^     b
- +5 ^  ^     (?=[a])
- +8    ^     [a]
-+17 ^  ^     |
-+22 ^   ^    $
-+23 ^   ^    
- 0: "ab"
-    \C-"ab"
- 0: "ab"
-
-/\d+X|9+Y/
-    ++++123999\P
-Partial match: 123999
-    ++++123999Y\P
- 0: 999Y
-
-/Z(*F)/
-    Z\P
-No match
-    ZA\P 
-No match
-    
-/Z(?!)/
-    Z\P 
-No match
-    ZA\P 
-No match
-
-/dog(sbody)?/
-    dogs\P
- 0: dog
-    dogs\P\P 
-Partial match: dogs
-    
-/dog(sbody)??/
-    dogs\P
- 0: dog
-    dogs\P\P 
-Partial match: dogs
-
-/dog|dogsbody/
-    dogs\P
- 0: dog
-    dogs\P\P 
-Partial match: dogs
- 
-/dogsbody|dog/
-    dogs\P
- 0: dog
-    dogs\P\P 
-Partial match: dogs
-
-/Z(*F)Q|ZXY/
-    Z\P
-Partial match: Z
-    ZA\P 
-No match
-    X\P 
-No match
-
-/\bthe cat\b/
-    the cat\P
- 0: the cat
-    the cat\P\P
-Partial match: the cat
-
-/dog(sbody)?/
-    dogs\D\P
- 0: dog
-    body\D\R
- 0: body
-
-/dog(sbody)?/
-    dogs\D\P\P
-Partial match: dogs
-    body\D\R
- 0: body
-
-/abc/
-   abc\P
- 0: abc
-   abc\P\P
- 0: abc
-
-/abc\K123/
-    xyzabc123pqr
-Error -16 (item unsupported for DFA matching)
-    
-/(?<=abc)123/
-    xyzabc123pqr 
- 0: 123
-    xyzabc12\P
-Partial match: abc12
-    xyzabc12\P\P
-Partial match: abc12
-
-/\babc\b/
-    +++abc+++
- 0: abc
-    +++ab\P
-Partial match: +ab
-    +++ab\P\P  
-Partial match: +ab
-
-/(?=C)/g+
-    ABCDECBA
- 0: 
- 0+ CDECBA
- 0: 
- 0+ CBA
-
-/(abc|def|xyz)/I
-Capturing subpattern count = 1
-No options
-No first char
-No need char
-    terhjk;abcdaadsfe
- 0: abc
-    the quick xyz brown fox 
- 0: xyz
-    \Yterhjk;abcdaadsfe
- 0: abc
-    \Ythe quick xyz brown fox 
- 0: xyz
-    ** Failers
-No match
-    thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
-No match
-    \Ythejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
-No match
-
-/(abc|def|xyz)/SI
-Capturing subpattern count = 1
-No options
-No first char
-No need char
-Subject length lower bound = 3
-Starting byte set: a d x 
-    terhjk;abcdaadsfe
- 0: abc
-    the quick xyz brown fox 
- 0: xyz
-    \Yterhjk;abcdaadsfe
- 0: abc
-    \Ythe quick xyz brown fox 
- 0: xyz
-    ** Failers
-No match
-    thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
-No match
-    \Ythejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
-No match
-
-/abcd*/+
-    xxxxabcd\P
- 0: abcd
- 0+ 
- 1: abc
-    xxxxabcd\P\P
-Partial match: abcd
-    dddxxx\R 
- 0: ddd
- 0+ xxx
- 1: dd
- 2: d
- 3: 
-    xxxxabcd\P\P
-Partial match: abcd
-    xxx\R 
- 0: 
- 0+ xxx
-
-/abcd*/i
-    xxxxabcd\P
- 0: abcd
- 1: abc
-    xxxxabcd\P\P
-Partial match: abcd
-    XXXXABCD\P
- 0: ABCD
- 1: ABC
-    XXXXABCD\P\P
-Partial match: ABCD
-
-/abc\d*/
-    xxxxabc1\P
- 0: abc1
- 1: abc
-    xxxxabc1\P\P
-Partial match: abc1
-
-/abc[de]*/
-    xxxxabcde\P
- 0: abcde
- 1: abcd
- 2: abc
-    xxxxabcde\P\P
-Partial match: abcde
-
-/(?:(?1)|B)(A(*F)|C)/
-    ABCD
- 0: BC
-    CCD
- 0: CC
-    ** Failers
-No match
-    CAD   
-No match
-
-/^(?:(?1)|B)(A(*F)|C)/
-    CCD
- 0: CC
-    BCD 
- 0: BC
-    ** Failers
-No match
-    ABCD
-No match
-    CAD
-No match
-    BAD    
-No match
-
-/^(?!a(*SKIP)b)/
-    ac
-Error -16 (item unsupported for DFA matching)
-    
-/^(?=a(*SKIP)b|ac)/
-    ** Failers
-No match
-    ac
-Error -16 (item unsupported for DFA matching)
-    
-/^(?=a(*THEN)b|ac)/
-    ac
-Error -16 (item unsupported for DFA matching)
-    
-/^(?=a(*PRUNE)b)/
-    ab  
-Error -16 (item unsupported for DFA matching)
-    ** Failers 
-No match
-    ac
-Error -16 (item unsupported for DFA matching)
-
-/^(?(?!a(*SKIP)b))/
-    ac
-Error -16 (item unsupported for DFA matching)
-
-/(?<=abc)def/
-    abc\P\P
-Partial match: abc
-
-/abc$/
-    abc
- 0: abc
-    abc\P
- 0: abc
-    abc\P\P
-Partial match: abc
-
-/abc$/m
-    abc
- 0: abc
-    abc\n
- 0: abc
-    abc\P\P
-Partial match: abc
-    abc\n\P\P 
- 0: abc
-    abc\P
- 0: abc
-    abc\n\P
- 0: abc
-
-/abc\z/
-    abc
- 0: abc
-    abc\P
- 0: abc
-    abc\P\P
-Partial match: abc
-
-/abc\Z/
-    abc
- 0: abc
-    abc\P
- 0: abc
-    abc\P\P
-Partial match: abc
-
-/abc\b/
-    abc
- 0: abc
-    abc\P
- 0: abc
-    abc\P\P
-Partial match: abc
-
-/abc\B/
-    abc
-No match
-    abc\P
-Partial match: abc
-    abc\P\P
-Partial match: abc
-
-/.+/
-    abc\>0
- 0: abc
- 1: ab
- 2: a
-    abc\>1
- 0: bc
- 1: b
-    abc\>2
- 0: c
-    abc\>3
-No match
-    abc\>4
-Error -24 (bad offset value)
-    abc\>-4 
-Error -24 (bad offset value)
-
-/^(?:a)++\w/
-     aaaab
- 0: aaaab
-     ** Failers 
-No match
-     aaaa 
-No match
-     bbb 
-No match
-
-/^(?:aa|(?:a)++\w)/
-     aaaab
- 0: aaaab
- 1: aa
-     aaaa 
- 0: aa
-     ** Failers 
-No match
-     bbb 
-No match
-
-/^(?:a)*+\w/
-     aaaab
- 0: aaaab
-     bbb 
- 0: b
-     ** Failers 
-No match
-     aaaa 
-No match
-
-/^(a)++\w/
-     aaaab
- 0: aaaab
-     ** Failers 
-No match
-     aaaa 
-No match
-     bbb 
-No match
-
-/^(a|)++\w/
-     aaaab
- 0: aaaab
-     ** Failers 
-No match
-     aaaa 
-No match
-     bbb 
-No match
-
-/(?=abc){3}abc/+
-    abcabcabc
- 0: abc
- 0+ abcabc
-    ** Failers
-No match
-    xyz  
-No match
-    
-/(?=abc)+abc/+
-    abcabcabc
- 0: abc
- 0+ abcabc
-    ** Failers
-No match
-    xyz  
-No match
-    
-/(?=abc)++abc/+
-    abcabcabc
- 0: abc
- 0+ abcabc
-    ** Failers
-No match
-    xyz  
-No match
-    
-/(?=abc){0}xyz/
-    xyz 
- 0: xyz
-
-/(?=abc){1}xyz/
-    ** Failers
-No match
-    xyz 
-No match
-    
-/(?=(a))?./
-    ab
- 0: a
-    bc
- 0: b
-      
-/(?=(a))??./
-    ab
- 0: a
-    bc
- 0: b
-
-/^(?=(a)){0}b(?1)/
-    backgammon
- 0: ba
-
-/^(?=(?1))?[az]([abc])d/
-    abd 
- 0: abd
-    zcdxx 
- 0: zcd
-
-/^(?!a){0}\w+/
-    aaaaa
- 0: aaaaa
- 1: aaaa
- 2: aaa
- 3: aa
- 4: a
-
-/(?<=(abc))?xyz/
-    abcxyz
- 0: xyz
-    pqrxyz 
- 0: xyz
-
-/((?2))((?1))/
-    abc
-Error -26 (nested recursion at the same subject position)
-
-/(?(R)a+|(?R)b)/
-    aaaabcde
- 0: aaaab
-
-/(?(R)a+|((?R))b)/
-    aaaabcde
- 0: aaaab
-
-/((?(R)a+|(?1)b))/
-    aaaabcde
- 0: aaaab
-
-/((?(R2)a+|(?1)b))/
-    aaaabcde
-Error -17 (backreference condition or recursion test not supported for DFA matching)
-
-/(?(R)a*(?1)|((?R))b)/
-    aaaabcde
-Error -26 (nested recursion at the same subject position)
-
-/(a+)/
-    \O6aaaa
-Matched, but too many subsidiary matches
- 0: aaaa
- 1: aaa
- 2: aa
-    \O8aaaa
- 0: aaaa
- 1: aaa
- 2: aa
- 3: a
-
-/ab\Cde/
-    abXde
- 0: abXde
-    
-/(?<=ab\Cde)X/
-    abZdeX
- 0: X
-
 /-- End of testinput7 --/

Modified: code/trunk/testdata/testoutput8
===================================================================
--- code/trunk/testdata/testoutput8    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput8    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,1039 +1,6813 @@
-/-- This set of tests checks UTF-8 support with the DFA matching functionality
-    of pcre_dfa_exec(). The -dfa flag must be used with pcretest when running 
-    it. --/
-
-/\x{100}ab/8
-  \x{100}ab
- 0: \x{100}ab
-  
-/a\x{100}*b/8
-    ab
- 0: ab
-    a\x{100}b  
- 0: a\x{100}b
-    a\x{100}\x{100}b  
- 0: a\x{100}\x{100}b
+/-- This set of tests check the DFA matching functionality of pcre_dfa_exec().
+    The -dfa flag must be used with pcretest when running it. --/
+     
+/abc/
+    abc
+ 0: abc

-/a\x{100}+b/8
-    a\x{100}b  
- 0: a\x{100}b
-    a\x{100}\x{100}b  
- 0: a\x{100}\x{100}b
+/ab*c/
+    abc
+ 0: abc
+    abbbbc
+ 0: abbbbc
+    ac
+ 0: ac
+    
+/ab+c/
+    abc
+ 0: abc
+    abbbbbbc
+ 0: abbbbbbc
     *** Failers 
 No match
+    ac
+No match
     ab
 No match
+    
+/a*/
+    a
+ 0: a
+ 1: 
+    aaaaaaaaaaaaaaaaa
+ 0: aaaaaaaaaaaaaaaaa
+ 1: aaaaaaaaaaaaaaaa
+ 2: aaaaaaaaaaaaaaa
+ 3: aaaaaaaaaaaaaa
+ 4: aaaaaaaaaaaaa
+ 5: aaaaaaaaaaaa
+ 6: aaaaaaaaaaa
+ 7: aaaaaaaaaa
+ 8: aaaaaaaaa
+ 9: aaaaaaaa
+10: aaaaaaa
+11: aaaaaa
+12: aaaaa
+13: aaaa
+14: aaa
+15: aa
+16: a
+17: 
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 
+Matched, but too many subsidiary matches
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 3: aaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 4: aaaaaaaaaaaaaaaaaaaaaaaaaa
+ 5: aaaaaaaaaaaaaaaaaaaaaaaaa
+ 6: aaaaaaaaaaaaaaaaaaaaaaaa
+ 7: aaaaaaaaaaaaaaaaaaaaaaa
+ 8: aaaaaaaaaaaaaaaaaaaaaa
+ 9: aaaaaaaaaaaaaaaaaaaaa
+10: aaaaaaaaaaaaaaaaaaaa
+11: aaaaaaaaaaaaaaaaaaa
+12: aaaaaaaaaaaaaaaaaa
+13: aaaaaaaaaaaaaaaaa
+14: aaaaaaaaaaaaaaaa
+15: aaaaaaaaaaaaaaa
+16: aaaaaaaaaaaaaa
+17: aaaaaaaaaaaaa
+18: aaaaaaaaaaaa
+19: aaaaaaaaaaa
+20: aaaaaaaaaa
+21: aaaaaaaaa
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\F 
+ 0: 
+    
+/(a|abcd|african)/
+    a
+ 0: a
+    abcd
+ 0: abcd
+ 1: a
+    african
+ 0: african
+ 1: a
+    
+/^abc/
+    abcdef
+ 0: abc
+    *** Failers
+No match
+    xyzabc
+No match
+    xyz\nabc    
+No match
+    
+/^abc/m
+    abcdef
+ 0: abc
+    xyz\nabc    
+ 0: abc
+    *** Failers
+No match
+    xyzabc
+No match
+    
+/\Aabc/
+    abcdef
+ 0: abc
+    *** Failers
+No match
+    xyzabc
+No match
+    xyz\nabc    
+No match
+    
+/\Aabc/m
+    abcdef
+ 0: abc
+    *** Failers
+No match
+    xyzabc
+No match
+    xyz\nabc    
+No match
+    
+/\Gabc/
+    abcdef
+ 0: abc
+    xyzabc\>3
+ 0: abc
+    *** Failers
+No match
+    xyzabc    
+No match
+    xyzabc\>2 
+No match
+    
+/x\dy\Dz/
+    x9yzz
+ 0: x9yzz
+    x0y+z
+ 0: x0y+z
+    *** Failers
+No match
+    xyz
+No match
+    xxy0z     
+No match
+    
+/x\sy\Sz/
+    x yzz
+ 0: x yzz
+    x y+z
+ 0: x y+z
+    *** Failers
+No match
+    xyz
+No match
+    xxyyz
+No match
+    
+/x\wy\Wz/
+    xxy+z
+ 0: xxy+z
+    *** Failers
+No match
+    xxy0z
+No match
+    x+y+z         
+No match
+    
+/x.y/
+    x+y
+ 0: x+y
+    x-y
+ 0: x-y
+    *** Failers
+No match
+    x\ny
+No match
+    
+/x.y/s
+    x+y
+ 0: x+y
+    x-y
+ 0: x-y
+    x\ny
+ 0: x\x0ay
+
+/(a.b(?s)c.d|x.y)p.q/
+    a+bc+dp+q
+ 0: a+bc+dp+q
+    a+bc\ndp+q
+ 0: a+bc\x0adp+q
+    x\nyp+q 
+ 0: x\x0ayp+q
+    *** Failers 
+No match
+    a\nbc\ndp+q
+No match
+    a+bc\ndp\nq
+No match
+    x\nyp\nq 
+No match
+
+/a\d\z/
+    ba0
+ 0: a0
+    *** Failers
+No match
+    ba0\n
+No match
+    ba0\ncd   
+No match
+
+/a\d\z/m
+    ba0
+ 0: a0
+    *** Failers
+No match
+    ba0\n
+No match
+    ba0\ncd   
+No match
+
+/a\d\Z/
+    ba0
+ 0: a0
+    ba0\n
+ 0: a0
+    *** Failers
+No match
+    ba0\ncd   
+No match
+
+/a\d\Z/m
+    ba0
+ 0: a0
+    ba0\n
+ 0: a0
+    *** Failers
+No match
+    ba0\ncd   
+No match
+
+/a\d$/
+    ba0
+ 0: a0
+    ba0\n
+ 0: a0
+    *** Failers
+No match
+    ba0\ncd   
+No match
+
+/a\d$/m
+    ba0
+ 0: a0
+    ba0\n
+ 0: a0
+    ba0\ncd   
+ 0: a0
+    *** Failers
+No match
+
+/abc/i
+    abc
+ 0: abc
+    aBc
+ 0: aBc
+    ABC
+ 0: ABC
+    
+/[^a]/
+    abcd
+ 0: b
+    
+/ab?\w/
+    abz
+ 0: abz
+ 1: ab
+    abbz
+ 0: abb
+ 1: ab
+    azz  
+ 0: az
+
+/x{0,3}yz/
+    ayzq
+ 0: yz
+    axyzq
+ 0: xyz
+    axxyz
+ 0: xxyz
+    axxxyzq
+ 0: xxxyz
+    axxxxyzq
+ 0: xxxyz
+    *** Failers
+No match
+    ax
+No match
+    axx     
+No match
+      
+/x{3}yz/
+    axxxyzq
+ 0: xxxyz
+    axxxxyzq
+ 0: xxxyz
+    *** Failers
+No match
+    ax
+No match
+    axx     
+No match
+    ayzq
+No match
+    axyzq
+No match
+    axxyz
+No match
+      
+/x{2,3}yz/
+    axxyz
+ 0: xxyz
+    axxxyzq
+ 0: xxxyz
+    axxxxyzq
+ 0: xxxyz
+    *** Failers
+No match
+    ax
+No match
+    axx     
+No match
+    ayzq
+No match
+    axyzq
+No match
+      
+/[^a]+/
+    bac
+ 0: b
+    bcdefax
+ 0: bcdef
+ 1: bcde
+ 2: bcd
+ 3: bc
+ 4: b
+    *** Failers
+ 0: *** F
+ 1: *** 
+ 2: ***
+ 3: **
+ 4: *
+    aaaaa   
+No match
+
+/[^a]*/
+    bac
+ 0: b
+ 1: 
+    bcdefax
+ 0: bcdef
+ 1: bcde
+ 2: bcd
+ 3: bc
+ 4: b
+ 5: 
+    *** Failers
+ 0: *** F
+ 1: *** 
+ 2: ***
+ 3: **
+ 4: *
+ 5: 
+    aaaaa   
+ 0: 
+    
+/[^a]{3,5}/
+    xyz
+ 0: xyz
+    awxyza
+ 0: wxyz
+ 1: wxy
+    abcdefa
+ 0: bcdef
+ 1: bcde
+ 2: bcd
+    abcdefghijk
+ 0: bcdef
+ 1: bcde
+ 2: bcd
+    *** Failers
+ 0: *** F
+ 1: *** 
+ 2: ***
+    axya
+No match
+    axa
+No match
+    aaaaa         
+No match
+
+/\d*/
+    1234b567
+ 0: 1234
+ 1: 123
+ 2: 12
+ 3: 1
+ 4: 
+    xyz
+ 0: 
+    
+/\D*/
+    a1234b567
+ 0: a
+ 1: 
+    xyz
+ 0: xyz
+ 1: xy
+ 2: x
+ 3:

-/\bX/8
-    Xoanon
- 0: X
-    +Xoanon
- 0: X
-    \x{300}Xoanon 
- 0: X
+/\d+/
+    ab1234c56
+ 0: 1234
+ 1: 123
+ 2: 12
+ 3: 1
+    *** Failers
+No match
+    xyz
+No match
+    
+/\D+/
+    ab123c56
+ 0: ab
+ 1: a
+    *** Failers
+ 0: *** Failers
+ 1: *** Failer
+ 2: *** Faile
+ 3: *** Fail
+ 4: *** Fai
+ 5: *** Fa
+ 6: *** F
+ 7: *** 
+ 8: ***
+ 9: **
+10: *
+    789
+No match
+    
+/\d?A/
+    045ABC
+ 0: 5A
+    ABC
+ 0: A
+    *** Failers
+No match
+    XYZ
+No match
+    
+/\D?A/
+    ABC
+ 0: A
+    BAC
+ 0: BA
+    9ABC             
+ 0: A
+    *** Failers
+No match
+
+/a+/
+    aaaa
+ 0: aaaa
+ 1: aaa
+ 2: aa
+ 3: a
+
+/^.*xyz/
+    xyz
+ 0: xyz
+    ggggggggxyz
+ 0: ggggggggxyz
+    
+/^.+xyz/
+    abcdxyz
+ 0: abcdxyz
+    axyz
+ 0: axyz
+    *** Failers
+No match
+    xyz
+No match
+    
+/^.?xyz/
+    xyz
+ 0: xyz
+    cxyz       
+ 0: cxyz
+
+/^\d{2,3}X/
+    12X
+ 0: 12X
+    123X
+ 0: 123X
+    *** Failers
+No match
+    X
+No match
+    1X
+No match
+    1234X     
+No match
+
+/^[abcd]\d/
+    a45
+ 0: a4
+    b93
+ 0: b9
+    c99z
+ 0: c9
+    d04
+ 0: d0
+    *** Failers
+No match
+    e45
+No match
+    abcd      
+No match
+    abcd1234
+No match
+    1234  
+No match
+
+/^[abcd]*\d/
+    a45
+ 0: a4
+    b93
+ 0: b9
+    c99z
+ 0: c9
+    d04
+ 0: d0
+    abcd1234
+ 0: abcd1
+    1234  
+ 0: 1
+    *** Failers
+No match
+    e45
+No match
+    abcd      
+No match
+
+/^[abcd]+\d/
+    a45
+ 0: a4
+    b93
+ 0: b9
+    c99z
+ 0: c9
+    d04
+ 0: d0
+    abcd1234
+ 0: abcd1
+    *** Failers
+No match
+    1234  
+No match
+    e45
+No match
+    abcd      
+No match
+
+/^a+X/
+    aX
+ 0: aX
+    aaX 
+ 0: aaX
+
+/^[abcd]?\d/
+    a45
+ 0: a4
+    b93
+ 0: b9
+    c99z
+ 0: c9
+    d04
+ 0: d0
+    1234  
+ 0: 1
+    *** Failers
+No match
+    abcd1234
+No match
+    e45
+No match
+
+/^[abcd]{2,3}\d/
+    ab45
+ 0: ab4
+    bcd93
+ 0: bcd9
+    *** Failers
+No match
+    1234 
+No match
+    a36 
+No match
+    abcd1234
+No match
+    ee45
+No match
+
+/^(abc)*\d/
+    abc45
+ 0: abc4
+    abcabcabc45
+ 0: abcabcabc4
+    42xyz 
+ 0: 4
+    *** Failers
+No match
+
+/^(abc)+\d/
+    abc45
+ 0: abc4
+    abcabcabc45
+ 0: abcabcabc4
+    *** Failers
+No match
+    42xyz 
+No match
+
+/^(abc)?\d/
+    abc45
+ 0: abc4
+    42xyz 
+ 0: 4
+    *** Failers
+No match
+    abcabcabc45
+No match
+
+/^(abc){2,3}\d/
+    abcabc45
+ 0: abcabc4
+    abcabcabc45
+ 0: abcabcabc4
+    *** Failers
+No match
+    abcabcabcabc45
+No match
+    abc45
+No match
+    42xyz 
+No match
+
+/1(abc|xyz)2(?1)3/
+    1abc2abc3456
+ 0: 1abc2abc3
+    1abc2xyz3456 
+ 0: 1abc2xyz3
+
+/^(a*\w|ab)=(a*\w|ab)/
+    ab=ab
+ 0: ab=ab
+ 1: ab=a
+
+/^(a*\w|ab)=(?1)/
+    ab=ab
+ 0: ab=ab
+ 1: ab=a
+
+/^([^()]|\((?1)*\))*$/
+    abc
+ 0: abc
+    a(b)c
+ 0: a(b)c
+    a(b(c))d  
+ 0: a(b(c))d
+    *** Failers)
+No match
+    a(b(c)d  
+No match
+
+/^>abc>([^()]|\((?1)*\))*<xyz<$/
+    >abc>123<xyz<
+ 0: >abc>123<xyz<
+    >abc>1(2)3<xyz<
+ 0: >abc>1(2)3<xyz<
+    >abc>(1(2)3)<xyz<
+ 0: >abc>(1(2)3)<xyz<
+
+/^(?>a*)\d/
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa9876
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa9
     *** Failers 
 No match
-    YXoanon  
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 No match
+
+/< (?: (?(R) \d++  | [^<>]*+) | (?R)) * >/x
+    <>
+ 0: <>
+    <abcd>
+ 0: <abcd>
+    <abc <123> hij>
+ 0: <abc <123> hij>
+    <abc <def> hij>
+ 0: <def>
+    <abc<>def> 
+ 0: <abc<>def>
+    <abc<>      
+ 0: <>
+    *** Failers
+No match
+    <abc
+No match
+
+/^(?(?=abc)\w{3}:|\d\d)$/        
+    abc:                          
+ 0: abc:
+    12                             
+ 0: 12
+    *** Failers                     
+No match
+    123                       
+No match
+    xyz                        
+No match
+                                
+/^(?(?!abc)\d\d|\w{3}:)$/      
+    abc:                        
+ 0: abc:
+    12         
+ 0: 12
+    *** Failers
+No match
+    123
+No match
+    xyz    
+No match
+
+/^(?=abc)\w{5}:$/        
+    abcde:                          
+ 0: abcde:
+    *** Failers                     
+No match
+    abc.. 
+No match
+    123                       
+No match
+    vwxyz                        
+No match
+                                
+/^(?!abc)\d\d$/      
+    12         
+ 0: 12
+    *** Failers
+No match
+    abcde:
+No match
+    abc..  
+No match
+    123
+No match
+    vwxyz    
+No match
+
+/(?<=abc|xy)123/
+    abc12345
+ 0: 123
+    wxy123z
+ 0: 123
+    *** Failers
+No match
+    123abc
+No match
+
+/(?<!abc|xy)123/
+    123abc
+ 0: 123
+    mno123456 
+ 0: 123
+    *** Failers
+No match
+    abc12345
+No match
+    wxy123z
+No match
+
+/abc(?C1)xyz/
+    abcxyz
+--->abcxyz
+  1 ^  ^       x
+ 0: abcxyz
+    123abcxyz999 
+--->123abcxyz999
+  1    ^  ^          x
+ 0: abcxyz
+
+/(ab|cd){3,4}/C
+  ababab
+--->ababab
+ +0 ^          (ab|cd){3,4}
+ +1 ^          a
+ +4 ^          c
+ +2 ^^         b
+ +3 ^ ^        |
+ +1 ^ ^        a
+ +4 ^ ^        c
+ +2 ^  ^       b
+ +3 ^   ^      |
+ +1 ^   ^      a
+ +4 ^   ^      c
+ +2 ^    ^     b
+ +3 ^     ^    |
++12 ^     ^    
+ +1 ^     ^    a
+ +4 ^     ^    c
+ 0: ababab
+  abcdabcd
+--->abcdabcd
+ +0 ^            (ab|cd){3,4}
+ +1 ^            a
+ +4 ^            c
+ +2 ^^           b
+ +3 ^ ^          |
+ +1 ^ ^          a
+ +4 ^ ^          c
+ +5 ^  ^         d
+ +6 ^   ^        )
+ +1 ^   ^        a
+ +4 ^   ^        c
+ +2 ^    ^       b
+ +3 ^     ^      |
++12 ^     ^      
+ +1 ^     ^      a
+ +4 ^     ^      c
+ +5 ^      ^     d
+ +6 ^       ^    )
++12 ^       ^    
+ 0: abcdabcd
+ 1: abcdab
+  abcdcdcdcdcd  
+--->abcdcdcdcdcd
+ +0 ^                (ab|cd){3,4}
+ +1 ^                a
+ +4 ^                c
+ +2 ^^               b
+ +3 ^ ^              |
+ +1 ^ ^              a
+ +4 ^ ^              c
+ +5 ^  ^             d
+ +6 ^   ^            )
+ +1 ^   ^            a
+ +4 ^   ^            c
+ +5 ^    ^           d
+ +6 ^     ^          )
++12 ^     ^          
+ +1 ^     ^          a
+ +4 ^     ^          c
+ +5 ^      ^         d
+ +6 ^       ^        )
++12 ^       ^        
+ 0: abcdcdcd
+ 1: abcdcd
+
+/^abc/
+    abcdef
+ 0: abc
+    *** Failers
+No match
+    abcdef\B  
+No match
+
+/^(a*|xyz)/
+    bcd
+ 0: 
+    aaabcd
+ 0: aaa
+ 1: aa
+ 2: a
+ 3: 
+    xyz
+ 0: xyz
+ 1: 
+    xyz\N  
+ 0: xyz
+    *** Failers
+ 0: 
+    bcd\N   
+No match

-/\BX/8
-    YXoanon
- 0: X
+/xyz$/
+    xyz
+ 0: xyz
+    xyz\n
+ 0: xyz
     *** Failers
 No match
-    Xoanon
+    xyz\Z
 No match
-    +Xoanon    
+    xyz\n\Z    
 No match
-    \x{300}Xoanon 
+    
+/xyz$/m
+    xyz
+ 0: xyz
+    xyz\n 
+ 0: xyz
+    abcxyz\npqr 
+ 0: xyz
+    abcxyz\npqr\Z 
+ 0: xyz
+    xyz\n\Z    
+ 0: xyz
+    *** Failers
 No match
+    xyz\Z
+No match

-/X\b/8
-    X+oanon
- 0: X
-    ZX\x{300}oanon 
- 0: X
-    FAX 
- 0: X
+/\Gabc/
+    abcdef
+ 0: abc
+    defabcxyz\>3 
+ 0: abc
     *** Failers 
 No match
-    Xoanon  
+    defabcxyz
 No match
+
+/^abcdef/
+    ab\P
+Partial match: ab
+    abcde\P
+Partial match: abcde
+    abcdef\P
+ 0: abcdef
+    *** Failers
+No match
+    abx\P    
+No match
+
+/^a{2,4}\d+z/
+    a\P
+Partial match: a
+    aa\P
+Partial match: aa
+    aa2\P 
+Partial match: aa2
+    aaa\P
+Partial match: aaa
+    aaa23\P 
+Partial match: aaa23
+    aaaa12345\P
+Partial match: aaaa12345
+    aa0z\P
+ 0: aa0z
+    aaaa4444444444444z\P 
+ 0: aaaa4444444444444z
+    *** Failers
+No match
+    az\P 
+No match
+    aaaaa\P 
+No match
+    a56\P 
+No match
+
+/^abcdef/
+   abc\P
+Partial match: abc
+   def\R 
+ 0: def
+   
+/(?<=foo)bar/
+   xyzfo\P 
+No match
+   foob\P\>2 
+Partial match: foob
+   foobar...\R\P\>4 
+ 0: ar
+   xyzfo\P
+No match
+   foobar\>2  
+ 0: bar
+   *** Failers
+No match
+   xyzfo\P
+No match
+   obar\R   
+No match
+
+/(ab*(cd|ef))+X/
+    adfadadaklhlkalkajhlkjahdfasdfasdfladsfjkj\P\Z
+No match
+    lkjhlkjhlkjhlkjhabbbbbbcdaefabbbbbbbefa\P\B\Z
+Partial match: abbbbbbcdaefabbbbbbbefa
+    cdabbbbbbbb\P\R\B\Z
+Partial match: cdabbbbbbbb
+    efabbbbbbbbbbbbbbbb\P\R\B\Z
+Partial match: efabbbbbbbbbbbbbbbb
+    bbbbbbbbbbbbcdXyasdfadf\P\R\B\Z    
+ 0: bbbbbbbbbbbbcdX
+
+/(a|b)/SF>testsavedregex
+Compiled pattern written to testsavedregex
+Study data written to testsavedregex
+<testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
+Study data loaded from testsavedregex
+    abc
+ 0: a
+    ** Failers
+ 0: a
+    def  
+No match

-/X\B/8
-    Xoanon  
- 0: X
+/the quick brown fox/
+    the quick brown fox
+ 0: the quick brown fox
+    The quick brown FOX
+No match
+    What do you know about the quick brown fox?
+ 0: the quick brown fox
+    What do you know about THE QUICK BROWN FOX?
+No match
+
+/The quick brown fox/i
+    the quick brown fox
+ 0: the quick brown fox
+    The quick brown FOX
+ 0: The quick brown FOX
+    What do you know about the quick brown fox?
+ 0: the quick brown fox
+    What do you know about THE QUICK BROWN FOX?
+ 0: THE QUICK BROWN FOX
+
+/abcd\t\n\r\f\a\e\071\x3b\$\\\?caxyz/
+    abcd\t\n\r\f\a\e9;\$\\?caxyz
+ 0: abcd\x09\x0a\x0d\x0c\x07\x1b9;$\?caxyz
+
+/a*abc?xyz+pqr{3}ab{2,}xy{4,5}pq{0,6}AB{0,}zz/
+    abxyzpqrrrabbxyyyypqAzz
+ 0: abxyzpqrrrabbxyyyypqAzz
+    abxyzpqrrrabbxyyyypqAzz
+ 0: abxyzpqrrrabbxyyyypqAzz
+    aabxyzpqrrrabbxyyyypqAzz
+ 0: aabxyzpqrrrabbxyyyypqAzz
+    aaabxyzpqrrrabbxyyyypqAzz
+ 0: aaabxyzpqrrrabbxyyyypqAzz
+    aaaabxyzpqrrrabbxyyyypqAzz
+ 0: aaaabxyzpqrrrabbxyyyypqAzz
+    abcxyzpqrrrabbxyyyypqAzz
+ 0: abcxyzpqrrrabbxyyyypqAzz
+    aabcxyzpqrrrabbxyyyypqAzz
+ 0: aabcxyzpqrrrabbxyyyypqAzz
+    aaabcxyzpqrrrabbxyyyypAzz
+ 0: aaabcxyzpqrrrabbxyyyypAzz
+    aaabcxyzpqrrrabbxyyyypqAzz
+ 0: aaabcxyzpqrrrabbxyyyypqAzz
+    aaabcxyzpqrrrabbxyyyypqqAzz
+ 0: aaabcxyzpqrrrabbxyyyypqqAzz
+    aaabcxyzpqrrrabbxyyyypqqqAzz
+ 0: aaabcxyzpqrrrabbxyyyypqqqAzz
+    aaabcxyzpqrrrabbxyyyypqqqqAzz
+ 0: aaabcxyzpqrrrabbxyyyypqqqqAzz
+    aaabcxyzpqrrrabbxyyyypqqqqqAzz
+ 0: aaabcxyzpqrrrabbxyyyypqqqqqAzz
+    aaabcxyzpqrrrabbxyyyypqqqqqqAzz
+ 0: aaabcxyzpqrrrabbxyyyypqqqqqqAzz
+    aaaabcxyzpqrrrabbxyyyypqAzz
+ 0: aaaabcxyzpqrrrabbxyyyypqAzz
+    abxyzzpqrrrabbxyyyypqAzz
+ 0: abxyzzpqrrrabbxyyyypqAzz
+    aabxyzzzpqrrrabbxyyyypqAzz
+ 0: aabxyzzzpqrrrabbxyyyypqAzz
+    aaabxyzzzzpqrrrabbxyyyypqAzz
+ 0: aaabxyzzzzpqrrrabbxyyyypqAzz
+    aaaabxyzzzzpqrrrabbxyyyypqAzz
+ 0: aaaabxyzzzzpqrrrabbxyyyypqAzz
+    abcxyzzpqrrrabbxyyyypqAzz
+ 0: abcxyzzpqrrrabbxyyyypqAzz
+    aabcxyzzzpqrrrabbxyyyypqAzz
+ 0: aabcxyzzzpqrrrabbxyyyypqAzz
+    aaabcxyzzzzpqrrrabbxyyyypqAzz
+ 0: aaabcxyzzzzpqrrrabbxyyyypqAzz
+    aaaabcxyzzzzpqrrrabbxyyyypqAzz
+ 0: aaaabcxyzzzzpqrrrabbxyyyypqAzz
+    aaaabcxyzzzzpqrrrabbbxyyyypqAzz
+ 0: aaaabcxyzzzzpqrrrabbbxyyyypqAzz
+    aaaabcxyzzzzpqrrrabbbxyyyyypqAzz
+ 0: aaaabcxyzzzzpqrrrabbbxyyyyypqAzz
+    aaabcxyzpqrrrabbxyyyypABzz
+ 0: aaabcxyzpqrrrabbxyyyypABzz
+    aaabcxyzpqrrrabbxyyyypABBzz
+ 0: aaabcxyzpqrrrabbxyyyypABBzz
+    >>>aaabxyzpqrrrabbxyyyypqAzz
+ 0: aaabxyzpqrrrabbxyyyypqAzz
+    >aaaabxyzpqrrrabbxyyyypqAzz
+ 0: aaaabxyzpqrrrabbxyyyypqAzz
+    >>>>abcxyzpqrrrabbxyyyypqAzz
+ 0: abcxyzpqrrrabbxyyyypqAzz
     *** Failers
 No match
-    X+oanon
+    abxyzpqrrabbxyyyypqAzz
 No match
-    ZX\x{300}oanon 
+    abxyzpqrrrrabbxyyyypqAzz
 No match
-    FAX 
+    abxyzpqrrrabxyyyypqAzz
 No match
-    
-/[^a]/8
-    abcd
+    aaaabcxyzzzzpqrrrabbbxyyyyyypqAzz
+No match
+    aaaabcxyzzzzpqrrrabbbxyyypqAzz
+No match
+    aaabcxyzpqrrrabbxyyyypqqqqqqqAzz
+No match
+
+/^(abc){1,2}zz/
+    abczz
+ 0: abczz
+    abcabczz
+ 0: abcabczz
+    *** Failers
+No match
+    zz
+No match
+    abcabcabczz
+No match
+    >>abczz
+No match
+
+/^(b+?|a){1,2}?c/
+    bc
+ 0: bc
+    bbc
+ 0: bbc
+    bbbc
+ 0: bbbc
+    bac
+ 0: bac
+    bbac
+ 0: bbac
+    aac
+ 0: aac
+    abbbbbbbbbbbc
+ 0: abbbbbbbbbbbc
+    bbbbbbbbbbbac
+ 0: bbbbbbbbbbbac
+    *** Failers
+No match
+    aaac
+No match
+    abbbbbbbbbbbac
+No match
+
+/^(b+|a){1,2}c/
+    bc
+ 0: bc
+    bbc
+ 0: bbc
+    bbbc
+ 0: bbbc
+    bac
+ 0: bac
+    bbac
+ 0: bbac
+    aac
+ 0: aac
+    abbbbbbbbbbbc
+ 0: abbbbbbbbbbbc
+    bbbbbbbbbbbac
+ 0: bbbbbbbbbbbac
+    *** Failers
+No match
+    aaac
+No match
+    abbbbbbbbbbbac
+No match
+
+/^(b+|a){1,2}?bc/
+    bbc
+ 0: bbc
+
+/^(b*|ba){1,2}?bc/
+    babc
+ 0: babc
+    bbabc
+ 0: bbabc
+    bababc
+ 0: bababc
+    *** Failers
+No match
+    bababbc
+No match
+    babababc
+No match
+
+/^(ba|b*){1,2}?bc/
+    babc
+ 0: babc
+    bbabc
+ 0: bbabc
+    bababc
+ 0: bababc
+    *** Failers
+No match
+    bababbc
+No match
+    babababc
+No match
+
+/^\ca\cA\c[\c{\c:/
+    \x01\x01\e;z
+ 0: \x01\x01\x1b;z
+
+/^[ab\]cde]/
+    athing
+ 0: a
+    bthing
  0: b
-    a\x{100}   
- 0: \x{100}
+    ]thing
+ 0: ]
+    cthing
+ 0: c
+    dthing
+ 0: d
+    ething
+ 0: e
+    *** Failers
+No match
+    fthing
+No match
+    [thing
+No match
+    \\thing
+No match

-/^[abc\x{123}\x{400}-\x{402}]{2,3}\d/8
-    ab99
- 0: ab9
-    \x{123}\x{123}45
- 0: \x{123}\x{123}4
-    \x{400}\x{401}\x{402}6  
- 0: \x{400}\x{401}\x{402}6
+/^[]cde]/
+    ]thing
+ 0: ]
+    cthing
+ 0: c
+    dthing
+ 0: d
+    ething
+ 0: e
     *** Failers
 No match
-    d99
+    athing
 No match
-    \x{123}\x{122}4   
+    fthing
 No match
-    \x{400}\x{403}6  
+
+/^[^ab\]cde]/
+    fthing
+ 0: f
+    [thing
+ 0: [
+    \\thing
+ 0: \
+    *** Failers
+ 0: *
+    athing
 No match
-    \x{400}\x{401}\x{402}\x{402}6  
+    bthing
 No match
+    ]thing
+No match
+    cthing
+No match
+    dthing
+No match
+    ething
+No match

-/abc/8
-    \xC3]
-Error -10 (bad UTF-8 string) offset=0 reason=6
-    \xC3
-Error -10 (bad UTF-8 string) offset=0 reason=1
-    \xC3\xC3\xC3
-Error -10 (bad UTF-8 string) offset=0 reason=6
-    \xC3\xC3\xC3\?
+/^[^]cde]/
+    athing
+ 0: a
+    fthing
+ 0: f
+    *** Failers
+ 0: *
+    ]thing
 No match
-    \xe1\x88 
-Error -10 (bad UTF-8 string) offset=0 reason=1
-    \P\xe1\x88 
-Error -10 (bad UTF-8 string) offset=0 reason=1
-    \P\P\xe1\x88 
-Error -25 (short UTF-8 string) offset=0 reason=1
+    cthing
+No match
+    dthing
+No match
+    ething
+No match

-/a.b/8
-    acb
- 0: acb
-    a\x7fb
- 0: a\x{7f}b
-    a\x{100}b 
- 0: a\x{100}b
+/^\\x81/
+    \x81
+ 0: \x81
+
+/^\xFF/
+    \xFF
+ 0: \xff
+
+/^[0-9]+$/
+    0
+ 0: 0
+    1
+ 0: 1
+    2
+ 0: 2
+    3
+ 0: 3
+    4
+ 0: 4
+    5
+ 0: 5
+    6
+ 0: 6
+    7
+ 0: 7
+    8
+ 0: 8
+    9
+ 0: 9
+    10
+ 0: 10
+    100
+ 0: 100
     *** Failers
 No match
-    a\nb  
+    abc
 No match

-/a(.{3})b/8
-    a\x{4000}xyb 
- 0: a\x{4000}xyb
-    a\x{4000}\x7fyb 
- 0: a\x{4000}\x{7f}yb
-    a\x{4000}\x{100}yb 
- 0: a\x{4000}\x{100}yb
+/^.*nter/
+    enter
+ 0: enter
+    inter
+ 0: inter
+    uponter
+ 0: uponter
+
+/^xxx[0-9]+$/
+    xxx0
+ 0: xxx0
+    xxx1234
+ 0: xxx1234
     *** Failers
 No match
-    a\x{4000}b 
+    xxx
 No match
-    ac\ncb 
+
+/^.+[0-9][0-9][0-9]$/
+    x123
+ 0: x123
+    xx123
+ 0: xx123
+    123456
+ 0: 123456
+    *** Failers
 No match
+    123
+No match
+    x1234
+ 0: x1234

-/a(.*?)(.)/
-    a\xc0\x88b
- 0: a\xc0\x88b
- 1: a\xc0\x88
- 2: a\xc0
+/^.+?[0-9][0-9][0-9]$/
+    x123
+ 0: x123
+    xx123
+ 0: xx123
+    123456
+ 0: 123456
+    *** Failers
+No match
+    123
+No match
+    x1234
+ 0: x1234

-/a(.*?)(.)/8
-    a\x{100}b
- 0: a\x{100}b
- 1: a\x{100}
+/^([^!]+)!(.+)=apquxz\.ixr\.zzz\.ac\.uk$/
+    abc!pqr=apquxz.ixr.zzz.ac.uk
+ 0: abc!pqr=apquxz.ixr.zzz.ac.uk
+    *** Failers
+No match
+    !pqr=apquxz.ixr.zzz.ac.uk
+No match
+    abc!=apquxz.ixr.zzz.ac.uk
+No match
+    abc!pqr=apquxz:ixr.zzz.ac.uk
+No match
+    abc!pqr=apquxz.ixr.zzz.ac.ukk
+No match

-/a(.*)(.)/
-    a\xc0\x88b
- 0: a\xc0\x88b
- 1: a\xc0\x88
- 2: a\xc0
+/:/
+    Well, we need a colon: somewhere
+ 0: :
+    *** Fail if we don't
+No match

-/a(.*)(.)/8
-    a\x{100}b
- 0: a\x{100}b
- 1: a\x{100}
+/([\da-f:]+)$/i
+    0abc
+ 0: 0abc
+    abc
+ 0: abc
+    fed
+ 0: fed
+    E
+ 0: E
+    ::
+ 0: ::
+    5f03:12C0::932e
+ 0: 5f03:12C0::932e
+    fed def
+ 0: def
+    Any old stuff
+ 0: ff
+    *** Failers
+No match
+    0zzz
+No match
+    gzzz
+No match
+    fed\x20
+No match
+    Any old rubbish
+No match

-/a(.)(.)/
-    a\xc0\x92bcd
- 0: a\xc0\x92
+/^.*\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})$/
+    .1.2.3
+ 0: .1.2.3
+    A.12.123.0
+ 0: A.12.123.0
+    *** Failers
+No match
+    .1.2.3333
+No match
+    1.2.3
+No match
+    1234.2.3
+No match

-/a(.)(.)/8
-    a\x{240}bcd
- 0: a\x{240}b
+/^(\d+)\s+IN\s+SOA\s+(\S+)\s+(\S+)\s*\(\s*$/
+    1 IN SOA non-sp1 non-sp2(
+ 0: 1 IN SOA non-sp1 non-sp2(
+    1    IN    SOA    non-sp1    non-sp2   (
+ 0: 1    IN    SOA    non-sp1    non-sp2   (
+    *** Failers
+No match
+    1IN SOA non-sp1 non-sp2(
+No match

-/a(.?)(.)/
-    a\xc0\x92bcd
- 0: a\xc0\x92
- 1: a\xc0
+/^[a-zA-Z\d][a-zA-Z\d\-]*(\.[a-zA-Z\d][a-zA-z\d\-]*)*\.$/
+    a.
+ 0: a.
+    Z.
+ 0: Z.
+    2.
+ 0: 2.
+    ab-c.pq-r.
+ 0: ab-c.pq-r.
+    sxk.zzz.ac.uk.
+ 0: sxk.zzz.ac.uk.
+    x-.y-.
+ 0: x-.y-.
+    *** Failers
+No match
+    -abc.peq.
+No match

-/a(.?)(.)/8
-    a\x{240}bcd
- 0: a\x{240}b
- 1: a\x{240}
+/^\*\.[a-z]([a-z\-\d]*[a-z\d]+)?(\.[a-z]([a-z\-\d]*[a-z\d]+)?)*$/
+    *.a
+ 0: *.a
+    *.b0-a
+ 0: *.b0-a
+    *.c3-b.c
+ 0: *.c3-b.c
+    *.c-a.b-c
+ 0: *.c-a.b-c
+    *** Failers
+No match
+    *.0
+No match
+    *.a-
+No match
+    *.a-b.c-
+No match
+    *.c-a.0-c
+No match

-/a(.??)(.)/
-    a\xc0\x92bcd
- 0: a\xc0\x92
- 1: a\xc0
+/^(?=ab(de))(abd)(e)/
+    abde
+ 0: abde

-/a(.??)(.)/8
-    a\x{240}bcd
- 0: a\x{240}b
- 1: a\x{240}
+/^(?!(ab)de|x)(abd)(f)/
+    abdf
+ 0: abdf

-/a(.{3})b/8
-    a\x{1234}xyb 
- 0: a\x{1234}xyb
-    a\x{1234}\x{4321}yb 
- 0: a\x{1234}\x{4321}yb
-    a\x{1234}\x{4321}\x{3412}b 
- 0: a\x{1234}\x{4321}\x{3412}b
+/^(?=(ab(cd)))(ab)/
+    abcd
+ 0: ab
+
+/^[\da-f](\.[\da-f])*$/i
+    a.b.c.d
+ 0: a.b.c.d
+    A.B.C.D
+ 0: A.B.C.D
+    a.b.c.1.2.3.C
+ 0: a.b.c.1.2.3.C
+
+/^\".*\"\s*(;.*)?$/
+    \"1234\"
+ 0: "1234"
+    \"abcd\" ;
+ 0: "abcd" ;
+    \"\" ; rhubarb
+ 0: "" ; rhubarb
     *** Failers
 No match
-    a\x{1234}b 
+    \"1234\" : things
 No match
-    ac\ncb 
+
+/^$/
+    \
+ 0: 
+    *** Failers
 No match

-/a(.{3,})b/8
-    a\x{1234}xyb 
- 0: a\x{1234}xyb
-    a\x{1234}\x{4321}yb 
- 0: a\x{1234}\x{4321}yb
-    a\x{1234}\x{4321}\x{3412}b 
- 0: a\x{1234}\x{4321}\x{3412}b
-    axxxxbcdefghijb 
- 0: axxxxbcdefghijb
- 1: axxxxb
-    a\x{1234}\x{4321}\x{3412}\x{3421}b 
- 0: a\x{1234}\x{4321}\x{3412}\x{3421}b
+/   ^    a   (?# begins with a)  b\sc (?# then b c) $ (?# then end)/x
+    ab c
+ 0: ab c
     *** Failers
 No match
-    a\x{1234}b 
+    abc
 No match
+    ab cde
+No match

-/a(.{3,}?)b/8
-    a\x{1234}xyb 
- 0: a\x{1234}xyb
-    a\x{1234}\x{4321}yb 
- 0: a\x{1234}\x{4321}yb
-    a\x{1234}\x{4321}\x{3412}b 
- 0: a\x{1234}\x{4321}\x{3412}b
-    axxxxbcdefghijb 
- 0: axxxxbcdefghijb
- 1: axxxxb
-    a\x{1234}\x{4321}\x{3412}\x{3421}b 
- 0: a\x{1234}\x{4321}\x{3412}\x{3421}b
+/(?x)   ^    a   (?# begins with a)  b\sc (?# then b c) $ (?# then end)/
+    ab c
+ 0: ab c
     *** Failers
 No match
-    a\x{1234}b 
+    abc
 No match
+    ab cde
+No match

-/a(.{3,5})b/8
-    a\x{1234}xyb 
- 0: a\x{1234}xyb
-    a\x{1234}\x{4321}yb 
- 0: a\x{1234}\x{4321}yb
-    a\x{1234}\x{4321}\x{3412}b 
- 0: a\x{1234}\x{4321}\x{3412}b
-    axxxxbcdefghijb 
- 0: axxxxb
-    a\x{1234}\x{4321}\x{3412}\x{3421}b 
- 0: a\x{1234}\x{4321}\x{3412}\x{3421}b
-    axbxxbcdefghijb 
- 0: axbxxb
-    axxxxxbcdefghijb 
- 0: axxxxxb
+/^   a\ b[c ]d       $/x
+    a bcd
+ 0: a bcd
+    a b d
+ 0: a b d
     *** Failers
 No match
-    a\x{1234}b 
+    abcd
 No match
-    axxxxxxbcdefghijb 
+    ab d
 No match

-/a(.{3,5}?)b/8
-    a\x{1234}xyb 
- 0: a\x{1234}xyb
-    a\x{1234}\x{4321}yb 
- 0: a\x{1234}\x{4321}yb
-    a\x{1234}\x{4321}\x{3412}b 
- 0: a\x{1234}\x{4321}\x{3412}b
-    axxxxbcdefghijb 
- 0: axxxxb
-    a\x{1234}\x{4321}\x{3412}\x{3421}b 
- 0: a\x{1234}\x{4321}\x{3412}\x{3421}b
-    axbxxbcdefghijb 
- 0: axbxxb
-    axxxxxbcdefghijb 
- 0: axxxxxb
+/^(a(b(c)))(d(e(f)))(h(i(j)))(k(l(m)))$/
+    abcdefhijklm
+ 0: abcdefhijklm
+
+/^(?:a(b(c)))(?:d(e(f)))(?:h(i(j)))(?:k(l(m)))$/
+    abcdefhijklm
+ 0: abcdefhijklm
+
+/^[\w][\W][\s][\S][\d][\D][\b][\n][\c]][\022]/
+    a+ Z0+\x08\n\x1d\x12
+ 0: a+ Z0+\x08\x0a\x1d\x12
+
+/^[.^$|()*+?{,}]+/
+    .^\$(*+)|{?,?}
+ 0: .^$(*+)|{?,?}
+ 1: .^$(*+)|{?,?
+ 2: .^$(*+)|{?,
+ 3: .^$(*+)|{?
+ 4: .^$(*+)|{
+ 5: .^$(*+)|
+ 6: .^$(*+)
+ 7: .^$(*+
+ 8: .^$(*
+ 9: .^$(
+10: .^$
+11: .^
+12: .
+
+/^a*\w/
+    z
+ 0: z
+    az
+ 0: az
+ 1: a
+    aaaz
+ 0: aaaz
+ 1: aaa
+ 2: aa
+ 3: a
+    a
+ 0: a
+    aa
+ 0: aa
+ 1: a
+    aaaa
+ 0: aaaa
+ 1: aaa
+ 2: aa
+ 3: a
+    a+
+ 0: a
+    aa+
+ 0: aa
+ 1: a
+
+/^a*?\w/
+    z
+ 0: z
+    az
+ 0: az
+ 1: a
+    aaaz
+ 0: aaaz
+ 1: aaa
+ 2: aa
+ 3: a
+    a
+ 0: a
+    aa
+ 0: aa
+ 1: a
+    aaaa
+ 0: aaaa
+ 1: aaa
+ 2: aa
+ 3: a
+    a+
+ 0: a
+    aa+
+ 0: aa
+ 1: a
+
+/^a+\w/
+    az
+ 0: az
+    aaaz
+ 0: aaaz
+ 1: aaa
+ 2: aa
+    aa
+ 0: aa
+    aaaa
+ 0: aaaa
+ 1: aaa
+ 2: aa
+    aa+
+ 0: aa
+
+/^a+?\w/
+    az
+ 0: az
+    aaaz
+ 0: aaaz
+ 1: aaa
+ 2: aa
+    aa
+ 0: aa
+    aaaa
+ 0: aaaa
+ 1: aaa
+ 2: aa
+    aa+
+ 0: aa
+
+/^\d{8}\w{2,}/
+    1234567890
+ 0: 1234567890
+    12345678ab
+ 0: 12345678ab
+    12345678__
+ 0: 12345678__
     *** Failers
 No match
-    a\x{1234}b 
+    1234567
 No match
-    axxxxxxbcdefghijb 
+
+/^[aeiou\d]{4,5}$/
+    uoie
+ 0: uoie
+    1234
+ 0: 1234
+    12345
+ 0: 12345
+    aaaaa
+ 0: aaaaa
+    *** Failers
 No match
+    123456
+No match

-/^[a\x{c0}]/8
+/^[aeiou\d]{4,5}?/
+    uoie
+ 0: uoie
+    1234
+ 0: 1234
+    12345
+ 0: 12345
+ 1: 1234
+    aaaaa
+ 0: aaaaa
+ 1: aaaa
+    123456
+ 0: 12345
+ 1: 1234
+
+/^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/
+    From abcd  Mon Sep 01 12:33:02 1997
+ 0: From abcd  Mon Sep 01 12:33
+
+/^From\s+\S+\s+([a-zA-Z]{3}\s+){2}\d{1,2}\s+\d\d:\d\d/
+    From abcd  Mon Sep 01 12:33:02 1997
+ 0: From abcd  Mon Sep 01 12:33
+    From abcd  Mon Sep  1 12:33:02 1997
+ 0: From abcd  Mon Sep  1 12:33
     *** Failers
 No match
-    \x{100}
+    From abcd  Sep 01 12:33:02 1997
 No match

-/(?<=aXb)cd/8
-    aXbcd
- 0: cd
+/^12.34/s
+    12\n34
+ 0: 12\x0a34
+    12\r34
+ 0: 12\x0d34

-/(?<=a\x{100}b)cd/8
-    a\x{100}bcd
- 0: cd
+/\w+(?=\t)/
+    the quick brown\t fox
+ 0: brown

-/(?<=a\x{100000}b)cd/8
-    a\x{100000}bcd
- 0: cd
+/foo(?!bar)(.*)/
+    foobar is foolish see?
+ 0: foolish see?
+ 1: foolish see
+ 2: foolish se
+ 3: foolish s
+ 4: foolish 
+ 5: foolish
+ 6: foolis
+ 7: fooli
+ 8: fool
+ 9: foo
+
+/(?:(?!foo)...|^.{0,2})bar(.*)/
+    foobar crowbar etc
+ 0: rowbar etc
+ 1: rowbar et
+ 2: rowbar e
+ 3: rowbar 
+ 4: rowbar
+    barrel
+ 0: barrel
+ 1: barre
+ 2: barr
+ 3: bar
+    2barrel
+ 0: 2barrel
+ 1: 2barre
+ 2: 2barr
+ 3: 2bar
+    A barrel
+ 0: A barrel
+ 1: A barre
+ 2: A barr
+ 3: A bar
+
+/^(\D*)(?=\d)(?!123)/
+    abc456
+ 0: abc
+    *** Failers
+No match
+    abc123
+No match
+
+/^1234(?# test newlines
+  inside)/
+    1234
+ 0: 1234
+
+/^1234 #comment in extended re
+  /x
+    1234
+ 0: 1234
+
+/#rhubarb
+  abcd/x
+    abcd
+ 0: abcd
+
+/^abcd#rhubarb/x
+    abcd
+ 0: abcd
+
+/(?!^)abc/
+    the abc
+ 0: abc
+    *** Failers
+No match
+    abc
+No match
+
+/(?=^)abc/
+    abc
+ 0: abc
+    *** Failers
+No match
+    the abc
+No match
+
+/^[ab]{1,3}(ab*|b)/
+    aabbbbb
+ 0: aabbbbb
+ 1: aabbbb
+ 2: aabbb
+ 3: aabb
+ 4: aab
+ 5: aa
+
+/^[ab]{1,3}?(ab*|b)/
+    aabbbbb
+ 0: aabbbbb
+ 1: aabbbb
+ 2: aabbb
+ 3: aabb
+ 4: aab
+ 5: aa
+
+/^[ab]{1,3}?(ab*?|b)/
+    aabbbbb
+ 0: aabbbbb
+ 1: aabbbb
+ 2: aabbb
+ 3: aabb
+ 4: aab
+ 5: aa
+
+/^[ab]{1,3}(ab*?|b)/
+    aabbbbb
+ 0: aabbbbb
+ 1: aabbbb
+ 2: aabbb
+ 3: aabb
+ 4: aab
+ 5: aa
+
+/  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                          # optional leading comment
+(?:    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+# address
+|                     #  or
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)             # one word, optionally followed by....
+(?:
+[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]  |  # atom and space parts, or...
+\(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)       |  # comments, or...
+
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+# quoted strings
+)*
+<  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                     # leading <
+(?:  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  ,  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+)* # further okay, if led by comma
+:                                # closing colon
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  )? #       optional route
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)                    # initial word
+(?:  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+" (?:                      # opening quote...
+[^\\\x80-\xff\n\015"]                #   Anything except backslash and quote
+|                     #    or
+\\ [^\x80-\xff]           #   Escaped something (something != CR)
+)* "  # closing quote
+)  )* # further okay, if led by a period
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  @  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*    (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                           # initial subdomain
+(?:                                  #
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  \.                        # if led by a period...
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*   (?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|   \[                         # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*    #    stuff
+\]                        #           ]
+)                     #   ...further okay
+)*
+#       address spec
+(?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*  > #                  trailing >
+# name and address
+)  (?: [\040\t] |  \(
+(?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  |  \( (?:  [^\\\x80-\xff\n\015()]  |  \\ [^\x80-\xff]  )* \)  )*
+\)  )*                       # optional trailing comment
+/x
+    Alan Other <user\@dom.ain>
+ 0: Alan Other <user@???>
+    <user\@dom.ain>
+ 0: user@???
+ 1: user@dom
+    user\@dom.ain
+ 0: user@???
+ 1: user@dom
+    \"A. Other\" <user.1234\@dom.ain> (a comment)
+ 0: "A. Other" <user.1234@???> (a comment)
+ 1: "A. Other" <user.1234@???> 
+ 2: "A. Other" <user.1234@???>
+    A. Other <user.1234\@dom.ain> (a comment)
+ 0:  Other <user.1234@???> (a comment)
+ 1:  Other <user.1234@???> 
+ 2:  Other <user.1234@???>
+    \"/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/\"\@x400-re.lay
+ 0: "/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/"@???
+ 1: "/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/"@x400-re
+    A missing angle <user\@some.where
+ 0: user@???
+ 1: user@some
+    *** Failers
+No match
+    The quick brown fox
+No match
+
+/[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional leading comment
+(?:
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+# Atom
+|                       #  or
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+# Quoted string
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+# Atom
+|                       #  or
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+# Quoted string
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# additional words
+)*
+@
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+)*
+# address
+|                             #  or
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+# Atom
+|                       #  or
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+# Quoted string
+)
+# leading word
+[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] *               # "normal" atoms and or spaces
+(?:
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+|
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+) # "special" comment or quoted string
+[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] *            #  more "normal"
+)*
+<
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# <
+(?:
+@
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+)*
+(?: ,
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+@
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+)*
+)*  # additional domains
+:
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+)?     #       optional route
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+# Atom
+|                       #  or
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+# Quoted string
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+# Atom
+|                       #  or
+"                                     # "
+[^\\\x80-\xff\n\015"] *                            #   normal
+(?:  \\ [^\x80-\xff]  [^\\\x80-\xff\n\015"] * )*        #   ( special normal* )*
+"                                     #        "
+# Quoted string
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# additional words
+)*
+@
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+(?:
+\.
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+(?:
+[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+    # some number of atom characters...
+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
+|
+\[                            # [
+(?: [^\\\x80-\xff\n\015\[\]] |  \\ [^\x80-\xff]  )*     #    stuff
+\]                           #           ]
+)
+[\040\t]*                    # Nab whitespace.
+(?:
+\(                              #  (
+[^\\\x80-\xff\n\015()] *                             #     normal*
+(?:                                 #       (
+(?:  \\ [^\x80-\xff]  |
+\(                            #  (
+[^\\\x80-\xff\n\015()] *                            #     normal*
+(?:  \\ [^\x80-\xff]   [^\\\x80-\xff\n\015()] * )*        #     (special normal*)*
+\)                           #                       )
+)    #         special
+[^\\\x80-\xff\n\015()] *                         #         normal*
+)*                                  #            )*
+\)                             #                )
+[\040\t]* )*    # If comment found, allow more spaces.
+# optional trailing comments
+)*
+#       address spec
+>                    #                 >
+# name and address
+)
+/x
+    Alan Other <user\@dom.ain>
+ 0: Alan Other <user@???>
+    <user\@dom.ain>
+ 0: user@???
+ 1: user@dom
+    user\@dom.ain
+ 0: user@???
+ 1: user@dom
+    \"A. Other\" <user.1234\@dom.ain> (a comment)
+ 0: "A. Other" <user.1234@???>
+    A. Other <user.1234\@dom.ain> (a comment)
+ 0:  Other <user.1234@???>
+    \"/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/\"\@x400-re.lay
+ 0: "/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/"@???
+ 1: "/s=user/ou=host/o=place/prmd=uu.yy/admd= /c=gb/"@x400-re
+    A missing angle <user\@some.where
+ 0: user@???
+ 1: user@some
+    *** Failers
+No match
+    The quick brown fox
+No match
+
+/abc\0def\00pqr\000xyz\0000AB/
+    abc\0def\00pqr\000xyz\0000AB
+ 0: abc\x00def\x00pqr\x00xyz\x000AB
+    abc456 abc\0def\00pqr\000xyz\0000ABCDE
+ 0: abc\x00def\x00pqr\x00xyz\x000AB
+
+/abc\x0def\x00pqr\x000xyz\x0000AB/
+    abc\x0def\x00pqr\x000xyz\x0000AB
+ 0: abc\x0def\x00pqr\x000xyz\x0000AB
+    abc456 abc\x0def\x00pqr\x000xyz\x0000ABCDE
+ 0: abc\x0def\x00pqr\x000xyz\x0000AB
+
+/^[\000-\037]/
+    \0A
+ 0: \x00
+    \01B
+ 0: \x01
+    \037C
+ 0: \x1f
+
+/\0*/
+    \0\0\0\0
+ 0: \x00\x00\x00\x00
+ 1: \x00\x00\x00
+ 2: \x00\x00
+ 3: \x00
+ 4: 
+
+/A\x0{2,3}Z/
+    The A\x0\x0Z
+ 0: A\x00\x00Z
+    An A\0\x0\0Z
+ 0: A\x00\x00\x00Z
+    *** Failers
+No match
+    A\0Z
+No match
+    A\0\x0\0\x0Z
+No match
+
+/^\s/
+    \040abc
+ 0:  
+    \x0cabc
+ 0: \x0c
+    \nabc
+ 0: \x0a
+    \rabc
+ 0: \x0d
+    \tabc
+ 0: \x09
+    *** Failers
+No match
+    abc
+No match
+
+/^a    b
+    ?  c/x
+    abc
+ 0: abc
+
+/ab{1,3}bc/
+    abbbbc
+ 0: abbbbc
+    abbbc
+ 0: abbbc
+    abbc
+ 0: abbc
+    *** Failers
+No match
+    abc
+No match
+    abbbbbc
+No match
+
+/([^.]*)\.([^:]*):[T ]+(.*)/
+    track1.title:TBlah blah blah
+ 0: track1.title:TBlah blah blah
+ 1: track1.title:TBlah blah bla
+ 2: track1.title:TBlah blah bl
+ 3: track1.title:TBlah blah b
+ 4: track1.title:TBlah blah 
+ 5: track1.title:TBlah blah
+ 6: track1.title:TBlah bla
+ 7: track1.title:TBlah bl
+ 8: track1.title:TBlah b
+ 9: track1.title:TBlah 
+10: track1.title:TBlah
+11: track1.title:TBla
+12: track1.title:TBl
+13: track1.title:TB
+14: track1.title:T
+
+/([^.]*)\.([^:]*):[T ]+(.*)/i
+    track1.title:TBlah blah blah
+ 0: track1.title:TBlah blah blah
+ 1: track1.title:TBlah blah bla
+ 2: track1.title:TBlah blah bl
+ 3: track1.title:TBlah blah b
+ 4: track1.title:TBlah blah 
+ 5: track1.title:TBlah blah
+ 6: track1.title:TBlah bla
+ 7: track1.title:TBlah bl
+ 8: track1.title:TBlah b
+ 9: track1.title:TBlah 
+10: track1.title:TBlah
+11: track1.title:TBla
+12: track1.title:TBl
+13: track1.title:TB
+14: track1.title:T
+
+/([^.]*)\.([^:]*):[t ]+(.*)/i
+    track1.title:TBlah blah blah
+ 0: track1.title:TBlah blah blah
+ 1: track1.title:TBlah blah bla
+ 2: track1.title:TBlah blah bl
+ 3: track1.title:TBlah blah b
+ 4: track1.title:TBlah blah 
+ 5: track1.title:TBlah blah
+ 6: track1.title:TBlah bla
+ 7: track1.title:TBlah bl
+ 8: track1.title:TBlah b
+ 9: track1.title:TBlah 
+10: track1.title:TBlah
+11: track1.title:TBla
+12: track1.title:TBl
+13: track1.title:TB
+14: track1.title:T
+
+/^[W-c]+$/
+    WXY_^abc
+ 0: WXY_^abc
+    *** Failers
+No match
+    wxy
+No match
+
+/^[W-c]+$/i
+    WXY_^abc
+ 0: WXY_^abc
+    wxy_^ABC
+ 0: wxy_^ABC
+
+/^[\x3f-\x5F]+$/i
+    WXY_^abc
+ 0: WXY_^abc
+    wxy_^ABC
+ 0: wxy_^ABC
+
+/^abc$/m
+    abc
+ 0: abc
+    qqq\nabc
+ 0: abc
+    abc\nzzz
+ 0: abc
+    qqq\nabc\nzzz
+ 0: abc
+
+/^abc$/
+    abc
+ 0: abc
+    *** Failers
+No match
+    qqq\nabc
+No match
+    abc\nzzz
+No match
+    qqq\nabc\nzzz
+No match
+
+/\Aabc\Z/m
+    abc
+ 0: abc
+    abc\n 
+ 0: abc
+    *** Failers
+No match
+    qqq\nabc
+No match
+    abc\nzzz
+No match
+    qqq\nabc\nzzz
+No match

-/(?:\x{100}){3}b/8
-    \x{100}\x{100}\x{100}b
- 0: \x{100}\x{100}\x{100}b
-    *** Failers 
+/\A(.)*\Z/s
+    abc\ndef
+ 0: abc\x0adef
+
+/\A(.)*\Z/m
+    *** Failers
+ 0: *** Failers
+    abc\ndef
 No match
-    \x{100}\x{100}b
+
+/(?:b)|(?::+)/
+    b::c
+ 0: b
+    c::b
+ 0: ::
+ 1: :
+
+/[-az]+/
+    az-
+ 0: az-
+ 1: az
+ 2: a
+    *** Failers
+ 0: a
+    b
 No match

-/\x{ab}/8
-    \x{ab} 
- 0: \x{ab}
-    \xc2\xab
- 0: \x{ab}
-    *** Failers 
+/[az-]+/
+    za-
+ 0: za-
+ 1: za
+ 2: z
+    *** Failers
+ 0: a
+    b
 No match
-    \x00{ab}
+
+/[a\-z]+/
+    a-z
+ 0: a-z
+ 1: a-
+ 2: a
+    *** Failers
+ 0: a
+    b
 No match

-/(?<=(.))X/8
-    WXYZ
- 0: X
-    \x{256}XYZ 
- 0: X
+/[a-z]+/
+    abcdxyz
+ 0: abcdxyz
+ 1: abcdxy
+ 2: abcdx
+ 3: abcd
+ 4: abc
+ 5: ab
+ 6: a
+
+/[\d-]+/
+    12-34
+ 0: 12-34
+ 1: 12-3
+ 2: 12-
+ 3: 12
+ 4: 1
     *** Failers
 No match
-    XYZ 
+    aaa
 No match

-/[^a]+/8g
-    bcd
- 0: bcd
- 1: bc
- 2: b
-    \x{100}aY\x{256}Z 
- 0: \x{100}
- 0: Y\x{256}Z
- 1: Y\x{256}
- 2: Y
+/[\d-z]+/
+    12-34z
+ 0: 12-34z
+ 1: 12-34
+ 2: 12-3
+ 3: 12-
+ 4: 12
+ 5: 1
+    *** Failers
+No match
+    aaa
+No match
+
+/\x5c/
+    \\
+ 0: \
+
+/\x20Z/
+    the Zoo
+ 0:  Z
+    *** Failers
+No match
+    Zulu
+No match
+
+/ab{3cd/
+    ab{3cd
+ 0: ab{3cd
+
+/ab{3,cd/
+    ab{3,cd
+ 0: ab{3,cd
+
+/ab{3,4a}cd/
+    ab{3,4a}cd
+ 0: ab{3,4a}cd
+
+/{4,5a}bc/
+    {4,5a}bc
+ 0: {4,5a}bc
+
+/^a.b/<lf>
+    a\rb
+ 0: a\x0db
+    *** Failers
+No match
+    a\nb
+No match
+
+/abc$/
+    abc
+ 0: abc
+    abc\n
+ 0: abc
+    *** Failers
+No match
+    abc\ndef
+No match
+
+/(abc)\123/
+    abc\x53
+ 0: abcS
+
+/(abc)\223/
+    abc\x93
+ 0: abc\x93
+
+/(abc)\323/
+    abc\xd3
+ 0: abc\xd3
+
+/(abc)\100/
+    abc\x40
+ 0: abc@
+    abc\100
+ 0: abc@
+
+/(abc)\1000/
+    abc\x400
+ 0: abc@0
+    abc\x40\x30
+ 0: abc@0
+    abc\1000
+ 0: abc@0
+    abc\100\x30
+ 0: abc@0
+    abc\100\060
+ 0: abc@0
+    abc\100\60
+ 0: abc@0
+
+/abc\81/
+    abc\081
+ 0: abc\x0081
+    abc\0\x38\x31
+ 0: abc\x0081
+
+/abc\91/
+    abc\091
+ 0: abc\x0091
+    abc\0\x39\x31
+ 0: abc\x0091
+
+/(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)\12\123/
+    abcdefghijk\12S
+ 0: abcdefghijk\x0aS
+
+/ab\idef/
+    abidef
+ 0: abidef
+
+/a{0}bc/
+    bc
+ 0: bc
+
+/(a|(bc)){0,0}?xyz/
+    xyz
+ 0: xyz
+
+/abc[\10]de/
+    abc\010de
+ 0: abc\x08de
+
+/abc[\1]de/
+    abc\1de
+ 0: abc\x01de
+
+/(abc)[\1]de/
+    abc\1de
+ 0: abc\x01de
+
+/(?s)a.b/
+    a\nb
+ 0: a\x0ab
+
+/^([^a])([^\b])([^c]*)([^d]{3,4})/
+    baNOTccccd
+ 0: baNOTcccc
+ 1: baNOTccc
+ 2: baNOTcc
+ 3: baNOTc
+ 4: baNOT
+    baNOTcccd
+ 0: baNOTccc
+ 1: baNOTcc
+ 2: baNOTc
+ 3: baNOT
+    baNOTccd
+ 0: baNOTcc
+ 1: baNOTc
+ 2: baNOT
+    bacccd
+ 0: baccc
+    *** Failers
+ 0: *** Failers
+ 1: *** Failer
+ 2: *** Faile
+ 3: *** Fail
+ 4: *** Fai
+ 5: *** Fa
+ 6: *** F
+    anything
+No match
+    b\bc   
+No match
+    baccd
+No match
+
+/[^a]/
+    Abc
+ 0: A
+  
+/[^a]/i
+    Abc 
+ 0: b
+
+/[^a]+/
+    AAAaAbc
+ 0: AAA
+ 1: AA
+ 2: A
+  
+/[^a]+/i
+    AAAaAbc 
+ 0: bc
+ 1: b
+
+/[^a]+/
+    bbb\nccc
+ 0: bbb\x0accc
+ 1: bbb\x0acc
+ 2: bbb\x0ac
+ 3: bbb\x0a
+ 4: bbb
+ 5: bb
+ 6: b
+   
+/[^k]$/
+    abc
+ 0: c
+    *** Failers
+ 0: s
+    abk   
+No match
+   
+/[^k]{2,3}$/
+    abc
+ 0: abc
+    kbc
+ 0: bc
+    kabc 
+ 0: abc
+    *** Failers
+ 0: ers
+    abk
+No match
+    akb
+No match
+    akk 
+No match
+
+/^\d{8,}\@.+[^k]$/
+    12345678\@a.b.c.d
+ 0: 12345678@???
+    123456789\@x.y.z
+ 0: 123456789@???
+    *** Failers
+No match
+    12345678\@x.y.uk
+No match
+    1234567\@a.b.c.d       
+No match
+
+/[^a]/
+    aaaabcd
+ 0: b
+    aaAabcd 
+ 0: A
+
+/[^a]/i
+    aaaabcd
+ 0: b
+    aaAabcd 
+ 0: b
+
+/[^az]/
+    aaaabcd
+ 0: b
+    aaAabcd 
+ 0: A
+
+/[^az]/i
+    aaaabcd
+ 0: b
+    aaAabcd 
+ 0: b
+
+/\000\001\002\003\004\005\006\007\010\011\012\013\014\015\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\040\041\042\043\044\045\046\047\050\051\052\053\054\055\056\057\060\061\062\063\064\065\066\067\070\071\072\073\074\075\076\077\100\101\102\103\104\105\106\107\110\111\112\113\114\115\116\117\120\121\122\123\124\125\126\127\130\131\132\133\134\135\136\137\140\141\142\143\144\145\146\147\150\151\152\153\154\155\156\157\160\161\162\163\164\165\166\167\170\171\172\173\174\175\176\177\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\257\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\365\366\367\370\371\372\373\374\375\376\377/
+ \000\001\002\003\004\005\006\007\010\011\012\013\014\015\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\040\041\042\043\044\045\046\047\050\051\052\053\054\055\056\057\060\061\062\063\064\065\066\067\070\071\072\073\074\075\076\077\100\101\102\103\104\105\106\107\110\111\112\113\114\115\116\117\120\121\122\123\124\125\126\127\130\131\132\133\134\135\136\137\140\141\142\143\144\145\146\147\150\151\152\153\154\155\156\157\160\161\162\163\164\165\166\167\170\171\172\173\174\175\176\177\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\257\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\365\366\367\370\371\372\373\374\375\376\377
+ 0: \x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff
+
+/P[^*]TAIRE[^*]{1,6}?LL/
+    xxxxxxxxxxxPSTAIREISLLxxxxxxxxx
+ 0: PSTAIREISLL
+
+/P[^*]TAIRE[^*]{1,}?LL/
+    xxxxxxxxxxxPSTAIREISLLxxxxxxxxx
+ 0: PSTAIREISLL
+
+/(\.\d\d[1-9]?)\d+/
+    1.230003938
+ 0: .230003938
+ 1: .23000393
+ 2: .2300039
+ 3: .230003
+ 4: .23000
+ 5: .2300
+ 6: .230
+    1.875000282   
+ 0: .875000282
+ 1: .87500028
+ 2: .8750002
+ 3: .875000
+ 4: .87500
+ 5: .8750
+ 6: .875
+    1.235  
+ 0: .235
+                  
+/(\.\d\d((?=0)|\d(?=\d)))/
+    1.230003938      
+ 0: .230
+ 1: .23
+    1.875000282
+ 0: .875
+    *** Failers 
+No match
+    1.235 
+No match

-/^[^a]{2}/8
-    \x{100}bc
- 0: \x{100}b
+/a(?)b/
+    ab 
+ 0: ab

-/^[^a]{2,}/8
-    \x{100}bcAa
- 0: \x{100}bcA
- 1: \x{100}bc
- 2: \x{100}b
+/\b(foo)\s+(\w+)/i
+    Food is on the foo table
+ 0: foo table
+ 1: foo tabl
+ 2: foo tab
+ 3: foo ta
+ 4: foo t
+    
+/foo(.*)bar/
+    The food is under the bar in the barn.
+ 0: food is under the bar in the bar
+ 1: food is under the bar
+    
+/foo(.*?)bar/  
+    The food is under the bar in the barn.
+ 0: food is under the bar in the bar
+ 1: food is under the bar

-/^[^a]{2,}?/8
-    \x{100}bca
- 0: \x{100}bc
- 1: \x{100}b
-
-/[^a]+/8ig
-    bcd
- 0: bcd
- 1: bc
- 2: b
-    \x{100}aY\x{256}Z 
- 0: \x{100}
- 0: Y\x{256}Z
- 1: Y\x{256}
- 2: Y
+/(.*)(\d*)/
+    I have 2 numbers: 53147
+Matched, but too many subsidiary matches
+ 0: I have 2 numbers: 53147
+ 1: I have 2 numbers: 5314
+ 2: I have 2 numbers: 531
+ 3: I have 2 numbers: 53
+ 4: I have 2 numbers: 5
+ 5: I have 2 numbers: 
+ 6: I have 2 numbers:
+ 7: I have 2 numbers
+ 8: I have 2 number
+ 9: I have 2 numbe
+10: I have 2 numb
+11: I have 2 num
+12: I have 2 nu
+13: I have 2 n
+14: I have 2 
+15: I have 2
+16: I have 
+17: I have
+18: I hav
+19: I ha
+20: I h
+21: I

-/^[^a]{2}/8i
-    \x{100}bc
- 0: \x{100}b
+/(.*)(\d+)/
+    I have 2 numbers: 53147
+ 0: I have 2 numbers: 53147
+ 1: I have 2 numbers: 5314
+ 2: I have 2 numbers: 531
+ 3: I have 2 numbers: 53
+ 4: I have 2 numbers: 5
+ 5: I have 2

-/^[^a]{2,}/8i
-    \x{100}bcAa
- 0: \x{100}bc
- 1: \x{100}b
+/(.*?)(\d*)/
+    I have 2 numbers: 53147
+Matched, but too many subsidiary matches
+ 0: I have 2 numbers: 53147
+ 1: I have 2 numbers: 5314
+ 2: I have 2 numbers: 531
+ 3: I have 2 numbers: 53
+ 4: I have 2 numbers: 5
+ 5: I have 2 numbers: 
+ 6: I have 2 numbers:
+ 7: I have 2 numbers
+ 8: I have 2 number
+ 9: I have 2 numbe
+10: I have 2 numb
+11: I have 2 num
+12: I have 2 nu
+13: I have 2 n
+14: I have 2 
+15: I have 2
+16: I have 
+17: I have
+18: I hav
+19: I ha
+20: I h
+21: I

-/^[^a]{2,}?/8i
-    \x{100}bca
- 0: \x{100}bc
- 1: \x{100}b
+/(.*?)(\d+)/
+    I have 2 numbers: 53147
+ 0: I have 2 numbers: 53147
+ 1: I have 2 numbers: 5314
+ 2: I have 2 numbers: 531
+ 3: I have 2 numbers: 53
+ 4: I have 2 numbers: 5
+ 5: I have 2

-/\x{100}{0,0}/8
-    abcd
+/(.*)(\d+)$/
+    I have 2 numbers: 53147
+ 0: I have 2 numbers: 53147
+
+/(.*?)(\d+)$/
+    I have 2 numbers: 53147
+ 0: I have 2 numbers: 53147
+
+/(.*)\b(\d+)$/
+    I have 2 numbers: 53147
+ 0: I have 2 numbers: 53147
+
+/(.*\D)(\d+)$/
+    I have 2 numbers: 53147
+ 0: I have 2 numbers: 53147
+
+/^\D*(?!123)/
+    ABC123
+ 0: AB
+ 1: A
+ 2: 
+     
+/^(\D*)(?=\d)(?!123)/
+    ABC445
+ 0: ABC
+    *** Failers
+No match
+    ABC123
+No match
+    
+/^[W-]46]/
+    W46]789 
+ 0: W46]
+    -46]789
+ 0: -46]
+    *** Failers
+No match
+    Wall
+No match
+    Zebra
+No match
+    42
+No match
+    [abcd] 
+No match
+    ]abcd[
+No match
+       
+/^[W-\]46]/
+    W46]789 
+ 0: W
+    Wall
+ 0: W
+    Zebra
+ 0: Z
+    Xylophone  
+ 0: X
+    42
+ 0: 4
+    [abcd] 
+ 0: [
+    ]abcd[
+ 0: ]
+    \\backslash 
+ 0: \
+    *** Failers
+No match
+    -46]789
+No match
+    well
+No match
+    
+/\d\d\/\d\d\/\d\d\d\d/
+    01/01/2000
+ 0: 01/01/2000
+
+/word (?:[a-zA-Z0-9]+ ){0,10}otherword/
+  word cat dog elephant mussel cow horse canary baboon snake shark otherword
+ 0: word cat dog elephant mussel cow horse canary baboon snake shark otherword
+  word cat dog elephant mussel cow horse canary baboon snake shark
+No match
+
+/word (?:[a-zA-Z0-9]+ ){0,300}otherword/
+  word cat dog elephant mussel cow horse canary baboon snake shark the quick brown fox and the lazy dog and several other words getting close to thirty by now I hope
+No match
+
+/^(a){0,0}/
+    bcd
  0: 
- 
-/\x{100}?/8
-    abcd
+    abc
  0: 
-    \x{100}\x{100} 
- 0: \x{100}
+    aab     
+ 0: 
+
+/^(a){0,1}/
+    bcd
+ 0: 
+    abc
+ 0: a
  1: 
+    aab  
+ 0: a
+ 1:

-/\x{100}{0,3}/8 
-    \x{100}\x{100} 
- 0: \x{100}\x{100}
- 1: \x{100}
+/^(a){0,2}/
+    bcd
+ 0: 
+    abc
+ 0: a
+ 1: 
+    aab  
+ 0: aa
+ 1: a
  2: 
-    \x{100}\x{100}\x{100}\x{100} 
- 0: \x{100}\x{100}\x{100}
- 1: \x{100}\x{100}
- 2: \x{100}
+
+/^(a){0,3}/
+    bcd
+ 0: 
+    abc
+ 0: a
+ 1: 
+    aab
+ 0: aa
+ 1: a
+ 2: 
+    aaa   
+ 0: aaa
+ 1: aa
+ 2: a
  3: 
-    
-/\x{100}*/8
-    abce
+
+/^(a){0,}/
+    bcd
  0: 
-    \x{100}\x{100}\x{100}\x{100} 
- 0: \x{100}\x{100}\x{100}\x{100}
- 1: \x{100}\x{100}\x{100}
- 2: \x{100}\x{100}
- 3: \x{100}
- 4: 
+    abc
+ 0: a
+ 1: 
+    aab
+ 0: aa
+ 1: a
+ 2: 
+    aaa
+ 0: aaa
+ 1: aa
+ 2: a
+ 3: 
+    aaaaaaaa    
+ 0: aaaaaaaa
+ 1: aaaaaaa
+ 2: aaaaaa
+ 3: aaaaa
+ 4: aaaa
+ 5: aaa
+ 6: aa
+ 7: a
+ 8:

-/\x{100}{1,1}/8
-    abcd\x{100}\x{100}\x{100}\x{100} 
- 0: \x{100}
+/^(a){1,1}/
+    bcd
+No match
+    abc
+ 0: a
+    aab  
+ 0: a

-/\x{100}{1,3}/8
-    abcd\x{100}\x{100}\x{100}\x{100} 
- 0: \x{100}\x{100}\x{100}
- 1: \x{100}\x{100}
- 2: \x{100}
+/^(a){1,2}/
+    bcd
+No match
+    abc
+ 0: a
+    aab  
+ 0: aa
+ 1: a

-/\x{100}+/8
-    abcd\x{100}\x{100}\x{100}\x{100} 
- 0: \x{100}\x{100}\x{100}\x{100}
- 1: \x{100}\x{100}\x{100}
- 2: \x{100}\x{100}
- 3: \x{100}
+/^(a){1,3}/
+    bcd
+No match
+    abc
+ 0: a
+    aab
+ 0: aa
+ 1: a
+    aaa   
+ 0: aaa
+ 1: aa
+ 2: a

-/\x{100}{3}/8
-    abcd\x{100}\x{100}\x{100}XX
- 0: \x{100}\x{100}\x{100}
+/^(a){1,}/
+    bcd
+No match
+    abc
+ 0: a
+    aab
+ 0: aa
+ 1: a
+    aaa
+ 0: aaa
+ 1: aa
+ 2: a
+    aaaaaaaa    
+ 0: aaaaaaaa
+ 1: aaaaaaa
+ 2: aaaaaa
+ 3: aaaaa
+ 4: aaaa
+ 5: aaa
+ 6: aa
+ 7: a

-/\x{100}{3,5}/8
-    abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
- 0: \x{100}\x{100}\x{100}\x{100}\x{100}
- 1: \x{100}\x{100}\x{100}\x{100}
- 2: \x{100}\x{100}\x{100}
+/.*\.gif/
+    borfle\nbib.gif\nno
+ 0: bib.gif

-/\x{100}{3,}/8
-    abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
- 0: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 1: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 2: \x{100}\x{100}\x{100}\x{100}\x{100}
- 3: \x{100}\x{100}\x{100}\x{100}
- 4: \x{100}\x{100}\x{100}
+/.{0,}\.gif/
+    borfle\nbib.gif\nno
+ 0: bib.gif

-/(?<=a\x{100}{2}b)X/8
-    Xyyya\x{100}\x{100}bXzzz
- 0: X
+/.*\.gif/m
+    borfle\nbib.gif\nno
+ 0: bib.gif

-/\D*/8
-  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-Matched, but too many subsidiary matches
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/.*\.gif/s
+    borfle\nbib.gif\nno
+ 0: borfle\x0abib.gif

-/\D*/8
-  \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-Matched, but too many subsidiary matches
- 0: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 1: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 2: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 3: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 4: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 5: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 6: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 7: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 8: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
- 9: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-10: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-11: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-12: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-13: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-14: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-15: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-16: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-17: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-18: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-19: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-20: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
-21: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+/.*\.gif/ms
+    borfle\nbib.gif\nno
+ 0: borfle\x0abib.gif
+    
+/.*$/
+    borfle\nbib.gif\nno
+ 0: no

-/\D/8
-    1X2
- 0: X
-    1\x{100}2 
- 0: \x{100}
+/.*$/m
+    borfle\nbib.gif\nno
+ 0: borfle
+
+/.*$/s
+    borfle\nbib.gif\nno
+ 0: borfle\x0abib.gif\x0ano
+
+/.*$/ms
+    borfle\nbib.gif\nno
+ 0: borfle\x0abib.gif\x0ano
+ 1: borfle\x0abib.gif
+ 2: borfle
+    
+/.*$/
+    borfle\nbib.gif\nno\n
+ 0: no
+
+/.*$/m
+    borfle\nbib.gif\nno\n
+ 0: borfle
+
+/.*$/s
+    borfle\nbib.gif\nno\n
+ 0: borfle\x0abib.gif\x0ano\x0a
+ 1: borfle\x0abib.gif\x0ano
+
+/.*$/ms
+    borfle\nbib.gif\nno\n
+ 0: borfle\x0abib.gif\x0ano\x0a
+ 1: borfle\x0abib.gif\x0ano
+ 2: borfle\x0abib.gif
+ 3: borfle
+    
+/(.*X|^B)/
+    abcde\n1234Xyz
+ 0: 1234X
+    BarFoo 
+ 0: B
+    *** Failers
+No match
+    abcde\nBar  
+No match
+
+/(.*X|^B)/m
+    abcde\n1234Xyz
+ 0: 1234X
+    BarFoo 
+ 0: B
+    abcde\nBar  
+ 0: B
+
+/(.*X|^B)/s
+    abcde\n1234Xyz
+ 0: abcde\x0a1234X
+    BarFoo 
+ 0: B
+    *** Failers
+No match
+    abcde\nBar  
+No match
+
+/(.*X|^B)/ms
+    abcde\n1234Xyz
+ 0: abcde\x0a1234X
+    BarFoo 
+ 0: B
+    abcde\nBar  
+ 0: B
+
+/(?s)(.*X|^B)/
+    abcde\n1234Xyz
+ 0: abcde\x0a1234X
+    BarFoo 
+ 0: B
+    *** Failers 
+No match
+    abcde\nBar  
+No match
+
+/(?s:.*X|^B)/
+    abcde\n1234Xyz
+ 0: abcde\x0a1234X
+    BarFoo 
+ 0: B
+    *** Failers 
+No match
+    abcde\nBar  
+No match
+
+/^.*B/
+    **** Failers
+No match
+    abc\nB
+No match
+     
+/(?s)^.*B/
+    abc\nB
+ 0: abc\x0aB
+
+/(?m)^.*B/
+    abc\nB
+ 0: B
+     
+/(?ms)^.*B/
+    abc\nB
+ 0: abc\x0aB
+
+/(?ms)^B/
+    abc\nB
+ 0: B
+
+/(?s)B$/
+    B\n
+ 0: B
+
+/^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/
+    123456654321
+ 0: 123456654321

-/>\S/8
-    > >X Y
- 0: >X
-    > >\x{100} Y
- 0: >\x{100}
+/^\d\d\d\d\d\d\d\d\d\d\d\d/
+    123456654321 
+ 0: 123456654321
+
+/^[\d][\d][\d][\d][\d][\d][\d][\d][\d][\d][\d][\d]/
+    123456654321
+ 0: 123456654321

-/\d/8
-    \x{100}3
- 0: 3
+/^[abc]{12}/
+    abcabcabcabc
+ 0: abcabcabcabc

-/\s/8
-    \x{100} X
- 0:  
+/^[a-c]{12}/
+    abcabcabcabc
+ 0: abcabcabcabc

-/\D+/8
-    12abcd34
+/^(a|b|c){12}/
+    abcabcabcabc 
+ 0: abcabcabcabc
+
+/^[abcdefghijklmnopqrstuvwxy0123456789]/
+    n
+ 0: n
+    *** Failers 
+No match
+    z 
+No match
+
+/abcde{0,0}/
+    abcd
  0: abcd
- 1: abc
- 2: ab
- 3: a
     *** Failers
- 0: *** Failers
- 1: *** Failer
- 2: *** Faile
- 3: *** Fail
- 4: *** Fai
- 5: *** Fa
- 6: *** F
- 7: *** 
- 8: ***
- 9: **
-10: *
-    1234  
 No match
+    abce  
+No match

-/\D{2,3}/8
-    12abcd34
+/ab[cd]{0,0}e/
+    abe
+ 0: abe
+    *** Failers
+No match
+    abcde 
+No match
+    
+/ab(c){0,0}d/
+    abd
+ 0: abd
+    *** Failers
+No match
+    abcd   
+No match
+
+/a(b*)/
+    a
+ 0: a
+    ab
+ 0: ab
+ 1: a
+    abbbb
+ 0: abbbb
+ 1: abbb
+ 2: abb
+ 3: ab
+ 4: a
+    *** Failers
+ 0: a
+    bbbbb    
+No match
+    
+/ab\d{0}e/
+    abe
+ 0: abe
+    *** Failers
+No match
+    ab1e   
+No match
+    
+/"([^\\"]+|\\.)*"/
+    the \"quick\" brown fox
+ 0: "quick"
+    \"the \\\"quick\\\" brown fox\" 
+ 0: "the \"quick\" brown fox"
+
+/.*?/g+
+    abc
  0: abc
+ 0+ 
  1: ab
-    12ab34
- 0: ab
-    *** Failers  
- 0: ***
- 1: **
-    1234
+ 2: a
+ 3: 
+ 0: 
+ 0+ 
+  
+/\b/g+
+    abc 
+ 0: 
+ 0+ abc
+ 0: 
+ 0+ 
+
+/\b/+g
+    abc 
+ 0: 
+ 0+ abc
+ 0: 
+ 0+ 
+
+//g
+    abc
+ 0: 
+ 0: 
+ 0: 
+ 0: 
+
+/<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/is
+  <TR BGCOLOR='#DBE9E9'><TD align=left valign=top>43.<a href='joblist.cfm?JobID=94 6735&Keyword='>Word Processor<BR>(N-1286)</a></TD><TD align=left valign=top>Lega lstaff.com</TD><TD align=left valign=top>CA - Statewide</TD></TR>
+ 0: <TR BGCOLOR='#DBE9E9'><TD align=left valign=top>43.<a href='joblist.cfm?JobID=94 6735&Keyword='>Word Processor<BR>(N-1286)</a></TD><TD align=left valign=top>Lega lstaff.com</TD><TD align=left valign=top>CA - Statewide</TD></TR>
+
+/a[^a]b/
+    acb
+ 0: acb
+    a\nb
+ 0: a\x0ab
+    
+/a.b/
+    acb
+ 0: acb
+    *** Failers 
 No match
-    12a34  
+    a\nb   
 No match
+    
+/a[^a]b/s
+    acb
+ 0: acb
+    a\nb  
+ 0: a\x0ab
+    
+/a.b/s
+    acb
+ 0: acb
+    a\nb  
+ 0: a\x0ab

-/\D{2,3}?/8
-    12abcd34
+/^(b+?|a){1,2}?c/
+    bac
+ 0: bac
+    bbac
+ 0: bbac
+    bbbac
+ 0: bbbac
+    bbbbac
+ 0: bbbbac
+    bbbbbac 
+ 0: bbbbbac
+
+/^(b+|a){1,2}?c/
+    bac
+ 0: bac
+    bbac
+ 0: bbac
+    bbbac
+ 0: bbbac
+    bbbbac
+ 0: bbbbac
+    bbbbbac 
+ 0: bbbbbac
+    
+/(?!\A)x/m
+    x\nb\n
+No match
+    a\bx\n  
+ 0: x
+    
+/\x0{ab}/
+    \0{ab} 
+ 0: \x00{ab}
+
+/(A|B)*?CD/
+    CD 
+ 0: CD
+    
+/(A|B)*CD/
+    CD 
+ 0: CD
+
+/(?<!bar)foo/
+    foo
+ 0: foo
+    catfood
+ 0: foo
+    arfootle
+ 0: foo
+    rfoosh
+ 0: foo
+    *** Failers
+No match
+    barfoo
+No match
+    towbarfoo
+No match
+
+/\w{3}(?<!bar)foo/
+    catfood
+ 0: catfoo
+    *** Failers
+No match
+    foo
+No match
+    barfoo
+No match
+    towbarfoo
+No match
+
+/(?<=(foo)a)bar/
+    fooabar
+ 0: bar
+    *** Failers
+No match
+    bar
+No match
+    foobbar
+No match
+      
+/\Aabc\z/m
+    abc
  0: abc
- 1: ab
-    12ab34
+    *** Failers
+No match
+    abc\n   
+No match
+    qqq\nabc
+No match
+    abc\nzzz
+No match
+    qqq\nabc\nzzz
+No match
+
+"(?>.*/)foo"
+    /this/is/a/very/long/line/in/deed/with/very/many/slashes/in/it/you/see/
+No match
+
+"(?>.*/)foo"
+    /this/is/a/very/long/line/in/deed/with/very/many/slashes/in/and/foo
+ 0: /this/is/a/very/long/line/in/deed/with/very/many/slashes/in/and/foo
+
+/(?>(\.\d\d[1-9]?))\d+/
+    1.230003938
+ 0: .230003938
+ 1: .23000393
+ 2: .2300039
+ 3: .230003
+ 4: .23000
+ 5: .2300
+ 6: .230
+    1.875000282
+ 0: .875000282
+ 1: .87500028
+ 2: .8750002
+ 3: .875000
+ 4: .87500
+ 5: .8750
+    *** Failers 
+No match
+    1.235 
+No match
+
+/^((?>\w+)|(?>\s+))*$/
+    now is the time for all good men to come to the aid of the party
+ 0: now is the time for all good men to come to the aid of the party
+    *** Failers
+No match
+    this is not a line with only words and spaces!
+No match
+    
+/(\d+)(\w)/
+    12345a
+ 0: 12345a
+ 1: 12345
+ 2: 1234
+ 3: 123
+ 4: 12
+    12345+ 
+ 0: 12345
+ 1: 1234
+ 2: 123
+ 3: 12
+
+/((?>\d+))(\w)/
+    12345a
+ 0: 12345a
+    *** Failers
+No match
+    12345+ 
+No match
+
+/(?>a+)b/
+    aaab
+ 0: aaab
+
+/((?>a+)b)/
+    aaab
+ 0: aaab
+
+/(?>(a+))b/
+    aaab
+ 0: aaab
+
+/(?>b)+/
+    aaabbbccc
+ 0: bbb
+ 1: bb
+ 2: b
+
+/(?>a+|b+|c+)*c/
+    aaabbbbccccd
+ 0: aaabbbbcccc
+ 1: aaabbbbc
+    
+/(a+|b+|c+)*c/
+    aaabbbbccccd
+ 0: aaabbbbcccc
+ 1: aaabbbbccc
+ 2: aaabbbbcc
+ 3: aaabbbbc
+
+/((?>[^()]+)|\([^()]*\))+/
+    ((abc(ade)ufh()()x
+ 0: abc(ade)ufh()()x
+ 1: abc(ade)ufh()()
+ 2: abc(ade)ufh()
+ 3: abc(ade)ufh
+ 4: abc(ade)
+ 5: abc
+    
+/\(((?>[^()]+)|\([^()]+\))+\)/ 
+    (abc)
+ 0: (abc)
+    (abc(def)xyz)
+ 0: (abc(def)xyz)
+    *** Failers
+No match
+    ((()aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa   
+No match
+
+/a(?-i)b/i
+    ab
  0: ab
-    *** Failers  
- 0: ***
- 1: **
-    1234
+    Ab
+ 0: Ab
+    *** Failers 
 No match
-    12a34  
+    aB
 No match
+    AB
+No match
+        
+/(a (?x)b c)d e/
+    a bcd e
+ 0: a bcd e
+    *** Failers
+No match
+    a b cd e
+No match
+    abcd e   
+No match
+    a bcde 
+No match
+ 
+/(a b(?x)c d (?-x)e f)/
+    a bcde f
+ 0: a bcde f
+    *** Failers
+No match
+    abcdef  
+No match

-/\d+/8
-    12abcd34
- 0: 12
- 1: 1
+/(a(?i)b)c/
+    abc
+ 0: abc
+    aBc
+ 0: aBc
     *** Failers
 No match
+    abC
+No match
+    aBC  
+No match
+    Abc
+No match
+    ABc
+No match
+    ABC
+No match
+    AbC
+No match
+    
+/a(?i:b)c/
+    abc
+ 0: abc
+    aBc
+ 0: aBc
+    *** Failers 
+No match
+    ABC
+No match
+    abC
+No match
+    aBC
+No match
+    
+/a(?i:b)*c/
+    aBc
+ 0: aBc
+    aBBc
+ 0: aBBc
+    *** Failers 
+No match
+    aBC
+No match
+    aBBC
+No match
+    
+/a(?=b(?i)c)\w\wd/
+    abcd
+ 0: abcd
+    abCd
+ 0: abCd
+    *** Failers
+No match
+    aBCd
+No match
+    abcD     
+No match
+    
+/(?s-i:more.*than).*million/i
+    more than million
+ 0: more than million
+    more than MILLION
+ 0: more than MILLION
+    more \n than Million 
+ 0: more \x0a than Million
+    *** Failers
+No match
+    MORE THAN MILLION    
+No match
+    more \n than \n million 
+No match

-/\d{2,3}/8
-    12abcd34
+/(?:(?s-i)more.*than).*million/i
+    more than million
+ 0: more than million
+    more than MILLION
+ 0: more than MILLION
+    more \n than Million 
+ 0: more \x0a than Million
+    *** Failers
+No match
+    MORE THAN MILLION    
+No match
+    more \n than \n million 
+No match
+    
+/(?>a(?i)b+)+c/ 
+    abc
+ 0: abc
+    aBbc
+ 0: aBbc
+    aBBc 
+ 0: aBBc
+    *** Failers
+No match
+    Abc
+No match
+    abAb    
+No match
+    abbC 
+No match
+    
+/(?=a(?i)b)\w\wc/
+    abc
+ 0: abc
+    aBc
+ 0: aBc
+    *** Failers
+No match
+    Ab 
+No match
+    abC
+No match
+    aBC     
+No match
+    
+/(?<=a(?i)b)(\w\w)c/
+    abxxc
+ 0: xxc
+    aBxxc
+ 0: xxc
+    *** Failers
+No match
+    Abxxc
+No match
+    ABxxc
+No match
+    abxxC      
+No match
+
+/^(?(?=abc)\w{3}:|\d\d)$/
+    abc:
+ 0: abc:
+    12
  0: 12
-    1234abcd
- 0: 123
- 1: 12
-    *** Failers  
+    *** Failers
 No match
-    1.4 
+    123
 No match
+    xyz    
+No match

-/\d{2,3}?/8
-    12abcd34
+/^(?(?!abc)\d\d|\w{3}:)$/
+    abc:
+ 0: abc:
+    12
  0: 12
-    1234abcd
- 0: 123
- 1: 12
-    *** Failers  
+    *** Failers
 No match
-    1.4 
+    123
 No match
+    xyz    
+No match
+    
+/(?(?<=foo)bar|cat)/
+    foobar
+ 0: bar
+    cat
+ 0: cat
+    fcat
+ 0: cat
+    focat   
+ 0: cat
+    *** Failers
+No match
+    foocat  
+No match

-/\S+/8
-    12abcd34
- 0: 12abcd34
- 1: 12abcd3
- 2: 12abcd
- 3: 12abc
- 4: 12ab
- 5: 12a
- 6: 12
- 7: 1
+/(?(?<!foo)cat|bar)/
+    foobar
+ 0: bar
+    cat
+ 0: cat
+    fcat
+ 0: cat
+    focat   
+ 0: cat
     *** Failers
- 0: ***
- 1: **
- 2: *
-    \    \ 
 No match
+    foocat  
+No match

-/\S{2,3}/8
-    12abcd34
- 0: 12a
- 1: 12
-    1234abcd
- 0: 123
- 1: 12
+/(?>a*)*/
+    a
+ 0: a
+ 1: 
+    aa
+ 0: aa
+ 1: 
+    aaaa
+ 0: aaaa
+ 1: 
+    
+/(abc|)+/
+    abc
+ 0: abc
+ 1: 
+    abcabc
+ 0: abcabc
+ 1: abc
+ 2: 
+    abcabcabc
+ 0: abcabcabc
+ 1: abcabc
+ 2: abc
+ 3: 
+    xyz      
+ 0: 
+
+/([a]*)*/
+    a
+ 0: a
+ 1: 
+    aaaaa 
+ 0: aaaaa
+ 1: aaaa
+ 2: aaa
+ 3: aa
+ 4: a
+ 5: 
+ 
+/([ab]*)*/
+    a
+ 0: a
+ 1: 
+    b
+ 0: b
+ 1: 
+    ababab
+ 0: ababab
+ 1: ababa
+ 2: abab
+ 3: aba
+ 4: ab
+ 5: a
+ 6: 
+    aaaabcde
+ 0: aaaab
+ 1: aaaa
+ 2: aaa
+ 3: aa
+ 4: a
+ 5: 
+    bbbb    
+ 0: bbbb
+ 1: bbb
+ 2: bb
+ 3: b
+ 4: 
+ 
+/([^a]*)*/
+    b
+ 0: b
+ 1: 
+    bbbb
+ 0: bbbb
+ 1: bbb
+ 2: bb
+ 3: b
+ 4: 
+    aaa   
+ 0: 
+ 
+/([^ab]*)*/
+    cccc
+ 0: cccc
+ 1: ccc
+ 2: cc
+ 3: c
+ 4: 
+    abab  
+ 0: 
+ 
+/([a]*?)*/
+    a
+ 0: a
+ 1: 
+    aaaa 
+ 0: aaaa
+ 1: aaa
+ 2: aa
+ 3: a
+ 4: 
+ 
+/([ab]*?)*/
+    a
+ 0: a
+ 1: 
+    b
+ 0: b
+ 1: 
+    abab
+ 0: abab
+ 1: aba
+ 2: ab
+ 3: a
+ 4: 
+    baba   
+ 0: baba
+ 1: bab
+ 2: ba
+ 3: b
+ 4: 
+ 
+/([^a]*?)*/
+    b
+ 0: b
+ 1: 
+    bbbb
+ 0: bbbb
+ 1: bbb
+ 2: bb
+ 3: b
+ 4: 
+    aaa   
+ 0: 
+ 
+/([^ab]*?)*/
+    c
+ 0: c
+ 1: 
+    cccc
+ 0: cccc
+ 1: ccc
+ 2: cc
+ 3: c
+ 4: 
+    baba   
+ 0: 
+ 
+/(?>a*)*/
+    a
+ 0: a
+ 1: 
+    aaabcde 
+ 0: aaa
+ 1: 
+ 
+/((?>a*))*/
+    aaaaa
+ 0: aaaaa
+ 1: 
+    aabbaa 
+ 0: aa
+ 1: 
+ 
+/((?>a*?))*/
+    aaaaa
+ 0: aaaaa
+ 1: 
+    aabbaa 
+ 0: aa
+ 1: 
+
+/(?(?=[^a-z]+[a-z])  \d{2}-[a-z]{3}-\d{2}  |  \d{2}-\d{2}-\d{2} ) /x
+    12-sep-98
+ 0: 12-sep-98
+    12-09-98
+ 0: 12-09-98
     *** Failers
- 0: ***
- 1: **
-    \     \  
 No match
+    sep-12-98
+No match
+        
+/(?i:saturday|sunday)/
+    saturday
+ 0: saturday
+    sunday
+ 0: sunday
+    Saturday
+ 0: Saturday
+    Sunday
+ 0: Sunday
+    SATURDAY
+ 0: SATURDAY
+    SUNDAY
+ 0: SUNDAY
+    SunDay
+ 0: SunDay
+    
+/(a(?i)bc|BB)x/
+    abcx
+ 0: abcx
+    aBCx
+ 0: aBCx
+    bbx
+ 0: bbx
+    BBx
+ 0: BBx
+    *** Failers
+No match
+    abcX
+No match
+    aBCX
+No match
+    bbX
+No match
+    BBX               
+No match

-/\S{2,3}?/8
-    12abcd34
- 0: 12a
- 1: 12
-    1234abcd
- 0: 123
- 1: 12
+/^([ab](?i)[cd]|[ef])/
+    ac
+ 0: ac
+    aC
+ 0: aC
+    bD
+ 0: bD
+    elephant
+ 0: e
+    Europe 
+ 0: E
+    frog
+ 0: f
+    France
+ 0: F
     *** Failers
- 0: ***
- 1: **
-    \     \  
 No match
+    Africa     
+No match

-/>\s+</8
-    12>      <34
- 0: >      <
+/^(ab|a(?i)[b-c](?m-i)d|x(?i)y|z)/
+    ab
+ 0: ab
+    aBd
+ 0: aBd
+    xy
+ 0: xy
+    xY
+ 0: xY
+    zebra
+ 0: z
+    Zambesi
+ 0: Z
     *** Failers
 No match
+    aCD  
+No match
+    XY  
+No match

-/>\s{2,3}</8
-    ab>  <cd
- 0: >  <
-    ab>   <ce
- 0: >   <
+/(?<=foo\n)^bar/m
+    foo\nbar
+ 0: bar
     *** Failers
 No match
-    ab>    <cd 
+    bar
 No match
+    baz\nbar   
+No match

-/>\s{2,3}?</8
-    ab>  <cd
- 0: >  <
-    ab>   <ce
- 0: >   <
+/(?<=(?<!foo)bar)baz/
+    barbaz
+ 0: baz
+    barbarbaz 
+ 0: baz
+    koobarbaz 
+ 0: baz
     *** Failers
 No match
-    ab>    <cd 
+    baz
 No match
+    foobarbaz 
+No match

-/\w+/8
-    12      34
- 0: 12
- 1: 1
+/The following tests are taken from the Perl 5.005 test suite; some of them/
+/are compatible with 5.004, but I'd rather not have to sort them out./
+No match
+
+/abc/
+    abc
+ 0: abc
+    xabcy
+ 0: abc
+    ababc
+ 0: abc
     *** Failers
- 0: Failers
- 1: Failer
- 2: Faile
- 3: Fail
- 4: Fai
- 5: Fa
- 6: F
-    +++=*! 
 No match
+    xbc
+No match
+    axc
+No match
+    abx
+No match

-/\w{2,3}/8
-    ab  cd
+/ab*c/
+    abc
+ 0: abc
+
+/ab*bc/
+    abc
+ 0: abc
+    abbc
+ 0: abbc
+    abbbbc
+ 0: abbbbc
+
+/.{1}/
+    abbbbc
+ 0: a
+
+/.{3,4}/
+    abbbbc
+ 0: abbb
+ 1: abb
+
+/ab{0,}bc/
+    abbbbc
+ 0: abbbbc
+
+/ab+bc/
+    abbc
+ 0: abbc
+    *** Failers
+No match
+    abc
+No match
+    abq
+No match
+
+/ab+bc/
+    abbbbc
+ 0: abbbbc
+
+/ab{1,}bc/
+    abbbbc
+ 0: abbbbc
+
+/ab{1,3}bc/
+    abbbbc
+ 0: abbbbc
+
+/ab{3,4}bc/
+    abbbbc
+ 0: abbbbc
+
+/ab{4,5}bc/
+    *** Failers
+No match
+    abq
+No match
+    abbbbc
+No match
+
+/ab?bc/
+    abbc
+ 0: abbc
+    abc
+ 0: abc
+
+/ab{0,1}bc/
+    abc
+ 0: abc
+
+/ab?bc/
+
+/ab?c/
+    abc
+ 0: abc
+
+/ab{0,1}c/
+    abc
+ 0: abc
+
+/^abc$/
+    abc
+ 0: abc
+    *** Failers
+No match
+    abbbbc
+No match
+    abcc
+No match
+
+/^abc/
+    abcc
+ 0: abc
+
+/^abc$/
+
+/abc$/
+    aabc
+ 0: abc
+    *** Failers
+No match
+    aabc
+ 0: abc
+    aabcd
+No match
+
+/^/
+    abc
+ 0: 
+
+/$/
+    abc
+ 0: 
+
+/a.c/
+    abc
+ 0: abc
+    axc
+ 0: axc
+
+/a.*c/
+    axyzc
+ 0: axyzc
+
+/a[bc]d/
+    abd
+ 0: abd
+    *** Failers
+No match
+    axyzd
+No match
+    abc
+No match
+
+/a[b-d]e/
+    ace
+ 0: ace
+
+/a[b-d]/
+    aac
+ 0: ac
+
+/a[-b]/
+    a-
+ 0: a-
+
+/a[b-]/
+    a-
+ 0: a-
+
+/a]/
+    a]
+ 0: a]
+
+/a[]]b/
+    a]b
+ 0: a]b
+
+/a[^bc]d/
+    aed
+ 0: aed
+    *** Failers
+No match
+    abd
+No match
+    abd
+No match
+
+/a[^-b]c/
+    adc
+ 0: adc
+
+/a[^]b]c/
+    adc
+ 0: adc
+    *** Failers
+No match
+    a-c
+ 0: a-c
+    a]c
+No match
+
+/\ba\b/
+    a-
+ 0: a
+    -a
+ 0: a
+    -a-
+ 0: a
+
+/\by\b/
+    *** Failers
+No match
+    xy
+No match
+    yz
+No match
+    xyz
+No match
+
+/\Ba\B/
+    *** Failers
+ 0: a
+    a-
+No match
+    -a
+No match
+    -a-
+No match
+
+/\By\b/
+    xy
+ 0: y
+
+/\by\B/
+    yz
+ 0: y
+
+/\By\B/
+    xyz
+ 0: y
+
+/\w/
+    a
+ 0: a
+
+/\W/
+    -
+ 0: -
+    *** Failers
+ 0: *
+    -
+ 0: -
+    a
+No match
+
+/a\sb/
+    a b
+ 0: a b
+
+/a\Sb/
+    a-b
+ 0: a-b
+    *** Failers
+No match
+    a-b
+ 0: a-b
+    a b
+No match
+
+/\d/
+    1
+ 0: 1
+
+/\D/
+    -
+ 0: -
+    *** Failers
+ 0: *
+    -
+ 0: -
+    1
+No match
+
+/[\w]/
+    a
+ 0: a
+
+/[\W]/
+    -
+ 0: -
+    *** Failers
+ 0: *
+    -
+ 0: -
+    a
+No match
+
+/a[\s]b/
+    a b
+ 0: a b
+
+/a[\S]b/
+    a-b
+ 0: a-b
+    *** Failers
+No match
+    a-b
+ 0: a-b
+    a b
+No match
+
+/[\d]/
+    1
+ 0: 1
+
+/[\D]/
+    -
+ 0: -
+    *** Failers
+ 0: *
+    -
+ 0: -
+    1
+No match
+
+/ab|cd/
+    abc
  0: ab
-    abcd ce
+    abcd
+ 0: ab
+
+/()ef/
+    def
+ 0: ef
+
+/$b/
+
+/a\(b/
+    a(b
+ 0: a(b
+
+/a\(*b/
+    ab
+ 0: ab
+    a((b
+ 0: a((b
+
+/a\\b/
+    a\b
+No match
+
+/((a))/
+    abc
+ 0: a
+
+/(a)b(c)/
+    abc
  0: abc
- 1: ab
+
+/a+b+c/
+    aabbabc
+ 0: abc
+
+/a{1,}b{1,}c/
+    aabbabc
+ 0: abc
+
+/a.+?c/
+    abcabc
+ 0: abcabc
+ 1: abc
+
+/(a+|b)*/
+    ab
+ 0: ab
+ 1: a
+ 2: 
+
+/(a+|b){0,}/
+    ab
+ 0: ab
+ 1: a
+ 2: 
+
+/(a+|b)+/
+    ab
+ 0: ab
+ 1: a
+
+/(a+|b){1,}/
+    ab
+ 0: ab
+ 1: a
+
+/(a+|b)?/
+    ab
+ 0: a
+ 1: 
+
+/(a+|b){0,1}/
+    ab
+ 0: a
+ 1: 
+
+/[^ab]*/
+    cde
+ 0: cde
+ 1: cd
+ 2: c
+ 3: 
+
+/abc/
     *** Failers
- 0: Fai
- 1: Fa
-    a.b.c
 No match
+    b
+No match
+

-/\w{2,3}?/8
-    ab  cd
+/a*/
+    
+
+/([abc])*d/
+    abbbcd
+ 0: abbbcd
+
+/([abc])*bcd/
+    abcd
+ 0: abcd
+
+/a|b|c|d|e/
+    e
+ 0: e
+
+/(a|b|c|d|e)f/
+    ef
+ 0: ef
+
+/abcd*efg/
+    abcdefg
+ 0: abcdefg
+
+/ab*/
+    xabyabbbz
  0: ab
-    abcd ce
+ 1: a
+    xayabbbz
+ 0: a
+
+/(ab|cd)e/
+    abcde
+ 0: cde
+
+/[abhgefdc]ij/
+    hij
+ 0: hij
+
+/^(ab|cd)e/
+
+/(abc|)ef/
+    abcdef
+ 0: ef
+
+/(a|b)c*d/
+    abcd
+ 0: bcd
+
+/(ab|ab*)bc/
+    abc
  0: abc
+
+/a([bc]*)c*/
+    abc
+ 0: abc
  1: ab
+ 2: a
+
+/a([bc]*)(c*d)/
+    abcd
+ 0: abcd
+
+/a([bc]+)(c*d)/
+    abcd
+ 0: abcd
+
+/a([bc]*)(c+d)/
+    abcd
+ 0: abcd
+
+/a[bcd]*dcdcde/
+    adcdcde
+ 0: adcdcde
+
+/a[bcd]+dcdcde/
     *** Failers
- 0: Fai
- 1: Fa
-    a.b.c
 No match
+    abcde
+No match
+    adcdcde
+No match

-/\W+/8
-    12====34
- 0: ====
- 1: ===
- 2: ==
- 3: =
+/(ab|a)b*c/
+    abc
+ 0: abc
+
+/((a)(b)c)(d)/
+    abcd
+ 0: abcd
+
+/[a-zA-Z_][a-zA-Z0-9_]*/
+    alpha
+ 0: alpha
+ 1: alph
+ 2: alp
+ 3: al
+ 4: a
+
+/^a(bc+|b[eh])g|.h$/
+    abh
+ 0: bh
+
+/(bc+d$|ef*g.|h?i(j|k))/
+    effgz
+ 0: effgz
+    ij
+ 0: ij
+    reffgz
+ 0: effgz
     *** Failers
- 0: *** 
- 1: ***
- 2: **
- 3: *
-    abcd 
 No match
+    effg
+No match
+    bcdd
+No match

-/\W{2,3}/8
-    ab====cd
- 0: ===
- 1: ==
-    ab==cd
- 0: ==
+/((((((((((a))))))))))/
+    a
+ 0: a
+
+/(((((((((a)))))))))/
+    a
+ 0: a
+
+/multiple words of text/
     *** Failers
- 0: ***
- 1: **
-    a.b.c
 No match
+    aa
+No match
+    uh-uh
+No match

-/\W{2,3}?/8
-    ab====cd
- 0: ===
- 1: ==
-    ab==cd
- 0: ==
+/multiple words/
+    multiple words, yeah
+ 0: multiple words
+
+/(.*)c(.*)/
+    abcde
+ 0: abcde
+ 1: abcd
+ 2: abc
+
+/\((.*), (.*)\)/
+    (a, b)
+ 0: (a, b)
+
+/[k]/
+
+/abcd/
+    abcd
+ 0: abcd
+
+/a(bc)d/
+    abcd
+ 0: abcd
+
+/a[-]?c/
+    ac
+ 0: ac
+
+/abc/i
+    ABC
+ 0: ABC
+    XABCY
+ 0: ABC
+    ABABC
+ 0: ABC
     *** Failers
- 0: ***
- 1: **
-    a.b.c
 No match
+    aaxabxbaxbbx
+No match
+    XBC
+No match
+    AXC
+No match
+    ABX
+No match

-/[\x{100}]/8
-    \x{100}
- 0: \x{100}
-    Z\x{100}
- 0: \x{100}
-    \x{100}Z
- 0: \x{100}
-    *** Failers 
+/ab*c/i
+    ABC
+ 0: ABC
+
+/ab*bc/i
+    ABC
+ 0: ABC
+    ABBC
+ 0: ABBC
+
+/ab*?bc/i
+    ABBBBC
+ 0: ABBBBC
+
+/ab{0,}?bc/i
+    ABBBBC
+ 0: ABBBBC
+
+/ab+?bc/i
+    ABBC
+ 0: ABBC
+
+/ab+bc/i
+    *** Failers
 No match
+    ABC
+No match
+    ABQ
+No match

-/[Z\x{100}]/8
-    Z\x{100}
- 0: Z
-    \x{100}
- 0: \x{100}
-    \x{100}Z
- 0: \x{100}
-    *** Failers 
+/ab{1,}bc/i
+
+/ab+bc/i
+    ABBBBC
+ 0: ABBBBC
+
+/ab{1,}?bc/i
+    ABBBBC
+ 0: ABBBBC
+
+/ab{1,3}?bc/i
+    ABBBBC
+ 0: ABBBBC
+
+/ab{3,4}?bc/i
+    ABBBBC
+ 0: ABBBBC
+
+/ab{4,5}?bc/i
+    *** Failers
 No match
+    ABQ
+No match
+    ABBBBC
+No match

-/[\x{100}\x{200}]/8
-   ab\x{100}cd
- 0: \x{100}
-   ab\x{200}cd
- 0: \x{200}
-   *** Failers  
+/ab??bc/i
+    ABBC
+ 0: ABBC
+    ABC
+ 0: ABC
+
+/ab{0,1}?bc/i
+    ABC
+ 0: ABC
+
+/ab??bc/i
+
+/ab??c/i
+    ABC
+ 0: ABC
+
+/ab{0,1}?c/i
+    ABC
+ 0: ABC
+
+/^abc$/i
+    ABC
+ 0: ABC
+    *** Failers
 No match
+    ABBBBC
+No match
+    ABCC
+No match

-/[\x{100}-\x{200}]/8
-   ab\x{100}cd
- 0: \x{100}
-   ab\x{200}cd
- 0: \x{200}
-   ab\x{111}cd 
- 0: \x{111}
-   *** Failers  
+/^abc/i
+    ABCC
+ 0: ABC
+
+/^abc$/i
+
+/abc$/i
+    AABC
+ 0: ABC
+
+/^/i
+    ABC
+ 0: 
+
+/$/i
+    ABC
+ 0: 
+
+/a.c/i
+    ABC
+ 0: ABC
+    AXC
+ 0: AXC
+
+/a.*?c/i
+    AXYZC
+ 0: AXYZC
+
+/a.*c/i
+    *** Failers
 No match
+    AABC
+ 0: AABC
+    AXYZD
+No match

-/[z-\x{200}]/8
-   ab\x{100}cd
- 0: \x{100}
-   ab\x{200}cd
- 0: \x{200}
-   ab\x{111}cd 
- 0: \x{111}
-   abzcd
- 0: z
-   ab|cd  
- 0: |
-   *** Failers  
+/a[bc]d/i
+    ABD
+ 0: ABD
+
+/a[b-d]e/i
+    ACE
+ 0: ACE
+    *** Failers
 No match
+    ABC
+No match
+    ABD
+No match

-/[Q\x{100}\x{200}]/8
-   ab\x{100}cd
- 0: \x{100}
-   ab\x{200}cd
- 0: \x{200}
-   Q? 
- 0: Q
-   *** Failers  
+/a[b-d]/i
+    AAC
+ 0: AC
+
+/a[-b]/i
+    A-
+ 0: A-
+
+/a[b-]/i
+    A-
+ 0: A-
+
+/a]/i
+    A]
+ 0: A]
+
+/a[]]b/i
+    A]B
+ 0: A]B
+
+/a[^bc]d/i
+    AED
+ 0: AED
+
+/a[^-b]c/i
+    ADC
+ 0: ADC
+    *** Failers
 No match
+    ABD
+No match
+    A-C
+No match

-/[Q\x{100}-\x{200}]/8
-   ab\x{100}cd
- 0: \x{100}
-   ab\x{200}cd
- 0: \x{200}
-   ab\x{111}cd 
- 0: \x{111}
-   Q? 
- 0: Q
-   *** Failers  
+/a[^]b]c/i
+    ADC
+ 0: ADC
+
+/ab|cd/i
+    ABC
+ 0: AB
+    ABCD
+ 0: AB
+
+/()ef/i
+    DEF
+ 0: EF
+
+/$b/i
+    *** Failers
 No match
+    A]C
+No match
+    B
+No match

-/[Qz-\x{200}]/8
-   ab\x{100}cd
- 0: \x{100}
-   ab\x{200}cd
- 0: \x{200}
-   ab\x{111}cd 
- 0: \x{111}
-   abzcd
- 0: z
-   ab|cd  
- 0: |
-   Q? 
- 0: Q
-   *** Failers  
+/a\(b/i
+    A(B
+ 0: A(B
+
+/a\(*b/i
+    AB
+ 0: AB
+    A((B
+ 0: A((B
+
+/a\\b/i
+    A\B
 No match

-/[\x{100}\x{200}]{1,3}/8
-   ab\x{100}cd
- 0: \x{100}
-   ab\x{200}cd
- 0: \x{200}
-   ab\x{200}\x{100}\x{200}\x{100}cd
- 0: \x{200}\x{100}\x{200}
- 1: \x{200}\x{100}
- 2: \x{200}
-   *** Failers  
+/((a))/i
+    ABC
+ 0: A
+
+/(a)b(c)/i
+    ABC
+ 0: ABC
+
+/a+b+c/i
+    AABBABC
+ 0: ABC
+
+/a{1,}b{1,}c/i
+    AABBABC
+ 0: ABC
+
+/a.+?c/i
+    ABCABC
+ 0: ABCABC
+ 1: ABC
+
+/a.*?c/i
+    ABCABC
+ 0: ABCABC
+ 1: ABC
+
+/a.{0,5}?c/i
+    ABCABC
+ 0: ABCABC
+ 1: ABC
+
+/(a+|b)*/i
+    AB
+ 0: AB
+ 1: A
+ 2: 
+
+/(a+|b){0,}/i
+    AB
+ 0: AB
+ 1: A
+ 2: 
+
+/(a+|b)+/i
+    AB
+ 0: AB
+ 1: A
+
+/(a+|b){1,}/i
+    AB
+ 0: AB
+ 1: A
+
+/(a+|b)?/i
+    AB
+ 0: A
+ 1: 
+
+/(a+|b){0,1}/i
+    AB
+ 0: A
+ 1: 
+
+/(a+|b){0,1}?/i
+    AB
+ 0: A
+ 1: 
+
+/[^ab]*/i
+    CDE
+ 0: CDE
+ 1: CD
+ 2: C
+ 3: 
+
+/abc/i
+
+/a*/i
+    
+
+/([abc])*d/i
+    ABBBCD
+ 0: ABBBCD
+
+/([abc])*bcd/i
+    ABCD
+ 0: ABCD
+
+/a|b|c|d|e/i
+    E
+ 0: E
+
+/(a|b|c|d|e)f/i
+    EF
+ 0: EF
+
+/abcd*efg/i
+    ABCDEFG
+ 0: ABCDEFG
+
+/ab*/i
+    XABYABBBZ
+ 0: AB
+ 1: A
+    XAYABBBZ
+ 0: A
+
+/(ab|cd)e/i
+    ABCDE
+ 0: CDE
+
+/[abhgefdc]ij/i
+    HIJ
+ 0: HIJ
+
+/^(ab|cd)e/i
+    ABCDE
 No match

-/[\x{100}\x{200}]{1,3}?/8
-   ab\x{100}cd
- 0: \x{100}
-   ab\x{200}cd
- 0: \x{200}
-   ab\x{200}\x{100}\x{200}\x{100}cd
- 0: \x{200}\x{100}\x{200}
- 1: \x{200}\x{100}
- 2: \x{200}
-   *** Failers  
+/(abc|)ef/i
+    ABCDEF
+ 0: EF
+
+/(a|b)c*d/i
+    ABCD
+ 0: BCD
+
+/(ab|ab*)bc/i
+    ABC
+ 0: ABC
+
+/a([bc]*)c*/i
+    ABC
+ 0: ABC
+ 1: AB
+ 2: A
+
+/a([bc]*)(c*d)/i
+    ABCD
+ 0: ABCD
+
+/a([bc]+)(c*d)/i
+    ABCD
+ 0: ABCD
+
+/a([bc]*)(c+d)/i
+    ABCD
+ 0: ABCD
+
+/a[bcd]*dcdcde/i
+    ADCDCDE
+ 0: ADCDCDE
+
+/a[bcd]+dcdcde/i
+
+/(ab|a)b*c/i
+    ABC
+ 0: ABC
+
+/((a)(b)c)(d)/i
+    ABCD
+ 0: ABCD
+
+/[a-zA-Z_][a-zA-Z0-9_]*/i
+    ALPHA
+ 0: ALPHA
+ 1: ALPH
+ 2: ALP
+ 3: AL
+ 4: A
+
+/^a(bc+|b[eh])g|.h$/i
+    ABH
+ 0: BH
+
+/(bc+d$|ef*g.|h?i(j|k))/i
+    EFFGZ
+ 0: EFFGZ
+    IJ
+ 0: IJ
+    REFFGZ
+ 0: EFFGZ
+    *** Failers
 No match
+    ADCDCDE
+No match
+    EFFG
+No match
+    BCDD
+No match

-/[Q\x{100}\x{200}]{1,3}/8
-   ab\x{100}cd
- 0: \x{100}
-   ab\x{200}cd
- 0: \x{200}
-   ab\x{200}\x{100}\x{200}\x{100}cd
- 0: \x{200}\x{100}\x{200}
- 1: \x{200}\x{100}
- 2: \x{200}
-   *** Failers  
+/((((((((((a))))))))))/i
+    A
+ 0: A
+
+/(((((((((a)))))))))/i
+    A
+ 0: A
+
+/(?:(?:(?:(?:(?:(?:(?:(?:(?:(a))))))))))/i
+    A
+ 0: A
+
+/(?:(?:(?:(?:(?:(?:(?:(?:(?:(a|b|c))))))))))/i
+    C
+ 0: C
+
+/multiple words of text/i
+    *** Failers
 No match
+    AA
+No match
+    UH-UH
+No match

-/[Q\x{100}\x{200}]{1,3}?/8
-   ab\x{100}cd
- 0: \x{100}
-   ab\x{200}cd
- 0: \x{200}
-   ab\x{200}\x{100}\x{200}\x{100}cd
- 0: \x{200}\x{100}\x{200}
- 1: \x{200}\x{100}
- 2: \x{200}
-   *** Failers  
+/multiple words/i
+    MULTIPLE WORDS, YEAH
+ 0: MULTIPLE WORDS
+
+/(.*)c(.*)/i
+    ABCDE
+ 0: ABCDE
+ 1: ABCD
+ 2: ABC
+
+/\((.*), (.*)\)/i
+    (A, B)
+ 0: (A, B)
+
+/[k]/i
+
+/abcd/i
+    ABCD
+ 0: ABCD
+
+/a(bc)d/i
+    ABCD
+ 0: ABCD
+
+/a[-]?c/i
+    AC
+ 0: AC
+
+/a(?!b)./
+    abad
+ 0: ad
+
+/a(?=d)./
+    abad
+ 0: ad
+
+/a(?=c|d)./
+    abad
+ 0: ad
+
+/a(?:b|c|d)(.)/
+    ace
+ 0: ace
+
+/a(?:b|c|d)*(.)/
+    ace
+ 0: ace
+ 1: ac
+
+/a(?:b|c|d)+?(.)/
+    ace
+ 0: ace
+    acdbcdbe
+ 0: acdbcdbe
+ 1: acdbcdb
+ 2: acdbcd
+ 3: acdbc
+ 4: acdb
+ 5: acd
+
+/a(?:b|c|d)+(.)/
+    acdbcdbe
+ 0: acdbcdbe
+ 1: acdbcdb
+ 2: acdbcd
+ 3: acdbc
+ 4: acdb
+ 5: acd
+
+/a(?:b|c|d){2}(.)/
+    acdbcdbe
+ 0: acdb
+
+/a(?:b|c|d){4,5}(.)/
+    acdbcdbe
+ 0: acdbcdb
+ 1: acdbcd
+
+/a(?:b|c|d){4,5}?(.)/
+    acdbcdbe
+ 0: acdbcdb
+ 1: acdbcd
+
+/((foo)|(bar))*/
+    foobar
+ 0: foobar
+ 1: foo
+ 2: 
+
+/a(?:b|c|d){6,7}(.)/
+    acdbcdbe
+ 0: acdbcdbe
+
+/a(?:b|c|d){6,7}?(.)/
+    acdbcdbe
+ 0: acdbcdbe
+
+/a(?:b|c|d){5,6}(.)/
+    acdbcdbe
+ 0: acdbcdbe
+ 1: acdbcdb
+
+/a(?:b|c|d){5,6}?(.)/
+    acdbcdbe
+ 0: acdbcdbe
+ 1: acdbcdb
+
+/a(?:b|c|d){5,7}(.)/
+    acdbcdbe
+ 0: acdbcdbe
+ 1: acdbcdb
+
+/a(?:b|c|d){5,7}?(.)/
+    acdbcdbe
+ 0: acdbcdbe
+ 1: acdbcdb
+
+/a(?:b|(c|e){1,2}?|d)+?(.)/
+    ace
+ 0: ace
+
+/^(.+)?B/
+    AB
+ 0: AB
+
+/^([^a-z])|(\^)$/
+    .
+ 0: .
+
+/^[<>]&/
+    <&OUT
+ 0: <&
+
+/(?:(f)(o)(o)|(b)(a)(r))*/
+    foobar
+ 0: foobar
+ 1: foo
+ 2: 
+
+/(?<=a)b/
+    ab
+ 0: b
+    *** Failers
 No match
+    cb
+No match
+    b
+No match

-/(?<=[\x{100}\x{200}])X/8
-    abc\x{200}X
- 0: X
-    abc\x{100}X 
- 0: X
+/(?<!c)b/
+    ab
+ 0: b
+    b
+ 0: b
+    b
+ 0: b
+
+/(?:..)*a/
+    aba
+ 0: aba
+ 1: a
+
+/(?:..)*?a/
+    aba
+ 0: aba
+ 1: a
+
+/^(){3,5}/
+    abc
+ 0: 
+
+/^(a+)*ax/
+    aax
+ 0: aax
+
+/^((a|b)+)*ax/
+    aax
+ 0: aax
+
+/^((a|bc)+)*ax/
+    aax
+ 0: aax
+
+/(a|x)*ab/
+    cab
+ 0: ab
+
+/(a)*ab/
+    cab
+ 0: ab
+
+/(?:(?i)a)b/
+    ab
+ 0: ab
+
+/((?i)a)b/
+    ab
+ 0: ab
+
+/(?:(?i)a)b/
+    Ab
+ 0: Ab
+
+/((?i)a)b/
+    Ab
+ 0: Ab
+
+/(?:(?i)a)b/
     *** Failers
 No match
-    X  
+    cb
 No match
+    aB
+No match

-/(?<=[Q\x{100}\x{200}])X/8
-    abc\x{200}X
- 0: X
-    abc\x{100}X 
- 0: X
-    abQX 
- 0: X
+/((?i)a)b/
+
+/(?i:a)b/
+    ab
+ 0: ab
+
+/((?i:a))b/
+    ab
+ 0: ab
+
+/(?i:a)b/
+    Ab
+ 0: Ab
+
+/((?i:a))b/
+    Ab
+ 0: Ab
+
+/(?i:a)b/
     *** Failers
 No match
-    X  
+    aB
 No match
+    aB
+No match

-/(?<=[\x{100}\x{200}]{3})X/8
-    abc\x{100}\x{200}\x{100}X
- 0: X
+/((?i:a))b/
+
+/(?:(?-i)a)b/i
+    ab
+ 0: ab
+
+/((?-i)a)b/i
+    ab
+ 0: ab
+
+/(?:(?-i)a)b/i
+    aB
+ 0: aB
+
+/((?-i)a)b/i
+    aB
+ 0: aB
+
+/(?:(?-i)a)b/i
     *** Failers
 No match
-    abc\x{200}X
+    aB
+ 0: aB
+    Ab
 No match
-    X  
+
+/((?-i)a)b/i
+
+/(?:(?-i)a)b/i
+    aB
+ 0: aB
+
+/((?-i)a)b/i
+    aB
+ 0: aB
+
+/(?:(?-i)a)b/i
+    *** Failers
 No match
+    Ab
+No match
+    AB
+No match

-/[^\x{100}\x{200}]X/8
-    AX
- 0: AX
-    \x{150}X
- 0: \x{150}X
-    \x{500}X 
- 0: \x{500}X
+/((?-i)a)b/i
+
+/(?-i:a)b/i
+    ab
+ 0: ab
+
+/((?-i:a))b/i
+    ab
+ 0: ab
+
+/(?-i:a)b/i
+    aB
+ 0: aB
+
+/((?-i:a))b/i
+    aB
+ 0: aB
+
+/(?-i:a)b/i
     *** Failers
 No match
-    \x{100}X
+    AB
 No match
-    \x{200}X   
+    Ab
 No match

-/[^Q\x{100}\x{200}]X/8
-    AX
- 0: AX
-    \x{150}X
- 0: \x{150}X
-    \x{500}X 
- 0: \x{500}X
+/((?-i:a))b/i
+
+/(?-i:a)b/i
+    aB
+ 0: aB
+
+/((?-i:a))b/i
+    aB
+ 0: aB
+
+/(?-i:a)b/i
     *** Failers
 No match
-    \x{100}X
+    Ab
 No match
-    \x{200}X   
+    AB
 No match
-    QX 
+
+/((?-i:a))b/i
+
+/((?-i:a.))b/i
+    *** Failers
 No match
+    AB
+No match
+    a\nB
+No match

-/[^\x{100}-\x{200}]X/8
-    AX
- 0: AX
-    \x{500}X 
- 0: \x{500}X
+/((?s-i:a.))b/i
+    a\nB
+ 0: a\x0aB
+
+/(?:c|d)(?:)(?:a(?:)(?:b)(?:b(?:))(?:b(?:)(?:b)))/
+    cabbbb
+ 0: cabbbb
+
+/(?:c|d)(?:)(?:aaaaaaaa(?:)(?:bbbbbbbb)(?:bbbbbbbb(?:))(?:bbbbbbbb(?:)(?:bbbbbbbb)))/
+    caaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+ 0: caaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+
+/foo\w*\d{4}baz/
+    foobar1234baz
+ 0: foobar1234baz
+
+/x(~~)*(?:(?:F)?)?/
+    x~~
+ 0: x~~
+ 1: x
+
+/^a(?#xxx){3}c/
+    aaac
+ 0: aaac
+
+/^a (?#xxx) (?#yyy) {3}c/x
+    aaac
+ 0: aaac
+
+/(?<![cd])b/
     *** Failers
 No match
-    \x{100}X
+    B\nB
 No match
-    \x{150}X
+    dbcb
 No match
-    \x{200}X   
+
+/(?<![cd])[ab]/
+    dbaacb
+ 0: a
+
+/(?<!(c|d))b/
+
+/(?<!(c|d))[ab]/
+    dbaacb
+ 0: a
+
+/(?<!cd)[ab]/
+    cdaccb
+ 0: b
+
+/^(?:a?b?)*$/
+    *** Failers
 No match
+    dbcb
+No match
+    a--
+No match

-/[z-\x{100}]/8i
-    z
- 0: z
-    Z 
- 0: Z
-    \x{100}
- 0: \x{100}
+/((?s)^a(.))((?m)^b$)/
+    a\nb\nc\n
+ 0: a\x0ab
+
+/((?m)^b$)/
+    a\nb\nc\n
+ 0: b
+
+/(?m)^b/
+    a\nb\n
+ 0: b
+
+/(?m)^(b)/
+    a\nb\n
+ 0: b
+
+/((?m)^b)/
+    a\nb\n
+ 0: b
+
+/\n((?m)^b)/
+    a\nb\n
+ 0: \x0ab
+
+/((?s).)c(?!.)/
+    a\nb\nc\n
+ 0: \x0ac
+    a\nb\nc\n
+ 0: \x0ac
+
+/((?s)b.)c(?!.)/
+    a\nb\nc\n
+ 0: b\x0ac
+    a\nb\nc\n
+ 0: b\x0ac
+
+/^b/
+
+/()^b/
     *** Failers
 No match
-    \x{102}
+    a\nb\nc\n
 No match
-    y    
+    a\nb\nc\n
 No match

-/[\xFF]/
-    >\xff<
- 0: \xff
+/((?m)^b)/
+    a\nb\nc\n
+ 0: b

-/[\xff]/8
-    >\x{ff}<
- 0: \x{ff}
+/(?(?!a)a|b)/

-/[^\xFF]/
-    XYZ
- 0: X
+/(?(?!a)b|a)/
+    a
+ 0: a

-/[^\xff]/8
-    XYZ
- 0: X
-    \x{123} 
- 0: \x{123}
+/(?(?=a)b|a)/
+    *** Failers
+No match
+    a
+No match
+    a
+No match

-/^[ac]*b/8
-  xb
+/(?(?=a)a|b)/
+    a
+ 0: a
+
+/(\w+:)+/
+    one:
+ 0: one:
+
+/$(?<=^(a))/
+    a
+ 0: 
+
+/([\w:]+::)?(\w+)$/
+    abcd
+ 0: abcd
+    xy:z:::abcd
+ 0: xy:z:::abcd
+
+/^[^bcd]*(c+)/
+    aexycd
+ 0: aexyc
+
+/(a*)b+/
+    caab
+ 0: aab
+
+/([\w:]+::)?(\w+)$/
+    abcd
+ 0: abcd
+    xy:z:::abcd
+ 0: xy:z:::abcd
+    *** Failers
+ 0: Failers
+    abcd:
 No match
+    abcd:
+No match

-/^[ac\x{100}]*b/8
-  xb
+/^[^bcd]*(c+)/
+    aexycd
+ 0: aexyc
+
+/(>a+)ab/
+
+/(?>a+)b/
+    aaab
+ 0: aaab
+
+/([[:]+)/
+    a:[b]:
+ 0: :[
+ 1: :
+
+/([[=]+)/
+    a=[b]=
+ 0: =[
+ 1: =
+
+/([[.]+)/
+    a.[b].
+ 0: .[
+ 1: .
+
+/((?>a+)b)/
+    aaab
+ 0: aaab
+
+/(?>(a+))b/
+    aaab
+ 0: aaab
+
+/((?>[^()]+)|\([^()]*\))+/
+    ((abc(ade)ufh()()x
+ 0: abc(ade)ufh()()x
+ 1: abc(ade)ufh()()
+ 2: abc(ade)ufh()
+ 3: abc(ade)ufh
+ 4: abc(ade)
+ 5: abc
+
+/a\Z/
+    *** Failers
 No match
+    aaab
+No match
+    a\nb\n
+No match

-/^[^x]*b/8i
-  xb
+/b\Z/
+    a\nb\n
+ 0: b
+
+/b\z/
+
+/b\Z/
+    a\nb
+ 0: b
+
+/b\z/
+    a\nb
+ 0: b
+    *** Failers
 No match
+    
+/(?>.*)(?<=(abcd|wxyz))/
+    alphabetabcd
+ 0: alphabetabcd
+    endingwxyz
+ 0: endingwxyz
+    *** Failers
+No match
+    a rather long string that doesn't end with one of them
+No match

-/^[^x]*b/8
-  xb
+/word (?>(?:(?!otherword)[a-zA-Z0-9]+ ){0,30})otherword/
+    word cat dog elephant mussel cow horse canary baboon snake shark otherword
+ 0: word cat dog elephant mussel cow horse canary baboon snake shark otherword
+    word cat dog elephant mussel cow horse canary baboon snake shark
 No match

-/^\d*b/8
-  xb 
+/word (?>[a-zA-Z0-9]+ ){0,30}otherword/
+    word cat dog elephant mussel cow horse canary baboon snake shark the quick brown fox and the lazy dog and several other words getting close to thirty by now I hope
 No match

-/(|a)/g8
-    catac
+/(?<=\d{3}(?!999))foo/
+    999foo
+ 0: foo
+    123999foo 
+ 0: foo
+    *** Failers
+No match
+    123abcfoo
+No match
+    
+/(?<=(?!...999)\d{3})foo/
+    999foo
+ 0: foo
+    123999foo 
+ 0: foo
+    *** Failers
+No match
+    123abcfoo
+No match
+
+/(?<=\d{3}(?!999)...)foo/
+    123abcfoo
+ 0: foo
+    123456foo 
+ 0: foo
+    *** Failers
+No match
+    123999foo  
+No match
+    
+/(?<=\d{3}...)(?<!999)foo/
+    123abcfoo   
+ 0: foo
+    123456foo 
+ 0: foo
+    *** Failers
+No match
+    123999foo  
+No match
+
+/((Z)+|A)*/
+    ZABCDEFG
+ 0: ZA
+ 1: Z
+ 2: 
+
+/(Z()|A)*/
+    ZABCDEFG
+ 0: ZA
+ 1: Z
+ 2: 
+
+/(Z(())|A)*/
+    ZABCDEFG
+ 0: ZA
+ 1: Z
+ 2: 
+
+/((?>Z)+|A)*/
+    ZABCDEFG
+ 0: ZA
+ 1: Z
+ 2: 
+
+/((?>)+|A)*/
+    ZABCDEFG
  0: 
+
+/a*/g
+    abbab
  0: a
  1: 
  0: 
+ 0: 
  0: a
  1: 
  0: 
  0: 
-    a\x{256}a 
+
+/^[a-\d]/
+    abcde
  0: a
- 1: 
- 0: 
+    -things
+ 0: -
+    0digit
+ 0: 0
+    *** Failers
+No match
+    bcdef    
+No match
+
+/^[\d-a]/
+    abcde
  0: a
- 1: 
+    -things
+ 0: -
+    0digit
+ 0: 0
+    *** Failers
+No match
+    bcdef    
+No match
+    
+/[[:space:]]+/
+    > \x09\x0a\x0c\x0d\x0b<
+ 0:  \x09\x0a\x0c\x0d\x0b
+ 1:  \x09\x0a\x0c\x0d
+ 2:  \x09\x0a\x0c
+ 3:  \x09\x0a
+ 4:  \x09
+ 5:  
+     
+/[[:blank:]]+/
+    > \x09\x0a\x0c\x0d\x0b<
+ 0:  \x09
+ 1:  
+     
+/[\s]+/
+    > \x09\x0a\x0c\x0d\x0b<
+ 0:  \x09\x0a\x0c\x0d
+ 1:  \x09\x0a\x0c
+ 2:  \x09\x0a
+ 3:  \x09
+ 4:  
+     
+/\s+/
+    > \x09\x0a\x0c\x0d\x0b<
+ 0:  \x09\x0a\x0c\x0d
+ 1:  \x09\x0a\x0c
+ 2:  \x09\x0a
+ 3:  \x09
+ 4:  
+     
+/a?b/x
+    ab
+No match
+
+/(?!\A)x/m
+  a\nxb\n
+ 0: x
+
+/(?!^)x/m
+  a\nxb\n
+No match
+
+/abc\Qabc\Eabc/
+    abcabcabc
+ 0: abcabcabc
+    
+/abc\Q(*+|\Eabc/
+    abc(*+|abc 
+ 0: abc(*+|abc
+
+/   abc\Q abc\Eabc/x
+    abc abcabc
+ 0: abc abcabc
+    *** Failers
+No match
+    abcabcabc  
+No match
+    
+/abc#comment
+    \Q#not comment
+    literal\E/x
+    abc#not comment\n    literal     
+ 0: abc#not comment\x0a    literal
+
+/abc#comment
+    \Q#not comment
+    literal/x
+    abc#not comment\n    literal     
+ 0: abc#not comment\x0a    literal
+
+/abc#comment
+    \Q#not comment
+    literal\E #more comment
+    /x
+    abc#not comment\n    literal     
+ 0: abc#not comment\x0a    literal
+
+/abc#comment
+    \Q#not comment
+    literal\E #more comment/x
+    abc#not comment\n    literal     
+ 0: abc#not comment\x0a    literal
+
+/\Qabc\$xyz\E/
+    abc\\\$xyz
+ 0: abc\$xyz
+
+/\Qabc\E\$\Qxyz\E/
+    abc\$xyz
+ 0: abc$xyz
+
+/\Gabc/
+    abc
+ 0: abc
+    *** Failers
+No match
+    xyzabc  
+No match
+
+/\Gabc./g
+    abc1abc2xyzabc3
+ 0: abc1
+ 0: abc2
+
+/abc./g
+    abc1abc2xyzabc3 
+ 0: abc1
+ 0: abc2
+ 0: abc3
+
+/a(?x: b c )d/
+    XabcdY
+ 0: abcd
+    *** Failers 
+No match
+    Xa b c d Y 
+No match
+
+/((?x)x y z | a b c)/
+    XabcY
+ 0: abc
+    AxyzB 
+ 0: xyz
+
+/(?i)AB(?-i)C/
+    XabCY
+ 0: abC
+    *** Failers
+No match
+    XabcY  
+No match
+
+/((?i)AB(?-i)C|D)E/
+    abCE
+ 0: abCE
+    DE
+ 0: DE
+    *** Failers
+No match
+    abcE
+No match
+    abCe  
+No match
+    dE
+No match
+    De    
+No match
+
+/[z\Qa-d]\E]/
+    z
+ 0: z
+    a
+ 0: a
+    -
+ 0: -
+    d
+ 0: d
+    ] 
+ 0: ]
+    *** Failers
+ 0: a
+    b     
+No match
+
+/[\z\C]/
+    z
+ 0: z
+    C 
+ 0: C
+    
+/\M/
+    M 
+ 0: M
+    
+/(a+)*b/
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 
+No match
+    
+/(?i)reg(?:ul(?:[a\xE4]|ae)r|ex)/
+    REGular
+ 0: REGular
+    regulaer
+ 0: regulaer
+    Regex  
+ 0: Regex
+    regul\xE4r 
+ 0: regul\xe4r
+
+/\xC5\xE6\xE5\xE4[\xE0-\xFF\xC0-\xDF]+/
+    \xC5\xE6\xE5\xE4\xE0
+ 0: \xc5\xe6\xe5\xe4\xe0
+    \xC5\xE6\xE5\xE4\xFF
+ 0: \xc5\xe6\xe5\xe4\xff
+    \xC5\xE6\xE5\xE4\xC0
+ 0: \xc5\xe6\xe5\xe4\xc0
+    \xC5\xE6\xE5\xE4\xDF
+ 0: \xc5\xe6\xe5\xe4\xdf
+
+/(?<=Z)X./
+    \x84XAZXB
+ 0: XB
+
+/^(?(2)a|(1)(2))+$/
+    123a
+Error -17 (backreference condition or recursion test not supported for DFA matching)
+
+/(?<=a|bbbb)c/
+    ac
+ 0: c
+    bbbbc
+ 0: c
+
+/abc/SS>testsavedregex
+Compiled pattern written to testsavedregex
+<testsavedregex
+Compiled pattern loaded from testsavedregex
+No study data
+    abc
+ 0: abc
+    *** Failers
+No match
+    bca
+No match
+    
+/abc/FSS>testsavedregex
+Compiled pattern written to testsavedregex
+<testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
+No study data
+    abc
+ 0: abc
+    *** Failers
+No match
+    bca
+No match
+
+/(a|b)/S>testsavedregex
+Compiled pattern written to testsavedregex
+Study data written to testsavedregex
+<testsavedregex
+Compiled pattern loaded from testsavedregex
+Study data loaded from testsavedregex
+    abc
+ 0: a
+    *** Failers
+ 0: a
+    def  
+No match
+    
+/(a|b)/SF>testsavedregex
+Compiled pattern written to testsavedregex
+Study data written to testsavedregex
+<testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
+Study data loaded from testsavedregex
+    abc
+ 0: a
+    *** Failers
+ 0: a
+    def  
+No match
+    
+/line\nbreak/
+    this is a line\nbreak
+ 0: line\x0abreak
+    line one\nthis is a line\nbreak in the second line 
+ 0: line\x0abreak
+
+/line\nbreak/f
+    this is a line\nbreak
+ 0: line\x0abreak
+    ** Failers 
+No match
+    line one\nthis is a line\nbreak in the second line 
+No match
+
+/line\nbreak/mf
+    this is a line\nbreak
+ 0: line\x0abreak
+    ** Failers 
+No match
+    line one\nthis is a line\nbreak in the second line 
+No match
+
+/1234/
+    123\P
+Partial match: 123
+    a4\P\R
+No match
+
+/1234/
+    123\P
+Partial match: 123
+    4\P\R
+ 0: 4
+
+/^/mg
+    a\nb\nc\n
  0: 
+ 0: 
+ 0: 
+    \ 
+ 0: 
+    
+/(?<=C\n)^/mg
+    A\nC\nC\n 
+ 0:

-/^\x{85}$/8i
-    \x{85}
- 0: \x{85}
+/(?s)A?B/
+    AB
+ 0: AB
+    aB  
+ 0: B

-/^abc./mgx8<any>
-    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x{0085}abc7 \x{2028}abc8 \x{2029}abc9 JUNK
+/(?s)A*B/
+    AB
+ 0: AB
+    aB  
+ 0: B
+
+/(?m)A?B/
+    AB
+ 0: AB
+    aB  
+ 0: B
+
+/(?m)A*B/
+    AB
+ 0: AB
+    aB  
+ 0: B
+
+/Content-Type\x3A[^\r\n]{6,}/
+    Content-Type:xxxxxyyy 
+ 0: Content-Type:xxxxxyyy
+ 1: Content-Type:xxxxxyy
+ 2: Content-Type:xxxxxy
+
+/Content-Type\x3A[^\r\n]{6,}z/
+    Content-Type:xxxxxyyyz
+ 0: Content-Type:xxxxxyyyz
+
+/Content-Type\x3A[^a]{6,}/
+    Content-Type:xxxyyy 
+ 0: Content-Type:xxxyyy
+
+/Content-Type\x3A[^a]{6,}z/
+    Content-Type:xxxyyyz
+ 0: Content-Type:xxxyyyz
+
+/^abc/m
+    xyz\nabc
+ 0: abc
+    xyz\nabc\<lf>
+ 0: abc
+    xyz\r\nabc\<lf>
+ 0: abc
+    xyz\rabc\<cr>
+ 0: abc
+    xyz\r\nabc\<crlf>
+ 0: abc
+    ** Failers 
+No match
+    xyz\nabc\<cr>
+No match
+    xyz\r\nabc\<cr>
+No match
+    xyz\nabc\<crlf>
+No match
+    xyz\rabc\<crlf>
+No match
+    xyz\rabc\<lf>
+No match
+    
+/abc$/m<lf>
+    xyzabc
+ 0: abc
+    xyzabc\n 
+ 0: abc
+    xyzabc\npqr 
+ 0: abc
+    xyzabc\r\<cr> 
+ 0: abc
+    xyzabc\rpqr\<cr> 
+ 0: abc
+    xyzabc\r\n\<crlf> 
+ 0: abc
+    xyzabc\r\npqr\<crlf> 
+ 0: abc
+    ** Failers
+No match
+    xyzabc\r 
+No match
+    xyzabc\rpqr 
+No match
+    xyzabc\r\n 
+No match
+    xyzabc\r\npqr 
+No match
+    
+/^abc/m<cr>
+    xyz\rabcdef
+ 0: abc
+    xyz\nabcdef\<lf>
+ 0: abc
+    ** Failers  
+No match
+    xyz\nabcdef
+No match
+       
+/^abc/m<lf>
+    xyz\nabcdef
+ 0: abc
+    xyz\rabcdef\<cr>
+ 0: abc
+    ** Failers  
+No match
+    xyz\rabcdef
+No match
+       
+/^abc/m<crlf>
+    xyz\r\nabcdef
+ 0: abc
+    xyz\rabcdef\<cr>
+ 0: abc
+    ** Failers  
+No match
+    xyz\rabcdef
+No match
+    
+/.*/<lf>
+    abc\ndef
+ 0: abc
+ 1: ab
+ 2: a
+ 3: 
+    abc\rdef
+ 0: abc\x0ddef
+ 1: abc\x0dde
+ 2: abc\x0dd
+ 3: abc\x0d
+ 4: abc
+ 5: ab
+ 6: a
+ 7: 
+    abc\r\ndef
+ 0: abc\x0d
+ 1: abc
+ 2: ab
+ 3: a
+ 4: 
+    \<cr>abc\ndef
+ 0: abc\x0adef
+ 1: abc\x0ade
+ 2: abc\x0ad
+ 3: abc\x0a
+ 4: abc
+ 5: ab
+ 6: a
+ 7: 
+    \<cr>abc\rdef
+ 0: abc
+ 1: ab
+ 2: a
+ 3: 
+    \<cr>abc\r\ndef
+ 0: abc
+ 1: ab
+ 2: a
+ 3: 
+    \<crlf>abc\ndef
+ 0: abc\x0adef
+ 1: abc\x0ade
+ 2: abc\x0ad
+ 3: abc\x0a
+ 4: abc
+ 5: ab
+ 6: a
+ 7: 
+    \<crlf>abc\rdef
+ 0: abc\x0ddef
+ 1: abc\x0dde
+ 2: abc\x0dd
+ 3: abc\x0d
+ 4: abc
+ 5: ab
+ 6: a
+ 7: 
+    \<crlf>abc\r\ndef
+ 0: abc
+ 1: ab
+ 2: a
+ 3: 
+
+/\w+(.)(.)?def/s
+    abc\ndef
+ 0: abc\x0adef
+    abc\rdef
+ 0: abc\x0ddef
+    abc\r\ndef
+ 0: abc\x0d\x0adef
+
+/^\w+=.*(\\\n.*)*/
+    abc=xyz\\\npqr
+ 0: abc=xyz\\x0apqr
+ 1: abc=xyz\\x0apq
+ 2: abc=xyz\\x0ap
+ 3: abc=xyz\\x0a
+ 4: abc=xyz\
+ 5: abc=xyz
+ 6: abc=xy
+ 7: abc=x
+ 8: abc=
+
+/^(a()*)*/
+    aaaa
+ 0: aaaa
+ 1: aaa
+ 2: aa
+ 3: a
+ 4: 
+
+/^(?:a(?:(?:))*)*/
+    aaaa
+ 0: aaaa
+ 1: aaa
+ 2: aa
+ 3: a
+ 4: 
+
+/^(a()+)+/
+    aaaa
+ 0: aaaa
+ 1: aaa
+ 2: aa
+ 3: a
+
+/^(?:a(?:(?:))+)+/
+    aaaa
+ 0: aaaa
+ 1: aaa
+ 2: aa
+ 3: a
+
+/(a|)*\d/
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+No match
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+
+/(?>a|)*\d/
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+No match
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+
+/(?:a|)*\d/
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+No match
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
+
+/^a.b/<lf>
+    a\rb
+ 0: a\x0db
+    a\nb\<cr> 
+ 0: a\x0ab
+    ** Failers
+No match
+    a\nb
+No match
+    a\nb\<any>
+No match
+    a\rb\<cr>   
+No match
+    a\rb\<any>   
+No match
+
+/^abc./mgx<any>
+    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 JUNK
  0: abc1
  0: abc2
  0: abc3
@@ -1041,100 +6815,92 @@
  0: abc5
  0: abc6
  0: abc7
- 0: abc8
- 0: abc9

-/abc.$/mgx8<any>
-    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x{0085} abc7\x{2028} abc8\x{2029} abc9
+/abc.$/mgx<any>
+    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc9
  0: abc1
  0: abc2
  0: abc3
  0: abc4
  0: abc5
  0: abc6
- 0: abc7
- 0: abc8
  0: abc9

-/^a\Rb/8<bsr_unicode>
+/^a\Rb/<bsr_unicode>
     a\nb
- 0: a\x{0a}b
+ 0: a\x0ab
     a\rb
- 0: a\x{0d}b
+ 0: a\x0db
     a\r\nb
- 0: a\x{0d}\x{0a}b
+ 0: a\x0d\x0ab
     a\x0bb
- 0: a\x{0b}b
+ 0: a\x0bb
     a\x0cb
- 0: a\x{0c}b
-    a\x{85}b   
- 0: a\x{85}b
-    a\x{2028}b 
- 0: a\x{2028}b
-    a\x{2029}b 
- 0: a\x{2029}b
+ 0: a\x0cb
+    a\x85b   
+ 0: a\x85b
     ** Failers
 No match
     a\n\rb    
 No match

-/^a\R*b/8<bsr_unicode>
+/^a\R*b/<bsr_unicode>
     ab
  0: ab
     a\nb
- 0: a\x{0a}b
+ 0: a\x0ab
     a\rb
- 0: a\x{0d}b
+ 0: a\x0db
     a\r\nb
- 0: a\x{0d}\x{0a}b
+ 0: a\x0d\x0ab
     a\x0bb
- 0: a\x{0b}b
-    a\x0c\x{2028}\x{2029}b
- 0: a\x{0c}\x{2028}\x{2029}b
-    a\x{85}b   
- 0: a\x{85}b
+ 0: a\x0bb
+    a\x0cb
+ 0: a\x0cb
+    a\x85b   
+ 0: a\x85b
     a\n\rb    
- 0: a\x{0a}\x{0d}b
-    a\n\r\x{85}\x0cb 
- 0: a\x{0a}\x{0d}\x{85}\x{0c}b
+ 0: a\x0a\x0db
+    a\n\r\x85\x0cb 
+ 0: a\x0a\x0d\x85\x0cb

-/^a\R+b/8<bsr_unicode>
+/^a\R+b/<bsr_unicode>
     a\nb
- 0: a\x{0a}b
+ 0: a\x0ab
     a\rb
- 0: a\x{0d}b
+ 0: a\x0db
     a\r\nb
- 0: a\x{0d}\x{0a}b
+ 0: a\x0d\x0ab
     a\x0bb
- 0: a\x{0b}b
-    a\x0c\x{2028}\x{2029}b
- 0: a\x{0c}\x{2028}\x{2029}b
-    a\x{85}b   
- 0: a\x{85}b
+ 0: a\x0bb
+    a\x0cb
+ 0: a\x0cb
+    a\x85b   
+ 0: a\x85b
     a\n\rb    
- 0: a\x{0a}\x{0d}b
-    a\n\r\x{85}\x0cb 
- 0: a\x{0a}\x{0d}\x{85}\x{0c}b
+ 0: a\x0a\x0db
+    a\n\r\x85\x0cb 
+ 0: a\x0a\x0d\x85\x0cb
     ** Failers
 No match
     ab  
 No match
-
-/^a\R{1,3}b/8<bsr_unicode>
+    
+/^a\R{1,3}b/<bsr_unicode>
     a\nb
- 0: a\x{0a}b
+ 0: a\x0ab
     a\n\rb
- 0: a\x{0a}\x{0d}b
-    a\n\r\x{85}b
- 0: a\x{0a}\x{0d}\x{85}b
+ 0: a\x0a\x0db
+    a\n\r\x85b
+ 0: a\x0a\x0d\x85b
     a\r\n\r\nb 
- 0: a\x{0d}\x{0a}\x{0d}\x{0a}b
+ 0: a\x0d\x0a\x0d\x0ab
     a\r\n\r\n\r\nb 
- 0: a\x{0d}\x{0a}\x{0d}\x{0a}\x{0d}\x{0a}b
+ 0: a\x0d\x0a\x0d\x0a\x0d\x0ab
     a\n\r\n\rb
- 0: a\x{0a}\x{0d}\x{0a}\x{0d}b
+ 0: a\x0a\x0d\x0a\x0db
     a\n\n\r\nb 
- 0: a\x{0a}\x{0a}\x{0d}\x{0a}b
+ 0: a\x0a\x0a\x0d\x0ab
     ** Failers
 No match
     a\n\n\n\rb
@@ -1142,164 +6908,645 @@
     a\r
 No match

-/\h+\V?\v{3,4}/8 
-    \x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
- 0: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}\x{0d}
- 1: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}
+/^a[\R]b/<bsr_unicode>
+    aRb
+ 0: aRb
+    ** Failers
+No match
+    a\nb  
+No match

-/\V?\v{3,4}/8 
-    \x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
- 0: X\x{0a}\x{0b}\x{0c}\x{0d}
- 1: X\x{0a}\x{0b}\x{0c}
+/.+foo/
+    afoo
+ 0: afoo
+    ** Failers 
+No match
+    \r\nfoo 
+No match
+    \nfoo 
+No match

-/\h+\V?\v{3,4}/8
-    >\x09\x20\x{a0}X\x0a\x0a\x0a<
- 0: \x{09} \x{a0}X\x{0a}\x{0a}\x{0a}
+/.+foo/<crlf>
+    afoo
+ 0: afoo
+    \nfoo 
+ 0: \x0afoo
+    ** Failers 
+No match
+    \r\nfoo 
+No match

-/\V?\v{3,4}/8
-    >\x09\x20\x{a0}X\x0a\x0a\x0a<
- 0: X\x{0a}\x{0a}\x{0a}
+/.+foo/<any>
+    afoo
+ 0: afoo
+    ** Failers 
+No match
+    \nfoo 
+No match
+    \r\nfoo 
+No match

-/\H\h\V\v/8
+/.+foo/s
+    afoo
+ 0: afoo
+    \r\nfoo 
+ 0: \x0d\x0afoo
+    \nfoo 
+ 0: \x0afoo
+
+/^$/mg<any>
+    abc\r\rxyz
+ 0: 
+    abc\n\rxyz  
+ 0: 
+    ** Failers 
+No match
+    abc\r\nxyz
+No match
+
+/^X/m
+    XABC
+ 0: X
+    ** Failers 
+No match
+    XABC\B
+No match
+
+/(?m)^$/<any>g+
+    abc\r\n\r\n
+ 0: 
+ 0+ \x0d\x0a
+
+/(?m)^$|^\r\n/<any>g+ 
+    abc\r\n\r\n
+ 0: \x0d\x0a
+ 0+ 
+ 1: 
+    
+/(?m)$/<any>g+ 
+    abc\r\n\r\n
+ 0: 
+ 0+ \x0d\x0a\x0d\x0a
+ 0: 
+ 0+ \x0d\x0a
+ 0: 
+ 0+ 
+
+/(?|(abc)|(xyz))/
+   >abc<
+ 0: abc
+   >xyz< 
+ 0: xyz
+
+/(x)(?|(abc)|(xyz))(x)/
+    xabcx
+ 0: xabcx
+    xxyzx 
+ 0: xxyzx
+
+/(x)(?|(abc)(pqr)|(xyz))(x)/
+    xabcpqrx
+ 0: xabcpqrx
+    xxyzx 
+ 0: xxyzx
+
+/(?|(abc)|(xyz))(?1)/
+    abcabc
+ 0: abcabc
+    xyzabc 
+ 0: xyzabc
+    ** Failers 
+No match
+    xyzxyz 
+No match
+ 
+/\H\h\V\v/
     X X\x0a
- 0: X X\x{0a}
+ 0: X X\x0a
     X\x09X\x0b
- 0: X\x{09}X\x{0b}
+ 0: X\x09X\x0b
     ** Failers
 No match
-    \x{a0} X\x0a   
+    \xa0 X\x0a   
 No match

-/\H*\h+\V?\v{3,4}/8 
-    \x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
- 0: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}\x{0d}
- 1: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}
-    \x09\x20\x{a0}\x0a\x0b\x0c\x0d\x0a
- 0: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}\x{0d}
- 1: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}
-    \x09\x20\x{a0}\x0a\x0b\x0c
- 0: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}
+/\H*\h+\V?\v{3,4}/ 
+    \x09\x20\xa0X\x0a\x0b\x0c\x0d\x0a
+ 0: \x09 \xa0X\x0a\x0b\x0c\x0d
+ 1: \x09 \xa0X\x0a\x0b\x0c
+    \x09\x20\xa0\x0a\x0b\x0c\x0d\x0a
+ 0: \x09 \xa0\x0a\x0b\x0c\x0d
+ 1: \x09 \xa0\x0a\x0b\x0c
+    \x09\x20\xa0\x0a\x0b\x0c
+ 0: \x09 \xa0\x0a\x0b\x0c
     ** Failers 
 No match
-    \x09\x20\x{a0}\x0a\x0b
+    \x09\x20\xa0\x0a\x0b
 No match

-/\H\h\V\v/8
-    \x{3001}\x{3000}\x{2030}\x{2028}
- 0: \x{3001}\x{3000}\x{2030}\x{2028}
-    X\x{180e}X\x{85}
- 0: X\x{180e}X\x{85}
+/\H{3,4}/
+    XY  ABCDE
+ 0: ABCD
+ 1: ABC
+    XY  PQR ST 
+ 0: PQR
+    
+/.\h{3,4}./
+    XY  AB    PQRS
+ 0: B    P
+ 1: B    
+
+/\h*X\h?\H+Y\H?Z/
+    >XNNNYZ
+ 0: XNNNYZ
+    >  X NYQZ
+ 0:   X NYQZ
     ** Failers
 No match
-    \x{2009} X\x0a   
+    >XYZ   
 No match
-    
-/\H*\h+\V?\v{3,4}/8 
-    \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
- 0: \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x{0c}\x{0d}
- 1: \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x{0c}
-    \x09\x{205f}\x{a0}\x0a\x{2029}\x0c\x{2028}\x0a
- 0: \x{09}\x{205f}\x{a0}\x{0a}\x{2029}\x{0c}\x{2028}
- 1: \x{09}\x{205f}\x{a0}\x{0a}\x{2029}\x{0c}
-    \x09\x20\x{202f}\x0a\x0b\x0c
- 0: \x{09} \x{202f}\x{0a}\x{0b}\x{0c}
-    ** Failers 
+    >  X NY Z
 No match
-    \x09\x{200a}\x{a0}\x{2028}\x0b
+
+/\v*X\v?Y\v+Z\V*\x0a\V+\x0b\V{2,3}\x0c/
+    >XY\x0aZ\x0aA\x0bNN\x0c
+ 0: XY\x0aZ\x0aA\x0bNN\x0c
+    >\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
+ 0: \x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
+
+/.+A/<crlf>
+    \r\nA
 No match
-     
-/a\Rb/I8<bsr_anycrlf>
+    
+/\nA/<crlf>
+    \r\nA 
+ 0: \x0aA
+
+/[\r\n]A/<crlf>
+    \r\nA 
+ 0: \x0aA
+
+/(\r|\n)A/<crlf>
+    \r\nA 
+ 0: \x0aA
+
+/a\Rb/I<bsr_anycrlf>
 Capturing subpattern count = 0
-Options: bsr_anycrlf utf8
+Options: bsr_anycrlf
 First char = 'a'
 Need char = 'b'
     a\rb
- 0: a\x{0d}b
+ 0: a\x0db
     a\nb
- 0: a\x{0a}b
+ 0: a\x0ab
     a\r\nb
- 0: a\x{0d}\x{0a}b
+ 0: a\x0d\x0ab
     ** Failers
 No match
-    a\x{85}b
+    a\x85b
 No match
     a\x0bb     
 No match

-/a\Rb/I8<bsr_unicode>
+/a\Rb/I<bsr_unicode>
 Capturing subpattern count = 0
-Options: bsr_unicode utf8
+Options: bsr_unicode
 First char = 'a'
 Need char = 'b'
     a\rb
- 0: a\x{0d}b
+ 0: a\x0db
     a\nb
- 0: a\x{0a}b
+ 0: a\x0ab
     a\r\nb
- 0: a\x{0d}\x{0a}b
-    a\x{85}b
- 0: a\x{85}b
+ 0: a\x0d\x0ab
+    a\x85b
+ 0: a\x85b
     a\x0bb     
- 0: a\x{0b}b
+ 0: a\x0bb
     ** Failers 
 No match
-    a\x{85}b\<bsr_anycrlf>
+    a\x85b\<bsr_anycrlf>
 No match
     a\x0bb\<bsr_anycrlf>
 No match

-/a\R?b/I8<bsr_anycrlf>
+/a\R?b/I<bsr_anycrlf>
 Capturing subpattern count = 0
-Options: bsr_anycrlf utf8
+Options: bsr_anycrlf
 First char = 'a'
 Need char = 'b'
     a\rb
- 0: a\x{0d}b
+ 0: a\x0db
     a\nb
- 0: a\x{0a}b
+ 0: a\x0ab
     a\r\nb
- 0: a\x{0d}\x{0a}b
+ 0: a\x0d\x0ab
     ** Failers
 No match
-    a\x{85}b
+    a\x85b
 No match
     a\x0bb     
 No match

-/a\R?b/I8<bsr_unicode>
+/a\R?b/I<bsr_unicode>
 Capturing subpattern count = 0
-Options: bsr_unicode utf8
+Options: bsr_unicode
 First char = 'a'
 Need char = 'b'
     a\rb
- 0: a\x{0d}b
+ 0: a\x0db
     a\nb
- 0: a\x{0a}b
+ 0: a\x0ab
     a\r\nb
- 0: a\x{0d}\x{0a}b
-    a\x{85}b
- 0: a\x{85}b
+ 0: a\x0d\x0ab
+    a\x85b
+ 0: a\x85b
     a\x0bb     
- 0: a\x{0b}b
+ 0: a\x0bb
     ** Failers 
 No match
-    a\x{85}b\<bsr_anycrlf>
+    a\x85b\<bsr_anycrlf>
 No match
     a\x0bb\<bsr_anycrlf>
 No match
- 
-/X/8f<any> 
-    A\x{1ec5}ABCXYZ
+    
+/a\R{2,4}b/I<bsr_anycrlf>
+Capturing subpattern count = 0
+Options: bsr_anycrlf
+First char = 'a'
+Need char = 'b'
+    a\r\n\nb
+ 0: a\x0d\x0a\x0ab
+    a\n\r\rb
+ 0: a\x0a\x0d\x0db
+    a\r\n\r\n\r\n\r\nb
+ 0: a\x0d\x0a\x0d\x0a\x0d\x0a\x0d\x0ab
+    ** Failers
+No match
+    a\x85\85b
+No match
+    a\x0b\0bb     
+No match
+
+/a\R{2,4}b/I<bsr_unicode>
+Capturing subpattern count = 0
+Options: bsr_unicode
+First char = 'a'
+Need char = 'b'
+    a\r\rb
+ 0: a\x0d\x0db
+    a\n\n\nb
+ 0: a\x0a\x0a\x0ab
+    a\r\n\n\r\rb
+ 0: a\x0d\x0a\x0a\x0d\x0db
+    a\x85\85b
+No match
+    a\x0b\0bb     
+No match
+    ** Failers 
+No match
+    a\r\r\r\r\rb 
+No match
+    a\x85\85b\<bsr_anycrlf>
+No match
+    a\x0b\0bb\<bsr_anycrlf>
+No match
+    
+/a(?!)|\wbc/
+    abc 
+ 0: abc
+
+/a[]b/<JS>
+    ** Failers
+No match
+    ab
+No match
+
+/a[]+b/<JS>
+    ** Failers
+No match
+    ab 
+No match
+
+/a[]*+b/<JS>
+    ** Failers
+No match
+    ab 
+No match
+
+/a[^]b/<JS>
+    aXb
+ 0: aXb
+    a\nb 
+ 0: a\x0ab
+    ** Failers
+No match
+    ab  
+No match
+    
+/a[^]+b/<JS> 
+    aXb
+ 0: aXb
+    a\nX\nXb 
+ 0: a\x0aX\x0aXb
+    ** Failers
+No match
+    ab  
+No match
+
+/X$/E
+    X
  0: X
+    ** Failers 
+No match
+    X\n 
+No match

-/abcd*/8
+/X$/
+    X
+ 0: X
+    X\n 
+ 0: X
+
+/xyz/C
+  xyz 
+--->xyz
+ +0 ^       x
+ +1 ^^      y
+ +2 ^ ^     z
+ +3 ^  ^    
+ 0: xyz
+  abcxyz 
+--->abcxyz
+ +0    ^       x
+ +1    ^^      y
+ +2    ^ ^     z
+ +3    ^  ^    
+ 0: xyz
+  abcxyz\Y
+--->abcxyz
+ +0 ^          x
+ +0  ^         x
+ +0   ^        x
+ +0    ^       x
+ +1    ^^      y
+ +2    ^ ^     z
+ +3    ^  ^    
+ 0: xyz
+  ** Failers 
+No match
+  abc
+No match
+  abc\Y
+--->abc
+ +0 ^       x
+ +0  ^      x
+ +0   ^     x
+ +0    ^    x
+No match
+  abcxypqr  
+No match
+  abcxypqr\Y  
+--->abcxypqr
+ +0 ^            x
+ +0  ^           x
+ +0   ^          x
+ +0    ^         x
+ +1    ^^        y
+ +2    ^ ^       z
+ +0     ^        x
+ +0      ^       x
+ +0       ^      x
+ +0        ^     x
+ +0         ^    x
+No match
+
+/(*NO_START_OPT)xyz/C
+  abcxyz 
+--->abcxyz
++15 ^          x
++15  ^         x
++15   ^        x
++15    ^       x
++16    ^^      y
++17    ^ ^     z
++18    ^  ^    
+ 0: xyz
+  
+/(?C)ab/
+  ab
+--->ab
+  0 ^      a
+ 0: ab
+  \C-ab
+ 0: ab
+  
+/ab/C
+  ab
+--->ab
+ +0 ^      a
+ +1 ^^     b
+ +2 ^ ^    
+ 0: ab
+  \C-ab    
+ 0: ab
+
+/^"((?(?=[a])[^"])|b)*"$/C
+    "ab"
+--->"ab"
+ +0 ^        ^
+ +1 ^        "
+ +2 ^^       ((?(?=[a])[^"])|b)*
++21 ^^       "
+ +3 ^^       (?(?=[a])[^"])
++18 ^^       b
+ +5 ^^       (?=[a])
+ +8  ^       [a]
++11  ^^      )
++12 ^^       [^"]
++16 ^ ^      )
++17 ^ ^      |
++21 ^ ^      "
+ +3 ^ ^      (?(?=[a])[^"])
++18 ^ ^      b
+ +5 ^ ^      (?=[a])
+ +8   ^      [a]
++19 ^  ^     )
++21 ^  ^     "
+ +3 ^  ^     (?(?=[a])[^"])
++18 ^  ^     b
+ +5 ^  ^     (?=[a])
+ +8    ^     [a]
++17 ^  ^     |
++22 ^   ^    $
++23 ^   ^    
+ 0: "ab"
+    \C-"ab"
+ 0: "ab"
+
+/\d+X|9+Y/
+    ++++123999\P
+Partial match: 123999
+    ++++123999Y\P
+ 0: 999Y
+
+/Z(*F)/
+    Z\P
+No match
+    ZA\P 
+No match
+    
+/Z(?!)/
+    Z\P 
+No match
+    ZA\P 
+No match
+
+/dog(sbody)?/
+    dogs\P
+ 0: dog
+    dogs\P\P 
+Partial match: dogs
+    
+/dog(sbody)??/
+    dogs\P
+ 0: dog
+    dogs\P\P 
+Partial match: dogs
+
+/dog|dogsbody/
+    dogs\P
+ 0: dog
+    dogs\P\P 
+Partial match: dogs
+ 
+/dogsbody|dog/
+    dogs\P
+ 0: dog
+    dogs\P\P 
+Partial match: dogs
+
+/Z(*F)Q|ZXY/
+    Z\P
+Partial match: Z
+    ZA\P 
+No match
+    X\P 
+No match
+
+/\bthe cat\b/
+    the cat\P
+ 0: the cat
+    the cat\P\P
+Partial match: the cat
+
+/dog(sbody)?/
+    dogs\D\P
+ 0: dog
+    body\D\R
+ 0: body
+
+/dog(sbody)?/
+    dogs\D\P\P
+Partial match: dogs
+    body\D\R
+ 0: body
+
+/abc/
+   abc\P
+ 0: abc
+   abc\P\P
+ 0: abc
+
+/abc\K123/
+    xyzabc123pqr
+Error -16 (item unsupported for DFA matching)
+    
+/(?<=abc)123/
+    xyzabc123pqr 
+ 0: 123
+    xyzabc12\P
+Partial match: abc12
+    xyzabc12\P\P
+Partial match: abc12
+
+/\babc\b/
+    +++abc+++
+ 0: abc
+    +++ab\P
+Partial match: +ab
+    +++ab\P\P  
+Partial match: +ab
+
+/(?=C)/g+
+    ABCDECBA
+ 0: 
+ 0+ CDECBA
+ 0: 
+ 0+ CBA
+
+/(abc|def|xyz)/I
+Capturing subpattern count = 1
+No options
+No first char
+No need char
+    terhjk;abcdaadsfe
+ 0: abc
+    the quick xyz brown fox 
+ 0: xyz
+    \Yterhjk;abcdaadsfe
+ 0: abc
+    \Ythe quick xyz brown fox 
+ 0: xyz
+    ** Failers
+No match
+    thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
+No match
+    \Ythejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
+No match
+
+/(abc|def|xyz)/SI
+Capturing subpattern count = 1
+No options
+No first char
+No need char
+Subject length lower bound = 3
+Starting byte set: a d x 
+    terhjk;abcdaadsfe
+ 0: abc
+    the quick xyz brown fox 
+ 0: xyz
+    \Yterhjk;abcdaadsfe
+ 0: abc
+    \Ythe quick xyz brown fox 
+ 0: xyz
+    ** Failers
+No match
+    thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
+No match
+    \Ythejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
+No match
+
+/abcd*/+
     xxxxabcd\P
  0: abcd
+ 0+ 
  1: abc
     xxxxabcd\P\P
 Partial match: abcd
+    dddxxx\R 
+ 0: ddd
+ 0+ xxx
+ 1: dd
+ 2: d
+ 3: 
+    xxxxabcd\P\P
+Partial match: abcd
+    xxx\R 
+ 0: 
+ 0+ xxx

-/abcd*/i8
+/abcd*/i
     xxxxabcd\P
  0: abcd
  1: abc
@@ -1311,14 +7558,14 @@
     XXXXABCD\P\P
 Partial match: ABCD

-/abc\d*/8
+/abc\d*/
     xxxxabc1\P
  0: abc1
  1: abc
     xxxxabc1\P\P
 Partial match: abc1

-/abc[de]*/8
+/abc[de]*/
     xxxxabcde\P
  0: abcde
  1: abcd
@@ -1326,33 +7573,297 @@
     xxxxabcde\P\P
 Partial match: abcde

-/\bthe cat\b/8
-    the cat\P
- 0: the cat
-    the cat\P\P
-Partial match: the cat
+/(?:(?1)|B)(A(*F)|C)/
+    ABCD
+ 0: BC
+    CCD
+ 0: CC
+    ** Failers
+No match
+    CAD   
+No match

-/a+/8
-    a\x{123}aa\>1
- 0: aa
- 1: a
-    a\x{123}aa\>2
-Error -11 (bad UTF-8 offset)
-    a\x{123}aa\>3
- 0: aa
- 1: a
-    a\x{123}aa\>4
- 0: a
-    a\x{123}aa\>5
+/^(?:(?1)|B)(A(*F)|C)/
+    CCD
+ 0: CC
+    BCD 
+ 0: BC
+    ** Failers
 No match
-    a\x{123}aa\>6
+    ABCD
+No match
+    CAD
+No match
+    BAD    
+No match
+
+/^(?!a(*SKIP)b)/
+    ac
+Error -16 (item unsupported for DFA matching)
+    
+/^(?=a(*SKIP)b|ac)/
+    ** Failers
+No match
+    ac
+Error -16 (item unsupported for DFA matching)
+    
+/^(?=a(*THEN)b|ac)/
+    ac
+Error -16 (item unsupported for DFA matching)
+    
+/^(?=a(*PRUNE)b)/
+    ab  
+Error -16 (item unsupported for DFA matching)
+    ** Failers 
+No match
+    ac
+Error -16 (item unsupported for DFA matching)
+
+/^(?(?!a(*SKIP)b))/
+    ac
+Error -16 (item unsupported for DFA matching)
+
+/(?<=abc)def/
+    abc\P\P
+Partial match: abc
+
+/abc$/
+    abc
+ 0: abc
+    abc\P
+ 0: abc
+    abc\P\P
+Partial match: abc
+
+/abc$/m
+    abc
+ 0: abc
+    abc\n
+ 0: abc
+    abc\P\P
+Partial match: abc
+    abc\n\P\P 
+ 0: abc
+    abc\P
+ 0: abc
+    abc\n\P
+ 0: abc
+
+/abc\z/
+    abc
+ 0: abc
+    abc\P
+ 0: abc
+    abc\P\P
+Partial match: abc
+
+/abc\Z/
+    abc
+ 0: abc
+    abc\P
+ 0: abc
+    abc\P\P
+Partial match: abc
+
+/abc\b/
+    abc
+ 0: abc
+    abc\P
+ 0: abc
+    abc\P\P
+Partial match: abc
+
+/abc\B/
+    abc
+No match
+    abc\P
+Partial match: abc
+    abc\P\P
+Partial match: abc
+
+/.+/
+    abc\>0
+ 0: abc
+ 1: ab
+ 2: a
+    abc\>1
+ 0: bc
+ 1: b
+    abc\>2
+ 0: c
+    abc\>3
+No match
+    abc\>4
 Error -24 (bad offset value)
+    abc\>-4 
+Error -24 (bad offset value)

-/ab\Cde/8
+/^(?:a)++\w/
+     aaaab
+ 0: aaaab
+     ** Failers 
+No match
+     aaaa 
+No match
+     bbb 
+No match
+
+/^(?:aa|(?:a)++\w)/
+     aaaab
+ 0: aaaab
+ 1: aa
+     aaaa 
+ 0: aa
+     ** Failers 
+No match
+     bbb 
+No match
+
+/^(?:a)*+\w/
+     aaaab
+ 0: aaaab
+     bbb 
+ 0: b
+     ** Failers 
+No match
+     aaaa 
+No match
+
+/^(a)++\w/
+     aaaab
+ 0: aaaab
+     ** Failers 
+No match
+     aaaa 
+No match
+     bbb 
+No match
+
+/^(a|)++\w/
+     aaaab
+ 0: aaaab
+     ** Failers 
+No match
+     aaaa 
+No match
+     bbb 
+No match
+
+/(?=abc){3}abc/+
+    abcabcabc
+ 0: abc
+ 0+ abcabc
+    ** Failers
+No match
+    xyz  
+No match
+    
+/(?=abc)+abc/+
+    abcabcabc
+ 0: abc
+ 0+ abcabc
+    ** Failers
+No match
+    xyz  
+No match
+    
+/(?=abc)++abc/+
+    abcabcabc
+ 0: abc
+ 0+ abcabc
+    ** Failers
+No match
+    xyz  
+No match
+    
+/(?=abc){0}xyz/
+    xyz 
+ 0: xyz
+
+/(?=abc){1}xyz/
+    ** Failers
+No match
+    xyz 
+No match
+    
+/(?=(a))?./
+    ab
+ 0: a
+    bc
+ 0: b
+      
+/(?=(a))??./
+    ab
+ 0: a
+    bc
+ 0: b
+
+/^(?=(a)){0}b(?1)/
+    backgammon
+ 0: ba
+
+/^(?=(?1))?[az]([abc])d/
+    abd 
+ 0: abd
+    zcdxx 
+ 0: zcd
+
+/^(?!a){0}\w+/
+    aaaaa
+ 0: aaaaa
+ 1: aaaa
+ 2: aaa
+ 3: aa
+ 4: a
+
+/(?<=(abc))?xyz/
+    abcxyz
+ 0: xyz
+    pqrxyz 
+ 0: xyz
+
+/((?2))((?1))/
+    abc
+Error -26 (nested recursion at the same subject position)
+
+/(?(R)a+|(?R)b)/
+    aaaabcde
+ 0: aaaab
+
+/(?(R)a+|((?R))b)/
+    aaaabcde
+ 0: aaaab
+
+/((?(R)a+|(?1)b))/
+    aaaabcde
+ 0: aaaab
+
+/((?(R2)a+|(?1)b))/
+    aaaabcde
+Error -17 (backreference condition or recursion test not supported for DFA matching)
+
+/(?(R)a*(?1)|((?R))b)/
+    aaaabcde
+Error -26 (nested recursion at the same subject position)
+
+/(a+)/
+    \O6aaaa
+Matched, but too many subsidiary matches
+ 0: aaaa
+ 1: aaa
+ 2: aa
+    \O8aaaa
+ 0: aaaa
+ 1: aaa
+ 2: aa
+ 3: a
+
+/ab\Cde/
     abXde
-Error -16 (item unsupported for DFA matching)
+ 0: abXde
+    
+/(?<=ab\Cde)X/
+    abZdeX
+ 0: X

-/(?<=ab\Cde)X/8
-Failed: \C not allowed in lookbehind assertion at offset 10
-
-/-- End of testinput8 --/
+/-- End of testinput8 --/

Modified: code/trunk/testdata/testoutput9
===================================================================
--- code/trunk/testdata/testoutput9    2011-12-28 16:10:09 UTC (rev 835)
+++ code/trunk/testdata/testoutput9    2011-12-28 17:16:11 UTC (rev 836)
@@ -1,2037 +1,1326 @@
-/-- This set of tests check Unicode property support with the DFA matching 
-    functionality of pcre_dfa_exec(). The -dfa flag must be used with pcretest
-    when running it. --/
+/-- This set of tests checks UTF-8 support with the DFA matching functionality
+    of pcre_dfa_exec(). The -dfa flag must be used with pcretest when running 
+    it. --/

-/\pL\P{Nd}/8
-    AB
- 0: AB
-    *** Failers
- 0: Fa
-    A0
+/\x{100}ab/8
+  \x{100}ab
+ 0: \x{100}ab
+  
+/a\x{100}*b/8
+    ab
+ 0: ab
+    a\x{100}b  
+ 0: a\x{100}b
+    a\x{100}\x{100}b  
+ 0: a\x{100}\x{100}b
+    
+/a\x{100}+b/8
+    a\x{100}b  
+ 0: a\x{100}b
+    a\x{100}\x{100}b  
+ 0: a\x{100}\x{100}b
+    *** Failers 
 No match
-    00   
+    ab
 No match
-
-/\X./8
-    AB
- 0: AB
-    A\x{300}BC 
- 0: A\x{300}B
-    A\x{300}\x{301}\x{302}BC 
- 0: A\x{300}\x{301}\x{302}B
-    *** Failers
- 0: **
-    \x{300}  
+     
+/\bX/8
+    Xoanon
+ 0: X
+    +Xoanon
+ 0: X
+    \x{300}Xoanon 
+ 0: X
+    *** Failers 
 No match
-
-/\X\X/8
-    ABC
- 0: AB
-    A\x{300}B\x{300}\x{301}C 
- 0: A\x{300}B\x{300}\x{301}
-    A\x{300}\x{301}\x{302}BC 
- 0: A\x{300}\x{301}\x{302}B
+    YXoanon  
+No match
+    
+/\BX/8
+    YXoanon
+ 0: X
     *** Failers
- 0: **
-    \x{300}  
 No match
-
-/^\pL+/8
-    abcd
- 0: abcd
- 1: abc
- 2: ab
- 3: a
-    a 
- 0: a
-    *** Failers 
+    Xoanon
 No match
-
-/^\PL+/8
-    1234
- 0: 1234
- 1: 123
- 2: 12
- 3: 1
-    = 
- 0: =
-    *** Failers 
- 0: *** 
- 1: ***
- 2: **
- 3: *
-    abcd 
+    +Xoanon    
 No match
+    \x{300}Xoanon 
+No match

-/^\X+/8
-    abcdA\x{300}\x{301}\x{302}
- 0: abcdA\x{300}\x{301}\x{302}
- 1: abcd
- 2: abc
- 3: ab
- 4: a
-    A\x{300}\x{301}\x{302}
- 0: A\x{300}\x{301}\x{302}
-    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
- 0: A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
- 1: A\x{300}\x{301}\x{302}
-    a 
- 0: a
+/X\b/8
+    X+oanon
+ 0: X
+    ZX\x{300}oanon 
+ 0: X
+    FAX 
+ 0: X
     *** Failers 
- 0: *** Failers
- 1: *** Failer
- 2: *** Faile
- 3: *** Fail
- 4: *** Fai
- 5: *** Fa
- 6: *** F
- 7: *** 
- 8: ***
- 9: **
-10: *
-    \x{300}\x{301}\x{302}
 No match
-
-/\X?abc/8
-    abc
- 0: abc
-    A\x{300}abc
- 0: A\x{300}abc
-    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
- 0: A\x{300}abc
-    \x{300}abc  
- 0: abc
-    *** Failers
+    Xoanon  
 No match
-
-/^\X?abc/8
-    abc
- 0: abc
-    A\x{300}abc
- 0: A\x{300}abc
+    
+/X\B/8
+    Xoanon  
+ 0: X
     *** Failers
 No match
-    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
+    X+oanon
 No match
-    \x{300}abc  
+    ZX\x{300}oanon 
 No match
-
-/\X*abc/8
-    abc
- 0: abc
-    A\x{300}abc
- 0: A\x{300}abc
-    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
- 0: A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abc
-    \x{300}abc  
- 0: abc
-    *** Failers
+    FAX 
 No match
+    
+/[^a]/8
+    abcd
+ 0: b
+    a\x{100}   
+ 0: \x{100}

-/^\X*abc/8
-    abc
- 0: abc
-    A\x{300}abc
- 0: A\x{300}abc
-    A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
- 0: A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abc
+/^[abc\x{123}\x{400}-\x{402}]{2,3}\d/8
+    ab99
+ 0: ab9
+    \x{123}\x{123}45
+ 0: \x{123}\x{123}4
+    \x{400}\x{401}\x{402}6  
+ 0: \x{400}\x{401}\x{402}6
     *** Failers
 No match
-    \x{300}abc  
+    d99
 No match
-
-/^\pL?=./8
-    A=b
- 0: A=b
-    =c 
- 0: =c
-    *** Failers
+    \x{123}\x{122}4   
 No match
-    1=2 
+    \x{400}\x{403}6  
 No match
-    AAAA=b  
+    \x{400}\x{401}\x{402}\x{402}6  
 No match

-/^\pL*=./8
-    AAAA=b
- 0: AAAA=b
-    =c 
- 0: =c
+/a.b/8
+    acb
+ 0: acb
+    a\x7fb
+ 0: a\x{7f}b
+    a\x{100}b 
+ 0: a\x{100}b
     *** Failers
 No match
-    1=2  
+    a\nb  
 No match

-/^\X{2,3}X/8
-    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
- 0: A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
-    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X 
- 0: A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
+/a(.{3})b/8
+    a\x{4000}xyb 
+ 0: a\x{4000}xyb
+    a\x{4000}\x7fyb 
+ 0: a\x{4000}\x{7f}yb
+    a\x{4000}\x{100}yb 
+ 0: a\x{4000}\x{100}yb
     *** Failers
 No match
-    X
+    a\x{4000}b 
 No match
-    A\x{300}\x{301}\x{302}X
+    ac\ncb 
 No match
-    A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
-No match

-/^\pC\pL\pM\pN\pP\pS\pZ</8
-    \x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
- 0: \x{7f}\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
-    \np\x{300}9!\$ < 
- 0: \x{0a}p\x{300}9!$ <
-    ** Failers 
+/a(.*?)(.)/
+    a\xc0\x88b
+ 0: a\xc0\x88b
+ 1: a\xc0\x88
+ 2: a\xc0
+
+/a(.*?)(.)/8
+    a\x{100}b
+ 0: a\x{100}b
+ 1: a\x{100}
+
+/a(.*)(.)/
+    a\xc0\x88b
+ 0: a\xc0\x88b
+ 1: a\xc0\x88
+ 2: a\xc0
+
+/a(.*)(.)/8
+    a\x{100}b
+ 0: a\x{100}b
+ 1: a\x{100}
+
+/a(.)(.)/
+    a\xc0\x92bcd
+ 0: a\xc0\x92
+
+/a(.)(.)/8
+    a\x{240}bcd
+ 0: a\x{240}b
+
+/a(.?)(.)/
+    a\xc0\x92bcd
+ 0: a\xc0\x92
+ 1: a\xc0
+
+/a(.?)(.)/8
+    a\x{240}bcd
+ 0: a\x{240}b
+ 1: a\x{240}
+
+/a(.??)(.)/
+    a\xc0\x92bcd
+ 0: a\xc0\x92
+ 1: a\xc0
+
+/a(.??)(.)/8
+    a\x{240}bcd
+ 0: a\x{240}b
+ 1: a\x{240}
+
+/a(.{3})b/8
+    a\x{1234}xyb 
+ 0: a\x{1234}xyb
+    a\x{1234}\x{4321}yb 
+ 0: a\x{1234}\x{4321}yb
+    a\x{1234}\x{4321}\x{3412}b 
+ 0: a\x{1234}\x{4321}\x{3412}b
+    *** Failers
 No match
-    ap\x{300}9!\$ < 
+    a\x{1234}b 
 No match
-  
-/^\PC/8
-    X
- 0: X
-    ** Failers 
- 0: *
-    \x7f
+    ac\ncb 
 No match
-  
-/^\PL/8
-    9
- 0: 9
-    ** Failers 
- 0: *
-    \x{c0}
+
+/a(.{3,})b/8
+    a\x{1234}xyb 
+ 0: a\x{1234}xyb
+    a\x{1234}\x{4321}yb 
+ 0: a\x{1234}\x{4321}yb
+    a\x{1234}\x{4321}\x{3412}b 
+ 0: a\x{1234}\x{4321}\x{3412}b
+    axxxxbcdefghijb 
+ 0: axxxxbcdefghijb
+ 1: axxxxb
+    a\x{1234}\x{4321}\x{3412}\x{3421}b 
+ 0: a\x{1234}\x{4321}\x{3412}\x{3421}b
+    *** Failers
 No match
-  
-/^\PM/8
-    X
- 0: X
-    ** Failers 
- 0: *
-    \x{30f}
+    a\x{1234}b 
 No match
-  
-/^\PN/8
-    X
- 0: X
-    ** Failers 
- 0: *
-    \x{660}
+
+/a(.{3,}?)b/8
+    a\x{1234}xyb 
+ 0: a\x{1234}xyb
+    a\x{1234}\x{4321}yb 
+ 0: a\x{1234}\x{4321}yb
+    a\x{1234}\x{4321}\x{3412}b 
+ 0: a\x{1234}\x{4321}\x{3412}b
+    axxxxbcdefghijb 
+ 0: axxxxbcdefghijb
+ 1: axxxxb
+    a\x{1234}\x{4321}\x{3412}\x{3421}b 
+ 0: a\x{1234}\x{4321}\x{3412}\x{3421}b
+    *** Failers
 No match
-  
-/^\PP/8
-    X
- 0: X
-    ** Failers 
+    a\x{1234}b 
 No match
-    \x{66c}
+
+/a(.{3,5})b/8
+    a\x{1234}xyb 
+ 0: a\x{1234}xyb
+    a\x{1234}\x{4321}yb 
+ 0: a\x{1234}\x{4321}yb
+    a\x{1234}\x{4321}\x{3412}b 
+ 0: a\x{1234}\x{4321}\x{3412}b
+    axxxxbcdefghijb 
+ 0: axxxxb
+    a\x{1234}\x{4321}\x{3412}\x{3421}b 
+ 0: a\x{1234}\x{4321}\x{3412}\x{3421}b
+    axbxxbcdefghijb 
+ 0: axbxxb
+    axxxxxbcdefghijb 
+ 0: axxxxxb
+    *** Failers
 No match
-  
-/^\PS/8
-    X
- 0: X
-    ** Failers 
- 0: *
-    \x{f01}
+    a\x{1234}b 
 No match
-  
-/^\PZ/8
-    X
- 0: X
-    ** Failers 
- 0: *
-    \x{1680}
+    axxxxxxbcdefghijb 
 No match
-    
-/^\p{Cc}/8
-    \x{017}
- 0: \x{17}
-    \x{09f} 
- 0: \x{9f}
-    ** Failers
+
+/a(.{3,5}?)b/8
+    a\x{1234}xyb 
+ 0: a\x{1234}xyb
+    a\x{1234}\x{4321}yb 
+ 0: a\x{1234}\x{4321}yb
+    a\x{1234}\x{4321}\x{3412}b 
+ 0: a\x{1234}\x{4321}\x{3412}b
+    axxxxbcdefghijb 
+ 0: axxxxb
+    a\x{1234}\x{4321}\x{3412}\x{3421}b 
+ 0: a\x{1234}\x{4321}\x{3412}\x{3421}b
+    axbxxbcdefghijb 
+ 0: axbxxb
+    axxxxxbcdefghijb 
+ 0: axxxxxb
+    *** Failers
 No match
-    \x{0600} 
+    a\x{1234}b 
 No match
-  
-/^\p{Cf}/8
-    \x{601}
- 0: \x{601}
-    ** Failers
+    axxxxxxbcdefghijb 
 No match
-    \x{09f} 
+
+/^[a\x{c0}]/8
+    *** Failers
 No match
-  
-/^\p{Cn}/8
-    ** Failers
+    \x{100}
 No match
-    \x{09f} 
-No match
-  
-/^\p{Co}/8
-    \x{f8ff}
- 0: \x{f8ff}
-    ** Failers
-No match
-    \x{09f} 
-No match
-  
-/^\p{Cs}/8
-    \?\x{dfff}
- 0: \x{dfff}
-    ** Failers
-No match
-    \x{09f} 
-No match
-  
-/^\p{Ll}/8
-    a
- 0: a
-    ** Failers 
-No match
-    Z
-No match
-    \x{e000}  
-No match
-  
-/^\p{Lm}/8
-    \x{2b0}
- 0: \x{2b0}
-    ** Failers
-No match
-    a 
-No match
-  
-/^\p{Lo}/8
-    \x{1bb}
- 0: \x{1bb}
-    ** Failers
-No match
-    a 
-No match
-    \x{2b0}
-No match
-  
-/^\p{Lt}/8
-    \x{1c5}
- 0: \x{1c5}
-    ** Failers
-No match
-    a 
-No match
-    \x{2b0}
-No match
-  
-/^\p{Lu}/8
-    A
- 0: A
-    ** Failers
-No match
-    \x{2b0}
-No match
-  
-/^\p{Mc}/8
-    \x{903}
- 0: \x{903}
-    ** Failers
-No match
-    X
-No match
-    \x{300}
-No match
-       
-/^\p{Me}/8
-    \x{488}
- 0: \x{488}
-    ** Failers
-No match
-    X
-No match
-    \x{903}
-No match
-    \x{300}
-No match
-  
-/^\p{Mn}/8
-    \x{300}
- 0: \x{300}
-    ** Failers
-No match
-    X
-No match
-    \x{903}
-No match
-  
-/^\p{Nd}+/8
-    0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}\x{66a}
- 0: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}
- 1: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}
- 2: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}
- 3: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}
- 4: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}
- 5: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}
- 6: 0123456789\x{660}\x{661}\x{662}\x{663}
- 7: 0123456789\x{660}\x{661}\x{662}
- 8: 0123456789\x{660}\x{661}
- 9: 0123456789\x{660}
-10: 0123456789
-11: 012345678
-12: 01234567
-13: 0123456
-14: 012345
-15: 01234
-16: 0123
-17: 012
-18: 01
-19: 0
-    \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}\x{6fa}
- 0: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}
- 1: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}
- 2: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}
- 3: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}
- 4: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}
- 5: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}
- 6: \x{6f0}\x{6f1}\x{6f2}\x{6f3}
- 7: \x{6f0}\x{6f1}\x{6f2}
- 8: \x{6f0}\x{6f1}
- 9: \x{6f0}
-    \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}\x{970}
- 0: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}
- 1: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}
- 2: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}
- 3: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}
- 4: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}
- 5: \x{966}\x{967}\x{968}\x{969}\x{96a}
- 6: \x{966}\x{967}\x{968}\x{969}
- 7: \x{966}\x{967}\x{968}
- 8: \x{966}\x{967}
- 9: \x{966}
-    ** Failers
-No match
-    X
-No match
-  
-/^\p{Nl}/8
-    \x{16ee}
- 0: \x{16ee}
-    ** Failers
-No match
-    X
-No match
-    \x{966}
-No match
-  
-/^\p{No}/8
-    \x{b2}
- 0: \x{b2}
-    \x{b3}
- 0: \x{b3}
-    ** Failers
-No match
-    X
-No match
-    \x{16ee}
-No match
-  
-/^\p{Pc}/8
-    \x5f
- 0: _
-    \x{203f}
- 0: \x{203f}
-    ** Failers
-No match
-    X
-No match
-    -
-No match
-    \x{58a}
-No match
-  
-/^\p{Pd}/8
-    -
- 0: -
-    \x{58a}
- 0: \x{58a}
-    ** Failers
-No match
-    X
-No match
-    \x{203f}
-No match
-  
-/^\p{Pe}/8
-    )
- 0: )
-    ]
- 0: ]
-    }
- 0: }
-    \x{f3b}
- 0: \x{f3b}
-    ** Failers
-No match
-    X
-No match
-    \x{203f}
-No match
-    (
-No match
-    [
-No match
-    {
-No match
-    \x{f3c}
-No match
-  
-/^\p{Pf}/8
-    \x{bb}
- 0: \x{bb}
-    \x{2019}
- 0: \x{2019}
-    ** Failers
-No match
-    X
-No match
-    \x{203f}
-No match
-  
-/^\p{Pi}/8
-    \x{ab}
- 0: \x{ab}
-    \x{2018}
- 0: \x{2018}
-    ** Failers
-No match
-    X
-No match
-    \x{203f}
-No match
-  
-/^\p{Po}/8
-    !
- 0: !
-    \x{37e}
- 0: \x{37e}
-    ** Failers
- 0: *
-    X
-No match
-    \x{203f}
-No match
-  
-/^\p{Ps}/8
-    (
- 0: (
-    [
- 0: [
-    {
- 0: {
-    \x{f3c}
- 0: \x{f3c}
-    ** Failers
-No match
-    X
-No match
-    )
-No match
-    ]
-No match
-    }
-No match
-    \x{f3b}
-No match
-  
-/^\p{Sc}+/8
-    $\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
- 0: $\x{a2}\x{a3}\x{a4}\x{a5}
- 1: $\x{a2}\x{a3}\x{a4}
- 2: $\x{a2}\x{a3}
- 3: $\x{a2}
- 4: $
-    \x{9f2}
- 0: \x{9f2}
-    ** Failers
-No match
-    X
-No match
-    \x{2c2}
-No match
-  
-/^\p{Sk}/8
-    \x{2c2}
- 0: \x{2c2}
-    ** Failers
-No match
-    X
-No match
-    \x{9f2}
-No match
-  
-/^\p{Sm}+/8
-    +<|~\x{ac}\x{2044}
- 0: +<|~\x{ac}\x{2044}
- 1: +<|~\x{ac}
- 2: +<|~
- 3: +<|
- 4: +<
- 5: +
-    ** Failers
-No match
-    X
-No match
-    \x{9f2}
-No match
-  
-/^\p{So}/8
-    \x{a6}
- 0: \x{a6}
-    \x{482} 
- 0: \x{482}
-    ** Failers
-No match
-    X
-No match
-    \x{9f2}
-No match
-  
-/^\p{Zl}/8
-    \x{2028}
- 0: \x{2028}
-    ** Failers
-No match
-    X
-No match
-    \x{2029}
-No match
-  
-/^\p{Zp}/8
-    \x{2029}
- 0: \x{2029}
-    ** Failers
-No match
-    X
-No match
-    \x{2028}
-No match
-  
-/^\p{Zs}/8
-    \ \
- 0:  
-    \x{a0}
- 0: \x{a0}
-    \x{1680}
- 0: \x{1680}
-    \x{180e}
- 0: \x{180e}
-    \x{2000}
- 0: \x{2000}
-    \x{2001}     
- 0: \x{2001}
-    ** Failers
-No match
-    \x{2028}
-No match
-    \x{200d} 
-No match
-  
-/\p{Nd}+(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}AB
- 1: \x{660}\x{661}\x{662}A
- 2: \x{660}\x{661}\x{662}
-  
-/\p{Nd}+?(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}AB
- 1: \x{660}\x{661}\x{662}A
- 2: \x{660}\x{661}\x{662}
-  
-/\p{Nd}{2,}(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}AB
- 1: \x{660}\x{661}\x{662}A
-  
-/\p{Nd}{2,}?(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}AB
- 1: \x{660}\x{661}\x{662}A
-  
-/\p{Nd}*(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}AB
- 1: \x{660}\x{661}\x{662}A
- 2: \x{660}\x{661}\x{662}
- 3: \x{660}\x{661}
-  
-/\p{Nd}*?(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}AB
- 1: \x{660}\x{661}\x{662}A
- 2: \x{660}\x{661}\x{662}
- 3: \x{660}\x{661}
-  
-/\p{Nd}{2}(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}A
-  
-/\p{Nd}{2,3}(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}AB
- 1: \x{660}\x{661}\x{662}A
-  
-/\p{Nd}{2,3}?(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}AB
- 1: \x{660}\x{661}\x{662}A
-  
-/\p{Nd}?(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}
- 1: \x{660}\x{661}
-  
-/\p{Nd}??(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}
- 1: \x{660}\x{661}
-  
-/\p{Nd}*+(..)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}AB
-  
-/\p{Nd}*+(...)/8
-      \x{660}\x{661}\x{662}ABC
- 0: \x{660}\x{661}\x{662}ABC
-  
-/\p{Nd}*+(....)/8
-      ** Failers
- 0: ** F
-      \x{660}\x{661}\x{662}ABC
-No match
-  
-/\p{Lu}/8i
-    A
- 0: A
-    a\x{10a0}B 
- 0: \x{10a0}
-    ** Failers 
- 0: F
-    a
-No match
-    \x{1d00}  
-No match

-/\p{^Lu}/8i
-    1234
- 0: 1
-    ** Failers
- 0: *
-    ABC 
-No match
+/(?<=aXb)cd/8
+    aXbcd
+ 0: cd

-/\P{Lu}/8i
-    1234
- 0: 1
-    ** Failers
- 0: *
-    ABC 
-No match
+/(?<=a\x{100}b)cd/8
+    a\x{100}bcd
+ 0: cd

-/(?<=A\p{Nd})XYZ/8
-    A2XYZ
- 0: XYZ
-    123A5XYZPQR
- 0: XYZ
-    ABA\x{660}XYZpqr
- 0: XYZ
-    ** Failers
-No match
-    AXYZ
-No match
-    XYZ     
-No match
+/(?<=a\x{100000}b)cd/8
+    a\x{100000}bcd
+ 0: cd

-/(?<!\pL)XYZ/8
-    1XYZ
- 0: XYZ
-    AB=XYZ.. 
- 0: XYZ
-    XYZ 
- 0: XYZ
-    ** Failers
+/(?:\x{100}){3}b/8
+    \x{100}\x{100}\x{100}b
+ 0: \x{100}\x{100}\x{100}b
+    *** Failers 
 No match
-    WXYZ 
+    \x{100}\x{100}b
 No match

-/[\p{Nd}]/8
-    1234
- 0: 1
-
-/[\p{Nd}+-]+/8
-    1234
- 0: 1234
- 1: 123
- 2: 12
- 3: 1
-    12-34
- 0: 12-34
- 1: 12-3
- 2: 12-
- 3: 12
- 4: 1
-    12+\x{661}-34  
- 0: 12+\x{661}-34
- 1: 12+\x{661}-3
- 2: 12+\x{661}-
- 3: 12+\x{661}
- 4: 12+
- 5: 12
- 6: 1
-    ** Failers
+/\x{ab}/8
+    \x{ab} 
+ 0: \x{ab}
+    \xc2\xab
+ 0: \x{ab}
+    *** Failers 
 No match
-    abcd  
+    \x00{ab}
 No match

-/[\P{Nd}]+/8
-    abcd
- 0: abcd
- 1: abc
- 2: ab
- 3: a
-    ** Failers
- 0: ** Failers
- 1: ** Failer
- 2: ** Faile
- 3: ** Fail
- 4: ** Fai
- 5: ** Fa
- 6: ** F
- 7: ** 
- 8: **
- 9: *
-    1234
+/(?<=(.))X/8
+    WXYZ
+ 0: X
+    \x{256}XYZ 
+ 0: X
+    *** Failers
 No match
-
-/\D+/8
-    11111111111111111111111111111111111111111111111111111111111111111111111
+    XYZ 
 No match
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-Matched, but too many subsidiary matches
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-     
-/\P{Nd}+/8
-    11111111111111111111111111111111111111111111111111111111111111111111111
-No match
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-Matched, but too many subsidiary matches
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/[\D]+/8
-    11111111111111111111111111111111111111111111111111111111111111111111111
-No match
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-Matched, but too many subsidiary matches
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/[^a]+/8g
+    bcd
+ 0: bcd
+ 1: bc
+ 2: b
+    \x{100}aY\x{256}Z 
+ 0: \x{100}
+ 0: Y\x{256}Z
+ 1: Y\x{256}
+ 2: Y
+    
+/^[^a]{2}/8
+    \x{100}bc
+ 0: \x{100}b
+ 
+/^[^a]{2,}/8
+    \x{100}bcAa
+ 0: \x{100}bcA
+ 1: \x{100}bc
+ 2: \x{100}b

-/[\P{Nd}]+/8
-    11111111111111111111111111111111111111111111111111111111111111111111111
-No match
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-Matched, but too many subsidiary matches
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/^[^a]{2,}?/8
+    \x{100}bca
+ 0: \x{100}bc
+ 1: \x{100}b

-/[\D\P{Nd}]+/8
-    11111111111111111111111111111111111111111111111111111111111111111111111
-No match
-    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-Matched, but too many subsidiary matches
- 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
- 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+/[^a]+/8ig
+    bcd
+ 0: bcd
+ 1: bc
+ 2: b
+    \x{100}aY\x{256}Z 
+ 0: \x{100}
+ 0: Y\x{256}Z
+ 1: Y\x{256}
+ 2: Y
+    
+/^[^a]{2}/8i
+    \x{100}bc
+ 0: \x{100}b
+ 
+/^[^a]{2,}/8i
+    \x{100}bcAa
+ 0: \x{100}bc
+ 1: \x{100}b

-/\pL/8
-    a
- 0: a
-    A 
- 0: A
+/^[^a]{2,}?/8i
+    \x{100}bca
+ 0: \x{100}bc
+ 1: \x{100}b

-/\pL/8i
-    a
- 0: a
-    A 
- 0: A
+/\x{100}{0,0}/8
+    abcd
+ 0: 
+ 
+/\x{100}?/8
+    abcd
+ 0: 
+    \x{100}\x{100} 
+ 0: \x{100}
+ 1: 
+
+/\x{100}{0,3}/8 
+    \x{100}\x{100} 
+ 0: \x{100}\x{100}
+ 1: \x{100}
+ 2: 
+    \x{100}\x{100}\x{100}\x{100} 
+ 0: \x{100}\x{100}\x{100}
+ 1: \x{100}\x{100}
+ 2: \x{100}
+ 3:

-/\p{Lu}/8 
-    A
- 0: A
-    aZ
- 0: Z
-    ** Failers
- 0: F
-    abc   
-No match
+/\x{100}*/8
+    abce
+ 0: 
+    \x{100}\x{100}\x{100}\x{100} 
+ 0: \x{100}\x{100}\x{100}\x{100}
+ 1: \x{100}\x{100}\x{100}
+ 2: \x{100}\x{100}
+ 3: \x{100}
+ 4:

-/\p{Lu}/8i
-    A
- 0: A
-    aZ
- 0: Z
-    ** Failers
- 0: F
-    abc   
-No match
+/\x{100}{1,1}/8
+    abcd\x{100}\x{100}\x{100}\x{100} 
+ 0: \x{100}

-/\p{Ll}/8 
-    a
- 0: a
-    Az
- 0: z
-    ** Failers
- 0: a
-    ABC   
-No match
+/\x{100}{1,3}/8
+    abcd\x{100}\x{100}\x{100}\x{100} 
+ 0: \x{100}\x{100}\x{100}
+ 1: \x{100}\x{100}
+ 2: \x{100}

-/\p{Ll}/8i 
-    a
- 0: a
-    Az
- 0: z
-    ** Failers
- 0: a
-    ABC   
-No match
+/\x{100}+/8
+    abcd\x{100}\x{100}\x{100}\x{100} 
+ 0: \x{100}\x{100}\x{100}\x{100}
+ 1: \x{100}\x{100}\x{100}
+ 2: \x{100}\x{100}
+ 3: \x{100}

-/^\x{c0}$/8i
-    \x{c0}
- 0: \x{c0}
-    \x{e0} 
- 0: \x{e0}
+/\x{100}{3}/8
+    abcd\x{100}\x{100}\x{100}XX
+ 0: \x{100}\x{100}\x{100}

-/^\x{e0}$/8i
-    \x{c0}
- 0: \x{c0}
-    \x{e0} 
- 0: \x{e0}
+/\x{100}{3,5}/8
+    abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
+ 0: \x{100}\x{100}\x{100}\x{100}\x{100}
+ 1: \x{100}\x{100}\x{100}\x{100}
+ 2: \x{100}\x{100}\x{100}

-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8
-    A\x{391}\x{10427}\x{ff3a}\x{1fb0}
- 0: A\x{391}\x{10427}\x{ff3a}\x{1fb0}
-    ** Failers
-No match
-    a\x{391}\x{10427}\x{ff3a}\x{1fb0}   
-No match
-    A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
-No match
-    A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
-No match
-    A\x{391}\x{10427}\x{ff5a}\x{1fb0}
-No match
-    A\x{391}\x{10427}\x{ff3a}\x{1fb8}
-No match
+/\x{100}{3,}/8
+    abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
+ 0: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 1: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 2: \x{100}\x{100}\x{100}\x{100}\x{100}
+ 3: \x{100}\x{100}\x{100}\x{100}
+ 4: \x{100}\x{100}\x{100}

-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8i
-    A\x{391}\x{10427}\x{ff3a}\x{1fb0}
- 0: A\x{391}\x{10427}\x{ff3a}\x{1fb0}
-    a\x{391}\x{10427}\x{ff3a}\x{1fb0}   
- 0: a\x{391}\x{10427}\x{ff3a}\x{1fb0}
-    A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
- 0: A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
-    A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
- 0: A\x{391}\x{1044f}\x{ff3a}\x{1fb0}
-    A\x{391}\x{10427}\x{ff5a}\x{1fb0}
- 0: A\x{391}\x{10427}\x{ff5a}\x{1fb0}
-    A\x{391}\x{10427}\x{ff3a}\x{1fb8}
- 0: A\x{391}\x{10427}\x{ff3a}\x{1fb8}
+/(?<=a\x{100}{2}b)X/8
+    Xyyya\x{100}\x{100}bXzzz
+ 0: X

-/\x{391}+/8i
-    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
- 0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
- 1: \x{391}\x{3b1}\x{3b1}\x{3b1}
- 2: \x{391}\x{3b1}\x{3b1}
- 3: \x{391}\x{3b1}
- 4: \x{391}
+/\D*/8
+  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+Matched, but too many subsidiary matches
+ 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ 9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+15: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+16: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+17: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+18: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+19: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+20: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+21: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

-/\x{391}{3,5}(.)/8i
-    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
- 0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
- 1: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
- 2: \x{391}\x{3b1}\x{3b1}\x{3b1}
+/\D*/8
+  \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+Matched, but too many subsidiary matches
+ 0: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 1: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 2: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 3: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 4: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 5: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 6: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 7: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 8: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+ 9: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+10: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+11: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+12: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+13: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+14: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+15: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+16: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+17: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+18: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+19: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+20: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
+21: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}

-/\x{391}{3,5}?(.)/8i
-    \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
- 0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
- 1: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
- 2: \x{391}\x{3b1}\x{3b1}\x{3b1}
-
-/[\x{391}\x{ff3a}]/8i
-    \x{391}
- 0: \x{391}
-    \x{ff3a}
- 0: \x{ff3a}
-    \x{3b1}
- 0: \x{3b1}
-    \x{ff5a}   
- 0: \x{ff5a}
+/\D/8
+    1X2
+ 0: X
+    1\x{100}2 
+ 0: \x{100}
+  
+/>\S/8
+    > >X Y
+ 0: >X
+    > >\x{100} Y
+ 0: >\x{100}
+  
+/\d/8
+    \x{100}3
+ 0: 3

-/[\x{c0}\x{391}]/8i
-    \x{c0}
- 0: \x{c0}
-    \x{e0} 
- 0: \x{e0}
+/\s/8
+    \x{100} X
+ 0:  
+    
+/\D+/8
+    12abcd34
+ 0: abcd
+ 1: abc
+ 2: ab
+ 3: a
+    *** Failers
+ 0: *** Failers
+ 1: *** Failer
+ 2: *** Faile
+ 3: *** Fail
+ 4: *** Fai
+ 5: *** Fa
+ 6: *** F
+ 7: *** 
+ 8: ***
+ 9: **
+10: *
+    1234  
+No match

-/[\x{105}-\x{109}]/8i
-    \x{104}
- 0: \x{104}
-    \x{105}
- 0: \x{105}
-    \x{109}  
- 0: \x{109}
-    ** Failers
+/\D{2,3}/8
+    12abcd34
+ 0: abc
+ 1: ab
+    12ab34
+ 0: ab
+    *** Failers  
+ 0: ***
+ 1: **
+    1234
 No match
-    \x{100}
+    12a34  
 No match
-    \x{10a} 
+
+/\D{2,3}?/8
+    12abcd34
+ 0: abc
+ 1: ab
+    12ab34
+ 0: ab
+    *** Failers  
+ 0: ***
+ 1: **
+    1234
 No match
-    
-/[z-\x{100}]/8i 
-    Z
- 0: Z
-    z
- 0: z
-    \x{39c}
- 0: \x{39c}
-    \x{178}
- 0: \x{178}
-    |
- 0: |
-    \x{80}
- 0: \x{80}
-    \x{ff}
- 0: \x{ff}
-    \x{100}
- 0: \x{100}
-    \x{101} 
- 0: \x{101}
-    ** Failers
+    12a34  
 No match
-    \x{102}
+
+/\d+/8
+    12abcd34
+ 0: 12
+ 1: 1
+    *** Failers
 No match
-    Y
+
+/\d{2,3}/8
+    12abcd34
+ 0: 12
+    1234abcd
+ 0: 123
+ 1: 12
+    *** Failers  
 No match
-    y           
+    1.4 
 No match

-/[z-\x{100}]/8i
+/\d{2,3}?/8
+    12abcd34
+ 0: 12
+    1234abcd
+ 0: 123
+ 1: 12
+    *** Failers  
+No match
+    1.4 
+No match

-/^\X/8
-    A
- 0: A
-    A\x{300}BC 
- 0: A\x{300}
-    A\x{300}\x{301}\x{302}BC 
- 0: A\x{300}\x{301}\x{302}
+/\S+/8
+    12abcd34
+ 0: 12abcd34
+ 1: 12abcd3
+ 2: 12abcd
+ 3: 12abc
+ 4: 12ab
+ 5: 12a
+ 6: 12
+ 7: 1
     *** Failers
- 0: *
-    \x{300}  
+ 0: ***
+ 1: **
+ 2: *
+    \    \ 
 No match

-/^[\X]/8
-    X123
- 0: X
+/\S{2,3}/8
+    12abcd34
+ 0: 12a
+ 1: 12
+    1234abcd
+ 0: 123
+ 1: 12
     *** Failers
+ 0: ***
+ 1: **
+    \     \  
 No match
-    AXYZ
+
+/\S{2,3}?/8
+    12abcd34
+ 0: 12a
+ 1: 12
+    1234abcd
+ 0: 123
+ 1: 12
+    *** Failers
+ 0: ***
+ 1: **
+    \     \  
 No match

-/^(\X*)C/8
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
- 0: A\x{300}\x{301}\x{302}BC
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
- 0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
- 1: A\x{300}\x{301}\x{302}BC
-
-/^(\X*?)C/8
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
- 0: A\x{300}\x{301}\x{302}BC
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
- 0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
- 1: A\x{300}\x{301}\x{302}BC
-
-/^(\X*)(.)/8
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
- 0: A\x{300}\x{301}\x{302}BCA
- 1: A\x{300}\x{301}\x{302}BC
- 2: A\x{300}\x{301}\x{302}B
- 3: A
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
- 0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
- 1: A\x{300}\x{301}\x{302}BCA
- 2: A\x{300}\x{301}\x{302}BC
- 3: A\x{300}\x{301}\x{302}B
- 4: A
-
-/^(\X*?)(.)/8
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301} 
- 0: A\x{300}\x{301}\x{302}BCA
- 1: A\x{300}\x{301}\x{302}BC
- 2: A\x{300}\x{301}\x{302}B
- 3: A
-    A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C 
- 0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
- 1: A\x{300}\x{301}\x{302}BCA
- 2: A\x{300}\x{301}\x{302}BC
- 3: A\x{300}\x{301}\x{302}B
- 4: A
-
-/^\X(.)/8
+/>\s+</8
+    12>      <34
+ 0: >      <
     *** Failers
- 0: **
-    A\x{300}\x{301}\x{302}
 No match

-/^\X{2,3}(.)/8
-    A\x{300}\x{301}B\x{300}X
- 0: A\x{300}\x{301}B\x{300}X
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}
- 0: A\x{300}\x{301}B\x{300}C
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
- 0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
- 1: A\x{300}\x{301}B\x{300}C
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
- 0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}D
- 1: A\x{300}\x{301}B\x{300}C
-    
-/^\X{2,3}?(.)/8
-    A\x{300}\x{301}B\x{300}X
- 0: A\x{300}\x{301}B\x{300}X
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}
- 0: A\x{300}\x{301}B\x{300}C
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
- 0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
- 1: A\x{300}\x{301}B\x{300}C
-    A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
- 0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}D
- 1: A\x{300}\x{301}B\x{300}C
-
-/^\pN{2,3}X/
-    12X
- 0: 12X
-    123X
- 0: 123X
+/>\s{2,3}</8
+    ab>  <cd
+ 0: >  <
+    ab>   <ce
+ 0: >   <
     *** Failers
 No match
-    X
+    ab>    <cd 
 No match
-    1X
+
+/>\s{2,3}?</8
+    ab>  <cd
+ 0: >  <
+    ab>   <ce
+ 0: >   <
+    *** Failers
 No match
-    1234X     
+    ab>    <cd 
 No match

-/\x{100}/i8
-    \x{100}   
- 0: \x{100}
-    \x{101} 
- 0: \x{101}
-    
-/^\p{Han}+/8
-    \x{2e81}\x{3007}\x{2f804}\x{31a0}
- 0: \x{2e81}\x{3007}\x{2f804}
- 1: \x{2e81}\x{3007}
- 2: \x{2e81}
-    ** Failers
+/\w+/8
+    12      34
+ 0: 12
+ 1: 1
+    *** Failers
+ 0: Failers
+ 1: Failer
+ 2: Faile
+ 3: Fail
+ 4: Fai
+ 5: Fa
+ 6: F
+    +++=*! 
 No match
-    \x{2e7f}  
+
+/\w{2,3}/8
+    ab  cd
+ 0: ab
+    abcd ce
+ 0: abc
+ 1: ab
+    *** Failers
+ 0: Fai
+ 1: Fa
+    a.b.c
 No match

-/^\P{Katakana}+/8
-    \x{3105}
- 0: \x{3105}
-    ** Failers
- 0: ** Failers
- 1: ** Failer
- 2: ** Faile
- 3: ** Fail
- 4: ** Fai
- 5: ** Fa
- 6: ** F
- 7: ** 
- 8: **
- 9: *
-    \x{30ff}  
+/\w{2,3}?/8
+    ab  cd
+ 0: ab
+    abcd ce
+ 0: abc
+ 1: ab
+    *** Failers
+ 0: Fai
+ 1: Fa
+    a.b.c
 No match

-/^[\p{Arabic}]/8
-    \x{06e9}
- 0: \x{6e9}
-    \x{060b}
- 0: \x{60b}
-    ** Failers
+/\W+/8
+    12====34
+ 0: ====
+ 1: ===
+ 2: ==
+ 3: =
+    *** Failers
+ 0: *** 
+ 1: ***
+ 2: **
+ 3: *
+    abcd 
 No match
-    X\x{06e9}   
+
+/\W{2,3}/8
+    ab====cd
+ 0: ===
+ 1: ==
+    ab==cd
+ 0: ==
+    *** Failers
+ 0: ***
+ 1: **
+    a.b.c
 No match

-/^[\P{Yi}]/8
-    \x{2f800}
- 0: \x{2f800}
-    ** Failers
- 0: *
-    \x{a014}
+/\W{2,3}?/8
+    ab====cd
+ 0: ===
+ 1: ==
+    ab==cd
+ 0: ==
+    *** Failers
+ 0: ***
+ 1: **
+    a.b.c
 No match
-    \x{a4c6}   
+
+/[\x{100}]/8
+    \x{100}
+ 0: \x{100}
+    Z\x{100}
+ 0: \x{100}
+    \x{100}Z
+ 0: \x{100}
+    *** Failers 
 No match

-/^\p{Any}X/8
-    AXYZ
- 0: AX
-    \x{1234}XYZ 
- 0: \x{1234}X
-    ** Failers
+/[Z\x{100}]/8
+    Z\x{100}
+ 0: Z
+    \x{100}
+ 0: \x{100}
+    \x{100}Z
+ 0: \x{100}
+    *** Failers 
 No match
-    X  
+
+/[\x{100}\x{200}]/8
+   ab\x{100}cd
+ 0: \x{100}
+   ab\x{200}cd
+ 0: \x{200}
+   *** Failers  
 No match
-    
-/^\P{Any}X/8
-    ** Failers
+
+/[\x{100}-\x{200}]/8
+   ab\x{100}cd
+ 0: \x{100}
+   ab\x{200}cd
+ 0: \x{200}
+   ab\x{111}cd 
+ 0: \x{111}
+   *** Failers  
 No match
-    AX
+
+/[z-\x{200}]/8
+   ab\x{100}cd
+ 0: \x{100}
+   ab\x{200}cd
+ 0: \x{200}
+   ab\x{111}cd 
+ 0: \x{111}
+   abzcd
+ 0: z
+   ab|cd  
+ 0: |
+   *** Failers  
 No match
-    
-/^\p{Any}?X/8
-    XYZ
- 0: X
-    AXYZ
- 0: AX
-    \x{1234}XYZ 
- 0: \x{1234}X
-    ** Failers
+
+/[Q\x{100}\x{200}]/8
+   ab\x{100}cd
+ 0: \x{100}
+   ab\x{200}cd
+ 0: \x{200}
+   Q? 
+ 0: Q
+   *** Failers  
 No match
-    ABXYZ   
+
+/[Q\x{100}-\x{200}]/8
+   ab\x{100}cd
+ 0: \x{100}
+   ab\x{200}cd
+ 0: \x{200}
+   ab\x{111}cd 
+ 0: \x{111}
+   Q? 
+ 0: Q
+   *** Failers  
 No match

-/^\P{Any}?X/8
-    XYZ
- 0: X
-    ** Failers
+/[Qz-\x{200}]/8
+   ab\x{100}cd
+ 0: \x{100}
+   ab\x{200}cd
+ 0: \x{200}
+   ab\x{111}cd 
+ 0: \x{111}
+   abzcd
+ 0: z
+   ab|cd  
+ 0: |
+   Q? 
+ 0: Q
+   *** Failers  
 No match
-    AXYZ
+
+/[\x{100}\x{200}]{1,3}/8
+   ab\x{100}cd
+ 0: \x{100}
+   ab\x{200}cd
+ 0: \x{200}
+   ab\x{200}\x{100}\x{200}\x{100}cd
+ 0: \x{200}\x{100}\x{200}
+ 1: \x{200}\x{100}
+ 2: \x{200}
+   *** Failers  
 No match
-    \x{1234}XYZ 
+
+/[\x{100}\x{200}]{1,3}?/8
+   ab\x{100}cd
+ 0: \x{100}
+   ab\x{200}cd
+ 0: \x{200}
+   ab\x{200}\x{100}\x{200}\x{100}cd
+ 0: \x{200}\x{100}\x{200}
+ 1: \x{200}\x{100}
+ 2: \x{200}
+   *** Failers  
 No match
-    ABXYZ   
+
+/[Q\x{100}\x{200}]{1,3}/8
+   ab\x{100}cd
+ 0: \x{100}
+   ab\x{200}cd
+ 0: \x{200}
+   ab\x{200}\x{100}\x{200}\x{100}cd
+ 0: \x{200}\x{100}\x{200}
+ 1: \x{200}\x{100}
+ 2: \x{200}
+   *** Failers  
 No match

-/^\p{Any}+X/8
-    AXYZ
- 0: AX
-    \x{1234}XYZ
- 0: \x{1234}X
-    A\x{1234}XYZ
- 0: A\x{1234}X
-    ** Failers
+/[Q\x{100}\x{200}]{1,3}?/8
+   ab\x{100}cd
+ 0: \x{100}
+   ab\x{200}cd
+ 0: \x{200}
+   ab\x{200}\x{100}\x{200}\x{100}cd
+ 0: \x{200}\x{100}\x{200}
+ 1: \x{200}\x{100}
+ 2: \x{200}
+   *** Failers  
 No match
-    XYZ
-No match

-/^\P{Any}+X/8
-    ** Failers
+/(?<=[\x{100}\x{200}])X/8
+    abc\x{200}X
+ 0: X
+    abc\x{100}X 
+ 0: X
+    *** Failers
 No match
-    AXYZ
+    X  
 No match
-    \x{1234}XYZ
-No match
-    A\x{1234}XYZ
-No match
-    XYZ
-No match

-/^\p{Any}*X/8
-    XYZ
+/(?<=[Q\x{100}\x{200}])X/8
+    abc\x{200}X
  0: X
-    AXYZ
- 0: AX
-    \x{1234}XYZ
- 0: \x{1234}X
-    A\x{1234}XYZ
- 0: A\x{1234}X
-    ** Failers
+    abc\x{100}X 
+ 0: X
+    abQX 
+ 0: X
+    *** Failers
 No match
+    X  
+No match

-/^\P{Any}*X/8
-    XYZ
+/(?<=[\x{100}\x{200}]{3})X/8
+    abc\x{100}\x{200}\x{100}X
  0: X
-    ** Failers
+    *** Failers
 No match
-    AXYZ
+    abc\x{200}X
 No match
-    \x{1234}XYZ
+    X  
 No match
-    A\x{1234}XYZ
-No match

-/^[\p{Any}]X/8
-    AXYZ
+/[^\x{100}\x{200}]X/8
+    AX
  0: AX
-    \x{1234}XYZ 
- 0: \x{1234}X
-    ** Failers
+    \x{150}X
+ 0: \x{150}X
+    \x{500}X 
+ 0: \x{500}X
+    *** Failers
 No match
-    X  
+    \x{100}X
 No match
-    
-/^[\P{Any}]X/8
-    ** Failers
+    \x{200}X   
 No match
+
+/[^Q\x{100}\x{200}]X/8
     AX
-No match
-    
-/^[\p{Any}]?X/8
-    XYZ
- 0: X
-    AXYZ
  0: AX
-    \x{1234}XYZ 
- 0: \x{1234}X
-    ** Failers
+    \x{150}X
+ 0: \x{150}X
+    \x{500}X 
+ 0: \x{500}X
+    *** Failers
 No match
-    ABXYZ   
+    \x{100}X
 No match
-
-/^[\P{Any}]?X/8
-    XYZ
- 0: X
-    ** Failers
+    \x{200}X   
 No match
-    AXYZ
+    QX 
 No match
-    \x{1234}XYZ 
-No match
-    ABXYZ   
-No match

-/^[\p{Any}]+X/8
-    AXYZ
+/[^\x{100}-\x{200}]X/8
+    AX
  0: AX
-    \x{1234}XYZ
- 0: \x{1234}X
-    A\x{1234}XYZ
- 0: A\x{1234}X
-    ** Failers
+    \x{500}X 
+ 0: \x{500}X
+    *** Failers
 No match
-    XYZ
+    \x{100}X
 No match
-
-/^[\P{Any}]+X/8
-    ** Failers
+    \x{150}X
 No match
-    AXYZ
+    \x{200}X   
 No match
-    \x{1234}XYZ
+
+/[z-\x{100}]/8i
+    z
+ 0: z
+    Z 
+ 0: Z
+    \x{100}
+ 0: \x{100}
+    *** Failers
 No match
-    A\x{1234}XYZ
+    \x{102}
 No match
-    XYZ
+    y    
 No match

-/^[\p{Any}]*X/8
+/[\xFF]/
+    >\xff<
+ 0: \xff
+
+/[\xff]/8
+    >\x{ff}<
+ 0: \x{ff}
+
+/[^\xFF]/
     XYZ
  0: X
-    AXYZ
- 0: AX
-    \x{1234}XYZ
- 0: \x{1234}X
-    A\x{1234}XYZ
- 0: A\x{1234}X
-    ** Failers
-No match

-/^[\P{Any}]*X/8
+/[^\xff]/8
     XYZ
  0: X
-    ** Failers
-No match
-    AXYZ
-No match
-    \x{1234}XYZ
-No match
-    A\x{1234}XYZ
-No match
+    \x{123} 
+ 0: \x{123}

-/^\p{Any}{3,5}?/8
-    abcdefgh
- 0: abcde
- 1: abcd
- 2: abc
-    \x{1234}\n\r\x{3456}xyz 
- 0: \x{1234}\x{0a}\x{0d}\x{3456}x
- 1: \x{1234}\x{0a}\x{0d}\x{3456}
- 2: \x{1234}\x{0a}\x{0d}
-
-/^\p{Any}{3,5}/8
-    abcdefgh
- 0: abcde
- 1: abcd
- 2: abc
-    \x{1234}\n\r\x{3456}xyz 
- 0: \x{1234}\x{0a}\x{0d}\x{3456}x
- 1: \x{1234}\x{0a}\x{0d}\x{3456}
- 2: \x{1234}\x{0a}\x{0d}
-
-/^\P{Any}{3,5}?/8
-    ** Failers
+/^[ac]*b/8
+  xb
 No match
-    abcdefgh
-No match
-    \x{1234}\n\r\x{3456}xyz 
-No match

-/^\p{L&}X/8
-     AXY
- 0: AX
-     aXY
- 0: aX
-     \x{1c5}XY
- 0: \x{1c5}X
-     ** Failers
+/^[ac\x{100}]*b/8
+  xb
 No match
-     \x{1bb}XY
-No match
-     \x{2b0}XY
-No match
-     !XY      
-No match

-/^[\p{L&}]X/8
-     AXY
- 0: AX
-     aXY
- 0: aX
-     \x{1c5}XY
- 0: \x{1c5}X
-     ** Failers
+/^[^x]*b/8i
+  xb
 No match
-     \x{1bb}XY
-No match
-     \x{2b0}XY
-No match
-     !XY      
-No match

-/^\p{L&}+X/8
-     AXY
- 0: AX
-     aXY
- 0: aX
-     AbcdeXyz 
- 0: AbcdeX
-     \x{1c5}AbXY
- 0: \x{1c5}AbX
-     abcDEXypqreXlmn 
- 0: abcDEXypqreX
- 1: abcDEX
-     ** Failers
+/^[^x]*b/8
+  xb
 No match
-     \x{1bb}XY
+  
+/^\d*b/8
+  xb 
 No match
-     \x{2b0}XY
-No match
-     !XY      
-No match

-/^[\p{L&}]+X/8
-     AXY
- 0: AX
-     aXY
- 0: aX
-     AbcdeXyz 
- 0: AbcdeX
-     \x{1c5}AbXY
- 0: \x{1c5}AbX
-     abcDEXypqreXlmn 
- 0: abcDEXypqreX
- 1: abcDEX
-     ** Failers
-No match
-     \x{1bb}XY
-No match
-     \x{2b0}XY
-No match
-     !XY      
-No match
+/(|a)/g8
+    catac
+ 0: 
+ 0: a
+ 1: 
+ 0: 
+ 0: a
+ 1: 
+ 0: 
+ 0: 
+    a\x{256}a 
+ 0: a
+ 1: 
+ 0: 
+ 0: a
+ 1: 
+ 0:

-/^\p{L&}+?X/8
-     AXY
- 0: AX
-     aXY
- 0: aX
-     AbcdeXyz 
- 0: AbcdeX
-     \x{1c5}AbXY
- 0: \x{1c5}AbX
-     abcDEXypqreXlmn 
- 0: abcDEXypqreX
- 1: abcDEX
-     ** Failers
-No match
-     \x{1bb}XY
-No match
-     \x{2b0}XY
-No match
-     !XY      
-No match
+/^\x{85}$/8i
+    \x{85}
+ 0: \x{85}

-/^[\p{L&}]+?X/8
-     AXY
- 0: AX
-     aXY
- 0: aX
-     AbcdeXyz 
- 0: AbcdeX
-     \x{1c5}AbXY
- 0: \x{1c5}AbX
-     abcDEXypqreXlmn 
- 0: abcDEXypqreX
- 1: abcDEX
-     ** Failers
+/^abc./mgx8<any>
+    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x{0085}abc7 \x{2028}abc8 \x{2029}abc9 JUNK
+ 0: abc1
+ 0: abc2
+ 0: abc3
+ 0: abc4
+ 0: abc5
+ 0: abc6
+ 0: abc7
+ 0: abc8
+ 0: abc9
+
+/abc.$/mgx8<any>
+    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x{0085} abc7\x{2028} abc8\x{2029} abc9
+ 0: abc1
+ 0: abc2
+ 0: abc3
+ 0: abc4
+ 0: abc5
+ 0: abc6
+ 0: abc7
+ 0: abc8
+ 0: abc9
+
+/^a\Rb/8<bsr_unicode>
+    a\nb
+ 0: a\x{0a}b
+    a\rb
+ 0: a\x{0d}b
+    a\r\nb
+ 0: a\x{0d}\x{0a}b
+    a\x0bb
+ 0: a\x{0b}b
+    a\x0cb
+ 0: a\x{0c}b
+    a\x{85}b   
+ 0: a\x{85}b
+    a\x{2028}b 
+ 0: a\x{2028}b
+    a\x{2029}b 
+ 0: a\x{2029}b
+    ** Failers
 No match
-     \x{1bb}XY
+    a\n\rb    
 No match
-     \x{2b0}XY
-No match
-     !XY      
-No match

-/^\P{L&}X/8
-     !XY
- 0: !X
-     \x{1bb}XY
- 0: \x{1bb}X
-     \x{2b0}XY
- 0: \x{2b0}X
-     ** Failers
+/^a\R*b/8<bsr_unicode>
+    ab
+ 0: ab
+    a\nb
+ 0: a\x{0a}b
+    a\rb
+ 0: a\x{0d}b
+    a\r\nb
+ 0: a\x{0d}\x{0a}b
+    a\x0bb
+ 0: a\x{0b}b
+    a\x0c\x{2028}\x{2029}b
+ 0: a\x{0c}\x{2028}\x{2029}b
+    a\x{85}b   
+ 0: a\x{85}b
+    a\n\rb    
+ 0: a\x{0a}\x{0d}b
+    a\n\r\x{85}\x0cb 
+ 0: a\x{0a}\x{0d}\x{85}\x{0c}b
+
+/^a\R+b/8<bsr_unicode>
+    a\nb
+ 0: a\x{0a}b
+    a\rb
+ 0: a\x{0d}b
+    a\r\nb
+ 0: a\x{0d}\x{0a}b
+    a\x0bb
+ 0: a\x{0b}b
+    a\x0c\x{2028}\x{2029}b
+ 0: a\x{0c}\x{2028}\x{2029}b
+    a\x{85}b   
+ 0: a\x{85}b
+    a\n\rb    
+ 0: a\x{0a}\x{0d}b
+    a\n\r\x{85}\x0cb 
+ 0: a\x{0a}\x{0d}\x{85}\x{0c}b
+    ** Failers
 No match
-     \x{1c5}XY
+    ab  
 No match
-     AXY      
-No match

-/^[\P{L&}]X/8
-     !XY
- 0: !X
-     \x{1bb}XY
- 0: \x{1bb}X
-     \x{2b0}XY
- 0: \x{2b0}X
-     ** Failers
+/^a\R{1,3}b/8<bsr_unicode>
+    a\nb
+ 0: a\x{0a}b
+    a\n\rb
+ 0: a\x{0a}\x{0d}b
+    a\n\r\x{85}b
+ 0: a\x{0a}\x{0d}\x{85}b
+    a\r\n\r\nb 
+ 0: a\x{0d}\x{0a}\x{0d}\x{0a}b
+    a\r\n\r\n\r\nb 
+ 0: a\x{0d}\x{0a}\x{0d}\x{0a}\x{0d}\x{0a}b
+    a\n\r\n\rb
+ 0: a\x{0a}\x{0d}\x{0a}\x{0d}b
+    a\n\n\r\nb 
+ 0: a\x{0a}\x{0a}\x{0d}\x{0a}b
+    ** Failers
 No match
-     \x{1c5}XY
+    a\n\n\n\rb
 No match
-     AXY      
+    a\r
 No match

-/^\x{023a}+?(\x{0130}+)/8i
-  \x{023a}\x{2c65}\x{0130}
- 0: \x{23a}\x{2c65}\x{130}
-  
-/^\x{023a}+([^X])/8i
-  \x{023a}\x{2c65}X
- 0: \x{23a}\x{2c65}
- 
-/\x{c0}+\x{116}+/8i
-    \x{c0}\x{e0}\x{116}\x{117}
- 0: \x{c0}\x{e0}\x{116}\x{117}
- 1: \x{c0}\x{e0}\x{116}
+/\h+\V?\v{3,4}/8 
+    \x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
+ 0: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}\x{0d}
+ 1: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}

-/[\x{c0}\x{116}]+/8i
-    \x{c0}\x{e0}\x{116}\x{117}
- 0: \x{c0}\x{e0}\x{116}\x{117}
- 1: \x{c0}\x{e0}\x{116}
- 2: \x{c0}\x{e0}
- 3: \x{c0}
+/\V?\v{3,4}/8 
+    \x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
+ 0: X\x{0a}\x{0b}\x{0c}\x{0d}
+ 1: X\x{0a}\x{0b}\x{0c}

-/Check property support in non-UTF-8 mode/
- 
-/\p{L}{4}/
-    123abcdefg
- 0: abcd
-    123abc\xc4\xc5zz
- 0: abc\xc4
+/\h+\V?\v{3,4}/8
+    >\x09\x20\x{a0}X\x0a\x0a\x0a<
+ 0: \x{09} \x{a0}X\x{0a}\x{0a}\x{0a}

-/\p{Carian}\p{Cham}\p{Kayah_Li}\p{Lepcha}\p{Lycian}\p{Lydian}\p{Ol_Chiki}\p{Rejang}\p{Saurashtra}\p{Sundanese}\p{Vai}/8
-    \x{102A4}\x{AA52}\x{A91D}\x{1C46}\x{10283}\x{1092E}\x{1C6B}\x{A93B}\x{A8BF}\x{1BA0}\x{A50A}====
- 0: \x{102a4}\x{aa52}\x{a91d}\x{1c46}\x{10283}\x{1092e}\x{1c6b}\x{a93b}\x{a8bf}\x{1ba0}\x{a50a}
+/\V?\v{3,4}/8
+    >\x09\x20\x{a0}X\x0a\x0a\x0a<
+ 0: X\x{0a}\x{0a}\x{0a}

-/\x{a77d}\x{1d79}/8i
-    \x{a77d}\x{1d79}
- 0: \x{a77d}\x{1d79}
-    \x{1d79}\x{a77d} 
- 0: \x{1d79}\x{a77d}
-
-/\x{a77d}\x{1d79}/8
-    \x{a77d}\x{1d79}
- 0: \x{a77d}\x{1d79}
-    ** Failers 
+/\H\h\V\v/8
+    X X\x0a
+ 0: X X\x{0a}
+    X\x09X\x0b
+ 0: X\x{09}X\x{0b}
+    ** Failers
 No match
-    \x{1d79}\x{a77d} 
+    \x{a0} X\x0a   
 No match
-
-/^\p{Xan}/8
-    ABCD
- 0: A
-    1234
- 0: 1
-    \x{6ca}
- 0: \x{6ca}
-    \x{a6c}
- 0: \x{a6c}
-    \x{10a7}   
- 0: \x{10a7}
-    ** Failers
+    
+/\H*\h+\V?\v{3,4}/8 
+    \x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
+ 0: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}\x{0d}
+ 1: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}
+    \x09\x20\x{a0}\x0a\x0b\x0c\x0d\x0a
+ 0: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}\x{0d}
+ 1: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}
+    \x09\x20\x{a0}\x0a\x0b\x0c
+ 0: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}
+    ** Failers 
 No match
-    _ABC   
+    \x09\x20\x{a0}\x0a\x0b
 No match
-
-/^\p{Xan}+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
- 1: ABCD1234\x{6ca}\x{a6c}
- 2: ABCD1234\x{6ca}
- 3: ABCD1234
- 4: ABCD123
- 5: ABCD12
- 6: ABCD1
- 7: ABCD
- 8: ABC
- 9: AB
-10: A
+     
+/\H\h\V\v/8
+    \x{3001}\x{3000}\x{2030}\x{2028}
+ 0: \x{3001}\x{3000}\x{2030}\x{2028}
+    X\x{180e}X\x{85}
+ 0: X\x{180e}X\x{85}
     ** Failers
 No match
-    _ABC   
+    \x{2009} X\x0a   
 No match
-
-/^\p{Xan}*/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
- 1: ABCD1234\x{6ca}\x{a6c}
- 2: ABCD1234\x{6ca}
- 3: ABCD1234
- 4: ABCD123
- 5: ABCD12
- 6: ABCD1
- 7: ABCD
- 8: ABC
- 9: AB
-10: A
-11:

-/^\p{Xan}{2,9}/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}
- 1: ABCD1234
- 2: ABCD123
- 3: ABCD12
- 4: ABCD1
- 5: ABCD
- 6: ABC
- 7: AB
-    
-/^[\p{Xan}]/8
-    ABCD1234_
- 0: A
-    1234abcd_
- 0: 1
-    \x{6ca}
- 0: \x{6ca}
-    \x{a6c}
- 0: \x{a6c}
-    \x{10a7}   
- 0: \x{10a7}
-    ** Failers
+/\H*\h+\V?\v{3,4}/8 
+    \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
+ 0: \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x{0c}\x{0d}
+ 1: \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x{0c}
+    \x09\x{205f}\x{a0}\x0a\x{2029}\x0c\x{2028}\x0a
+ 0: \x{09}\x{205f}\x{a0}\x{0a}\x{2029}\x{0c}\x{2028}
+ 1: \x{09}\x{205f}\x{a0}\x{0a}\x{2029}\x{0c}
+    \x09\x20\x{202f}\x0a\x0b\x0c
+ 0: \x{09} \x{202f}\x{0a}\x{0b}\x{0c}
+    ** Failers 
 No match
-    _ABC   
+    \x09\x{200a}\x{a0}\x{2028}\x0b
 No match
- 
-/^[\p{Xan}]+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
- 1: ABCD1234\x{6ca}\x{a6c}
- 2: ABCD1234\x{6ca}
- 3: ABCD1234
- 4: ABCD123
- 5: ABCD12
- 6: ABCD1
- 7: ABCD
- 8: ABC
- 9: AB
-10: A
+     
+/a\Rb/I8<bsr_anycrlf>
+Capturing subpattern count = 0
+Options: bsr_anycrlf utf
+First char = 'a'
+Need char = 'b'
+    a\rb
+ 0: a\x{0d}b
+    a\nb
+ 0: a\x{0a}b
+    a\r\nb
+ 0: a\x{0d}\x{0a}b
     ** Failers
 No match
-    _ABC   
+    a\x{85}b
 No match
-
-/^>\p{Xsp}/8
-    >\x{1680}\x{2028}\x{0b}
- 0: >\x{1680}
-    ** Failers
+    a\x0bb     
 No match
-    \x{0b} 
-No match

-/^>\p{Xsp}+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}
- 4: > \x{09}\x{0a}\x{0c}
- 5: > \x{09}\x{0a}
- 6: > \x{09}
- 7: > 
-
-/^>\p{Xsp}*/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}
- 4: > \x{09}\x{0a}\x{0c}
- 5: > \x{09}\x{0a}
- 6: > \x{09}
- 7: > 
- 8: >
-    
-/^>\p{Xsp}{2,9}/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}
- 4: > \x{09}\x{0a}\x{0c}
- 5: > \x{09}\x{0a}
- 6: > \x{09}
-    
-/^>[\p{Xsp}]/8
-    >\x{2028}\x{0b}
- 0: >\x{2028}
- 
-/^>[\p{Xsp}]+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}
- 4: > \x{09}\x{0a}\x{0c}
- 5: > \x{09}\x{0a}
- 6: > \x{09}
- 7: > 
-
-/^>\p{Xps}/8
-    >\x{1680}\x{2028}\x{0b}
- 0: >\x{1680}
-    >\x{a0} 
- 0: >\x{a0}
-    ** Failers
+/a\Rb/I8<bsr_unicode>
+Capturing subpattern count = 0
+Options: bsr_unicode utf
+First char = 'a'
+Need char = 'b'
+    a\rb
+ 0: a\x{0d}b
+    a\nb
+ 0: a\x{0a}b
+    a\r\nb
+ 0: a\x{0d}\x{0a}b
+    a\x{85}b
+ 0: a\x{85}b
+    a\x0bb     
+ 0: a\x{0b}b
+    ** Failers 
 No match
-    \x{0b} 
+    a\x{85}b\<bsr_anycrlf>
 No match
-
-/^>\p{Xps}+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 4: > \x{09}\x{0a}\x{0c}\x{0d}
- 5: > \x{09}\x{0a}\x{0c}
- 6: > \x{09}\x{0a}
- 7: > \x{09}
- 8: > 
-
-/^>\p{Xps}+?/8
-    >\x{1680}\x{2028}\x{0b}
- 0: >\x{1680}\x{2028}\x{0b}
- 1: >\x{1680}\x{2028}
- 2: >\x{1680}
-
-/^>\p{Xps}*/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 4: > \x{09}\x{0a}\x{0c}\x{0d}
- 5: > \x{09}\x{0a}\x{0c}
- 6: > \x{09}\x{0a}
- 7: > \x{09}
- 8: > 
- 9: >
+    a\x0bb\<bsr_anycrlf>
+No match

-/^>\p{Xps}{2,9}/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 4: > \x{09}\x{0a}\x{0c}\x{0d}
- 5: > \x{09}\x{0a}\x{0c}
- 6: > \x{09}\x{0a}
- 7: > \x{09}
-    
-/^>\p{Xps}{2,9}?/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 4: > \x{09}\x{0a}\x{0c}\x{0d}
- 5: > \x{09}\x{0a}\x{0c}
- 6: > \x{09}\x{0a}
- 7: > \x{09}
-    
-/^>[\p{Xps}]/8
-    >\x{2028}\x{0b}
- 0: >\x{2028}
- 
-/^>[\p{Xps}]+/8
-    > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
- 1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
- 2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
- 3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
- 4: > \x{09}\x{0a}\x{0c}\x{0d}
- 5: > \x{09}\x{0a}\x{0c}
- 6: > \x{09}\x{0a}
- 7: > \x{09}
- 8: > 
-
-/^\p{Xwd}/8
-    ABCD
- 0: A
-    1234
- 0: 1
-    \x{6ca}
- 0: \x{6ca}
-    \x{a6c}
- 0: \x{a6c}
-    \x{10a7}
- 0: \x{10a7}
-    _ABC    
- 0: _
+/a\R?b/I8<bsr_anycrlf>
+Capturing subpattern count = 0
+Options: bsr_anycrlf utf
+First char = 'a'
+Need char = 'b'
+    a\rb
+ 0: a\x{0d}b
+    a\nb
+ 0: a\x{0a}b
+    a\r\nb
+ 0: a\x{0d}\x{0a}b
     ** Failers
 No match
-    [] 
+    a\x{85}b
 No match
+    a\x0bb     
+No match

-/^\p{Xwd}+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 1: ABCD1234\x{6ca}\x{a6c}\x{10a7}
- 2: ABCD1234\x{6ca}\x{a6c}
- 3: ABCD1234\x{6ca}
- 4: ABCD1234
- 5: ABCD123
- 6: ABCD12
- 7: ABCD1
- 8: ABCD
- 9: ABC
-10: AB
-11: A
-
-/^\p{Xwd}*/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 1: ABCD1234\x{6ca}\x{a6c}\x{10a7}
- 2: ABCD1234\x{6ca}\x{a6c}
- 3: ABCD1234\x{6ca}
- 4: ABCD1234
- 5: ABCD123
- 6: ABCD12
- 7: ABCD1
- 8: ABCD
- 9: ABC
-10: AB
-11: A
-12: 
-    
-/^\p{Xwd}{2,9}/8
-    A_12\x{6ca}\x{a6c}\x{10a7}
- 0: A_12\x{6ca}\x{a6c}\x{10a7}
- 1: A_12\x{6ca}\x{a6c}
- 2: A_12\x{6ca}
- 3: A_12
- 4: A_1
- 5: A_
-    
-/^[\p{Xwd}]/8
-    ABCD1234_
- 0: A
-    1234abcd_
- 0: 1
-    \x{6ca}
- 0: \x{6ca}
-    \x{a6c}
- 0: \x{a6c}
-    \x{10a7}   
- 0: \x{10a7}
-    _ABC 
- 0: _
-    ** Failers
+/a\R?b/I8<bsr_unicode>
+Capturing subpattern count = 0
+Options: bsr_unicode utf
+First char = 'a'
+Need char = 'b'
+    a\rb
+ 0: a\x{0d}b
+    a\nb
+ 0: a\x{0a}b
+    a\r\nb
+ 0: a\x{0d}\x{0a}b
+    a\x{85}b
+ 0: a\x{85}b
+    a\x0bb     
+ 0: a\x{0b}b
+    ** Failers 
 No match
-    []   
+    a\x{85}b\<bsr_anycrlf>
 No match
+    a\x0bb\<bsr_anycrlf>
+No match

-/^[\p{Xwd}]+/8
-    ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
- 1: ABCD1234\x{6ca}\x{a6c}\x{10a7}
- 2: ABCD1234\x{6ca}\x{a6c}
- 3: ABCD1234\x{6ca}
- 4: ABCD1234
- 5: ABCD123
- 6: ABCD12
- 7: ABCD1
- 8: ABCD
- 9: ABC
-10: AB
-11: A
+/X/8f<any> 
+    A\x{1ec5}ABCXYZ
+ 0: X

-/-- Unicode properties for \b abd \B --/
+/abcd*/8
+    xxxxabcd\P
+ 0: abcd
+ 1: abc
+    xxxxabcd\P\P
+Partial match: abcd

-/\b...\B/8W
-    abc_
- 0: abc
-    \x{37e}abc\x{376} 
- 0: abc
-    \x{37e}\x{376}\x{371}\x{393}\x{394} 
- 0: \x{376}\x{371}\x{393}
-    !\x{c0}++\x{c1}\x{c2} 
- 0: ++\x{c1}
-    !\x{c0}+++++ 
- 0: \x{c0}++
+/abcd*/i8
+    xxxxabcd\P
+ 0: abcd
+ 1: abc
+    xxxxabcd\P\P
+Partial match: abcd
+    XXXXABCD\P
+ 0: ABCD
+ 1: ABC
+    XXXXABCD\P\P
+Partial match: ABCD

-/-- Without PCRE_UCP, non-ASCII always fail, even if < 256  --/
+/abc\d*/8
+    xxxxabc1\P
+ 0: abc1
+ 1: abc
+    xxxxabc1\P\P
+Partial match: abc1

-/\b...\B/8
-    abc_
- 0: abc
-    ** Failers 
- 0: Fai
-    \x{37e}abc\x{376} 
-No match
-    \x{37e}\x{376}\x{371}\x{393}\x{394} 
-No match
-    !\x{c0}++\x{c1}\x{c2} 
-No match
-    !\x{c0}+++++ 
-No match
+/abc[de]*/8
+    xxxxabcde\P
+ 0: abcde
+ 1: abcd
+ 2: abc
+    xxxxabcde\P\P
+Partial match: abcde

-/-- With PCRE_UCP, non-UTF8 chars that are < 256 still check properties  --/
+/\bthe cat\b/8
+    the cat\P
+ 0: the cat
+    the cat\P\P
+Partial match: the cat

-/\b...\B/W
-    abc_
- 0: abc
-    !\x{c0}++\x{c1}\x{c2} 
- 0: ++\xc1
-    !\x{c0}+++++ 
- 0: \xc0++
+/ab\Cde/8
+    abXde
+Error -16 (item unsupported for DFA matching)

+/(?<=ab\Cde)X/8
+Failed: \C not allowed in lookbehind assertion at offset 10
+
/-- End of testinput9 --/

Diese Nachricht ist Teil des folgenden Threads:
	Der komplette Thread sortiert nach Datum

[Pcre-svn] [836] code/trunk: Merging all the changes from th…