[Pcre-svn] [835] code/trunk: Rolled back trunk to r755 to pr…

Startseite
Nachricht löschen
Autor: Subversion repository
Datum:  
To: pcre-svn
Betreff: [Pcre-svn] [835] code/trunk: Rolled back trunk to r755 to prepare for merging the 16-bit branch.
Revision: 835
          http://vcs.pcre.org/viewvc?view=rev&revision=835
Author:   ph10
Date:     2011-12-28 16:10:09 +0000 (Wed, 28 Dec 2011)


Log Message:
-----------
Rolled back trunk to r755 to prepare for merging the 16-bit branch.

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/Makefile.am
    code/trunk/NEWS
    code/trunk/PrepareRelease
    code/trunk/RunTest
    code/trunk/RunTest.bat
    code/trunk/configure.ac
    code/trunk/doc/html/pcreapi.html
    code/trunk/doc/html/pcrecallout.html
    code/trunk/doc/html/pcrecompat.html
    code/trunk/doc/html/pcrejit.html
    code/trunk/doc/html/pcrelimits.html
    code/trunk/doc/html/pcrematching.html
    code/trunk/doc/html/pcrepattern.html
    code/trunk/doc/html/pcretest.html
    code/trunk/doc/pcre.txt
    code/trunk/doc/pcreapi.3
    code/trunk/doc/pcrecallout.3
    code/trunk/doc/pcrecompat.3
    code/trunk/doc/pcrejit.3
    code/trunk/doc/pcrelimits.3
    code/trunk/doc/pcrepattern.3
    code/trunk/doc/pcretest.1
    code/trunk/doc/pcretest.txt
    code/trunk/maint/ManyConfigTests
    code/trunk/pcre-config.in
    code/trunk/pcre.h.in
    code/trunk/pcre_compile.c
    code/trunk/pcre_dfa_exec.c
    code/trunk/pcre_exec.c
    code/trunk/pcre_fullinfo.c
    code/trunk/pcre_internal.h
    code/trunk/pcre_jit_compile.c
    code/trunk/pcre_study.c
    code/trunk/pcre_valid_utf8.c
    code/trunk/pcregrep.c
    code/trunk/pcreposix.c
    code/trunk/pcretest.c
    code/trunk/perltest.pl
    code/trunk/sljit/sljitConfigInternal.h
    code/trunk/sljit/sljitExecAllocator.c
    code/trunk/sljit/sljitLir.h
    code/trunk/sljit/sljitNativeARM_Thumb2.c
    code/trunk/sljit/sljitNativeARM_v5.c
    code/trunk/sljit/sljitNativeMIPS_common.c
    code/trunk/sljit/sljitNativePPC_common.c
    code/trunk/sljit/sljitNativeX86_common.c
    code/trunk/testdata/testinput11
    code/trunk/testdata/testinput13
    code/trunk/testdata/testinput2
    code/trunk/testdata/testinput6
    code/trunk/testdata/testoutput10
    code/trunk/testdata/testoutput11
    code/trunk/testdata/testoutput13
    code/trunk/testdata/testoutput14
    code/trunk/testdata/testoutput15
    code/trunk/testdata/testoutput2
    code/trunk/testdata/testoutput6


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/ChangeLog    2011-12-28 16:10:09 UTC (rev 835)
@@ -1,19 +1,9 @@
 ChangeLog for PCRE
 ------------------


-Version 8.22
+Version 8.21
------------

-1.  Renamed "isnumber" as "is_a_number" because in some Mac environments this
-    name is defined in ctype.h.
-    
-2.  Fixed a bug in the code for calculating the fixed length of lookbehind
-    assertions.
-      
-
-Version 8.21 12-Dec-2011
-------------------------
-
 1.  Updating the JIT compiler.


 2.  JIT compiler now supports OP_NCREF, OP_RREF and OP_NRREF. New test cases
@@ -23,7 +13,7 @@
     PCRE_EXTRA_TABLES is not suported by JIT, and should be checked before
     calling _pcre_jit_exec. Some extra comments are added.


-4.  (*MARK) settings inside atomic groups that do not contain any capturing
+4.  Mark settings inside atomic groups that do not contain any capturing
     parentheses, for example, (?>a(*:m)), were not being passed out. This bug
     was introduced by change 18 for 8.20.


@@ -32,101 +22,37 @@

 6.  Lookbehinds such as (?<=a{2}b) that contained a fixed repetition were
     erroneously being rejected as "not fixed length" if PCRE_CASELESS was set.
-    This bug was probably introduced by change 9 of 8.13.
-
+    This bug was probably introduced by change 9 of 8.13. 
+    
 7.  While fixing 6 above, I noticed that a number of other items were being
-    incorrectly rejected as "not fixed length". This arose partly because newer
+    incorrectly rejected as "not fixed length". This arose partly because newer 
     opcodes had not been added to the fixed-length checking code. I have (a)
     corrected the bug and added tests for these items, and (b) arranged for an
     error to occur if an unknown opcode is encountered while checking for fixed
-    length instead of just assuming "not fixed length". The items that were
-    rejected were: (*ACCEPT), (*COMMIT), (*FAIL), (*MARK), (*PRUNE), (*SKIP),
-    (*THEN), \h, \H, \v, \V, and single character negative classes with fixed
+    length instead of just assuming "not fixed length". The items that were 
+    rejected were: (*ACCEPT), (*COMMIT), (*FAIL), (*MARK), (*PRUNE), (*SKIP), 
+    (*THEN), \h, \H, \v, \V, and single character negative classes with fixed 
     repetitions, e.g. [^a]{3}, with and without PCRE_CASELESS.
-
+    
 8.  A possessively repeated conditional subpattern such as (?(?=c)c|d)++ was
-    being incorrectly compiled and would have given unpredicatble results.
-
-9.  A possessively repeated subpattern with minimum repeat count greater than
+    being incorrectly compiled and would have given unpredicatble results. 
+    
+9.  A possessively repeated subpattern with minimum repeat count greater than 
     one behaved incorrectly. For example, (A){2,}+ behaved as if it was
-    (A)(A)++ which meant that, after a subsequent mismatch, backtracking into
-    the first (A) could occur when it should not.
+    (A)(A)++ which meant that, after a subsequent mismatch, backtracking into 
+    the first (A) could occur when it should not. 
+    
+10. Add a cast and remove a redundant test from the code. 


-10. Add a cast and remove a redundant test from the code.
-
11. JIT should use pcre_malloc/pcre_free for allocation.

 12. Updated pcre-config so that it no longer shows -L/usr/lib, which seems
-    best practice nowadays, and helps with cross-compiling. (If the exec_prefix
-    is anything other than /usr, -L is still shown).
-
+    best practice nowadays, and helps with cross-compiling. (If the exec_prefix 
+    is anything other than /usr, -L is still shown). 
+    
 13. In non-UTF-8 mode, \C is now supported in lookbehinds and DFA matching.


-14. Perl does not support \N without a following name in a [] class; PCRE now
-    also gives an error.


-15. If a forward reference was repeated with an upper limit of around 2000,
-    it caused the error "internal error: overran compiling workspace". The
-    maximum number of forward references (including repeats) was limited by the
-    internal workspace, and dependent on the LINK_SIZE. The code has been
-    rewritten so that the workspace expands (via pcre_malloc) if necessary, and
-    the default depends on LINK_SIZE. There is a new upper limit (for safety)
-    of around 200,000 forward references. While doing this, I also speeded up
-    the filling in of repeated forward references.
-
-16. A repeated forward reference in a pattern such as (a)(?2){2}(.) was
-    incorrectly expecting the subject to contain another "a" after the start.
-
-17. When (*SKIP:name) is activated without a corresponding (*MARK:name) earlier
-    in the match, the SKIP should be ignored. This was not happening; instead
-    the SKIP was being treated as NOMATCH. For patterns such as
-    /A(*MARK:A)A+(*SKIP:B)Z|AAC/ this meant that the AAC branch was never
-    tested.
-
-18. The behaviour of (*MARK), (*PRUNE), and (*THEN) has been reworked and is
-    now much more compatible with Perl, in particular in cases where the result
-    is a non-match for a non-anchored pattern. For example, if
-    /b(*:m)f|a(*:n)w/ is matched against "abc", the non-match returns the name
-    "m", where previously it did not return a name. A side effect of this
-    change is that for partial matches, the last encountered mark name is
-    returned, as for non matches. A number of tests that were previously not
-    Perl-compatible have been moved into the Perl-compatible test files. The
-    refactoring has had the pleasing side effect of removing one argument from
-    the match() function, thus reducing its stack requirements.
-
-19. If the /S+ option was used in pcretest to study a pattern using JIT,
-    subsequent uses of /S (without +) incorrectly behaved like /S+.
-
-21. Retrieve executable code size support for the JIT compiler and fixing
-    some warnings.
-
-22. A caseless match of a UTF-8 character whose other case uses fewer bytes did
-    not work when the shorter character appeared right at the end of the
-    subject string.
-
-23. Added some (int) casts to non-JIT modules to reduce warnings on 64-bit
-    systems.
-
-24. Added PCRE_INFO_JITSIZE to pass on the value from (21) above, and also
-    output it when the /M option is used in pcretest.
-
-25. The CheckMan script was not being included in the distribution. Also, added
-    an explicit "perl" to run Perl scripts from the PrepareRelease script
-    because this is reportedly needed in Windows.
-
-26. If study data was being save in a file and studying had not found a set of
-    "starts with" bytes for the pattern, the data written to the file (though
-    never used) was taken from uninitialized memory and so caused valgrind to
-    complain.
-
-27. Updated RunTest.bat as provided by Sheri Pierce.
-
-28. Fixed a possible uninitialized memory bug in pcre_jit_compile.c.
-
-29. Computation of memory usage for the table of capturing group names was
-    giving an unnecessarily large value.
-
-
 Version 8.20 21-Oct-2011
 ------------------------



Modified: code/trunk/Makefile.am
===================================================================
--- code/trunk/Makefile.am    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/Makefile.am    2011-12-28 16:10:09 UTC (rev 835)
@@ -100,7 +100,6 @@
 # These files are used in the preparation of a release
 EXTRA_DIST += \
   PrepareRelease \
-  CheckMan \
   CleanTxt \
   Detrail \
   132html \


Modified: code/trunk/NEWS
===================================================================
--- code/trunk/NEWS    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/NEWS    2011-12-28 16:10:09 UTC (rev 835)
@@ -1,13 +1,6 @@
 News about PCRE releases
 ------------------------


-Release 8.21 12-Dec-2011
-------------------------
-
-This is almost entirely a bug-fix release. The only new feature is the ability
-to obtain the size of the memory used by the JIT compiler.
-
-
Release 8.20 21-Oct-2011
------------------------


Modified: code/trunk/PrepareRelease
===================================================================
--- code/trunk/PrepareRelease    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/PrepareRelease    2011-12-28 16:10:09 UTC (rev 835)
@@ -37,7 +37,7 @@


# Check the remaining man pages

-perl ../CheckMan *.1 *.3
+../CheckMan *.1 *.3
if [ $? != 0 ] ; then exit 1; fi

 # Make Text form of the documentation. It needs some mangling to make it
@@ -64,7 +64,7 @@
             pcrelimits pcrestack ; do
   echo "  Processing $file.3"
   nroff -c -man $file.3 >$file.rawtxt
-  perl ../CleanTxt <$file.rawtxt >>pcre.txt
+  ../CleanTxt <$file.rawtxt >>pcre.txt
   /bin/rm $file.rawtxt
   echo "------------------------------------------------------------------------------" >>pcre.txt
   if [ "$file" != "pcresample" ] ; then
@@ -77,7 +77,7 @@
 for file in pcretest pcregrep pcre-config ; do
   echo Making $file.txt
   nroff -c -man $file.1 >$file.rawtxt
-  perl ../CleanTxt <$file.rawtxt >$file.txt
+  ../CleanTxt <$file.rawtxt >$file.txt
   /bin/rm $file.rawtxt
 done


@@ -126,7 +126,7 @@
for file in *.1 ; do
base=`basename $file .1`
echo " Making $base.html"
- perl ../132html -toc $base <$file >html/$base.html
+ ../132html -toc $base <$file >html/$base.html
done

 # Exclude table of contents for function summaries. It seems that expr
@@ -146,7 +146,7 @@
     toc=""
   fi
   echo "  Making $base.html"
-  perl ../132html $toc $base <$file >html/$base.html
+  ../132html $toc $base <$file >html/$base.html
   if [ $? != 0 ] ; then exit 1; fi
 done


@@ -235,7 +235,7 @@
libpcreposix.def"

echo Detrailing
-perl ./Detrail $files doc/p* doc/html/*
+./Detrail $files doc/p* doc/html/*

echo Doing basic configure to get default pcre.h and config.h
# This is in case the caller has set aliases (as I do - PH)

Modified: code/trunk/RunTest
===================================================================
--- code/trunk/RunTest    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/RunTest    2011-12-28 16:10:09 UTC (rev 835)
@@ -222,7 +222,7 @@
 # PCRE tests that are not JIT or Perl-compatible: API, errors, internals


 if [ $do2 = yes ] ; then
-  echo "Test 2: API, errors, internals, and non-Perl stuff (not UTF-8)"
+  echo "Test 2: API, errors, internals, and non-Perl stuff"
   for opt in "" "-s" $jitopt; do
     $sim $valgrind ./pcretest -q $opt $testdata/testinput2 testtry
     if [ $? = 0 ] ; then


Modified: code/trunk/RunTest.bat
===================================================================
--- code/trunk/RunTest.bat    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/RunTest.bat    2011-12-28 16:10:09 UTC (rev 835)
@@ -18,8 +18,6 @@
 @rem 14 requires presense of jit support
 @rem 15 requires absence of jit support
 @rem Sheri P also added override tests for study and jit testing
-@rem JIT testing n/a for tests 7-10, removed JIT override test for them
-@rem removed override tests for 14-15


setlocal enabledelayedexpansion
if [%srcdir%]==[] (
@@ -29,7 +27,7 @@
if [%srcdir%]==[] (
if exist ..\..\testdata\ set srcdir=..\..)
if NOT exist "%srcdir%\testdata\" (
-Error: echo distribution testdata folder not found!
+Error: echo distribution testdata folder not found.
call :conferror
exit /b 1
goto :eof
@@ -43,14 +41,14 @@
echo pcregrep=%pcregrep%

if NOT exist "%pcregrep%" (
-echo Error: "%pcregrep%" not found!
+echo Error: "%pcregrep%" not found.
echo.
call :conferror
exit /b 1
)

if NOT exist "%pcretest%" (
-echo Error: "%pcretest%" not found!
+echo Error: "%pcretest%" not found.
echo.
call :conferror
exit /b 1
@@ -221,7 +219,7 @@
goto :eof

:do2
- call :runsub 2 testout "API, errors, internals, and non-Perl stuff (not UTF-8)" -q
+ call :runsub 2 testout "API, errors, internals, and non-Perl stuff" -q
call :runsub 2 testoutstudy "Test with Study Override" -q -s
if %jit% EQU 1 call :runsub 2 testoutjit "Test with JIT Override" -q -s+
goto :eof
@@ -265,6 +263,7 @@
:do7
call :runsub 7 testout "DFA matching" -q -dfa
call :runsub 7 testoutstudy "Test with Study Override" -q -dfa -s
+ if %jit% EQU 1 call :runsub 7 testoutjit "Test with JIT Override" -q -dfa -s+
goto :eof

:do8
@@ -274,6 +273,7 @@
)
call :runsub 8 testout "DFA matching with UTF-8" -q -dfa
call :runsub 8 testoutstudy "Test with Study Override" -q -dfa -s
+ if %jit% EQU 1 call :runsub 8 testoutjit "Test with JIT Override" -q -dfa -s+
goto :eof

:do9
@@ -283,6 +283,7 @@
)
call :runsub 9 testout "DFA matching with Unicode properties" -q -dfa
call :runsub 9 testoutstudy "Test with Study Override" -q -dfa -s
+ if %jit% EQU 1 call :runsub 9 testoutjit "Test with JIT Override" -q -dfa -s+
goto :eof

:do10
@@ -292,6 +293,7 @@
)
call :runsub 10 testout "Internal offsets and code size tests" -q
call :runsub 10 testoutstudy "Test with Study Override" -q -s
+ if %jit% EQU 1 call :runsub 10 testoutjit "Test with JIT Override" -q -s+
goto :eof

:do11
@@ -326,6 +328,8 @@
goto :eof
)
call :runsub 14 testout "JIT-specific features - have JIT" -q
+ call :runsub 14 testoutstudy "Test with Study Override" -q -s
+ call :runsub 14 testoutjit "Test with JIT Override" -q -s+
goto :eof

:do15
@@ -334,12 +338,12 @@
goto :eof
)
call :runsub 15 testout "JIT-specific features - no JIT" -q
+ call :runsub 15 testoutstudy "Test with Study Override" -q -s
goto :eof

:conferror
+@echo Configuration error.
@echo.
-@echo Either your build is incomplete or you have a configuration error.
-@echo.
@echo If configured with cmake and executed via "make test" or the MSVC "RUN_TESTS"
@echo project, pcre_test.bat defines variables and automatically calls RunTest.bat.
@echo For manual testing of all available features, after configuring with cmake

Modified: code/trunk/configure.ac
===================================================================
--- code/trunk/configure.ac    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/configure.ac    2011-12-28 16:10:09 UTC (rev 835)
@@ -9,9 +9,9 @@
 dnl be defined as -RC2, for example. For real releases, it should be empty.


m4_define(pcre_major, [8])
-m4_define(pcre_minor, [22])
-m4_define(pcre_prerelease, [])
-m4_define(pcre_date, [2011-12-12])
+m4_define(pcre_minor, [21])
+m4_define(pcre_prerelease, [-RC1])
+m4_define(pcre_date, [2011-11-14])

# Libtool shared library interface versions (current:revision:age)
m4_define(libpcre_version, [0:1:0])

Modified: code/trunk/doc/html/pcreapi.html
===================================================================
--- code/trunk/doc/html/pcreapi.html    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/html/pcreapi.html    2011-12-28 16:10:09 UTC (rev 835)
@@ -649,23 +649,6 @@
 string (by default this causes the current matching alternative to fail). A
 pattern such as (\1)(a) succeeds when this option is set (assuming it can find
 an "a" in the subject), whereas it fails by default, for Perl compatibility.
-</P>
-<P>
-(3) \U matches an upper case "U" character; by default \U causes a compile
-time error (Perl uses \U to upper case subsequent characters).
-</P>
-<P>
-(4) \u matches a lower case "u" character unless it is followed by four
-hexadecimal digits, in which case the hexadecimal number defines the code point
-to match. By default, \u causes a compile time error (Perl uses it to upper
-case the following character).
-</P>
-<P>
-(5) \x matches a lower case "x" character unless it is followed by two
-hexadecimal digits, in which case the hexadecimal number defines the code point
-to match. By default, as in Perl, a hexadecimal number is always expected after
-\x, but it may have zero, one, or two digits (so, for example, \xz matches a
-binary zero character followed by z).
 <pre>
   PCRE_MULTILINE
 </pre>
@@ -1144,12 +1127,6 @@
 <a href="pcrejit.html"><b>pcrejit</b></a>
 documentation for details of what can and cannot be handled.
 <pre>
-  PCRE_INFO_JITSIZE
-</pre>
-If the pattern was successfully studied with the PCRE_STUDY_JIT_COMPILE option,
-return the size of the JIT compiled code, otherwise return zero. The fourth
-argument should point to a <b>size_t</b> variable.
-<pre>
   PCRE_INFO_LASTLITERAL
 </pre>
 Return the value of the rightmost literal byte that must exist in any matched
@@ -1258,13 +1235,10 @@
 <pre>
   PCRE_INFO_SIZE
 </pre>
-Return the size of the compiled pattern. The fourth argument should point to a
-<b>size_t</b> variable. This value does not include the size of the <b>pcre</b>
-structure that is returned by <b>pcre_compile()</b>. The value that is passed as
-the argument to <b>pcre_malloc()</b> when <b>pcre_compile()</b> is getting memory
-in which to place the compiled data is the value returned by this option plus
-the size of the <b>pcre</b> structure. Studying a compiled pattern, with or
-without JIT, does not alter the value returned by this option.
+Return the size of the compiled pattern, that is, the value that was passed as
+the argument to <b>pcre_malloc()</b> when PCRE was getting memory in which to
+place the compiled data. The fourth argument should point to a <b>size_t</b>
+variable.
 <pre>
   PCRE_INFO_STUDYSIZE
 </pre>
@@ -2512,7 +2486,7 @@
 </P>
 <br><a name="SEC24" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 02 December 2011
+Last updated: 23 September 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>


Modified: code/trunk/doc/html/pcrecallout.html
===================================================================
--- code/trunk/doc/html/pcrecallout.html    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/html/pcrecallout.html    2011-12-28 16:10:09 UTC (rev 835)
@@ -189,10 +189,9 @@
 <P>
 The <i>mark</i> field is present from version 2 of the <i>pcre_callout</i>
 structure. In callouts from <b>pcre_exec()</b> it contains a pointer to the
-zero-terminated name of the most recently passed (*MARK), (*PRUNE), or (*THEN)
-item in the match, or NULL if no such items have been passed. Instances of
-(*PRUNE) or (*THEN) without a name do not obliterate a previous (*MARK). In
-callouts from <b>pcre_dfa_exec()</b> this field always contains NULL.
+zero-terminated name of the most recently passed (*MARK) item in the match, or
+NULL if there are no (*MARK)s in the current matching path. In callouts from
+<b>pcre_dfa_exec()</b> this field always contains NULL.
 </P>
 <br><a name="SEC4" href="#TOC1">RETURN VALUES</a><br>
 <P>
@@ -220,7 +219,7 @@
 </P>
 <br><a name="SEC6" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 30 November 2011
+Last updated: 26 August 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>


Modified: code/trunk/doc/html/pcrecompat.html
===================================================================
--- code/trunk/doc/html/pcrecompat.html    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/html/pcrecompat.html    2011-12-28 16:10:09 UTC (rev 835)
@@ -53,8 +53,7 @@
 own, matching a non-newline character, is supported.) In fact these are
 implemented by Perl's general string-handling and are not part of its pattern
 matching engine. If any of these are encountered by PCRE, an error is
-generated by default. However, if the PCRE_JAVASCRIPT_COMPAT option is set,
-\U and \u are interpreted as JavaScript interprets them.
+generated.
 </P>
 <P>
 6. The Perl escape sequences \p, \P, and \X are supported only if PCRE is
@@ -203,7 +202,7 @@
 REVISION
 </b><br>
 <P>
-Last updated: 14 November 2011
+Last updated: 09 October 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>


Modified: code/trunk/doc/html/pcrejit.html
===================================================================
--- code/trunk/doc/html/pcrejit.html    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/html/pcrejit.html    2011-12-28 16:10:09 UTC (rev 835)
@@ -20,11 +20,10 @@
 <li><a name="TOC5" href="#SEC5">RETURN VALUES FROM JIT EXECUTION</a>
 <li><a name="TOC6" href="#SEC6">SAVING AND RESTORING COMPILED PATTERNS</a>
 <li><a name="TOC7" href="#SEC7">CONTROLLING THE JIT STACK</a>
-<li><a name="TOC8" href="#SEC8">JIT STACK FAQ</a>
-<li><a name="TOC9" href="#SEC9">EXAMPLE CODE</a>
-<li><a name="TOC10" href="#SEC10">SEE ALSO</a>
-<li><a name="TOC11" href="#SEC11">AUTHOR</a>
-<li><a name="TOC12" href="#SEC12">REVISION</a>
+<li><a name="TOC8" href="#SEC8">EXAMPLE CODE</a>
+<li><a name="TOC9" href="#SEC9">SEE ALSO</a>
+<li><a name="TOC10" href="#SEC10">AUTHOR</a>
+<li><a name="TOC11" href="#SEC11">REVISION</a>
 </ul>
 <br><a name="SEC1" href="#TOC1">PCRE JUST-IN-TIME COMPILER SUPPORT</a><br>
 <P>
@@ -58,18 +57,12 @@
 fails.
 </P>
 <P>
-A program that is linked with PCRE 8.20 or later can tell if JIT support is
-available by calling <b>pcre_config()</b> with the PCRE_CONFIG_JIT option. The
-result is 1 when JIT is available, and 0 otherwise. However, a simple program
-does not need to check this in order to use JIT. The API is implemented in a
-way that falls back to the ordinary PCRE code if JIT is not available.
+A program can tell if JIT support is available by calling <b>pcre_config()</b>
+with the PCRE_CONFIG_JIT option. The result is 1 when JIT is available, and 0
+otherwise. However, a simple program does not need to check this in order to
+use JIT. The API is implemented in a way that falls back to the ordinary PCRE
+code if JIT is not available.
 </P>
-<P>
-If your program may sometimes be linked with versions of PCRE that are older
-than 8.20, but you want to use JIT when it is available, you can test
-the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT macro such
-as PCRE_CONFIG_JIT, for compile-time control of your code.
-</P>
 <br><a name="SEC3" href="#TOC1">SIMPLE USE OF JIT</a><br>
 <P>
 You have to do two things to make use of the JIT support in the simplest way:
@@ -82,21 +75,6 @@
       no longer needed instead of just freeing it yourself. This
       ensures that any JIT data is also freed.
 </pre>
-For a program that may be linked with pre-8.20 versions of PCRE, you can insert
-<pre>
-  #ifndef PCRE_STUDY_JIT_COMPILE
-  #define PCRE_STUDY_JIT_COMPILE 0
-  #endif
-</pre>
-so that no option is passed to <b>pcre_study()</b>, and then use something like
-this to free the study data:
-<pre>
-  #ifdef PCRE_CONFIG_JIT
-      pcre_free_study(study_ptr);
-  #else
-      pcre_free(study_ptr);
-  #endif
-</pre>
 In some circumstances you may need to call additional functions. These are
 described in the section entitled
 <a href="#stackcontrol">"Controlling the JIT stack"</a>
@@ -138,8 +116,12 @@
 <P>
 The unsupported pattern items are:
 <pre>
-  \C             match a single byte; not supported in UTF-8 mode
+  \C            match a single byte; not supported in UTF-8 mode
   (?Cn)          callouts
+  (?(&#60;name&#62;)...  conditional test on setting of a named subpattern
+  (?(R)...       conditional test on whole pattern recursion
+  (?(Rn)...      conditional test on recursion, by number
+  (?(R&name)...  conditional test on recursion, by name
   (*COMMIT)      )
   (*MARK)        )
   (*PRUNE)       ) the backtracking control verbs
@@ -185,10 +167,7 @@
 By default, it uses 32K on the machine stack. However, some large or
 complicated patterns need more than this. The error PCRE_ERROR_JIT_STACKLIMIT
 is given when there is not enough stack. Three functions are provided for
-managing blocks of memory for use as JIT stacks. There is further discussion
-about the use of JIT stacks in the section entitled
-<a href="#stackcontrol">"JIT stack FAQ"</a>
-below.
+managing blocks of memory for use as JIT stacks.
 </P>
 <P>
 The <b>pcre_jit_stack_alloc()</b> function creates a JIT stack. Its arguments
@@ -255,87 +234,9 @@
 and <b>pcre_assign_jit_stack()</b> does nothing unless the <b>extra</b> argument
 is non-NULL and points to a <b>pcre_extra</b> block that is the result of a
 successful study with PCRE_STUDY_JIT_COMPILE.
-<a name="stackfaq"></a></P>
-<br><a name="SEC8" href="#TOC1">JIT STACK FAQ</a><br>
-<P>
-(1) Why do we need JIT stacks?
-<br>
-<br>
-PCRE (and JIT) is a recursive, depth-first engine, so it needs a stack where
-the local data of the current node is pushed before checking its child nodes.
-Allocating real machine stack on some platforms is difficult. For example, the
-stack chain needs to be updated every time if we extend the stack on PowerPC.
-Although it is possible, its updating time overhead decreases performance. So
-we do the recursion in memory.
 </P>
+<br><a name="SEC8" href="#TOC1">EXAMPLE CODE</a><br>
 <P>
-(2) Why don't we simply allocate blocks of memory with <b>malloc()</b>?
-<br>
-<br>
-Modern operating systems have a nice feature: they can reserve an address space
-instead of allocating memory. We can safely allocate memory pages inside this
-address space, so the stack could grow without moving memory data (this is
-important because of pointers). Thus we can allocate 1M address space, and use
-only a single memory page (usually 4K) if that is enough. However, we can still
-grow up to 1M anytime if needed.
-</P>
-<P>
-(3) Who "owns" a JIT stack?
-<br>
-<br>
-The owner of the stack is the user program, not the JIT studied pattern or
-anything else. The user program must ensure that if a stack is used by
-<b>pcre_exec()</b>, (that is, it is assigned to the pattern currently running),
-that stack must not be used by any other threads (to avoid overwriting the same
-memory area). The best practice for multithreaded programs is to allocate a
-stack for each thread, and return this stack through the JIT callback function.
-</P>
-<P>
-(4) When should a JIT stack be freed?
-<br>
-<br>
-You can free a JIT stack at any time, as long as it will not be used by
-<b>pcre_exec()</b> again. When you assign the stack to a pattern, only a pointer
-is set. There is no reference counting or any other magic. You can free the
-patterns and stacks in any order, anytime. Just <i>do not</i> call
-<b>pcre_exec()</b> with a pattern pointing to an already freed stack, as that
-will cause SEGFAULT. (Also, do not free a stack currently used by
-<b>pcre_exec()</b> in another thread). You can also replace the stack for a
-pattern at any time. You can even free the previous stack before assigning a
-replacement.
-</P>
-<P>
-(5) Should I allocate/free a stack every time before/after calling
-<b>pcre_exec()</b>?
-<br>
-<br>
-No, because this is too costly in terms of resources. However, you could
-implement some clever idea which release the stack if it is not used in let's
-say two minutes. The JIT callback can help to achive this without keeping a
-list of the currently JIT studied patterns.
-</P>
-<P>
-(6) OK, the stack is for long term memory allocation. But what happens if a
-pattern causes stack overflow with a stack of 1M? Is that 1M kept until the
-stack is freed?
-<br>
-<br>
-Especially on embedded sytems, it might be a good idea to release
-memory sometimes without freeing the stack. There is no API for this at the
-moment. Probably a function call which returns with the currently allocated
-memory for any stack and another which allows releasing memory (shrinking the
-stack) would be a good idea if someone needs this.
-</P>
-<P>
-(7) This is too much of a headache. Isn't there any better solution for JIT
-stack handling?
-<br>
-<br>
-No, thanks to Windows. If POSIX threads were used everywhere, we could throw
-out this complicated API.
-</P>
-<br><a name="SEC9" href="#TOC1">EXAMPLE CODE</a><br>
-<P>
 This is a single-threaded example that specifies a JIT stack without using a
 callback.
 <pre>
@@ -359,22 +260,22 @@


</PRE>
</P>
-<br><a name="SEC10" href="#TOC1">SEE ALSO</a><br>
+<br><a name="SEC9" href="#TOC1">SEE ALSO</a><br>
<P>
<b>pcreapi</b>(3)
</P>
-<br><a name="SEC11" href="#TOC1">AUTHOR</a><br>
+<br><a name="SEC10" href="#TOC1">AUTHOR</a><br>
<P>
-Philip Hazel (FAQ by Zoltan Herczeg)
+Philip Hazel
<br>
University Computing Service
<br>
Cambridge CB2 3QH, England.
<br>
</P>
-<br><a name="SEC12" href="#TOC1">REVISION</a><br>
+<br><a name="SEC11" href="#TOC1">REVISION</a><br>
<P>
-Last updated: 26 November 2011
+Last updated: 19 October 2011
<br>
Copyright &copy; 1997-2011 University of Cambridge.
<br>

Modified: code/trunk/doc/html/pcrelimits.html
===================================================================
--- code/trunk/doc/html/pcrelimits.html    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/html/pcrelimits.html    2011-12-28 16:10:09 UTC (rev 835)
@@ -37,12 +37,6 @@
 no more than 65535 capturing subpatterns.
 </P>
 <P>
-There is a limit to the number of forward references to subsequent subpatterns
-of around 200,000. Repeated forward references with fixed upper limits, for
-example, (?2){0,100} when subpattern number 2 is to the right, are included in
-the count. There is no limit to the number of backward references.
-</P>
-<P>
 The maximum length of name for a named subpattern is 32 characters, and the
 maximum number of named subpatterns is 10000.
 </P>
@@ -71,7 +65,7 @@
 REVISION
 </b><br>
 <P>
-Last updated: 30 November 2011
+Last updated: 24 August 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>


Modified: code/trunk/doc/html/pcrematching.html
===================================================================
--- code/trunk/doc/html/pcrematching.html    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/html/pcrematching.html    2011-12-28 16:10:09 UTC (rev 835)
@@ -164,9 +164,9 @@
 </P>
 <P>
 7. The \C escape sequence, which (in the standard algorithm) matches a single
-byte, even in UTF-8 mode, is not supported in UTF-8 mode, because the
-alternative algorithm moves through the subject string one character at a time,
-for all active paths through the tree.
+byte, even in UTF-8 mode, is not supported because the alternative algorithm
+moves through the subject string one character at a time, for all active paths
+through the tree.
 </P>
 <P>
 8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not
@@ -220,7 +220,7 @@
 </P>
 <br><a name="SEC8" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 19 November 2011
+Last updated: 17 November 2010
 <br>
 Copyright &copy; 1997-2010 University of Cambridge.
 <br>


Modified: code/trunk/doc/html/pcrepattern.html
===================================================================
--- code/trunk/doc/html/pcrepattern.html    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/html/pcrepattern.html    2011-12-28 16:10:09 UTC (rev 835)
@@ -268,8 +268,7 @@
   \t        tab (hex 09)
   \ddd      character with octal code ddd, or back reference
   \xhh      character with hex code hh
-  \x{hhh..} character with hex code hhh.. (non-JavaScript mode)
-  \uhhhh    character with hex code hhhh (JavaScript mode only)
+  \x{hhh..} character with hex code hhh..
 </pre>
 The precise effect of \cx is as follows: if x is a lower case letter, it
 is converted to upper case. Then bit 6 of the character (hex 40) is inverted.
@@ -281,12 +280,12 @@
 0xc0 bits are flipped.)
 </P>
 <P>
-By default, after \x, from zero to two hexadecimal digits are read (letters
-can be in upper or lower case). Any number of hexadecimal digits may appear
-between \x{ and }, but the value of the character code must be less than 256
-in non-UTF-8 mode, and less than 2**31 in UTF-8 mode. That is, the maximum
-value in hexadecimal is 7FFFFFFF. Note that this is bigger than the largest
-Unicode code point, which is 10FFFF.
+After \x, from zero to two hexadecimal digits are read (letters can be in
+upper or lower case). Any number of hexadecimal digits may appear between \x{
+and }, but the value of the character code must be less than 256 in non-UTF-8
+mode, and less than 2**31 in UTF-8 mode. That is, the maximum value in
+hexadecimal is 7FFFFFFF. Note that this is bigger than the largest Unicode code
+point, which is 10FFFF.
 </P>
 <P>
 If characters other than hexadecimal digits appear between \x{ and }, or if
@@ -295,17 +294,9 @@
 following digits, giving a character whose value is zero.
 </P>
 <P>
-If the PCRE_JAVASCRIPT_COMPAT option is set, the interpretation of \x is
-as just described only when it is followed by two hexadecimal digits.
-Otherwise, it matches a literal "x" character. In JavaScript mode, support for
-code points greater than 256 is provided by \u, which must be followed by
-four hexadecimal digits; otherwise it matches a literal "u" character.
-</P>
-<P>
 Characters whose value is less than 256 can be defined by either of the two
-syntaxes for \x (or by \u in JavaScript mode). There is no difference in the
-way they are handled. For example, \xdc is exactly the same as \x{dc} (or
-\u00dc in JavaScript mode).
+syntaxes for \x. There is no difference in the way they are handled. For
+example, \xdc is exactly the same as \x{dc}.
 </P>
 <P>
 After \0 up to two further octal digits are read. If there are fewer than two
@@ -347,27 +338,14 @@
 </P>
 <P>
 All the sequences that define a single character value can be used both inside
-and outside character classes. In addition, inside a character class, \b is
-interpreted as the backspace character (hex 08).
+and outside character classes. In addition, inside a character class, the
+sequence \b is interpreted as the backspace character (hex 08). The sequences
+\B, \N, \R, and \X are not special inside a character class. Like any other
+unrecognized escape sequences, they are treated as the literal characters "B",
+"N", "R", and "X" by default, but cause an error if the PCRE_EXTRA option is
+set. Outside a character class, these sequences have different meanings.
 </P>
-<P>
-\N is not allowed in a character class. \B, \R, and \X are not special
-inside a character class. Like other unrecognized escape sequences, they are
-treated as the literal characters "B", "R", and "X" by default, but cause an
-error if the PCRE_EXTRA option is set. Outside a character class, these
-sequences have different meanings.
-</P>
 <br><b>
-Unsupported escape sequences
-</b><br>
-<P>
-In Perl, the sequences \l, \L, \u, and \U are recognized by its string
-handler and used to modify the case of following characters. By default, PCRE
-does not support these escape sequences. However, if the PCRE_JAVASCRIPT_COMPAT
-option is set, \U matches a "U" character, and \u can be used to define a
-character by code point, as described in the previous section.
-</P>
-<br><b>
 Absolute and relative back references
 </b><br>
 <P>
@@ -411,8 +389,7 @@
 There is also the single sequence \N, which matches a non-newline character.
 This is the same as
 <a href="#fullstopdot">the "." metacharacter</a>
-when PCRE_DOTALL is not set. Perl also uses \N to match characters by name;
-PCRE does not support this.
+when PCRE_DOTALL is not set.
 </P>
 <P>
 Each pair of lower and upper case escape sequences partitions the complete set
@@ -986,8 +963,7 @@
 <P>
 The escape sequence \N behaves like a dot, except that it is not affected by
 the PCRE_DOTALL option. In other words, it matches any character except one
-that signifies the end of a line. Perl also uses \N to match characters by
-name; PCRE does not support this.
+that signifies the end of a line.
 </P>
 <br><a name="SEC7" href="#TOC1">MATCHING A SINGLE BYTE</a><br>
 <P>
@@ -1003,8 +979,8 @@
 </P>
 <P>
 PCRE does not allow \C to appear in lookbehind assertions
-<a href="#lookbehind">(described below)</a>
-in UTF-8 mode, because this would make it impossible to calculate the length of
+<a href="#lookbehind">(described below),</a>
+because in UTF-8 mode this would make it impossible to calculate the length of
 the lookbehind.
 </P>
 <P>
@@ -1950,10 +1926,10 @@
 assertion fails.
 </P>
 <P>
-In UTF-8 mode, PCRE does not allow the \C escape (which matches a single byte,
-even in UTF-8 mode) to appear in lookbehind assertions, because it makes it
-impossible to calculate the length of the lookbehind. The \X and \R escapes,
-which can match different numbers of bytes, are also not permitted.
+PCRE does not allow the \C escape (which matches a single byte in UTF-8 mode)
+to appear in lookbehind assertions, because it makes it impossible to calculate
+the length of the lookbehind. The \X and \R escapes, which can match
+different numbers of bytes, are also not permitted.
 </P>
 <P>
 <a href="#subpatternsassubroutines">"Subroutine"</a>
@@ -2535,11 +2511,10 @@
 If any of these verbs are used in an assertion or in a subpattern that is
 called as a subroutine (whether or not recursively), their effect is confined
 to that subpattern; it does not extend to the surrounding pattern, with one
-exception: the name from a *(MARK), (*PRUNE), or (*THEN) that is encountered in
-a successful positive assertion <i>is</i> passed back when a match succeeds
-(compare capturing parentheses in assertions). Note that such subpatterns are
-processed as anchored at the point where they are tested. Note also that Perl's
-treatment of subroutines is different in some cases.
+exception: a *MARK that is encountered in a positive assertion <i>is</i> passed
+back (compare capturing parentheses in assertions). Note that such subpatterns
+are processed as anchored at the point where they are tested. Note also that
+Perl's treatment of subroutines is different in some cases.
 </P>
 <P>
 The new verbs make use of what was previously invalid syntax: an opening
@@ -2561,10 +2536,6 @@
 when calling <b>pcre_compile()</b> or <b>pcre_exec()</b>, or by starting the
 pattern with (*NO_START_OPT).
 </P>
-<P>
-Experiments with Perl suggest that it too has similar optimizations, sometimes
-leading to anomalous results.
-</P>
 <br><b>
 Verbs that act immediately
 </b><br>
@@ -2612,17 +2583,17 @@
 (*MARK) as you like in a pattern, and their names do not have to be unique.
 </P>
 <P>
-When a match succeeds, the name of the last-encountered (*MARK) on the matching
-path is passed back to the caller via the <i>pcre_extra</i> data structure, as
-described in the
+When a match succeeds, the name of the last-encountered (*MARK) is passed back
+to the caller via the <i>pcre_extra</i> data structure, as described in the
 <a href="pcreapi.html#extradata">section on <i>pcre_extra</i></a>
 in the
 <a href="pcreapi.html"><b>pcreapi</b></a>
-documentation. Here is an example of <b>pcretest</b> output, where the /K
-modifier requests the retrieval and outputting of (*MARK) data:
+documentation. No data is returned for a partial match. Here is an example of
+<b>pcretest</b> output, where the /K modifier requests the retrieval and
+outputting of (*MARK) data:
 <pre>
-    re&#62; /X(*MARK:A)Y|X(*MARK:B)Z/K
-  data&#62; XY
+  /X(*MARK:A)Y|X(*MARK:B)Z/K
+  XY
    0: XY
   MK: A
   XZ
@@ -2640,18 +2611,33 @@
 assertions.
 </P>
 <P>
-After a partial match or a failed match, the name of the last encountered
-(*MARK) in the entire match process is returned. For example:
+A name may also be returned after a failed match if the final path through the
+pattern involves (*MARK). However, unless (*MARK) used in conjunction with
+(*COMMIT), this is unlikely to happen for an unanchored pattern because, as the
+starting point for matching is advanced, the final check is often with an empty
+string, causing a failure before (*MARK) is reached. For example:
 <pre>
-    re&#62; /X(*MARK:A)Y|X(*MARK:B)Z/K
-  data&#62; XP
+  /X(*MARK:A)Y|X(*MARK:B)Z/K
+  XP
+  No match
+</pre>
+There are three potential starting points for this match (starting with X,
+starting with P, and with an empty string). If the pattern is anchored, the
+result is different:
+<pre>
+  /^X(*MARK:A)Y|^X(*MARK:B)Z/K
+  XP
   No match, mark = B
 </pre>
-Note that in this unanchored example the mark is retained from the match
-attempt that started at the letter "X". Subsequent match attempts starting at
-"P" and then with an empty string do not get as far as the (*MARK) item, but
-nevertheless do not reset it.
+PCRE's start-of-match optimizations can also interfere with this. For example,
+if, as a result of a call to <b>pcre_study()</b>, it knows the minimum
+subject length for a match, a shorter subject will not be scanned at all.
 </P>
+<P>
+Note that similar anomalies (though different in detail) exist in Perl, no
+doubt for the same reasons. The use of (*MARK) data after a failed match of an
+unanchored pattern is not recommended, unless (*COMMIT) is involved.
+</P>
 <br><b>
 Verbs that act after backtracking
 </b><br>
@@ -2689,8 +2675,8 @@
 unless PCRE's start-of-match optimizations are turned off, as shown in this
 <b>pcretest</b> example:
 <pre>
-    re&#62; /(*COMMIT)abc/
-  data&#62; xyzabc
+  /(*COMMIT)abc/
+  xyzabc
    0: abc
   xyzabc\Y
   No match
@@ -2711,8 +2697,10 @@
 the right, backtracking cannot cross (*PRUNE). In simple cases, the use of
 (*PRUNE) is just an alternative to an atomic group or possessive quantifier,
 but there are some uses of (*PRUNE) that cannot be expressed in any other way.
-The behaviour of (*PRUNE:NAME) is the same as (*MARK:NAME)(*PRUNE). In an
-anchored pattern (*PRUNE) has the same effect as (*COMMIT).
+The behaviour of (*PRUNE:NAME) is the same as (*MARK:NAME)(*PRUNE) when the
+match fails completely; the name is passed back if this is the final attempt.
+(*PRUNE:NAME) does not pass back a name if the match succeeds. In an anchored
+pattern (*PRUNE) has the same effect as (*COMMIT).
 <pre>
   (*SKIP)
 </pre>
@@ -2738,7 +2726,8 @@
 searched for the most recent (*MARK) that has the same name. If one is found,
 the "bumpalong" advance is to the subject position that corresponds to that
 (*MARK) instead of to where (*SKIP) was encountered. If no (*MARK) with a
-matching name is found, the (*SKIP) is ignored.
+matching name is found, normal "bumpalong" of one character happens (that is,
+the (*SKIP) is ignored).
 <pre>
   (*THEN) or (*THEN:NAME)
 </pre>
@@ -2752,8 +2741,9 @@
 If the COND1 pattern matches, FOO is tried (and possibly further items after
 the end of the group if FOO succeeds); on failure, the matcher skips to the
 second alternative and tries COND2, without backtracking into COND1. The
-behaviour of (*THEN:NAME) is exactly the same as (*MARK:NAME)(*THEN).
-If (*THEN) is not inside an alternation, it acts like (*PRUNE).
+behaviour of (*THEN:NAME) is exactly the same as (*MARK:NAME)(*THEN) if the
+overall match fails. If (*THEN) is not inside an alternation, it acts like
+(*PRUNE).
 </P>
 <P>
 Note that a subpattern that does not contain a | character is just a part of
@@ -2829,7 +2819,7 @@
 </P>
 <br><a name="SEC28" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 29 November 2011
+Last updated: 19 October 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>


Modified: code/trunk/doc/html/pcretest.html
===================================================================
--- code/trunk/doc/html/pcretest.html    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/html/pcretest.html    2011-12-28 16:10:09 UTC (rev 835)
@@ -364,10 +364,7 @@
 </P>
 <P>
 The <b>/M</b> modifier causes the size of memory block used to hold the compiled
-pattern to be output. This does not include the size of the <b>pcre</b> block;
-it is just the actual compiled data. If the pattern is successfully studied
-with the PCRE_STUDY_JIT_COMPILE option, the size of the JIT compiled code is
-also output.
+pattern to be output.
 </P>
 <P>
 If the <b>/S</b> modifier appears once, it causes <b>pcre_study()</b> to be
@@ -859,7 +856,7 @@
 </P>
 <br><a name="SEC15" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 02 December 2011
+Last updated: 26 August 2011
 <br>
 Copyright &copy; 1997-2011 University of Cambridge.
 <br>


Modified: code/trunk/doc/pcre.txt
===================================================================
--- code/trunk/doc/pcre.txt    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/pcre.txt    2011-12-28 16:10:09 UTC (rev 835)
@@ -633,9 +633,9 @@
        always 1, and the value of the capture_last field is always -1.


        7.  The \C escape sequence, which (in the standard algorithm) matches a
-       single byte, even in UTF-8  mode,  is  not  supported  in  UTF-8  mode,
-       because  the alternative algorithm moves through the subject string one
-       character at a time, for all active paths through the tree.
+       single byte, even in UTF-8 mode, is not supported because the  alterna-
+       tive  algorithm  moves  through  the  subject string one character at a
+       time, for all active paths through the tree.


        8. Except for (*FAIL), the backtracking control verbs such as  (*PRUNE)
        are  not  supported.  (*FAIL)  is supported, and behaves like a failing
@@ -685,7 +685,7 @@


REVISION

-       Last updated: 19 November 2011
+       Last updated: 17 November 2010
        Copyright (c) 1997-2010 University of Cambridge.
 ------------------------------------------------------------------------------


@@ -1256,20 +1256,6 @@
        set  (assuming  it can find an "a" in the subject), whereas it fails by
        default, for Perl compatibility.


-       (3) \U matches an upper case "U" character; by default \U causes a com-
-       pile time error (Perl uses \U to upper case subsequent characters).
-
-       (4) \u matches a lower case "u" character unless it is followed by four
-       hexadecimal digits, in which case the hexadecimal  number  defines  the
-       code  point  to match. By default, \u causes a compile time error (Perl
-       uses it to upper case the following character).
-
-       (5) \x matches a lower case "x" character unless it is followed by  two
-       hexadecimal  digits,  in  which case the hexadecimal number defines the
-       code point to match. By default, as in Perl, a  hexadecimal  number  is
-       always expected after \x, but it may have zero, one, or two digits (so,
-       for example, \xz matches a binary zero character followed by z).
-
          PCRE_MULTILINE


        By default, PCRE treats the subject string as consisting  of  a  single
@@ -1724,12 +1710,6 @@
        compiler could not handle this particular pattern. See the pcrejit doc-
        umentation for details of what can and cannot be handled.


-         PCRE_INFO_JITSIZE
-
-       If the pattern was successfully studied with the PCRE_STUDY_JIT_COMPILE
-       option, return the size of the  JIT  compiled  code,  otherwise  return
-       zero. The fourth argument should point to a size_t variable.
-
          PCRE_INFO_LASTLITERAL


        Return  the  value of the rightmost literal byte that must exist in any
@@ -1838,14 +1818,10 @@


          PCRE_INFO_SIZE


-       Return  the  size  of  the compiled pattern. The fourth argument should
-       point to a size_t variable. This value does not include the size of the
-       pcre  structure  that  is returned by pcre_compile(). The value that is
-       passed as the argument to pcre_malloc() when pcre_compile() is  getting
-       memory  in  which  to  place the compiled data is the value returned by
-       this option plus the size of the pcre structure.  Studying  a  compiled
-       pattern, with or without JIT, does not alter the value returned by this
-       option.
+       Return  the  size  of the compiled pattern, that is, the value that was
+       passed as the argument to pcre_malloc() when PCRE was getting memory in
+       which to place the compiled data. The fourth argument should point to a
+       size_t variable.


          PCRE_INFO_STUDYSIZE


@@ -3004,7 +2980,7 @@

REVISION

-       Last updated: 02 December 2011
+       Last updated: 23 September 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------


@@ -3167,11 +3143,9 @@

        The mark field is present from version 2 of the pcre_callout structure.
        In  callouts  from pcre_exec() it contains a pointer to the zero-termi-
-       nated name of the most recently passed (*MARK),  (*PRUNE),  or  (*THEN)
-       item in the match, or NULL if no such items have been passed. Instances
-       of (*PRUNE) or (*THEN) without a name  do  not  obliterate  a  previous
-       (*MARK).  In  callouts  from pcre_dfa_exec() this field always contains
-       NULL.
+       nated name of the most recently passed (*MARK) item in  the  match,  or
+       NULL if there are no (*MARK)s in the current matching path. In callouts
+       from pcre_dfa_exec() this field always contains NULL.



RETURN VALUES
@@ -3199,7 +3173,7 @@

REVISION

-       Last updated: 30 November 2011
+       Last updated: 26 August 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------


@@ -3244,9 +3218,7 @@
        its own, matching a non-newline character, is supported.) In fact these
        are implemented by Perl's general string-handling and are not  part  of
        its  pattern  matching engine. If any of these are encountered by PCRE,
-       an error is generated by default. However, if the  PCRE_JAVASCRIPT_COM-
-       PAT  option  is set, \U and \u are interpreted as JavaScript interprets
-       them.
+       an error is generated.


        6. The Perl escape sequences \p, \P, and \X are supported only if  PCRE
        is  built  with Unicode character property support. The properties that
@@ -3373,7 +3345,7 @@


REVISION

-       Last updated: 14 November 2011
+       Last updated: 09 October 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------


@@ -3600,8 +3572,7 @@
          \t        tab (hex 09)
          \ddd      character with octal code ddd, or back reference
          \xhh      character with hex code hh
-         \x{hhh..} character with hex code hhh.. (non-JavaScript mode)
-         \uhhhh    character with hex code hhhh (JavaScript mode only)
+         \x{hhh..} character with hex code hhh..


        The precise effect of \cx is as follows: if x is a lower  case  letter,
        it  is converted to upper case. Then bit 6 of the character (hex 40) is
@@ -3612,12 +3583,12 @@
        is compiled in EBCDIC mode, all byte values are  valid.  A  lower  case
        letter is converted to upper case, and then the 0xc0 bits are flipped.)


-       By  default,  after  \x,  from  zero to two hexadecimal digits are read
-       (letters can be in upper or lower case). Any number of hexadecimal dig-
-       its  may  appear between \x{ and }, but the value of the character code
-       must be less than 256 in non-UTF-8 mode, and less than 2**31  in  UTF-8
-       mode.  That is, the maximum value in hexadecimal is 7FFFFFFF. Note that
-       this is bigger than the largest Unicode code point, which is 10FFFF.
+       After  \x, from zero to two hexadecimal digits are read (letters can be
+       in upper or lower case). Any number of hexadecimal  digits  may  appear
+       between  \x{  and  },  but the value of the character code must be less
+       than 256 in non-UTF-8 mode, and less than 2**31 in UTF-8 mode. That is,
+       the  maximum value in hexadecimal is 7FFFFFFF. Note that this is bigger
+       than the largest Unicode code point, which is 10FFFF.


        If characters other than hexadecimal digits appear between \x{  and  },
        or if there is no terminating }, this form of escape is not recognized.
@@ -3625,17 +3596,9 @@
        escape,  with  no  following  digits, giving a character whose value is
        zero.


-       If the PCRE_JAVASCRIPT_COMPAT option is set, the interpretation  of  \x
-       is  as  just described only when it is followed by two hexadecimal dig-
-       its.  Otherwise, it matches a  literal  "x"  character.  In  JavaScript
-       mode, support for code points greater than 256 is provided by \u, which
-       must be followed by four hexadecimal digits;  otherwise  it  matches  a
-       literal "u" character.
-
        Characters whose value is less than 256 can be defined by either of the
-       two syntaxes for \x (or by \u in JavaScript mode). There is no  differ-
-       ence in the way they are handled. For example, \xdc is exactly the same
-       as \x{dc} (or \u00dc in JavaScript mode).
+       two  syntaxes  for  \x. There is no difference in the way they are han-
+       dled. For example, \xdc is exactly the same as \x{dc}.


        After \0 up to two further octal digits are read. If  there  are  fewer
        than  two  digits,  just  those  that  are  present  are used. Thus the
@@ -3679,23 +3642,13 @@


        All the sequences that define a single character value can be used both
        inside and outside character classes. In addition, inside  a  character
-       class, \b is interpreted as the backspace character (hex 08).
+       class,  the  sequence \b is interpreted as the backspace character (hex
+       08). The sequences \B, \N, \R, and \X are not special inside a  charac-
+       ter  class.  Like  any  other  unrecognized  escape sequences, they are
+       treated as the literal characters "B", "N", "R", and  "X"  by  default,
+       but cause an error if the PCRE_EXTRA option is set. Outside a character
+       class, these sequences have different meanings.


-       \N  is not allowed in a character class. \B, \R, and \X are not special
-       inside a character class. Like  other  unrecognized  escape  sequences,
-       they  are  treated  as  the  literal  characters  "B",  "R", and "X" by
-       default, but cause an error if the PCRE_EXTRA option is set. Outside  a
-       character class, these sequences have different meanings.
-
-   Unsupported escape sequences
-
-       In  Perl, the sequences \l, \L, \u, and \U are recognized by its string
-       handler and used  to  modify  the  case  of  following  characters.  By
-       default,  PCRE does not support these escape sequences. However, if the
-       PCRE_JAVASCRIPT_COMPAT option is set, \U matches a "U"  character,  and
-       \u can be used to define a character by code point, as described in the
-       previous section.
-
    Absolute and relative back references


        The sequence \g followed by an unsigned or a negative  number,  option-
@@ -3729,54 +3682,53 @@


        There is also the single sequence \N, which matches a non-newline char-
        acter.   This  is the same as the "." metacharacter when PCRE_DOTALL is
-       not set. Perl also uses \N to match characters by name; PCRE  does  not
-       support this.
+       not set.


-       Each  pair of lower and upper case escape sequences partitions the com-
-       plete set of characters into two disjoint  sets.  Any  given  character
-       matches  one, and only one, of each pair. The sequences can appear both
-       inside and outside character classes. They each match one character  of
-       the  appropriate  type.  If the current matching point is at the end of
-       the subject string, all of them fail, because there is no character  to
+       Each pair of lower and upper case escape sequences partitions the  com-
+       plete  set  of  characters  into two disjoint sets. Any given character
+       matches one, and only one, of each pair. The sequences can appear  both
+       inside  and outside character classes. They each match one character of
+       the appropriate type. If the current matching point is at  the  end  of
+       the  subject string, all of them fail, because there is no character to
        match.


-       For  compatibility  with Perl, \s does not match the VT character (code
-       11).  This makes it different from the the POSIX "space" class. The  \s
-       characters  are  HT  (9), LF (10), FF (12), CR (13), and space (32). If
+       For compatibility with Perl, \s does not match the VT  character  (code
+       11).   This makes it different from the the POSIX "space" class. The \s
+       characters are HT (9), LF (10), FF (12), CR (13), and  space  (32).  If
        "use locale;" is included in a Perl script, \s may match the VT charac-
        ter. In PCRE, it never does.


-       A  "word"  character is an underscore or any character that is a letter
-       or digit.  By default, the definition of letters  and  digits  is  con-
-       trolled  by PCRE's low-valued character tables, and may vary if locale-
-       specific matching is taking place (see "Locale support" in the  pcreapi
-       page).  For  example,  in  a French locale such as "fr_FR" in Unix-like
-       systems, or "french" in Windows, some character codes greater than  128
-       are  used  for  accented letters, and these are then matched by \w. The
+       A "word" character is an underscore or any character that is  a  letter
+       or  digit.   By  default,  the definition of letters and digits is con-
+       trolled by PCRE's low-valued character tables, and may vary if  locale-
+       specific  matching is taking place (see "Locale support" in the pcreapi
+       page). For example, in a French locale such  as  "fr_FR"  in  Unix-like
+       systems,  or "french" in Windows, some character codes greater than 128
+       are used for accented letters, and these are then matched  by  \w.  The
        use of locales with Unicode is discouraged.


-       By default, in UTF-8 mode, characters  with  values  greater  than  128
-       never  match  \d,  \s,  or  \w,  and always match \D, \S, and \W. These
-       sequences retain their original meanings from before UTF-8 support  was
-       available,  mainly for efficiency reasons. However, if PCRE is compiled
-       with Unicode property support, and the PCRE_UCP option is set, the  be-
-       haviour  is  changed  so  that Unicode properties are used to determine
+       By  default,  in  UTF-8  mode,  characters with values greater than 128
+       never match \d, \s, or \w, and always  match  \D,  \S,  and  \W.  These
+       sequences  retain their original meanings from before UTF-8 support was
+       available, mainly for efficiency reasons. However, if PCRE is  compiled
+       with  Unicode property support, and the PCRE_UCP option is set, the be-
+       haviour is changed so that Unicode properties  are  used  to  determine
        character types, as follows:


          \d  any character that \p{Nd} matches (decimal digit)
          \s  any character that \p{Z} matches, plus HT, LF, FF, CR
          \w  any character that \p{L} or \p{N} matches, plus underscore


-       The upper case escapes match the inverse sets of characters. Note  that
-       \d  matches  only decimal digits, whereas \w matches any Unicode digit,
-       as well as any Unicode letter, and underscore. Note also that  PCRE_UCP
-       affects  \b,  and  \B  because  they are defined in terms of \w and \W.
+       The  upper case escapes match the inverse sets of characters. Note that
+       \d matches only decimal digits, whereas \w matches any  Unicode  digit,
+       as  well as any Unicode letter, and underscore. Note also that PCRE_UCP
+       affects \b, and \B because they are defined in  terms  of  \w  and  \W.
        Matching these sequences is noticeably slower when PCRE_UCP is set.


-       The sequences \h, \H, \v, and \V are features that were added  to  Perl
-       at  release  5.10. In contrast to the other sequences, which match only
-       ASCII characters by default, these  always  match  certain  high-valued
-       codepoints  in UTF-8 mode, whether or not PCRE_UCP is set. The horizon-
+       The  sequences  \h, \H, \v, and \V are features that were added to Perl
+       at release 5.10. In contrast to the other sequences, which  match  only
+       ASCII  characters  by  default,  these always match certain high-valued
+       codepoints in UTF-8 mode, whether or not PCRE_UCP is set. The  horizon-
        tal space characters are:


          U+0009     Horizontal tab
@@ -3811,104 +3763,104 @@


    Newline sequences


-       Outside a character class, by default, the escape sequence  \R  matches
+       Outside  a  character class, by default, the escape sequence \R matches
        any Unicode newline sequence. In non-UTF-8 mode \R is equivalent to the
        following:


          (?>\r\n|\n|\x0b|\f|\r|\x85)


-       This is an example of an "atomic group", details  of  which  are  given
+       This  is  an  example  of an "atomic group", details of which are given
        below.  This particular group matches either the two-character sequence
-       CR followed by LF, or  one  of  the  single  characters  LF  (linefeed,
+       CR  followed  by  LF,  or  one  of  the single characters LF (linefeed,
        U+000A), VT (vertical tab, U+000B), FF (formfeed, U+000C), CR (carriage
        return, U+000D), or NEL (next line, U+0085). The two-character sequence
        is treated as a single unit that cannot be split.


-       In  UTF-8  mode, two additional characters whose codepoints are greater
+       In UTF-8 mode, two additional characters whose codepoints  are  greater
        than 255 are added: LS (line separator, U+2028) and PS (paragraph sepa-
-       rator,  U+2029).   Unicode character property support is not needed for
+       rator, U+2029).  Unicode character property support is not  needed  for
        these characters to be recognized.


        It is possible to restrict \R to match only CR, LF, or CRLF (instead of
-       the  complete  set  of  Unicode  line  endings)  by  setting the option
+       the complete set  of  Unicode  line  endings)  by  setting  the  option
        PCRE_BSR_ANYCRLF either at compile time or when the pattern is matched.
        (BSR is an abbrevation for "backslash R".) This can be made the default
-       when PCRE is built; if this is the case, the  other  behaviour  can  be
-       requested  via  the  PCRE_BSR_UNICODE  option.   It is also possible to
-       specify these settings by starting a pattern string  with  one  of  the
+       when  PCRE  is  built;  if this is the case, the other behaviour can be
+       requested via the PCRE_BSR_UNICODE option.   It  is  also  possible  to
+       specify  these  settings  by  starting a pattern string with one of the
        following sequences:


          (*BSR_ANYCRLF)   CR, LF, or CRLF only
          (*BSR_UNICODE)   any Unicode newline sequence


-       These  override  the default and the options given to pcre_compile() or
-       pcre_compile2(), but  they  can  be  overridden  by  options  given  to
+       These override the default and the options given to  pcre_compile()  or
+       pcre_compile2(),  but  they  can  be  overridden  by  options  given to
        pcre_exec() or pcre_dfa_exec(). Note that these special settings, which
-       are not Perl-compatible, are recognized only at the  very  start  of  a
-       pattern,  and that they must be in upper case. If more than one of them
+       are  not  Perl-compatible,  are  recognized only at the very start of a
+       pattern, and that they must be in upper case. If more than one of  them
        is present, the last one is used. They can be combined with a change of
        newline convention; for example, a pattern can start with:


          (*ANY)(*BSR_ANYCRLF)


        They can also be combined with the (*UTF8) or (*UCP) special sequences.
-       Inside a character class, \R  is  treated  as  an  unrecognized  escape
+       Inside  a  character  class,  \R  is  treated as an unrecognized escape
        sequence, and so matches the letter "R" by default, but causes an error
        if PCRE_EXTRA is set.


    Unicode character properties


        When PCRE is built with Unicode character property support, three addi-
-       tional  escape sequences that match characters with specific properties
-       are available.  When not in UTF-8 mode, these sequences are  of  course
-       limited  to  testing characters whose codepoints are less than 256, but
+       tional escape sequences that match characters with specific  properties
+       are  available.   When not in UTF-8 mode, these sequences are of course
+       limited to testing characters whose codepoints are less than  256,  but
        they do work in this mode.  The extra escape sequences are:


          \p{xx}   a character with the xx property
          \P{xx}   a character without the xx property
          \X       an extended Unicode sequence


-       The property names represented by xx above are limited to  the  Unicode
+       The  property  names represented by xx above are limited to the Unicode
        script names, the general category properties, "Any", which matches any
-       character  (including  newline),  and  some  special  PCRE   properties
-       (described  in the next section).  Other Perl properties such as "InMu-
-       sicalSymbols" are not currently supported by PCRE.  Note  that  \P{Any}
+       character   (including  newline),  and  some  special  PCRE  properties
+       (described in the next section).  Other Perl properties such as  "InMu-
+       sicalSymbols"  are  not  currently supported by PCRE. Note that \P{Any}
        does not match any characters, so always causes a match failure.


        Sets of Unicode characters are defined as belonging to certain scripts.
-       A character from one of these sets can be matched using a script  name.
+       A  character from one of these sets can be matched using a script name.
        For example:


          \p{Greek}
          \P{Han}


-       Those  that are not part of an identified script are lumped together as
+       Those that are not part of an identified script are lumped together  as
        "Common". The current list of scripts is:


        Arabic, Armenian, Avestan, Balinese, Bamum, Bengali, Bopomofo, Braille,
-       Buginese,  Buhid,  Canadian_Aboriginal, Carian, Cham, Cherokee, Common,
-       Coptic,  Cuneiform,  Cypriot,  Cyrillic,  Deseret,  Devanagari,   Egyp-
-       tian_Hieroglyphs,   Ethiopic,   Georgian,  Glagolitic,  Gothic,  Greek,
-       Gujarati, Gurmukhi,  Han,  Hangul,  Hanunoo,  Hebrew,  Hiragana,  Impe-
+       Buginese, Buhid, Canadian_Aboriginal, Carian, Cham,  Cherokee,  Common,
+       Coptic,   Cuneiform,  Cypriot,  Cyrillic,  Deseret,  Devanagari,  Egyp-
+       tian_Hieroglyphs,  Ethiopic,  Georgian,  Glagolitic,   Gothic,   Greek,
+       Gujarati,  Gurmukhi,  Han,  Hangul,  Hanunoo,  Hebrew,  Hiragana, Impe-
        rial_Aramaic, Inherited, Inscriptional_Pahlavi, Inscriptional_Parthian,
-       Javanese, Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer,  Lao,
+       Javanese,  Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Lao,
        Latin,  Lepcha,  Limbu,  Linear_B,  Lisu,  Lycian,  Lydian,  Malayalam,
-       Meetei_Mayek, Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham,  Old_Italic,
-       Old_Persian,  Old_South_Arabian,  Old_Turkic, Ol_Chiki, Oriya, Osmanya,
-       Phags_Pa, Phoenician, Rejang, Runic,  Samaritan,  Saurashtra,  Shavian,
-       Sinhala,  Sundanese,  Syloti_Nagri,  Syriac, Tagalog, Tagbanwa, Tai_Le,
-       Tai_Tham, Tai_Viet, Tamil, Telugu,  Thaana,  Thai,  Tibetan,  Tifinagh,
+       Meetei_Mayek,  Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham, Old_Italic,
+       Old_Persian, Old_South_Arabian, Old_Turkic, Ol_Chiki,  Oriya,  Osmanya,
+       Phags_Pa,  Phoenician,  Rejang,  Runic, Samaritan, Saurashtra, Shavian,
+       Sinhala, Sundanese, Syloti_Nagri, Syriac,  Tagalog,  Tagbanwa,  Tai_Le,
+       Tai_Tham,  Tai_Viet,  Tamil,  Telugu,  Thaana, Thai, Tibetan, Tifinagh,
        Ugaritic, Vai, Yi.


        Each character has exactly one Unicode general category property, spec-
-       ified by a two-letter abbreviation. For compatibility with Perl,  nega-
-       tion  can  be  specified  by including a circumflex between the opening
-       brace and the property name.  For  example,  \p{^Lu}  is  the  same  as
+       ified  by a two-letter abbreviation. For compatibility with Perl, nega-
+       tion can be specified by including a  circumflex  between  the  opening
+       brace  and  the  property  name.  For  example,  \p{^Lu} is the same as
        \P{Lu}.


        If only one letter is specified with \p or \P, it includes all the gen-
-       eral category properties that start with that letter. In this case,  in
-       the  absence of negation, the curly brackets in the escape sequence are
+       eral  category properties that start with that letter. In this case, in
+       the absence of negation, the curly brackets in the escape sequence  are
        optional; these two examples have the same effect:


          \p{L}
@@ -3960,54 +3912,54 @@
          Zp    Paragraph separator
          Zs    Space separator


-       The special property L& is also supported: it matches a character  that
-       has  the  Lu,  Ll, or Lt property, in other words, a letter that is not
+       The  special property L& is also supported: it matches a character that
+       has the Lu, Ll, or Lt property, in other words, a letter  that  is  not
        classified as a modifier or "other".


-       The Cs (Surrogate) property applies only to  characters  in  the  range
-       U+D800  to  U+DFFF. Such characters are not valid in UTF-8 strings (see
+       The  Cs  (Surrogate)  property  applies only to characters in the range
+       U+D800 to U+DFFF. Such characters are not valid in UTF-8  strings  (see
        RFC 3629) and so cannot be tested by PCRE, unless UTF-8 validity check-
-       ing  has  been  turned off (see the discussion of PCRE_NO_UTF8_CHECK in
+       ing has been turned off (see the discussion  of  PCRE_NO_UTF8_CHECK  in
        the pcreapi page). Perl does not support the Cs property.


-       The long synonyms for  property  names  that  Perl  supports  (such  as
-       \p{Letter})  are  not  supported by PCRE, nor is it permitted to prefix
+       The  long  synonyms  for  property  names  that  Perl supports (such as
+       \p{Letter}) are not supported by PCRE, nor is it  permitted  to  prefix
        any of these properties with "Is".


        No character that is in the Unicode table has the Cn (unassigned) prop-
        erty.  Instead, this property is assumed for any code point that is not
        in the Unicode table.


-       Specifying caseless matching does not affect  these  escape  sequences.
+       Specifying  caseless  matching  does not affect these escape sequences.
        For example, \p{Lu} always matches only upper case letters.


-       The  \X  escape  matches  any number of Unicode characters that form an
+       The \X escape matches any number of Unicode  characters  that  form  an
        extended Unicode sequence. \X is equivalent to


          (?>\PM\pM*)


-       That is, it matches a character without the "mark"  property,  followed
-       by  zero  or  more  characters with the "mark" property, and treats the
-       sequence as an atomic group (see below).  Characters  with  the  "mark"
-       property  are  typically  accents  that affect the preceding character.
-       None of them have codepoints less than 256, so  in  non-UTF-8  mode  \X
+       That  is,  it matches a character without the "mark" property, followed
+       by zero or more characters with the "mark"  property,  and  treats  the
+       sequence  as  an  atomic group (see below).  Characters with the "mark"
+       property are typically accents that  affect  the  preceding  character.
+       None  of  them  have  codepoints less than 256, so in non-UTF-8 mode \X
        matches any one character.


        Note that recent versions of Perl have changed \X to match what Unicode
        calls an "extended grapheme cluster", which has a more complicated def-
        inition.


-       Matching  characters  by Unicode property is not fast, because PCRE has
-       to search a structure that contains  data  for  over  fifteen  thousand
+       Matching characters by Unicode property is not fast, because  PCRE  has
+       to  search  a  structure  that  contains data for over fifteen thousand
        characters. That is why the traditional escape sequences such as \d and
-       \w do not use Unicode properties in PCRE by  default,  though  you  can
+       \w  do  not  use  Unicode properties in PCRE by default, though you can
        make them do so by setting the PCRE_UCP option for pcre_compile() or by
        starting the pattern with (*UCP).


    PCRE's additional properties


-       As well as the standard Unicode properties described  in  the  previous
-       section,  PCRE supports four more that make it possible to convert tra-
+       As  well  as  the standard Unicode properties described in the previous
+       section, PCRE supports four more that make it possible to convert  tra-
        ditional escape sequences such as \w and \s and POSIX character classes
        to use Unicode properties. PCRE uses these non-standard, non-Perl prop-
        erties internally when PCRE_UCP is set. They are:
@@ -4017,40 +3969,40 @@
          Xsp   Any Perl space character
          Xwd   Any Perl "word" character


-       Xan matches characters that have either the L (letter) or the  N  (num-
-       ber)  property. Xps matches the characters tab, linefeed, vertical tab,
-       formfeed, or carriage return, and any other character that  has  the  Z
+       Xan  matches  characters that have either the L (letter) or the N (num-
+       ber) property. Xps matches the characters tab, linefeed, vertical  tab,
+       formfeed,  or  carriage  return, and any other character that has the Z
        (separator) property.  Xsp is the same as Xps, except that vertical tab
        is excluded. Xwd matches the same characters as Xan, plus underscore.


    Resetting the match start


-       The escape sequence \K causes any previously matched characters not  to
+       The  escape sequence \K causes any previously matched characters not to
        be included in the final matched sequence. For example, the pattern:


          foo\Kbar


-       matches  "foobar",  but reports that it has matched "bar". This feature
-       is similar to a lookbehind assertion (described  below).   However,  in
-       this  case, the part of the subject before the real match does not have
-       to be of fixed length, as lookbehind assertions do. The use of \K  does
-       not  interfere  with  the setting of captured substrings.  For example,
+       matches "foobar", but reports that it has matched "bar".  This  feature
+       is  similar  to  a lookbehind assertion (described below).  However, in
+       this case, the part of the subject before the real match does not  have
+       to  be of fixed length, as lookbehind assertions do. The use of \K does
+       not interfere with the setting of captured  substrings.   For  example,
        when the pattern


          (foo)\Kbar


        matches "foobar", the first substring is still set to "foo".


-       Perl documents that the use  of  \K  within  assertions  is  "not  well
-       defined".  In  PCRE,  \K  is  acted upon when it occurs inside positive
+       Perl  documents  that  the  use  of  \K  within assertions is "not well
+       defined". In PCRE, \K is acted upon  when  it  occurs  inside  positive
        assertions, but is ignored in negative assertions.


    Simple assertions


-       The final use of backslash is for certain simple assertions. An  asser-
-       tion  specifies a condition that has to be met at a particular point in
-       a match, without consuming any characters from the subject string.  The
-       use  of subpatterns for more complicated assertions is described below.
+       The  final use of backslash is for certain simple assertions. An asser-
+       tion specifies a condition that has to be met at a particular point  in
+       a  match, without consuming any characters from the subject string. The
+       use of subpatterns for more complicated assertions is described  below.
        The backslashed assertions are:


          \b     matches at a word boundary
@@ -4061,49 +4013,49 @@
          \z     matches only at the end of the subject
          \G     matches at the first matching position in the subject


-       Inside a character class, \b has a different meaning;  it  matches  the
-       backspace  character.  If  any  other  of these assertions appears in a
-       character class, by default it matches the corresponding literal  char-
+       Inside  a  character  class, \b has a different meaning; it matches the
+       backspace character. If any other of  these  assertions  appears  in  a
+       character  class, by default it matches the corresponding literal char-
        acter  (for  example,  \B  matches  the  letter  B).  However,  if  the
-       PCRE_EXTRA option is set, an "invalid escape sequence" error is  gener-
+       PCRE_EXTRA  option is set, an "invalid escape sequence" error is gener-
        ated instead.


-       A  word  boundary is a position in the subject string where the current
-       character and the previous character do not both match \w or  \W  (i.e.
-       one  matches  \w  and the other matches \W), or the start or end of the
-       string if the first or last  character  matches  \w,  respectively.  In
-       UTF-8  mode,  the  meanings  of \w and \W can be changed by setting the
-       PCRE_UCP option. When this is done, it also affects \b and \B.  Neither
-       PCRE  nor  Perl has a separate "start of word" or "end of word" metase-
-       quence. However, whatever follows \b normally determines which  it  is.
+       A word boundary is a position in the subject string where  the  current
+       character  and  the previous character do not both match \w or \W (i.e.
+       one matches \w and the other matches \W), or the start or  end  of  the
+       string  if  the  first  or  last character matches \w, respectively. In
+       UTF-8 mode, the meanings of \w and \W can be  changed  by  setting  the
+       PCRE_UCP  option. When this is done, it also affects \b and \B. Neither
+       PCRE nor Perl has a separate "start of word" or "end of  word"  metase-
+       quence.  However,  whatever follows \b normally determines which it is.
        For example, the fragment \ba matches "a" at the start of a word.


-       The  \A,  \Z,  and \z assertions differ from the traditional circumflex
+       The \A, \Z, and \z assertions differ from  the  traditional  circumflex
        and dollar (described in the next section) in that they only ever match
-       at  the  very start and end of the subject string, whatever options are
-       set. Thus, they are independent of multiline mode. These  three  asser-
+       at the very start and end of the subject string, whatever  options  are
+       set.  Thus,  they are independent of multiline mode. These three asser-
        tions are not affected by the PCRE_NOTBOL or PCRE_NOTEOL options, which
-       affect only the behaviour of the circumflex and dollar  metacharacters.
-       However,  if the startoffset argument of pcre_exec() is non-zero, indi-
+       affect  only the behaviour of the circumflex and dollar metacharacters.
+       However, if the startoffset argument of pcre_exec() is non-zero,  indi-
        cating that matching is to start at a point other than the beginning of
-       the  subject,  \A  can never match. The difference between \Z and \z is
+       the subject, \A can never match. The difference between \Z  and  \z  is
        that \Z matches before a newline at the end of the string as well as at
        the very end, whereas \z matches only at the end.


-       The  \G assertion is true only when the current matching position is at
-       the start point of the match, as specified by the startoffset  argument
-       of  pcre_exec().  It  differs  from \A when the value of startoffset is
-       non-zero. By calling pcre_exec() multiple times with appropriate  argu-
+       The \G assertion is true only when the current matching position is  at
+       the  start point of the match, as specified by the startoffset argument
+       of pcre_exec(). It differs from \A when the  value  of  startoffset  is
+       non-zero.  By calling pcre_exec() multiple times with appropriate argu-
        ments, you can mimic Perl's /g option, and it is in this kind of imple-
        mentation where \G can be useful.


-       Note, however, that PCRE's interpretation of \G, as the  start  of  the
+       Note,  however,  that  PCRE's interpretation of \G, as the start of the
        current match, is subtly different from Perl's, which defines it as the
-       end of the previous match. In Perl, these can  be  different  when  the
-       previously  matched  string was empty. Because PCRE does just one match
+       end  of  the  previous  match. In Perl, these can be different when the
+       previously matched string was empty. Because PCRE does just  one  match
        at a time, it cannot reproduce this behaviour.


-       If all the alternatives of a pattern begin with \G, the  expression  is
+       If  all  the alternatives of a pattern begin with \G, the expression is
        anchored to the starting match position, and the "anchored" flag is set
        in the compiled regular expression.


@@ -4111,81 +4063,80 @@
CIRCUMFLEX AND DOLLAR

        Outside a character class, in the default matching mode, the circumflex
-       character  is  an  assertion  that is true only if the current matching
-       point is at the start of the subject string. If the  startoffset  argu-
-       ment  of  pcre_exec()  is  non-zero,  circumflex can never match if the
-       PCRE_MULTILINE option is unset. Inside a  character  class,  circumflex
+       character is an assertion that is true only  if  the  current  matching
+       point  is  at the start of the subject string. If the startoffset argu-
+       ment of pcre_exec() is non-zero, circumflex  can  never  match  if  the
+       PCRE_MULTILINE  option  is  unset. Inside a character class, circumflex
        has an entirely different meaning (see below).


-       Circumflex  need  not be the first character of the pattern if a number
-       of alternatives are involved, but it should be the first thing in  each
-       alternative  in  which  it appears if the pattern is ever to match that
-       branch. If all possible alternatives start with a circumflex, that  is,
-       if  the  pattern  is constrained to match only at the start of the sub-
-       ject, it is said to be an "anchored" pattern.  (There  are  also  other
+       Circumflex need not be the first character of the pattern if  a  number
+       of  alternatives are involved, but it should be the first thing in each
+       alternative in which it appears if the pattern is ever  to  match  that
+       branch.  If all possible alternatives start with a circumflex, that is,
+       if the pattern is constrained to match only at the start  of  the  sub-
+       ject,  it  is  said  to be an "anchored" pattern. (There are also other
        constructs that can cause a pattern to be anchored.)


-       A  dollar  character  is  an assertion that is true only if the current
-       matching point is at the end of  the  subject  string,  or  immediately
+       A dollar character is an assertion that is true  only  if  the  current
+       matching  point  is  at  the  end of the subject string, or immediately
        before a newline at the end of the string (by default). Dollar need not
-       be the last character of the pattern if a number  of  alternatives  are
-       involved,  but  it  should  be  the last item in any branch in which it
+       be  the  last  character of the pattern if a number of alternatives are
+       involved, but it should be the last item in  any  branch  in  which  it
        appears. Dollar has no special meaning in a character class.


-       The meaning of dollar can be changed so that it  matches  only  at  the
-       very  end  of  the string, by setting the PCRE_DOLLAR_ENDONLY option at
+       The  meaning  of  dollar  can be changed so that it matches only at the
+       very end of the string, by setting the  PCRE_DOLLAR_ENDONLY  option  at
        compile time. This does not affect the \Z assertion.


        The meanings of the circumflex and dollar characters are changed if the
-       PCRE_MULTILINE  option  is  set.  When  this  is the case, a circumflex
-       matches immediately after internal newlines as well as at the start  of
-       the  subject  string.  It  does not match after a newline that ends the
-       string. A dollar matches before any newlines in the string, as well  as
-       at  the very end, when PCRE_MULTILINE is set. When newline is specified
-       as the two-character sequence CRLF, isolated CR and  LF  characters  do
+       PCRE_MULTILINE option is set. When  this  is  the  case,  a  circumflex
+       matches  immediately after internal newlines as well as at the start of
+       the subject string. It does not match after a  newline  that  ends  the
+       string.  A dollar matches before any newlines in the string, as well as
+       at the very end, when PCRE_MULTILINE is set. When newline is  specified
+       as  the  two-character  sequence CRLF, isolated CR and LF characters do
        not indicate newlines.


-       For  example, the pattern /^abc$/ matches the subject string "def\nabc"
-       (where \n represents a newline) in multiline mode, but  not  otherwise.
-       Consequently,  patterns  that  are anchored in single line mode because
-       all branches start with ^ are not anchored in  multiline  mode,  and  a
-       match  for  circumflex  is  possible  when  the startoffset argument of
-       pcre_exec() is non-zero. The PCRE_DOLLAR_ENDONLY option is  ignored  if
+       For example, the pattern /^abc$/ matches the subject string  "def\nabc"
+       (where  \n  represents a newline) in multiline mode, but not otherwise.
+       Consequently, patterns that are anchored in single  line  mode  because
+       all  branches  start  with  ^ are not anchored in multiline mode, and a
+       match for circumflex is  possible  when  the  startoffset  argument  of
+       pcre_exec()  is  non-zero. The PCRE_DOLLAR_ENDONLY option is ignored if
        PCRE_MULTILINE is set.


-       Note  that  the sequences \A, \Z, and \z can be used to match the start
-       and end of the subject in both modes, and if all branches of a  pattern
-       start  with  \A it is always anchored, whether or not PCRE_MULTILINE is
+       Note that the sequences \A, \Z, and \z can be used to match  the  start
+       and  end of the subject in both modes, and if all branches of a pattern
+       start with \A it is always anchored, whether or not  PCRE_MULTILINE  is
        set.



FULL STOP (PERIOD, DOT) AND \N

        Outside a character class, a dot in the pattern matches any one charac-
-       ter  in  the subject string except (by default) a character that signi-
-       fies the end of a line. In UTF-8 mode, the  matched  character  may  be
+       ter in the subject string except (by default) a character  that  signi-
+       fies  the  end  of  a line. In UTF-8 mode, the matched character may be
        more than one byte long.


-       When  a line ending is defined as a single character, dot never matches
-       that character; when the two-character sequence CRLF is used, dot  does
-       not  match  CR  if  it  is immediately followed by LF, but otherwise it
-       matches all characters (including isolated CRs and LFs). When any  Uni-
-       code  line endings are being recognized, dot does not match CR or LF or
+       When a line ending is defined as a single character, dot never  matches
+       that  character; when the two-character sequence CRLF is used, dot does
+       not match CR if it is immediately followed  by  LF,  but  otherwise  it
+       matches  all characters (including isolated CRs and LFs). When any Uni-
+       code line endings are being recognized, dot does not match CR or LF  or
        any of the other line ending characters.


-       The behaviour of dot with regard to newlines can  be  changed.  If  the
-       PCRE_DOTALL  option  is  set,  a dot matches any one character, without
+       The  behaviour  of  dot  with regard to newlines can be changed. If the
+       PCRE_DOTALL option is set, a dot matches  any  one  character,  without
        exception. If the two-character sequence CRLF is present in the subject
        string, it takes two dots to match it.


-       The  handling of dot is entirely independent of the handling of circum-
-       flex and dollar, the only relationship being  that  they  both  involve
+       The handling of dot is entirely independent of the handling of  circum-
+       flex  and  dollar,  the  only relationship being that they both involve
        newlines. Dot has no special meaning in a character class.


-       The  escape  sequence  \N  behaves  like  a  dot, except that it is not
-       affected by the PCRE_DOTALL option. In  other  words,  it  matches  any
-       character  except  one that signifies the end of a line. Perl also uses
-       \N to match characters by name; PCRE does not support this.
+       The escape sequence \N behaves like  a  dot,  except  that  it  is  not
+       affected  by  the  PCRE_DOTALL  option.  In other words, it matches any
+       character except one that signifies the end of a line.



 MATCHING A SINGLE BYTE
@@ -4202,7 +4153,7 @@
        PCRE_NO_UTF8_CHECK option is used).


        PCRE  does  not  allow \C to appear in lookbehind assertions (described
-       below) in UTF-8 mode, because this would make it impossible  to  calcu-
+       below), because in UTF-8 mode this would make it impossible  to  calcu-
        late the length of the lookbehind.


        In  general, the \C escape sequence is best avoided in UTF-8 mode. How-
@@ -5109,41 +5060,40 @@
        then try to match. If there are insufficient characters before the cur-
        rent position, the assertion fails.


-       In  UTF-8 mode, PCRE does not allow the \C escape (which matches a sin-
-       gle byte, even in UTF-8  mode)  to  appear  in  lookbehind  assertions,
-       because  it  makes it impossible to calculate the length of the lookbe-
-       hind. The \X and \R escapes,  which  can  match  different  numbers  of
-       bytes, are also not permitted.
+       PCRE does not allow the \C escape (which matches a single byte in UTF-8
+       mode) to appear in lookbehind assertions, because it makes it  impossi-
+       ble  to  calculate the length of the lookbehind. The \X and \R escapes,
+       which can match different numbers of bytes, are also not permitted.


-       "Subroutine"  calls  (see below) such as (?2) or (?&X) are permitted in
-       lookbehinds, as long as the subpattern matches a  fixed-length  string.
+       "Subroutine" calls (see below) such as (?2) or (?&X) are  permitted  in
+       lookbehinds,  as  long as the subpattern matches a fixed-length string.
        Recursion, however, is not supported.


-       Possessive  quantifiers  can  be  used  in  conjunction with lookbehind
+       Possessive quantifiers can  be  used  in  conjunction  with  lookbehind
        assertions to specify efficient matching of fixed-length strings at the
        end of subject strings. Consider a simple pattern such as


          abcd$


-       when  applied  to  a  long string that does not match. Because matching
+       when applied to a long string that does  not  match.  Because  matching
        proceeds from left to right, PCRE will look for each "a" in the subject
-       and  then  see  if what follows matches the rest of the pattern. If the
+       and then see if what follows matches the rest of the  pattern.  If  the
        pattern is specified as


          ^.*abcd$


-       the initial .* matches the entire string at first, but when this  fails
+       the  initial .* matches the entire string at first, but when this fails
        (because there is no following "a"), it backtracks to match all but the
-       last character, then all but the last two characters, and so  on.  Once
-       again  the search for "a" covers the entire string, from right to left,
+       last  character,  then all but the last two characters, and so on. Once
+       again the search for "a" covers the entire string, from right to  left,
        so we are no better off. However, if the pattern is written as


          ^.*+(?<=abcd)


-       there can be no backtracking for the .*+ item; it can  match  only  the
-       entire  string.  The subsequent lookbehind assertion does a single test
-       on the last four characters. If it fails, the match fails  immediately.
-       For  long  strings, this approach makes a significant difference to the
+       there  can  be  no backtracking for the .*+ item; it can match only the
+       entire string. The subsequent lookbehind assertion does a  single  test
+       on  the last four characters. If it fails, the match fails immediately.
+       For long strings, this approach makes a significant difference  to  the
        processing time.


    Using multiple assertions
@@ -5152,18 +5102,18 @@


          (?<=\d{3})(?<!999)foo


-       matches "foo" preceded by three digits that are not "999". Notice  that
-       each  of  the  assertions is applied independently at the same point in
-       the subject string. First there is a  check  that  the  previous  three
-       characters  are  all  digits,  and  then there is a check that the same
+       matches  "foo" preceded by three digits that are not "999". Notice that
+       each of the assertions is applied independently at the  same  point  in
+       the  subject  string.  First  there  is a check that the previous three
+       characters are all digits, and then there is  a  check  that  the  same
        three characters are not "999".  This pattern does not match "foo" pre-
-       ceded  by  six  characters,  the first of which are digits and the last
-       three of which are not "999". For example, it  doesn't  match  "123abc-
+       ceded by six characters, the first of which are  digits  and  the  last
+       three  of  which  are not "999". For example, it doesn't match "123abc-
        foo". A pattern to do that is


          (?<=\d{3}...)(?<!999)foo


-       This  time  the  first assertion looks at the preceding six characters,
+       This time the first assertion looks at the  preceding  six  characters,
        checking that the first three are digits, and then the second assertion
        checks that the preceding three characters are not "999".


@@ -5171,29 +5121,29 @@

          (?<=(?<!foo)bar)baz


-       matches  an occurrence of "baz" that is preceded by "bar" which in turn
+       matches an occurrence of "baz" that is preceded by "bar" which in  turn
        is not preceded by "foo", while


          (?<=\d{3}(?!999)...)foo


-       is another pattern that matches "foo" preceded by three digits and  any
+       is  another pattern that matches "foo" preceded by three digits and any
        three characters that are not "999".



CONDITIONAL SUBPATTERNS

-       It  is possible to cause the matching process to obey a subpattern con-
-       ditionally or to choose between two alternative subpatterns,  depending
-       on  the result of an assertion, or whether a specific capturing subpat-
-       tern has already been matched. The two possible  forms  of  conditional
+       It is possible to cause the matching process to obey a subpattern  con-
+       ditionally  or to choose between two alternative subpatterns, depending
+       on the result of an assertion, or whether a specific capturing  subpat-
+       tern  has  already  been matched. The two possible forms of conditional
        subpattern are:


          (?(condition)yes-pattern)
          (?(condition)yes-pattern|no-pattern)


-       If  the  condition is satisfied, the yes-pattern is used; otherwise the
-       no-pattern (if present) is used. If there are more  than  two  alterna-
-       tives  in  the subpattern, a compile-time error occurs. Each of the two
+       If the condition is satisfied, the yes-pattern is used;  otherwise  the
+       no-pattern  (if  present)  is used. If there are more than two alterna-
+       tives in the subpattern, a compile-time error occurs. Each of  the  two
        alternatives may itself contain nested subpatterns of any form, includ-
        ing  conditional  subpatterns;  the  restriction  to  two  alternatives
        applies only at the level of the condition. This pattern fragment is an
@@ -5202,73 +5152,73 @@
          (?(1) (A|B|C) | (D | (?(2)E|F) | E) )



-       There  are  four  kinds of condition: references to subpatterns, refer-
+       There are four kinds of condition: references  to  subpatterns,  refer-
        ences to recursion, a pseudo-condition called DEFINE, and assertions.


    Checking for a used subpattern by number


-       If the text between the parentheses consists of a sequence  of  digits,
+       If  the  text between the parentheses consists of a sequence of digits,
        the condition is true if a capturing subpattern of that number has pre-
-       viously matched. If there is more than one  capturing  subpattern  with
-       the  same  number  (see  the earlier section about duplicate subpattern
-       numbers), the condition is true if any of them have matched. An  alter-
-       native  notation is to precede the digits with a plus or minus sign. In
-       this case, the subpattern number is relative rather than absolute.  The
-       most  recently opened parentheses can be referenced by (?(-1), the next
-       most recent by (?(-2), and so on. Inside loops it can also  make  sense
+       viously  matched.  If  there is more than one capturing subpattern with
+       the same number (see the earlier  section  about  duplicate  subpattern
+       numbers),  the condition is true if any of them have matched. An alter-
+       native notation is to precede the digits with a plus or minus sign.  In
+       this  case, the subpattern number is relative rather than absolute. The
+       most recently opened parentheses can be referenced by (?(-1), the  next
+       most  recent  by (?(-2), and so on. Inside loops it can also make sense
        to refer to subsequent groups. The next parentheses to be opened can be
-       referenced as (?(+1), and so on. (The value zero in any of these  forms
+       referenced  as (?(+1), and so on. (The value zero in any of these forms
        is not used; it provokes a compile-time error.)


-       Consider  the  following  pattern, which contains non-significant white
+       Consider the following pattern, which  contains  non-significant  white
        space to make it more readable (assume the PCRE_EXTENDED option) and to
        divide it into three parts for ease of discussion:


          ( \( )?    [^()]+    (?(1) \) )


-       The  first  part  matches  an optional opening parenthesis, and if that
+       The first part matches an optional opening  parenthesis,  and  if  that
        character is present, sets it as the first captured substring. The sec-
-       ond  part  matches one or more characters that are not parentheses. The
-       third part is a conditional subpattern that tests whether  or  not  the
-       first  set  of  parentheses  matched.  If they did, that is, if subject
-       started with an opening parenthesis, the condition is true, and so  the
-       yes-pattern  is  executed and a closing parenthesis is required. Other-
-       wise, since no-pattern is not present, the subpattern matches  nothing.
-       In  other  words,  this  pattern matches a sequence of non-parentheses,
+       ond part matches one or more characters that are not  parentheses.  The
+       third  part  is  a conditional subpattern that tests whether or not the
+       first set of parentheses matched. If they  did,  that  is,  if  subject
+       started  with an opening parenthesis, the condition is true, and so the
+       yes-pattern is executed and a closing parenthesis is  required.  Other-
+       wise,  since no-pattern is not present, the subpattern matches nothing.
+       In other words, this pattern matches  a  sequence  of  non-parentheses,
        optionally enclosed in parentheses.


-       If you were embedding this pattern in a larger one,  you  could  use  a
+       If  you  were  embedding  this pattern in a larger one, you could use a
        relative reference:


          ...other stuff... ( \( )?    [^()]+    (?(-1) \) ) ...


-       This  makes  the  fragment independent of the parentheses in the larger
+       This makes the fragment independent of the parentheses  in  the  larger
        pattern.


    Checking for a used subpattern by name


-       Perl uses the syntax (?(<name>)...) or (?('name')...)  to  test  for  a
-       used  subpattern  by  name.  For compatibility with earlier versions of
-       PCRE, which had this facility before Perl, the syntax  (?(name)...)  is
-       also  recognized. However, there is a possible ambiguity with this syn-
-       tax, because subpattern names may  consist  entirely  of  digits.  PCRE
-       looks  first for a named subpattern; if it cannot find one and the name
-       consists entirely of digits, PCRE looks for a subpattern of  that  num-
-       ber,  which must be greater than zero. Using subpattern names that con-
+       Perl  uses  the  syntax  (?(<name>)...) or (?('name')...) to test for a
+       used subpattern by name. For compatibility  with  earlier  versions  of
+       PCRE,  which  had this facility before Perl, the syntax (?(name)...) is
+       also recognized. However, there is a possible ambiguity with this  syn-
+       tax,  because  subpattern  names  may  consist entirely of digits. PCRE
+       looks first for a named subpattern; if it cannot find one and the  name
+       consists  entirely  of digits, PCRE looks for a subpattern of that num-
+       ber, which must be greater than zero. Using subpattern names that  con-
        sist entirely of digits is not recommended.


        Rewriting the above example to use a named subpattern gives this:


          (?<OPEN> \( )?    [^()]+    (?(<OPEN>) \) )


-       If the name used in a condition of this kind is a duplicate,  the  test
-       is  applied to all subpatterns of the same name, and is true if any one
+       If  the  name used in a condition of this kind is a duplicate, the test
+       is applied to all subpatterns of the same name, and is true if any  one
        of them has matched.


    Checking for pattern recursion


        If the condition is the string (R), and there is no subpattern with the
-       name  R, the condition is true if a recursive call to the whole pattern
+       name R, the condition is true if a recursive call to the whole  pattern
        or any subpattern has been made. If digits or a name preceded by amper-
        sand follow the letter R, for example:


@@ -5276,51 +5226,51 @@

        the condition is true if the most recent recursion is into a subpattern
        whose number or name is given. This condition does not check the entire
-       recursion  stack.  If  the  name  used in a condition of this kind is a
+       recursion stack. If the name used in a condition  of  this  kind  is  a
        duplicate, the test is applied to all subpatterns of the same name, and
        is true if any one of them is the most recent recursion.


-       At  "top  level",  all  these recursion test conditions are false.  The
+       At "top level", all these recursion test  conditions  are  false.   The
        syntax for recursive patterns is described below.


    Defining subpatterns for use by reference only


-       If the condition is the string (DEFINE), and  there  is  no  subpattern
-       with  the  name  DEFINE,  the  condition is always false. In this case,
-       there may be only one alternative  in  the  subpattern.  It  is  always
-       skipped  if  control  reaches  this  point  in the pattern; the idea of
-       DEFINE is that it can be used to define subroutines that can be  refer-
-       enced  from elsewhere. (The use of subroutines is described below.) For
-       example, a pattern to match an IPv4 address  such  as  "192.168.23.245"
+       If  the  condition  is  the string (DEFINE), and there is no subpattern
+       with the name DEFINE, the condition is  always  false.  In  this  case,
+       there  may  be  only  one  alternative  in the subpattern. It is always
+       skipped if control reaches this point  in  the  pattern;  the  idea  of
+       DEFINE  is that it can be used to define subroutines that can be refer-
+       enced from elsewhere. (The use of subroutines is described below.)  For
+       example,  a  pattern  to match an IPv4 address such as "192.168.23.245"
        could be written like this (ignore whitespace and line breaks):


          (?(DEFINE) (?<byte> 2[0-4]\d | 25[0-5] | 1\d\d | [1-9]?\d) )
          \b (?&byte) (\.(?&byte)){3} \b


-       The  first part of the pattern is a DEFINE group inside which a another
-       group named "byte" is defined. This matches an individual component  of
-       an  IPv4  address  (a number less than 256). When matching takes place,
-       this part of the pattern is skipped because DEFINE acts  like  a  false
-       condition.  The  rest of the pattern uses references to the named group
-       to match the four dot-separated components of an IPv4 address,  insist-
+       The first part of the pattern is a DEFINE group inside which a  another
+       group  named "byte" is defined. This matches an individual component of
+       an IPv4 address (a number less than 256). When  matching  takes  place,
+       this  part  of  the pattern is skipped because DEFINE acts like a false
+       condition. The rest of the pattern uses references to the  named  group
+       to  match the four dot-separated components of an IPv4 address, insist-
        ing on a word boundary at each end.


    Assertion conditions


-       If  the  condition  is  not  in any of the above formats, it must be an
-       assertion.  This may be a positive or negative lookahead or  lookbehind
-       assertion.  Consider  this  pattern,  again  containing non-significant
+       If the condition is not in any of the above  formats,  it  must  be  an
+       assertion.   This may be a positive or negative lookahead or lookbehind
+       assertion. Consider  this  pattern,  again  containing  non-significant
        white space, and with the two alternatives on the second line:


          (?(?=[^a-z]*[a-z])
          \d{2}-[a-z]{3}-\d{2}  |  \d{2}-\d{2}-\d{2} )


-       The condition  is  a  positive  lookahead  assertion  that  matches  an
-       optional  sequence of non-letters followed by a letter. In other words,
-       it tests for the presence of at least one letter in the subject.  If  a
-       letter  is found, the subject is matched against the first alternative;
-       otherwise it is  matched  against  the  second.  This  pattern  matches
-       strings  in  one  of the two forms dd-aaa-dd or dd-dd-dd, where aaa are
+       The  condition  is  a  positive  lookahead  assertion  that  matches an
+       optional sequence of non-letters followed by a letter. In other  words,
+       it  tests  for the presence of at least one letter in the subject. If a
+       letter is found, the subject is matched against the first  alternative;
+       otherwise  it  is  matched  against  the  second.  This pattern matches
+       strings in one of the two forms dd-aaa-dd or dd-dd-dd,  where  aaa  are
        letters and dd are digits.



@@ -5329,41 +5279,41 @@
        There are two ways of including comments in patterns that are processed
        by PCRE. In both cases, the start of the comment must not be in a char-
        acter class, nor in the middle of any other sequence of related charac-
-       ters  such  as  (?: or a subpattern name or number. The characters that
+       ters such as (?: or a subpattern name or number.  The  characters  that
        make up a comment play no part in the pattern matching.


-       The sequence (?# marks the start of a comment that continues up to  the
-       next  closing parenthesis. Nested parentheses are not permitted. If the
+       The  sequence (?# marks the start of a comment that continues up to the
+       next closing parenthesis. Nested parentheses are not permitted. If  the
        PCRE_EXTENDED option is set, an unescaped # character also introduces a
-       comment,  which  in  this  case continues to immediately after the next
-       newline character or character sequence in the pattern.  Which  charac-
+       comment, which in this case continues to  immediately  after  the  next
+       newline  character  or character sequence in the pattern. Which charac-
        ters are interpreted as newlines is controlled by the options passed to
        pcre_compile() or by a special sequence at the start of the pattern, as
-       described  in  the  section  entitled "Newline conventions" above. Note
-       that the end of this type of comment is a literal newline  sequence  in
+       described in the section entitled  "Newline  conventions"  above.  Note
+       that  the  end of this type of comment is a literal newline sequence in
        the pattern; escape sequences that happen to represent a newline do not
-       count. For example, consider this pattern when  PCRE_EXTENDED  is  set,
+       count.  For  example,  consider this pattern when PCRE_EXTENDED is set,
        and the default newline convention is in force:


          abc #comment \n still comment


-       On  encountering  the  # character, pcre_compile() skips along, looking
-       for a newline in the pattern. The sequence \n is still literal at  this
-       stage,  so  it does not terminate the comment. Only an actual character
+       On encountering the # character, pcre_compile()  skips  along,  looking
+       for  a newline in the pattern. The sequence \n is still literal at this
+       stage, so it does not terminate the comment. Only an  actual  character
        with the code value 0x0a (the default newline) does so.



RECURSIVE PATTERNS

-       Consider the problem of matching a string in parentheses, allowing  for
-       unlimited  nested  parentheses.  Without the use of recursion, the best
-       that can be done is to use a pattern that  matches  up  to  some  fixed
-       depth  of  nesting.  It  is not possible to handle an arbitrary nesting
+       Consider  the problem of matching a string in parentheses, allowing for
+       unlimited nested parentheses. Without the use of  recursion,  the  best
+       that  can  be  done  is  to use a pattern that matches up to some fixed
+       depth of nesting. It is not possible to  handle  an  arbitrary  nesting
        depth.


        For some time, Perl has provided a facility that allows regular expres-
-       sions  to recurse (amongst other things). It does this by interpolating
-       Perl code in the expression at run time, and the code can refer to  the
+       sions to recurse (amongst other things). It does this by  interpolating
+       Perl  code in the expression at run time, and the code can refer to the
        expression itself. A Perl pattern using code interpolation to solve the
        parentheses problem can be created like this:


@@ -5373,201 +5323,201 @@
        refers recursively to the pattern in which it appears.


        Obviously, PCRE cannot support the interpolation of Perl code. Instead,
-       it supports special syntax for recursion of  the  entire  pattern,  and
-       also  for  individual  subpattern  recursion. After its introduction in
-       PCRE and Python, this kind of  recursion  was  subsequently  introduced
+       it  supports  special  syntax  for recursion of the entire pattern, and
+       also for individual subpattern recursion.  After  its  introduction  in
+       PCRE  and  Python,  this  kind of recursion was subsequently introduced
        into Perl at release 5.10.


-       A  special  item  that consists of (? followed by a number greater than
-       zero and a closing parenthesis is a recursive subroutine  call  of  the
-       subpattern  of  the  given  number, provided that it occurs inside that
-       subpattern. (If not, it is a non-recursive subroutine  call,  which  is
-       described  in  the  next  section.)  The special item (?R) or (?0) is a
+       A special item that consists of (? followed by a  number  greater  than
+       zero  and  a  closing parenthesis is a recursive subroutine call of the
+       subpattern of the given number, provided that  it  occurs  inside  that
+       subpattern.  (If  not,  it is a non-recursive subroutine call, which is
+       described in the next section.) The special item  (?R)  or  (?0)  is  a
        recursive call of the entire regular expression.


-       This PCRE pattern solves the nested  parentheses  problem  (assume  the
+       This  PCRE  pattern  solves  the nested parentheses problem (assume the
        PCRE_EXTENDED option is set so that white space is ignored):


          \( ( [^()]++ | (?R) )* \)


-       First  it matches an opening parenthesis. Then it matches any number of
-       substrings which can either be a  sequence  of  non-parentheses,  or  a
-       recursive  match  of the pattern itself (that is, a correctly parenthe-
+       First it matches an opening parenthesis. Then it matches any number  of
+       substrings  which  can  either  be  a sequence of non-parentheses, or a
+       recursive match of the pattern itself (that is, a  correctly  parenthe-
        sized substring).  Finally there is a closing parenthesis. Note the use
        of a possessive quantifier to avoid backtracking into sequences of non-
        parentheses.


-       If this were part of a larger pattern, you would not  want  to  recurse
+       If  this  were  part of a larger pattern, you would not want to recurse
        the entire pattern, so instead you could use this:


          ( \( ( [^()]++ | (?1) )* \) )


-       We  have  put the pattern into parentheses, and caused the recursion to
+       We have put the pattern into parentheses, and caused the  recursion  to
        refer to them instead of the whole pattern.


-       In a larger pattern,  keeping  track  of  parenthesis  numbers  can  be
-       tricky.  This is made easier by the use of relative references. Instead
+       In  a  larger  pattern,  keeping  track  of  parenthesis numbers can be
+       tricky. This is made easier by the use of relative references.  Instead
        of (?1) in the pattern above you can write (?-2) to refer to the second
-       most  recently  opened  parentheses  preceding  the recursion. In other
-       words, a negative number counts capturing  parentheses  leftwards  from
+       most recently opened parentheses  preceding  the  recursion.  In  other
+       words,  a  negative  number counts capturing parentheses leftwards from
        the point at which it is encountered.


-       It  is  also  possible  to refer to subsequently opened parentheses, by
-       writing references such as (?+2). However, these  cannot  be  recursive
-       because  the  reference  is  not inside the parentheses that are refer-
-       enced. They are always non-recursive subroutine calls, as described  in
+       It is also possible to refer to  subsequently  opened  parentheses,  by
+       writing  references  such  as (?+2). However, these cannot be recursive
+       because the reference is not inside the  parentheses  that  are  refer-
+       enced.  They are always non-recursive subroutine calls, as described in
        the next section.


-       An  alternative  approach is to use named parentheses instead. The Perl
-       syntax for this is (?&name); PCRE's earlier syntax  (?P>name)  is  also
+       An alternative approach is to use named parentheses instead.  The  Perl
+       syntax  for  this  is (?&name); PCRE's earlier syntax (?P>name) is also
        supported. We could rewrite the above example as follows:


          (?<pn> \( ( [^()]++ | (?&pn) )* \) )


-       If  there  is more than one subpattern with the same name, the earliest
+       If there is more than one subpattern with the same name,  the  earliest
        one is used.


-       This particular example pattern that we have been looking  at  contains
+       This  particular  example pattern that we have been looking at contains
        nested unlimited repeats, and so the use of a possessive quantifier for
        matching strings of non-parentheses is important when applying the pat-
-       tern  to  strings  that do not match. For example, when this pattern is
+       tern to strings that do not match. For example, when  this  pattern  is
        applied to


          (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()


-       it yields "no match" quickly. However, if a  possessive  quantifier  is
-       not  used, the match runs for a very long time indeed because there are
-       so many different ways the + and * repeats can carve  up  the  subject,
+       it  yields  "no  match" quickly. However, if a possessive quantifier is
+       not used, the match runs for a very long time indeed because there  are
+       so  many  different  ways the + and * repeats can carve up the subject,
        and all have to be tested before failure can be reported.


-       At  the  end  of a match, the values of capturing parentheses are those
-       from the outermost level. If you want to obtain intermediate values,  a
-       callout  function can be used (see below and the pcrecallout documenta-
+       At the end of a match, the values of capturing  parentheses  are  those
+       from  the outermost level. If you want to obtain intermediate values, a
+       callout function can be used (see below and the pcrecallout  documenta-
        tion). If the pattern above is matched against


          (ab(cd)ef)


-       the value for the inner capturing parentheses  (numbered  2)  is  "ef",
-       which  is the last value taken on at the top level. If a capturing sub-
-       pattern is not matched at the top level, its final  captured  value  is
-       unset,  even  if  it was (temporarily) set at a deeper level during the
+       the  value  for  the  inner capturing parentheses (numbered 2) is "ef",
+       which is the last value taken on at the top level. If a capturing  sub-
+       pattern  is  not  matched at the top level, its final captured value is
+       unset, even if it was (temporarily) set at a deeper  level  during  the
        matching process.


-       If there are more than 15 capturing parentheses in a pattern, PCRE  has
-       to  obtain extra memory to store data during a recursion, which it does
+       If  there are more than 15 capturing parentheses in a pattern, PCRE has
+       to obtain extra memory to store data during a recursion, which it  does
        by using pcre_malloc, freeing it via pcre_free afterwards. If no memory
        can be obtained, the match fails with the PCRE_ERROR_NOMEMORY error.


-       Do  not  confuse  the (?R) item with the condition (R), which tests for
-       recursion.  Consider this pattern, which matches text in  angle  brack-
-       ets,  allowing for arbitrary nesting. Only digits are allowed in nested
-       brackets (that is, when recursing), whereas any characters are  permit-
+       Do not confuse the (?R) item with the condition (R),  which  tests  for
+       recursion.   Consider  this pattern, which matches text in angle brack-
+       ets, allowing for arbitrary nesting. Only digits are allowed in  nested
+       brackets  (that is, when recursing), whereas any characters are permit-
        ted at the outer level.


          < (?: (?(R) \d++  | [^<>]*+) | (?R)) * >


-       In  this  pattern, (?(R) is the start of a conditional subpattern, with
-       two different alternatives for the recursive and  non-recursive  cases.
+       In this pattern, (?(R) is the start of a conditional  subpattern,  with
+       two  different  alternatives for the recursive and non-recursive cases.
        The (?R) item is the actual recursive call.


    Differences in recursion processing between PCRE and Perl


-       Recursion  processing  in PCRE differs from Perl in two important ways.
-       In PCRE (like Python, but unlike Perl), a recursive subpattern call  is
+       Recursion processing in PCRE differs from Perl in two  important  ways.
+       In  PCRE (like Python, but unlike Perl), a recursive subpattern call is
        always treated as an atomic group. That is, once it has matched some of
        the subject string, it is never re-entered, even if it contains untried
-       alternatives  and  there  is a subsequent matching failure. This can be
-       illustrated by the following pattern, which purports to match a  palin-
-       dromic  string  that contains an odd number of characters (for example,
+       alternatives and there is a subsequent matching failure.  This  can  be
+       illustrated  by the following pattern, which purports to match a palin-
+       dromic string that contains an odd number of characters  (for  example,
        "a", "aba", "abcba", "abcdcba"):


          ^(.|(.)(?1)\2)$


        The idea is that it either matches a single character, or two identical
-       characters  surrounding  a sub-palindrome. In Perl, this pattern works;
-       in PCRE it does not if the pattern is  longer  than  three  characters.
+       characters surrounding a sub-palindrome. In Perl, this  pattern  works;
+       in  PCRE  it  does  not if the pattern is longer than three characters.
        Consider the subject string "abcba":


-       At  the  top level, the first character is matched, but as it is not at
+       At the top level, the first character is matched, but as it is  not  at
        the end of the string, the first alternative fails; the second alterna-
        tive is taken and the recursion kicks in. The recursive call to subpat-
-       tern 1 successfully matches the next character ("b").  (Note  that  the
+       tern  1  successfully  matches the next character ("b"). (Note that the
        beginning and end of line tests are not part of the recursion).


-       Back  at  the top level, the next character ("c") is compared with what
-       subpattern 2 matched, which was "a". This fails. Because the  recursion
-       is  treated  as  an atomic group, there are now no backtracking points,
-       and so the entire match fails. (Perl is able, at  this  point,  to  re-
-       enter  the  recursion  and try the second alternative.) However, if the
+       Back at the top level, the next character ("c") is compared  with  what
+       subpattern  2 matched, which was "a". This fails. Because the recursion
+       is treated as an atomic group, there are now  no  backtracking  points,
+       and  so  the  entire  match fails. (Perl is able, at this point, to re-
+       enter the recursion and try the second alternative.)  However,  if  the
        pattern is written with the alternatives in the other order, things are
        different:


          ^((.)(?1)\2|.)$


-       This  time,  the recursing alternative is tried first, and continues to
-       recurse until it runs out of characters, at which point  the  recursion
-       fails.  But  this  time  we  do  have another alternative to try at the
-       higher level. That is the big difference:  in  the  previous  case  the
+       This time, the recursing alternative is tried first, and  continues  to
+       recurse  until  it runs out of characters, at which point the recursion
+       fails. But this time we do have  another  alternative  to  try  at  the
+       higher  level.  That  is  the  big difference: in the previous case the
        remaining alternative is at a deeper recursion level, which PCRE cannot
        use.


-       To change the pattern so that it matches all palindromic  strings,  not
-       just  those  with an odd number of characters, it is tempting to change
+       To  change  the pattern so that it matches all palindromic strings, not
+       just those with an odd number of characters, it is tempting  to  change
        the pattern to this:


          ^((.)(?1)\2|.?)$


-       Again, this works in Perl, but not in PCRE, and for  the  same  reason.
-       When  a  deeper  recursion has matched a single character, it cannot be
-       entered again in order to match an empty string.  The  solution  is  to
-       separate  the two cases, and write out the odd and even cases as alter-
+       Again,  this  works  in Perl, but not in PCRE, and for the same reason.
+       When a deeper recursion has matched a single character,  it  cannot  be
+       entered  again  in  order  to match an empty string. The solution is to
+       separate the two cases, and write out the odd and even cases as  alter-
        natives at the higher level:


          ^(?:((.)(?1)\2|)|((.)(?3)\4|.))


-       If you want to match typical palindromic phrases, the  pattern  has  to
+       If  you  want  to match typical palindromic phrases, the pattern has to
        ignore all non-word characters, which can be done like this:


          ^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$


        If run with the PCRE_CASELESS option, this pattern matches phrases such
        as "A man, a plan, a canal: Panama!" and it works well in both PCRE and
-       Perl.  Note the use of the possessive quantifier *+ to avoid backtrack-
-       ing into sequences of non-word characters. Without this, PCRE  takes  a
-       great  deal  longer  (ten  times or more) to match typical phrases, and
+       Perl. Note the use of the possessive quantifier *+ to avoid  backtrack-
+       ing  into  sequences of non-word characters. Without this, PCRE takes a
+       great deal longer (ten times or more) to  match  typical  phrases,  and
        Perl takes so long that you think it has gone into a loop.


-       WARNING: The palindrome-matching patterns above work only if  the  sub-
-       ject  string  does not start with a palindrome that is shorter than the
-       entire string.  For example, although "abcba" is correctly matched,  if
-       the  subject  is "ababa", PCRE finds the palindrome "aba" at the start,
-       then fails at top level because the end of the string does not  follow.
-       Once  again, it cannot jump back into the recursion to try other alter-
+       WARNING:  The  palindrome-matching patterns above work only if the sub-
+       ject string does not start with a palindrome that is shorter  than  the
+       entire  string.  For example, although "abcba" is correctly matched, if
+       the subject is "ababa", PCRE finds the palindrome "aba" at  the  start,
+       then  fails at top level because the end of the string does not follow.
+       Once again, it cannot jump back into the recursion to try other  alter-
        natives, so the entire match fails.


-       The second way in which PCRE and Perl differ in  their  recursion  pro-
-       cessing  is in the handling of captured values. In Perl, when a subpat-
-       tern is called recursively or as a subpattern (see the  next  section),
-       it  has  no  access to any values that were captured outside the recur-
-       sion, whereas in PCRE these values can  be  referenced.  Consider  this
+       The  second  way  in which PCRE and Perl differ in their recursion pro-
+       cessing is in the handling of captured values. In Perl, when a  subpat-
+       tern  is  called recursively or as a subpattern (see the next section),
+       it has no access to any values that were captured  outside  the  recur-
+       sion,  whereas  in  PCRE  these values can be referenced. Consider this
        pattern:


          ^(.)(\1|a(?2))


-       In  PCRE,  this  pattern matches "bab". The first capturing parentheses
-       match "b", then in the second group, when the back reference  \1  fails
-       to  match "b", the second alternative matches "a" and then recurses. In
-       the recursion, \1 does now match "b" and so the whole  match  succeeds.
-       In  Perl,  the pattern fails to match because inside the recursive call
+       In PCRE, this pattern matches "bab". The  first  capturing  parentheses
+       match  "b",  then in the second group, when the back reference \1 fails
+       to match "b", the second alternative matches "a" and then recurses.  In
+       the  recursion,  \1 does now match "b" and so the whole match succeeds.
+       In Perl, the pattern fails to match because inside the  recursive  call
        \1 cannot access the externally set value.



SUBPATTERNS AS SUBROUTINES

-       If the syntax for a recursive subpattern call (either by number  or  by
-       name)  is  used outside the parentheses to which it refers, it operates
-       like a subroutine in a programming language. The called subpattern  may
-       be  defined  before or after the reference. A numbered reference can be
+       If  the  syntax for a recursive subpattern call (either by number or by
+       name) is used outside the parentheses to which it refers,  it  operates
+       like  a subroutine in a programming language. The called subpattern may
+       be defined before or after the reference. A numbered reference  can  be
        absolute or relative, as in these examples:


          (...(absolute)...)...(?2)...
@@ -5578,109 +5528,108 @@


          (sens|respons)e and \1ibility


-       matches "sense and sensibility" and "response and responsibility",  but
+       matches  "sense and sensibility" and "response and responsibility", but
        not "sense and responsibility". If instead the pattern


          (sens|respons)e and (?1)ibility


-       is  used, it does match "sense and responsibility" as well as the other
-       two strings. Another example is  given  in  the  discussion  of  DEFINE
+       is used, it does match "sense and responsibility" as well as the  other
+       two  strings.  Another  example  is  given  in the discussion of DEFINE
        above.


-       All  subroutine  calls, whether recursive or not, are always treated as
-       atomic groups. That is, once a subroutine has matched some of the  sub-
+       All subroutine calls, whether recursive or not, are always  treated  as
+       atomic  groups. That is, once a subroutine has matched some of the sub-
        ject string, it is never re-entered, even if it contains untried alter-
-       natives and there is  a  subsequent  matching  failure.  Any  capturing
-       parentheses  that  are  set  during the subroutine call revert to their
+       natives  and  there  is  a  subsequent  matching failure. Any capturing
+       parentheses that are set during the subroutine  call  revert  to  their
        previous values afterwards.


-       Processing options such as case-independence are fixed when  a  subpat-
-       tern  is defined, so if it is used as a subroutine, such options cannot
+       Processing  options  such as case-independence are fixed when a subpat-
+       tern is defined, so if it is used as a subroutine, such options  cannot
        be changed for different calls. For example, consider this pattern:


          (abc)(?i:(?-1))


-       It matches "abcabc". It does not match "abcABC" because the  change  of
+       It  matches  "abcabc". It does not match "abcABC" because the change of
        processing option does not affect the called subpattern.



ONIGURUMA SUBROUTINE SYNTAX

-       For  compatibility with Oniguruma, the non-Perl syntax \g followed by a
+       For compatibility with Oniguruma, the non-Perl syntax \g followed by  a
        name or a number enclosed either in angle brackets or single quotes, is
-       an  alternative  syntax  for  referencing a subpattern as a subroutine,
-       possibly recursively. Here are two of the examples used above,  rewrit-
+       an alternative syntax for referencing a  subpattern  as  a  subroutine,
+       possibly  recursively. Here are two of the examples used above, rewrit-
        ten using this syntax:


          (?<pn> \( ( (?>[^()]+) | \g<pn> )* \) )
          (sens|respons)e and \g'1'ibility


-       PCRE  supports  an extension to Oniguruma: if a number is preceded by a
+       PCRE supports an extension to Oniguruma: if a number is preceded  by  a
        plus or a minus sign it is taken as a relative reference. For example:


          (abc)(?i:\g<-1>)


-       Note that \g{...} (Perl syntax) and \g<...> (Oniguruma syntax) are  not
-       synonymous.  The former is a back reference; the latter is a subroutine
+       Note  that \g{...} (Perl syntax) and \g<...> (Oniguruma syntax) are not
+       synonymous. The former is a back reference; the latter is a  subroutine
        call.



CALLOUTS

        Perl has a feature whereby using the sequence (?{...}) causes arbitrary
-       Perl  code to be obeyed in the middle of matching a regular expression.
+       Perl code to be obeyed in the middle of matching a regular  expression.
        This makes it possible, amongst other things, to extract different sub-
        strings that match the same pair of parentheses when there is a repeti-
        tion.


        PCRE provides a similar feature, but of course it cannot obey arbitrary
        Perl code. The feature is called "callout". The caller of PCRE provides
-       an external function by putting its entry point in the global  variable
-       pcre_callout.   By default, this variable contains NULL, which disables
+       an  external function by putting its entry point in the global variable
+       pcre_callout.  By default, this variable contains NULL, which  disables
        all calling out.


-       Within a regular expression, (?C) indicates the  points  at  which  the
-       external  function  is  to be called. If you want to identify different
-       callout points, you can put a number less than 256 after the letter  C.
-       The  default  value is zero.  For example, this pattern has two callout
+       Within  a  regular  expression,  (?C) indicates the points at which the
+       external function is to be called. If you want  to  identify  different
+       callout  points, you can put a number less than 256 after the letter C.
+       The default value is zero.  For example, this pattern has  two  callout
        points:


          (?C1)abc(?C2)def


        If the PCRE_AUTO_CALLOUT flag is passed to pcre_compile(), callouts are
-       automatically  installed  before each item in the pattern. They are all
+       automatically installed before each item in the pattern. They  are  all
        numbered 255.


        During matching, when PCRE reaches a callout point (and pcre_callout is
-       set),  the  external function is called. It is provided with the number
-       of the callout, the position in the pattern, and, optionally, one  item
-       of  data  originally supplied by the caller of pcre_exec(). The callout
-       function may cause matching to proceed, to backtrack, or to fail  alto-
+       set), the external function is called. It is provided with  the  number
+       of  the callout, the position in the pattern, and, optionally, one item
+       of data originally supplied by the caller of pcre_exec().  The  callout
+       function  may cause matching to proceed, to backtrack, or to fail alto-
        gether. A complete description of the interface to the callout function
        is given in the pcrecallout documentation.



BACKTRACKING CONTROL

-       Perl 5.10 introduced a number of "Special Backtracking Control  Verbs",
+       Perl  5.10 introduced a number of "Special Backtracking Control Verbs",
        which are described in the Perl documentation as "experimental and sub-
-       ject to change or removal in a future version of Perl". It goes  on  to
-       say:  "Their usage in production code should be noted to avoid problems
+       ject  to  change or removal in a future version of Perl". It goes on to
+       say: "Their usage in production code should be noted to avoid  problems
        during upgrades." The same remarks apply to the PCRE features described
        in this section.


-       Since  these  verbs  are  specifically related to backtracking, most of
-       them can be  used  only  when  the  pattern  is  to  be  matched  using
+       Since these verbs are specifically related  to  backtracking,  most  of
+       them  can  be  used  only  when  the  pattern  is  to  be matched using
        pcre_exec(), which uses a backtracking algorithm. With the exception of
        (*FAIL), which behaves like a failing negative assertion, they cause an
        error if encountered by pcre_dfa_exec().


-       If  any of these verbs are used in an assertion or in a subpattern that
+       If any of these verbs are used in an assertion or in a subpattern  that
        is called as a subroutine (whether or not recursively), their effect is
        confined to that subpattern; it does not extend to the surrounding pat-
-       tern, with one exception: the name from a *(MARK), (*PRUNE), or (*THEN)
-       that  is  encountered in a successful positive assertion is passed back
-       when a match succeeds (compare capturing  parentheses  in  assertions).
+       tern,  with  one  exception:  a *MARK that is encountered in a positive
+       assertion is passed back (compare capturing parentheses in assertions).
        Note that such subpatterns are processed as anchored at the point where
        they are tested. Note also that Perl's treatment of subroutines is dif-
        ferent in some cases.
@@ -5703,61 +5652,59 @@
        by setting the PCRE_NO_START_OPTIMIZE  option  when  calling  pcre_com-
        pile() or pcre_exec(), or by starting the pattern with (*NO_START_OPT).


-       Experiments  with  Perl  suggest that it too has similar optimizations,
-       sometimes leading to anomalous results.
-
    Verbs that act immediately


-       The following verbs act as soon as they are encountered. They  may  not
+       The  following  verbs act as soon as they are encountered. They may not
        be followed by a name.


           (*ACCEPT)


-       This  verb causes the match to end successfully, skipping the remainder
-       of the pattern. However, when it is inside a subpattern that is  called
-       as  a  subroutine, only that subpattern is ended successfully. Matching
-       then continues at the outer level. If  (*ACCEPT)  is  inside  capturing
+       This verb causes the match to end successfully, skipping the  remainder
+       of  the pattern. However, when it is inside a subpattern that is called
+       as a subroutine, only that subpattern is ended  successfully.  Matching
+       then  continues  at  the  outer level. If (*ACCEPT) is inside capturing
        parentheses, the data so far is captured. For example:


          A((?:A|B(*ACCEPT)|C)D)


-       This  matches  "AB", "AAD", or "ACD"; when it matches "AB", "B" is cap-
+       This matches "AB", "AAD", or "ACD"; when it matches "AB", "B"  is  cap-
        tured by the outer parentheses.


          (*FAIL) or (*F)


-       This verb causes a matching failure, forcing backtracking to occur.  It
-       is  equivalent to (?!) but easier to read. The Perl documentation notes
-       that it is probably useful only when combined  with  (?{})  or  (??{}).
-       Those  are,  of course, Perl features that are not present in PCRE. The
-       nearest equivalent is the callout feature, as for example in this  pat-
+       This  verb causes a matching failure, forcing backtracking to occur. It
+       is equivalent to (?!) but easier to read. The Perl documentation  notes
+       that  it  is  probably  useful only when combined with (?{}) or (??{}).
+       Those are, of course, Perl features that are not present in  PCRE.  The
+       nearest  equivalent is the callout feature, as for example in this pat-
        tern:


          a+(?C)(*FAIL)


-       A  match  with the string "aaaa" always fails, but the callout is taken
+       A match with the string "aaaa" always fails, but the callout  is  taken
        before each backtrack happens (in this example, 10 times).


    Recording which path was taken


-       There is one verb whose main purpose  is  to  track  how  a  match  was
-       arrived  at,  though  it  also  has a secondary use in conjunction with
+       There  is  one  verb  whose  main  purpose  is to track how a match was
+       arrived at, though it also has a  secondary  use  in  conjunction  with
        advancing the match starting point (see (*SKIP) below).


          (*MARK:NAME) or (*:NAME)


-       A name is always  required  with  this  verb.  There  may  be  as  many
-       instances  of  (*MARK) as you like in a pattern, and their names do not
+       A  name  is  always  required  with  this  verb.  There  may be as many
+       instances of (*MARK) as you like in a pattern, and their names  do  not
        have to be unique.


-       When a match succeeds, the name of the last-encountered (*MARK) on  the
-       matching  path  is  passed  back  to the caller via the pcre_extra data
-       structure, as described in the section on  pcre_extra  in  the  pcreapi
-       documentation. Here is an example of pcretest output, where the /K mod-
-       ifier requests the retrieval and outputting of (*MARK) data:
+       When  a  match  succeeds,  the  name of the last-encountered (*MARK) is
+       passed back to  the  caller  via  the  pcre_extra  data  structure,  as
+       described in the section on pcre_extra in the pcreapi documentation. No
+       data is returned for a partial match. Here is an  example  of  pcretest
+       output,  where the /K modifier requests the retrieval and outputting of
+       (*MARK) data:


-           re> /X(*MARK:A)Y|X(*MARK:B)Z/K
-         data> XY
+         /X(*MARK:A)Y|X(*MARK:B)Z/K
+         XY
           0: XY
          MK: A
          XZ
@@ -5773,78 +5720,98 @@
        and passed back if it is the last-encountered. This does not happen for
        negative assertions.


-       After  a  partial match or a failed match, the name of the last encoun-
-       tered (*MARK) in the entire match process is returned. For example:
+       A  name  may  also  be  returned after a failed match if the final path
+       through the pattern involves (*MARK). However, unless (*MARK)  used  in
+       conjunction  with  (*COMMIT),  this  is unlikely to happen for an unan-
+       chored pattern because, as the starting point for matching is advanced,
+       the final check is often with an empty string, causing a failure before
+       (*MARK) is reached. For example:


-           re> /X(*MARK:A)Y|X(*MARK:B)Z/K
-         data> XP
+         /X(*MARK:A)Y|X(*MARK:B)Z/K
+         XP
+         No match
+
+       There are three potential starting points for this match (starting with
+       X,  starting  with  P,  and  with  an  empty string). If the pattern is
+       anchored, the result is different:
+
+         /^X(*MARK:A)Y|^X(*MARK:B)Z/K
+         XP
          No match, mark = B


-       Note that in this unanchored example the  mark  is  retained  from  the
-       match attempt that started at the letter "X". Subsequent match attempts
-       starting at "P" and then with an empty string do not get as far as  the
-       (*MARK) item, but nevertheless do not reset it.
+       PCRE's start-of-match optimizations can also interfere with  this.  For
+       example,  if, as a result of a call to pcre_study(), it knows the mini-
+       mum subject length for a match, a shorter subject will not  be  scanned
+       at all.


+       Note that similar anomalies (though different in detail) exist in Perl,
+       no doubt for the same reasons. The use of (*MARK) data after  a  failed
+       match  of an unanchored pattern is not recommended, unless (*COMMIT) is
+       involved.
+
    Verbs that act after backtracking


        The following verbs do nothing when they are encountered. Matching con-
-       tinues with what follows, but if there is no subsequent match,  causing
-       a  backtrack  to  the  verb, a failure is forced. That is, backtracking
-       cannot pass to the left of the verb. However, when one of  these  verbs
-       appears  inside  an atomic group, its effect is confined to that group,
-       because once the group has been matched, there is never any  backtrack-
-       ing  into  it.  In  this situation, backtracking can "jump back" to the
-       left of the entire atomic group. (Remember also, as stated above,  that
+       tinues  with what follows, but if there is no subsequent match, causing
+       a backtrack to the verb, a failure is  forced.  That  is,  backtracking
+       cannot  pass  to the left of the verb. However, when one of these verbs
+       appears inside an atomic group, its effect is confined to  that  group,
+       because  once the group has been matched, there is never any backtrack-
+       ing into it. In this situation, backtracking can  "jump  back"  to  the
+       left  of the entire atomic group. (Remember also, as stated above, that
        this localization also applies in subroutine calls and assertions.)


-       These  verbs  differ  in exactly what kind of failure occurs when back-
+       These verbs differ in exactly what kind of failure  occurs  when  back-
        tracking reaches them.


          (*COMMIT)


-       This verb, which may not be followed by a name, causes the whole  match
+       This  verb, which may not be followed by a name, causes the whole match
        to fail outright if the rest of the pattern does not match. Even if the
        pattern is unanchored, no further attempts to find a match by advancing
        the  starting  point  take  place.  Once  (*COMMIT)  has  been  passed,
-       pcre_exec() is committed to finding a match  at  the  current  starting
+       pcre_exec()  is  committed  to  finding a match at the current starting
        point, or not at all. For example:


          a+(*COMMIT)b


-       This  matches  "xxaab" but not "aacaab". It can be thought of as a kind
+       This matches "xxaab" but not "aacaab". It can be thought of as  a  kind
        of dynamic anchor, or "I've started, so I must finish." The name of the
-       most  recently passed (*MARK) in the path is passed back when (*COMMIT)
+       most recently passed (*MARK) in the path is passed back when  (*COMMIT)
        forces a match failure.


-       Note that (*COMMIT) at the start of a pattern is not  the  same  as  an
-       anchor,  unless  PCRE's start-of-match optimizations are turned off, as
+       Note  that  (*COMMIT)  at  the start of a pattern is not the same as an
+       anchor, unless PCRE's start-of-match optimizations are turned  off,  as
        shown in this pcretest example:


-           re> /(*COMMIT)abc/
-         data> xyzabc
+         /(*COMMIT)abc/
+         xyzabc
           0: abc
          xyzabc\Y
          No match


-       PCRE knows that any match must start  with  "a",  so  the  optimization
-       skips  along the subject to "a" before running the first match attempt,
-       which succeeds. When the optimization is disabled by the \Y  escape  in
+       PCRE  knows  that  any  match  must start with "a", so the optimization
+       skips along the subject to "a" before running the first match  attempt,
+       which  succeeds.  When the optimization is disabled by the \Y escape in
        the second subject, the match starts at "x" and so the (*COMMIT) causes
        it to fail without trying any other starting points.


          (*PRUNE) or (*PRUNE:NAME)


-       This verb causes the match to fail at the current starting position  in
-       the  subject  if the rest of the pattern does not match. If the pattern
-       is unanchored, the normal "bumpalong"  advance  to  the  next  starting
-       character  then happens. Backtracking can occur as usual to the left of
-       (*PRUNE), before it is reached,  or  when  matching  to  the  right  of
-       (*PRUNE),  but  if  there is no match to the right, backtracking cannot
-       cross (*PRUNE). In simple cases, the use of (*PRUNE) is just an  alter-
-       native  to an atomic group or possessive quantifier, but there are some
+       This  verb causes the match to fail at the current starting position in
+       the subject if the rest of the pattern does not match. If  the  pattern
+       is  unanchored,  the  normal  "bumpalong"  advance to the next starting
+       character then happens. Backtracking can occur as usual to the left  of
+       (*PRUNE),  before  it  is  reached,  or  when  matching to the right of
+       (*PRUNE), but if there is no match to the  right,  backtracking  cannot
+       cross  (*PRUNE). In simple cases, the use of (*PRUNE) is just an alter-
+       native to an atomic group or possessive quantifier, but there are  some
        uses of (*PRUNE) that cannot be expressed in any other way.  The behav-
-       iour  of  (*PRUNE:NAME)  is  the  same  as  (*MARK:NAME)(*PRUNE). In an
-       anchored pattern (*PRUNE) has the same effect as (*COMMIT).
+       iour of (*PRUNE:NAME) is the  same  as  (*MARK:NAME)(*PRUNE)  when  the
+       match  fails  completely;  the name is passed back if this is the final
+       attempt.  (*PRUNE:NAME) does not pass back a name  if  the  match  suc-
+       ceeds.  In  an  anchored pattern (*PRUNE) has the same effect as (*COM-
+       MIT).


          (*SKIP)


@@ -5871,66 +5838,67 @@
        is searched for the most recent (*MARK) that has the same name. If  one
        is  found, the "bumpalong" advance is to the subject position that cor-
        responds to that (*MARK) instead of to where (*SKIP)  was  encountered.
-       If no (*MARK) with a matching name is found, the (*SKIP) is ignored.
+       If  no (*MARK) with a matching name is found, normal "bumpalong" of one
+       character happens (that is, the (*SKIP) is ignored).


          (*THEN) or (*THEN:NAME)


-       This  verb  causes a skip to the next innermost alternative if the rest
-       of the pattern does not match. That is, it cancels  pending  backtrack-
-       ing,  but  only within the current alternative. Its name comes from the
+       This verb causes a skip to the next innermost alternative if  the  rest
+       of  the  pattern does not match. That is, it cancels pending backtrack-
+       ing, but only within the current alternative. Its name comes  from  the
        observation that it can be used for a pattern-based if-then-else block:


          ( COND1 (*THEN) FOO | COND2 (*THEN) BAR | COND3 (*THEN) BAZ ) ...


-       If the COND1 pattern matches, FOO is tried (and possibly further  items
-       after  the  end  of the group if FOO succeeds); on failure, the matcher
-       skips to the second alternative and tries COND2,  without  backtracking
-       into  COND1.  The  behaviour  of  (*THEN:NAME)  is  exactly the same as
-       (*MARK:NAME)(*THEN).  If (*THEN) is not inside an alternation, it  acts
-       like (*PRUNE).
+       If  the COND1 pattern matches, FOO is tried (and possibly further items
+       after the end of the group if FOO succeeds); on  failure,  the  matcher
+       skips  to  the second alternative and tries COND2, without backtracking
+       into COND1. The behaviour  of  (*THEN:NAME)  is  exactly  the  same  as
+       (*MARK:NAME)(*THEN)  if  the  overall  match  fails.  If (*THEN) is not
+       inside an alternation, it acts like (*PRUNE).


-       Note  that  a  subpattern that does not contain a | character is just a
-       part of the enclosing alternative; it is not a nested alternation  with
-       only  one alternative. The effect of (*THEN) extends beyond such a sub-
-       pattern to the enclosing alternative. Consider this pattern,  where  A,
+       Note that a subpattern that does not contain a | character  is  just  a
+       part  of the enclosing alternative; it is not a nested alternation with
+       only one alternative. The effect of (*THEN) extends beyond such a  sub-
+       pattern  to  the enclosing alternative. Consider this pattern, where A,
        B, etc. are complex pattern fragments that do not contain any | charac-
        ters at this level:


          A (B(*THEN)C) | D


-       If A and B are matched, but there is a failure in C, matching does  not
+       If  A and B are matched, but there is a failure in C, matching does not
        backtrack into A; instead it moves to the next alternative, that is, D.
-       However, if the subpattern containing (*THEN) is given an  alternative,
+       However,  if the subpattern containing (*THEN) is given an alternative,
        it behaves differently:


          A (B(*THEN)C | (*FAIL)) | D


-       The  effect of (*THEN) is now confined to the inner subpattern. After a
+       The effect of (*THEN) is now confined to the inner subpattern. After  a
        failure in C, matching moves to (*FAIL), which causes the whole subpat-
-       tern  to  fail  because  there are no more alternatives to try. In this
+       tern to fail because there are no more alternatives  to  try.  In  this
        case, matching does now backtrack into A.


        Note also that a conditional subpattern is not considered as having two
-       alternatives,  because  only  one  is  ever used. In other words, the |
+       alternatives, because only one is ever used.  In  other  words,  the  |
        character in a conditional subpattern has a different meaning. Ignoring
        white space, consider:


          ^.*? (?(?=a) a | b(*THEN)c )


-       If  the  subject  is  "ba", this pattern does not match. Because .*? is
-       ungreedy, it initially matches zero  characters.  The  condition  (?=a)
-       then  fails,  the  character  "b"  is  matched, but "c" is not. At this
-       point, matching does not backtrack to .*? as might perhaps be  expected
-       from  the  presence  of  the | character. The conditional subpattern is
+       If the subject is "ba", this pattern does not  match.  Because  .*?  is
+       ungreedy,  it  initially  matches  zero characters. The condition (?=a)
+       then fails, the character "b" is matched,  but  "c"  is  not.  At  this
+       point,  matching does not backtrack to .*? as might perhaps be expected
+       from the presence of the | character.  The  conditional  subpattern  is
        part of the single alternative that comprises the whole pattern, and so
-       the  match  fails.  (If  there was a backtrack into .*?, allowing it to
+       the match fails. (If there was a backtrack into  .*?,  allowing  it  to
        match "b", the match would succeed.)


-       The verbs just described provide four different "strengths" of  control
+       The  verbs just described provide four different "strengths" of control
        when subsequent matching fails. (*THEN) is the weakest, carrying on the
-       match at the next alternative. (*PRUNE) comes next, failing  the  match
-       at  the  current starting position, but allowing an advance to the next
-       character (for an unanchored pattern). (*SKIP) is similar, except  that
+       match  at  the next alternative. (*PRUNE) comes next, failing the match
+       at the current starting position, but allowing an advance to  the  next
+       character  (for an unanchored pattern). (*SKIP) is similar, except that
        the advance may be more than one character. (*COMMIT) is the strongest,
        causing the entire match to fail.


@@ -5940,8 +5908,8 @@

          (A(*COMMIT)B(*THEN)C|D)


-       Once A has matched, PCRE is committed to this  match,  at  the  current
-       starting  position. If subsequently B matches, but C does not, the nor-
+       Once  A  has  matched,  PCRE is committed to this match, at the current
+       starting position. If subsequently B matches, but C does not, the  nor-
        mal (*THEN) action of trying the next alternative (that is, D) does not
        happen because (*COMMIT) overrides.


@@ -5960,7 +5928,7 @@

REVISION

-       Last updated: 29 November 2011
+       Last updated: 19 October 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------


@@ -6529,19 +6497,13 @@
        been  fully  tested. If --enable-jit is set on an unsupported platform,
        compilation fails.


-       A program that is linked with PCRE 8.20 or later can tell if  JIT  sup-
-       port  is  available  by  calling pcre_config() with the PCRE_CONFIG_JIT
-       option. The result is 1 when JIT is available, and  0  otherwise.  How-
-       ever, a simple program does not need to check this in order to use JIT.
-       The API is implemented in a way that falls back to  the  ordinary  PCRE
-       code if JIT is not available.
+       A program can tell if JIT support is available by calling pcre_config()
+       with the PCRE_CONFIG_JIT option. The result is 1 when JIT is available,
+       and 0 otherwise. However, a simple program does not need to check  this
+       in order to use JIT. The API is implemented in a way that falls back to
+       the ordinary PCRE code if JIT is not available.


-       If  your program may sometimes be linked with versions of PCRE that are
-       older than 8.20, but you want to use JIT when it is available, you  can
-       test the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT
-       macro such as PCRE_CONFIG_JIT, for compile-time control of your code.


-
SIMPLE USE OF JIT

        You have to do two things to make use of the JIT support  in  the  sim-
@@ -6555,22 +6517,6 @@
              no longer needed instead of just freeing it yourself. This
              ensures that any JIT data is also freed.


-       For  a  program  that may be linked with pre-8.20 versions of PCRE, you
-       can insert
-
-         #ifndef PCRE_STUDY_JIT_COMPILE
-         #define PCRE_STUDY_JIT_COMPILE 0
-         #endif
-
-       so that no option is passed to pcre_study(),  and  then  use  something
-       like this to free the study data:
-
-         #ifdef PCRE_CONFIG_JIT
-             pcre_free_study(study_ptr);
-         #else
-             pcre_free(study_ptr);
-         #endif
-
        In  some circumstances you may need to call additional functions. These
        are described in the  section  entitled  "Controlling  the  JIT  stack"
        below.
@@ -6609,8 +6555,12 @@


        The unsupported pattern items are:


-         \C             match a single byte; not supported in UTF-8 mode
+         \C            match a single byte; not supported in UTF-8 mode
          (?Cn)          callouts
+         (?(<name>)...  conditional test on setting of a named subpattern
+         (?(R)...       conditional test on whole pattern recursion
+         (?(Rn)...      conditional test on recursion, by number
+         (?(R&name)...  conditional test on recursion, by name
          (*COMMIT)      )
          (*MARK)        )
          (*PRUNE)       ) the backtracking control verbs
@@ -6659,29 +6609,28 @@
        large  or  complicated  patterns  need  more  than  this.   The   error
        PCRE_ERROR_JIT_STACKLIMIT  is  given  when  there  is not enough stack.
        Three functions are provided for managing blocks of memory for  use  as
-       JIT  stacks. There is further discussion about the use of JIT stacks in
-       the section entitled "JIT stack FAQ" below.
+       JIT stacks.


-       The pcre_jit_stack_alloc() function creates a JIT stack. Its  arguments
-       are  a starting size and a maximum size, and it returns a pointer to an
-       opaque structure of type pcre_jit_stack, or NULL if there is an  error.
-       The  pcre_jit_stack_free() function can be used to free a stack that is
-       no longer needed. (For the technically minded:  the  address  space  is
+       The  pcre_jit_stack_alloc() function creates a JIT stack. Its arguments
+       are a starting size and a maximum size, and it returns a pointer to  an
+       opaque  structure of type pcre_jit_stack, or NULL if there is an error.
+       The pcre_jit_stack_free() function can be used to free a stack that  is
+       no  longer  needed.  (For  the technically minded: the address space is
        allocated by mmap or VirtualAlloc.)


-       JIT  uses far less memory for recursion than the interpretive code, and
-       a maximum stack size of 512K to 1M should be more than enough  for  any
+       JIT uses far less memory for recursion than the interpretive code,  and
+       a  maximum  stack size of 512K to 1M should be more than enough for any
        pattern.


-       The  pcre_assign_jit_stack()  function  specifies  which stack JIT code
+       The pcre_assign_jit_stack() function specifies  which  stack  JIT  code
        should use. Its arguments are as follows:


          pcre_extra         *extra
          pcre_jit_callback  callback
          void               *data


-       The extra argument must be  the  result  of  studying  a  pattern  with
-       PCRE_STUDY_JIT_COMPILE.  There  are  three  cases for the values of the
+       The  extra  argument  must  be  the  result  of studying a pattern with
+       PCRE_STUDY_JIT_COMPILE. There are three cases for  the  values  of  the
        other two options:


          (1) If callback is NULL and data is NULL, an internal 32K block
@@ -6696,18 +6645,18 @@
              is used; otherwise the return value must be a valid JIT stack,
              the result of calling pcre_jit_stack_alloc().


-       You may safely assign the same JIT stack to more than one  pattern,  as
+       You  may  safely assign the same JIT stack to more than one pattern, as
        long as they are all matched sequentially in the same thread. In a mul-
        tithread application, each thread must use its own JIT stack.


-       Strictly speaking, even more is allowed. You can assign the same  stack
-       to  any number of patterns as long as they are not used for matching by
+       Strictly  speaking, even more is allowed. You can assign the same stack
+       to any number of patterns as long as they are not used for matching  by
        multiple threads at the same time. For example, you can assign the same
-       stack  to all compiled patterns, and use a global mutex in the callback
+       stack to all compiled patterns, and use a global mutex in the  callback
        to wait until the stack is available for use. However, this is an inef-
        ficient solution, and not recommended.


-       This  is  a  suggestion  for  how a typical multithreaded program might
+       This is a suggestion for how  a  typical  multithreaded  program  might
        operate:


          During thread initalization
@@ -6719,80 +6668,12 @@
          Use a one-line callback function
            return thread_local_var


-       All the functions described in this section do nothing if  JIT  is  not
-       available,  and  pcre_assign_jit_stack()  does nothing unless the extra
-       argument is non-NULL and points to  a  pcre_extra  block  that  is  the
+       All  the  functions  described in this section do nothing if JIT is not
+       available, and pcre_assign_jit_stack() does nothing  unless  the  extra
+       argument  is  non-NULL  and  points  to  a pcre_extra block that is the
        result of a successful study with PCRE_STUDY_JIT_COMPILE.



-JIT STACK FAQ
-
-       (1) Why do we need JIT stacks?
-
-       PCRE  (and JIT) is a recursive, depth-first engine, so it needs a stack
-       where the local data of the current node is pushed before checking  its
-       child nodes.  Allocating real machine stack on some platforms is diffi-
-       cult. For example, the stack chain needs to be updated every time if we
-       extend  the  stack  on  PowerPC.  Although it is possible, its updating
-       time overhead decreases performance. So we do the recursion in memory.
-
-       (2) Why don't we simply allocate blocks of memory with malloc()?
-
-       Modern operating systems have a  nice  feature:  they  can  reserve  an
-       address space instead of allocating memory. We can safely allocate mem-
-       ory pages inside this address space, so the stack  could  grow  without
-       moving memory data (this is important because of pointers). Thus we can
-       allocate 1M address space, and use only a single memory  page  (usually
-       4K)  if  that is enough. However, we can still grow up to 1M anytime if
-       needed.
-
-       (3) Who "owns" a JIT stack?
-
-       The owner of the stack is the user program, not the JIT studied pattern
-       or  anything else. The user program must ensure that if a stack is used
-       by pcre_exec(), (that is, it is assigned to the pattern currently  run-
-       ning), that stack must not be used by any other threads (to avoid over-
-       writing the same memory area). The best practice for multithreaded pro-
-       grams  is  to  allocate  a stack for each thread, and return this stack
-       through the JIT callback function.
-
-       (4) When should a JIT stack be freed?
-
-       You can free a JIT stack at any time, as long as it will not be used by
-       pcre_exec()  again.  When  you  assign  the  stack to a pattern, only a
-       pointer is set. There is no reference counting or any other magic.  You
-       can  free  the  patterns  and stacks in any order, anytime. Just do not
-       call pcre_exec() with a pattern pointing to an already freed stack,  as
-       that  will cause SEGFAULT. (Also, do not free a stack currently used by
-       pcre_exec() in another thread). You can also replace the  stack  for  a
-       pattern  at  any  time.  You  can  even  free the previous stack before
-       assigning a replacement.
-
-       (5) Should I allocate/free a  stack  every  time  before/after  calling
-       pcre_exec()?
-
-       No,  because  this  is  too  costly in terms of resources. However, you
-       could implement some clever idea which release the stack if it  is  not
-       used in let's say two minutes. The JIT callback can help to achive this
-       without keeping a list of the currently JIT studied patterns.
-
-       (6) OK, the stack is for long term memory allocation. But what  happens
-       if  a pattern causes stack overflow with a stack of 1M? Is that 1M kept
-       until the stack is freed?
-
-       Especially on embedded sytems, it might be a good idea to release  mem-
-       ory  sometimes  without  freeing the stack. There is no API for this at
-       the moment. Probably a function call which returns with  the  currently
-       allocated  memory for any stack and another which allows releasing mem-
-       ory (shrinking the stack) would be a good idea if someone needs this.
-
-       (7) This is too much of a headache. Isn't there any better solution for
-       JIT stack handling?
-
-       No,  thanks to Windows. If POSIX threads were used everywhere, we could
-       throw out this complicated API.
-
-
 EXAMPLE CODE


        This is a single-threaded example that specifies a  JIT  stack  without
@@ -6824,14 +6705,14 @@


AUTHOR

-       Philip Hazel (FAQ by Zoltan Herczeg)
+       Philip Hazel
        University Computing Service
        Cambridge CB2 3QH, England.



REVISION

-       Last updated: 26 November 2011
+       Last updated: 19 October 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------


@@ -8272,12 +8153,6 @@
        There is no limit to the number of parenthesized subpatterns, but there
        can be no more than 65535 capturing subpatterns.


-       There is a limit to the number of forward references to subsequent sub-
-       patterns of around 200,000.  Repeated  forward  references  with  fixed
-       upper  limits,  for example, (?2){0,100} when subpattern number 2 is to
-       the right, are included in the count. There is no limit to  the  number
-       of backward references.
-
        The maximum length of name for a named subpattern is 32 characters, and
        the maximum number of named subpatterns is 10000.


@@ -8298,7 +8173,7 @@

REVISION

-       Last updated: 30 November 2011
+       Last updated: 24 August 2011
        Copyright (c) 1997-2011 University of Cambridge.
 ------------------------------------------------------------------------------



Modified: code/trunk/doc/pcreapi.3
===================================================================
--- code/trunk/doc/pcreapi.3    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/pcreapi.3    2011-12-28 16:10:09 UTC (rev 835)
@@ -644,18 +644,18 @@
 pattern such as (\e1)(a) succeeds when this option is set (assuming it can find
 an "a" in the subject), whereas it fails by default, for Perl compatibility.
 .P
-(3) \eU matches an upper case "U" character; by default \eU causes a compile
+(3) \eU matches an upper case "U" character; by default \eU causes a compile 
 time error (Perl uses \eU to upper case subsequent characters).
 .P
-(4) \eu matches a lower case "u" character unless it is followed by four
-hexadecimal digits, in which case the hexadecimal number defines the code point
-to match. By default, \eu causes a compile time error (Perl uses it to upper
+(4) \eu matches a lower case "u" character unless it is followed by four 
+hexadecimal digits, in which case the hexadecimal number defines the code point 
+to match. By default, \eu causes a compile time error (Perl uses it to upper 
 case the following character).
 .P
-(5) \ex matches a lower case "x" character unless it is followed by two
-hexadecimal digits, in which case the hexadecimal number defines the code point
-to match. By default, as in Perl, a hexadecimal number is always expected after
-\ex, but it may have zero, one, or two digits (so, for example, \exz matches a
+(5) \ex matches a lower case "x" character unless it is followed by two 
+hexadecimal digits, in which case the hexadecimal number defines the code point 
+to match. By default, as in Perl, a hexadecimal number is always expected after 
+\ex, but it may have zero, one, or two digits (so, for example, \exz matches a 
 binary zero character followed by z).
 .sp
   PCRE_MULTILINE
@@ -1147,12 +1147,6 @@
 .\"
 documentation for details of what can and cannot be handled.
 .sp
-  PCRE_INFO_JITSIZE
-.sp
-If the pattern was successfully studied with the PCRE_STUDY_JIT_COMPILE option,
-return the size of the JIT compiled code, otherwise return zero. The fourth
-argument should point to a \fBsize_t\fP variable.
-.sp
   PCRE_INFO_LASTLITERAL
 .sp
 Return the value of the rightmost literal byte that must exist in any matched
@@ -1268,13 +1262,10 @@
 .sp
   PCRE_INFO_SIZE
 .sp
-Return the size of the compiled pattern. The fourth argument should point to a
-\fBsize_t\fP variable. This value does not include the size of the \fBpcre\fP
-structure that is returned by \fBpcre_compile()\fP. The value that is passed as
-the argument to \fBpcre_malloc()\fP when \fBpcre_compile()\fP is getting memory
-in which to place the compiled data is the value returned by this option plus
-the size of the \fBpcre\fP structure. Studying a compiled pattern, with or
-without JIT, does not alter the value returned by this option.
+Return the size of the compiled pattern, that is, the value that was passed as
+the argument to \fBpcre_malloc()\fP when PCRE was getting memory in which to
+place the compiled data. The fourth argument should point to a \fBsize_t\fP
+variable.
 .sp
   PCRE_INFO_STUDYSIZE
 .sp
@@ -2553,6 +2544,6 @@
 .rs
 .sp
 .nf
-Last updated: 02 December 2011
+Last updated: 14 November 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcrecallout.3
===================================================================
--- code/trunk/doc/pcrecallout.3    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/pcrecallout.3    2011-12-28 16:10:09 UTC (rev 835)
@@ -160,10 +160,9 @@
 .P
 The \fImark\fP field is present from version 2 of the \fIpcre_callout\fP
 structure. In callouts from \fBpcre_exec()\fP it contains a pointer to the
-zero-terminated name of the most recently passed (*MARK), (*PRUNE), or (*THEN)
-item in the match, or NULL if no such items have been passed. Instances of
-(*PRUNE) or (*THEN) without a name do not obliterate a previous (*MARK). In
-callouts from \fBpcre_dfa_exec()\fP this field always contains NULL.
+zero-terminated name of the most recently passed (*MARK) item in the match, or
+NULL if there are no (*MARK)s in the current matching path. In callouts from
+\fBpcre_dfa_exec()\fP this field always contains NULL.
 .
 .
 .SH "RETURN VALUES"
@@ -196,6 +195,6 @@
 .rs
 .sp
 .nf
-Last updated: 30 November 2011
+Last updated: 26 August 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcrecompat.3
===================================================================
--- code/trunk/doc/pcrecompat.3    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/pcrecompat.3    2011-12-28 16:10:09 UTC (rev 835)
@@ -38,7 +38,7 @@
 own, matching a non-newline character, is supported.) In fact these are
 implemented by Perl's general string-handling and are not part of its pattern
 matching engine. If any of these are encountered by PCRE, an error is
-generated by default. However, if the PCRE_JAVASCRIPT_COMPAT option is set,
+generated by default. However, if the PCRE_JAVASCRIPT_COMPAT option is set, 
 \eU and \eu are interpreted as JavaScript interprets them.
 .P
 6. The Perl escape sequences \ep, \eP, and \eX are supported only if PCRE is


Modified: code/trunk/doc/pcrejit.3
===================================================================
--- code/trunk/doc/pcrejit.3    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/pcrejit.3    2011-12-28 16:10:09 UTC (rev 835)
@@ -34,16 +34,11 @@
 fully tested. If --enable-jit is set on an unsupported platform, compilation
 fails.
 .P
-A program that is linked with PCRE 8.20 or later can tell if JIT support is
-available by calling \fBpcre_config()\fP with the PCRE_CONFIG_JIT option. The
-result is 1 when JIT is available, and 0 otherwise. However, a simple program
-does not need to check this in order to use JIT. The API is implemented in a
-way that falls back to the ordinary PCRE code if JIT is not available.
-.P
-If your program may sometimes be linked with versions of PCRE that are older
-than 8.20, but you want to use JIT when it is available, you can test
-the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT macro such
-as PCRE_CONFIG_JIT, for compile-time control of your code.
+A program can tell if JIT support is available by calling \fBpcre_config()\fP
+with the PCRE_CONFIG_JIT option. The result is 1 when JIT is available, and 0
+otherwise. However, a simple program does not need to check this in order to
+use JIT. The API is implemented in a way that falls back to the ordinary PCRE
+code if JIT is not available.
 .
 .
 .SH "SIMPLE USE OF JIT"
@@ -59,21 +54,6 @@
       no longer needed instead of just freeing it yourself. This
       ensures that any JIT data is also freed.
 .sp
-For a program that may be linked with pre-8.20 versions of PCRE, you can insert
-.sp
-  #ifndef PCRE_STUDY_JIT_COMPILE
-  #define PCRE_STUDY_JIT_COMPILE 0
-  #endif
-.sp
-so that no option is passed to \fBpcre_study()\fP, and then use something like
-this to free the study data:
-.sp
-  #ifdef PCRE_CONFIG_JIT
-      pcre_free_study(study_ptr);
-  #else
-      pcre_free(study_ptr);
-  #endif
-.sp
 In some circumstances you may need to call additional functions. These are
 described in the section entitled
 .\" HTML <a href="#stackcontrol">
@@ -115,7 +95,7 @@
 .P
 The unsupported pattern items are:
 .sp
-  \eC             match a single byte; not supported in UTF-8 mode
+  \eC            match a single byte; not supported in UTF-8 mode
   (?Cn)          callouts
   (*COMMIT)      )
   (*MARK)        )
@@ -173,13 +153,7 @@
 By default, it uses 32K on the machine stack. However, some large or
 complicated patterns need more than this. The error PCRE_ERROR_JIT_STACKLIMIT
 is given when there is not enough stack. Three functions are provided for
-managing blocks of memory for use as JIT stacks. There is further discussion
-about the use of JIT stacks in the section entitled
-.\" HTML <a href="#stackcontrol">
-.\" </a>
-"JIT stack FAQ"
-.\"
-below.
+managing blocks of memory for use as JIT stacks.
 .P
 The \fBpcre_jit_stack_alloc()\fP function creates a JIT stack. Its arguments
 are a starting size and a maximum size, and it returns a pointer to an opaque
@@ -243,74 +217,6 @@
 successful study with PCRE_STUDY_JIT_COMPILE.
 .
 .
-.\" HTML <a name="stackfaq"></a>
-.SH "JIT STACK FAQ"
-.rs
-.sp
-(1) Why do we need JIT stacks?
-.sp
-PCRE (and JIT) is a recursive, depth-first engine, so it needs a stack where
-the local data of the current node is pushed before checking its child nodes.
-Allocating real machine stack on some platforms is difficult. For example, the
-stack chain needs to be updated every time if we extend the stack on PowerPC.
-Although it is possible, its updating time overhead decreases performance. So
-we do the recursion in memory.
-.P
-(2) Why don't we simply allocate blocks of memory with \fBmalloc()\fP?
-.sp
-Modern operating systems have a nice feature: they can reserve an address space
-instead of allocating memory. We can safely allocate memory pages inside this
-address space, so the stack could grow without moving memory data (this is
-important because of pointers). Thus we can allocate 1M address space, and use
-only a single memory page (usually 4K) if that is enough. However, we can still
-grow up to 1M anytime if needed.
-.P
-(3) Who "owns" a JIT stack?
-.sp
-The owner of the stack is the user program, not the JIT studied pattern or
-anything else. The user program must ensure that if a stack is used by
-\fBpcre_exec()\fP, (that is, it is assigned to the pattern currently running),
-that stack must not be used by any other threads (to avoid overwriting the same
-memory area). The best practice for multithreaded programs is to allocate a
-stack for each thread, and return this stack through the JIT callback function.
-.P
-(4) When should a JIT stack be freed?
-.sp
-You can free a JIT stack at any time, as long as it will not be used by
-\fBpcre_exec()\fP again. When you assign the stack to a pattern, only a pointer
-is set. There is no reference counting or any other magic. You can free the
-patterns and stacks in any order, anytime. Just \fIdo not\fP call
-\fBpcre_exec()\fP with a pattern pointing to an already freed stack, as that
-will cause SEGFAULT. (Also, do not free a stack currently used by
-\fBpcre_exec()\fP in another thread). You can also replace the stack for a
-pattern at any time. You can even free the previous stack before assigning a
-replacement.
-.P
-(5) Should I allocate/free a stack every time before/after calling
-\fBpcre_exec()\fP?
-.sp
-No, because this is too costly in terms of resources. However, you could
-implement some clever idea which release the stack if it is not used in let's
-say two minutes. The JIT callback can help to achive this without keeping a
-list of the currently JIT studied patterns.
-.P
-(6) OK, the stack is for long term memory allocation. But what happens if a
-pattern causes stack overflow with a stack of 1M? Is that 1M kept until the
-stack is freed?
-.sp
-Especially on embedded sytems, it might be a good idea to release
-memory sometimes without freeing the stack. There is no API for this at the
-moment. Probably a function call which returns with the currently allocated
-memory for any stack and another which allows releasing memory (shrinking the
-stack) would be a good idea if someone needs this.
-.P
-(7) This is too much of a headache. Isn't there any better solution for JIT
-stack handling?
-.sp
-No, thanks to Windows. If POSIX threads were used everywhere, we could throw
-out this complicated API.
-.
-.
 .SH "EXAMPLE CODE"
 .rs
 .sp
@@ -347,7 +253,7 @@
 .rs
 .sp
 .nf
-Philip Hazel (FAQ by Zoltan Herczeg)
+Philip Hazel
 University Computing Service
 Cambridge CB2 3QH, England.
 .fi
@@ -357,6 +263,6 @@
 .rs
 .sp
 .nf
-Last updated: 26 November 2011
+Last updated: 15 November 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcrelimits.3
===================================================================
--- code/trunk/doc/pcrelimits.3    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/pcrelimits.3    2011-12-28 16:10:09 UTC (rev 835)
@@ -23,11 +23,6 @@
 There is no limit to the number of parenthesized subpatterns, but there can be
 no more than 65535 capturing subpatterns.
 .P
-There is a limit to the number of forward references to subsequent subpatterns
-of around 200,000. Repeated forward references with fixed upper limits, for
-example, (?2){0,100} when subpattern number 2 is to the right, are included in
-the count. There is no limit to the number of backward references.
-.P
 The maximum length of name for a named subpattern is 32 characters, and the
 maximum number of named subpatterns is 10000.
 .P
@@ -57,6 +52,6 @@
 .rs
 .sp
 .nf
-Last updated: 30 November 2011
+Last updated: 24 August 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcrepattern.3
===================================================================
--- code/trunk/doc/pcrepattern.3    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/pcrepattern.3    2011-12-28 16:10:09 UTC (rev 835)
@@ -242,7 +242,7 @@
   \eddd      character with octal code ddd, or back reference
   \exhh      character with hex code hh
   \ex{hhh..} character with hex code hhh.. (non-JavaScript mode)
-  \euhhhh    character with hex code hhhh (JavaScript mode only)
+  \euhhhh    character with hex code hhhh (JavaScript mode only) 
 .sp
 The precise effect of \ecx is as follows: if x is a lower case letter, it
 is converted to upper case. Then bit 6 of the character (hex 40) is inverted.
@@ -265,15 +265,15 @@
 initial \ex will be interpreted as a basic hexadecimal escape, with no
 following digits, giving a character whose value is zero.
 .P
-If the PCRE_JAVASCRIPT_COMPAT option is set, the interpretation of \ex is
-as just described only when it is followed by two hexadecimal digits.
+If the PCRE_JAVASCRIPT_COMPAT option is set, the interpretation of \ex is 
+as just described only when it is followed by two hexadecimal digits. 
 Otherwise, it matches a literal "x" character. In JavaScript mode, support for
-code points greater than 256 is provided by \eu, which must be followed by
+code points greater than 256 is provided by \eu, which must be followed by 
 four hexadecimal digits; otherwise it matches a literal "u" character.
 .P
 Characters whose value is less than 256 can be defined by either of the two
 syntaxes for \ex (or by \eu in JavaScript mode). There is no difference in the
-way they are handled. For example, \exdc is exactly the same as \ex{dc} (or
+way they are handled. For example, \exdc is exactly the same as \ex{dc} (or 
 \eu00dc in JavaScript mode).
 .P
 After \e0 up to two further octal digits are read. If there are fewer than two
@@ -328,14 +328,12 @@
 zero, because no more than three octal digits are ever read.
 .P
 All the sequences that define a single character value can be used both inside
-and outside character classes. In addition, inside a character class, \eb is
-interpreted as the backspace character (hex 08).
-.P
-\eN is not allowed in a character class. \eB, \eR, and \eX are not special
-inside a character class. Like other unrecognized escape sequences, they are
-treated as the literal characters "B", "R", and "X" by default, but cause an
-error if the PCRE_EXTRA option is set. Outside a character class, these
-sequences have different meanings.
+and outside character classes. In addition, inside a character class, the
+sequence \eb is interpreted as the backspace character (hex 08). The sequences
+\eB, \eN, \eR, and \eX are not special inside a character class. Like any other
+unrecognized escape sequences, they are treated as the literal characters "B",
+"N", "R", and "X" by default, but cause an error if the PCRE_EXTRA option is
+set. Outside a character class, these sequences have different meanings.
 .
 .
 .SS "Unsupported escape sequences"
@@ -407,7 +405,7 @@
 .\" </a>
 the "." metacharacter
 .\"
-when PCRE_DOTALL is not set. Perl also uses \eN to match characters by name;
+when PCRE_DOTALL is not set. Perl also uses \eN to match characters by name; 
 PCRE does not support this.
 .P
 Each pair of lower and upper case escape sequences partitions the complete set
@@ -2569,11 +2567,10 @@
 If any of these verbs are used in an assertion or in a subpattern that is
 called as a subroutine (whether or not recursively), their effect is confined
 to that subpattern; it does not extend to the surrounding pattern, with one
-exception: the name from a *(MARK), (*PRUNE), or (*THEN) that is encountered in
-a successful positive assertion \fIis\fP passed back when a match succeeds
-(compare capturing parentheses in assertions). Note that such subpatterns are
-processed as anchored at the point where they are tested. Note also that Perl's
-treatment of subroutines is different in some cases.
+exception: a *MARK that is encountered in a positive assertion \fIis\fP passed
+back (compare capturing parentheses in assertions). Note that such subpatterns
+are processed as anchored at the point where they are tested. Note also that
+Perl's treatment of subroutines is different in some cases.
 .P
 The new verbs make use of what was previously invalid syntax: an opening
 parenthesis followed by an asterisk. They are generally of the form
@@ -2592,9 +2589,6 @@
 the start-of-match optimizations by setting the PCRE_NO_START_OPTIMIZE option
 when calling \fBpcre_compile()\fP or \fBpcre_exec()\fP, or by starting the
 pattern with (*NO_START_OPT).
-.P
-Experiments with Perl suggest that it too has similar optimizations, sometimes
-leading to anomalous results.
 .
 .
 .SS "Verbs that act immediately"
@@ -2642,9 +2636,8 @@
 A name is always required with this verb. There may be as many instances of
 (*MARK) as you like in a pattern, and their names do not have to be unique.
 .P
-When a match succeeds, the name of the last-encountered (*MARK) on the matching
-path is passed back to the caller via the \fIpcre_extra\fP data structure, as
-described in the
+When a match succeeds, the name of the last-encountered (*MARK) is passed back
+to the caller via the \fIpcre_extra\fP data structure, as described in the
 .\" HTML <a href="pcreapi.html#extradata">
 .\" </a>
 section on \fIpcre_extra\fP
@@ -2653,11 +2646,12 @@
 .\" HREF
 \fBpcreapi\fP
 .\"
-documentation. Here is an example of \fBpcretest\fP output, where the /K
-modifier requests the retrieval and outputting of (*MARK) data:
+documentation. No data is returned for a partial match. Here is an example of
+\fBpcretest\fP output, where the /K modifier requests the retrieval and
+outputting of (*MARK) data:
 .sp
-    re> /X(*MARK:A)Y|X(*MARK:B)Z/K
-  data> XY
+  /X(*MARK:A)Y|X(*MARK:B)Z/K
+  XY
    0: XY
   MK: A
   XZ
@@ -2673,17 +2667,31 @@
 passed back if it is the last-encountered. This does not happen for negative
 assertions.
 .P
-After a partial match or a failed match, the name of the last encountered
-(*MARK) in the entire match process is returned. For example:
+A name may also be returned after a failed match if the final path through the
+pattern involves (*MARK). However, unless (*MARK) used in conjunction with
+(*COMMIT), this is unlikely to happen for an unanchored pattern because, as the
+starting point for matching is advanced, the final check is often with an empty
+string, causing a failure before (*MARK) is reached. For example:
 .sp
-    re> /X(*MARK:A)Y|X(*MARK:B)Z/K
-  data> XP
+  /X(*MARK:A)Y|X(*MARK:B)Z/K
+  XP
+  No match
+.sp
+There are three potential starting points for this match (starting with X,
+starting with P, and with an empty string). If the pattern is anchored, the
+result is different:
+.sp
+  /^X(*MARK:A)Y|^X(*MARK:B)Z/K
+  XP
   No match, mark = B
 .sp
-Note that in this unanchored example the mark is retained from the match
-attempt that started at the letter "X". Subsequent match attempts starting at
-"P" and then with an empty string do not get as far as the (*MARK) item, but
-nevertheless do not reset it.
+PCRE's start-of-match optimizations can also interfere with this. For example,
+if, as a result of a call to \fBpcre_study()\fP, it knows the minimum
+subject length for a match, a shorter subject will not be scanned at all.
+.P
+Note that similar anomalies (though different in detail) exist in Perl, no
+doubt for the same reasons. The use of (*MARK) data after a failed match of an
+unanchored pattern is not recommended, unless (*COMMIT) is involved.
 .
 .
 .SS "Verbs that act after backtracking"
@@ -2720,8 +2728,8 @@
 unless PCRE's start-of-match optimizations are turned off, as shown in this
 \fBpcretest\fP example:
 .sp
-    re> /(*COMMIT)abc/
-  data> xyzabc
+  /(*COMMIT)abc/
+  xyzabc
    0: abc
   xyzabc\eY
   No match
@@ -2742,8 +2750,10 @@
 the right, backtracking cannot cross (*PRUNE). In simple cases, the use of
 (*PRUNE) is just an alternative to an atomic group or possessive quantifier,
 but there are some uses of (*PRUNE) that cannot be expressed in any other way.
-The behaviour of (*PRUNE:NAME) is the same as (*MARK:NAME)(*PRUNE). In an
-anchored pattern (*PRUNE) has the same effect as (*COMMIT).
+The behaviour of (*PRUNE:NAME) is the same as (*MARK:NAME)(*PRUNE) when the
+match fails completely; the name is passed back if this is the final attempt.
+(*PRUNE:NAME) does not pass back a name if the match succeeds. In an anchored
+pattern (*PRUNE) has the same effect as (*COMMIT).
 .sp
   (*SKIP)
 .sp
@@ -2769,7 +2779,8 @@
 searched for the most recent (*MARK) that has the same name. If one is found,
 the "bumpalong" advance is to the subject position that corresponds to that
 (*MARK) instead of to where (*SKIP) was encountered. If no (*MARK) with a
-matching name is found, the (*SKIP) is ignored.
+matching name is found, normal "bumpalong" of one character happens (that is,
+the (*SKIP) is ignored).
 .sp
   (*THEN) or (*THEN:NAME)
 .sp
@@ -2783,8 +2794,9 @@
 If the COND1 pattern matches, FOO is tried (and possibly further items after
 the end of the group if FOO succeeds); on failure, the matcher skips to the
 second alternative and tries COND2, without backtracking into COND1. The
-behaviour of (*THEN:NAME) is exactly the same as (*MARK:NAME)(*THEN).
-If (*THEN) is not inside an alternation, it acts like (*PRUNE).
+behaviour of (*THEN:NAME) is exactly the same as (*MARK:NAME)(*THEN) if the
+overall match fails. If (*THEN) is not inside an alternation, it acts like
+(*PRUNE).
 .P
 Note that a subpattern that does not contain a | character is just a part of
 the enclosing alternative; it is not a nested alternation with only one
@@ -2862,6 +2874,6 @@
 .rs
 .sp
 .nf
-Last updated: 29 November 2011
+Last updated: 19 November 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcretest.1
===================================================================
--- code/trunk/doc/pcretest.1    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/pcretest.1    2011-12-28 16:10:09 UTC (rev 835)
@@ -319,10 +319,7 @@
 which it appears.
 .P
 The \fB/M\fP modifier causes the size of memory block used to hold the compiled
-pattern to be output. This does not include the size of the \fBpcre\fP block;
-it is just the actual compiled data. If the pattern is successfully studied
-with the PCRE_STUDY_JIT_COMPILE option, the size of the JIT compiled code is
-also output.
+pattern to be output.
 .P
 If the \fB/S\fP modifier appears once, it causes \fBpcre_study()\fP to be
 called after the expression has been compiled, and the results used when the
@@ -878,6 +875,6 @@
 .rs
 .sp
 .nf
-Last updated: 02 December 2011
+Last updated: 26 August 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcretest.txt
===================================================================
--- code/trunk/doc/pcretest.txt    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/doc/pcretest.txt    2011-12-28 16:10:09 UTC (rev 835)
@@ -305,33 +305,30 @@
        it appears.


        The /M modifier causes the size of memory block used to hold  the  com-
-       piled  pattern to be output. This does not include the size of the pcre
-       block; it is just the actual compiled data. If the pattern is  success-
-       fully  studied  with the PCRE_STUDY_JIT_COMPILE option, the size of the
-       JIT compiled code is also output.
+       piled pattern to be output.


-       If the /S modifier appears once, it causes pcre_study()  to  be  called
-       after  the  expression has been compiled, and the results used when the
-       expression is matched. If /S appears  twice,  it  suppresses  studying,
+       If  the  /S  modifier appears once, it causes pcre_study() to be called
+       after the expression has been compiled, and the results used  when  the
+       expression  is  matched.  If  /S appears twice, it suppresses studying,
        even if it was requested externally by the -s command line option. This
-       makes it possible to specify that certain patterns are always  studied,
+       makes  it possible to specify that certain patterns are always studied,
        and others are never studied, independently of -s. This feature is used
        in the test files in a few cases where the output is different when the
        pattern is studied.


-       If  the  /S modifier is immediately followed by a + character, the call
-       to  pcre_study()  is  made  with  the  PCRE_STUDY_JIT_COMPILE   option,
-       requesting  just-in-time  optimization support if it is available. Note
-       that there is also a /+ modifier; it  must  not  be  given  immediately
-       after  /S  because this will be misinterpreted. If JIT studying is suc-
-       cessful, it will automatically be used when pcre_exec() is run,  except
-       when  incompatible  run-time  options  are specified. These include the
+       If the /S modifier is immediately followed by a + character,  the  call
+       to   pcre_study()  is  made  with  the  PCRE_STUDY_JIT_COMPILE  option,
+       requesting just-in-time optimization support if it is  available.  Note
+       that  there  is  also  a  /+ modifier; it must not be given immediately
+       after /S because this will be misinterpreted. If JIT studying  is  suc-
+       cessful,  it will automatically be used when pcre_exec() is run, except
+       when incompatible run-time options are  specified.  These  include  the
        partial matching options; a complete list is given in the pcrejit docu-
-       mentation.  See  also the \J escape sequence below for a way of setting
+       mentation. See also the \J escape sequence below for a way  of  setting
        the size of the JIT stack.


-       The /T modifier must be followed by a single digit. It  causes  a  spe-
-       cific  set of built-in character tables to be passed to pcre_compile().
+       The  /T  modifier  must be followed by a single digit. It causes a spe-
+       cific set of built-in character tables to be passed to  pcre_compile().
        It is used in the standard PCRE tests to check behaviour with different
        character tables. The digit specifies the tables as follows:


@@ -339,12 +336,12 @@
                pcre_chartables.c.dist
          1   a set of tables defining ISO 8859 characters


-       In  table 1, some characters whose codes are greater than 128 are iden-
+       In table 1, some characters whose codes are greater than 128 are  iden-
        tified as letters, digits, spaces, etc.


    Using the POSIX wrapper API


-       The /P modifier causes pcretest to call PCRE via the POSIX wrapper  API
+       The  /P modifier causes pcretest to call PCRE via the POSIX wrapper API
        rather than its native API. When /P is set, the following modifiers set
        options for the regcomp() function:


@@ -356,17 +353,17 @@
          /W    REG_UCP        )   the POSIX standard
          /8    REG_UTF8       )


-       The /+ modifier works as  described  above.  All  other  modifiers  are
+       The  /+  modifier  works  as  described  above. All other modifiers are
        ignored.



DATA LINES

-       Before  each  data  line is passed to pcre_exec(), leading and trailing
-       white space is removed, and it is then scanned for \ escapes.  Some  of
-       these  are  pretty esoteric features, intended for checking out some of
-       the more complicated features of PCRE. If you are just  testing  "ordi-
-       nary"  regular  expressions,  you probably don't need any of these. The
+       Before each data line is passed to pcre_exec(),  leading  and  trailing
+       white  space  is removed, and it is then scanned for \ escapes. Some of
+       these are pretty esoteric features, intended for checking out  some  of
+       the  more  complicated features of PCRE. If you are just testing "ordi-
+       nary" regular expressions, you probably don't need any  of  these.  The
        following escapes are recognized:


          \a         alarm (BEL, \x07)
@@ -447,95 +444,95 @@
          \<any>     pass the PCRE_NEWLINE_ANY option to pcre_exec()
                       or pcre_dfa_exec()


-       Note that \xhh always specifies one byte,  even  in  UTF-8  mode;  this
+       Note  that  \xhh  always  specifies  one byte, even in UTF-8 mode; this
        makes it possible to construct invalid UTF-8 sequences for testing pur-
        poses. On the other hand, \x{hh} is interpreted as a UTF-8 character in
-       UTF-8  mode, generating more than one byte if the value is greater than
+       UTF-8 mode, generating more than one byte if the value is greater  than
        127. When not in UTF-8 mode, it generates one byte for values less than
        256, and causes an error for greater values.


-       The  escapes  that  specify  line ending sequences are literal strings,
+       The escapes that specify line ending  sequences  are  literal  strings,
        exactly as shown. No more than one newline setting should be present in
        any data line.


-       A  backslash  followed by anything else just escapes the anything else.
-       If the very last character is a backslash, it is ignored. This gives  a
-       way  of  passing  an empty line as data, since a real empty line termi-
+       A backslash followed by anything else just escapes the  anything  else.
+       If  the very last character is a backslash, it is ignored. This gives a
+       way of passing an empty line as data, since a real  empty  line  termi-
        nates the data input.


-       The \J escape provides a way of setting the maximum stack size that  is
-       used  by the just-in-time optimization code. It is ignored if JIT opti-
-       mization is not being used. Providing a stack that is larger  than  the
+       The  \J escape provides a way of setting the maximum stack size that is
+       used by the just-in-time optimization code. It is ignored if JIT  opti-
+       mization  is  not being used. Providing a stack that is larger than the
        default 32K is necessary only for very complicated patterns.


-       If  \M  is present, pcretest calls pcre_exec() several times, with dif-
-       ferent values in the match_limit and  match_limit_recursion  fields  of
-       the  pcre_extra  data structure, until it finds the minimum numbers for
-       each parameter  that  allow  pcre_exec()  to  complete  without  error.
-       Because  this  is testing a specific feature of the normal interpretive
-       pcre_exec() execution, the use of any JIT optimization that might  have
+       If \M is present, pcretest calls pcre_exec() several times,  with  dif-
+       ferent  values  in  the match_limit and match_limit_recursion fields of
+       the pcre_extra data structure, until it finds the minimum  numbers  for
+       each  parameter  that  allow  pcre_exec()  to  complete  without error.
+       Because this is testing a specific feature of the  normal  interpretive
+       pcre_exec()  execution, the use of any JIT optimization that might have
        been set up by the /S+ qualifier of -s+ option is disabled.


-       The  match_limit number is a measure of the amount of backtracking that
-       takes place, and checking it out can be instructive.  For  most  simple
-       matches,  the  number  is quite small, but for patterns with very large
-       numbers of matching possibilities, it can  become  large  very  quickly
-       with  increasing  length  of  subject string. The match_limit_recursion
-       number is a measure of how much stack (or, if  PCRE  is  compiled  with
-       NO_RECURSE,  how  much  heap)  memory  is  needed to complete the match
+       The match_limit number is a measure of the amount of backtracking  that
+       takes  place,  and  checking it out can be instructive. For most simple
+       matches, the number is quite small, but for patterns  with  very  large
+       numbers  of  matching  possibilities,  it can become large very quickly
+       with increasing length of  subject  string.  The  match_limit_recursion
+       number  is  a  measure  of how much stack (or, if PCRE is compiled with
+       NO_RECURSE, how much heap) memory  is  needed  to  complete  the  match
        attempt.


-       When \O is used, the value specified may be higher or  lower  than  the
+       When  \O  is  used, the value specified may be higher or lower than the
        size set by the -O command line option (or defaulted to 45); \O applies
        only to the call of pcre_exec() for the line in which it appears.


-       If the /P modifier was present on the pattern, causing the POSIX  wrap-
-       per  API  to  be  used, the only option-setting sequences that have any
-       effect are \B,  \N,  and  \Z,  causing  REG_NOTBOL,  REG_NOTEMPTY,  and
+       If  the /P modifier was present on the pattern, causing the POSIX wrap-
+       per API to be used, the only option-setting  sequences  that  have  any
+       effect  are  \B,  \N,  and  \Z,  causing  REG_NOTBOL, REG_NOTEMPTY, and
        REG_NOTEOL, respectively, to be passed to regexec().


-       The  use of \x{hh...} to represent UTF-8 characters is not dependent on
-       the use of the /8 modifier on the pattern.  It  is  recognized  always.
-       There  may  be  any number of hexadecimal digits inside the braces. The
-       result is from one to six bytes,  encoded  according  to  the  original
-       UTF-8  rules  of  RFC  2279.  This  allows for values in the range 0 to
-       0x7FFFFFFF. Note that not all of those are valid Unicode  code  points,
-       or  indeed  valid  UTF-8 characters according to the later rules in RFC
+       The use of \x{hh...} to represent UTF-8 characters is not dependent  on
+       the  use  of  the  /8 modifier on the pattern. It is recognized always.
+       There may be any number of hexadecimal digits inside  the  braces.  The
+       result  is  from  one  to  six bytes, encoded according to the original
+       UTF-8 rules of RFC 2279. This allows for  values  in  the  range  0  to
+       0x7FFFFFFF.  Note  that not all of those are valid Unicode code points,
+       or indeed valid UTF-8 characters according to the later  rules  in  RFC
        3629.



THE ALTERNATIVE MATCHING FUNCTION

-       By  default,  pcretest  uses  the  standard  PCRE  matching   function,
+       By   default,  pcretest  uses  the  standard  PCRE  matching  function,
        pcre_exec() to match each data line. From release 6.0, PCRE supports an
-       alternative matching function, pcre_dfa_test(),  which  operates  in  a
-       different  way,  and has some restrictions. The differences between the
+       alternative  matching  function,  pcre_dfa_test(),  which operates in a
+       different way, and has some restrictions. The differences  between  the
        two functions are described in the pcrematching documentation.


-       If a data line contains the \D escape sequence, or if the command  line
-       contains  the -dfa option, the alternative matching function is called.
+       If  a data line contains the \D escape sequence, or if the command line
+       contains the -dfa option, the alternative matching function is  called.
        This function finds all possible matches at a given point. If, however,
-       the  \F escape sequence is present in the data line, it stops after the
+       the \F escape sequence is present in the data line, it stops after  the
        first match is found. This is always the shortest possible match.



DEFAULT OUTPUT FROM PCRETEST

-       This section describes the output when the  normal  matching  function,
+       This  section  describes  the output when the normal matching function,
        pcre_exec(), is being used.


        When a match succeeds, pcretest outputs the list of captured substrings
-       that pcre_exec() returns, starting with number 0 for  the  string  that
-       matched  the  whole  pattern. Otherwise, it outputs "No match" when the
+       that  pcre_exec()  returns,  starting with number 0 for the string that
+       matched the whole pattern. Otherwise, it outputs "No  match"  when  the
        return is PCRE_ERROR_NOMATCH, and "Partial match:" followed by the par-
-       tially  matching substring when pcre_exec() returns PCRE_ERROR_PARTIAL.
-       (Note that this is the entire substring that was inspected  during  the
-       partial  match; it may include characters before the actual match start
-       if a lookbehind assertion, \K, \b, or \B was involved.) For  any  other
-       return,  pcretest  outputs  the  PCRE negative error number and a short
-       descriptive phrase. If the error is a failed UTF-8  string  check,  the
-       byte  offset  of the start of the failing character and the reason code
-       are also output, provided that the size of  the  output  vector  is  at
+       tially matching substring when pcre_exec() returns  PCRE_ERROR_PARTIAL.
+       (Note  that  this is the entire substring that was inspected during the
+       partial match; it may include characters before the actual match  start
+       if  a  lookbehind assertion, \K, \b, or \B was involved.) For any other
+       return, pcretest outputs the PCRE negative error  number  and  a  short
+       descriptive  phrase.  If  the error is a failed UTF-8 string check, the
+       byte offset of the start of the failing character and the  reason  code
+       are  also  output,  provided  that  the size of the output vector is at
        least two. Here is an example of an interactive pcretest run.


          $ pcretest
@@ -550,9 +547,9 @@


        Unset capturing substrings that are not followed by one that is set are
        not returned by pcre_exec(), and are not shown by pcretest. In the fol-
-       lowing  example, there are two capturing substrings, but when the first
-       data line is matched, the second, unset  substring  is  not  shown.  An
-       "internal"  unset  substring  is  shown as "<unset>", as for the second
+       lowing example, there are two capturing substrings, but when the  first
+       data  line  is  matched,  the  second, unset substring is not shown. An
+       "internal" unset substring is shown as "<unset>",  as  for  the  second
        data line.


            re> /(a)|(b)/
@@ -564,11 +561,11 @@
           1: <unset>
           2: b


-       If the strings contain any non-printing characters, they are output  as
-       \0x  escapes,  or  as \x{...} escapes if the /8 modifier was present on
-       the pattern. See below for the definition of  non-printing  characters.
-       If  the pattern has the /+ modifier, the output for substring 0 is fol-
-       lowed by the the rest of the subject string, identified  by  "0+"  like
+       If  the strings contain any non-printing characters, they are output as
+       \0x escapes, or as \x{...} escapes if the /8 modifier  was  present  on
+       the  pattern.  See below for the definition of non-printing characters.
+       If the pattern has the /+ modifier, the output for substring 0 is  fol-
+       lowed  by  the  the rest of the subject string, identified by "0+" like
        this:


            re> /cat/+
@@ -576,7 +573,7 @@
           0: cat
           0+ aract


-       If  the  pattern  has  the /g or /G modifier, the results of successive
+       If the pattern has the /g or /G modifier,  the  results  of  successive
        matching attempts are output in sequence, like this:


            re> /\Bi(\w\w)/g
@@ -588,32 +585,32 @@
           0: ipp
           1: pp


-       "No match" is output only if the first match attempt fails. Here is  an
-       example  of a failure message (the offset 4 that is specified by \>4 is
+       "No  match" is output only if the first match attempt fails. Here is an
+       example of a failure message (the offset 4 that is specified by \>4  is
        past the end of the subject string):


            re> /xyz/
          data> xyz\>4
          Error -24 (bad offset value)


-       If any of the sequences \C, \G, or \L are present in a data  line  that
-       is  successfully  matched,  the substrings extracted by the convenience
+       If  any  of the sequences \C, \G, or \L are present in a data line that
+       is successfully matched, the substrings extracted  by  the  convenience
        functions are output with C, G, or L after the string number instead of
        a colon. This is in addition to the normal full list. The string length
-       (that is, the return from the extraction function) is given  in  paren-
+       (that  is,  the return from the extraction function) is given in paren-
        theses after each string for \C and \G.


        Note that whereas patterns can be continued over several lines (a plain
        ">" prompt is used for continuations), data lines may not. However new-
-       lines  can  be included in data by means of the \n escape (or \r, \r\n,
+       lines can be included in data by means of the \n escape (or  \r,  \r\n,
        etc., depending on the newline sequence setting).



OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION

-       When the alternative matching function, pcre_dfa_exec(),  is  used  (by
-       means  of  the \D escape sequence or the -dfa command line option), the
-       output consists of a list of all the matches that start  at  the  first
+       When  the  alternative  matching function, pcre_dfa_exec(), is used (by
+       means of the \D escape sequence or the -dfa command line  option),  the
+       output  consists  of  a list of all the matches that start at the first
        point in the subject where there is at least one match. For example:


            re> /(tang|tangerine|tan)/
@@ -622,11 +619,11 @@
           1: tang
           2: tan


-       (Using  the  normal  matching function on this data finds only "tang".)
-       The longest matching string is always given first (and numbered  zero).
+       (Using the normal matching function on this data  finds  only  "tang".)
+       The  longest matching string is always given first (and numbered zero).
        After a PCRE_ERROR_PARTIAL return, the output is "Partial match:", fol-
-       lowed by the partially matching  substring.  (Note  that  this  is  the
-       entire  substring  that  was inspected during the partial match; it may
+       lowed  by  the  partially  matching  substring.  (Note that this is the
+       entire substring that was inspected during the partial  match;  it  may
        include characters before the actual match start if a lookbehind asser-
        tion, \K, \b, or \B was involved.)


@@ -642,16 +639,16 @@
           1: tan
           0: tan


-       Since the matching function does not  support  substring  capture,  the
-       escape  sequences  that  are concerned with captured substrings are not
+       Since  the  matching  function  does not support substring capture, the
+       escape sequences that are concerned with captured  substrings  are  not
        relevant.



RESTARTING AFTER A PARTIAL MATCH

        When the alternative matching function has given the PCRE_ERROR_PARTIAL
-       return,  indicating that the subject partially matched the pattern, you
-       can restart the match with additional subject data by means of  the  \R
+       return, indicating that the subject partially matched the pattern,  you
+       can  restart  the match with additional subject data by means of the \R
        escape sequence. For example:


            re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
@@ -660,30 +657,30 @@
          data> n05\R\D
           0: n05


-       For  further  information  about  partial matching, see the pcrepartial
+       For further information about partial  matching,  see  the  pcrepartial
        documentation.



CALLOUTS

-       If the pattern contains any callout requests, pcretest's callout  func-
-       tion  is  called  during  matching. This works with both matching func-
+       If  the pattern contains any callout requests, pcretest's callout func-
+       tion is called during matching. This works  with  both  matching  func-
        tions. By default, the called function displays the callout number, the
-       start  and  current  positions in the text at the callout time, and the
+       start and current positions in the text at the callout  time,  and  the
        next pattern item to be tested. For example, the output


          --->pqrabcdef
            0    ^  ^     \d


-       indicates that callout number 0 occurred for a match  attempt  starting
-       at  the fourth character of the subject string, when the pointer was at
-       the seventh character of the data, and when the next pattern  item  was
-       \d.  Just  one  circumflex is output if the start and current positions
+       indicates  that  callout number 0 occurred for a match attempt starting
+       at the fourth character of the subject string, when the pointer was  at
+       the  seventh  character of the data, and when the next pattern item was
+       \d. Just one circumflex is output if the start  and  current  positions
        are the same.


        Callouts numbered 255 are assumed to be automatic callouts, inserted as
-       a  result  of the /C pattern modifier. In this case, instead of showing
-       the callout number, the offset in the pattern, preceded by a  plus,  is
+       a result of the /C pattern modifier. In this case, instead  of  showing
+       the  callout  number, the offset in the pattern, preceded by a plus, is
        output. For example:


            re> /\d?[A-E]\*/C
@@ -696,7 +693,7 @@
           0: E*


        If a pattern contains (*MARK) items, an additional line is output when-
-       ever a change of latest mark is passed to  the  callout  function.  For
+       ever  a  change  of  latest mark is passed to the callout function. For
        example:


            re> /a(*MARK:X)bc/C
@@ -710,59 +707,59 @@
          +12 ^  ^
           0: abc


-       The  mark  changes between matching "a" and "b", but stays the same for
-       the rest of the match, so nothing more is output. If, as  a  result  of
-       backtracking,  the  mark  reverts to being unset, the text "<unset>" is
+       The mark changes between matching "a" and "b", but stays the  same  for
+       the  rest  of  the match, so nothing more is output. If, as a result of
+       backtracking, the mark reverts to being unset, the  text  "<unset>"  is
        output.


-       The callout function in pcretest returns zero (carry  on  matching)  by
-       default,  but you can use a \C item in a data line (as described above)
+       The  callout  function  in pcretest returns zero (carry on matching) by
+       default, but you can use a \C item in a data line (as described  above)
        to change this and other parameters of the callout.


-       Inserting callouts can be helpful when using pcretest to check  compli-
-       cated  regular expressions. For further information about callouts, see
+       Inserting  callouts can be helpful when using pcretest to check compli-
+       cated regular expressions. For further information about callouts,  see
        the pcrecallout documentation.



NON-PRINTING CHARACTERS

-       When pcretest is outputting text in the compiled version of a  pattern,
-       bytes  other  than 32-126 are always treated as non-printing characters
+       When  pcretest is outputting text in the compiled version of a pattern,
+       bytes other than 32-126 are always treated as  non-printing  characters
        are are therefore shown as hex escapes.


-       When pcretest is outputting text that is a matched part  of  a  subject
-       string,  it behaves in the same way, unless a different locale has been
-       set for the  pattern  (using  the  /L  modifier).  In  this  case,  the
+       When  pcretest  is  outputting text that is a matched part of a subject
+       string, it behaves in the same way, unless a different locale has  been
+       set  for  the  pattern  (using  the  /L  modifier).  In  this case, the
        isprint() function to distinguish printing and non-printing characters.



SAVING AND RELOADING COMPILED PATTERNS

-       The  facilities  described  in  this section are not available when the
-       POSIX interface to PCRE is being used, that is,  when  the  /P  pattern
+       The facilities described in this section are  not  available  when  the
+       POSIX  interface  to  PCRE  is being used, that is, when the /P pattern
        modifier is specified.


        When the POSIX interface is not in use, you can cause pcretest to write
-       a compiled pattern to a file, by following the modifiers with >  and  a
+       a  compiled  pattern to a file, by following the modifiers with > and a
        file name.  For example:


          /pattern/im >/some/file


-       See  the pcreprecompile documentation for a discussion about saving and
-       re-using compiled patterns.  Note that if the pattern was  successfully
+       See the pcreprecompile documentation for a discussion about saving  and
+       re-using  compiled patterns.  Note that if the pattern was successfully
        studied with JIT optimization, the JIT data cannot be saved.


-       The  data  that  is  written  is  binary. The first eight bytes are the
-       length of the compiled pattern data  followed  by  the  length  of  the
-       optional  study  data,  each  written as four bytes in big-endian order
-       (most significant byte first). If there is no study  data  (either  the
+       The data that is written is binary.  The  first  eight  bytes  are  the
+       length  of  the  compiled  pattern  data  followed by the length of the
+       optional study data, each written as four  bytes  in  big-endian  order
+       (most  significant  byte  first). If there is no study data (either the
        pattern was not studied, or studying did not return any data), the sec-
-       ond length is zero. The lengths are followed by an exact  copy  of  the
-       compiled  pattern.  If  there is additional study data, this (excluding
-       any JIT data) follows immediately after  the  compiled  pattern.  After
+       ond  length  is  zero. The lengths are followed by an exact copy of the
+       compiled pattern. If there is additional study  data,  this  (excluding
+       any  JIT  data)  follows  immediately after the compiled pattern. After
        writing the file, pcretest expects to read a new pattern.


-       A  saved  pattern  can  be reloaded into pcretest by specifying < and a
+       A saved pattern can be reloaded into pcretest by  specifying  <  and  a
        file name instead of a pattern. The name of the file must not contain a
        < character, as otherwise pcretest will interpret the line as a pattern
        delimited by < characters.  For example:
@@ -771,27 +768,27 @@
          Compiled pattern loaded from /some/file
          No study data


-       If the pattern was previously studied with the  JIT  optimization,  the
-       JIT  information cannot be saved and restored, and so is lost. When the
-       pattern has been loaded, pcretest proceeds to read data  lines  in  the
+       If  the  pattern  was previously studied with the JIT optimization, the
+       JIT information cannot be saved and restored, and so is lost. When  the
+       pattern  has  been  loaded, pcretest proceeds to read data lines in the
        usual way.


-       You  can copy a file written by pcretest to a different host and reload
-       it there, even if the new host has opposite endianness to  the  one  on
-       which  the pattern was compiled. For example, you can compile on an i86
+       You can copy a file written by pcretest to a different host and  reload
+       it  there,  even  if the new host has opposite endianness to the one on
+       which the pattern was compiled. For example, you can compile on an  i86
        machine and run on a SPARC machine.


-       File names for saving and reloading can be absolute  or  relative,  but
-       note  that the shell facility of expanding a file name that starts with
+       File  names  for  saving and reloading can be absolute or relative, but
+       note that the shell facility of expanding a file name that starts  with
        a tilde (~) is not available.


-       The ability to save and reload files in pcretest is intended for  test-
-       ing  and experimentation. It is not intended for production use because
-       only a single pattern can be written to a file. Furthermore,  there  is
-       no  facility  for  supplying  custom  character  tables  for use with a
-       reloaded pattern. If the original  pattern  was  compiled  with  custom
-       tables,  an  attempt to match a subject string using a reloaded pattern
-       is likely to cause pcretest to crash.  Finally, if you attempt to  load
+       The  ability to save and reload files in pcretest is intended for test-
+       ing and experimentation. It is not intended for production use  because
+       only  a  single pattern can be written to a file. Furthermore, there is
+       no facility for supplying  custom  character  tables  for  use  with  a
+       reloaded  pattern.  If  the  original  pattern was compiled with custom
+       tables, an attempt to match a subject string using a  reloaded  pattern
+       is  likely to cause pcretest to crash.  Finally, if you attempt to load
        a file that is not in the correct format, the result is undefined.



@@ -810,5 +807,5 @@

REVISION

-       Last updated: 02 December 2011
+       Last updated: 26 August 2011
        Copyright (c) 1997-2011 University of Cambridge.


Modified: code/trunk/maint/ManyConfigTests
===================================================================
--- code/trunk/maint/ManyConfigTests    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/maint/ManyConfigTests    2011-12-28 16:10:09 UTC (rev 835)
@@ -141,10 +141,6 @@
 # builds both) to save a bit of time by building only one version of the
 # library for the subsequent tests.


-valgrind=
-cvalgrind=
-withvalgrind=
-
echo "Tests in the current directory"
srcdir=.
for opts in \
@@ -183,13 +179,10 @@
runtest
done

-valgrind=
-cvalgrind=
-withvalgrind=
-
# Clean up the distribution and then do at least one build and test in a
# directory other than the source directory. It doesn't work unless the
-# source directory is cleaned up first.
+# source directory is cleaned up first - and anyway, it's best to leave it
+# in a clean state after all this reconfiguring.

if [ -f Makefile ]; then
echo "Running 'make distclean'"

Modified: code/trunk/pcre-config.in
===================================================================
--- code/trunk/pcre-config.in    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcre-config.in    2011-12-28 16:10:09 UTC (rev 835)
@@ -70,7 +70,7 @@
       ;;
     --libs-cpp)
       if test @enable_cpp@ = yes ; then
-        echo $libS$libR -lpcrecpp -lpcre
+        echo -L@libdir@$libR -lpcrecpp -lpcre
       else
         echo "${usage}" 1>&2
       fi


Modified: code/trunk/pcre.h.in
===================================================================
--- code/trunk/pcre.h.in    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcre.h.in    2011-12-28 16:10:09 UTC (rev 835)
@@ -98,25 +98,20 @@
 /* Options. Some are compile-time only, some are run-time only, and some are
 both, so we keep them all distinct. However, almost all the bits in the options
 word are now used. In the long run, we may have to re-use some of the
-compile-time only bits for runtime options, or vice versa. In the comments
-below, "compile", "exec", and "DFA exec" mean that the option is permitted to
-be set for those functions; "used in" means that an option may be set only for
-compile, but is subsequently referenced in exec and/or DFA exec. Any of the
-compile-time options may be inspected during studying (and therefore JIT
-compiling). */
+compile-time only bits for runtime options, or vice versa. */


 #define PCRE_CASELESS           0x00000001  /* Compile */
 #define PCRE_MULTILINE          0x00000002  /* Compile */
 #define PCRE_DOTALL             0x00000004  /* Compile */
 #define PCRE_EXTENDED           0x00000008  /* Compile */
 #define PCRE_ANCHORED           0x00000010  /* Compile, exec, DFA exec */
-#define PCRE_DOLLAR_ENDONLY     0x00000020  /* Compile, used in exec, DFA exec */
+#define PCRE_DOLLAR_ENDONLY     0x00000020  /* Compile */
 #define PCRE_EXTRA              0x00000040  /* Compile */
 #define PCRE_NOTBOL             0x00000080  /* Exec, DFA exec */
 #define PCRE_NOTEOL             0x00000100  /* Exec, DFA exec */
 #define PCRE_UNGREEDY           0x00000200  /* Compile */
 #define PCRE_NOTEMPTY           0x00000400  /* Exec, DFA exec */
-#define PCRE_UTF8               0x00000800  /* Compile, used in exec, DFA exec */
+#define PCRE_UTF8               0x00000800  /* Compile */
 #define PCRE_NO_AUTO_CAPTURE    0x00001000  /* Compile */
 #define PCRE_NO_UTF8_CHECK      0x00002000  /* Compile, exec, DFA exec */
 #define PCRE_AUTO_CALLOUT       0x00004000  /* Compile */
@@ -124,7 +119,7 @@
 #define PCRE_PARTIAL            0x00008000  /* Backwards compatible synonym */
 #define PCRE_DFA_SHORTEST       0x00010000  /* DFA exec */
 #define PCRE_DFA_RESTART        0x00020000  /* DFA exec */
-#define PCRE_FIRSTLINE          0x00040000  /* Compile, used in exec, DFA exec */
+#define PCRE_FIRSTLINE          0x00040000  /* Compile */
 #define PCRE_DUPNAMES           0x00080000  /* Compile */
 #define PCRE_NEWLINE_CR         0x00100000  /* Compile, exec, DFA exec */
 #define PCRE_NEWLINE_LF         0x00200000  /* Compile, exec, DFA exec */
@@ -133,12 +128,12 @@
 #define PCRE_NEWLINE_ANYCRLF    0x00500000  /* Compile, exec, DFA exec */
 #define PCRE_BSR_ANYCRLF        0x00800000  /* Compile, exec, DFA exec */
 #define PCRE_BSR_UNICODE        0x01000000  /* Compile, exec, DFA exec */
-#define PCRE_JAVASCRIPT_COMPAT  0x02000000  /* Compile, used in exec */
+#define PCRE_JAVASCRIPT_COMPAT  0x02000000  /* Compile */
 #define PCRE_NO_START_OPTIMIZE  0x04000000  /* Compile, exec, DFA exec */
 #define PCRE_NO_START_OPTIMISE  0x04000000  /* Synonym */
 #define PCRE_PARTIAL_HARD       0x08000000  /* Exec, DFA exec */
 #define PCRE_NOTEMPTY_ATSTART   0x10000000  /* Exec, DFA exec */
-#define PCRE_UCP                0x20000000  /* Compile, used in exec, DFA exec */
+#define PCRE_UCP                0x20000000  /* Compile */


/* Exec-time and get/set-time error codes */

@@ -216,7 +211,6 @@
 #define PCRE_INFO_HASCRORLF         14
 #define PCRE_INFO_MINLENGTH         15
 #define PCRE_INFO_JIT               16
-#define PCRE_INFO_JITSIZE           17


/* Request types for pcre_config(). Do not re-arrange, in order to remain
compatible. */

Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcre_compile.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -88,21 +88,14 @@
 The same workspace is used during the second, actual compile phase for
 remembering forward references to groups so that they can be filled in at the
 end. Each entry in this list occupies LINK_SIZE bytes, so even when LINK_SIZE
-is 4 there is plenty of room for most patterns. However, the memory can get
-filled up by repetitions of forward references, for example patterns like
-/(?1){0,1999}(b)/, and one user did hit the limit. The code has been changed so
-that the workspace is expanded using malloc() in this situation. The value
-below is therefore a minimum, and we put a maximum on it for safety. The
-minimum is now also defined in terms of LINK_SIZE so that the use of malloc()
-kicks in at the same number of forward references in all cases. */
+is 4 there is plenty of room. */


-#define COMPILE_WORK_SIZE (2048*LINK_SIZE)
-#define COMPILE_WORK_SIZE_MAX (100*COMPILE_WORK_SIZE)
+#define COMPILE_WORK_SIZE (4096)

/* The overrun tests check for a slightly smaller size so that they detect the
overrun before it actually does run off the end of the data block. */

-#define WORK_SIZE_SAFETY_MARGIN (100)
+#define WORK_SIZE_CHECK (COMPILE_WORK_SIZE - 100)


/* Table for handling escaped characters in the range '0'-'z'. Positive returns
@@ -419,8 +412,6 @@
"\\k is not followed by a braced, angle-bracketed, or quoted name\0"
/* 70 */
"internal error: unknown opcode in find_fixedlength()\0"
- "\\N is not supported in a class\0"
- "too many forward references\0"
;

/* Table to identify digits and hex digits. This is used when compiling
@@ -589,44 +580,6 @@


 /*************************************************
-*           Expand the workspace                 *
-*************************************************/
-
-/* This function is called during the second compiling phase, if the number of
-forward references fills the existing workspace, which is originally a block on
-the stack. A larger block is obtained from malloc() unless the ultimate limit
-has been reached or the increase will be rather small.
-
-Argument: pointer to the compile data block
-Returns:  0 if all went well, else an error number
-*/
-
-static int
-expand_workspace(compile_data *cd)
-{
-uschar *newspace;
-int newsize = cd->workspace_size * 2;
-
-if (newsize > COMPILE_WORK_SIZE_MAX) newsize = COMPILE_WORK_SIZE_MAX;
-if (cd->workspace_size >= COMPILE_WORK_SIZE_MAX ||
-    newsize - cd->workspace_size < WORK_SIZE_SAFETY_MARGIN)
- return ERR72;
-
-newspace = (pcre_malloc)(newsize);
-if (newspace == NULL) return ERR21;
-
-memcpy(newspace, cd->start_workspace, cd->workspace_size);
-cd->hwm = (uschar *)newspace + (cd->hwm - cd->start_workspace);
-if (cd->workspace_size > COMPILE_WORK_SIZE)
-  (pcre_free)((void *)cd->start_workspace);
-cd->start_workspace = newspace;
-cd->workspace_size = newsize;
-return 0;
-}
-
-
-
-/*************************************************
 *            Check for counted repeat            *
 *************************************************/


@@ -1655,8 +1608,7 @@
     case OP_ASSERTBACK:
     case OP_ASSERTBACK_NOT:
     do cc += GET(cc, 1); while (*cc == OP_ALT);
-    cc += _pcre_OP_lengths[*cc];
-    break;
+    /* Fall through */


     /* Skip over things that don't match chars */


@@ -1750,7 +1702,7 @@
     cc++;
     break;


-    /* The single-byte matcher isn't allowed. This only happens in UTF-8 mode;
+    /* The single-byte matcher isn't allowed. This only happens in UTF-8 mode; 
     otherwise \C is coded as OP_ALLANY. */


     case OP_ANYBYTE:
@@ -3378,8 +3330,7 @@
 #ifdef PCRE_DEBUG
     if (code > cd->hwm) cd->hwm = code;                 /* High water info */
 #endif
-    if (code > cd->start_workspace + cd->workspace_size -
-        WORK_SIZE_SAFETY_MARGIN)                       /* Check for overrun */
+    if (code > cd->start_workspace + WORK_SIZE_CHECK)   /* Check for overrun */
       {
       *errorcodeptr = ERR52;
       goto FAILED;
@@ -3429,8 +3380,7 @@
   /* In the real compile phase, just check the workspace used by the forward
   reference list. */


-  else if (cd->hwm > cd->start_workspace + cd->workspace_size -
-           WORK_SIZE_SAFETY_MARGIN)
+  else if (cd->hwm > cd->start_workspace + WORK_SIZE_CHECK)
     {
     *errorcodeptr = ERR52;
     goto FAILED;
@@ -3684,7 +3634,7 @@


       if (lengthptr != NULL)
         {
-        *lengthptr += (int)(class_utf8data - class_utf8data_base);
+        *lengthptr += class_utf8data - class_utf8data_base;
         class_utf8data = class_utf8data_base;
         }


@@ -3820,11 +3770,6 @@
         if (*errorcodeptr != 0) goto FAILED;


         if (-c == ESC_b) c = CHAR_BS;    /* \b is backspace in a class */
-        else if (-c == ESC_N)            /* \N is not supported in a class */
-          {
-          *errorcodeptr = ERR71;
-          goto FAILED;
-          }
         else if (-c == ESC_Q)            /* Handle start of quoted string */
           {
           if (ptr[1] == CHAR_BACKSLASH && ptr[2] == CHAR_E)
@@ -4383,7 +4328,7 @@


       /* Now fill in the complete length of the item */


-      PUT(previous, 1, (int)(code - previous));
+      PUT(previous, 1, code - previous);
       break;   /* End of class handling */
       }
 #endif
@@ -4481,7 +4426,7 @@
     past, but it no longer happens for non-repeated recursions. In fact, the
     repeated ones could be re-implemented independently so as not to need this,
     but for the moment we rely on the code for repeating groups. */
-
+    
     if (*previous == OP_RECURSE)
       {
       memmove(previous + 1 + LINK_SIZE, previous, 1 + LINK_SIZE);
@@ -4525,7 +4470,7 @@
         {
         uschar *lastchar = code - 1;
         while((*lastchar & 0xc0) == 0x80) lastchar--;
-        c = (int)(code - lastchar);     /* Length of UTF-8 character */
+        c = code - lastchar;            /* Length of UTF-8 character */
         memcpy(utf8_char, lastchar, c); /* Save the char */
         c |= 0x80;                      /* Flag c as a length */
         }
@@ -4932,32 +4877,16 @@
             *lengthptr += delta;
             }


-          /* This is compiling for real. If there is a set first byte for
-          the group, and we have not yet set a "required byte", set it. Make
-          sure there is enough workspace for copying forward references before
-          doing the copy. */
+          /* This is compiling for real */


           else
             {
             if (groupsetfirstbyte && reqbyte < 0) reqbyte = firstbyte;
-
             for (i = 1; i < repeat_min; i++)
               {
               uschar *hc;
               uschar *this_hwm = cd->hwm;
               memcpy(code, previous, len);
-
-              while (cd->hwm > cd->start_workspace + cd->workspace_size -
-                     WORK_SIZE_SAFETY_MARGIN - (this_hwm - save_hwm))
-                {
-                int save_offset = save_hwm - cd->start_workspace;
-                int this_offset = this_hwm - cd->start_workspace;
-                *errorcodeptr = expand_workspace(cd);
-                if (*errorcodeptr != 0) goto FAILED;
-                save_hwm = (uschar *)cd->start_workspace + save_offset;
-                this_hwm = (uschar *)cd->start_workspace + this_offset;
-                }
-
               for (hc = save_hwm; hc < this_hwm; hc += LINK_SIZE)
                 {
                 PUT(cd->hwm, 0, GET(hc, 0) + len);
@@ -5025,21 +4954,6 @@
             }


           memcpy(code, previous, len);
-
-          /* Ensure there is enough workspace for forward references before
-          copying them. */
-
-          while (cd->hwm > cd->start_workspace + cd->workspace_size -
-                 WORK_SIZE_SAFETY_MARGIN - (this_hwm - save_hwm))
-            {
-            int save_offset = save_hwm - cd->start_workspace;
-            int this_offset = this_hwm - cd->start_workspace;
-            *errorcodeptr = expand_workspace(cd);
-            if (*errorcodeptr != 0) goto FAILED;
-            save_hwm = (uschar *)cd->start_workspace + save_offset;
-            this_hwm = (uschar *)cd->start_workspace + this_offset;
-            }
-
           for (hc = save_hwm; hc < this_hwm; hc += LINK_SIZE)
             {
             PUT(cd->hwm, 0, GET(hc, 0) + len + ((i != 0)? 2+LINK_SIZE : 1));
@@ -5070,24 +4984,24 @@
       ONCE brackets can be converted into non-capturing brackets, as the
       behaviour of (?:xx)++ is the same as (?>xx)++ and this saves having to
       deal with possessive ONCEs specially.
-
+      
       Otherwise, when we are doing the actual compile phase, check to see
       whether this group is one that could match an empty string. If so,
       convert the initial operator to the S form (e.g. OP_BRA -> OP_SBRA) so
       that runtime checking can be done. [This check is also applied to ONCE
       groups at runtime, but in a different way.]


-      Then, if the quantifier was possessive and the bracket is not a
+      Then, if the quantifier was possessive and the bracket is not a 
       conditional, we convert the BRA code to the POS form, and the KET code to
       KETRPOS. (It turns out to be convenient at runtime to detect this kind of
       subpattern at both the start and at the end.) The use of special opcodes
       makes it possible to reduce greatly the stack usage in pcre_exec(). If
-      the group is preceded by OP_BRAZERO, convert this to OP_BRAPOSZERO.
-
+      the group is preceded by OP_BRAZERO, convert this to OP_BRAPOSZERO. 
+       
       Then, if the minimum number of matches is 1 or 0, cancel the possessive
       flag so that the default action below, of wrapping everything inside
       atomic brackets, does not happen. When the minimum is greater than 1,
-      there will be earlier copies of the group, and so we still have to wrap
+      there will be earlier copies of the group, and so we still have to wrap 
       the whole thing. */


       else
@@ -5096,23 +5010,23 @@
         uschar *bracode = ketcode - GET(ketcode, 1);


         /* Convert possessive ONCE brackets to non-capturing */
-
+         
         if ((*bracode == OP_ONCE || *bracode == OP_ONCE_NC) &&
             possessive_quantifier) *bracode = OP_BRA;


         /* For non-possessive ONCE brackets, all we need to do is to
         set the KET. */
-
+          
         if (*bracode == OP_ONCE || *bracode == OP_ONCE_NC)
           *ketcode = OP_KETRMAX + repeat_type;
-
+        
         /* Handle non-ONCE brackets and possessive ONCEs (which have been
-        converted to non-capturing above). */
-
+        converted to non-capturing above). */ 
+   
         else
           {
           /* In the compile phase, check for empty string matching. */
-
+             
           if (lengthptr == NULL)
             {
             uschar *scode = bracode;
@@ -5127,7 +5041,7 @@
               }
             while (*scode == OP_ALT);
             }
-
+          
           /* Handle possessive quantifiers. */


           if (possessive_quantifier)
@@ -5136,7 +5050,7 @@
             repeated non-capturing bracket, because we have not invented POS
             versions of the COND opcodes. Because we are moving code along, we
             must ensure that any pending recursive references are updated. */
-
+   
             if (*bracode == OP_COND || *bracode == OP_SCOND)
               {
               int nlen = (int)(code - bracode);
@@ -5149,25 +5063,25 @@
               *code++ = OP_KETRPOS;
               PUTINC(code, 0, nlen);
               PUT(bracode, 1, nlen);
-              }
-
+              }  
+ 
             /* For non-COND brackets, we modify the BRA code and use KETRPOS. */
-
-            else
+             
+            else 
               {
               *bracode += 1;              /* Switch to xxxPOS opcodes */
               *ketcode = OP_KETRPOS;
               }
-
-            /* If the minimum is zero, mark it as possessive, then unset the
+            
+            /* If the minimum is zero, mark it as possessive, then unset the 
             possessive flag when the minimum is 0 or 1. */
-
+             
             if (brazeroptr != NULL) *brazeroptr = OP_BRAPOSZERO;
             if (repeat_min < 2) possessive_quantifier = FALSE;
             }
-
+            
           /* Non-possessive quantifier */
-
+           
           else *ketcode = OP_KETRMAX + repeat_type;
           }
         }
@@ -6059,12 +5973,6 @@
               of the group. Then remember the forward reference. */


               called = cd->start_code + recno;
-              if (cd->hwm >= cd->start_workspace + cd->workspace_size -
-                  WORK_SIZE_SAFETY_MARGIN)
-                {
-                *errorcodeptr = expand_workspace(cd);
-                if (*errorcodeptr != 0) goto FAILED;
-                }
               PUTINC(cd->hwm, 0, (int)(code + 1 - cd->start_code));
               }


@@ -6085,14 +5993,11 @@
               }
             }


-          /* Insert the recursion/subroutine item. It does not have a set first
-          byte (relevant if it is repeated, because it will then be wrapped
-          with ONCE brackets). */
+          /* Insert the recursion/subroutine item. */


           *code = OP_RECURSE;
           PUT(code, 1, (int)(called - cd->start_code));
           code += 1 + LINK_SIZE;
-          groupsetfirstbyte = FALSE;
           }


         /* Can't determine a first byte now */
@@ -6451,10 +6356,10 @@


         if (ptr[1] != CHAR_PLUS && ptr[1] != CHAR_MINUS)
           {
-          BOOL is_a_number = TRUE;
+          BOOL isnumber = TRUE;
           for (p = ptr + 1; *p != 0 && *p != terminator; p++)
             {
-            if ((cd->ctypes[*p] & ctype_digit) == 0) is_a_number = FALSE;
+            if ((cd->ctypes[*p] & ctype_digit) == 0) isnumber = FALSE;
             if ((cd->ctypes[*p] & ctype_word) == 0) break;
             }
           if (*p != terminator)
@@ -6462,7 +6367,7 @@
             *errorcodeptr = ERR57;
             break;
             }
-          if (is_a_number)
+          if (isnumber)
             {
             ptr++;
             goto HANDLE_NUMERICAL_RECURSION;
@@ -6576,8 +6481,8 @@
 #endif
         /* In non-UTF-8 mode, we turn \C into OP_ALLANY instead of OP_ANYBYTE
         so that it works in DFA mode and in lookbehinds. */
-
-          {
+         
+          {  
           previous = (-c > ESC_b && -c < ESC_Z)? code : NULL;
           *code++ = (!utf8 && c == -ESC_C)? OP_ALLANY : -c;
           }
@@ -7315,8 +7220,7 @@
 computing the amount of memory that is needed. Compiled items are thrown away
 as soon as possible, so that a fairly large buffer should be sufficient for
 this purpose. The same space is used in the second phase for remembering where
-to fill in forward references to subpatterns. That may overflow, in which case
-new memory is obtained from malloc(). */
+to fill in forward references to subpatterns. */


uschar cworkspace[COMPILE_WORK_SIZE];

@@ -7506,10 +7410,9 @@
cd->names_found = 0;
cd->name_entry_size = 0;
cd->name_table = NULL;
+cd->start_workspace = cworkspace;
cd->start_code = cworkspace;
cd->hwm = cworkspace;
-cd->start_workspace = cworkspace;
-cd->workspace_size = COMPILE_WORK_SIZE;
cd->start_pattern = (const uschar *)pattern;
cd->end_pattern = (const uschar *)(pattern + strlen(pattern));
cd->req_varyopt = 0;
@@ -7544,7 +7447,7 @@
because nowadays we limit the maximum value of cd->names_found and
cd->name_entry_size. */

-size = length + sizeof(real_pcre) + cd->names_found * cd->name_entry_size;
+size = length + sizeof(real_pcre) + cd->names_found * (cd->name_entry_size + 3);
re = (real_pcre *)(pcre_malloc)(size);

if (re == NULL)
@@ -7587,7 +7490,7 @@
cd->name_table = (uschar *)re + re->name_table_offset;
codestart = cd->name_table + re->name_entry_size * re->name_count;
cd->start_code = codestart;
-cd->hwm = (uschar *)(cd->start_workspace);
+cd->hwm = cworkspace;
cd->req_varyopt = 0;
cd->had_accept = FALSE;
cd->check_lookbehind = FALSE;
@@ -7621,34 +7524,20 @@
if (code - codestart > length) errorcode = ERR23;
#endif

-/* Fill in any forward references that are required. There may be repeated
-references; optimize for them, as searching a large regex takes time. */
+/* Fill in any forward references that are required. */

-if (cd->hwm > cd->start_workspace)
+while (errorcode == 0 && cd->hwm > cworkspace)
   {
-  int prev_recno = -1;
-  const uschar *groupptr = NULL;
-  while (errorcode == 0 && cd->hwm > cd->start_workspace)
-    {
-    int offset, recno;
-    cd->hwm -= LINK_SIZE;
-    offset = GET(cd->hwm, 0);
-    recno = GET(codestart, offset);
-    if (recno != prev_recno)
-      {
-      groupptr = _pcre_find_bracket(codestart, utf8, recno);
-      prev_recno = recno;
-      }
-    if (groupptr == NULL) errorcode = ERR53;
-      else PUT(((uschar *)codestart), offset, (int)(groupptr - codestart));
-    }
+  int offset, recno;
+  const uschar *groupptr;
+  cd->hwm -= LINK_SIZE;
+  offset = GET(cd->hwm, 0);
+  recno = GET(codestart, offset);
+  groupptr = _pcre_find_bracket(codestart, utf8, recno);
+  if (groupptr == NULL) errorcode = ERR53;
+    else PUT(((uschar *)codestart), offset, (int)(groupptr - codestart));
   }


-/* If the workspace had to be expanded, free the new memory. */
-
-if (cd->workspace_size > COMPILE_WORK_SIZE)
- (pcre_free)((void *)cd->start_workspace);
-
/* Give an error if there's back reference to a non-existent capturing
subpattern. */


Modified: code/trunk/pcre_dfa_exec.c
===================================================================
--- code/trunk/pcre_dfa_exec.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcre_dfa_exec.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -2781,7 +2781,7 @@
             {
             const uschar *p = ptr;
             const uschar *pp = local_ptr;
-            charcount = (int)(pp - p);
+            charcount = pp - p;
             while (p < pp) if ((*p++ & 0xc0) == 0x80) charcount--;
             ADD_NEW_DATA(-next_state_offset, 0, (charcount - 1));
             }


Modified: code/trunk/pcre_exec.c
===================================================================
--- code/trunk/pcre_exec.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcre_exec.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -82,6 +82,14 @@
 #define MATCH_SKIP_ARG     (-993)
 #define MATCH_THEN         (-992)


+/* This is a convenience macro for code that occurs many times. */
+
+#define MRRETURN(ra) \
+ { \
+ md->mark = markptr; \
+ RRETURN(ra); \
+ }
+
/* Maximum number of ints of offset to save on the stack for recursive calls.
If the offset vector is bigger, malloc is used. This should be a multiple of 3,
because the offset vector is always a multiple of 3 long. */
@@ -217,7 +225,7 @@
while (length-- > 0) if (*p++ != *eptr++) return -1;
}

-return (int)(eptr - eptr_start);
+return eptr - eptr_start;
}


@@ -282,7 +290,7 @@
#define RMATCH(ra,rb,rc,rd,re,rw) \
{ \
printf("match() called in line %d\n", __LINE__); \
- rrc = match(ra,rb,mstart,rc,rd,re,rdepth+1); \
+ rrc = match(ra,rb,mstart,markptr,rc,rd,re,rdepth+1); \
printf("to line %d\n", __LINE__); \
}
#define RRETURN(ra) \
@@ -292,7 +300,7 @@
}
#else
#define RMATCH(ra,rb,rc,rd,re,rw) \
- rrc = match(ra,rb,mstart,rc,rd,re,rdepth+1)
+ rrc = match(ra,rb,mstart,markptr,rc,rd,re,rdepth+1)
#define RRETURN(ra) return ra
#endif

@@ -313,6 +321,7 @@
newframe->Xeptr = ra;\
newframe->Xecode = rb;\
newframe->Xmstart = mstart;\
+ newframe->Xmarkptr = markptr;\
newframe->Xoffset_top = rc;\
newframe->Xeptrb = re;\
newframe->Xrdepth = frame->Xrdepth + 1;\
@@ -348,6 +357,7 @@
USPTR Xeptr;
const uschar *Xecode;
USPTR Xmstart;
+ USPTR Xmarkptr;
int Xoffset_top;
eptrblock *Xeptrb;
unsigned int Xrdepth;
@@ -417,7 +427,7 @@
same response. */

 /* These macros pack up tests that are used for partial matching, and which
-appear several times in the code. We set the "hit end" flag if the pointer is
+appears several times in the code. We set the "hit end" flag if the pointer is
 at the end of the subject and also past the start of the subject (i.e.
 something has been matched). For hard partial matching, we then return
 immediately. The second one is used when we already know we are past the end of
@@ -428,14 +438,14 @@
       eptr > md->start_used_ptr) \
     { \
     md->hitend = TRUE; \
-    if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL); \
+    if (md->partial > 1) MRRETURN(PCRE_ERROR_PARTIAL); \
     }


 #define SCHECK_PARTIAL()\
   if (md->partial != 0 && eptr > md->start_used_ptr) \
     { \
     md->hitend = TRUE; \
-    if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL); \
+    if (md->partial > 1) MRRETURN(PCRE_ERROR_PARTIAL); \
     }



@@ -449,6 +459,7 @@
    ecode       pointer to current position in compiled code
    mstart      pointer to the current match start position (can be modified
                  by encountering \K)
+   markptr     pointer to the most recent MARK name, or NULL
    offset_top  current top pointer
    md          pointer to "static" info for the match
    eptrb       pointer to chain of blocks containing eptr at start of
@@ -464,7 +475,8 @@


 static int
 match(REGISTER USPTR eptr, REGISTER const uschar *ecode, USPTR mstart,
-  int offset_top, match_data *md, eptrblock *eptrb, unsigned int rdepth)
+  const uschar *markptr, int offset_top, match_data *md, eptrblock *eptrb,
+  unsigned int rdepth)
 {
 /* These variables do not need to be preserved over recursion in this function,
 so they can be ordinary variables in all cases. Mark some of them with
@@ -494,6 +506,7 @@
 frame->Xeptr = eptr;
 frame->Xecode = ecode;
 frame->Xmstart = mstart;
+frame->Xmarkptr = markptr;
 frame->Xoffset_top = offset_top;
 frame->Xeptrb = eptrb;
 frame->Xrdepth = rdepth;
@@ -507,6 +520,7 @@
 #define eptr               frame->Xeptr
 #define ecode              frame->Xecode
 #define mstart             frame->Xmstart
+#define markptr            frame->Xmarkptr
 #define offset_top         frame->Xoffset_top
 #define eptrb              frame->Xeptrb
 #define rdepth             frame->Xrdepth
@@ -687,12 +701,9 @@
   switch(op)
     {
     case OP_MARK:
-    md->nomatch_mark = ecode + 2;
-    md->mark = NULL;    /* In case previously set by assertion */
+    markptr = ecode + 2;
     RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, md,
       eptrb, RM55);
-    if ((rrc == MATCH_MATCH || rrc == MATCH_ACCEPT) &&
-         md->mark == NULL) md->mark = ecode + 2;


     /* A return of MATCH_SKIP_ARG means that matching failed at SKIP with an
     argument, and we must check whether that argument matches this MARK's
@@ -701,16 +712,18 @@
     position and return MATCH_SKIP. Otherwise, pass back the return code
     unaltered. */


-    else if (rrc == MATCH_SKIP_ARG &&
-        strcmp((char *)(ecode + 2), (char *)(md->start_match_ptr)) == 0)
+    if (rrc == MATCH_SKIP_ARG &&
+        strcmp((char *)markptr, (char *)(md->start_match_ptr)) == 0)
       {
       md->start_match_ptr = eptr;
       RRETURN(MATCH_SKIP);
       }
+
+    if (md->mark == NULL) md->mark = markptr;
     RRETURN(rrc);


     case OP_FAIL:
-    RRETURN(MATCH_NOMATCH);
+    MRRETURN(MATCH_NOMATCH);


     /* COMMIT overrides PRUNE, SKIP, and THEN */


@@ -721,7 +734,7 @@
         rrc != MATCH_SKIP && rrc != MATCH_SKIP_ARG &&
         rrc != MATCH_THEN)
       RRETURN(rrc);
-    RRETURN(MATCH_COMMIT);
+    MRRETURN(MATCH_COMMIT);


     /* PRUNE overrides THEN */


@@ -729,16 +742,13 @@
     RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
       eptrb, RM51);
     if (rrc != MATCH_NOMATCH && rrc != MATCH_THEN) RRETURN(rrc);
-    RRETURN(MATCH_PRUNE);
+    MRRETURN(MATCH_PRUNE);


     case OP_PRUNE_ARG:
-    md->nomatch_mark = ecode + 2;
-    md->mark = NULL;    /* In case previously set by assertion */
     RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, md,
       eptrb, RM56);
-    if ((rrc == MATCH_MATCH || rrc == MATCH_ACCEPT) &&
-         md->mark == NULL) md->mark = ecode + 2;
     if (rrc != MATCH_NOMATCH && rrc != MATCH_THEN) RRETURN(rrc);
+    md->mark = ecode + 2;
     RRETURN(MATCH_PRUNE);


     /* SKIP overrides PRUNE and THEN */
@@ -749,18 +759,9 @@
     if (rrc != MATCH_NOMATCH && rrc != MATCH_PRUNE && rrc != MATCH_THEN)
       RRETURN(rrc);
     md->start_match_ptr = eptr;   /* Pass back current position */
-    RRETURN(MATCH_SKIP);
+    MRRETURN(MATCH_SKIP);


-    /* Note that, for Perl compatibility, SKIP with an argument does NOT set
-    nomatch_mark. There is a flag that disables this opcode when re-matching a
-    pattern that ended with a SKIP for which there was not a matching MARK. */
-
     case OP_SKIP_ARG:
-    if (md->ignore_skip_arg)
-      {
-      ecode += _pcre_OP_lengths[*ecode] + ecode[1];
-      break;
-      }
     RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, md,
       eptrb, RM57);
     if (rrc != MATCH_NOMATCH && rrc != MATCH_PRUNE && rrc != MATCH_THEN)
@@ -768,8 +769,8 @@


     /* Pass back the current skip name by overloading md->start_match_ptr and
     returning the special MATCH_SKIP_ARG return code. This will either be
-    caught by a matching MARK, or get to the top, where it causes a rematch
-    with the md->ignore_skip_arg flag set. */
+    caught by a matching MARK, or get to the top, where it is treated the same
+    as PRUNE. */


     md->start_match_ptr = ecode + 2;
     RRETURN(MATCH_SKIP_ARG);
@@ -783,17 +784,14 @@
       eptrb, RM54);
     if (rrc != MATCH_NOMATCH) RRETURN(rrc);
     md->start_match_ptr = ecode;
-    RRETURN(MATCH_THEN);
+    MRRETURN(MATCH_THEN);


     case OP_THEN_ARG:
-    md->nomatch_mark = ecode + 2;
-    md->mark = NULL;    /* In case previously set by assertion */
     RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top,
       md, eptrb, RM58);
-    if ((rrc == MATCH_MATCH || rrc == MATCH_ACCEPT) &&
-         md->mark == NULL) md->mark = ecode + 2;
     if (rrc != MATCH_NOMATCH) RRETURN(rrc);
     md->start_match_ptr = ecode;
+    md->mark = ecode + 2;
     RRETURN(MATCH_THEN);


     /* Handle an atomic group that does not contain any capturing parentheses.
@@ -818,6 +816,7 @@
       if (rrc == MATCH_MATCH)  /* Note: _not_ MATCH_ACCEPT */
         {
         mstart = md->start_match_ptr;
+        markptr = md->mark;
         break;
         }
       if (rrc == MATCH_THEN)
@@ -955,6 +954,7 @@


       /* At this point, rrc will be one of MATCH_ONCE or MATCH_NOMATCH. */


+      if (md->mark == NULL) md->mark = markptr;
       RRETURN(rrc);
       }


@@ -1042,6 +1042,7 @@
       if (*ecode != OP_ALT) break;
       }


+    if (md->mark == NULL) md->mark = markptr;
     RRETURN(MATCH_NOMATCH);


     /* Handle possessive capturing brackets with an unlimited repeat. We come
@@ -1070,7 +1071,7 @@
     if (offset < md->offset_max)
       {
       matched_once = FALSE;
-      code_offset = (int)(ecode - md->start_code);
+      code_offset = ecode - md->start_code;


       save_offset1 = md->offset_vector[offset];
       save_offset2 = md->offset_vector[offset+1];
@@ -1129,6 +1130,7 @@
         md->offset_vector[md->offset_end - number] = save_offset3;
         }


+      if (md->mark == NULL) md->mark = markptr;
       if (allow_zero || matched_once)
         {
         ecode += 1 + LINK_SIZE;
@@ -1160,7 +1162,7 @@


     POSSESSIVE_NON_CAPTURE:
     matched_once = FALSE;
-    code_offset = (int)(ecode - md->start_code);
+    code_offset = ecode - md->start_code;


     for (;;)
       {
@@ -1230,8 +1232,8 @@
         cb.capture_top      = offset_top/2;
         cb.capture_last     = md->capture_last;
         cb.callout_data     = md->callout_data;
-        cb.mark             = md->nomatch_mark;
-        if ((rrc = (*pcre_callout)(&cb)) > 0) RRETURN(MATCH_NOMATCH);
+        cb.mark             = markptr;
+        if ((rrc = (*pcre_callout)(&cb)) > 0) MRRETURN(MATCH_NOMATCH);
         if (rrc < 0) RRETURN(rrc);
         }
       ecode += _pcre_OP_lengths[OP_CALLOUT];
@@ -1486,7 +1488,7 @@
          (md->notempty ||
            (md->notempty_atstart &&
              mstart == md->start_subject + md->start_offset)))
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);


     /* Otherwise, we have a match. */


@@ -1495,10 +1497,10 @@
     md->start_match_ptr = mstart;       /* and the start (\K can modify) */


     /* For some reason, the macros don't work properly if an expression is
-    given as the argument to RRETURN when the heap is in use. */
+    given as the argument to MRRETURN when the heap is in use. */


     rrc = (op == OP_END)? MATCH_MATCH : MATCH_ACCEPT;
-    RRETURN(rrc);
+    MRRETURN(rrc);


     /* Assertion brackets. Check the alternative branches in turn - the
     matching won't pass the KET for an assertion. If any one branch matches,
@@ -1526,6 +1528,7 @@
       if (rrc == MATCH_MATCH || rrc == MATCH_ACCEPT)
         {
         mstart = md->start_match_ptr;   /* In case \K reset it */
+        markptr = md->mark;
         break;
         }


@@ -1537,7 +1540,7 @@
       }
     while (*ecode == OP_ALT);


-    if (*ecode == OP_KET) RRETURN(MATCH_NOMATCH);
+    if (*ecode == OP_KET) MRRETURN(MATCH_NOMATCH);


     /* If checking an assertion for a condition, return MATCH_MATCH. */


@@ -1567,7 +1570,7 @@
     do
       {
       RMATCH(eptr, ecode + 1 + LINK_SIZE, offset_top, md, NULL, RM5);
-      if (rrc == MATCH_MATCH || rrc == MATCH_ACCEPT) RRETURN(MATCH_NOMATCH);
+      if (rrc == MATCH_MATCH || rrc == MATCH_ACCEPT) MRRETURN(MATCH_NOMATCH);
       if (rrc == MATCH_SKIP || rrc == MATCH_PRUNE || rrc == MATCH_COMMIT)
         {
         do ecode += GET(ecode,1); while (*ecode == OP_ALT);
@@ -1600,7 +1603,7 @@
       while (i-- > 0)
         {
         eptr--;
-        if (eptr < md->start_subject) RRETURN(MATCH_NOMATCH);
+        if (eptr < md->start_subject) MRRETURN(MATCH_NOMATCH);
         BACKCHAR(eptr);
         }
       }
@@ -1611,7 +1614,7 @@


       {
       eptr -= GET(ecode, 1);
-      if (eptr < md->start_subject) RRETURN(MATCH_NOMATCH);
+      if (eptr < md->start_subject) MRRETURN(MATCH_NOMATCH);
       }


     /* Save the earliest consulted character, then skip to next op code */
@@ -1640,8 +1643,8 @@
       cb.capture_top      = offset_top/2;
       cb.capture_last     = md->capture_last;
       cb.callout_data     = md->callout_data;
-      cb.mark             = md->nomatch_mark;
-      if ((rrc = (*pcre_callout)(&cb)) > 0) RRETURN(MATCH_NOMATCH);
+      cb.mark             = markptr;
+      if ((rrc = (*pcre_callout)(&cb)) > 0) MRRETURN(MATCH_NOMATCH);
       if (rrc < 0) RRETURN(rrc);
       }
     ecode += 2 + 2*LINK_SIZE;
@@ -1755,7 +1758,7 @@
       md->recursive = new_recursive.prevrec;
       if (new_recursive.offset_save != stacksave)
         (pcre_free)(new_recursive.offset_save);
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }


     RECURSION_MATCHED:
@@ -1835,7 +1838,7 @@
       md->end_match_ptr = eptr;      /* For ONCE_NC */
       md->end_offset_top = offset_top;
       md->start_match_ptr = mstart;
-      RRETURN(MATCH_MATCH);         /* Sets md->mark */
+      MRRETURN(MATCH_MATCH);         /* Sets md->mark */
       }


     /* For capturing groups we have to check the group number back at the start
@@ -1977,29 +1980,29 @@
     /* Not multiline mode: start of subject assertion, unless notbol. */


     case OP_CIRC:
-    if (md->notbol && eptr == md->start_subject) RRETURN(MATCH_NOMATCH);
+    if (md->notbol && eptr == md->start_subject) MRRETURN(MATCH_NOMATCH);


     /* Start of subject assertion */


     case OP_SOD:
-    if (eptr != md->start_subject) RRETURN(MATCH_NOMATCH);
+    if (eptr != md->start_subject) MRRETURN(MATCH_NOMATCH);
     ecode++;
     break;


     /* Multiline mode: start of subject unless notbol, or after any newline. */


     case OP_CIRCM:
-    if (md->notbol && eptr == md->start_subject) RRETURN(MATCH_NOMATCH);
+    if (md->notbol && eptr == md->start_subject) MRRETURN(MATCH_NOMATCH);
     if (eptr != md->start_subject &&
         (eptr == md->end_subject || !WAS_NEWLINE(eptr)))
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
     ecode++;
     break;


     /* Start of match assertion */


     case OP_SOM:
-    if (eptr != md->start_subject + md->start_offset) RRETURN(MATCH_NOMATCH);
+    if (eptr != md->start_subject + md->start_offset) MRRETURN(MATCH_NOMATCH);
     ecode++;
     break;


@@ -2015,10 +2018,10 @@

     case OP_DOLLM:
     if (eptr < md->end_subject)
-      { if (!IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH); }
+      { if (!IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH); }
     else
       {
-      if (md->noteol) RRETURN(MATCH_NOMATCH);
+      if (md->noteol) MRRETURN(MATCH_NOMATCH);
       SCHECK_PARTIAL();
       }
     ecode++;
@@ -2028,7 +2031,7 @@
     subject unless noteol is set. */


     case OP_DOLL:
-    if (md->noteol) RRETURN(MATCH_NOMATCH);
+    if (md->noteol) MRRETURN(MATCH_NOMATCH);
     if (!md->endonly) goto ASSERT_NL_OR_EOS;


     /* ... else fall through for endonly */
@@ -2036,7 +2039,7 @@
     /* End of subject assertion (\z) */


     case OP_EOD:
-    if (eptr < md->end_subject) RRETURN(MATCH_NOMATCH);
+    if (eptr < md->end_subject) MRRETURN(MATCH_NOMATCH);
     SCHECK_PARTIAL();
     ecode++;
     break;
@@ -2047,7 +2050,7 @@
     ASSERT_NL_OR_EOS:
     if (eptr < md->end_subject &&
         (!IS_NEWLINE(eptr) || eptr != md->end_subject - md->nllen))
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);


     /* Either at end of string or \n before end. */


@@ -2169,21 +2172,21 @@

       if ((*ecode++ == OP_WORD_BOUNDARY)?
            cur_is_word == prev_is_word : cur_is_word != prev_is_word)
-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
       }
     break;


     /* Match a single character type; inline for speed */


     case OP_ANY:
-    if (IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH);
+    if (IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH);
     /* Fall through */


     case OP_ALLANY:
     if (eptr >= md->end_subject)   /* DO NOT merge the eptr++ here; it must */
       {                            /* not be updated before SCHECK_PARTIAL. */
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     eptr++;
     if (utf8) while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
@@ -2197,7 +2200,7 @@
     if (eptr >= md->end_subject)   /* DO NOT merge the eptr++ here; it must */
       {                            /* not be updated before SCHECK_PARTIAL. */
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     eptr++;
     ecode++;
@@ -2207,7 +2210,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
@@ -2216,7 +2219,7 @@
 #endif
        (md->ctypes[c] & ctype_digit) != 0
        )
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
     ecode++;
     break;


@@ -2224,7 +2227,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
@@ -2233,7 +2236,7 @@
 #endif
        (md->ctypes[c] & ctype_digit) == 0
        )
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
     ecode++;
     break;


@@ -2241,7 +2244,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
@@ -2250,7 +2253,7 @@
 #endif
        (md->ctypes[c] & ctype_space) != 0
        )
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
     ecode++;
     break;


@@ -2258,7 +2261,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
@@ -2267,7 +2270,7 @@
 #endif
        (md->ctypes[c] & ctype_space) == 0
        )
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
     ecode++;
     break;


@@ -2275,7 +2278,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
@@ -2284,7 +2287,7 @@
 #endif
        (md->ctypes[c] & ctype_word) != 0
        )
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
     ecode++;
     break;


@@ -2292,7 +2295,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     if (
@@ -2301,7 +2304,7 @@
 #endif
        (md->ctypes[c] & ctype_word) == 0
        )
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
     ecode++;
     break;


@@ -2309,12 +2312,12 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     switch(c)
       {
-      default: RRETURN(MATCH_NOMATCH);
+      default: MRRETURN(MATCH_NOMATCH);


       case 0x000d:
       if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
@@ -2328,7 +2331,7 @@
       case 0x0085:
       case 0x2028:
       case 0x2029:
-      if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
+      if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
       break;
       }
     ecode++;
@@ -2338,7 +2341,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     switch(c)
@@ -2363,7 +2366,7 @@
       case 0x202f:    /* NARROW NO-BREAK SPACE */
       case 0x205f:    /* MEDIUM MATHEMATICAL SPACE */
       case 0x3000:    /* IDEOGRAPHIC SPACE */
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     ecode++;
     break;
@@ -2372,12 +2375,12 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     switch(c)
       {
-      default: RRETURN(MATCH_NOMATCH);
+      default: MRRETURN(MATCH_NOMATCH);
       case 0x09:      /* HT */
       case 0x20:      /* SPACE */
       case 0xa0:      /* NBSP */
@@ -2406,7 +2409,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     switch(c)
@@ -2419,7 +2422,7 @@
       case 0x85:      /* NEL */
       case 0x2028:    /* LINE SEPARATOR */
       case 0x2029:    /* PARAGRAPH SEPARATOR */
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     ecode++;
     break;
@@ -2428,12 +2431,12 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
     switch(c)
       {
-      default: RRETURN(MATCH_NOMATCH);
+      default: MRRETURN(MATCH_NOMATCH);
       case 0x0a:      /* LF */
       case 0x0b:      /* VT */
       case 0x0c:      /* FF */
@@ -2455,7 +2458,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
       {
@@ -2464,29 +2467,29 @@
       switch(ecode[1])
         {
         case PT_ANY:
-        if (op == OP_NOTPROP) RRETURN(MATCH_NOMATCH);
+        if (op == OP_NOTPROP) MRRETURN(MATCH_NOMATCH);
         break;


         case PT_LAMP:
         if ((prop->chartype == ucp_Lu ||
              prop->chartype == ucp_Ll ||
              prop->chartype == ucp_Lt) == (op == OP_NOTPROP))
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
         break;


         case PT_GC:
         if ((ecode[2] != _pcre_ucp_gentype[prop->chartype]) == (op == OP_PROP))
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
         break;


         case PT_PC:
         if ((ecode[2] != prop->chartype) == (op == OP_PROP))
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
         break;


         case PT_SC:
         if ((ecode[2] != prop->script) == (op == OP_PROP))
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
         break;


         /* These are specials */
@@ -2494,14 +2497,14 @@
         case PT_ALNUM:
         if ((_pcre_ucp_gentype[prop->chartype] == ucp_L ||
              _pcre_ucp_gentype[prop->chartype] == ucp_N) == (op == OP_NOTPROP))
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
         break;


         case PT_SPACE:    /* Perl space */
         if ((_pcre_ucp_gentype[prop->chartype] == ucp_Z ||
              c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR)
                == (op == OP_NOTPROP))
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
         break;


         case PT_PXSPACE:  /* POSIX space */
@@ -2509,14 +2512,14 @@
              c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
              c == CHAR_FF || c == CHAR_CR)
                == (op == OP_NOTPROP))
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
         break;


         case PT_WORD:
         if ((_pcre_ucp_gentype[prop->chartype] == ucp_L ||
              _pcre_ucp_gentype[prop->chartype] == ucp_N ||
              c == CHAR_UNDERSCORE) == (op == OP_NOTPROP))
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
         break;


         /* This should never occur */
@@ -2536,10 +2539,10 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     GETCHARINCTEST(c, eptr);
-    if (UCD_CATEGORY(c) == ucp_M) RRETURN(MATCH_NOMATCH);
+    if (UCD_CATEGORY(c) == ucp_M) MRRETURN(MATCH_NOMATCH);
     while (eptr < md->end_subject)
       {
       int len = 1;
@@ -2613,7 +2616,7 @@
       if ((length = match_ref(offset, eptr, length, md, caseless)) < 0)
         {
         CHECK_PARTIAL();
-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
         }
       eptr += length;
       continue;              /* With the main loop */
@@ -2634,7 +2637,7 @@
       if ((slength = match_ref(offset, eptr, length, md, caseless)) < 0)
         {
         CHECK_PARTIAL();
-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
         }
       eptr += slength;
       }
@@ -2653,11 +2656,11 @@
         int slength;
         RMATCH(eptr, ecode, offset_top, md, eptrb, RM14);
         if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-        if (fi >= max) RRETURN(MATCH_NOMATCH);
+        if (fi >= max) MRRETURN(MATCH_NOMATCH);
         if ((slength = match_ref(offset, eptr, length, md, caseless)) < 0)
           {
           CHECK_PARTIAL();
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
           }
         eptr += slength;
         }
@@ -2685,7 +2688,7 @@
         if (rrc != MATCH_NOMATCH) RRETURN(rrc);
         eptr -= length;
         }
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     /* Control never gets here */


@@ -2746,16 +2749,16 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           if (c > 255)
             {
-            if (op == OP_CLASS) RRETURN(MATCH_NOMATCH);
+            if (op == OP_CLASS) MRRETURN(MATCH_NOMATCH);
             }
           else
             {
-            if ((data[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
+            if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
             }
           }
         }
@@ -2768,10 +2771,10 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           c = *eptr++;
-          if ((data[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
+          if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
           }
         }


@@ -2793,20 +2796,20 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM16);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINC(c, eptr);
             if (c > 255)
               {
-              if (op == OP_CLASS) RRETURN(MATCH_NOMATCH);
+              if (op == OP_CLASS) MRRETURN(MATCH_NOMATCH);
               }
             else
               {
-              if ((data[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
+              if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
               }
             }
           }
@@ -2818,14 +2821,14 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM17);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             c = *eptr++;
-            if ((data[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
+            if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
             }
           }
         /* Control never gets here */
@@ -2891,7 +2894,7 @@
             }
           }


-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
         }
       }
     /* Control never gets here */
@@ -2943,10 +2946,10 @@
         if (eptr >= md->end_subject)
           {
           SCHECK_PARTIAL();
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
           }
         GETCHARINCTEST(c, eptr);
-        if (!_pcre_xclass(c, data)) RRETURN(MATCH_NOMATCH);
+        if (!_pcre_xclass(c, data)) MRRETURN(MATCH_NOMATCH);
         }


       /* If max == min we can continue with the main loop without the
@@ -2963,14 +2966,14 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM20);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) RRETURN(MATCH_NOMATCH);
+          if (fi >= max) MRRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINCTEST(c, eptr);
-          if (!_pcre_xclass(c, data)) RRETURN(MATCH_NOMATCH);
+          if (!_pcre_xclass(c, data)) MRRETURN(MATCH_NOMATCH);
           }
         /* Control never gets here */
         }
@@ -2999,7 +3002,7 @@
           if (eptr-- == pp) break;        /* Stop if tried at original pos */
           if (utf8) BACKCHAR(eptr);
           }
-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
         }


       /* Control never gets here */
@@ -3018,9 +3021,9 @@
       if (length > md->end_subject - eptr)
         {
         CHECK_PARTIAL();             /* Not SCHECK_PARTIAL() */
-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
         }
-      while (length-- > 0) if (*ecode++ != *eptr++) RRETURN(MATCH_NOMATCH);
+      while (length-- > 0) if (*ecode++ != *eptr++) MRRETURN(MATCH_NOMATCH);
       }
     else
 #endif
@@ -3030,23 +3033,16 @@
       if (md->end_subject - eptr < 1)
         {
         SCHECK_PARTIAL();            /* This one can use SCHECK_PARTIAL() */
-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
         }
-      if (ecode[1] != *eptr++) RRETURN(MATCH_NOMATCH);
+      if (ecode[1] != *eptr++) MRRETURN(MATCH_NOMATCH);
       ecode += 2;
       }
     break;


-    /* Match a single character, caselessly. If we are at the end of the
-    subject, give up immediately. */
+    /* Match a single character, caselessly */


     case OP_CHARI:
-    if (eptr >= md->end_subject)
-      {
-      SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
-      }
-
 #ifdef SUPPORT_UTF8
     if (utf8)
       {
@@ -3054,19 +3050,21 @@
       ecode++;
       GETCHARLEN(fc, ecode, length);


+      if (length > md->end_subject - eptr)
+        {
+        CHECK_PARTIAL();             /* Not SCHECK_PARTIAL() */
+        MRRETURN(MATCH_NOMATCH);
+        }
+
       /* If the pattern character's value is < 128, we have only one byte, and
-      we know that its other case must also be one byte long, so we can use the
-      fast lookup table. We know that there is at least one byte left in the
-      subject. */
+      can use the fast lookup table. */


       if (fc < 128)
         {
-        if (md->lcc[*ecode++] != md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+        if (md->lcc[*ecode++] != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
         }


-      /* Otherwise we must pick up the subject character. Note that we cannot
-      use the value of "length" to check for sufficient bytes left, because the
-      other case of the character may have more or fewer bytes.  */
+      /* Otherwise we must pick up the subject character */


       else
         {
@@ -3082,7 +3080,7 @@
 #ifdef SUPPORT_UCP
           if (dc != UCD_OTHERCASE(fc))
 #endif
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           }
         }
       }
@@ -3091,7 +3089,12 @@


     /* Non-UTF-8 mode */
       {
-      if (md->lcc[ecode[1]] != md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+      if (md->end_subject - eptr < 1)
+        {
+        SCHECK_PARTIAL();            /* This one can use SCHECK_PARTIAL() */
+        MRRETURN(MATCH_NOMATCH);
+        }
+      if (md->lcc[ecode[1]] != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
       ecode += 2;
       }
     break;
@@ -3197,7 +3200,7 @@
           else
             {
             CHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           }


@@ -3209,7 +3212,7 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM22);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr <= md->end_subject - length &&
               memcmp(eptr, charptr, length) == 0) eptr += length;
 #ifdef SUPPORT_UCP
@@ -3220,7 +3223,7 @@
             else
               {
               CHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             }
           /* Control never gets here */
@@ -3251,7 +3254,7 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM23);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (eptr == pp) { RRETURN(MATCH_NOMATCH); }
+            if (eptr == pp) { MRRETURN(MATCH_NOMATCH); }
 #ifdef SUPPORT_UCP
             eptr--;
             BACKCHAR(eptr);
@@ -3294,9 +3297,9 @@
         if (eptr >= md->end_subject)
           {
           SCHECK_PARTIAL();
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
           }
-        if (fc != md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+        if (fc != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
         }
       if (min == max) continue;
       if (minimize)
@@ -3305,13 +3308,13 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM24);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) RRETURN(MATCH_NOMATCH);
+          if (fi >= max) MRRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
-          if (fc != md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+          if (fc != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
           }
         /* Control never gets here */
         }
@@ -3337,7 +3340,7 @@
           eptr--;
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
           }
-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
         }
       /* Control never gets here */
       }
@@ -3351,9 +3354,9 @@
         if (eptr >= md->end_subject)
           {
           SCHECK_PARTIAL();
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
           }
-        if (fc != *eptr++) RRETURN(MATCH_NOMATCH);
+        if (fc != *eptr++) MRRETURN(MATCH_NOMATCH);
         }


       if (min == max) continue;
@@ -3364,13 +3367,13 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM26);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) RRETURN(MATCH_NOMATCH);
+          if (fi >= max) MRRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
-          if (fc != *eptr++) RRETURN(MATCH_NOMATCH);
+          if (fc != *eptr++) MRRETURN(MATCH_NOMATCH);
           }
         /* Control never gets here */
         }
@@ -3395,7 +3398,7 @@
           eptr--;
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
           }
-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
         }
       }
     /* Control never gets here */
@@ -3408,7 +3411,7 @@
     if (eptr >= md->end_subject)
       {
       SCHECK_PARTIAL();
-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     ecode++;
     GETCHARINCTEST(c, eptr);
@@ -3418,11 +3421,11 @@
       if (c < 256)
 #endif
       c = md->lcc[c];
-      if (md->lcc[*ecode++] == c) RRETURN(MATCH_NOMATCH);
+      if (md->lcc[*ecode++] == c) MRRETURN(MATCH_NOMATCH);
       }
     else    /* Caseful */
       {
-      if (*ecode++ == c) RRETURN(MATCH_NOMATCH);
+      if (*ecode++ == c) MRRETURN(MATCH_NOMATCH);
       }
     break;


@@ -3529,11 +3532,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(d, eptr);
           if (d < 256) d = md->lcc[d];
-          if (fc == d) RRETURN(MATCH_NOMATCH);
+          if (fc == d) MRRETURN(MATCH_NOMATCH);
           }
         }
       else
@@ -3546,9 +3549,9 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
-          if (fc == md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+          if (fc == md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
           }
         }


@@ -3565,15 +3568,15 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM28);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINC(d, eptr);
             if (d < 256) d = md->lcc[d];
-            if (fc == d) RRETURN(MATCH_NOMATCH);
+            if (fc == d) MRRETURN(MATCH_NOMATCH);
             }
           }
         else
@@ -3584,13 +3587,13 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM29);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
-            if (fc == md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+            if (fc == md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
             }
           }
         /* Control never gets here */
@@ -3652,7 +3655,7 @@
             }
           }


-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
         }
       /* Control never gets here */
       }
@@ -3671,10 +3674,10 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(d, eptr);
-          if (fc == d) RRETURN(MATCH_NOMATCH);
+          if (fc == d) MRRETURN(MATCH_NOMATCH);
           }
         }
       else
@@ -3686,9 +3689,9 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
-          if (fc == *eptr++) RRETURN(MATCH_NOMATCH);
+          if (fc == *eptr++) MRRETURN(MATCH_NOMATCH);
           }
         }


@@ -3705,14 +3708,14 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM32);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINC(d, eptr);
-            if (fc == d) RRETURN(MATCH_NOMATCH);
+            if (fc == d) MRRETURN(MATCH_NOMATCH);
             }
           }
         else
@@ -3723,13 +3726,13 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM33);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
-            if (fc == *eptr++) RRETURN(MATCH_NOMATCH);
+            if (fc == *eptr++) MRRETURN(MATCH_NOMATCH);
             }
           }
         /* Control never gets here */
@@ -3790,7 +3793,7 @@
             }
           }


-        RRETURN(MATCH_NOMATCH);
+        MRRETURN(MATCH_NOMATCH);
         }
       }
     /* Control never gets here */
@@ -3884,13 +3887,13 @@
         switch(prop_type)
           {
           case PT_ANY:
-          if (prop_fail_result) RRETURN(MATCH_NOMATCH);
+          if (prop_fail_result) MRRETURN(MATCH_NOMATCH);
           for (i = 1; i <= min; i++)
             {
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             }
@@ -3903,14 +3906,14 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             chartype = UCD_CHARTYPE(c);
             if ((chartype == ucp_Lu ||
                  chartype == ucp_Ll ||
                  chartype == ucp_Lt) == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           break;


@@ -3920,11 +3923,11 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == prop_value) == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           break;


@@ -3934,11 +3937,11 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CHARTYPE(c) == prop_value) == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           break;


@@ -3948,11 +3951,11 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_SCRIPT(c) == prop_value) == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           break;


@@ -3963,12 +3966,12 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             category = UCD_CATEGORY(c);
             if ((category == ucp_L || category == ucp_N) == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           break;


@@ -3978,13 +3981,13 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == ucp_Z || c == CHAR_HT || c == CHAR_NL ||
                  c == CHAR_FF || c == CHAR_CR)
                    == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           break;


@@ -3994,13 +3997,13 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == ucp_Z || c == CHAR_HT || c == CHAR_NL ||
                  c == CHAR_VT || c == CHAR_FF || c == CHAR_CR)
                    == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           break;


@@ -4011,13 +4014,13 @@
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             category = UCD_CATEGORY(c);
             if ((category == ucp_L || category == ucp_N || c == CHAR_UNDERSCORE)
                    == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           break;


@@ -4038,10 +4041,10 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINCTEST(c, eptr);
-          if (UCD_CATEGORY(c) == ucp_M) RRETURN(MATCH_NOMATCH);
+          if (UCD_CATEGORY(c) == ucp_M) MRRETURN(MATCH_NOMATCH);
           while (eptr < md->end_subject)
             {
             int len = 1;
@@ -4066,9 +4069,9 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
-          if (IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH);
+          if (IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH);
           eptr++;
           while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
           }
@@ -4080,7 +4083,7 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           eptr++;
           while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
@@ -4088,7 +4091,7 @@
         break;


         case OP_ANYBYTE:
-        if (eptr > md->end_subject - min) RRETURN(MATCH_NOMATCH);
+        if (eptr > md->end_subject - min) MRRETURN(MATCH_NOMATCH);
         eptr += min;
         break;


@@ -4098,12 +4101,12 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           switch(c)
             {
-            default: RRETURN(MATCH_NOMATCH);
+            default: MRRETURN(MATCH_NOMATCH);


             case 0x000d:
             if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
@@ -4117,7 +4120,7 @@
             case 0x0085:
             case 0x2028:
             case 0x2029:
-            if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
+            if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
             break;
             }
           }
@@ -4129,7 +4132,7 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           switch(c)
@@ -4154,7 +4157,7 @@
             case 0x202f:    /* NARROW NO-BREAK SPACE */
             case 0x205f:    /* MEDIUM MATHEMATICAL SPACE */
             case 0x3000:    /* IDEOGRAPHIC SPACE */
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           }
         break;
@@ -4165,12 +4168,12 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           switch(c)
             {
-            default: RRETURN(MATCH_NOMATCH);
+            default: MRRETURN(MATCH_NOMATCH);
             case 0x09:      /* HT */
             case 0x20:      /* SPACE */
             case 0xa0:      /* NBSP */
@@ -4201,7 +4204,7 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           switch(c)
@@ -4214,7 +4217,7 @@
             case 0x85:      /* NEL */
             case 0x2028:    /* LINE SEPARATOR */
             case 0x2029:    /* PARAGRAPH SEPARATOR */
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           }
         break;
@@ -4225,12 +4228,12 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           switch(c)
             {
-            default: RRETURN(MATCH_NOMATCH);
+            default: MRRETURN(MATCH_NOMATCH);
             case 0x0a:      /* LF */
             case 0x0b:      /* VT */
             case 0x0c:      /* FF */
@@ -4249,11 +4252,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINC(c, eptr);
           if (c < 128 && (md->ctypes[c] & ctype_digit) != 0)
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           }
         break;


@@ -4263,10 +4266,10 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           if (*eptr >= 128 || (md->ctypes[*eptr++] & ctype_digit) == 0)
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           /* No need to skip more bytes - we know it's a 1-byte character */
           }
         break;
@@ -4277,10 +4280,10 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           if (*eptr < 128 && (md->ctypes[*eptr] & ctype_space) != 0)
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           while (++eptr < md->end_subject && (*eptr & 0xc0) == 0x80);
           }
         break;
@@ -4291,10 +4294,10 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           if (*eptr >= 128 || (md->ctypes[*eptr++] & ctype_space) == 0)
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           /* No need to skip more bytes - we know it's a 1-byte character */
           }
         break;
@@ -4305,10 +4308,10 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           if (*eptr < 128 && (md->ctypes[*eptr] & ctype_word) != 0)
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           while (++eptr < md->end_subject && (*eptr & 0xc0) == 0x80);
           }
         break;
@@ -4319,10 +4322,10 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           if (*eptr >= 128 || (md->ctypes[*eptr++] & ctype_word) == 0)
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           /* No need to skip more bytes - we know it's a 1-byte character */
           }
         break;
@@ -4345,9 +4348,9 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
-          if (IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH);
+          if (IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH);
           eptr++;
           }
         break;
@@ -4356,7 +4359,7 @@
         if (eptr > md->end_subject - min)
           {
           SCHECK_PARTIAL();
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
           }
         eptr += min;
         break;
@@ -4365,7 +4368,7 @@
         if (eptr > md->end_subject - min)
           {
           SCHECK_PARTIAL();
-          RRETURN(MATCH_NOMATCH);
+          MRRETURN(MATCH_NOMATCH);
           }
         eptr += min;
         break;
@@ -4376,11 +4379,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           switch(*eptr++)
             {
-            default: RRETURN(MATCH_NOMATCH);
+            default: MRRETURN(MATCH_NOMATCH);


             case 0x000d:
             if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
@@ -4392,7 +4395,7 @@
             case 0x000b:
             case 0x000c:
             case 0x0085:
-            if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
+            if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
             break;
             }
           }
@@ -4404,7 +4407,7 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           switch(*eptr++)
             {
@@ -4412,7 +4415,7 @@
             case 0x09:      /* HT */
             case 0x20:      /* SPACE */
             case 0xa0:      /* NBSP */
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           }
         break;
@@ -4423,11 +4426,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           switch(*eptr++)
             {
-            default: RRETURN(MATCH_NOMATCH);
+            default: MRRETURN(MATCH_NOMATCH);
             case 0x09:      /* HT */
             case 0x20:      /* SPACE */
             case 0xa0:      /* NBSP */
@@ -4442,7 +4445,7 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           switch(*eptr++)
             {
@@ -4452,7 +4455,7 @@
             case 0x0c:      /* FF */
             case 0x0d:      /* CR */
             case 0x85:      /* NEL */
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           }
         break;
@@ -4463,11 +4466,11 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           switch(*eptr++)
             {
-            default: RRETURN(MATCH_NOMATCH);
+            default: MRRETURN(MATCH_NOMATCH);
             case 0x0a:      /* LF */
             case 0x0b:      /* VT */
             case 0x0c:      /* FF */
@@ -4484,9 +4487,9 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
-          if ((md->ctypes[*eptr++] & ctype_digit) != 0) RRETURN(MATCH_NOMATCH);
+          if ((md->ctypes[*eptr++] & ctype_digit) != 0) MRRETURN(MATCH_NOMATCH);
           }
         break;


@@ -4496,9 +4499,9 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
-          if ((md->ctypes[*eptr++] & ctype_digit) == 0) RRETURN(MATCH_NOMATCH);
+          if ((md->ctypes[*eptr++] & ctype_digit) == 0) MRRETURN(MATCH_NOMATCH);
           }
         break;


@@ -4508,9 +4511,9 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
-          if ((md->ctypes[*eptr++] & ctype_space) != 0) RRETURN(MATCH_NOMATCH);
+          if ((md->ctypes[*eptr++] & ctype_space) != 0) MRRETURN(MATCH_NOMATCH);
           }
         break;


@@ -4520,9 +4523,9 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
-          if ((md->ctypes[*eptr++] & ctype_space) == 0) RRETURN(MATCH_NOMATCH);
+          if ((md->ctypes[*eptr++] & ctype_space) == 0) MRRETURN(MATCH_NOMATCH);
           }
         break;


@@ -4532,10 +4535,10 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           if ((md->ctypes[*eptr++] & ctype_word) != 0)
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           }
         break;


@@ -4545,10 +4548,10 @@
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           if ((md->ctypes[*eptr++] & ctype_word) == 0)
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           }
         break;


@@ -4577,14 +4580,14 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM36);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
-            if (prop_fail_result) RRETURN(MATCH_NOMATCH);
+            if (prop_fail_result) MRRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */


@@ -4594,18 +4597,18 @@
             int chartype;
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM37);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             chartype = UCD_CHARTYPE(c);
             if ((chartype == ucp_Lu ||
                  chartype == ucp_Ll ||
                  chartype == ucp_Lt) == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */


@@ -4614,15 +4617,15 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM38);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == prop_value) == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */


@@ -4631,15 +4634,15 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM39);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CHARTYPE(c) == prop_value) == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */


@@ -4648,15 +4651,15 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM40);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_SCRIPT(c) == prop_value) == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */


@@ -4666,16 +4669,16 @@
             int category;
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM59);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             category = UCD_CATEGORY(c);
             if ((category == ucp_L || category == ucp_N) == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */


@@ -4684,17 +4687,17 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM60);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == ucp_Z || c == CHAR_HT || c == CHAR_NL ||
                  c == CHAR_FF || c == CHAR_CR)
                    == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */


@@ -4703,17 +4706,17 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM61);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             if ((UCD_CATEGORY(c) == ucp_Z || c == CHAR_HT || c == CHAR_NL ||
                  c == CHAR_VT || c == CHAR_FF || c == CHAR_CR)
                    == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */


@@ -4723,11 +4726,11 @@
             int category;
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM62);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (fi >= max) RRETURN(MATCH_NOMATCH);
+            if (fi >= max) MRRETURN(MATCH_NOMATCH);
             if (eptr >= md->end_subject)
               {
               SCHECK_PARTIAL();
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             GETCHARINCTEST(c, eptr);
             category = UCD_CATEGORY(c);
@@ -4735,7 +4738,7 @@
                  category == ucp_N ||
                  c == CHAR_UNDERSCORE)
                    == prop_fail_result)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             }
           /* Control never gets here */


@@ -4755,14 +4758,14 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM41);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) RRETURN(MATCH_NOMATCH);
+          if (fi >= max) MRRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           GETCHARINCTEST(c, eptr);
-          if (UCD_CATEGORY(c) == ucp_M) RRETURN(MATCH_NOMATCH);
+          if (UCD_CATEGORY(c) == ucp_M) MRRETURN(MATCH_NOMATCH);
           while (eptr < md->end_subject)
             {
             int len = 1;
@@ -4783,14 +4786,14 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM42);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) RRETURN(MATCH_NOMATCH);
+          if (fi >= max) MRRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           if (ctype == OP_ANY && IS_NEWLINE(eptr))
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           GETCHARINC(c, eptr);
           switch(ctype)
             {
@@ -4802,7 +4805,7 @@
             case OP_ANYNL:
             switch(c)
               {
-              default: RRETURN(MATCH_NOMATCH);
+              default: MRRETURN(MATCH_NOMATCH);
               case 0x000d:
               if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
               break;
@@ -4814,7 +4817,7 @@
               case 0x0085:
               case 0x2028:
               case 0x2029:
-              if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
+              if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
               break;
               }
             break;
@@ -4842,14 +4845,14 @@
               case 0x202f:    /* NARROW NO-BREAK SPACE */
               case 0x205f:    /* MEDIUM MATHEMATICAL SPACE */
               case 0x3000:    /* IDEOGRAPHIC SPACE */
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             break;


             case OP_HSPACE:
             switch(c)
               {
-              default: RRETURN(MATCH_NOMATCH);
+              default: MRRETURN(MATCH_NOMATCH);
               case 0x09:      /* HT */
               case 0x20:      /* SPACE */
               case 0xa0:      /* NBSP */
@@ -4884,14 +4887,14 @@
               case 0x85:      /* NEL */
               case 0x2028:    /* LINE SEPARATOR */
               case 0x2029:    /* PARAGRAPH SEPARATOR */
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             break;


             case OP_VSPACE:
             switch(c)
               {
-              default: RRETURN(MATCH_NOMATCH);
+              default: MRRETURN(MATCH_NOMATCH);
               case 0x0a:      /* LF */
               case 0x0b:      /* VT */
               case 0x0c:      /* FF */
@@ -4905,32 +4908,32 @@


             case OP_NOT_DIGIT:
             if (c < 256 && (md->ctypes[c] & ctype_digit) != 0)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             break;


             case OP_DIGIT:
             if (c >= 256 || (md->ctypes[c] & ctype_digit) == 0)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             break;


             case OP_NOT_WHITESPACE:
             if (c < 256 && (md->ctypes[c] & ctype_space) != 0)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             break;


             case OP_WHITESPACE:
             if  (c >= 256 || (md->ctypes[c] & ctype_space) == 0)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             break;


             case OP_NOT_WORDCHAR:
             if (c < 256 && (md->ctypes[c] & ctype_word) != 0)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             break;


             case OP_WORDCHAR:
             if (c >= 256 || (md->ctypes[c] & ctype_word) == 0)
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
             break;


             default:
@@ -4946,14 +4949,14 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM43);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (fi >= max) RRETURN(MATCH_NOMATCH);
+          if (fi >= max) MRRETURN(MATCH_NOMATCH);
           if (eptr >= md->end_subject)
             {
             SCHECK_PARTIAL();
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
             }
           if (ctype == OP_ANY && IS_NEWLINE(eptr))
-            RRETURN(MATCH_NOMATCH);
+            MRRETURN(MATCH_NOMATCH);
           c = *eptr++;
           switch(ctype)
             {
@@ -4965,7 +4968,7 @@
             case OP_ANYNL:
             switch(c)
               {
-              default: RRETURN(MATCH_NOMATCH);
+              default: MRRETURN(MATCH_NOMATCH);
               case 0x000d:
               if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
               break;
@@ -4976,7 +4979,7 @@
               case 0x000b:
               case 0x000c:
               case 0x0085:
-              if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
+              if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
               break;
               }
             break;
@@ -4988,14 +4991,14 @@
               case 0x09:      /* HT */
               case 0x20:      /* SPACE */
               case 0xa0:      /* NBSP */
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             break;


             case OP_HSPACE:
             switch(c)
               {
-              default: RRETURN(MATCH_NOMATCH);
+              default: MRRETURN(MATCH_NOMATCH);
               case 0x09:      /* HT */
               case 0x20:      /* SPACE */
               case 0xa0:      /* NBSP */
@@ -5012,14 +5015,14 @@
               case 0x0c:      /* FF */
               case 0x0d:      /* CR */
               case 0x85:      /* NEL */
-              RRETURN(MATCH_NOMATCH);
+              MRRETURN(MATCH_NOMATCH);
               }
             break;


             case OP_VSPACE:
             switch(c)
               {
-              default: RRETURN(MATCH_NOMATCH);
+              default: MRRETURN(MATCH_NOMATCH);
               case 0x0a:      /* LF */
               case 0x0b:      /* VT */
               case 0x0c:      /* FF */
@@ -5030,27 +5033,27 @@
             break;


             case OP_NOT_DIGIT:
-            if ((md->ctypes[c] & ctype_digit) != 0) RRETURN(MATCH_NOMATCH);
+            if ((md->ctypes[c] & ctype_digit) != 0) MRRETURN(MATCH_NOMATCH);
             break;


             case OP_DIGIT:
-            if ((md->ctypes[c] & ctype_digit) == 0) RRETURN(MATCH_NOMATCH);
+            if ((md->ctypes[c] & ctype_digit) == 0) MRRETURN(MATCH_NOMATCH);
             break;


             case OP_NOT_WHITESPACE:
-            if ((md->ctypes[c] & ctype_space) != 0) RRETURN(MATCH_NOMATCH);
+            if ((md->ctypes[c] & ctype_space) != 0) MRRETURN(MATCH_NOMATCH);
             break;


             case OP_WHITESPACE:
-            if  ((md->ctypes[c] & ctype_space) == 0) RRETURN(MATCH_NOMATCH);
+            if  ((md->ctypes[c] & ctype_space) == 0) MRRETURN(MATCH_NOMATCH);
             break;


             case OP_NOT_WORDCHAR:
-            if ((md->ctypes[c] & ctype_word) != 0) RRETURN(MATCH_NOMATCH);
+            if ((md->ctypes[c] & ctype_word) != 0) MRRETURN(MATCH_NOMATCH);
             break;


             case OP_WORDCHAR:
-            if ((md->ctypes[c] & ctype_word) == 0) RRETURN(MATCH_NOMATCH);
+            if ((md->ctypes[c] & ctype_word) == 0) MRRETURN(MATCH_NOMATCH);
             break;


             default:
@@ -5792,7 +5795,7 @@


       /* Get here if we can't make it match with any permitted repetitions */


-      RRETURN(MATCH_NOMATCH);
+      MRRETURN(MATCH_NOMATCH);
       }
     /* Control never gets here */


@@ -6087,7 +6090,6 @@
md->endonly = (re->options & PCRE_DOLLAR_ENDONLY) != 0;
md->use_ucp = (re->options & PCRE_UCP) != 0;
md->jscript_compat = (re->options & PCRE_JAVASCRIPT_COMPAT) != 0;
-md->ignore_skip_arg = FALSE;

/* Some options are unpacked into BOOL variables in the hope that testing
them will be faster than individual option bits. */
@@ -6098,7 +6100,7 @@
md->notempty_atstart = (options & PCRE_NOTEMPTY_ATSTART) != 0;

 md->hitend = FALSE;
-md->mark = md->nomatch_mark = NULL;     /* In case never set */
+md->mark = NULL;                        /* In case never set */


 md->recursive = NULL;                   /* No recursion at top level */
 md->hasthen = (re->flags & PCRE_HASTHEN) != 0;
@@ -6450,23 +6452,11 @@
   md->match_call_count = 0;
   md->match_function_type = 0;
   md->end_offset_top = 0;
-  rc = match(start_match, md->start_code, start_match, 2, md, NULL, 0);
+  rc = match(start_match, md->start_code, start_match, NULL, 2, md, NULL, 0);
   if (md->hitend && start_partial == NULL) start_partial = md->start_used_ptr;


   switch(rc)
     {
-    /* If MATCH_SKIP_ARG reaches this level it means that a MARK that matched
-    the SKIP's arg was not found. In this circumstance, Perl ignores the SKIP
-    entirely. The only way we can do that is to re-do the match at the same
-    point, with a flag to force SKIP with an argument to be ignored. Just
-    treating this case as NOMATCH does not work because it does not check other
-    alternatives in patterns such as A(*SKIP:A)B|AC when the subject is AC. */
-
-    case MATCH_SKIP_ARG:
-    new_start_match = start_match;
-    md->ignore_skip_arg = TRUE;
-    break;
-
     /* SKIP passes back the next starting point explicitly, but if it is the
     same as the match we have just done, treat it as NOMATCH. */


@@ -6478,13 +6468,18 @@
       }
     /* Fall through */


+    /* If MATCH_SKIP_ARG reaches this level it means that a MARK that matched
+    the SKIP's arg was not found. We also treat this as NOMATCH. */
+
+    case MATCH_SKIP_ARG:
+    /* Fall through */
+
     /* NOMATCH and PRUNE advance by one character. THEN at this level acts
-    exactly like PRUNE. Unset the ignore SKIP-with-argument flag. */
+    exactly like PRUNE. */


     case MATCH_NOMATCH:
     case MATCH_PRUNE:
     case MATCH_THEN:
-    md->ignore_skip_arg = FALSE;
     new_start_match = start_match + 1;
 #ifdef SUPPORT_UTF8
     if (utf8)
@@ -6613,12 +6608,8 @@
     offsets[1] = (int)(md->end_match_ptr - md->start_subject);
     }


-  /* Return MARK data if requested */
-
-  if (extra_data != NULL && (extra_data->flags & PCRE_EXTRA_MARK) != 0)
-    *(extra_data->mark) = (unsigned char *)(md->mark);
   DPRINTF((">>>> returning %d\n", rc));
-  return rc;
+  goto RETURN_MARK;
   }


/* Control gets here if there has been an error, or if the overall match
@@ -6662,8 +6653,10 @@

/* Return the MARK data if it has been requested. */

+RETURN_MARK:
+
if (extra_data != NULL && (extra_data->flags & PCRE_EXTRA_MARK) != 0)
- *(extra_data->mark) = (unsigned char *)(md->nomatch_mark);
+ *(extra_data->mark) = (unsigned char *)(md->mark);
return rc;
}


Modified: code/trunk/pcre_fullinfo.c
===================================================================
--- code/trunk/pcre_fullinfo.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcre_fullinfo.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -100,19 +100,6 @@
   *((size_t *)where) = (study == NULL)? 0 : study->size;
   break;


-  case PCRE_INFO_JITSIZE:
-#ifdef SUPPORT_JIT
-  *((size_t *)where) =
-      (extra_data != NULL &&
-      (extra_data->flags & PCRE_EXTRA_EXECUTABLE_JIT) != 0 &&
-      extra_data->executable_jit != NULL)?
-    _pcre_jit_get_size(extra_data->executable_jit) : 0;
-#else
-  *((size_t *)where) = 0;
-#endif
-
-  break;
-
   case PCRE_INFO_CAPTURECOUNT:
   *((int *)where) = re->top_bracket;
   break;


Modified: code/trunk/pcre_internal.h
===================================================================
--- code/trunk/pcre_internal.h    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcre_internal.h    2011-12-28 16:10:09 UTC (rev 835)
@@ -1665,7 +1665,7 @@
        ERR40, ERR41, ERR42, ERR43, ERR44, ERR45, ERR46, ERR47, ERR48, ERR49,
        ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
        ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69,
-       ERR70, ERR71, ERR72, ERRCOUNT };
+       ERR70, ERRCOUNT };


 /* The real format of the start of the pcre block; the index of names and the
 code vector run on as long as necessary after the end. We store an explicit
@@ -1741,7 +1741,6 @@
   uschar *name_table;           /* The name/number table */
   int  names_found;             /* Number of entries so far */
   int  name_entry_size;         /* Size of each entry */
-  int  workspace_size;          /* Size of workspace */
   int  bracount;                /* Count of capturing parens as we compile */
   int  final_bracount;          /* Saved value after first pass */
   int  top_backref;             /* Maximum back reference */
@@ -1825,7 +1824,6 @@
   BOOL   hitend;                /* Hit the end of the subject at some point */
   BOOL   bsr_anycrlf;           /* \R is just any CRLF, not full Unicode */
   BOOL   hasthen;               /* Pattern contains (*THEN) */
-  BOOL   ignore_skip_arg;       /* For re-run when SKIP name not found */
   const  uschar *start_code;    /* For use when recursing */
   USPTR  start_subject;         /* Start of the subject string */
   USPTR  end_subject;           /* End of the subject string */
@@ -1841,8 +1839,7 @@
   int    eptrn;                 /* Next free eptrblock */
   recursion_info *recursive;    /* Linked list of recursion data */
   void  *callout_data;          /* To pass back to callouts */
-  const  uschar *mark;          /* Mark pointer to pass back on success */
-  const  uschar *nomatch_mark;  /* Mark pointer to pass back on failure */
+  const  uschar *mark;          /* Mark pointer to pass back */
   const  uschar *once_target;   /* Where to back up to for atomic groups */
 } match_data;


@@ -1953,7 +1950,6 @@
 extern int           _pcre_jit_exec(const real_pcre *, void *, PCRE_SPTR,
                         int, int, int, int, int *, int);
 extern void          _pcre_jit_free(void *);
-extern int           _pcre_jit_get_size(void *);
 #endif


/* Unicode character database (UCD) */

Modified: code/trunk/pcre_jit_compile.c
===================================================================
--- code/trunk/pcre_jit_compile.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcre_jit_compile.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -166,7 +166,6 @@
   void *executable_func;
   pcre_jit_callback callback;
   void *userdata;
-  sljit_uw executable_size;
 } executable_function;


 typedef struct jump_list {
@@ -5698,8 +5697,7 @@
     {
     SLJIT_ASSERT(opcode == OP_COND || opcode == OP_SCOND);
     assert = CURRENT_AS(bracket_fallback)->u.assert;
-    if ((ccbegin[1 + LINK_SIZE] == OP_ASSERT_NOT || ccbegin[1 + LINK_SIZE] == OP_ASSERTBACK_NOT) && assert->framesize >= 0)
-
+    if (assert->framesize >= 0 && (ccbegin[1 + LINK_SIZE] == OP_ASSERT_NOT || ccbegin[1 + LINK_SIZE] == OP_ASSERTBACK_NOT))
       {
       OP1(SLJIT_MOV, STACK_TOP, 0, SLJIT_MEM1(SLJIT_LOCALS_REG), assert->localptr);
       add_jump(compiler, &common->revertframes, JUMP(SLJIT_FAST_CALL));
@@ -6101,7 +6099,6 @@
 uschar *ccend;
 executable_function *function;
 void *executable_func;
-sljit_uw executable_size;
 struct sljit_label *leave;
 struct sljit_label *mainloop = NULL;
 struct sljit_label *empty_match_found;
@@ -6431,7 +6428,6 @@


SLJIT_FREE(common->localptrs);
executable_func = sljit_generate_code(compiler);
-executable_size = sljit_get_generated_code_size(compiler);
sljit_free_compiler(compiler);
if (executable_func == NULL)
return;
@@ -6446,7 +6442,6 @@
}

function->executable_func = executable_func;
-function->executable_size = executable_size;
function->callback = NULL;
function->userdata = NULL;
extra->executable_jit = function;
@@ -6535,12 +6530,6 @@
SLJIT_FREE(function);
}

-int
-_pcre_jit_get_size(void *executable_func)
-{
-return ((executable_function*)executable_func)->executable_size;
-}
-
PCRE_EXP_DECL pcre_jit_stack *
pcre_jit_stack_alloc(int startsize, int maxsize)
{

Modified: code/trunk/pcre_study.c
===================================================================
--- code/trunk/pcre_study.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcre_study.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -286,8 +286,8 @@
     cc++;
     break;


-    /* The single-byte matcher means we can't proceed in UTF-8 mode. (In
-    non-UTF-8 mode \C will actually be turned into OP_ALLANY, so won't ever
+    /* The single-byte matcher means we can't proceed in UTF-8 mode. (In 
+    non-UTF-8 mode \C will actually be turned into OP_ALLANY, so won't ever 
     appear, but leave the code, just in case.) */


     case OP_ANYBYTE:
@@ -1322,16 +1322,11 @@
   study->size = sizeof(pcre_study_data);
   study->flags = 0;


-  /* Set the start bits always, to avoid unset memory errors if the
-  study data is written to a file, but set the flag only if any of the bits
-  are set, to save time looking when none are. */
-
   if (bits_set)
     {
     study->flags |= PCRE_STUDY_MAPPED;
     memcpy(study->start_bits, start_bits, sizeof(start_bits));
     }
-  else memset(study->start_bits, 0, 32 * sizeof(uschar));


/* Always set the minlength value in the block, because the JIT compiler
makes use of it. However, don't set the bit unless the length is greater than

Modified: code/trunk/pcre_valid_utf8.c
===================================================================
--- code/trunk/pcre_valid_utf8.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcre_valid_utf8.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -111,7 +111,7 @@
 if (length < 0)
   {
   for (p = string; *p != 0; p++);
-  length = (int)(p - string);
+  length = p - string;
   }


for (p = string; length-- > 0; p++)
@@ -123,20 +123,20 @@

   if (c < 0xc0)                         /* Isolated 10xx xxxx byte */
     {
-    *erroroffset = (int)(p - string);
+    *erroroffset = p - string;
     return PCRE_UTF8_ERR20;
     }


   if (c >= 0xfe)                        /* Invalid 0xfe or 0xff bytes */
     {
-    *erroroffset = (int)(p - string);
+    *erroroffset = p - string;
     return PCRE_UTF8_ERR21;
     }


   ab = _pcre_utf8_table4[c & 0x3f];     /* Number of additional bytes */
   if (length < ab)
     {
-    *erroroffset = (int)(p - string);          /* Missing bytes */
+    *erroroffset = p - string;          /* Missing bytes */
     return ab - length;                 /* Codes ERR1 to ERR5 */
     }
   length -= ab;                         /* Length remaining */
@@ -145,7 +145,7 @@


   if (((d = *(++p)) & 0xc0) != 0x80)
     {
-    *erroroffset = (int)(p - string) - 1;
+    *erroroffset = p - string - 1;
     return PCRE_UTF8_ERR6;
     }


@@ -160,7 +160,7 @@

     case 1: if ((c & 0x3e) == 0)
       {
-      *erroroffset = (int)(p - string) - 1;
+      *erroroffset = p - string - 1;
       return PCRE_UTF8_ERR15;
       }
     break;
@@ -172,17 +172,17 @@
     case 2:
     if ((*(++p) & 0xc0) != 0x80)     /* Third byte */
       {
-      *erroroffset = (int)(p - string) - 2;
+      *erroroffset = p - string - 2;
       return PCRE_UTF8_ERR7;
       }
     if (c == 0xe0 && (d & 0x20) == 0)
       {
-      *erroroffset = (int)(p - string) - 2;
+      *erroroffset = p - string - 2;
       return PCRE_UTF8_ERR16;
       }
     if (c == 0xed && d >= 0xa0)
       {
-      *erroroffset = (int)(p - string) - 2;
+      *erroroffset = p - string - 2;
       return PCRE_UTF8_ERR14;
       }
     break;
@@ -194,22 +194,22 @@
     case 3:
     if ((*(++p) & 0xc0) != 0x80)     /* Third byte */
       {
-      *erroroffset = (int)(p - string) - 2;
+      *erroroffset = p - string - 2;
       return PCRE_UTF8_ERR7;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Fourth byte */
       {
-      *erroroffset = (int)(p - string) - 3;
+      *erroroffset = p - string - 3;
       return PCRE_UTF8_ERR8;
       }
     if (c == 0xf0 && (d & 0x30) == 0)
       {
-      *erroroffset = (int)(p - string) - 3;
+      *erroroffset = p - string - 3;
       return PCRE_UTF8_ERR17;
       }
     if (c > 0xf4 || (c == 0xf4 && d > 0x8f))
       {
-      *erroroffset = (int)(p - string) - 3;
+      *erroroffset = p - string - 3;
       return PCRE_UTF8_ERR13;
       }
     break;
@@ -225,22 +225,22 @@
     case 4:
     if ((*(++p) & 0xc0) != 0x80)     /* Third byte */
       {
-      *erroroffset = (int)(p - string) - 2;
+      *erroroffset = p - string - 2;
       return PCRE_UTF8_ERR7;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Fourth byte */
       {
-      *erroroffset = (int)(p - string) - 3;
+      *erroroffset = p - string - 3;
       return PCRE_UTF8_ERR8;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Fifth byte */
       {
-      *erroroffset = (int)(p - string) - 4;
+      *erroroffset = p - string - 4;
       return PCRE_UTF8_ERR9;
       }
     if (c == 0xf8 && (d & 0x38) == 0)
       {
-      *erroroffset = (int)(p - string) - 4;
+      *erroroffset = p - string - 4;
       return PCRE_UTF8_ERR18;
       }
     break;
@@ -251,27 +251,27 @@
     case 5:
     if ((*(++p) & 0xc0) != 0x80)     /* Third byte */
       {
-      *erroroffset = (int)(p - string) - 2;
+      *erroroffset = p - string - 2;
       return PCRE_UTF8_ERR7;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Fourth byte */
       {
-      *erroroffset = (int)(p - string) - 3;
+      *erroroffset = p - string - 3;
       return PCRE_UTF8_ERR8;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Fifth byte */
       {
-      *erroroffset = (int)(p - string) - 4;
+      *erroroffset = p - string - 4;
       return PCRE_UTF8_ERR9;
       }
     if ((*(++p) & 0xc0) != 0x80)     /* Sixth byte */
       {
-      *erroroffset = (int)(p - string) - 5;
+      *erroroffset = p - string - 5;
       return PCRE_UTF8_ERR10;
       }
     if (c == 0xfc && (d & 0x3c) == 0)
       {
-      *erroroffset = (int)(p - string) - 5;
+      *erroroffset = p - string - 5;
       return PCRE_UTF8_ERR19;
       }
     break;
@@ -283,7 +283,7 @@


   if (ab > 3)
     {
-    *erroroffset = (int)(p - string) - ab;
+    *erroroffset = p - string - ab;
     return (ab == 4)? PCRE_UTF8_ERR11 : PCRE_UTF8_ERR12;
     }
   }


Modified: code/trunk/pcregrep.c
===================================================================
--- code/trunk/pcregrep.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcregrep.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -1410,7 +1410,7 @@
         and its line-ending characters (if they matched the pattern), so there
         may be no more to print. */


-        plength = (int)((linelength + endlinelength) - startoffset);
+        plength = (linelength + endlinelength) - startoffset;
         if (plength > 0) FWRITE(ptr + startoffset, 1, plength, stdout);
         }


@@ -1462,7 +1462,7 @@

   if (input_line_buffered && bufflength < (size_t)bufsize)
     {
-    int add = read_one_line(ptr, bufsize - (int)(ptr - main_buffer), in);
+    int add = read_one_line(ptr, bufsize - (ptr - main_buffer), in);
     bufflength += add;
     endptr += add;
     }


Modified: code/trunk/pcreposix.c
===================================================================
--- code/trunk/pcreposix.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcreposix.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -154,9 +154,7 @@
   REG_BADPAT,  /* \c must be followed by an ASCII character */
   REG_BADPAT,  /* \k is not followed by a braced, angle-bracketed, or quoted name */
   /* 70 */
-  REG_BADPAT,  /* internal error: unknown opcode in find_fixedlength() */
-  REG_BADPAT,  /* \N is not supported in a class */
-  REG_BADPAT,  /* too many forward references */
+  REG_BADPAT,  /* internal error: unknown opcode in find_fixedlength() */ 
 };


/* Table of texts corresponding to POSIX error codes */

Modified: code/trunk/pcretest.c
===================================================================
--- code/trunk/pcretest.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/pcretest.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -191,7 +191,6 @@
 static int show_malloc;
 static int use_utf8;
 static size_t gotten_store;
-static size_t first_gotten_store = 0;
 static const unsigned char *last_callout_mark = NULL;


/* The buffers grow automatically if very long input lines are encountered. */
@@ -1000,14 +999,12 @@
*************************************************/

/* Alternative malloc function, to test functionality and save the size of a
-compiled re, which is the first store request that pcre_compile() makes. The
-show_malloc variable is set only during matching. */
+compiled re. The show_malloc variable is set only during matching. */

 static void *new_malloc(size_t size)
 {
 void *block = malloc(size);
 gotten_store = size;
-if (first_gotten_store == 0) first_gotten_store = size;
 if (show_malloc)
   fprintf(outfile, "malloc       %3d %p\n", (int)size, block);
 return block;
@@ -1523,7 +1520,7 @@
       (sbuf[4] << 24) | (sbuf[5] << 16) | (sbuf[6] << 8) | sbuf[7];


     re = (real_pcre *)new_malloc(true_size);
-    regex_gotten_store = first_gotten_store;
+    regex_gotten_store = gotten_store;


     if (fread(re, 1, true_size, f) != true_size) goto FAIL_READ;


@@ -1632,7 +1629,6 @@
/* Look for options after final delimiter */

options = 0;
- study_options = 0;
log_store = showstore; /* default from command line */

   while (*pp != 0)
@@ -1781,7 +1777,6 @@
     if ((options & PCRE_UCP) != 0) cflags |= REG_UCP;
     if ((options & PCRE_UNGREEDY) != 0) cflags |= REG_UNGREEDY;


-    first_gotten_store = 0;
     rc = regcomp(&preg, (char *)p, cflags);


     /* Compilation failed; go back for another re, skipping to blank line
@@ -1819,7 +1814,6 @@
           (double)CLOCKS_PER_SEC);
       }


-    first_gotten_store = 0;
     re = pcre_compile((char *)p, options, &error, &erroroffset, tables);


     /* Compilation failed; go back for another re, skipping to blank line
@@ -1854,20 +1848,22 @@
     new_info(re, NULL, PCRE_INFO_OPTIONS, &get_options);
     if ((get_options & PCRE_UTF8) != 0) use_utf8 = 1;


-    /* Extract the size for possible writing before possibly flipping it,
-    and remember the store that was got. */
+    /* Print information if required. There are now two info-returning
+    functions. The old one has a limited interface and returns only limited
+    data. Check that it agrees with the newer one. */


-    true_size = ((real_pcre *)re)->size;
-    regex_gotten_store = first_gotten_store;
-
-    /* Output code size information if requested */
-
     if (log_store)
       fprintf(outfile, "Memory allocation (code space): %d\n",
-        (int)(first_gotten_store -
+        (int)(gotten_store -
               sizeof(real_pcre) -
               ((real_pcre *)re)->name_count * ((real_pcre *)re)->name_entry_size));


+    /* Extract the size for possible writing before possibly flipping it,
+    and remember the store that was got. */
+
+    true_size = ((real_pcre *)re)->size;
+    regex_gotten_store = gotten_store;
+
     /* If -s or /S was present, study the regex to generate additional info to
     help with the matching, unless the pattern has the SS option, which
     suppresses the effect of /S (used for a few test patterns where studying is
@@ -1892,16 +1888,7 @@
       if (error != NULL)
         fprintf(outfile, "Failed to study: %s\n", error);
       else if (extra != NULL)
-        {
         true_study_size = ((pcre_study_data *)(extra->study_data))->size;
-        if (log_store)
-          {
-          size_t jitsize;
-          new_info(re, extra, PCRE_INFO_JITSIZE, &jitsize);
-          if (jitsize != 0)
-            fprintf(outfile, "Memory allocation (JIT code): %d\n", jitsize);
-          }
-        }
       }


     /* If /K was present, we set up for handling MARK data. */
@@ -1954,9 +1941,7 @@
         }
       }


-    /* Extract information from the compiled data if required. There are now
-    two info-returning functions. The old one has a limited interface and
-    returns only limited data. Check that it agrees with the newer one. */
+    /* Extract information from the compiled data if required */


     SHOW_INFO:



Modified: code/trunk/perltest.pl
===================================================================
--- code/trunk/perltest.pl    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/perltest.pl    2011-12-28 16:10:09 UTC (rev 835)
@@ -111,10 +111,6 @@


$pattern =~ s/S(?=[a-zA-Z]*$)//g;

- # Remove /Y from a pattern (asks pcretest to disable PCRE optimization)
-
- $pattern =~ s/Y(?=[a-zA-Z]*$)//;
-
# Check that the pattern is valid

eval "\$_ =~ ${pattern}";

Modified: code/trunk/sljit/sljitConfigInternal.h
===================================================================
--- code/trunk/sljit/sljitConfigInternal.h    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/sljit/sljitConfigInternal.h    2011-12-28 16:10:09 UTC (rev 835)
@@ -354,8 +354,8 @@
 #endif /* !SLJIT_UNALIGNED */


#if (defined SLJIT_EXECUTABLE_ALLOCATOR && SLJIT_EXECUTABLE_ALLOCATOR)
-SLJIT_API_FUNC_ATTRIBUTE void* sljit_malloc_exec(sljit_uw size);
-SLJIT_API_FUNC_ATTRIBUTE void sljit_free_exec(void* ptr);
+void* sljit_malloc_exec(sljit_uw size);
+void sljit_free_exec(void* ptr);
#define SLJIT_MALLOC_EXEC(size) sljit_malloc_exec(size)
#define SLJIT_FREE_EXEC(ptr) sljit_free_exec(ptr)
#endif

Modified: code/trunk/sljit/sljitExecAllocator.c
===================================================================
--- code/trunk/sljit/sljitExecAllocator.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/sljit/sljitExecAllocator.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -163,7 +163,7 @@
     }
 }


-SLJIT_API_FUNC_ATTRIBUTE void* sljit_malloc_exec(sljit_uw size)
+void* sljit_malloc_exec(sljit_uw size)
 {
     struct block_header *header;
     struct block_header *next_header;
@@ -231,7 +231,7 @@
     return MEM_START(header);
 }


-SLJIT_API_FUNC_ATTRIBUTE void sljit_free_exec(void* ptr)
+void sljit_free_exec(void* ptr)
 {
     struct block_header *header;
     struct free_block* free_block;


Modified: code/trunk/sljit/sljitLir.h
===================================================================
--- code/trunk/sljit/sljitLir.h    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/sljit/sljitLir.h    2011-12-28 16:10:09 UTC (rev 835)
@@ -195,8 +195,6 @@
     int local_size;
     /* Code size. */
     sljit_uw size;
-    /* For statistical purposes. */
-    sljit_uw executable_size;


 #if (defined SLJIT_CONFIG_X86_32 && SLJIT_CONFIG_X86_32)
     int args;
@@ -293,15 +291,6 @@
 SLJIT_API_FUNC_ATTRIBUTE void* sljit_generate_code(struct sljit_compiler *compiler);
 SLJIT_API_FUNC_ATTRIBUTE void sljit_free_code(void* code);


-/*
- After the code generation we can retrieve the allocated executable memory size,
- although this area may not be fully filled with instructions depending on some
- optimizations. This function is useful only for statistical purposes.
-
- Before a successful code generation, this function returns with 0.
-*/
-static SLJIT_INLINE sljit_uw sljit_get_generated_code_size(struct sljit_compiler *compiler) { return compiler->executable_size; }
-
/* Instruction generation. Returns with error code. */

/*

Modified: code/trunk/sljit/sljitNativeARM_Thumb2.c
===================================================================
--- code/trunk/sljit/sljitNativeARM_Thumb2.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/sljit/sljitNativeARM_Thumb2.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -416,7 +416,6 @@


     SLJIT_CACHE_FLUSH(code, code_ptr);
     compiler->error = SLJIT_ERR_COMPILED;
-    compiler->executable_size = compiler->size * sizeof(sljit_uh);
     /* Set thumb mode flag. */
     return (void*)((sljit_uw)code | 0x1);
 }


Modified: code/trunk/sljit/sljitNativeARM_v5.c
===================================================================
--- code/trunk/sljit/sljitNativeARM_v5.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/sljit/sljitNativeARM_v5.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -788,7 +788,6 @@


     SLJIT_CACHE_FLUSH(code, code_ptr);
     compiler->error = SLJIT_ERR_COMPILED;
-    compiler->executable_size = size * sizeof(sljit_uw);
     return code;
 }



Modified: code/trunk/sljit/sljitNativeMIPS_common.c
===================================================================
--- code/trunk/sljit/sljitNativeMIPS_common.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/sljit/sljitNativeMIPS_common.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -397,7 +397,6 @@
     }


     compiler->error = SLJIT_ERR_COMPILED;
-    compiler->executable_size = compiler->size * sizeof(sljit_ins);
 #ifndef __GNUC__
     SLJIT_CACHE_FLUSH(code, code_ptr);
 #else


Modified: code/trunk/sljit/sljitNativePPC_common.c
===================================================================
--- code/trunk/sljit/sljitNativePPC_common.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/sljit/sljitNativePPC_common.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -354,7 +354,6 @@


     SLJIT_CACHE_FLUSH(code, code_ptr);
     compiler->error = SLJIT_ERR_COMPILED;
-    compiler->executable_size = compiler->size * sizeof(sljit_ins);


 #if (defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64)
     if (((sljit_w)code_ptr) & 0x4)


Modified: code/trunk/sljit/sljitNativeX86_common.c
===================================================================
--- code/trunk/sljit/sljitNativeX86_common.c    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/sljit/sljitNativeX86_common.c    2011-12-28 16:10:09 UTC (rev 835)
@@ -357,22 +357,22 @@
     while (jump) {
         if (jump->flags & PATCH_MB) {
             SLJIT_ASSERT((sljit_w)(jump->u.label->addr - (jump->addr + sizeof(sljit_b))) >= -128 && (sljit_w)(jump->u.label->addr - (jump->addr + sizeof(sljit_b))) <= 127);
-            *(sljit_ub*)jump->addr = (sljit_ub)(jump->u.label->addr - (jump->addr + sizeof(sljit_b)));
+            *(sljit_ub*)jump->addr = jump->u.label->addr - (jump->addr + sizeof(sljit_b));
         } else if (jump->flags & PATCH_MW) {
             if (jump->flags & JUMP_LABEL) {
 #if (defined SLJIT_CONFIG_X86_32 && SLJIT_CONFIG_X86_32)
-                *(sljit_w*)jump->addr = (sljit_w)(jump->u.label->addr - (jump->addr + sizeof(sljit_w)));
+                *(sljit_w*)jump->addr = jump->u.label->addr - (jump->addr + sizeof(sljit_w));
 #else
                 SLJIT_ASSERT((sljit_w)(jump->u.label->addr - (jump->addr + sizeof(sljit_hw))) >= -0x80000000ll && (sljit_w)(jump->u.label->addr - (jump->addr + sizeof(sljit_hw))) <= 0x7fffffffll);
-                *(sljit_hw*)jump->addr = (sljit_hw)(jump->u.label->addr - (jump->addr + sizeof(sljit_hw)));
+                *(sljit_hw*)jump->addr = jump->u.label->addr - (jump->addr + sizeof(sljit_hw));
 #endif
             }
             else {
 #if (defined SLJIT_CONFIG_X86_32 && SLJIT_CONFIG_X86_32)
-                *(sljit_w*)jump->addr = (sljit_w)(jump->u.target - (jump->addr + sizeof(sljit_w)));
+                *(sljit_w*)jump->addr = jump->u.target - (jump->addr + sizeof(sljit_w));
 #else
                 SLJIT_ASSERT((sljit_w)(jump->u.target - (jump->addr + sizeof(sljit_hw))) >= -0x80000000ll && (sljit_w)(jump->u.target - (jump->addr + sizeof(sljit_hw))) <= 0x7fffffffll);
-                *(sljit_hw*)jump->addr = (sljit_hw)(jump->u.target - (jump->addr + sizeof(sljit_hw)));
+                *(sljit_hw*)jump->addr = jump->u.target - (jump->addr + sizeof(sljit_hw));
 #endif
             }
         }
@@ -387,7 +387,6 @@
     /* Maybe we waste some space because of short jumps. */
     SLJIT_ASSERT(code_ptr <= code + compiler->size);
     compiler->error = SLJIT_ERR_COMPILED;
-    compiler->executable_size = compiler->size;
     return (void*)code;
 }


@@ -1361,7 +1360,7 @@
             code = (sljit_ub*)ensure_buf(compiler, 1 + 4);
             FAIL_IF(!code);
             INC_CSIZE(4);
-            *(sljit_hw*)code = (sljit_hw)src1w;
+            *(sljit_hw*)code = src1w;
         }
         else {
             EMIT_MOV(compiler, TMP_REG2, 0, SLJIT_IMM, src1w);
@@ -1404,7 +1403,7 @@
             code = (sljit_ub*)ensure_buf(compiler, 1 + 4);
             FAIL_IF(!code);
             INC_CSIZE(4);
-            *(sljit_hw*)code = (sljit_hw)src2w;
+            *(sljit_hw*)code = src2w;
         }
         else {
             EMIT_MOV(compiler, TMP_REG2, 0, SLJIT_IMM, src1w);


Modified: code/trunk/testdata/testinput11
===================================================================
--- code/trunk/testdata/testinput11    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testinput11    2011-12-28 16:10:09 UTC (rev 835)
@@ -452,6 +452,20 @@
 /A(*MARK:A)A+(*SKIP:B)(B|Z) | AC(*:B)/xK
     AAAC


+/--- We use something more complicated than individual letters here, because
+that causes different behaviour in Perl. Perhaps it disables some optimization;
+anyway, the result now matches PCRE in that no tag is passed back for the 
+failures. ---/
+    
+/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK
+    AABC
+    XXYZ 
+    ** Failers
+    XAQQ  
+    XAQQXZZ  
+    AXQQQ 
+    AXXQQQ 
+    
 /--- COMMIT at the start of a pattern should act like an anchor. Again, 
 however, we need the complication for Perl. ---/


@@ -786,143 +800,4 @@
 /(?<=a(*THEN)b)c/
     xabcd 


-/(a)(?2){2}(.)/
-    abcd
-
-/(*MARK:A)(*PRUNE:B)(C|X)/KS
-    C
-    D 
-
-/(*MARK:A)(*PRUNE:B)(C|X)/KSS
-    C
-    D 
-
-/(*MARK:A)(*THEN:B)(C|X)/KS
-    C
-    D 
-
-/(*MARK:A)(*THEN:B)(C|X)/KSY
-    C
-    D 
-
-/(*MARK:A)(*THEN:B)(C|X)/KSS
-    C
-    D 
-
-/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
-
-/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
-    AAAC
-
-/--- Same --/
-
-/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK
-    AAAC
-
-/A(*:A)A+(*SKIP)(B|Z) | AC/xK
-    AAAC
-
-/--- This should fail, as a null name is the same as no name ---/
-
-/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK
-    AAAC
-
-/--- A check on what happens after hitting a mark and them bumping along to
-something that does not even start. Perl reports tags after the failures here, 
-though it does not when the individual letters are made into something 
-more complicated. ---/
-
-/A(*:A)B|XX(*:B)Y/K
-    AABC
-    XXYZ 
-    ** Failers
-    XAQQ  
-    XAQQXZZ  
-    AXQQQ 
-    AXXQQQ 
-    
-/^(A(*THEN:A)B|C(*THEN:B)D)/K
-    AB
-    CD
-    ** Failers
-    AC
-    CB    
-    
-/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
-    AB
-    CD
-    ** Failers
-    AC
-    CB    
-    
-/--- An empty name does not pass back an empty string. It is the same as if no
-name were given. ---/ 
-
-/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K
-    AB
-    CD 
-
-/--- PRUNE goes to next bumpalong; COMMIT does not. ---/
-    
-/A(*PRUNE:A)B/K
-    ACAB
-
-/--- Mark names can be duplicated ---/
-
-/A(*:A)B|X(*:A)Y/K
-    AABC
-    XXYZ 
-    
-/b(*:m)f|a(*:n)w/K
-    aw 
-    ** Failers 
-    abc
-
-/b(*:m)f|aw/K
-    abaw
-    ** Failers 
-    abc
-    abax 
-
-/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK
-    AAAC
-
-/a(*PRUNE:X)bc|qq/KY
-    ** Failers
-    axy
-
-/a(*THEN:X)bc|qq/KY
-    ** Failers
-    axy
-
-/(?=a(*MARK:A)b)..x/K
-    abxy
-    ** Failers
-    abpq  
-
-/(?=a(*MARK:A)b)..(*:Y)x/K
-    abxy
-    ** Failers
-    abpq  
-
-/(?=a(*PRUNE:A)b)..x/K
-    abxy
-    ** Failers
-    abpq  
-
-/(?=a(*PRUNE:A)b)..(*:Y)x/K
-    abxy
-    ** Failers
-    abpq  
-
-/(?=a(*THEN:A)b)..x/K
-    abxy
-    ** Failers
-    abpq  
-
-/(?=a(*THEN:A)b)..(*:Y)x/K
-    abxy
-    ** Failers
-    abpq  
-
 /-- End of testinput11 --/


Modified: code/trunk/testdata/testinput13
===================================================================
--- code/trunk/testdata/testinput13    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testinput13    2011-12-28 16:10:09 UTC (rev 835)
@@ -573,11 +573,4 @@
 /^\X/8 
     ́réo


-/^a\X41z/<JS>
-    aX41z
-    *** Failers
-    aAz
-
-/(?<=ab\Cde)X/8
-
 /-- End of testinput13 --/


Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testinput2    2011-12-28 16:10:09 UTC (rev 835)
@@ -3,10 +3,7 @@
     It also checks the non-Perl syntax the PCRE supports (Python, .NET, 
     Oniguruma). Finally, there are some tests where PCRE and Perl differ, 
     either because PCRE can't be compatible, or there is a possible Perl 
-    bug.
-    
-    NOTE: This is a non-UTF-8 set of tests. When UTF-8 is needed, use test
-    5, and if Unicode Property Support is needed, use test 13. --/  
+    bug. --/  


 /-- Originally, the Perl >= 5.10 things were in here too, but now I have 
     separated many (most?) of them out into test 11. However, there may still 
@@ -3325,19 +3322,116 @@
 /A(*PRUNE)B|A(*PRUNE)C/K
     AC


+/--- A whole lot of tests of verbs with arguments are here rather than in test
+     11 because Perl doesn't seem to follow its specification entirely 
+     correctly. ---/
+
+/--- Perl 5.11 sets $REGERROR on the AC failure case here; PCRE does not. It is
+     not clear how Perl defines "involved in the failure of the match". ---/ 
+
+/^(A(*THEN:A)B|C(*THEN:B)D)/K
+    AB
+    CD
+    ** Failers
+    AC
+    CB    
+    
+/--- Check the use of names for success and failure. PCRE doesn't show these 
+names for success, though Perl does, contrary to its spec. ---/
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
+    AB
+    CD
+    ** Failers
+    AC
+    CB    
+    
+/--- An empty name does not pass back an empty string. It is the same as if no
+name were given. ---/ 
+
+/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K
+    AB
+    CD 
+
+/--- PRUNE goes to next bumpalong; COMMIT does not. ---/
+    
+/A(*PRUNE:A)B/K
+    ACAB
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
+    C
+    D 
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KSS
+    C
+    D 
+
+/(*MARK:A)(*THEN:B)(C|X)/KS
+    C
+    D 
+
+/(*MARK:A)(*THEN:B)(C|X)/KSY
+    C
+    D 
+
+/(*MARK:A)(*THEN:B)(C|X)/KSS
+    C
+    D 
+
+/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
+
+/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
+    AAAC
+
+/--- Same --/
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK
+    AAAC
+
 /--- This should fail; the SKIP advances by one, but when we get to AC, the
-     PRUNE kills it. Perl behaves differently. ---/ 
+     PRUNE kills it. ---/ 


 /A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xK
     AAAC


-/--- Mark names can be duplicated. Perl doesn't give a mark for this one,
-though PCRE does. ---/
+/A(*:A)A+(*SKIP)(B|Z) | AC/xK
+    AAAC


+/--- This should fail, as a null name is the same as no name ---/
+
+/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK
+    AAAC
+
+/--- This fails in PCRE, and I think that is in accordance with Perl's 
+     documentation, though in Perl it succeeds. ---/
+    
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK
+    AAAC
+
+/--- Mark names can be duplicated ---/
+
+/A(*:A)B|X(*:A)Y/K
+    AABC
+    XXYZ 
+    
 /^A(*:A)B|^X(*:A)Y/K
     ** Failers
     XAQQ


+/--- A check on what happens after hitting a mark and them bumping along to
+something that does not even start. Perl reports tags after the failures here, 
+though it does not when the individual letters are made into something 
+more complicated. ---/
+
+/A(*:A)B|XX(*:B)Y/K
+    AABC
+    XXYZ 
+    ** Failers
+    XAQQ  
+    XAQQXZZ  
+    AXQQQ 
+    AXXQQQ 
+    
 /--- COMMIT at the start of a pattern should be the same as an anchor. Perl 
 optimizations defeat this. So does the PCRE optimization unless we disable it 
 with \Y. ---/
@@ -3347,6 +3441,78 @@
     ** Failers
     DEFGABC\Y  


+/--- Repeat some tests with added studying. ---/
+
+/A(*COMMIT)B/+KS
+    ACABX
+ 
+/A(*THEN)B|A(*THEN)C/KS
+    AC
+
+/A(*PRUNE)B|A(*PRUNE)C/KS
+    AC
+
+/^(A(*THEN:A)B|C(*THEN:B)D)/KS
+    AB
+    CD
+    ** Failers
+    AC
+    CB    
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/KS
+    AB
+    CD
+    ** Failers
+    AC
+    CB    
+
+/^(A(*PRUNE:)B|C(*PRUNE:B)D)/KS
+    AB
+    CD 
+
+/A(*PRUNE:A)B/KS
+    ACAB
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
+    C
+    D 
+
+/(*MARK:A)(*THEN:B)(C|X)/KS
+    C
+    D 
+
+/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xKS
+    AAAC
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xKS
+    AAAC
+    
+/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xKS
+    AAAC
+
+/A(*:A)A+(*SKIP)(B|Z) | AC/xKS
+    AAAC
+
+/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xKS
+    AAAC
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xKS
+    AAAC
+
+/A(*:A)B|XX(*:B)Y/KS
+    AABC
+    XXYZ 
+    ** Failers
+    XAQQ  
+    XAQQXZZ  
+    AXQQQ 
+    AXXQQQ 
+    
+/(*COMMIT)ABC/
+    ABCDEFG
+    ** Failers
+    DEFGABC\Y  
+
 /^(ab (c+(*THEN)cd) | xyz)/x
     abcccd  


@@ -3814,6 +3980,11 @@
 /^a\x1z/<JS>
     ax1z


+/^a\X41z/<JS>
+    aX41z
+    *** Failers
+    aAz
+
 /^a\u0041z/<JS>
     aAz
     *** Failers
@@ -3836,44 +4007,6 @@


/(?(?=c)c|d)*+Y/BZ

-/a[\NB]c/
-    aNc
-    
-/a[B-\Nc]/ 
+/(?<=ab\Cde)X/8


-/(a)(?2){0,1999}?(b)/
-
-/(a)(?(DEFINE)(b))(?2){0,1999}?(?2)/
-
-/--- This test, with something more complicated than individual letters, causes
-different behaviour in Perl. Perhaps it disables some optimization; no tag is
-passed back for the failures, whereas in PCRE there is a tag. ---/
-    
-/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK
-    AABC
-    XXYZ 
-    ** Failers
-    XAQQ  
-    XAQQXZZ  
-    AXQQQ 
-    AXXQQQ 
-
-/-- Perl doesn't give marks for these, though it does if the alternatives are
-replaced by single letters. --/
-    
-/(b|q)(*:m)f|a(*:n)w/K
-    aw 
-    ** Failers 
-    abc
-
-/(q|b)(*:m)f|a(*:n)w/K
-    aw 
-    ** Failers 
-    abc
-
-/-- After a partial match, the behaviour is as for a failure. --/
-
-/^a(*:X)bcde/K
-   abc\P
-
 /-- End of testinput2 --/


Modified: code/trunk/testdata/testinput6
===================================================================
--- code/trunk/testdata/testinput6    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testinput6    2011-12-28 16:10:09 UTC (rev 835)
@@ -802,18 +802,4 @@
     ** Failers 
     a\xFCb   


-/ⱥ/8i
-    ⱥ
-    Ⱥx 
-    Ⱥ 
-
-/[ⱥ]/8i
-    ⱥ
-    Ⱥx 
-    Ⱥ 
-
-/Ⱥ/8i
-    Ⱥ
-    ⱥ
-
 /-- End of testinput6 --/


Modified: code/trunk/testdata/testoutput10
===================================================================
--- code/trunk/testdata/testoutput10    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testoutput10    2011-12-28 16:10:09 UTC (rev 835)
@@ -194,7 +194,7 @@
 ------------------------------------------------------------------


 /a(?P<name1>b|c)d(?P<longername2>e)/BM
-Memory allocation (code space): 36
+Memory allocation (code space): 42
 ------------------------------------------------------------------
   0  32 Bra
   3     a
@@ -212,7 +212,7 @@
 ------------------------------------------------------------------


/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/BM
-Memory allocation (code space): 45
+Memory allocation (code space): 54
------------------------------------------------------------------
0 41 Bra
3 25 Bra
@@ -232,7 +232,7 @@
------------------------------------------------------------------

/(?P<a>a)...(?P=a)bbb(?P>a)d/BM
-Memory allocation (code space): 34
+Memory allocation (code space): 37
------------------------------------------------------------------
0 30 Bra
3 7 CBra 1

Modified: code/trunk/testdata/testoutput11
===================================================================
--- code/trunk/testdata/testoutput11    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testoutput11    2011-12-28 16:10:09 UTC (rev 835)
@@ -888,6 +888,36 @@
  0: AC
 MK: B


+/--- We use something more complicated than individual letters here, because
+that causes different behaviour in Perl. Perhaps it disables some optimization;
+anyway, the result now matches PCRE in that no tag is passed back for the 
+failures. ---/
+    
+/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK
+    AABC
+ 0: AB
+ 1: A
+ 2: B
+MK: A
+    XXYZ 
+ 0: XXY
+ 1: <unset>
+ 2: <unset>
+ 3: X
+ 4: X
+ 5: Y
+MK: B
+    ** Failers
+No match
+    XAQQ  
+No match
+    XAQQXZZ  
+No match
+    AXQQQ 
+No match
+    AXXQQQ 
+No match
+    
 /--- COMMIT at the start of a pattern should act like an anchor. Again, 
 however, we need the complication for Perl. ---/


@@ -1411,245 +1441,4 @@
     xabcd 
  0: c


-/(a)(?2){2}(.)/
-    abcd
- 0: abcd
- 1: a
- 2: d
-
-/(*MARK:A)(*PRUNE:B)(C|X)/KS
-    C
- 0: C
- 1: C
-MK: B
-    D 
-No match, mark = B
-
-/(*MARK:A)(*PRUNE:B)(C|X)/KSS
-    C
- 0: C
- 1: C
-MK: B
-    D 
-No match, mark = B
-
-/(*MARK:A)(*THEN:B)(C|X)/KS
-    C
- 0: C
- 1: C
-MK: B
-    D 
-No match, mark = B
-
-/(*MARK:A)(*THEN:B)(C|X)/KSY
-    C
- 0: C
- 1: C
-MK: B
-    D 
-No match, mark = B
-
-/(*MARK:A)(*THEN:B)(C|X)/KSS
-    C
- 0: C
- 1: C
-MK: B
-    D 
-No match, mark = B
-
-/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
-
-/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
-    AAAC
-No match, mark = A
-
-/--- Same --/
-
-/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK
-    AAAC
-No match, mark = B
-
-/A(*:A)A+(*SKIP)(B|Z) | AC/xK
-    AAAC
-No match, mark = A
-
-/--- This should fail, as a null name is the same as no name ---/
-
-/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK
-    AAAC
-No match, mark = A
-
-/--- A check on what happens after hitting a mark and them bumping along to
-something that does not even start. Perl reports tags after the failures here, 
-though it does not when the individual letters are made into something 
-more complicated. ---/
-
-/A(*:A)B|XX(*:B)Y/K
-    AABC
- 0: AB
-MK: A
-    XXYZ 
- 0: XXY
-MK: B
-    ** Failers
-No match
-    XAQQ  
-No match, mark = A
-    XAQQXZZ  
-No match, mark = A
-    AXQQQ 
-No match, mark = A
-    AXXQQQ 
-No match, mark = B
-    
-/^(A(*THEN:A)B|C(*THEN:B)D)/K
-    AB
- 0: AB
- 1: AB
-MK: A
-    CD
- 0: CD
- 1: CD
-MK: B
-    ** Failers
-No match
-    AC
-No match, mark = A
-    CB    
-No match, mark = B
-    
-/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
-    AB
- 0: AB
- 1: AB
-MK: A
-    CD
- 0: CD
- 1: CD
-MK: B
-    ** Failers
-No match
-    AC
-No match, mark = A
-    CB    
-No match, mark = B
-    
-/--- An empty name does not pass back an empty string. It is the same as if no
-name were given. ---/ 
-
-/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K
-    AB
- 0: AB
- 1: AB
-    CD 
- 0: CD
- 1: CD
-MK: B
-
-/--- PRUNE goes to next bumpalong; COMMIT does not. ---/
-    
-/A(*PRUNE:A)B/K
-    ACAB
- 0: AB
-MK: A
-
-/--- Mark names can be duplicated ---/
-
-/A(*:A)B|X(*:A)Y/K
-    AABC
- 0: AB
-MK: A
-    XXYZ 
- 0: XY
-MK: A
-    
-/b(*:m)f|a(*:n)w/K
-    aw 
- 0: aw
-MK: n
-    ** Failers 
-No match, mark = n
-    abc
-No match, mark = m
-
-/b(*:m)f|aw/K
-    abaw
- 0: aw
-    ** Failers 
-No match
-    abc
-No match, mark = m
-    abax 
-No match, mark = m
-
-/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK
-    AAAC
- 0: AAC
-
-/a(*PRUNE:X)bc|qq/KY
-    ** Failers
-No match, mark = X
-    axy
-No match, mark = X
-
-/a(*THEN:X)bc|qq/KY
-    ** Failers
-No match, mark = X
-    axy
-No match, mark = X
-
-/(?=a(*MARK:A)b)..x/K
-    abxy
- 0: abx
-MK: A
-    ** Failers
-No match
-    abpq  
-No match
-
-/(?=a(*MARK:A)b)..(*:Y)x/K
-    abxy
- 0: abx
-MK: Y
-    ** Failers
-No match
-    abpq  
-No match
-
-/(?=a(*PRUNE:A)b)..x/K
-    abxy
- 0: abx
-MK: A
-    ** Failers
-No match
-    abpq  
-No match
-
-/(?=a(*PRUNE:A)b)..(*:Y)x/K
-    abxy
- 0: abx
-MK: Y
-    ** Failers
-No match
-    abpq  
-No match
-
-/(?=a(*THEN:A)b)..x/K
-    abxy
- 0: abx
-MK: A
-    ** Failers
-No match
-    abpq  
-No match
-
-/(?=a(*THEN:A)b)..(*:Y)x/K
-    abxy
- 0: abx
-MK: Y
-    ** Failers
-No match
-    abpq  
-No match
-
 /-- End of testinput11 --/


Modified: code/trunk/testdata/testoutput13
===================================================================
--- code/trunk/testdata/testoutput13    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testoutput13    2011-12-28 16:10:09 UTC (rev 835)
@@ -1278,15 +1278,4 @@
     ́réo
 No match


-/^a\X41z/<JS>
-    aX41z
- 0: aX41z
-    *** Failers
-No match
-    aAz
-No match
-
-/(?<=ab\Cde)X/8
-Failed: \C not allowed in lookbehind assertion at offset 10
-
 /-- End of testinput13 --/


Modified: code/trunk/testdata/testoutput14
===================================================================
--- code/trunk/testdata/testoutput14    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testoutput14    2011-12-28 16:10:09 UTC (rev 835)
@@ -42,7 +42,9 @@
 No options
 No first char
 No need char
-Study returned NULL
+Subject length lower bound = -1
+No set of starting bytes
+JIT study was successful


 /(?(R)a*(?1)|((?R))b)/S+
     aaaabcde


Modified: code/trunk/testdata/testoutput15
===================================================================
--- code/trunk/testdata/testoutput15    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testoutput15    2011-12-28 16:10:09 UTC (rev 835)
@@ -17,5 +17,6 @@
 No first char
 No need char
 Study returned NULL
+JIT support is not available in this version of PCRE


/-- End of testinput15 --/

Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testoutput2    2011-12-28 16:10:09 UTC (rev 835)
@@ -3,10 +3,7 @@
     It also checks the non-Perl syntax the PCRE supports (Python, .NET, 
     Oniguruma). Finally, there are some tests where PCRE and Perl differ, 
     either because PCRE can't be compatible, or there is a possible Perl 
-    bug.
-    
-    NOTE: This is a non-UTF-8 set of tests. When UTF-8 is needed, use test
-    5, and if Unicode Property Support is needed, use test 13. --/  
+    bug. --/  


 /-- Originally, the Perl >= 5.10 things were in here too, but now I have 
     separated many (most?) of them out into test 11. However, there may still 
@@ -10992,22 +10989,176 @@
     AC
 No match


+/--- A whole lot of tests of verbs with arguments are here rather than in test
+     11 because Perl doesn't seem to follow its specification entirely 
+     correctly. ---/
+
+/--- Perl 5.11 sets $REGERROR on the AC failure case here; PCRE does not. It is
+     not clear how Perl defines "involved in the failure of the match". ---/ 
+
+/^(A(*THEN:A)B|C(*THEN:B)D)/K
+    AB
+ 0: AB
+ 1: AB
+    CD
+ 0: CD
+ 1: CD
+    ** Failers
+No match
+    AC
+No match
+    CB    
+No match, mark = B
+    
+/--- Check the use of names for success and failure. PCRE doesn't show these 
+names for success, though Perl does, contrary to its spec. ---/
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
+    AB
+ 0: AB
+ 1: AB
+    CD
+ 0: CD
+ 1: CD
+    ** Failers
+No match
+    AC
+No match, mark = A
+    CB    
+No match, mark = B
+    
+/--- An empty name does not pass back an empty string. It is the same as if no
+name were given. ---/ 
+
+/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K
+    AB
+ 0: AB
+ 1: AB
+    CD 
+ 0: CD
+ 1: CD
+
+/--- PRUNE goes to next bumpalong; COMMIT does not. ---/
+    
+/A(*PRUNE:A)B/K
+    ACAB
+ 0: AB
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
+    C
+ 0: C
+ 1: C
+MK: A
+    D 
+No match
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KSS
+    C
+ 0: C
+ 1: C
+MK: A
+    D 
+No match, mark = B
+
+/(*MARK:A)(*THEN:B)(C|X)/KS
+    C
+ 0: C
+ 1: C
+MK: A
+    D 
+No match
+
+/(*MARK:A)(*THEN:B)(C|X)/KSY
+    C
+ 0: C
+ 1: C
+MK: A
+    D 
+No match, mark = B
+
+/(*MARK:A)(*THEN:B)(C|X)/KSS
+    C
+ 0: C
+ 1: C
+MK: A
+    D 
+No match, mark = B
+
+/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
+
+/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
+    AAAC
+No match
+
+/--- Same --/
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK
+    AAAC
+No match
+
 /--- This should fail; the SKIP advances by one, but when we get to AC, the
-     PRUNE kills it. Perl behaves differently. ---/ 
+     PRUNE kills it. ---/ 


 /A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xK
     AAAC
-No match, mark = A
+No match


-/--- Mark names can be duplicated. Perl doesn't give a mark for this one,
-though PCRE does. ---/
+/A(*:A)A+(*SKIP)(B|Z) | AC/xK
+    AAAC
+No match


+/--- This should fail, as a null name is the same as no name ---/
+
+/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK
+    AAAC
+No match
+
+/--- This fails in PCRE, and I think that is in accordance with Perl's 
+     documentation, though in Perl it succeeds. ---/
+    
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK
+    AAAC
+No match
+
+/--- Mark names can be duplicated ---/
+
+/A(*:A)B|X(*:A)Y/K
+    AABC
+ 0: AB
+MK: A
+    XXYZ 
+ 0: XY
+MK: A
+    
 /^A(*:A)B|^X(*:A)Y/K
     ** Failers
 No match
     XAQQ
 No match, mark = A


+/--- A check on what happens after hitting a mark and them bumping along to
+something that does not even start. Perl reports tags after the failures here, 
+though it does not when the individual letters are made into something 
+more complicated. ---/
+
+/A(*:A)B|XX(*:B)Y/K
+    AABC
+ 0: AB
+MK: A
+    XXYZ 
+ 0: XXY
+MK: B
+    ** Failers
+No match
+    XAQQ  
+No match
+    XAQQXZZ  
+No match
+    AXQQQ 
+No match
+    AXXQQQ 
+No match
+    
 /--- COMMIT at the start of a pattern should be the same as an anchor. Perl 
 optimizations defeat this. So does the PCRE optimization unless we disable it 
 with \Y. ---/
@@ -11020,6 +11171,126 @@
     DEFGABC\Y  
 No match


+/--- Repeat some tests with added studying. ---/
+
+/A(*COMMIT)B/+KS
+    ACABX
+No match
+ 
+/A(*THEN)B|A(*THEN)C/KS
+    AC
+ 0: AC
+
+/A(*PRUNE)B|A(*PRUNE)C/KS
+    AC
+No match
+
+/^(A(*THEN:A)B|C(*THEN:B)D)/KS
+    AB
+ 0: AB
+ 1: AB
+    CD
+ 0: CD
+ 1: CD
+    ** Failers
+No match
+    AC
+No match
+    CB    
+No match, mark = B
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/KS
+    AB
+ 0: AB
+ 1: AB
+    CD
+ 0: CD
+ 1: CD
+    ** Failers
+No match
+    AC
+No match, mark = A
+    CB    
+No match, mark = B
+
+/^(A(*PRUNE:)B|C(*PRUNE:B)D)/KS
+    AB
+ 0: AB
+ 1: AB
+    CD 
+ 0: CD
+ 1: CD
+
+/A(*PRUNE:A)B/KS
+    ACAB
+ 0: AB
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
+    C
+ 0: C
+ 1: C
+MK: A
+    D 
+No match
+
+/(*MARK:A)(*THEN:B)(C|X)/KS
+    C
+ 0: C
+ 1: C
+MK: A
+    D 
+No match
+
+/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xKS
+    AAAC
+No match
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xKS
+    AAAC
+No match
+    
+/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xKS
+    AAAC
+No match
+
+/A(*:A)A+(*SKIP)(B|Z) | AC/xKS
+    AAAC
+No match
+
+/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xKS
+    AAAC
+No match
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xKS
+    AAAC
+No match
+
+/A(*:A)B|XX(*:B)Y/KS
+    AABC
+ 0: AB
+MK: A
+    XXYZ 
+ 0: XXY
+MK: B
+    ** Failers
+No match
+    XAQQ  
+No match
+    XAQQXZZ  
+No match
+    AXQQQ 
+No match
+    AXXQQQ 
+No match
+    
+/(*COMMIT)ABC/
+    ABCDEFG
+ 0: ABC
+    ** Failers
+No match
+    DEFGABC\Y  
+No match
+
 /^(ab (c+(*THEN)cd) | xyz)/x
     abcccd  
 No match
@@ -11604,11 +11875,11 @@
  1: C
 MK: A
     D
-No match, mark = A
+No match


 /(*:A)A+(*SKIP:A)(B|Z)/KS
     AAAC
-No match, mark = A
+No match


/-- --/

@@ -11986,6 +12257,7 @@
 Latest Mark: B
 +18 ^ ^          z
 +20 ^            a
+Latest Mark: <unset>
 +21 ^^           e
 +22 ^ ^          q
 +23 ^  ^         )
@@ -12246,6 +12518,14 @@
     ax1z
  0: ax1z


+/^a\X41z/<JS>
+    aX41z
+ 0: aX41z
+    *** Failers
+No match
+    aAz
+No match
+
 /^a\u0041z/<JS>
     aAz
  0: aAz
@@ -12311,70 +12591,7 @@
         End
 ------------------------------------------------------------------


-/a[\NB]c/
-Failed: \N is not supported in a class at offset 3
+/(?<=ab\Cde)X/8
+Failed: \C not allowed in lookbehind assertion at offset 10

-/a[B-\Nc]/ 
-Failed: \N is not supported in a class at offset 5
-
-/(a)(?2){0,1999}?(b)/
-
-/(a)(?(DEFINE)(b))(?2){0,1999}?(?2)/
-
-/--- This test, with something more complicated than individual letters, causes
-different behaviour in Perl. Perhaps it disables some optimization; no tag is
-passed back for the failures, whereas in PCRE there is a tag. ---/
-    
-/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK
-    AABC
- 0: AB
- 1: A
- 2: B
-MK: A
-    XXYZ 
- 0: XXY
- 1: <unset>
- 2: <unset>
- 3: X
- 4: X
- 5: Y
-MK: B
-    ** Failers
-No match
-    XAQQ  
-No match, mark = A
-    XAQQXZZ  
-No match, mark = A
-    AXQQQ 
-No match, mark = A
-    AXXQQQ 
-No match, mark = B
-
-/-- Perl doesn't give marks for these, though it does if the alternatives are
-replaced by single letters. --/
-    
-/(b|q)(*:m)f|a(*:n)w/K
-    aw 
- 0: aw
-MK: n
-    ** Failers 
-No match, mark = n
-    abc
-No match, mark = m
-
-/(q|b)(*:m)f|a(*:n)w/K
-    aw 
- 0: aw
-MK: n
-    ** Failers 
-No match, mark = n
-    abc
-No match, mark = m
-
-/-- After a partial match, the behaviour is as for a failure. --/
-
-/^a(*:X)bcde/K
-   abc\P
-Partial match, mark=X: abc
-
 /-- End of testinput2 --/


Modified: code/trunk/testdata/testoutput6
===================================================================
--- code/trunk/testdata/testoutput6    2011-12-28 15:53:12 UTC (rev 834)
+++ code/trunk/testdata/testoutput6    2011-12-28 16:10:09 UTC (rev 835)
@@ -1353,26 +1353,4 @@
     a\xFCb   
 No match


-/ⱥ/8i
-    ⱥ
- 0: \x{2c65}
-    Ⱥx 
- 0: \x{23a}
-    Ⱥ 
- 0: \x{23a}
-
-/[ⱥ]/8i
-    ⱥ
- 0: \x{2c65}
-    Ⱥx 
- 0: \x{23a}
-    Ⱥ 
- 0: \x{23a}
-
-/Ⱥ/8i
-    Ⱥ
- 0: \x{23a}
-    ⱥ
- 0: \x{2c65}
-
 /-- End of testinput6 --/