[Pcre-svn] [612] code/trunk: Fix two study bugs concerned wi…

トップ ページ
このメッセージを削除
著者: Subversion repository
日付:  
To: pcre-svn
題目: [Pcre-svn] [612] code/trunk: Fix two study bugs concerned with minimum subject lengths ; add features to
Revision: 612
          http://vcs.pcre.org/viewvc?view=rev&revision=612
Author:   ph10
Date:     2011-07-02 16:20:59 +0100 (Sat, 02 Jul 2011)


Log Message:
-----------
Fix two study bugs concerned with minimum subject lengths; add features to
pcretest so that all tests can be run with or without study; adjust tests so
that this happens.

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/HACKING
    code/trunk/RunTest
    code/trunk/doc/pcretest.1
    code/trunk/pcre_internal.h
    code/trunk/pcre_study.c
    code/trunk/pcretest.c
    code/trunk/perltest.pl
    code/trunk/testdata/testinput11
    code/trunk/testdata/testinput2
    code/trunk/testdata/testinput5
    code/trunk/testdata/testinput7
    code/trunk/testdata/testoutput11
    code/trunk/testdata/testoutput2
    code/trunk/testdata/testoutput5
    code/trunk/testdata/testoutput7


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/ChangeLog    2011-07-02 15:20:59 UTC (rev 612)
@@ -79,8 +79,10 @@
     synonym of -m (show memory usage). I have changed it to mean "force study 
     for every regex", that is, assume /S for every regex. This is similar to -i 
     and -d etc. It's slightly incompatible, but I'm hoping nobody is still 
-    using it. It makes it easier to run collection of tests with study enabled, 
-    and thereby test pcre_study() more easily.  
+    using it. It makes it easier to run collections of tests with and without
+    study enabled, and thereby test pcre_study() more easily. All the standard 
+    tests are now run with and without -s (but some patterns can be marked as 
+    "never study" - see 20 below).


 15. When (*ACCEPT) was used in a subpattern that was called recursively, the
     restoration of the capturing data to the outer values was not happening 
@@ -101,6 +103,13 @@
 18. If a pattern containing \R was studied, it was assumed that \R always
     matched two bytes, thus causing the minimum subject length to be
     incorrectly computed because \R can also match just one byte.
+    
+19. If a pattern containing (*ACCEPT) was studied, the minimum subject length
+    was incorrectly computed. 
+    
+20. If /S is present twice on a test pattern in pcretest input, it *disables*
+    studying, thereby overriding the use of -s on the command line. This is
+    necessary for one or two tests to keep the output identical in both cases. 



Version 8.12 15-Jan-2011

Modified: code/trunk/HACKING
===================================================================
--- code/trunk/HACKING    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/HACKING    2011-07-02 15:20:59 UTC (rev 612)
@@ -2,7 +2,8 @@
 --------------------------


These are very rough technical notes that record potentially useful information
-about PCRE internals.
+about PCRE internals. For information about testing PCRE, see the pcretest
+documentation and the comment at the head of the RunTest file.


Historical note 1
@@ -449,4 +450,4 @@


Philip Hazel
-May 2011
+July 2011

Modified: code/trunk/RunTest
===================================================================
--- code/trunk/RunTest    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/RunTest    2011-07-02 15:20:59 UTC (rev 612)
@@ -1,6 +1,14 @@
 #! /bin/sh


-# Run PCRE tests.
+# Run the PCRE tests using the pcretest program. All tests are now run both
+# with and without -s, to ensure that everything is tested with and without
+# studying. However, there are some tests that produce different output after
+# studying, typically when we are tracing the actual matching process (for
+# example, using auto-callouts). In these few cases, the tests are duplicated
+# in the files, one with /S to force studying always, and one with /SS to force
+# *not* studying always. The use of -s doesn't then make any difference to
+# their output. There is also one test which compiles invalid UTF-8 with the
+# UTF-8 check turned off for which studying is disabled with /SS.

valgrind=

@@ -137,33 +145,37 @@

 if [ $do1 = yes ] ; then
   echo "Test 1: main functionality (Compatible with Perl >= 5.8)"
-  $valgrind ./pcretest -q $testdata/testinput1 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput1 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt $testdata/testinput1 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput1 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done 
 fi


# PCRE tests that are not Perl-compatible - API, errors, internals

 if [ $do2 = yes ] ; then
   echo "Test 2: API, errors, internals, and non-Perl stuff"
-  $valgrind ./pcretest -q $testdata/testinput2 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput2 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else
-    echo " "
-    echo "** Test 2 requires a lot of stack. If it has crashed with a"
-    echo "** segmentation fault, it may be that you do not have enough"
-    echo "** stack available by default. Please see the 'pcrestack' man"
-    echo "** page for a discussion of PCRE's stack usage."
-    echo " "
-    exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt $testdata/testinput2 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput2 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else
+      echo " "
+      echo "** Test 2 requires a lot of stack. If it has crashed with a"
+      echo "** segmentation fault, it may be that you do not have enough"
+      echo "** stack available by default. Please see the 'pcrestack' man"
+      echo "** page for a discussion of PCRE's stack usage."
+      echo " "
+      exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done   
 fi


# Locale-specific tests, provided that either the "fr_FR" or the "french"
@@ -191,19 +203,22 @@

   if [ "$locale" != "" ] ; then
     echo "Test 3: locale-specific features (using '$locale' locale)"
-    $valgrind ./pcretest -q $infile testtry
-    if [ $? = 0 ] ; then
-      $cf $outfile testtry
-      if [ $? != 0 ] ; then
-        echo " "
-        echo "Locale test did not run entirely successfully."
-        echo "This usually means that there is a problem with the locale"
-        echo "settings rather than a bug in PCRE."
-      else
-      echo "OK"
+    for opt in "" "-s"; do 
+      $valgrind ./pcretest -q $opt $infile testtry
+      if [ $? = 0 ] ; then
+        $cf $outfile testtry
+        if [ $? != 0 ] ; then
+          echo " "
+          echo "Locale test did not run entirely successfully."
+          echo "This usually means that there is a problem with the locale"
+          echo "settings rather than a bug in PCRE."
+          break; 
+        else
+          if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+        fi
+      else exit 1
       fi
-    else exit 1
-    fi
+    done   
   else
     echo "Cannot test locale-specific features - neither the 'fr_FR' nor the"
     echo "'french' locale exists, or the \"locale\" command is not available"
@@ -216,70 +231,82 @@


 if [ $do4 = yes ] ; then
   echo "Test 4: UTF-8 support (Compatible with Perl >= 5.8)"
-  $valgrind ./pcretest -q $testdata/testinput4 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput4 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt $testdata/testinput4 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput4 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done   
 fi


 if [ $do5 = yes ] ; then
   echo "Test 5: API, internals, and non-Perl stuff for UTF-8 support"
-  $valgrind ./pcretest -q $testdata/testinput5 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput5 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt $testdata/testinput5 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput5 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done   
 fi


 if [ $do6 = yes ] ; then
   echo "Test 6: Unicode property support (Compatible with Perl >= 5.10)"
-  $valgrind ./pcretest -q $testdata/testinput6 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput6 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt $testdata/testinput6 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput6 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done   
 fi


# Tests for DFA matching support

 if [ $do7 = yes ] ; then
   echo "Test 7: DFA matching"
-  $valgrind ./pcretest -q -dfa $testdata/testinput7 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput7 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt -dfa $testdata/testinput7 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput7 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done   
 fi


 if [ $do8 = yes ] ; then
   echo "Test 8: DFA matching with UTF-8"
-  $valgrind ./pcretest -q -dfa $testdata/testinput8 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput8 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt -dfa $testdata/testinput8 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput8 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done   
 fi


 if [ $do9 = yes ] ; then
   echo "Test 9: DFA matching with Unicode properties"
-  $valgrind ./pcretest -q -dfa $testdata/testinput9 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput9 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt -dfa $testdata/testinput9 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput9 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done   
 fi


# Test of internal offsets and code sizes. This test is run only when there
@@ -290,39 +317,45 @@

 if [ $do10 = yes ] ; then
   echo "Test 10: Internal offsets and code size tests"
-  $valgrind ./pcretest -q $testdata/testinput10 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput10 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt $testdata/testinput10 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput10 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done   
 fi


# Test of Perl >= 5.10 features

 if [ $do11 = yes ] ; then
   echo "Test 11: Features from Perl >= 5.10"
-  $valgrind ./pcretest -q $testdata/testinput11 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput11 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt $testdata/testinput11 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput11 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done   
 fi


# Test non-Perl-compatible Unicode property support

 if [ $do12 = yes ] ; then
   echo "Test 12: API, internals, and non-Perl stuff for Unicode property support"
-  $valgrind ./pcretest -q $testdata/testinput12 testtry
-  if [ $? = 0 ] ; then
-    $cf $testdata/testoutput12 testtry
-    if [ $? != 0 ] ; then exit 1; fi
-  else exit 1
-  fi
-  echo "OK"
+  for opt in "" "-s"; do 
+    $valgrind ./pcretest -q $opt $testdata/testinput12 testtry
+    if [ $? = 0 ] ; then
+      $cf $testdata/testoutput12 testtry
+      if [ $? != 0 ] ; then exit 1; fi
+    else exit 1
+    fi
+    if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+  done   
 fi


# End

Modified: code/trunk/doc/pcretest.1
===================================================================
--- code/trunk/doc/pcretest.1    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/doc/pcretest.1    2011-07-02 15:20:59 UTC (rev 612)
@@ -4,7 +4,7 @@
 .SH SYNOPSIS
 .rs
 .sp
-.B pcretest "[options] [source] [destination]"
+.B pcretest "[options] [input file [output file]]"
 .sp
 \fBpcretest\fP was written as a test program for the PCRE regular expression
 library itself, but it can also be used for experimenting with regular
@@ -18,14 +18,17 @@
 .\" HREF
 \fBpcreapi\fP
 .\"
-documentation.
+documentation. The input for \fBpcretest\fP is a sequence of regular expression 
+patterns and strings to be matched, as described below. The output shows the 
+result of each match. Options on the command line and the patterns control PCRE 
+options and exactly what is output.
 .
 .
-.SH OPTIONS
+.SH COMMAND LINE OPTIONS
 .rs
 .TP 10
 \fB-b\fP
-Behave as if each regex has the \fB/B\fP (show byte code) modifier; the
+Behave as if each pattern has the \fB/B\fP (show byte code) modifier; the
 internal form is output after compilation.
 .TP 10
 \fB-C\fP
@@ -33,7 +36,7 @@
 about the optional features that are included, and then exit.
 .TP 10
 \fB-d\fP
-Behave as if each regex has the \fB/D\fP (debug) modifier; the internal
+Behave as if each pattern has the \fB/D\fP (debug) modifier; the internal
 form and information about the compiled pattern is output after compilation;
 \fB-d\fP is equivalent to \fB-b -i\fP.
 .TP 10
@@ -46,7 +49,7 @@
 Output a brief summary these options and then exit.
 .TP 10
 \fB-i\fP
-Behave as if each regex has the \fB/I\fP modifier; information about the
+Behave as if each pattern has the \fB/I\fP modifier; information about the
 compiled pattern is given after compilation.
 .TP 10
 \fB-M\fP
@@ -67,7 +70,7 @@
 below).
 .TP 10
 \fB-p\fP
-Behave as if each regex has the \fB/P\fP modifier; the POSIX wrapper API is
+Behave as if each pattern has the \fB/P\fP modifier; the POSIX wrapper API is
 used to call PCRE. None of the other options has any effect when \fB-p\fP is
 set.
 .TP 10
@@ -79,8 +82,21 @@
 megabytes.
 .TP 10
 \fB-s\fP
-Behave as if each regex has the \fB/S\fP modifier; in other words, force each 
-regex to be studied.
+Behave as if each pattern has the \fB/S\fP modifier; in other words, force each
+pattern to be studied. If the \fB/I\fP or \fB/D\fP option is present on a
+pattern (requesting output about the compiled pattern), information about the
+result of studying is not included when studying is caused only by \fB-s\fP and
+neither \fB-i\fP nor \fB-d\fP is present on the command line. This behaviour
+means that the output from tests that are run with and without \fB-s\fP should
+be identical, except when options that output information about the actual
+running of a match are set. The \fB-M\fP, \fB-t\fP, and \fB-tm\fP options,
+which give information about resources used, are likely to produce different
+output with and without \fB-s\fP. Output may also differ if the \fB/C\fP option
+is present on an individual pattern. This uses callouts to trace the the
+matching process, and this may be different between studied and non-studied
+patterns. If the pattern contains (*MARK) items there may also be differences,
+for the same reason. The \fB-s\fP command line option can be overridden for
+specific patterns that should never be studied (see the /S option below).
 .TP 10
 \fB-t\fP
 Run each compile, study, and match many times with a timer, and output
@@ -193,10 +209,10 @@
   \fB/<bsr_unicode>\fP  PCRE_BSR_UNICODE
 .sp
 The modifiers that are enclosed in angle brackets are literal strings as shown,
-including the angle brackets, but the letters can be in either case. This
-example sets multiline matching with CRLF as the line ending sequence:
+including the angle brackets, but the letters within can be in either case.
+This example sets multiline matching with CRLF as the line ending sequence:
 .sp
-  /^abc/m<crlf>
+  /^abc/m<CRLF>
 .sp
 As well as turning on the PCRE_UTF8 option, the \fB/8\fP modifier also causes
 any non-printing characters in output strings to be printed using the
@@ -290,9 +306,13 @@
 The \fB/M\fP modifier causes the size of memory block used to hold the compiled
 pattern to be output.
 .P
-The \fB/S\fP modifier causes \fBpcre_study()\fP to be called after the
-expression has been compiled, and the results used when the expression is
-matched.
+If the \fB/S\fP modifier appears once, it causes \fBpcre_study()\fP to be
+called after the expression has been compiled, and the results used when the
+expression is matched. If \fB/S\fP appears twice, it suppresses studying, even 
+if it was requested externally by the \fB-s\fP command line option. This makes 
+it possible to specify that certain patterns are always studied, and others are 
+never studied, independently of \fB-s\fP. This feature is used in the test 
+files in a few cases where the output is different when the pattern is studied.
 .P
 The \fB/T\fP modifier must be followed by a single digit. It causes a specific
 set of built-in character tables to be passed to \fBpcre_compile()\fP. It is
@@ -746,7 +766,7 @@
 For example:
 .sp
    re> </some/file
-  Compiled regex loaded from /some/file
+  Compiled pattern loaded from /some/file
   No study data
 .sp
 When the pattern has been loaded, \fBpcretest\fP proceeds to read data lines in
@@ -792,6 +812,6 @@
 .rs
 .sp
 .nf
-Last updated: 06 June 2011
+Last updated: 02 July 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi


Modified: code/trunk/pcre_internal.h
===================================================================
--- code/trunk/pcre_internal.h    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/pcre_internal.h    2011-07-02 15:20:59 UTC (rev 612)
@@ -595,10 +595,10 @@
 #define PCRE_JCHANGED      0x0010  /* j option used in regex */
 #define PCRE_HASCRORLF     0x0020  /* explicit \r or \n in pattern */


-/* Options for the "extra" block produced by pcre_study(). */
+/* Flags for the "extra" block produced by pcre_study(). */

-#define PCRE_STUDY_MAPPED   0x01     /* a map of starting chars exists */
-#define PCRE_STUDY_MINLEN   0x02     /* a minimum length field exists */
+#define PCRE_STUDY_MAPPED  0x0001  /* a map of starting chars exists */
+#define PCRE_STUDY_MINLEN  0x0002  /* a minimum length field exists */


/* Masks for identifying the public options that are permitted at compile
time, run time, or study time, respectively. */

Modified: code/trunk/pcre_study.c
===================================================================
--- code/trunk/pcre_study.c    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/pcre_study.c    2011-07-02 15:20:59 UTC (rev 612)
@@ -66,9 +66,10 @@
 rather than bytes.


 Arguments:
-  code       pointer to start of group (the bracket)
-  startcode  pointer to start of the whole pattern
-  options    the compiling options
+  code        pointer to start of group (the bracket)
+  startcode   pointer to start of the whole pattern
+  options     the compiling options
+  had_accept  pointer to flag for (*ACCEPT) encountered


 Returns:   the minimum length
            -1 if \C was encountered
@@ -77,7 +78,8 @@
 */


 static int
-find_minlength(const uschar *code, const uschar *startcode, int options)
+find_minlength(const uschar *code, const uschar *startcode, int options,
+  BOOL *had_accept_ptr)
 {
 int length = -1;
 BOOL utf8 = (options & PCRE_UTF8) != 0;
@@ -125,17 +127,23 @@
     case OP_BRAPOS:
     case OP_SBRAPOS:
     case OP_ONCE:
-    d = find_minlength(cc, startcode, options);
+    d = find_minlength(cc, startcode, options, had_accept_ptr);
     if (d < 0) return d;
     branchlength += d;
+    if (*had_accept_ptr) return branchlength; 
     do cc += GET(cc, 1); while (*cc == OP_ALT);
     cc += 1 + LINK_SIZE;
     break;


     /* Reached end of a branch; if it's a ket it is the end of a nested
-    call. If it's ALT it is an alternation in a nested call. If it is
-    END it's the end of the outer call. All can be handled by the same code. */
+    call. If it's ALT it is an alternation in a nested call. If it is END it's
+    the end of the outer call. All can be handled by the same code. If it is
+    ACCEPT, it is essentially the same as END, but we set a flag so that
+    counting stops. */


+    case OP_ACCEPT: 
+    *had_accept_ptr = TRUE;
+    /* Fall through */ 
     case OP_ALT:
     case OP_KET:
     case OP_KETRMAX:
@@ -144,7 +152,7 @@
     case OP_END:
     if (length < 0 || (!had_recurse && branchlength < length))
       length = branchlength;
-    if (*cc != OP_ALT) return length;
+    if (op != OP_ALT) return length;
     cc += 1 + LINK_SIZE;
     branchlength = 0;
     had_recurse = FALSE;
@@ -367,7 +375,11 @@
         d = 0;
         had_recurse = TRUE;
         }
-      else d = find_minlength(cs, startcode, options);
+      else 
+        {
+        d = find_minlength(cs, startcode, options, had_accept_ptr);
+        *had_accept_ptr = FALSE; 
+        } 
       }
     else d = 0;
     cc += 3;
@@ -411,7 +423,10 @@
     if (cc > cs && cc < ce)
       had_recurse = TRUE;
     else
-      branchlength += find_minlength(cs, startcode, options);
+      { 
+      branchlength += find_minlength(cs, startcode, options, had_accept_ptr);
+      *had_accept_ptr = FALSE;
+      }  
     cc += 1 + LINK_SIZE;
     break;


@@ -479,10 +494,9 @@
     case OP_THEN_ARG:
     cc += _pcre_OP_lengths[op] + cc[1+LINK_SIZE];
     break;
-
+    
     /* The remaining opcodes are just skipped over. */


-    case OP_ACCEPT:
     case OP_CLOSE:
     case OP_COMMIT:
     case OP_FAIL:
@@ -688,6 +702,7 @@
   while (try_next)    /* Loop for items in this branch */
     {
     int rc;
+
     switch(*tcode)
       {
       /* If we reach something we don't understand, it means a new opcode has
@@ -1200,6 +1215,7 @@
 {
 int min;
 BOOL bits_set = FALSE;
+BOOL had_accept = FALSE;
 uschar start_bits[32];
 pcre_extra *extra;
 pcre_study_data *study;
@@ -1257,7 +1273,7 @@


/* Find the minimum length of subject string. */

-switch(min = find_minlength(code, code, re->options))
+switch(min = find_minlength(code, code, re->options, &had_accept))
{
case -2: *errorptr = "internal error: missing capturing bracket"; break;
case -3: *errorptr = "internal error: opcode not recognized"; break;

Modified: code/trunk/pcretest.c
===================================================================
--- code/trunk/pcretest.c    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/pcretest.c    2011-07-02 15:20:59 UTC (rev 612)
@@ -1436,6 +1436,7 @@
   size_t size, regex_gotten_store;
   int do_mark = 0;
   int do_study = 0;
+  int no_force_study = 0; 
   int do_debug = debug;
   int do_G = 0;
   int do_g = 0;
@@ -1502,7 +1503,7 @@
         }
       }


-    fprintf(outfile, "Compiled regex%s loaded from %s\n",
+    fprintf(outfile, "Compiled pattern%s loaded from %s\n",
       do_flip? " (byte-inverted)" : "", p);


     /* Need to know if UTF-8 for printing data strings */
@@ -1510,7 +1511,7 @@
     new_info(re, NULL, PCRE_INFO_OPTIONS, &get_options);
     use_utf8 = (get_options & PCRE_UTF8) != 0;


-    /* Now see if there is any following study data */
+    /* Now see if there is any following study data. */


     if (true_study_size != 0)
       {
@@ -1624,7 +1625,14 @@
       case 'P': do_posix = 1; break;
 #endif


-      case 'S': do_study = 1; break;
+      case 'S': 
+      if (do_study == 0) do_study = 1; else
+        {
+        do_study = 0;
+        no_force_study = 1;
+        }    
+      break;
+
       case 'U': options |= PCRE_UNGREEDY; break;
       case 'W': options |= PCRE_UCP; break;
       case 'X': options |= PCRE_EXTRA; break;
@@ -1808,10 +1816,12 @@
     true_size = ((real_pcre *)re)->size;
     regex_gotten_store = gotten_store;


-    /* If -s or /S was present, study the regexp to generate additional info to
-    help with the matching. */
+    /* If -s or /S was present, study the regex to generate additional info to
+    help with the matching, unless the pattern has the SS option, which 
+    suppresses the effect of /S (used for a few test patterns where studying is
+    never sensible). */


-    if (do_study || force_study)
+    if (do_study || (force_study && !no_force_study))
       {
       if (timeit > 0)
         {
@@ -2049,9 +2059,12 @@
       /* Don't output study size; at present it is in any case a fixed
       value, but it varies, depending on the computer architecture, and
       so messes up the test suite. (And with the /F option, it might be
-      flipped.) */
+      flipped.) If study was forced by an external -s, don't show this 
+      information unless -i or -d was also present. This means that, except
+      when auto-callouts are involved, the output from runs with and without
+      -s should be identical. */


-      if (do_study || force_study)
+      if (do_study || (force_study && showinfo && !no_force_study))
         {
         if (extra == NULL)
           fprintf(outfile, "Study returned NULL\n");
@@ -2129,7 +2142,11 @@
           }
         else
           {
-          fprintf(outfile, "Compiled regex written to %s\n", to_file);
+          fprintf(outfile, "Compiled pattern written to %s\n", to_file);
+          
+          /* If there is study data, write it, but verify the writing only
+          if the studying was requested by /S, not just by -s. */
+ 
           if (extra != NULL)
             {
             if (fwrite(extra->study_data, 1, true_study_size, f) <
@@ -2139,7 +2156,6 @@
                 strerror(errno));
               }
             else fprintf(outfile, "Study data written to %s\n", to_file);
-
             }
           }
         fclose(f);


Modified: code/trunk/perltest.pl
===================================================================
--- code/trunk/perltest.pl    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/perltest.pl    2011-07-02 15:20:59 UTC (rev 612)
@@ -103,6 +103,10 @@


$pattern =~ s/W(?=[a-zA-Z]*$)//;

+ # Remove /S or /SS from a pattern (asks pcretest to study or not to study)
+
+ $pattern =~ s/S(?=[a-zA-Z]*$)//g;
+
# Check that the pattern is valid

eval "\$_ =~ ${pattern}";

Modified: code/trunk/testdata/testinput11
===================================================================
--- code/trunk/testdata/testinput11    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/testdata/testinput11    2011-07-02 15:20:59 UTC (rev 612)
@@ -246,6 +246,7 @@
     aaabccc


 /(A (A|B(*ACCEPT)|C) D)(E)/x
+    AB
     ABX
     AADE
     ACDE
@@ -403,7 +404,10 @@
     AC
     CB    


-/(*MARK:A)(*SKIP:B)(C|X)/K
+/--- Force no study, otherwise mark is not seen. The studied version is in
+     test 2 because it isn't Perl-compatible. ---/
+
+/(*MARK:A)(*SKIP:B)(C|X)/KSS
     C
     D


@@ -435,9 +439,9 @@
 /A(*:A)A+(*SKIP:A)(B|Z) | AC/xK
     AAAC


-/--- Don't loop! ---/
+/--- Don't loop! Force no study, otherwise mark is not seen. ---/

-/(*:A)A+(*SKIP:A)(B|Z)/K
+/(*:A)A+(*SKIP:A)(B|Z)/KSS
     AAAC


/--- This should succeed, as a non-existent skip name disables the skip ---/

Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/testdata/testinput2    2011-07-02 15:20:59 UTC (rev 612)
@@ -1061,11 +1061,16 @@
 /abc(?C)de(?C1)f/I
     123abcdef


-/(?C1)\dabc(?C2)def/I
+/(?C1)\dabc(?C2)def/IS
     1234abcdef
     *** Failers
     abcdef


+/(?C1)\dabc(?C2)def/ISS
+    1234abcdef
+    *** Failers
+    abcdef
+
 /(?C255)ab/I


/(?C256)ab/I
@@ -1310,29 +1315,44 @@
abcde
abcdfe

-/a*b/ICDZ
+/a*b/ICDZS
ab
aaaab
aaaacb

+/a*b/ICDZSS
+ ab
+ aaaab
+ aaaacb
+
/a+b/ICDZ
ab
aaaab
aaaacb

-/(abc|def)x/ICDZ
+/(abc|def)x/ICDZS
abcx
defx
+ ** Failers
abcdefzx

+/(abc|def)x/ICDZSS
+ abcx
+ defx
+ ** Failers
+ abcdefzx
+
/(ab|cd){3,4}/IC
ababab
abcdabcd
abcdcdcdcdcd

-/([ab]{,4}c|xy)/ICDZ
+/([ab]{,4}c|xy)/ICDZS
     Note: that { does NOT introduce a quantifier


+/([ab]{,4}c|xy)/ICDZSS
+    Note: that { does NOT introduce a quantifier
+
 /([ab]{1,4}c|xy){4,5}?123/ICDZ
     aacaacaacaacaac123


@@ -1404,30 +1424,54 @@
     1X
     123456\P


-/abc/I>testsavedregex
+/abc/IS>testsavedregex
 <testsavedregex
     abc
     ** Failers
     bca


-/abc/IF>testsavedregex
+/abc/ISS>testsavedregex
 <testsavedregex
     abc
     ** Failers
     bca


+/abc/IFS>testsavedregex
+<testsavedregex
+    abc
+    ** Failers
+    bca
+
+/abc/IFSS>testsavedregex
+<testsavedregex
+    abc
+    ** Failers
+    bca
+
 /(a|b)/IS>testsavedregex
 <testsavedregex
     abc
     ** Failers
     def


+/(a|b)/ISS>testsavedregex
+<testsavedregex
+    abc
+    ** Failers
+    def
+
 /(a|b)/ISF>testsavedregex
 <testsavedregex
     abc
     ** Failers
     def


+/(a|b)/ISSF>testsavedregex
+<testsavedregex
+    abc
+    ** Failers
+    def
+
 ~<(\w+)/?>(.)*</(\1)>~smgI
     <!DOCTYPE seite SYSTEM "http://www.lco.lineas.de/xmlCms.dtd">\n<seite>\n<dokumenteninformation>\n<seitentitel>Partner der LCO</seitentitel>\n<sprache>de</sprache>\n<seitenbeschreibung>Partner der LINEAS Consulting\nGmbH</seitenbeschreibung>\n<schluesselworte>LINEAS Consulting GmbH Hamburg\nPartnerfirmen</schluesselworte>\n<revisit>30 days</revisit>\n<robots>index,follow</robots>\n<menueinformation>\n<aktiv>ja</aktiv>\n<menueposition>3</menueposition>\n<menuetext>Partner</menuetext>\n</menueinformation>\n<lastedited>\n<autor>LCO</autor>\n<firma>LINEAS Consulting</firma>\n<datum>15.10.2003</datum>\n</lastedited>\n</dokumenteninformation>\n<inhalt>\n\n<absatzueberschrift>Die Partnerfirmen der LINEAS Consulting\nGmbH</absatzueberschrift>\n\n<absatz><link ziel="http://www.ca.com/" zielfenster="_blank">\n<bild name="logo_ca.gif" rahmen="no"/></link> <link\nziel="http://www.ey.com/" zielfenster="_blank"><bild\nname="logo_euy.gif" rahmen="no"/></link>\n</absatz>\n\n<absatz><link ziel="http://www.cisco.de/" zielfenster="_blank">\n<bild name="logo_cisco.gif" rahmen="ja"/></link></absatz>\n\n<absatz><link ziel="http://www.atelion.de/"\nzielfenster="_blank"><bild\nname="logo_atelion.gif" rahmen="no"/></link>\n</absatz>\n\n<absatz><link ziel="http://www.line-information.de/"\nzielfenster="_blank">\n<bild name="logo_line_information.gif" rahmen="no"/></link>\n</absatz>\n\n<absatz><bild name="logo_aw.gif" rahmen="no"/></absatz>\n\n<absatz><link ziel="http://www.incognis.de/"\nzielfenster="_blank"><bild\nname="logo_incognis.gif" rahmen="no"/></link></absatz>\n\n<absatz><link ziel="http://www.addcraft.com/"\nzielfenster="_blank"><bild\nname="logo_addcraft.gif" rahmen="no"/></link></absatz>\n\n<absatz><link ziel="http://www.comendo.com/"\nzielfenster="_blank"><bild\nname="logo_comendo.gif" rahmen="no"/></link></absatz>\n\n</inhalt>\n</seite>


@@ -3312,14 +3356,22 @@
 /A(*PRUNE:A)B/K
     ACAB


-/(*MARK:A)(*PRUNE:B)(C|X)/K
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
     C
     D 


-/(*MARK:A)(*THEN:B)(C|X)/K
+/(*MARK:A)(*PRUNE:B)(C|X)/KSS
     C
     D 


+/(*MARK:A)(*THEN:B)(C|X)/KS
+    C
+    D 
+
+/(*MARK:A)(*THEN:B)(C|X)/KSS
+    C
+    D 
+
 /--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/


/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
@@ -3681,4 +3733,16 @@

/-- --/

+/-- These studied versions are here because they are not Perl-compatible; the
+    studying means the mark is not seen. --/
+
+/(*MARK:A)(*SKIP:B)(C|X)/KS
+    C
+    D
+     
+/(*:A)A+(*SKIP:A)(B|Z)/KS
+    AAAC
+
+/-- --/
+
 /-- End of testinput2 --/


Modified: code/trunk/testdata/testinput5
===================================================================
--- code/trunk/testdata/testinput5    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/testdata/testinput5    2011-07-02 15:20:59 UTC (rev 612)
@@ -198,7 +198,7 @@


/\xC3\xC3\xC3xxx/8

-/\xC3\xC3\xC3xxx/8?DZ
+/\xC3\xC3\xC3xxx/8?DZSS

 /abc/8
     \xC3]


Modified: code/trunk/testdata/testinput7
===================================================================
--- code/trunk/testdata/testinput7    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/testdata/testinput7    2011-07-02 15:20:59 UTC (rev 612)
@@ -3973,13 +3973,13 @@
     ac
     bbbbc


-/abc/>testsavedregex
+/abc/SS>testsavedregex
 <testsavedregex
     abc
     *** Failers
     bca


-/abc/F>testsavedregex
+/abc/FSS>testsavedregex
 <testsavedregex
     abc
     *** Failers


Modified: code/trunk/testdata/testoutput11
===================================================================
--- code/trunk/testdata/testoutput11    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/testdata/testoutput11    2011-07-02 15:20:59 UTC (rev 612)
@@ -501,6 +501,10 @@
 No match


 /(A (A|B(*ACCEPT)|C) D)(E)/x
+    AB
+ 0: AB
+ 1: AB
+ 2: B
     ABX
  0: AB
  1: AB
@@ -821,7 +825,10 @@
     CB    
 No match, mark = B


-/(*MARK:A)(*SKIP:B)(C|X)/K
+/--- Force no study, otherwise mark is not seen. The studied version is in
+     test 2 because it isn't Perl-compatible. ---/
+
+/(*MARK:A)(*SKIP:B)(C|X)/KSS
     C
  0: C
  1: C
@@ -864,9 +871,9 @@
     AAAC
  0: AC


-/--- Don't loop! ---/
+/--- Don't loop! Force no study, otherwise mark is not seen. ---/

-/(*:A)A+(*SKIP:A)(B|Z)/K
+/(*:A)A+(*SKIP:A)(B|Z)/KSS
     AAAC
 No match, mark = A



Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/testdata/testoutput2    2011-07-02 15:20:59 UTC (rev 612)
@@ -3580,11 +3580,13 @@
   1    ^    ^     f
  0: abcdef


-/(?C1)\dabc(?C2)def/I
+/(?C1)\dabc(?C2)def/IS
 Capturing subpattern count = 0
 No options
 No first char
 Need char = 'f'
+Subject length lower bound = 7
+Starting byte set: 0 1 2 3 4 5 6 7 8 9 
     1234abcdef
 --->1234abcdef
   1 ^              \d
@@ -3596,6 +3598,24 @@
     *** Failers
 No match
     abcdef
+No match
+
+/(?C1)\dabc(?C2)def/ISS
+Capturing subpattern count = 0
+No options
+No first char
+Need char = 'f'
+    1234abcdef
+--->1234abcdef
+  1 ^              \d
+  1  ^             \d
+  1   ^            \d
+  1    ^           \d
+  2    ^   ^       d
+ 0: 4abcdef
+    *** Failers
+No match
+    abcdef
 --->abcdef
   1 ^          \d
   1  ^         \d
@@ -4778,7 +4798,7 @@
  +4 ^   ^      e
 No match


-/a*b/ICDZ
+/a*b/ICDZS
 ------------------------------------------------------------------
         Bra
         Callout 255 0 2
@@ -4793,6 +4813,8 @@
 Options:
 No first char
 Need char = 'b'
+Subject length lower bound = 1
+Starting byte set: a b 
   ab
 --->ab
  +0 ^      a*
@@ -4815,6 +4837,48 @@
  +2   ^ ^      b
  +0    ^       a*
  +2    ^^      b
+ +0      ^     a*
+ +2      ^     b
+ +3      ^^    
+ 0: b
+
+/a*b/ICDZSS
+------------------------------------------------------------------
+        Bra
+        Callout 255 0 2
+        a*+
+        Callout 255 2 1
+        b
+        Callout 255 3 0
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options:
+No first char
+Need char = 'b'
+  ab
+--->ab
+ +0 ^      a*
+ +2 ^^     b
+ +3 ^ ^    
+ 0: ab
+  aaaab
+--->aaaab
+ +0 ^         a*
+ +2 ^   ^     b
+ +3 ^    ^    
+ 0: aaaab
+  aaaacb
+--->aaaacb
+ +0 ^          a*
+ +2 ^   ^      b
+ +0  ^         a*
+ +2  ^  ^      b
+ +0   ^        a*
+ +2   ^ ^      b
+ +0    ^       a*
+ +2    ^^      b
  +0     ^      a*
  +2     ^      b
  +0      ^     a*
@@ -4861,7 +4925,7 @@
  +2    ^^      b
 No match


-/(abc|def)x/ICDZ
+/(abc|def)x/ICDZS
 ------------------------------------------------------------------
         Bra
         Callout 255 0 9
@@ -4892,6 +4956,8 @@
 Options:
 No first char
 Need char = 'x'
+Subject length lower bound = 4
+Starting byte set: a d 
   abcx
 --->abcx
  +0 ^        (abc|def)
@@ -4915,6 +4981,8 @@
 +10 ^   ^    
  0: defx
  1: def
+  ** Failers 
+No match
   abcdefzx
 --->abcdefzx
  +0 ^            (abc|def)
@@ -4924,6 +4992,80 @@
  +4 ^  ^         |
  +9 ^  ^         x
  +5 ^            d
+ +0    ^         (abc|def)
+ +1    ^         a
+ +5    ^         d
+ +6    ^^        e
+ +7    ^ ^       f
+ +8    ^  ^      )
+ +9    ^  ^      x
+No match
+
+/(abc|def)x/ICDZSS
+------------------------------------------------------------------
+        Bra
+        Callout 255 0 9
+        CBra 1
+        Callout 255 1 1
+        a
+        Callout 255 2 1
+        b
+        Callout 255 3 1
+        c
+        Callout 255 4 0
+        Alt
+        Callout 255 5 1
+        d
+        Callout 255 6 1
+        e
+        Callout 255 7 1
+        f
+        Callout 255 8 0
+        Ket
+        Callout 255 9 1
+        x
+        Callout 255 10 0
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options:
+No first char
+Need char = 'x'
+  abcx
+--->abcx
+ +0 ^        (abc|def)
+ +1 ^        a
+ +2 ^^       b
+ +3 ^ ^      c
+ +4 ^  ^     |
+ +9 ^  ^     x
++10 ^   ^    
+ 0: abcx
+ 1: abc
+  defx
+--->defx
+ +0 ^        (abc|def)
+ +1 ^        a
+ +5 ^        d
+ +6 ^^       e
+ +7 ^ ^      f
+ +8 ^  ^     )
+ +9 ^  ^     x
++10 ^   ^    
+ 0: defx
+ 1: def
+  ** Failers 
+No match
+  abcdefzx
+--->abcdefzx
+ +0 ^            (abc|def)
+ +1 ^            a
+ +2 ^^           b
+ +3 ^ ^          c
+ +4 ^  ^         |
+ +9 ^  ^         x
+ +5 ^            d
  +0  ^           (abc|def)
  +1  ^           a
  +5  ^           d
@@ -5015,7 +5157,7 @@
  0: abcdcdcd
  1: cd


-/([ab]{,4}c|xy)/ICDZ
+/([ab]{,4}c|xy)/ICDZS
 ------------------------------------------------------------------
         Bra
         Callout 255 0 14
@@ -5048,8 +5190,59 @@
 Options:
 No first char
 No need char
+Subject length lower bound = 2
+Starting byte set: a b x 
     Note: that { does NOT introduce a quantifier
 --->Note: that { does NOT introduce a quantifier
+ +0         ^                                        ([ab]{,4}c|xy)
+ +1         ^                                        [ab]
+ +5         ^^                                       {
++11         ^                                        x
+ +0                                 ^                ([ab]{,4}c|xy)
+ +1                                 ^                [ab]
+ +5                                 ^^               {
++11                                 ^                x
+ +0                                     ^            ([ab]{,4}c|xy)
+ +1                                     ^            [ab]
+ +5                                     ^^           {
++11                                     ^            x
+No match
+
+/([ab]{,4}c|xy)/ICDZSS
+------------------------------------------------------------------
+        Bra
+        Callout 255 0 14
+        CBra 1
+        Callout 255 1 4
+        [ab]
+        Callout 255 5 1
+        {
+        Callout 255 6 1
+        ,
+        Callout 255 7 1
+        4
+        Callout 255 8 1
+        }
+        Callout 255 9 1
+        c
+        Callout 255 10 0
+        Alt
+        Callout 255 11 1
+        x
+        Callout 255 12 1
+        y
+        Callout 255 13 0
+        Ket
+        Callout 255 14 0
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options:
+No first char
+No need char
+    Note: that { does NOT introduce a quantifier
+--->Note: that { does NOT introduce a quantifier
  +0 ^                                                ([ab]{,4}c|xy)
  +1 ^                                                [ab]
 +11 ^                                                x
@@ -5467,14 +5660,33 @@
     123456\P
 No match


-/abc/I>testsavedregex
+/abc/IS>testsavedregex
 Capturing subpattern count = 0
 No options
 First char = 'a'
 Need char = 'c'
-Compiled regex written to testsavedregex
+Subject length lower bound = 3
+No set of starting bytes
+Compiled pattern written to testsavedregex
+Study data written to testsavedregex
 <testsavedregex
-Compiled regex loaded from testsavedregex
+Compiled pattern loaded from testsavedregex
+Study data loaded from testsavedregex
+    abc
+ 0: abc
+    ** Failers
+No match
+    bca
+No match
+
+/abc/ISS>testsavedregex
+Capturing subpattern count = 0
+No options
+First char = 'a'
+Need char = 'c'
+Compiled pattern written to testsavedregex
+<testsavedregex
+Compiled pattern loaded from testsavedregex
 No study data
     abc
  0: abc
@@ -5483,14 +5695,33 @@
     bca
 No match


-/abc/IF>testsavedregex
+/abc/IFS>testsavedregex
 Capturing subpattern count = 0
 No options
 First char = 'a'
 Need char = 'c'
-Compiled regex written to testsavedregex
+Subject length lower bound = 3
+No set of starting bytes
+Compiled pattern written to testsavedregex
+Study data written to testsavedregex
 <testsavedregex
-Compiled regex (byte-inverted) loaded from testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
+Study data loaded from testsavedregex
+    abc
+ 0: abc
+    ** Failers
+No match
+    bca
+No match
+
+/abc/IFSS>testsavedregex
+Capturing subpattern count = 0
+No options
+First char = 'a'
+Need char = 'c'
+Compiled pattern written to testsavedregex
+<testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
 No study data
     abc
  0: abc
@@ -5506,10 +5737,10 @@
 No need char
 Subject length lower bound = 1
 Starting byte set: a b 
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
 Study data written to testsavedregex
 <testsavedregex
-Compiled regex loaded from testsavedregex
+Compiled pattern loaded from testsavedregex
 Study data loaded from testsavedregex
     abc
  0: a
@@ -5520,6 +5751,24 @@
     def
 No match


+/(a|b)/ISS>testsavedregex
+Capturing subpattern count = 1
+No options
+No first char
+No need char
+Compiled pattern written to testsavedregex
+<testsavedregex
+Compiled pattern loaded from testsavedregex
+No study data
+    abc
+ 0: a
+ 1: a
+    ** Failers
+ 0: a
+ 1: a
+    def
+No match
+
 /(a|b)/ISF>testsavedregex
 Capturing subpattern count = 1
 No options
@@ -5527,10 +5776,10 @@
 No need char
 Subject length lower bound = 1
 Starting byte set: a b 
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
 Study data written to testsavedregex
 <testsavedregex
-Compiled regex (byte-inverted) loaded from testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
 Study data loaded from testsavedregex
     abc
  0: a
@@ -5541,6 +5790,24 @@
     def
 No match


+/(a|b)/ISSF>testsavedregex
+Capturing subpattern count = 1
+No options
+No first char
+No need char
+Compiled pattern written to testsavedregex
+<testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
+No study data
+    abc
+ 0: a
+ 1: a
+    ** Failers
+ 0: a
+ 1: a
+    def
+No match
+
 ~<(\w+)/?>(.)*</(\1)>~smgI
 Capturing subpattern count = 3
 Max back reference = 1
@@ -10805,20 +11072,36 @@
     ACAB
  0: AB


-/(*MARK:A)(*PRUNE:B)(C|X)/K
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
     C
  0: C
  1: C
 MK: A
     D 
+No match
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KSS
+    C
+ 0: C
+ 1: C
+MK: A
+    D 
 No match, mark = B


-/(*MARK:A)(*THEN:B)(C|X)/K
+/(*MARK:A)(*THEN:B)(C|X)/KS
     C
  0: C
  1: C
 MK: A
     D 
+No match
+
+/(*MARK:A)(*THEN:B)(C|X)/KSS
+    C
+ 0: C
+ 1: C
+MK: A
+    D 
 No match, mark = B


/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
@@ -11577,4 +11860,21 @@

/-- --/

+/-- These studied versions are here because they are not Perl-compatible; the
+    studying means the mark is not seen. --/
+
+/(*MARK:A)(*SKIP:B)(C|X)/KS
+    C
+ 0: C
+ 1: C
+MK: A
+    D
+No match
+     
+/(*:A)A+(*SKIP:A)(B|Z)/KS
+    AAAC
+No match
+
+/-- --/
+
 /-- End of testinput2 --/


Modified: code/trunk/testdata/testoutput5
===================================================================
--- code/trunk/testdata/testoutput5    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/testdata/testoutput5    2011-07-02 15:20:59 UTC (rev 612)
@@ -802,7 +802,7 @@
 /\xC3\xC3\xC3xxx/8
 Failed: invalid UTF-8 string at offset 0


-/\xC3\xC3\xC3xxx/8?DZ
+/\xC3\xC3\xC3xxx/8?DZSS
 ------------------------------------------------------------------
         Bra
         \X{c0}\X{c0}\X{c0}xxx
@@ -2184,7 +2184,7 @@
 No options
 No first char
 No need char
-Subject length lower bound = 2
+Subject length lower bound = 1
 Starting byte set: \x0a \x0b \x0c \x0d \x85 


/\R/SI8
@@ -2192,7 +2192,7 @@
Options: utf8
No first char
No need char
-Subject length lower bound = 2
+Subject length lower bound = 1
Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2

/\h*A/SI8

Modified: code/trunk/testdata/testoutput7
===================================================================
--- code/trunk/testdata/testoutput7    2011-06-29 08:49:21 UTC (rev 611)
+++ code/trunk/testdata/testoutput7    2011-07-02 15:20:59 UTC (rev 612)
@@ -1011,10 +1011,10 @@
  0: bbbbbbbbbbbbcdX


 /(a|b)/SF>testsavedregex
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
 Study data written to testsavedregex
 <testsavedregex
-Compiled regex (byte-inverted) loaded from testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
 Study data loaded from testsavedregex
     abc
  0: a
@@ -6439,10 +6439,10 @@
     bbbbc
  0: c


-/abc/>testsavedregex
-Compiled regex written to testsavedregex
+/abc/SS>testsavedregex
+Compiled pattern written to testsavedregex
 <testsavedregex
-Compiled regex loaded from testsavedregex
+Compiled pattern loaded from testsavedregex
 No study data
     abc
  0: abc
@@ -6451,10 +6451,10 @@
     bca
 No match


-/abc/F>testsavedregex
-Compiled regex written to testsavedregex
+/abc/FSS>testsavedregex
+Compiled pattern written to testsavedregex
 <testsavedregex
-Compiled regex (byte-inverted) loaded from testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
 No study data
     abc
  0: abc
@@ -6464,10 +6464,10 @@
 No match


 /(a|b)/S>testsavedregex
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
 Study data written to testsavedregex
 <testsavedregex
-Compiled regex loaded from testsavedregex
+Compiled pattern loaded from testsavedregex
 Study data loaded from testsavedregex
     abc
  0: a
@@ -6477,10 +6477,10 @@
 No match


 /(a|b)/SF>testsavedregex
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
 Study data written to testsavedregex
 <testsavedregex
-Compiled regex (byte-inverted) loaded from testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
 Study data loaded from testsavedregex
     abc
  0: a