[Pcre-svn] [716] code/trunk: Make (*THEN) work as in Perl in…

Startseite
Nachricht löschen
Autor: Subversion repository
Datum:  
To: pcre-svn
Betreff: [Pcre-svn] [716] code/trunk: Make (*THEN) work as in Perl in subpatterns that do not contain | alternatives.
Revision: 716
          http://vcs.pcre.org/viewvc?view=rev&revision=716
Author:   ph10
Date:     2011-10-04 17:38:05 +0100 (Tue, 04 Oct 2011)


Log Message:
-----------
Make (*THEN) work as in Perl in subpatterns that do not contain | alternatives.

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/HACKING
    code/trunk/configure.ac
    code/trunk/doc/pcrecompat.3
    code/trunk/doc/pcrepattern.3
    code/trunk/pcre_compile.c
    code/trunk/pcre_exec.c
    code/trunk/pcre_internal.h
    code/trunk/pcre_printint.src
    code/trunk/testdata/testinput11
    code/trunk/testdata/testinput2
    code/trunk/testdata/testoutput10
    code/trunk/testdata/testoutput11
    code/trunk/testdata/testoutput2


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/ChangeLog    2011-10-04 16:38:05 UTC (rev 716)
@@ -52,7 +52,7 @@
     so that no minimum is registered for a pattern that contains *ACCEPT.


 8.  If (*THEN) was present in the first (true) branch of a conditional group,
-    it was not handled as intended.
+    it was not handled as intended. [But see 16 below.]


 9.  Replaced RunTest.bat with the much improved version provided by Sheri
     Pierce.
@@ -74,7 +74,23 @@
     For "fr", it uses the Windows-specific input and output files. 


 14. If (*THEN) appeared in a group that was called recursively or as a 
-    subroutine, it did not work as intended. 
+    subroutine, it did not work as intended. [But see next item.]
+    
+15. Consider the pattern /A (B(*THEN)C) | D/ where A, B, C, and D are complex
+    pattern fragments (but not containing any | characters). If A and B are
+    matched, but there is a failure in C so that it backtracks to (*THEN), PCRE 
+    was behaving differently to Perl. PCRE backtracked into A, but Perl goes to 
+    D. In other words, Perl considers parentheses that do not contain any | 
+    characters to be part of a surrounding alternative, whereas PCRE was 
+    treading (B(*THEN)C) the same as (B(*THEN)C|(*FAIL)) -- which Perl handles 
+    differently. PCRE now behaves in the same way as Perl, except in the case 
+    of subroutine/recursion calls such as (?1) which have in any case always 
+    been different (but PCRE had them first :-).
+    
+16. Related to 15 above: Perl does not treat the | in a conditional group as 
+    creating alternatives. Such a group is treated in the same way as an 
+    ordinary group without any | characters when processing (*THEN). PCRE has 
+    been changed to match Perl's behaviour.



Version 8.13 16-Aug-2011

Modified: code/trunk/HACKING
===================================================================
--- code/trunk/HACKING    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/HACKING    2011-10-04 16:38:05 UTC (rev 716)
@@ -178,17 +178,14 @@
   OP_SKIP                ) indicating which parentheses must be closed.



-Backtracking control verbs with data
-------------------------------------
-
-OP_THEN is followed by a LINK_SIZE offset, which is the distance back to the
-start of the current branch.
+Backtracking control verbs with (optional) data
+-----------------------------------------------

-OP_MARK is followed by the mark name, preceded by a one-byte length, and
-followed by a binary zero. For (*PRUNE), (*SKIP), and (*THEN) with arguments,
-the opcodes OP_PRUNE_ARG, OP_SKIP_ARG, and OP_THEN_ARG are used. For the first
-two, the name follows immediately; for OP_THEN_ARG, it follows the LINK_SIZE
-offset value.
+(*THEN) without an argument generates the opcode OP_THEN and no following data.
+OP_MARK is followed by the mark name, preceded by a one-byte length, and
+followed by a binary zero. For (*PRUNE), (*SKIP), and (*THEN) with arguments,
+the opcodes OP_PRUNE_ARG, OP_SKIP_ARG, and OP_THEN_ARG are used, with the name
+following in the same format.


Matching literal characters
@@ -453,4 +450,4 @@


Philip Hazel
-August 2011
+October 2011

Modified: code/trunk/configure.ac
===================================================================
--- code/trunk/configure.ac    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/configure.ac    2011-10-04 16:38:05 UTC (rev 716)
@@ -10,8 +10,8 @@


m4_define(pcre_major, [8])
m4_define(pcre_minor, [20])
-m4_define(pcre_prerelease, [-RC2])
-m4_define(pcre_date, [2011-09-23])
+m4_define(pcre_prerelease, [-RC3])
+m4_define(pcre_date, [2011-09-30])

# Libtool shared library interface versions (current:revision:age)
m4_define(libpcre_version, [0:1:0])

Modified: code/trunk/doc/pcrecompat.3
===================================================================
--- code/trunk/doc/pcrecompat.3    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/doc/pcrecompat.3    2011-10-04 16:38:05 UTC (rev 716)
@@ -79,9 +79,9 @@
 .\"
 documentation for details.
 .P
-10. Subpatterns that are called recursively or as "subroutines" are always
-treated as atomic groups in PCRE. This is like Python, but unlike Perl. There
-is a discussion of an example that explains this in more detail in the
+10. Subpatterns that are called as subroutines (whether or not recursively) are
+always treated as atomic groups in PCRE. This is like Python, but unlike Perl.
+There is a discussion of an example that explains this in more detail in the
 .\" HTML <a href="pcrepattern.html#recursiondifference">
 .\" </a>
 section on recursion differences from Perl
@@ -92,11 +92,14 @@
 .\"
 page.
 .P
-11. There are some differences that are concerned with the settings of captured
+11. If (*THEN) is present in a group that is called as a subroutine, its action
+is limited to that group, even if the group does not contain any | characters.
+.P
+12. There are some differences that are concerned with the settings of captured
 strings when part of a pattern is repeated. For example, matching "aba" against
 the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE it is set to "b".
 .P
-12. PCRE's handling of duplicate subpattern numbers and duplicate subpattern
+13. PCRE's handling of duplicate subpattern numbers and duplicate subpattern
 names is not as general as Perl's. This is a consequence of the fact the PCRE
 works internally just with numbers, using an external table to translate
 between numbers and names. In particular, a pattern such as (?|(?<a>A)|(?<b)B),
@@ -106,12 +109,12 @@
 names map to capturing subpattern number 1. To avoid this confusing situation,
 an error is given at compile time.
 .P
-13. Perl recognizes comments in some places that PCRE does not, for example,
+14. Perl recognizes comments in some places that PCRE does not, for example,
 between the ( and ? at the start of a subpattern. If the /x modifier is set,
 Perl allows whitespace between ( and ? but PCRE never does, even if the
 PCRE_EXTENDED option is set.
 .P
-14. PCRE provides some extensions to the Perl regular expression facilities.
+15. PCRE provides some extensions to the Perl regular expression facilities.
 Perl 5.10 includes new features that are not in earlier versions of Perl, some
 of which (such as named parentheses) have been in PCRE for some time. This list
 is with respect to Perl 5.10:
@@ -145,7 +148,8 @@
 (i) The partial matching facility is PCRE-specific.
 .sp
 (j) Patterns compiled by PCRE can be saved and re-used at a later time, even on
-different hosts that have the other endianness.
+different hosts that have the other endianness. However, this does not apply to 
+optimized data created by the just-in-time compiler.
 .sp
 (k) The alternative matching function (\fBpcre_dfa_exec()\fP) matches in a
 different way and is not Perl-compatible.
@@ -168,6 +172,6 @@
 .rs
 .sp
 .nf
-Last updated: 24 August 2011
+Last updated: 04 October 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcrepattern.3
===================================================================
--- code/trunk/doc/pcrepattern.3    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/doc/pcrepattern.3    2011-10-04 16:38:05 UTC (rev 716)
@@ -1315,9 +1315,9 @@
 .sp
   /(?|(abc)|(def))\e1/
 .sp
-In contrast, a recursive or "subroutine" call to a numbered subpattern always
-refers to the first one in the pattern with the given number. The following
-pattern matches "abcabc" or "defabc":
+In contrast, a subroutine call to a numbered subpattern always refers to the
+first one in the pattern with the given number. The following pattern matches
+"abcabc" or "defabc":
 .sp
   /(?|(abc)|(def))(?1)/
 .sp
@@ -1434,7 +1434,7 @@
   a character class
   a back reference (see next section)
   a parenthesized subpattern (including assertions)
-  a recursive or "subroutine" call to a subpattern
+  a subroutine call to a subpattern (recursive or otherwise)
 .sp
 The general repetition quantifier specifies a minimum and maximum number of
 permitted matches, by giving the two numbers in curly brackets (braces),
@@ -2123,10 +2123,10 @@
 name DEFINE, the condition is always false. In this case, there may be only one
 alternative in the subpattern. It is always skipped if control reaches this
 point in the pattern; the idea of DEFINE is that it can be used to define
-"subroutines" that can be referenced from elsewhere. (The use of
+subroutines that can be referenced from elsewhere. (The use of
 .\" HTML <a href="#subpatternsassubroutines">
 .\" </a>
-"subroutines"
+subroutines
 .\"
 is described below.) For example, a pattern to match an IPv4 address such as
 "192.168.23.245" could be written like this (ignore whitespace and line
@@ -2221,11 +2221,11 @@
 this kind of recursion was subsequently introduced into Perl at release 5.10.
 .P
 A special item that consists of (? followed by a number greater than zero and a
-closing parenthesis is a recursive call of the subpattern of the given number,
-provided that it occurs inside that subpattern. (If not, it is a
+closing parenthesis is a recursive subroutine call of the subpattern of the
+given number, provided that it occurs inside that subpattern. (If not, it is a
 .\" HTML <a href="#subpatternsassubroutines">
 .\" </a>
-"subroutine"
+non-recursive subroutine
 .\"
 call, which is described in the next section.) The special item (?R) or (?0) is
 a recursive call of the entire regular expression.
@@ -2260,7 +2260,7 @@
 reference is not inside the parentheses that are referenced. They are always
 .\" HTML <a href="#subpatternsassubroutines">
 .\" </a>
-"subroutine"
+non-recursive subroutine
 .\"
 calls, as described in the next section.
 .P
@@ -2393,9 +2393,9 @@
 .SH "SUBPATTERNS AS SUBROUTINES"
 .rs
 .sp
-If the syntax for a recursive subpattern reference (either by number or by
+If the syntax for a recursive subpattern call (either by number or by
 name) is used outside the parentheses to which it refers, it operates like a
-subroutine in a programming language. The "called" subpattern may be defined
+subroutine in a programming language. The called subpattern may be defined
 before or after the reference. A numbered reference can be absolute or
 relative, as in these examples:
 .sp
@@ -2415,15 +2415,15 @@
 is used, it does match "sense and responsibility" as well as the other two
 strings. Another example is given in the discussion of DEFINE above.
 .P
-Like recursive subpatterns, a subroutine call is always treated as an atomic
-group. That is, once it has matched some of the subject string, it is never
-re-entered, even if it contains untried alternatives and there is a subsequent
-matching failure. Any capturing parentheses that are set during the subroutine
-call revert to their previous values afterwards.
+All subroutine calls, whether recursive or not, are always treated as atomic
+groups. That is, once a subroutine has matched some of the subject string, it
+is never re-entered, even if it contains untried alternatives and there is a
+subsequent matching failure. Any capturing parentheses that are set during the
+subroutine call revert to their previous values afterwards.
 .P
-When a subpattern is used as a subroutine, processing options such as
-case-independence are fixed when the subpattern is defined. They cannot be
-changed for different calls. For example, consider this pattern:
+Processing options such as case-independence are fixed when a subpattern is
+defined, so if it is used as a subroutine, such options cannot be changed for
+different calls. For example, consider this pattern:
 .sp
   (abc)(?i:(?-1))
 .sp
@@ -2504,20 +2504,22 @@
 failing negative assertion, they cause an error if encountered by
 \fBpcre_dfa_exec()\fP.
 .P
-If any of these verbs are used in an assertion or subroutine subpattern
-(including recursive subpatterns), their effect is confined to that subpattern;
-it does not extend to the surrounding pattern, with one exception: a *MARK that
-is encountered in a positive assertion \fIis\fP passed back (compare capturing
-parentheses in assertions). Note that such subpatterns are processed as
-anchored at the point where they are tested.
+If any of these verbs are used in an assertion or in a subpattern that is
+called as a subroutine (whether or not recursively), their effect is confined
+to that subpattern; it does not extend to the surrounding pattern, with one
+exception: a *MARK that is encountered in a positive assertion \fIis\fP passed
+back (compare capturing parentheses in assertions). Note that such subpatterns
+are processed as anchored at the point where they are tested. Note also that
+Perl's treatment of subroutines is different in some cases.
 .P
 The new verbs make use of what was previously invalid syntax: an opening
 parenthesis followed by an asterisk. They are generally of the form
 (*VERB) or (*VERB:NAME). Some may take either form, with differing behaviour,
-depending on whether or not an argument is present. An name is a sequence of
-letters, digits, and underscores. If the name is empty, that is, if the closing
-parenthesis immediately follows the colon, the effect is as if the colon were
-not there. Any number of these verbs may occur in a pattern.
+depending on whether or not an argument is present. A name is any sequence of
+characters that does not include a closing parenthesis. If the name is empty,
+that is, if the closing parenthesis immediately follows the colon, the effect
+is as if the colon were not there. Any number of these verbs may occur in a
+pattern.
 .P
 PCRE contains some optimizations that are used to speed up matching by running
 some checks at the start of each match attempt. For example, it may know the
@@ -2538,9 +2540,10 @@
    (*ACCEPT)
 .sp
 This verb causes the match to end successfully, skipping the remainder of the
-pattern. When inside a recursion, only the innermost pattern is ended
-immediately. If (*ACCEPT) is inside capturing parentheses, the data so far is
-captured. (This feature was added to PCRE at release 8.00.) For example:
+pattern. However, when it is inside a subpattern that is called as a
+subroutine, only that subpattern is ended successfully. Matching then continues
+at the outer level. If (*ACCEPT) is inside capturing parentheses, the data so
+far is captured. For example:
 .sp
   A((?:A|B(*ACCEPT)|C)D)
 .sp
@@ -2549,7 +2552,7 @@
 .sp
   (*FAIL) or (*F)
 .sp
-This verb causes the match to fail, forcing backtracking to occur. It is
+This verb causes a matching failure, forcing backtracking to occur. It is
 equivalent to (?!) but easier to read. The Perl documentation notes that it is
 probably useful only when combined with (?{}) or (??{}). Those are, of course,
 Perl features that are not present in PCRE. The nearest equivalent is the
@@ -2602,7 +2605,7 @@
 .P
 If (*MARK) is encountered in a positive assertion, its name is recorded and
 passed back if it is the last-encountered. This does not happen for negative
-assetions.
+assertions.
 .P
 A name may also be returned after a failed match if the final path through the
 pattern involves (*MARK). However, unless (*MARK) used in conjunction with
@@ -2716,41 +2719,77 @@
 searched for the most recent (*MARK) that has the same name. If one is found,
 the "bumpalong" advance is to the subject position that corresponds to that
 (*MARK) instead of to where (*SKIP) was encountered. If no (*MARK) with a
-matching name is found, normal "bumpalong" of one character happens (the
-(*SKIP) is ignored).
+matching name is found, normal "bumpalong" of one character happens (that is,
+the (*SKIP) is ignored).
 .sp
   (*THEN) or (*THEN:NAME)
 .sp
-This verb causes a skip to the next alternation in the innermost enclosing
-group if the rest of the pattern does not match. That is, it cancels pending
-backtracking, but only within the current alternation. Its name comes from the
-observation that it can be used for a pattern-based if-then-else block:
+This verb causes a skip to the next innermost alternative if the rest of the
+pattern does not match. That is, it cancels pending backtracking, but only
+within the current alternative. Its name comes from the observation that it can
+be used for a pattern-based if-then-else block:
 .sp
   ( COND1 (*THEN) FOO | COND2 (*THEN) BAR | COND3 (*THEN) BAZ ) ...
 .sp
 If the COND1 pattern matches, FOO is tried (and possibly further items after
-the end of the group if FOO succeeds); on failure the matcher skips to the
+the end of the group if FOO succeeds); on failure, the matcher skips to the
 second alternative and tries COND2, without backtracking into COND1. The
 behaviour of (*THEN:NAME) is exactly the same as (*MARK:NAME)(*THEN) if the
-overall match fails. If (*THEN) is not directly inside an alternation, it acts
-like (*PRUNE).
-.
+overall match fails. If (*THEN) is not inside an alternation, it acts like
+(*PRUNE).
 .P
-The above verbs provide four different "strengths" of control when subsequent
-matching fails. (*THEN) is the weakest, carrying on the match at the next
-alternation. (*PRUNE) comes next, failing the match at the current starting
-position, but allowing an advance to the next character (for an unanchored
-pattern). (*SKIP) is similar, except that the advance may be more than one
-character. (*COMMIT) is the strongest, causing the entire match to fail.
+Note that a subpattern that does not contain a | character is just a part of
+the enclosing alternative; it is not a nested alternation with only one
+alternative. The effect of (*THEN) extends beyond such a subpattern to the
+enclosing alternative. Consider this pattern, where A, B, etc. are complex
+pattern fragments that do not contain any | characters at this level:
+.sp
+  A (B(*THEN)C) | D
+.sp
+If A and B are matched, but there is a failure in C, matching does not 
+backtrack into A; instead it moves to the next alternative, that is, D.
+However, if the subpattern containing (*THEN) is given an alternative, it
+behaves differently:
+.sp
+  A (B(*THEN)C | (*FAIL)) | D
+.sp
+The effect of (*THEN) is now confined to the inner subpattern. After a failure
+in C, matching moves to (*FAIL), which causes the whole subpattern to fail 
+because there are no more alternatives to try. In this case, matching does now 
+backtrack into A.
 .P
-If more than one is present in a pattern, the "stongest" one wins. For example,
-consider this pattern, where A, B, etc. are complex pattern fragments:
+Note also that a conditional subpattern is not considered as having two 
+alternatives, because only one is ever used. In other words, the | character in 
+a conditional subpattern has a different meaning. Ignoring white space,
+consider:
 .sp
+  ^.*? (?(?=a) a | b(*THEN)c )
+.sp
+If the subject is "ba", this pattern does not match. Because .*? is ungreedy, 
+it initially matches zero characters. The condition (?=a) then fails, the 
+character "b" is matched, but "c" is not. At this point, matching does not
+backtrack to .*? as might perhaps be expected from the presence of the |
+character. The conditional subpattern is part of the single alternative that
+comprises the whole pattern, and so the match fails. (If there was a backtrack 
+into .*?, allowing it to match "b", the match would succeed.)
+.P
+The verbs just described provide four different "strengths" of control when
+subsequent matching fails. (*THEN) is the weakest, carrying on the match at the
+next alternative. (*PRUNE) comes next, failing the match at the current
+starting position, but allowing an advance to the next character (for an
+unanchored pattern). (*SKIP) is similar, except that the advance may be more
+than one character. (*COMMIT) is the strongest, causing the entire match to
+fail.
+.P
+If more than one such verb is present in a pattern, the "strongest" one wins.
+For example, consider this pattern, where A, B, etc. are complex pattern
+fragments:
+.sp
   (A(*COMMIT)B(*THEN)C|D)
 .sp
 Once A has matched, PCRE is committed to this match, at the current starting
 position. If subsequently B matches, but C does not, the normal (*THEN) action
-of trying the next alternation (that is, D) does not happen because (*COMMIT)
+of trying the next alternative (that is, D) does not happen because (*COMMIT)
 overrides.
 .
 .
@@ -2775,6 +2814,6 @@
 .rs
 .sp
 .nf
-Last updated: 24 August 2011
+Last updated: 04 October 2011
 Copyright (c) 1997-2011 University of Cambridge.
 .fi


Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/pcre_compile.c    2011-10-04 16:38:05 UTC (rev 716)
@@ -1761,7 +1761,7 @@
       break;


       case OP_THEN_ARG:
-      code += code[1+LINK_SIZE];
+      code += code[1];
       break;
       }


@@ -1880,7 +1880,7 @@
       break;


       case OP_THEN_ARG:
-      code += code[1+LINK_SIZE];
+      code += code[1];
       break;
       }


@@ -2217,7 +2217,7 @@
     break;


     case OP_THEN_ARG:
-    code += code[1+LINK_SIZE];
+    code += code[1];
     break;


     /* None of the remaining opcodes are required to match a character. */
@@ -5060,12 +5060,7 @@
               goto FAILED;
               }
             *code = verbs[i].op;
-            if (*code++ == OP_THEN)
-              {
-              PUT(code, 0, code - bcptr->current_branch - 1);
-              code += LINK_SIZE;
-              cd->external_flags |= PCRE_HASTHEN;
-              }
+            if (*code++ == OP_THEN) cd->external_flags |= PCRE_HASTHEN;
             }


           else
@@ -5076,11 +5071,7 @@
               goto FAILED;
               }
             *code = verbs[i].op_arg;
-            if (*code++ == OP_THEN_ARG)
-              {
-              PUT(code, 0, code - bcptr->current_branch - 1);
-              code += LINK_SIZE;
-              }
+            if (*code++ == OP_THEN_ARG) cd->external_flags |= PCRE_HASTHEN;
             *code++ = arglen;
             memcpy(code, arg, arglen);
             code += arglen;


Modified: code/trunk/pcre_exec.c
===================================================================
--- code/trunk/pcre_exec.c    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/pcre_exec.c    2011-10-04 16:38:05 UTC (rev 716)
@@ -775,24 +775,23 @@
     md->start_match_ptr = ecode + 2;
     RRETURN(MATCH_SKIP_ARG);


-    /* For THEN (and THEN_ARG) we pass back the address of the bracket or
-    the alt that is at the start of the current branch. This makes it possible
-    to skip back past alternatives that precede the THEN within the current
-    branch. */
+    /* For THEN (and THEN_ARG) we pass back the address of the opcode, so that
+    the branch in which it occurs can be determined. Overload the start of
+    match pointer to do this. */


     case OP_THEN:
     RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
       eptrb, RM54);
     if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-    md->start_match_ptr = ecode - GET(ecode, 1);
+    md->start_match_ptr = ecode;
     MRRETURN(MATCH_THEN);


     case OP_THEN_ARG:
-    RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1+LINK_SIZE],
-      offset_top, md, eptrb, RM58);
+    RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, 
+      md, eptrb, RM58);
     if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-    md->start_match_ptr = ecode - GET(ecode, 1);
-    md->mark = ecode + LINK_SIZE + 2;
+    md->start_match_ptr = ecode;     
+    md->mark = ecode + 2;
     RRETURN(MATCH_THEN);


     /* Handle a capturing bracket, other than those that are possessive with an
@@ -838,9 +837,29 @@
         RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
           eptrb, RM1);
         if (rrc == MATCH_ONCE) break;  /* Backing up through an atomic group */
-        if (rrc != MATCH_NOMATCH &&
-            (rrc != MATCH_THEN || md->start_match_ptr != ecode))
-          RRETURN(rrc);
+     
+        /* If we backed up to a THEN, check whether it is within the current 
+        branch by comparing the address of the THEN that is passed back with 
+        the end of the branch. If it is within the current branch, and the
+        branch is one of two or more alternatives (it either starts or ends
+        with OP_ALT), we have reached the limit of THEN's action, so convert 
+        the return code to NOMATCH, which will cause normal backtracking to 
+        happen from now on. Otherwise, THEN is passed back to an outer
+        alternative. This implements Perl's treatment of parenthesized groups, 
+        where a group not containing | does not affect the current alternative, 
+        that is, (X) is NOT the same as (X|(*F)). */
+
+        if (rrc == MATCH_THEN)
+          {
+          next = ecode + GET(ecode,1);
+          if (md->start_match_ptr < next && 
+              (*ecode == OP_ALT || *next == OP_ALT))
+            rrc = MATCH_NOMATCH;
+          }  
+          
+        /* Anything other than NOMATCH is passed back. */
+
+        if (rrc != MATCH_NOMATCH) RRETURN(rrc);
         md->capture_last = save_capture_last;
         ecode += GET(ecode, 1);
         if (*ecode != OP_ALT) break;
@@ -851,11 +870,10 @@
       md->offset_vector[offset+1] = save_offset2;
       md->offset_vector[md->offset_end - number] = save_offset3;


-      /* At this point, rrc will be one of MATCH_ONCE, MATCH_NOMATCH, or
-      MATCH_THEN. */
+      /* At this point, rrc will be one of MATCH_ONCE or MATCH_NOMATCH. */


-      if (rrc != MATCH_THEN && md->mark == NULL) md->mark = markptr;
-      RRETURN(((rrc == MATCH_ONCE)? MATCH_ONCE:MATCH_NOMATCH));
+      if (md->mark == NULL) md->mark = markptr;
+      RRETURN(rrc);
       }


     /* FALL THROUGH ... Insufficient room for saving captured contents. Treat
@@ -912,9 +930,20 @@


       RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md, eptrb,
         RM2);
-      if (rrc != MATCH_NOMATCH &&
-          (rrc != MATCH_THEN || md->start_match_ptr != ecode))
+        
+      /* See comment in the code for capturing groups above about handling
+      THEN. */
+
+      if (rrc == MATCH_THEN)
         {
+        next = ecode + GET(ecode,1);
+        if (md->start_match_ptr < next && 
+            (*ecode == OP_ALT || *next == OP_ALT))
+          rrc = MATCH_NOMATCH;
+        }  
+         
+      if (rrc != MATCH_NOMATCH)          
+        {
         if (rrc == MATCH_ONCE)
           {
           const uschar *scode = ecode;
@@ -930,7 +959,8 @@
       ecode += GET(ecode, 1);
       if (*ecode != OP_ALT) break;
       }
-    if (rrc != MATCH_THEN && md->mark == NULL) md->mark = markptr;
+      
+    if (md->mark == NULL) md->mark = markptr;
     RRETURN(MATCH_NOMATCH);


     /* Handle possessive capturing brackets with an unlimited repeat. We come
@@ -993,9 +1023,19 @@
           matched_once = TRUE;
           continue;
           }
-        if (rrc != MATCH_NOMATCH &&
-            (rrc != MATCH_THEN || md->start_match_ptr != ecode))
-          RRETURN(rrc);
+          
+        /* See comment in the code for capturing groups above about handling
+        THEN. */
+
+        if (rrc == MATCH_THEN)
+          {
+          next = ecode + GET(ecode,1);
+          if (md->start_match_ptr < next && 
+              (*ecode == OP_ALT || *next == OP_ALT))
+            rrc = MATCH_NOMATCH;
+          }  
+
+        if (rrc != MATCH_NOMATCH) RRETURN(rrc);
         md->capture_last = save_capture_last;
         ecode += GET(ecode, 1);
         if (*ecode != OP_ALT) break;
@@ -1008,7 +1048,7 @@
         md->offset_vector[md->offset_end - number] = save_offset3;
         }


-      if (rrc != MATCH_THEN && md->mark == NULL) md->mark = markptr;
+      if (md->mark == NULL) md->mark = markptr;
       if (allow_zero || matched_once)
         {
         ecode += 1 + LINK_SIZE;
@@ -1055,9 +1095,19 @@
         matched_once = TRUE;
         continue;
         }
-      if (rrc != MATCH_NOMATCH &&
-          (rrc != MATCH_THEN || md->start_match_ptr != ecode))
-        RRETURN(rrc);
+        
+      /* See comment in the code for capturing groups above about handling
+      THEN. */
+
+      if (rrc == MATCH_THEN)
+        {
+        next = ecode + GET(ecode,1);
+        if (md->start_match_ptr < next && 
+            (*ecode == OP_ALT || *next == OP_ALT))
+          rrc = MATCH_NOMATCH;
+        }  
+
+      if (rrc != MATCH_NOMATCH) RRETURN(rrc);
       ecode += GET(ecode, 1);
       if (*ecode != OP_ALT) break;
       }
@@ -1269,8 +1319,11 @@
         ecode += 1 + LINK_SIZE + GET(ecode, LINK_SIZE + 2);
         while (*ecode == OP_ALT) ecode += GET(ecode, 1);
         }
-      else if (rrc != MATCH_NOMATCH &&
-              (rrc != MATCH_THEN || md->start_match_ptr != ecode))
+ 
+      /* PCRE doesn't allow the effect of (*THEN) to escape beyond an
+      assertion; it is therefore treated as NOMATCH. */ 
+
+      else if (rrc != MATCH_NOMATCH && rrc != MATCH_THEN)              
         {
         RRETURN(rrc);         /* Need braces because of following else */
         }
@@ -1281,42 +1334,26 @@
         }
       }


-    /* We are now at the branch that is to be obeyed. As there is only one,
-    we used always to use tail recursion to avoid using another stack frame,
-    except when there was unlimited repeat of a possibly empty group. However,
-    that strategy no longer works because of the possibilty of (*THEN) being
-    encountered in the branch. However, we can still use tail recursion if
-    there are no (*THEN)s in the pattern. Otherwise, a recursive call to
-    match() is always required, unless the second alternative doesn't exist, in
-    which case we can just plough on. */
+    /* We are now at the branch that is to be obeyed. As there is only one, can
+    use tail recursion to avoid using another stack frame, except when there is
+    unlimited repeat of a possibly empty group. In the latter case, a recursive
+    call to match() is always required, unless the second alternative doesn't
+    exist, in which case we can just plough on. Note that, for compatibility
+    with Perl, the | in a conditional group is NOT treated as creating two
+    alternatives. If a THEN is encountered in the branch, it propagates out to
+    the enclosing alternative (unless nested in a deeper set of alternatives,
+    of course). */


     if (condition || *ecode == OP_ALT)
       {
-      if (op == OP_SCOND) md->match_function_type = MATCH_CBEGROUP;
-      else if (!md->hasthen)
+      if (op != OP_SCOND)
         {
         ecode += 1 + LINK_SIZE;
         goto TAIL_RECURSE;
         }
-
-      /* A call to match() is required. */
-
+ 
+      md->match_function_type = MATCH_CBEGROUP;
       RMATCH(eptr, ecode + 1 + LINK_SIZE, offset_top, md, eptrb, RM49);
-
-      /* If the result is THEN from within the "true" branch of the condition,
-      md->start_match_ptr will point to the original OP_COND, not to the start
-      of the branch, so we have do work to see if it matches. If THEN comes
-      from the "false" branch, md->start_match_ptr does point to OP_ALT. */
-
-      if (rrc == MATCH_THEN)
-        {
-        if (*ecode != OP_ALT)
-          {
-          do ecode += GET(ecode, 1); while (*ecode == OP_ALT);
-          ecode -= GET(ecode, 1);
-          }
-        if (md->start_match_ptr == ecode) rrc = MATCH_NOMATCH;
-        }
       RRETURN(rrc);
       }


@@ -1412,9 +1449,11 @@
         markptr = md->mark;
         break;
         }
-      if (rrc != MATCH_NOMATCH &&
-          (rrc != MATCH_THEN || md->start_match_ptr != ecode))
-        RRETURN(rrc);
+     
+      /* PCRE does not allow THEN to escape beyond an assertion; it is treated 
+      as NOMATCH. */
+   
+      if (rrc != MATCH_NOMATCH && rrc != MATCH_THEN) RRETURN(rrc);
       ecode += GET(ecode, 1);
       }
     while (*ecode == OP_ALT);
@@ -1455,9 +1494,11 @@
         do ecode += GET(ecode,1); while (*ecode == OP_ALT);
         break;
         }
-      if (rrc != MATCH_NOMATCH &&
-          (rrc != MATCH_THEN || md->start_match_ptr != ecode))
-        RRETURN(rrc);
+
+      /* PCRE does not allow THEN to escape beyond an assertion; it is treated 
+      as NOMATCH. */
+
+      if (rrc != MATCH_NOMATCH && rrc != MATCH_THEN) RRETURN(rrc);
       ecode += GET(ecode,1);
       }
     while (*ecode == OP_ALT);
@@ -1614,8 +1655,11 @@
           mstart = md->start_match_ptr;
           goto RECURSION_MATCHED;        /* Exit loop; end processing */
           }
-        else if (rrc != MATCH_NOMATCH &&
-                (rrc != MATCH_THEN || md->start_match_ptr != callpat))
+
+        /* PCRE does not allow THEN to escape beyond a recursion; it is treated
+        as NOMATCH. */
+
+        else if (rrc != MATCH_NOMATCH && rrc != MATCH_THEN)     
           {
           DPRINTF(("Recursion gave error %d\n", rrc));
           if (new_recursive.offset_save != stacksave)


Modified: code/trunk/pcre_internal.h
===================================================================
--- code/trunk/pcre_internal.h    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/pcre_internal.h    2011-10-04 16:38:05 UTC (rev 716)
@@ -1643,7 +1643,7 @@
   1, 1, 1,                       /* BRAZERO, BRAMINZERO, BRAPOSZERO        */ \
   3, 1, 3,                       /* MARK, PRUNE, PRUNE_ARG                 */ \
   1, 3,                          /* SKIP, SKIP_ARG                         */ \
-  1+LINK_SIZE, 3+LINK_SIZE,      /* THEN, THEN_ARG                         */ \
+  1, 3,                          /* THEN, THEN_ARG                         */ \
   1, 1, 1, 1,                    /* COMMIT, FAIL, ACCEPT, ASSERT_ACCEPT    */ \
   3, 1                           /* CLOSE, SKIPZERO  */



Modified: code/trunk/pcre_printint.src
===================================================================
--- code/trunk/pcre_printint.src    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/pcre_printint.src    2011-10-04 16:38:05 UTC (rev 716)
@@ -587,19 +587,12 @@
     break;


     case OP_THEN:
-    if (print_lengths)
-      fprintf(f, "    %s %d", OP_names[*code], GET(code, 1));
-    else
-      fprintf(f, "    %s", OP_names[*code]);
+    fprintf(f, "    %s", OP_names[*code]);
     break;


     case OP_THEN_ARG:
-    if (print_lengths)
-      fprintf(f, "    %s %d %s", OP_names[*code], GET(code, 1),
-        code + 2 + LINK_SIZE);
-    else
-      fprintf(f, "    %s %s", OP_names[*code], code + 2 + LINK_SIZE);
-    extra += code[1+LINK_SIZE];
+    fprintf(f, "    %s %s", OP_names[*code], code + 2);
+    extra += code[1];
     break;


     case OP_CIRCM:


Modified: code/trunk/testdata/testinput11
===================================================================
--- code/trunk/testdata/testinput11    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/testdata/testinput11    2011-10-04 16:38:05 UTC (rev 716)
@@ -495,9 +495,6 @@
 /(?>(*COMMIT)(yes|no)(*THEN)(*F))?/
   yes


-/^((yes|no)(*THEN)(*F))?/
-  yes
-
 /b?(*SKIP)c/
     bc
     abc
@@ -674,5 +671,100 @@
     a
     ba
     bba 
+    
+/--- Checking revised (*THEN) handling ---/ 


+/--- Capture ---/
+
+/^.*? (a(*THEN)b) c/x
+    aabc
+
+/^.*? (a(*THEN)b|(*F)) c/x
+    aabc
+
+/^.*? ( (a(*THEN)b) | (*F) ) c/x
+    aabc
+
+/^.*? ( (a(*THEN)b) ) c/x
+    aabc
+
+/--- Non-capture ---/
+
+/^.*? (?:a(*THEN)b) c/x
+    aabc
+
+/^.*? (?:a(*THEN)b|(*F)) c/x
+    aabc
+
+/^.*? (?: (?:a(*THEN)b) | (*F) ) c/x
+    aabc
+
+/^.*? (?: (?:a(*THEN)b) ) c/x
+    aabc
+
+/--- Atomic ---/
+
+/^.*? (?>a(*THEN)b) c/x
+    aabc
+
+/^.*? (?>a(*THEN)b|(*F)) c/x
+    aabc
+
+/^.*? (?> (?>a(*THEN)b) | (*F) ) c/x
+    aabc
+
+/^.*? (?> (?>a(*THEN)b) ) c/x
+    aabc
+
+/--- Possessive capture ---/
+
+/^.*? (a(*THEN)b)++ c/x
+    aabc
+
+/^.*? (a(*THEN)b|(*F))++ c/x
+    aabc
+
+/^.*? ( (a(*THEN)b)++ | (*F) )++ c/x
+    aabc
+
+/^.*? ( (a(*THEN)b)++ )++ c/x
+    aabc
+
+/--- Possessive non-capture ---/
+
+/^.*? (?:a(*THEN)b)++ c/x
+    aabc
+
+/^.*? (?:a(*THEN)b|(*F))++ c/x
+    aabc
+
+/^.*? (?: (?:a(*THEN)b)++ | (*F) )++ c/x
+    aabc
+
+/^.*? (?: (?:a(*THEN)b)++ )++ c/x
+    aabc
+    
+/--- Condition assertion ---/
+
+/^(?(?=a(*THEN)b)ab|ac)/
+    ac
+ 
+/--- Condition ---/
+
+/^.*?(?(?=a)a|b(*THEN)c)/
+    ba
+
+/^.*?(?:(?(?=a)a|b(*THEN)c)|d)/
+    ba
+
+/^.*?(?(?=a)a(*THEN)b|c)/
+    ac
+
+/--- Assertion ---/
+
+/^.*(?=a(*THEN)b)/ 
+    aabc
+
+/------------------------------/
+
 /-- End of testinput11 --/


Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/testdata/testinput2    2011-10-04 16:38:05 UTC (rev 716)
@@ -3885,5 +3885,57 @@


 /a(?:.)*?a/ims                                                                  
     \Mabbbbbbbbbbbbbbbbbbbbba
+    
+/a(?:.(*THEN))*?a/ims
+    \Mabbbbbbbbbbbbbbbbbbbbba


+/a(?:.(*THEN:ABC))*?a/ims
+    \Mabbbbbbbbbbbbbbbbbbbbba
+
+/-- These tests are in agreement with development Perl 5.015, which has fixed
+    some things, but they don't all work with 5.012, so they aren't in the
+    Perl-compatible tests. Those after the first come from Perl's own test
+    files. --/
+    
+/^((yes|no)(*THEN)(*F))?/
+  yes
+
+/(A (.*)   C? (*THEN)  | A D) (*FAIL)/x
+AbcdCBefgBhiBqz
+
+/(A (.*)   C? (*THEN)  | A D) z/x
+AbcdCBefgBhiBqz
+
+/(A (.*)   C? (*THEN)  | A D) \s* (*FAIL)/x
+AbcdCBefgBhiBqz
+
+/(A (.*)   C? (*THEN)  | A D) \s* z/x
+AbcdCBefgBhiBqz
+
+/(A (.*)   (?:C|) (*THEN)  | A D) (*FAIL)/x
+AbcdCBefgBhiBqz
+
+/(A (.*)   (?:C|) (*THEN)  | A D) z/x
+AbcdCBefgBhiBqz
+
+/(A (.*)   C{0,6} (*THEN)  | A D) (*FAIL)/x
+AbcdCBefgBhiBqz
+
+/(A (.*)   C{0,6} (*THEN)  | A D) z/x
+AbcdCBefgBhiBqz
+
+/(A (.*)   (CE){0,6} (*THEN)  | A D) (*FAIL)/x
+AbcdCEBefgBhiBqz
+
+/(A (.*)   (CE){0,6} (*THEN)  | A D) z/x
+AbcdCEBefgBhiBqz
+
+/(A (.*)   (CE*){0,6} (*THEN)  | A D) (*FAIL)/x
+AbcdCBefgBhiBqz
+
+/(A (.*)   (CE*){0,6} (*THEN)  | A D) z/x
+AbcdCBefgBhiBqz
+
+/-----------------------------------------------/  
+
 /-- End of testinput2 --/


Modified: code/trunk/testdata/testoutput10
===================================================================
--- code/trunk/testdata/testoutput10    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/testdata/testoutput10    2011-10-04 16:38:05 UTC (rev 716)
@@ -698,31 +698,31 @@


 /abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/B
 ------------------------------------------------------------------
-  0  79 Bra
+  0  73 Bra
   3     abc
   9   7 CBra 1
  14     d
  16   5 Alt
  19     e
  21  12 Ket
- 24     *THEN 24
- 27     x
- 29  16 CBra 2
- 34     123
- 40     *THEN 11
- 43     4
- 45  31 Alt
- 48     567
- 54   7 CBra 3
- 59     b
- 61   5 Alt
- 64     q
- 66  12 Ket
- 69     *THEN 24
- 72     xx
- 76  47 Ket
- 79  79 Ket
- 82     End
+ 24     *THEN
+ 25     x
+ 27  14 CBra 2
+ 32     123
+ 38     *THEN
+ 39     4
+ 41  29 Alt
+ 44     567
+ 50   7 CBra 3
+ 55     b
+ 57   5 Alt
+ 60     q
+ 62  12 Ket
+ 65     *THEN
+ 66     xx
+ 70  43 Ket
+ 73  73 Ket
+ 76     End
 ------------------------------------------------------------------


/-- End of testinput10 --/

Modified: code/trunk/testdata/testoutput11
===================================================================
--- code/trunk/testdata/testoutput11    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/testdata/testoutput11    2011-10-04 16:38:05 UTC (rev 716)
@@ -959,10 +959,6 @@
   yes
 No match


-/^((yes|no)(*THEN)(*F))?/
-  yes
- 0: 
-
 /b?(*SKIP)c/
     bc
  0: bc
@@ -1266,5 +1262,131 @@
  0: a
     bba 
  0: a
+    
+/--- Checking revised (*THEN) handling ---/ 


+/--- Capture ---/
+
+/^.*? (a(*THEN)b) c/x
+    aabc
+No match
+
+/^.*? (a(*THEN)b|(*F)) c/x
+    aabc
+ 0: aabc
+ 1: ab
+
+/^.*? ( (a(*THEN)b) | (*F) ) c/x
+    aabc
+ 0: aabc
+ 1: ab
+ 2: ab
+
+/^.*? ( (a(*THEN)b) ) c/x
+    aabc
+No match
+
+/--- Non-capture ---/
+
+/^.*? (?:a(*THEN)b) c/x
+    aabc
+No match
+
+/^.*? (?:a(*THEN)b|(*F)) c/x
+    aabc
+ 0: aabc
+
+/^.*? (?: (?:a(*THEN)b) | (*F) ) c/x
+    aabc
+ 0: aabc
+
+/^.*? (?: (?:a(*THEN)b) ) c/x
+    aabc
+No match
+
+/--- Atomic ---/
+
+/^.*? (?>a(*THEN)b) c/x
+    aabc
+No match
+
+/^.*? (?>a(*THEN)b|(*F)) c/x
+    aabc
+ 0: aabc
+
+/^.*? (?> (?>a(*THEN)b) | (*F) ) c/x
+    aabc
+ 0: aabc
+
+/^.*? (?> (?>a(*THEN)b) ) c/x
+    aabc
+No match
+
+/--- Possessive capture ---/
+
+/^.*? (a(*THEN)b)++ c/x
+    aabc
+No match
+
+/^.*? (a(*THEN)b|(*F))++ c/x
+    aabc
+ 0: aabc
+ 1: ab
+
+/^.*? ( (a(*THEN)b)++ | (*F) )++ c/x
+    aabc
+ 0: aabc
+ 1: ab
+ 2: ab
+
+/^.*? ( (a(*THEN)b)++ )++ c/x
+    aabc
+No match
+
+/--- Possessive non-capture ---/
+
+/^.*? (?:a(*THEN)b)++ c/x
+    aabc
+No match
+
+/^.*? (?:a(*THEN)b|(*F))++ c/x
+    aabc
+ 0: aabc
+
+/^.*? (?: (?:a(*THEN)b)++ | (*F) )++ c/x
+    aabc
+ 0: aabc
+
+/^.*? (?: (?:a(*THEN)b)++ )++ c/x
+    aabc
+No match
+    
+/--- Condition assertion ---/
+
+/^(?(?=a(*THEN)b)ab|ac)/
+    ac
+ 0: ac
+ 
+/--- Condition ---/
+
+/^.*?(?(?=a)a|b(*THEN)c)/
+    ba
+No match
+
+/^.*?(?:(?(?=a)a|b(*THEN)c)|d)/
+    ba
+ 0: ba
+
+/^.*?(?(?=a)a(*THEN)b|c)/
+    ac
+No match
+
+/--- Assertion ---/
+
+/^.*(?=a(*THEN)b)/ 
+    aabc
+ 0: a
+
+/------------------------------/
+
 /-- End of testinput11 --/


Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2011-10-01 06:42:38 UTC (rev 715)
+++ code/trunk/testdata/testoutput2    2011-10-04 16:38:05 UTC (rev 716)
@@ -11066,7 +11066,7 @@
  1: C
 MK: A
     D 
-No match
+No match, mark = B


 /(*MARK:A)(*THEN:B)(C|X)/KSS
     C
@@ -11231,7 +11231,7 @@
  1: C
 MK: A
     D 
-No match
+No match, mark = B


 /A(*MARK:A)A+(*SKIP)(B|Z) | AC/xKS
     AAAC
@@ -11824,7 +11824,7 @@


 /^.*?(?(?=a)a|b(*THEN)c)/
     ba
- 0: ba
+No match


 /^.*?(?(?=a)a|bc)/
     ba
@@ -11832,16 +11832,15 @@


 /^.*?(?(?=a)a(*THEN)b|c)/
     ac
- 0: ac
+No match


 /^.*?(?(?=a)a(*THEN)b)c/
     ac
- 0: ac
+No match


 /^.*?(a(*THEN)b)c/
     aabc
- 0: aabc
- 1: ab
+No match


 /^.*? (?1) c (?(DEFINE)(a(*THEN)b))/x
     aabc
@@ -12331,5 +12330,76 @@
 Minimum match() limit = 65
 Minimum match() recursion limit = 2
  0: abbbbbbbbbbbbbbbbbbbbba
+    
+/a(?:.(*THEN))*?a/ims
+    \Mabbbbbbbbbbbbbbbbbbbbba
+Minimum match() limit = 86
+Minimum match() recursion limit = 45
+ 0: abbbbbbbbbbbbbbbbbbbbba


+/a(?:.(*THEN:ABC))*?a/ims
+    \Mabbbbbbbbbbbbbbbbbbbbba
+Minimum match() limit = 86
+Minimum match() recursion limit = 45
+ 0: abbbbbbbbbbbbbbbbbbbbba
+
+/-- These tests are in agreement with development Perl 5.015, which has fixed
+    some things, but they don't all work with 5.012, so they aren't in the
+    Perl-compatible tests. Those after the first come from Perl's own test
+    files. --/
+    
+/^((yes|no)(*THEN)(*F))?/
+  yes
+No match
+
+/(A (.*)   C? (*THEN)  | A D) (*FAIL)/x
+AbcdCBefgBhiBqz
+No match
+
+/(A (.*)   C? (*THEN)  | A D) z/x
+AbcdCBefgBhiBqz
+No match
+
+/(A (.*)   C? (*THEN)  | A D) \s* (*FAIL)/x
+AbcdCBefgBhiBqz
+No match
+
+/(A (.*)   C? (*THEN)  | A D) \s* z/x
+AbcdCBefgBhiBqz
+No match
+
+/(A (.*)   (?:C|) (*THEN)  | A D) (*FAIL)/x
+AbcdCBefgBhiBqz
+No match
+
+/(A (.*)   (?:C|) (*THEN)  | A D) z/x
+AbcdCBefgBhiBqz
+No match
+
+/(A (.*)   C{0,6} (*THEN)  | A D) (*FAIL)/x
+AbcdCBefgBhiBqz
+No match
+
+/(A (.*)   C{0,6} (*THEN)  | A D) z/x
+AbcdCBefgBhiBqz
+No match
+
+/(A (.*)   (CE){0,6} (*THEN)  | A D) (*FAIL)/x
+AbcdCEBefgBhiBqz
+No match
+
+/(A (.*)   (CE){0,6} (*THEN)  | A D) z/x
+AbcdCEBefgBhiBqz
+No match
+
+/(A (.*)   (CE*){0,6} (*THEN)  | A D) (*FAIL)/x
+AbcdCBefgBhiBqz
+No match
+
+/(A (.*)   (CE*){0,6} (*THEN)  | A D) z/x
+AbcdCBefgBhiBqz
+No match
+
+/-----------------------------------------------/  
+
 /-- End of testinput2 --/