Revision: 927
http://vcs.pcre.org/viewvc?view=rev&revision=927
Author: ph10
Date: 2012-02-22 15:15:08 +0000 (Wed, 22 Feb 2012)
Log Message:
-----------
Correct and tidy up comments relating to OP_NOT (no code changes).
Modified Paths:
--------------
code/trunk/HACKING
code/trunk/pcre_compile.c
Modified: code/trunk/HACKING
===================================================================
--- code/trunk/HACKING 2012-02-22 15:01:32 UTC (rev 926)
+++ code/trunk/HACKING 2012-02-22 15:15:08 UTC (rev 927)
@@ -285,9 +285,7 @@
If there is only one character in the class, OP_CHAR or OP_CHARI is used for a
positive class, and OP_NOT or OP_NOTI for a negative one (that is, for
-something like [^a]). However, OP_NOT[I] can be used only with single-unit
-characters, so in UTF-8 (UTF-16) mode, the use of OP_NOT[I] applies only to
-characters whose code points are no greater than 127 (0xffff).
+something like [^a]).
Another set of 13 repeating opcodes (called OP_NOTSTAR etc.) are used for
repeated, negated, single-character classes. The normal single-character
@@ -467,4 +465,4 @@
Philip Hazel
-December 2011
+February 2012
Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c 2012-02-22 15:01:32 UTC (rev 926)
+++ code/trunk/pcre_compile.c 2012-02-22 15:15:08 UTC (rev 927)
@@ -4516,11 +4516,12 @@
LONE_SINGLE_CHARACTER:
/* Only the value of 1 matters for class_single_char. */
+
if (class_single_char < 2) class_single_char++;
/* If class_charcount is 1, we saw precisely one character. As long as
- there was no use of \p or \P, in other words, no use of any XCLASS features,
- we can optimize.
+ there was no use of \p or \P, in other words, no use of any XCLASS
+ features, we can optimize.
The optimization throws away the bit map. We turn the item into a
1-character OP_CHAR[I] if it's positive, or OP_NOT[I] if it's negative.
@@ -4533,8 +4534,6 @@
ptr++;
zeroreqchar = reqchar;
- /* The OP_NOT[I] opcodes work on single characters only. */
-
if (negate_class)
{
if (firstchar == REQ_UNSET) firstchar = REQ_NONE;
@@ -4804,21 +4803,22 @@
/* Now handle repetition for the different types of item. */
/* If previous was a character or negated character match, abolish the item
- and generate a repeat item instead. If a char item has a minumum of more
- than one, ensure that it is set in reqchar - it might not be if a sequence
- such as x{3} is the first thing in a branch because the x will have gone
+ and generate a repeat item instead. If a char item has a minimum of more
+ than one, ensure that it is set in reqchar - it might not be if a sequence
+ such as x{3} is the first thing in a branch because the x will have gone
into firstchar instead. */
if (*previous == OP_CHAR || *previous == OP_CHARI
|| *previous == OP_NOT || *previous == OP_NOTI)
{
- switch (*previous) {
- default: /* Make compiler happy. */
- case OP_CHAR: op_type = OP_STAR - OP_STAR; break;
- case OP_CHARI: op_type = OP_STARI - OP_STAR; break;
- case OP_NOT: op_type = OP_NOTSTAR - OP_STAR; break;
- case OP_NOTI: op_type = OP_NOTSTARI - OP_STAR; break;
- }
+ switch (*previous)
+ {
+ default: /* Make compiler happy. */
+ case OP_CHAR: op_type = OP_STAR - OP_STAR; break;
+ case OP_CHARI: op_type = OP_STARI - OP_STAR; break;
+ case OP_NOT: op_type = OP_NOTSTAR - OP_STAR; break;
+ case OP_NOTI: op_type = OP_NOTSTARI - OP_STAR; break;
+ }
/* Deal with UTF characters that take up more than one character. It's
easier to write this out separately than try to macrify it. Use c to