Revision: 227
http://www.exim.org/viewvc/pcre2?view=rev&revision=227
Author: ph10
Date: 2015-03-16 15:38:26 +0000 (Mon, 16 Mar 2015)
Log Message:
-----------
Test binary zero in callout strings; change offset to PCRE2_SIZE; some
documentation tidies.
Modified Paths:
--------------
code/trunk/doc/pcre2callout.3
code/trunk/doc/pcre2test.1
code/trunk/src/pcre2.h.in
code/trunk/testdata/testinput2
code/trunk/testdata/testinput6
code/trunk/testdata/testoutput2
code/trunk/testdata/testoutput6
Modified: code/trunk/doc/pcre2callout.3
===================================================================
--- code/trunk/doc/pcre2callout.3 2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/doc/pcre2callout.3 2015-03-16 15:38:26 UTC (rev 227)
@@ -1,4 +1,4 @@
-.TH PCRE2CALLOUT 3 "15 March 2015" "PCRE2 10.20"
+.TH PCRE2CALLOUT 3 "16 March 2015" "PCRE2 10.20"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS
@@ -197,8 +197,8 @@
PCRE2_SIZE \fIpattern_position\fP;
PCRE2_SIZE \fInext_item_length\fP;
PCRE2_SIZE \fIcallout_string_offset\fP;
+ PCRE2_SIZE \fIcallout_string_length\fP;
PCRE2_SPTR \fIcallout_string\fP;
- uint32_t \fIcallout_string_length\fP;
.sp
The \fIversion\fP field contains the version number of the block format. The
@@ -225,11 +225,12 @@
\fIcallout_string\fP points to the string that is contained within the compiled
pattern. Its length is given by \fIcallout_string_length\fP. Duplicated ending
delimiters that were present in the original pattern string have been turned
-into single characters. An additional code unit containing binary zero is
-present after the string, but is not included in the length. The delimiter that
-was used to start the string is also stored within the pattern, immediately
-before the string itself. You can therefore access this delimiter as
-\fIcallout_string\fP[-1] if you need it.
+into single characters, but there is no other processing of the callout string
+argument. An additional code unit containing binary zero is present after the
+string, but is not included in the length. The delimiter that was used to start
+the string is also stored within the pattern, immediately before the string
+itself. You can access this delimiter as \fIcallout_string\fP[-1] if you need
+it.
.P
The \fIcallout_string_offset\fP field is the code unit offset to the start of
the callout argument string within the original pattern string. This is
@@ -327,6 +328,6 @@
.rs
.sp
.nf
-Last updated: 15 March 2015
+Last updated: 16 March 2015
Copyright (c) 1997-2015 University of Cambridge.
.fi
Modified: code/trunk/doc/pcre2test.1
===================================================================
--- code/trunk/doc/pcre2test.1 2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/doc/pcre2test.1 2015-03-16 15:38:26 UTC (rev 227)
@@ -1,4 +1,4 @@
-.TH PCRE2TEST 1 "14 March 2015" "PCRE 10.20"
+.TH PCRE2TEST 1 "16 March 2015" "PCRE 10.20"
.SH NAME
pcre2test - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS
@@ -61,11 +61,17 @@
.sp
Input to \fBpcre2test\fP is processed line by line, either by calling the C
library's \fBfgets()\fP function, or via the \fBlibreadline\fP library (see
-below). In Unix-like environments, \fBfgets()\fP treats any bytes other than
-newline as data characters. However, in some Windows environments character 26
-(hex 1A) causes an immediate end of file, and no further data is read. For
-maximum portability, therefore, it is safest to avoid non-printing characters
-in \fBpcre2test\fP input files.
+below). The input is processed using using C's string functions, so must not
+contain binary zeroes, even though in Unix-like environments, \fBfgets()\fP
+treats any bytes other than newline as data characters. In some Windows
+environments character 26 (hex 1A) causes an immediate end of file, and no
+further data is read.
+.P
+For maximum portability, therefore, it is safest to avoid non-printing
+characters in \fBpcre2test\fP input files. There is a facility for specifying a
+pattern's characters as hexadecimal pairs, thus making it possible to include
+binary zeroes in a pattern for testing purposes. Subject lines are processed
+for backslash escapes, which makes it possible to include any data value.
.
.
.SH "COMMAND LINE OPTIONS"
@@ -1431,6 +1437,6 @@
.rs
.sp
.nf
-Last updated: 14 March 2015
+Last updated: 16 March 2015
Copyright (c) 1997-2015 University of Cambridge.
.fi
Modified: code/trunk/src/pcre2.h.in
===================================================================
--- code/trunk/src/pcre2.h.in 2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/src/pcre2.h.in 2015-03-16 15:38:26 UTC (rev 227)
@@ -339,8 +339,8 @@
PCRE2_SIZE next_item_length; /* Length of next item in the pattern */ \
/* ------------------- Added for Version 1 -------------------------- */ \
PCRE2_SIZE callout_string_offset; /* Offset to string within pattern */ \
+ PCRE2_SIZE callout_string_length; /* Length of string compiled into pattern */ \
PCRE2_SPTR callout_string; /* String compiled into pattern */ \
- uint32_t callout_string_length; /* Length of string compiled into pattern */ \
/* ------------------------------------------------------------------ */ \
} pcre2_callout_block;
Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2 2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/testdata/testinput2 2015-03-16 15:38:26 UTC (rev 227)
@@ -4224,4 +4224,9 @@
/(?:a(?C`code`)){3}X/
aaaXY
+# Binary zero in callout string
+# a ( ? C ' x z ' ) b
+/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
+ abcdefgh
+
# End of testinput2
Modified: code/trunk/testdata/testinput6
===================================================================
--- code/trunk/testdata/testinput6 2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/testdata/testinput6 2015-03-16 15:38:26 UTC (rev 227)
@@ -4841,4 +4841,9 @@
/(?:a(?C`code`)){3}X/
aaaXY
+# Binary zero in callout string
+# a ( ? C ' x z ' ) b
+/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
+ abcdefgh
+
# End of testinput6
Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2 2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/testdata/testoutput2 2015-03-16 15:38:26 UTC (rev 227)
@@ -14169,4 +14169,13 @@
^ ^ )
0: aaaX
+# Binary zero in callout string
+# a ( ? C ' x z ' ) b
+/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
+ abcdefgh
+Callout (5): 'x\x00z'
+--->abcdefgh
+ ^^ b
+ 0: ab
+
# End of testinput2
Modified: code/trunk/testdata/testoutput6
===================================================================
--- code/trunk/testdata/testoutput6 2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/testdata/testoutput6 2015-03-16 15:38:26 UTC (rev 227)
@@ -7910,4 +7910,13 @@
^ ^ )
0: aaaX
+# Binary zero in callout string
+# a ( ? C ' x z ' ) b
+/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
+ abcdefgh
+Callout (5): 'x\x00z'
+--->abcdefgh
+ ^^ b
+ 0: ab
+
# End of testinput6