[Pcre-svn] [227] code/trunk: Test binary zero in callout str…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [227] code/trunk: Test binary zero in callout strings; change offset to PCRE2_SIZE; some
Revision: 227
          http://www.exim.org/viewvc/pcre2?view=rev&revision=227
Author:   ph10
Date:     2015-03-16 15:38:26 +0000 (Mon, 16 Mar 2015)


Log Message:
-----------
Test binary zero in callout strings; change offset to PCRE2_SIZE; some
documentation tidies.

Modified Paths:
--------------
    code/trunk/doc/pcre2callout.3
    code/trunk/doc/pcre2test.1
    code/trunk/src/pcre2.h.in
    code/trunk/testdata/testinput2
    code/trunk/testdata/testinput6
    code/trunk/testdata/testoutput2
    code/trunk/testdata/testoutput6


Modified: code/trunk/doc/pcre2callout.3
===================================================================
--- code/trunk/doc/pcre2callout.3    2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/doc/pcre2callout.3    2015-03-16 15:38:26 UTC (rev 227)
@@ -1,4 +1,4 @@
-.TH PCRE2CALLOUT 3 "15 March 2015" "PCRE2 10.20"
+.TH PCRE2CALLOUT 3 "16 March 2015" "PCRE2 10.20"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -197,8 +197,8 @@
   PCRE2_SIZE    \fIpattern_position\fP;
   PCRE2_SIZE    \fInext_item_length\fP;
   PCRE2_SIZE    \fIcallout_string_offset\fP;
+  PCRE2_SIZE    \fIcallout_string_length\fP; 
   PCRE2_SPTR    \fIcallout_string\fP; 
-  uint32_t      \fIcallout_string_length\fP; 


.sp
The \fIversion\fP field contains the version number of the block format. The
@@ -225,11 +225,12 @@
\fIcallout_string\fP points to the string that is contained within the compiled
pattern. Its length is given by \fIcallout_string_length\fP. Duplicated ending
delimiters that were present in the original pattern string have been turned
-into single characters. An additional code unit containing binary zero is
-present after the string, but is not included in the length. The delimiter that
-was used to start the string is also stored within the pattern, immediately
-before the string itself. You can therefore access this delimiter as
-\fIcallout_string\fP[-1] if you need it.
+into single characters, but there is no other processing of the callout string
+argument. An additional code unit containing binary zero is present after the
+string, but is not included in the length. The delimiter that was used to start
+the string is also stored within the pattern, immediately before the string
+itself. You can access this delimiter as \fIcallout_string\fP[-1] if you need
+it.
.P
The \fIcallout_string_offset\fP field is the code unit offset to the start of
the callout argument string within the original pattern string. This is
@@ -327,6 +328,6 @@
.rs
.sp
.nf
-Last updated: 15 March 2015
+Last updated: 16 March 2015
Copyright (c) 1997-2015 University of Cambridge.
.fi

Modified: code/trunk/doc/pcre2test.1
===================================================================
--- code/trunk/doc/pcre2test.1    2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/doc/pcre2test.1    2015-03-16 15:38:26 UTC (rev 227)
@@ -1,4 +1,4 @@
-.TH PCRE2TEST 1 "14 March 2015" "PCRE 10.20"
+.TH PCRE2TEST 1 "16 March 2015" "PCRE 10.20"
 .SH NAME
 pcre2test - a program for testing Perl-compatible regular expressions.
 .SH SYNOPSIS
@@ -61,11 +61,17 @@
 .sp
 Input to \fBpcre2test\fP is processed line by line, either by calling the C
 library's \fBfgets()\fP function, or via the \fBlibreadline\fP library (see
-below). In Unix-like environments, \fBfgets()\fP treats any bytes other than
-newline as data characters. However, in some Windows environments character 26
-(hex 1A) causes an immediate end of file, and no further data is read. For
-maximum portability, therefore, it is safest to avoid non-printing characters
-in \fBpcre2test\fP input files.
+below). The input is processed using using C's string functions, so must not
+contain binary zeroes, even though in Unix-like environments, \fBfgets()\fP
+treats any bytes other than newline as data characters. In some Windows
+environments character 26 (hex 1A) causes an immediate end of file, and no
+further data is read. 
+.P
+For maximum portability, therefore, it is safest to avoid non-printing
+characters in \fBpcre2test\fP input files. There is a facility for specifying a
+pattern's characters as hexadecimal pairs, thus making it possible to include
+binary zeroes in a pattern for testing purposes. Subject lines are processed
+for backslash escapes, which makes it possible to include any data value.
 .
 .
 .SH "COMMAND LINE OPTIONS"
@@ -1431,6 +1437,6 @@
 .rs
 .sp
 .nf
-Last updated: 14 March 2015
+Last updated: 16 March 2015
 Copyright (c) 1997-2015 University of Cambridge.
 .fi


Modified: code/trunk/src/pcre2.h.in
===================================================================
--- code/trunk/src/pcre2.h.in    2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/src/pcre2.h.in    2015-03-16 15:38:26 UTC (rev 227)
@@ -339,8 +339,8 @@
   PCRE2_SIZE    next_item_length;  /* Length of next item in the pattern */ \
   /* ------------------- Added for Version 1 -------------------------- */ \
   PCRE2_SIZE    callout_string_offset; /* Offset to string within pattern */ \
+  PCRE2_SIZE    callout_string_length; /* Length of string compiled into pattern */ \
   PCRE2_SPTR    callout_string;    /* String compiled into pattern */ \
-  uint32_t      callout_string_length; /* Length of string compiled into pattern */ \
   /* ------------------------------------------------------------------ */ \
 } pcre2_callout_block;



Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/testdata/testinput2    2015-03-16 15:38:26 UTC (rev 227)
@@ -4224,4 +4224,9 @@
 /(?:a(?C`code`)){3}X/
     aaaXY


+# Binary zero in callout string
+#  a  (  ?  C  '  x     z  '  )  b
+/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
+    abcdefgh
+
 # End of testinput2 


Modified: code/trunk/testdata/testinput6
===================================================================
--- code/trunk/testdata/testinput6    2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/testdata/testinput6    2015-03-16 15:38:26 UTC (rev 227)
@@ -4841,4 +4841,9 @@
 /(?:a(?C`code`)){3}X/
     aaaXY


+# Binary zero in callout string
+#  a  (  ?  C  '  x     z  '  )  b
+/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
+    abcdefgh
+
 # End of testinput6


Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/testdata/testoutput2    2015-03-16 15:38:26 UTC (rev 227)
@@ -14169,4 +14169,13 @@
     ^  ^      )
  0: aaaX


+# Binary zero in callout string
+#  a  (  ?  C  '  x     z  '  )  b
+/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
+    abcdefgh
+Callout (5): 'x\x00z'
+--->abcdefgh
+    ^^           b
+ 0: ab
+
 # End of testinput2 


Modified: code/trunk/testdata/testoutput6
===================================================================
--- code/trunk/testdata/testoutput6    2015-03-15 17:49:03 UTC (rev 226)
+++ code/trunk/testdata/testoutput6    2015-03-16 15:38:26 UTC (rev 227)
@@ -7910,4 +7910,13 @@
     ^  ^      )
  0: aaaX


+# Binary zero in callout string
+#  a  (  ?  C  '  x     z  '  )  b
+/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
+    abcdefgh
+Callout (5): 'x\x00z'
+--->abcdefgh
+    ^^           b
+ 0: ab
+
 # End of testinput6