[Pcre-svn] [585] code/trunk: Documentation update for fuzz s…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [585] code/trunk: Documentation update for fuzz support.
Revision: 585
          http://www.exim.org/viewvc/pcre2?view=rev&revision=585
Author:   ph10
Date:     2016-11-01 11:56:07 +0000 (Tue, 01 Nov 2016)
Log Message:
-----------
Documentation update for fuzz support.


Modified Paths:
--------------
    code/trunk/README
    code/trunk/doc/pcre2build.3


Modified: code/trunk/README
===================================================================
--- code/trunk/README    2016-10-31 19:04:22 UTC (rev 584)
+++ code/trunk/README    2016-11-01 11:56:07 UTC (rev 585)
@@ -204,13 +204,6 @@
   --enable-newline-is-crlf, --enable-newline-is-anycrlf, or
   --enable-newline-is-any to the "configure" command, respectively.


- If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of
- the standard tests will fail, because the lines in the test files end with
- LF. Even if the files are edited to change the line endings, there are likely
- to be some failures. With --enable-newline-is-anycrlf or
- --enable-newline-is-any, many tests should succeed, but there may be some
- failures.
-
. By default, the sequence \R in a pattern matches any Unicode line ending
sequence. This is independent of the option specifying what PCRE2 considers
to be the end of a line (see above). However, the caller of PCRE2 can
@@ -253,13 +246,13 @@
sizes in the pcre2stack man page.

. In the 8-bit library, the default maximum compiled pattern size is around
- 64K. You can increase this by adding --with-link-size=3 to the "configure"
- command. PCRE2 then uses three bytes instead of two for offsets to different
- parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
- the same as --with-link-size=4, which (in both libraries) uses four-byte
- offsets. Increasing the internal link size reduces performance in the 8-bit
- and 16-bit libraries. In the 32-bit library, the link size setting is
- ignored, as 4-byte offsets are always used.
+ 64K bytes. You can increase this by adding --with-link-size=3 to the
+ "configure" command. PCRE2 then uses three bytes instead of two for offsets
+ to different parts of the compiled pattern. In the 16-bit library,
+ --with-link-size=3 is the same as --with-link-size=4, which (in both
+ libraries) uses four-byte offsets. Increasing the internal link size reduces
+ performance in the 8-bit and 16-bit libraries. In the 32-bit library, the
+ link size setting is ignored, as 4-byte offsets are always used.

. You can build PCRE2 so that its internal match() function that is called from
pcre2_match() does not call itself recursively. Instead, it uses memory
@@ -346,7 +339,8 @@

The value must be a plain integer. The default is 20480. The amount of memory
used by pcre2grep is actually three times this number, to allow for "before"
- and "after" lines.
+ and "after" lines. If very long lines are encountered, the buffer is
+ automatically enlarged, up to a fixed maximum size.

. The default maximum size of pcre2grep's internal buffer can be set by, for
example:
@@ -378,6 +372,22 @@
If you get error messages about missing functions tgetstr, tgetent, tputs,
tgetflag, or tgoto, this is the problem, and linking with the ncurses library
should fix it.
+
+. There is a special option called --enable-fuzz-support for use by people who
+ want to run fuzzing tests on PCRE2. At present this applies only to the 8-bit
+ library. If set, it causes an extra library called libpcre2-fuzzsupport.a to
+ be built, but not installed. This contains a single function called
+ LLVMFuzzerTestOneInput() whose arguments are a pointer to a string and the
+ length of the string. When called, this function tries to compile the string
+ as a pattern, and if that succeeds, to match it. This is done both with no
+ options and with some random options bits that are generated from the string.
+ Setting --enable-fuzz-support also causes a binary called pcre2fuzzcheck to
+ be created. This is normally run under valgrind or used when PCRE2 is
+ compiled with address sanitizing enabled. It calls the fuzzing function and
+ outputs information about it is doing. The input strings are specified by
+ arguments: if an argument starts with "=" the rest of it is a literal input
+ string. Otherwise, it is assumed to be a file name, and the contents of the
+ file are the test string.

The "configure" script builds the following files for the basic C library:

@@ -553,7 +563,7 @@


Testing PCRE2
-------------
+-------------

 To test the basic PCRE2 library on a Unix-like system, run the RunTest script.
 There is another script called RunGrepTest that tests the pcre2grep command.
@@ -767,6 +777,7 @@
   src/pcre2_xclass.c       )


   src/pcre2_printint.c     debugging function that is used by pcre2test,
+  src/pcre2_fuzzsupport.c  function for (optional) fuzzing support 


   src/config.h.in          template for config.h, when built by "configure"
   src/pcre2.h.in           template for pcre2.h when built by "configure"
@@ -855,4 +866,4 @@
 Philip Hazel
 Email local part: ph10
 Email domain: cam.ac.uk
-Last updated: 07 October 2016
+Last updated: 01 November 2016


Modified: code/trunk/doc/pcre2build.3
===================================================================
--- code/trunk/doc/pcre2build.3    2016-10-31 19:04:22 UTC (rev 584)
+++ code/trunk/doc/pcre2build.3    2016-11-01 11:56:07 UTC (rev 585)
@@ -1,4 +1,4 @@
-.TH PCRE2BUILD 3 "07 October 2016" "PCRE2 10.23"
+.TH PCRE2BUILD 3 "01 November 2016" "PCRE2 10.23"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .
@@ -515,6 +515,28 @@
 documentation.
 .
 .
+.SH "SUPPORT FOR FUZZERS"
+.rs
+.sp
+There is a special option for use by people who want to run fuzzing tests on
+PCRE2:
+.sp
+  --enable-fuzz-support
+.sp    
+At present this applies only to the 8-bit library. If set, it causes an extra
+library called libpcre2-fuzzsupport.a to be built, but not installed. This
+contains a single function called LLVMFuzzerTestOneInput() whose arguments are
+a pointer to a string and the length of the string. When called, this function
+tries to compile the string as a pattern, and if that succeeds, to match it.
+This is done both with no options and with some random options bits that are
+generated from the string. Setting --enable-fuzz-support also causes a binary
+called \fBpcre2fuzzcheck\fP to be created. This is normally run under valgrind
+or used when PCRE2 is compiled with address sanitizing enabled. It calls the
+fuzzing function and outputs information about it is doing. The input strings
+are specified by arguments: if an argument starts with "=" the rest of it is a
+literal input string. Otherwise, it is assumed to be a file name, and the
+contents of the file are the test string.
+.
 .SH "SEE ALSO"
 .rs
 .sp
@@ -535,6 +557,6 @@
 .rs
 .sp
 .nf
-Last updated: 07 October 2016
+Last updated: 01 November 2016
 Copyright (c) 1997-2016 University of Cambridge.
 .fi