[Pcre-svn] [1118] code/trunk/doc: Documentation update.

Page principale
Supprimer ce message
Auteur: Subversion repository
Date:  
À: pcre-svn
Sujet: [Pcre-svn] [1118] code/trunk/doc: Documentation update.
Revision: 1118
          http://www.exim.org/viewvc/pcre2?view=rev&revision=1118
Author:   ph10
Date:     2019-06-22 17:36:15 +0100 (Sat, 22 Jun 2019)
Log Message:
-----------
Documentation update.


Modified Paths:
--------------
    code/trunk/doc/html/pcre2pattern.html
    code/trunk/doc/pcre2-config.txt
    code/trunk/doc/pcre2.txt
    code/trunk/doc/pcre2grep.txt
    code/trunk/doc/pcre2pattern.3
    code/trunk/doc/pcre2test.txt


Modified: code/trunk/doc/html/pcre2pattern.html
===================================================================
--- code/trunk/doc/html/pcre2pattern.html    2019-06-21 16:10:17 UTC (rev 1117)
+++ code/trunk/doc/html/pcre2pattern.html    2019-06-22 16:36:15 UTC (rev 1118)
@@ -3525,9 +3525,10 @@
 instead of skipping on to "c".
 </P>
 <P>
-If (*SKIP) is used inside a lookbehind to specify a new starting position that
-is not later than the starting point of the current match, the position 
-specified by (*SKIP) is ignored, and instead the normal "bumpalong" occurs.
+If (*SKIP) is used to specify a new starting position that is the same as the
+starting position of the current match, or (by being inside a lookbehind)
+earlier, the position specified by (*SKIP) is ignored, and instead the normal
+"bumpalong" occurs.
 <pre>
   (*SKIP:NAME)
 </pre>
@@ -3754,7 +3755,7 @@
 </P>
 <br><a name="SEC31" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 21 June 2019
+Last updated: 22 June 2019
 <br>
 Copyright &copy; 1997-2019 University of Cambridge.
 <br>


Modified: code/trunk/doc/pcre2-config.txt
===================================================================
--- code/trunk/doc/pcre2-config.txt    2019-06-21 16:10:17 UTC (rev 1117)
+++ code/trunk/doc/pcre2-config.txt    2019-06-22 16:36:15 UTC (rev 1118)
@@ -16,8 +16,8 @@


        pcre2-config returns the configuration of the installed PCRE2 libraries
        and the options required to compile a program to use them. Some of  the
-       options  apply  only  to  the  8-bit,  or  16-bit, or 32-bit libraries,
-       respectively, and are not available for libraries that  have  not  been
+       options  apply  only  to the 8-bit, or 16-bit, or 32-bit libraries, re-
+       spectively, and are not available for  libraries  that  have  not  been
        built. If an unavailable option is encountered, the "usage" information
        is output.


@@ -36,30 +36,30 @@
        --version Writes the version number of the installed PCRE2 libraries to
                  the standard output.


-       --libs8   Writes  to  the  standard  output  the  command  line options
-                 required to link with the 8-bit PCRE2 library  (-lpcre2-8  on
+       --libs8   Writes  to  the  standard output the command line options re-
+                 quired to link with the 8-bit  PCRE2  library  (-lpcre2-8  on
                  many systems).


-       --libs16  Writes  to  the  standard  output  the  command  line options
-                 required to link with the 16-bit PCRE2 library (-lpcre2-16 on
+       --libs16  Writes  to  the  standard output the command line options re-
+                 quired to link with the 16-bit PCRE2 library  (-lpcre2-16  on
                  many systems).


-       --libs32  Writes  to  the  standard  output  the  command  line options
-                 required to link with the 32-bit PCRE2 library (-lpcre2-32 on
+       --libs32  Writes  to  the  standard output the command line options re-
+                 quired to link with the 32-bit PCRE2 library  (-lpcre2-32  on
                  many systems).


        --libs-posix
-                 Writes  to  the  standard  output  the  command  line options
-                 required to link  with  PCRE2's  POSIX  API  wrapper  library
+                 Writes  to  the  standard output the command line options re-
+                 quired  to  link  with  PCRE2's  POSIX  API  wrapper  library
                  (-lpcre2-posix -lpcre2-8 on many systems).


-       --cflags  Writes  to  the  standard  output  the  command  line options
-                 required to compile files that use PCRE2  (this  may  include
-                 some -I options, but is blank on many systems).
+       --cflags  Writes  to  the  standard output the command line options re-
+                 quired to compile files that use PCRE2 (this may include some
+                 -I options, but is blank on many systems).


        --cflags-posix
-                 Writes  to  the  standard  output  the  command  line options
-                 required to compile files that use PCRE2's POSIX API  wrapper
+                 Writes  to  the  standard output the command line options re-
+                 quired to compile files that use PCRE2's  POSIX  API  wrapper
                  library  (this  may  include some -I options, but is blank on
                  many systems).



Modified: code/trunk/doc/pcre2.txt
===================================================================
--- code/trunk/doc/pcre2.txt    2019-06-21 16:10:17 UTC (rev 1117)
+++ code/trunk/doc/pcre2.txt    2019-06-22 16:36:15 UTC (rev 1118)
@@ -31,8 +31,8 @@
        appeared in Python and the original PCRE before they appeared  in  Perl
        are  available  using the Python syntax. There is also some support for
        one or two .NET and Oniguruma syntax items, and there are  options  for
-       requesting   some  minor  changes  that  give  better  ECMAScript  (aka
-       JavaScript) compatibility.
+       requesting  some  minor  changes that give better ECMAScript (aka Java-
+       Script) compatibility.


        The source code for PCRE2 can be compiled to support 8-bit, 16-bit,  or
        32-bit  code units, which means that up to three separate libraries may
@@ -111,9 +111,9 @@


        The use of the \C escape sequence in a UTF-8 or UTF-16 pattern can lead
        to  problems,  because  it  may leave the current matching point in the
-       middle of  a  multi-code-unit  character.  The  PCRE2_NEVER_BACKSLASH_C
-       option can be used by an application to lock out the use of \C, causing
-       a compile-time error if it is encountered. It is also possible to build
+       middle of a multi-code-unit character. The PCRE2_NEVER_BACKSLASH_C  op-
+       tion can be used by an application to lock out the use of \C, causing a
+       compile-time error if it is encountered. It is also possible  to  build
        PCRE2 with the use of \C permanently disabled.


        Another  way  that  performance can be hit is by running a pattern that
@@ -451,8 +451,8 @@
        void pcre2_converted_pattern_free(PCRE2_UCHAR *converted_pattern);


        These functions provide a way of  converting  non-PCRE2  patterns  into
-       patterns  that  can  be  processed by pcre2_compile(). This facility is
-       experimental and may be changed in future releases. At present, "globs"
+       patterns that can be processed by pcre2_compile(). This facility is ex-
+       perimental and may be changed in future releases. At  present,  "globs"
        and  POSIX  basic  and  extended patterns can be converted. Details are
        given in the pcre2convert documentation.


@@ -543,22 +543,22 @@

        The compiling and matching functions recognize various options that are
        passed as bits in an options argument. There are also some more compli-
-       cated  parameters  such  as  custom  memory  management  functions  and
-       resource  limits  that  are passed in "contexts" (which are just memory
+       cated parameters such as custom memory  management  functions  and  re-
+       source  limits  that  are  passed  in "contexts" (which are just memory
        blocks, described below). Simple applications do not need to  make  use
        of contexts.


        Just-in-time  (JIT)  compiler  support  is an optional feature of PCRE2
        that can be built in  appropriate  hardware  environments.  It  greatly
-       speeds  up  the  matching  performance  of  many patterns. Programs can
-       request that it be used if  available  by  calling  pcre2_jit_compile()
-       after a pattern has been successfully compiled by pcre2_compile(). This
-       does nothing if JIT support is not available.
+       speeds  up  the matching performance of many patterns. Programs can re-
+       quest that it be used if available by calling pcre2_jit_compile() after
+       a  pattern has been successfully compiled by pcre2_compile(). This does
+       nothing if JIT support is not available.


        More complicated programs might need to  make  use  of  the  specialist
        functions    pcre2_jit_stack_create(),    pcre2_jit_stack_free(),   and
-       pcre2_jit_stack_assign() in order to  control  the  JIT  code's  memory
-       usage.
+       pcre2_jit_stack_assign() in order to control the JIT code's memory  us-
+       age.


        JIT matching is automatically used by pcre2_match() if it is available,
        unless the PCRE2_NO_JIT option is set. There is also a direct interface
@@ -570,13 +570,12 @@
        patible, is also provided. This uses  a  different  algorithm  for  the
        matching.  The  alternative  algorithm finds all possible matches (at a
        given point in the subject), and scans the subject  just  once  (unless
-       there  are  lookaround  assertions).  However,  this algorithm does not
-       return captured substrings. A description of  the  two  matching  algo-
-       rithms   and  their  advantages  and  disadvantages  is  given  in  the
-       pcre2matching   documentation.   There   is   no   JIT   support    for
-       pcre2_dfa_match().
+       there  are lookaround assertions). However, this algorithm does not re-
+       turn captured substrings. A description of the two matching  algorithms
+       and  their  advantages  and disadvantages is given in the pcre2matching
+       documentation. There is no JIT support for pcre2_dfa_match().


-       In  addition  to  the  main compiling and matching functions, there are
+       In addition to the main compiling and  matching  functions,  there  are
        convenience functions for extracting captured substrings from a subject
        string that has been matched by pcre2_match(). They are:


@@ -590,35 +589,35 @@
          pcre2_substring_nametable_scan()
          pcre2_substring_number_from_name()


-       pcre2_substring_free()  and  pcre2_substring_list_free()  are also pro-
-       vided, to free memory used for extracted strings. If  either  of  these
-       functions  is called with a NULL argument, the function returns immedi-
+       pcre2_substring_free() and pcre2_substring_list_free()  are  also  pro-
+       vided,  to  free  memory used for extracted strings. If either of these
+       functions is called with a NULL argument, the function returns  immedi-
        ately without doing anything.


-       The function pcre2_substitute() can be called to match  a  pattern  and
-       return  a  copy of the subject string with substitutions for parts that
+       The  function  pcre2_substitute()  can be called to match a pattern and
+       return a copy of the subject string with substitutions for  parts  that
        were matched.


-       Functions whose names begin with pcre2_serialize_ are used  for  saving
+       Functions  whose  names begin with pcre2_serialize_ are used for saving
        compiled patterns on disc or elsewhere, and reloading them later.


-       Finally,  there  are functions for finding out information about a com-
-       piled pattern (pcre2_pattern_info()) and about the  configuration  with
+       Finally, there are functions for finding out information about  a  com-
+       piled  pattern  (pcre2_pattern_info()) and about the configuration with
        which PCRE2 was built (pcre2_config()).


-       Functions  with  names  ending with _free() are used for freeing memory
-       blocks of various sorts. In all cases, if one  of  these  functions  is
+       Functions with names ending with _free() are used  for  freeing  memory
+       blocks  of  various  sorts.  In all cases, if one of these functions is
        called with a NULL argument, it does nothing.



STRING LENGTHS AND OFFSETS

-       The  PCRE2  API  uses  string  lengths and offsets into strings of code
-       units in several places. These values are always  of  type  PCRE2_SIZE,
-       which  is an unsigned integer type, currently always defined as size_t.
-       The largest  value  that  can  be  stored  in  such  a  type  (that  is
-       ~(PCRE2_SIZE)0)  is reserved as a special indicator for zero-terminated
-       strings and unset offsets.  Therefore, the longest string that  can  be
+       The PCRE2 API uses string lengths and  offsets  into  strings  of  code
+       units  in  several  places. These values are always of type PCRE2_SIZE,
+       which is an unsigned integer type, currently always defined as  size_t.
+       The  largest  value  that  can  be  stored  in  such  a  type  (that is
+       ~(PCRE2_SIZE)0) is reserved as a special indicator for  zero-terminated
+       strings  and  unset offsets.  Therefore, the longest string that can be
        handled is one less than this maximum.



@@ -625,31 +624,31 @@
NEWLINES

        PCRE2 supports five different conventions for indicating line breaks in
-       strings: a single CR (carriage return) character, a  single  LF  (line-
+       strings:  a  single  CR (carriage return) character, a single LF (line-
        feed) character, the two-character sequence CRLF, any of the three pre-
-       ceding, or any Unicode newline sequence. The Unicode newline  sequences
-       are  the  three just mentioned, plus the single characters VT (vertical
+       ceding,  or any Unicode newline sequence. The Unicode newline sequences
+       are the three just mentioned, plus the single characters  VT  (vertical
        tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
        separator, U+2028), and PS (paragraph separator, U+2029).


-       Each  of  the first three conventions is used by at least one operating
+       Each of the first three conventions is used by at least  one  operating
        system as its standard newline sequence. When PCRE2 is built, a default
        can be specified.  If it is not, the default is set to LF, which is the
-       Unix standard. However, the newline convention can  be  changed  by  an
-       application  when  calling  pcre2_compile(),  or it can be specified by
-       special text at the start of the pattern  itself;  this  overrides  any
-       other  settings.  See  the pcre2pattern page for details of the special
-       character sequences.
+       Unix standard. However, the newline convention can be changed by an ap-
+       plication when calling pcre2_compile(), or it can be specified by  spe-
+       cial  text at the start of the pattern itself; this overrides any other
+       settings. See the pcre2pattern page for details of the special  charac-
+       ter sequences.


-       In the PCRE2 documentation the word "newline"  is  used  to  mean  "the
+       In  the  PCRE2  documentation  the  word "newline" is used to mean "the
        character or pair of characters that indicate a line break". The choice
-       of newline convention affects the handling of the dot, circumflex,  and
+       of  newline convention affects the handling of the dot, circumflex, and
        dollar metacharacters, the handling of #-comments in /x mode, and, when
-       CRLF is a recognized line ending sequence, the match position  advance-
+       CRLF  is a recognized line ending sequence, the match position advance-
        ment for a non-anchored pattern. There is more detail about this in the
        section on pcre2_match() options below.


-       The choice of newline convention does not affect the interpretation  of
+       The  choice of newline convention does not affect the interpretation of
        the \n or \r escape sequences, nor does it affect what \R matches; this
        has its own separate convention.


@@ -656,12 +655,12 @@

MULTITHREADING

-       In a multithreaded application it is important to keep  thread-specific
-       data  separate  from data that can be shared between threads. The PCRE2
-       library code itself is thread-safe: it contains  no  static  or  global
-       variables.  The  API  is  designed to be fairly simple for non-threaded
-       applications while at the same time ensuring that multithreaded  appli-
-       cations can use it.
+       In  a multithreaded application it is important to keep thread-specific
+       data separate from data that can be shared between threads.  The  PCRE2
+       library  code  itself  is  thread-safe: it contains no static or global
+       variables. The API is designed to be fairly simple for non-threaded ap-
+       plications  while at the same time ensuring that multithreaded applica-
+       tions can use it.


        There are several different blocks of data that are used to pass infor-
        mation between the application and the PCRE2 libraries.
@@ -668,19 +667,19 @@


    The compiled pattern


-       A pointer to the compiled form of a pattern is  returned  to  the  user
+       A  pointer  to  the  compiled form of a pattern is returned to the user
        when pcre2_compile() is successful. The data in the compiled pattern is
-       fixed, and does not change when the pattern is matched.  Therefore,  it
-       is  thread-safe, that is, the same compiled pattern can be used by more
+       fixed,  and  does not change when the pattern is matched. Therefore, it
+       is thread-safe, that is, the same compiled pattern can be used by  more
        than one thread simultaneously. For example, an application can compile
        all its patterns at the start, before forking off multiple threads that
-       use them. However, if the just-in-time (JIT)  optimization  feature  is
-       being  used,  it needs separate memory stack areas for each thread. See
+       use  them.  However,  if the just-in-time (JIT) optimization feature is
+       being used, it needs separate memory stack areas for each  thread.  See
        the pcre2jit documentation for more details.


-       In a more complicated situation, where patterns are compiled only  when
-       they  are  first needed, but are still shared between threads, pointers
-       to compiled patterns must be protected  from  simultaneous  writing  by
+       In  a more complicated situation, where patterns are compiled only when
+       they are first needed, but are still shared between  threads,  pointers
+       to  compiled  patterns  must  be protected from simultaneous writing by
        multiple threads, at least until a pattern has been compiled. The logic
        can be something like this:


@@ -693,65 +692,65 @@
          Release the lock
          Use pointer in pcre2_match()


-       Of course, testing for compilation errors should also  be  included  in
+       Of  course,  testing  for compilation errors should also be included in
        the code.


        If JIT is being used, but the JIT compilation is not being done immedi-
-       ately, (perhaps waiting to see if the pattern  is  used  often  enough)
+       ately,  (perhaps  waiting  to  see if the pattern is used often enough)
        similar logic is required. JIT compilation updates a pointer within the
-       compiled code block, so a thread must gain unique write access  to  the
-       pointer     before    calling    pcre2_jit_compile().    Alternatively,
-       pcre2_code_copy()  or  pcre2_code_copy_with_tables()  can  be  used  to
-       obtain  a private copy of the compiled code before calling the JIT com-
+       compiled  code  block, so a thread must gain unique write access to the
+       pointer    before    calling    pcre2_jit_compile().     Alternatively,
+       pcre2_code_copy()  or  pcre2_code_copy_with_tables() can be used to ob-
+       tain a private copy of the compiled code before calling  the  JIT  com-
        piler.


    Context blocks


-       The next main section below introduces the idea of "contexts" in  which
+       The  next main section below introduces the idea of "contexts" in which
        PCRE2 functions are called. A context is nothing more than a collection
        of parameters that control the way PCRE2 operates. Grouping a number of
        parameters together in a context is a convenient way of passing them to
-       a PCRE2 function without using lots of arguments. The  parameters  that
-       are  stored  in  contexts  are in some sense "advanced features" of the
+       a  PCRE2  function without using lots of arguments. The parameters that
+       are stored in contexts are in some sense  "advanced  features"  of  the
        API. Many straightforward applications will not need to use contexts.


        In a multithreaded application, if the parameters in a context are val-
-       ues  that  are  never  changed, the same context can be used by all the
+       ues that are never changed, the same context can be  used  by  all  the
        threads. However, if any thread needs to change any value in a context,
        it must make its own thread-specific copy.


    Match blocks


-       The  matching  functions need a block of memory for storing the results
+       The matching functions need a block of memory for storing  the  results
        of a match. This includes details of what was matched, as well as addi-
-       tional  information  such as the name of a (*MARK) setting. Each thread
+       tional information such as the name of a (*MARK) setting.  Each  thread
        must provide its own copy of this memory.



PCRE2 CONTEXTS

-       Some PCRE2 functions have a lot of parameters, many of which  are  used
-       only  by  specialist  applications,  for example, those that use custom
-       memory management or non-standard character tables.  To  keep  function
-       argument  lists  at a reasonable size, and at the same time to keep the
-       API extensible, "uncommon" parameters are passed to  certain  functions
-       in  a  context instead of directly. A context is just a block of memory
-       that holds the parameter values.  Applications  that  do  not  need  to
-       adjust  any  of  the  context  parameters  can pass NULL when a context
-       pointer is required.
+       Some  PCRE2  functions have a lot of parameters, many of which are used
+       only by specialist applications, for example,  those  that  use  custom
+       memory  management  or  non-standard character tables. To keep function
+       argument lists at a reasonable size, and at the same time to  keep  the
+       API  extensible,  "uncommon" parameters are passed to certain functions
+       in a context instead of directly. A context is just a block  of  memory
+       that  holds the parameter values.  Applications that do not need to ad-
+       just any of the context parameters can pass NULL when a context pointer
+       is required.


-       There are three different types of context: a general context  that  is
-       relevant  for  several  PCRE2 operations, a compile-time context, and a
+       There  are  three different types of context: a general context that is
+       relevant for several PCRE2 operations, a compile-time  context,  and  a
        match-time context.


    The general context


-       At present, this context just  contains  pointers  to  (and  data  for)
-       external  memory  management  functions  that  are  called from several
-       places in the PCRE2 library. The context is named `general' rather than
-       specifically  `memory'  because in future other fields may be added. If
-       you do not want to supply your own custom memory management  functions,
-       you  do not need to bother with a general context. A general context is
+       At  present,  this context just contains pointers to (and data for) ex-
+       ternal memory management functions that are called from several  places
+       in  the  PCRE2  library.  The  context  is  named `general' rather than
+       specifically `memory' because in future other fields may be  added.  If
+       you  do not want to supply your own custom memory management functions,
+       you do not need to bother with a general context. A general context  is
        created by:


        pcre2_general_context *pcre2_general_context_create(
@@ -758,7 +757,7 @@
          void *(*private_malloc)(PCRE2_SIZE, void *),
          void (*private_free)(void *, void *), void *memory_data);


-       The two function pointers specify custom memory  management  functions,
+       The  two  function pointers specify custom memory management functions,
        whose prototypes are:


          void *private_malloc(PCRE2_SIZE, void *);
@@ -766,16 +765,16 @@


        Whenever code in PCRE2 calls these functions, the final argument is the
        value of memory_data. Either of the first two arguments of the creation
-       function  may be NULL, in which case the system memory management func-
-       tions malloc() and free() are used. (This is not currently  useful,  as
-       there  are  no  other  fields in a general context, but in future there
-       might be.)  The private_malloc() function  is  used  (if  supplied)  to
-       obtain  memory  for storing the context, and all three values are saved
-       as part of the context.
+       function may be NULL, in which case the system memory management  func-
+       tions  malloc()  and free() are used. (This is not currently useful, as
+       there are no other fields in a general context,  but  in  future  there
+       might  be.)  The private_malloc() function is used (if supplied) to ob-
+       tain memory for storing the context, and all three values are saved  as
+       part of the context.


-       Whenever PCRE2 creates a data block of any kind, the block  contains  a
-       pointer  to the free() function that matches the malloc() function that
-       was used. When the time comes to  free  the  block,  this  function  is
+       Whenever  PCRE2  creates a data block of any kind, the block contains a
+       pointer to the free() function that matches the malloc() function  that
+       was  used.  When  the  time  comes  to free the block, this function is
        called.


        A general context can be copied by calling:
@@ -787,13 +786,13 @@


        void pcre2_general_context_free(pcre2_general_context *gcontext);


-       If  this  function  is  passed  a NULL argument, it returns immediately
+       If this function is passed a  NULL  argument,  it  returns  immediately
        without doing anything.


    The compile context


-       A compile context is required if you want to provide an external  func-
-       tion  for  stack  checking  during compilation or to change the default
+       A  compile context is required if you want to provide an external func-
+       tion for stack checking during compilation or  to  change  the  default
        values of any of the following compile-time parameters:


          What \R matches (Unicode newlines or CR, LF, CRLF only)
@@ -803,11 +802,11 @@
          The maximum length of the pattern string
          The extra options bits (none set by default)


-       A compile context is also required if you are using custom memory  man-
-       agement.   If  none of these apply, just pass NULL as the context argu-
+       A  compile context is also required if you are using custom memory man-
+       agement.  If none of these apply, just pass NULL as the  context  argu-
        ment of pcre2_compile().


-       A compile context is created, copied, and freed by the following  func-
+       A  compile context is created, copied, and freed by the following func-
        tions:


        pcre2_compile_context *pcre2_compile_context_create(
@@ -818,7 +817,7 @@


        void pcre2_compile_context_free(pcre2_compile_context *ccontext);


-       A  compile  context  is created with default values for its parameters.
+       A compile context is created with default values  for  its  parameters.
        These can be changed by calling the following functions, which return 0
        on success, or PCRE2_ERROR_BADDATA if invalid data is detected.


@@ -825,16 +824,16 @@
        int pcre2_set_bsr(pcre2_compile_context *ccontext,
          uint32_t value);


-       The  value  must  be PCRE2_BSR_ANYCRLF, to specify that \R matches only
-       CR, LF, or CRLF, or PCRE2_BSR_UNICODE, to specify that \R  matches  any
+       The value must be PCRE2_BSR_ANYCRLF, to specify that  \R  matches  only
+       CR,  LF,  or CRLF, or PCRE2_BSR_UNICODE, to specify that \R matches any
        Unicode line ending sequence. The value is used by the JIT compiler and
-       by  the  two  interpreted   matching   functions,   pcre2_match()   and
+       by   the   two   interpreted   matching  functions,  pcre2_match()  and
        pcre2_dfa_match().


        int pcre2_set_character_tables(pcre2_compile_context *ccontext,
          const unsigned char *tables);


-       The  value  must  be  the result of a call to pcre2_maketables(), whose
+       The value must be the result of a  call  to  pcre2_maketables(),  whose
        only argument is a general context. This function builds a set of char-
        acter tables in the current locale.


@@ -841,22 +840,22 @@
        int pcre2_set_compile_extra_options(pcre2_compile_context *ccontext,
          uint32_t extra_options);


-       As  PCRE2  has developed, almost all the 32 option bits that are avail-
-       able in the options argument of pcre2_compile() have been used  up.  To
-       avoid  running  out, the compile context contains a set of extra option
-       bits which are used for some newer, assumed rarer, options. This  func-
-       tion  sets  those bits. It always sets all the bits (either on or off).
-       It does not modify any existing  setting.  The  available  options  are
-       defined in the section entitled "Extra compile options" below.
+       As PCRE2 has developed, almost all the 32 option bits that  are  avail-
+       able  in  the options argument of pcre2_compile() have been used up. To
+       avoid running out, the compile context contains a set of  extra  option
+       bits  which are used for some newer, assumed rarer, options. This func-
+       tion sets those bits. It always sets all the bits (either on  or  off).
+       It  does not modify any existing setting. The available options are de-
+       fined in the section entitled "Extra compile options" below.


        int pcre2_set_max_pattern_length(pcre2_compile_context *ccontext,
          PCRE2_SIZE value);


-       This  sets a maximum length, in code units, for any pattern string that
-       is compiled with this context. If the pattern is longer,  an  error  is
-       generated.   This facility is provided so that applications that accept
+       This sets a maximum length, in code units, for any pattern string  that
+       is  compiled  with  this context. If the pattern is longer, an error is
+       generated.  This facility is provided so that applications that  accept
        patterns from external sources can limit their size. The default is the
-       largest  number  that  a  PCRE2_SIZE variable can hold, which is effec-
+       largest number that a PCRE2_SIZE variable can  hold,  which  is  effec-
        tively unlimited.


        int pcre2_set_newline(pcre2_compile_context *ccontext,
@@ -863,46 +862,46 @@
          uint32_t value);


        This specifies which characters or character sequences are to be recog-
-       nized  as newlines. The value must be one of PCRE2_NEWLINE_CR (carriage
+       nized as newlines. The value must be one of PCRE2_NEWLINE_CR  (carriage
        return only), PCRE2_NEWLINE_LF (linefeed only), PCRE2_NEWLINE_CRLF (the
-       two-character  sequence  CR followed by LF), PCRE2_NEWLINE_ANYCRLF (any
-       of the above), PCRE2_NEWLINE_ANY (any  Unicode  newline  sequence),  or
+       two-character sequence CR followed by LF),  PCRE2_NEWLINE_ANYCRLF  (any
+       of  the  above),  PCRE2_NEWLINE_ANY  (any Unicode newline sequence), or
        PCRE2_NEWLINE_NUL (the NUL character, that is a binary zero).


        A pattern can override the value set in the compile context by starting
        with a sequence such as (*CRLF). See the pcre2pattern page for details.


-       When   a   pattern   is   compiled   with   the    PCRE2_EXTENDED    or
-       PCRE2_EXTENDED_MORE option, the newline convention affects the recogni-
-       tion of the end of internal comments starting  with  #.  The  value  is
-       saved  with the compiled pattern for subsequent use by the JIT compiler
-       and by  the  two  interpreted  matching  functions,  pcre2_match()  and
+       When  a  pattern  is  compiled  with  the  PCRE2_EXTENDED  or PCRE2_EX-
+       TENDED_MORE option, the newline convention affects the  recognition  of
+       the  end  of internal comments starting with #. The value is saved with
+       the compiled pattern for subsequent use by the JIT compiler and by  the
+       two     interpreted     matching     functions,    pcre2_match()    and
        pcre2_dfa_match().


        int pcre2_set_parens_nest_limit(pcre2_compile_context *ccontext,
          uint32_t value);


-       This  parameter  adjusts  the  limit,  set when PCRE2 is built (default
-       250), on the depth of parenthesis nesting  in  a  pattern.  This  limit
-       stops  rogue  patterns  using  up too much system stack when being com-
-       piled. The limit applies to parentheses of all kinds, not just  captur-
+       This parameter adjusts the limit, set  when  PCRE2  is  built  (default
+       250),  on  the  depth  of  parenthesis nesting in a pattern. This limit
+       stops rogue patterns using up too much system  stack  when  being  com-
+       piled.  The limit applies to parentheses of all kinds, not just captur-
        ing parentheses.


        int pcre2_set_compile_recursion_guard(pcre2_compile_context *ccontext,
          int (*guard_function)(uint32_t, void *), void *user_data);


-       There  is at least one application that runs PCRE2 in threads with very
-       limited system stack, where running out of stack is to  be  avoided  at
-       all  costs. The parenthesis limit above cannot take account of how much
-       stack is actually available during compilation. For  a  finer  control,
-       you  can  supply  a  function  that  is called whenever pcre2_compile()
-       starts to compile a parenthesized part of a pattern. This function  can
-       check  the  actual  stack  size  (or anything else that it wants to, of
+       There is at least one application that runs PCRE2 in threads with  very
+       limited  system  stack,  where running out of stack is to be avoided at
+       all costs. The parenthesis limit above cannot take account of how  much
+       stack  is  actually  available during compilation. For a finer control,
+       you can supply a  function  that  is  called  whenever  pcre2_compile()
+       starts  to compile a parenthesized part of a pattern. This function can
+       check the actual stack size (or anything else  that  it  wants  to,  of
        course).


-       The first argument to the callout function gives the current  depth  of
-       nesting,  and  the second is user data that is set up by the last argu-
-       ment  of  pcre2_set_compile_recursion_guard().  The  callout   function
+       The  first  argument to the callout function gives the current depth of
+       nesting, and the second is user data that is set up by the  last  argu-
+       ment   of  pcre2_set_compile_recursion_guard().  The  callout  function
        should return zero if all is well, or non-zero to force an error.


    The match context
@@ -916,10 +915,10 @@
          Change the backtracking depth limit
          Set custom memory management specifically for the match


-       If  none  of  these  apply,  just  pass NULL as the context argument of
+       If none of these apply, just pass  NULL  as  the  context  argument  of
        pcre2_match(), pcre2_dfa_match(), or pcre2_jit_match().


-       A match context is created, copied, and freed by  the  following  func-
+       A  match  context  is created, copied, and freed by the following func-
        tions:


        pcre2_match_context *pcre2_match_context_create(
@@ -930,7 +929,7 @@


        void pcre2_match_context_free(pcre2_match_context *mcontext);


-       A  match  context  is  created  with default values for its parameters.
+       A match context is created with  default  values  for  its  parameters.
        These can be changed by calling the following functions, which return 0
        on success, or PCRE2_ERROR_BADDATA if invalid data is detected.


@@ -938,7 +937,7 @@
          int (*callout_function)(pcre2_callout_block *, void *),
          void *callout_data);


-       This  sets  up a callout function for PCRE2 to call at specified points
+       This sets up a callout function for PCRE2 to call at  specified  points
        during a matching operation. Details are given in the pcre2callout doc-
        umentation.


@@ -946,7 +945,7 @@
          int (*callout_function)(pcre2_substitute_callout_block *, void *),
          void *callout_data);


-       This  sets up a callout function for PCRE2 to call after each substitu-
+       This sets up a callout function for PCRE2 to call after each  substitu-
        tion made by pcre2_substitute(). Details are given in the section enti-
        tled "Creating a new string with substitutions" below.


@@ -953,12 +952,11 @@
        int pcre2_set_offset_limit(pcre2_match_context *mcontext,
          PCRE2_SIZE value);


-       The  offset_limit  parameter  limits  how  far an unanchored search can
-       advance in the subject string. The default value  is  PCRE2_UNSET.  The
-       pcre2_match()      and      pcre2_dfa_match()      functions     return
-       PCRE2_ERROR_NOMATCH if a match with a starting point before or  at  the
-       given  offset  is  not  found. The pcre2_substitute() function makes no
-       more substitutions.
+       The offset_limit parameter limits how far an unanchored search can  ad-
+       vance  in  the  subject  string.  The default value is PCRE2_UNSET. The
+       pcre2_match() and pcre2_dfa_match()  functions  return  PCRE2_ERROR_NO-
+       MATCH if a match with a starting point before or at the given offset is
+       not found. The pcre2_substitute() function makes no more substitutions.


        For example, if the pattern /abc/ is matched against "123abc"  with  an
        offset  limit  less  than 3, the result is PCRE2_ERROR_NOMATCH. A match
@@ -966,16 +964,15 @@
        pcre2_dfa_match(),  or  pcre2_substitute()  is  greater than the offset
        limit set in the match context.


-       When using this  facility,  you  must  set  the  PCRE2_USE_OFFSET_LIMIT
-       option when calling pcre2_compile() so that when JIT is in use, differ-
-       ent code can be compiled. If a match  is  started  with  a  non-default
-       match  limit when PCRE2_USE_OFFSET_LIMIT is not set, an error is gener-
-       ated.
+       When using this facility, you must set the  PCRE2_USE_OFFSET_LIMIT  op-
+       tion when calling pcre2_compile() so that when JIT is in use, different
+       code can be compiled. If a match is started with  a  non-default  match
+       limit when PCRE2_USE_OFFSET_LIMIT is not set, an error is generated.


-       The offset limit facility can be used to track progress when  searching
-       large  subject  strings or to limit the extent of global substitutions.
-       See also the PCRE2_FIRSTLINE option, which requires a  match  to  start
-       before  or  at  the first newline that follows the start of matching in
+       The  offset limit facility can be used to track progress when searching
+       large subject strings or to limit the extent of  global  substitutions.
+       See  also  the  PCRE2_FIRSTLINE option, which requires a match to start
+       before or at the first newline that follows the start  of  matching  in
        the subject. If this is set with an offset limit, a match must occur in
        the first line and also within the offset limit. In other words, which-
        ever limit comes first is used.
@@ -984,15 +981,15 @@
          uint32_t value);


        The heap_limit parameter specifies, in units of kibibytes (1024 bytes),
-       the  maximum  amount  of heap memory that pcre2_match() may use to hold
+       the maximum amount of heap memory that pcre2_match() may  use  to  hold
        backtracking information when running an interpretive match. This limit
        also applies to pcre2_dfa_match(), which may use the heap when process-
-       ing patterns with a lot of nested pattern recursion or  lookarounds  or
+       ing  patterns  with a lot of nested pattern recursion or lookarounds or
        atomic groups. This limit does not apply to matching with the JIT opti-
-       mization, which has  its  own  memory  control  arrangements  (see  the
-       pcre2jit  documentation for more details). If the limit is reached, the
-       negative error code  PCRE2_ERROR_HEAPLIMIT  is  returned.  The  default
-       limit  can be set when PCRE2 is built; if it is not, the default is set
+       mization,  which  has  its  own  memory  control  arrangements (see the
+       pcre2jit documentation for more details). If the limit is reached,  the
+       negative  error  code  PCRE2_ERROR_HEAPLIMIT  is  returned. The default
+       limit can be set when PCRE2 is built; if it is not, the default is  set
        very large and is essentially "unlimited".


        A value for the heap limit may also be supplied by an item at the start
@@ -1000,101 +997,101 @@


          (*LIMIT_HEAP=ddd)


-       where  ddd  is  a  decimal  number.  However, such a setting is ignored
-       unless ddd is less than the limit set by the  caller  of  pcre2_match()
-       or, if no such limit is set, less than the default.
+       where ddd is a decimal number. However, such a setting is  ignored  un-
+       less  ddd is less than the limit set by the caller of pcre2_match() or,
+       if no such limit is set, less than the default.


-       The  pcre2_match() function starts out using a 20KiB vector on the sys-
+       The pcre2_match() function starts out using a 20KiB vector on the  sys-
        tem stack for recording backtracking points. The more nested backtrack-
-       ing  points  there  are (that is, the deeper the search tree), the more
-       memory is needed.  Heap memory is used only if the  initial  vector  is
+       ing points there are (that is, the deeper the search  tree),  the  more
+       memory  is  needed.   Heap memory is used only if the initial vector is
        too small. If the heap limit is set to a value less than 21 (in partic-
-       ular, zero) no heap memory will be used. In this  case,  only  patterns
-       that  do not have a lot of nested backtracking can be successfully pro-
+       ular,  zero)  no  heap memory will be used. In this case, only patterns
+       that do not have a lot of nested backtracking can be successfully  pro-
        cessed.


-       Similarly, for pcre2_dfa_match(), a vector on the system stack is  used
-       when  processing pattern recursions, lookarounds, or atomic groups, and
-       only if this is not big enough is heap memory used. In this case,  too,
+       Similarly,  for pcre2_dfa_match(), a vector on the system stack is used
+       when processing pattern recursions, lookarounds, or atomic groups,  and
+       only  if this is not big enough is heap memory used. In this case, too,
        setting a value of zero disables the use of the heap.


        int pcre2_set_match_limit(pcre2_match_context *mcontext,
          uint32_t value);


-       The  match_limit  parameter  provides  a means of preventing PCRE2 from
-       using up too many computing resources when processing patterns that are
+       The match_limit parameter provides a means of preventing PCRE2 from us-
+       ing  up  too many computing resources when processing patterns that are
        not going to match, but which have a very large number of possibilities
-       in their search trees. The classic  example  is  a  pattern  that  uses
+       in  their  search  trees.  The  classic  example is a pattern that uses
        nested unlimited repeats.


-       There  is an internal counter in pcre2_match() that is incremented each
-       time round its main matching loop. If  this  value  reaches  the  match
+       There is an internal counter in pcre2_match() that is incremented  each
+       time  round  its  main  matching  loop. If this value reaches the match
        limit, pcre2_match() returns the negative value PCRE2_ERROR_MATCHLIMIT.
-       This has the effect of limiting the amount  of  backtracking  that  can
+       This  has  the  effect  of limiting the amount of backtracking that can
        take place. For patterns that are not anchored, the count restarts from
-       zero for each position in the subject string. This limit  also  applies
+       zero  for  each position in the subject string. This limit also applies
        to pcre2_dfa_match(), though the counting is done in a different way.


-       When  pcre2_match() is called with a pattern that was successfully pro-
+       When pcre2_match() is called with a pattern that was successfully  pro-
        cessed by pcre2_jit_compile(), the way in which matching is executed is
-       entirely  different. However, there is still the possibility of runaway
-       matching that goes on for a very long  time,  and  so  the  match_limit
-       value  is  also used in this case (but in a different way) to limit how
+       entirely different. However, there is still the possibility of  runaway
+       matching  that  goes  on  for  a very long time, and so the match_limit
+       value is also used in this case (but in a different way) to  limit  how
        long the matching can continue.


-       The default value for the limit can be set when  PCRE2  is  built;  the
-       default  default  is 10 million, which handles all but the most extreme
-       cases. A value for the match limit may also be supplied by an  item  at
+       The default value for the limit can be set when PCRE2 is built; the de-
+       fault default is 10 million, which handles all  but  the  most  extreme
+       cases.  A  value for the match limit may also be supplied by an item at
        the start of a pattern of the form


          (*LIMIT_MATCH=ddd)


-       where  ddd  is  a  decimal  number.  However, such a setting is ignored
-       unless ddd is less than the limit set by the caller of pcre2_match() or
+       where ddd is a decimal number. However, such a setting is  ignored  un-
+       less  ddd  is less than the limit set by the caller of pcre2_match() or
        pcre2_dfa_match() or, if no such limit is set, less than the default.


        int pcre2_set_depth_limit(pcre2_match_context *mcontext,
          uint32_t value);


-       This   parameter   limits   the   depth   of   nested  backtracking  in
-       pcre2_match().  Each time a nested backtracking point is passed, a  new
+       This  parameter  limits   the   depth   of   nested   backtracking   in
+       pcre2_match().   Each time a nested backtracking point is passed, a new
        memory "frame" is used to remember the state of matching at that point.
-       Thus, this parameter indirectly limits the amount  of  memory  that  is
-       used  in  a  match.  However,  because  the size of each memory "frame"
-       depends on the number of capturing parentheses, the actual memory limit
-       varies  from pattern to pattern. This limit was more useful in versions
+       Thus,  this  parameter  indirectly  limits the amount of memory that is
+       used in a match. However, because the size of each memory  "frame"  de-
+       pends  on  the number of capturing parentheses, the actual memory limit
+       varies from pattern to pattern. This limit was more useful in  versions
        before 10.30, where function recursion was used for backtracking.


-       The depth limit is not relevant, and is ignored, when matching is  done
+       The  depth limit is not relevant, and is ignored, when matching is done
        using JIT compiled code. However, it is supported by pcre2_dfa_match(),
-       which uses it to limit the depth of nested internal recursive  function
-       calls  that implement atomic groups, lookaround assertions, and pattern
+       which  uses it to limit the depth of nested internal recursive function
+       calls that implement atomic groups, lookaround assertions, and  pattern
        recursions. This limits, indirectly, the amount of system stack that is
-       used.  It  was  more useful in versions before 10.32, when stack memory
+       used. It was more useful in versions before 10.32,  when  stack  memory
        was used for local workspace vectors for recursive function calls. From
-       version  10.32,  only local variables are allocated on the stack and as
+       version 10.32, only local variables are allocated on the stack  and  as
        each call uses only a few hundred bytes, even a small stack can support
        quite a lot of recursion.


-       If  the  depth  of  internal  recursive function calls is great enough,
-       local workspace vectors are allocated on the heap  from  version  10.32
-       onwards,  so  the depth limit also indirectly limits the amount of heap
+       If the depth of internal recursive function calls is great enough,  lo-
+       cal  workspace vectors are allocated on the heap from version 10.32 on-
+       wards, so the depth limit also indirectly limits  the  amount  of  heap
        memory that is used. A recursive pattern such as /(.(?2))((?1)|)/, when
-       matched  to a very long string using pcre2_dfa_match(), can use a great
-       deal of memory. However, it is probably  better  to  limit  heap  usage
-       directly by calling pcre2_set_heap_limit().
+       matched to a very long string using pcre2_dfa_match(), can use a  great
+       deal  of memory. However, it is probably better to limit heap usage di-
+       rectly by calling pcre2_set_heap_limit().


-       The  default  value for the depth limit can be set when PCRE2 is built;
-       if it is not, the default is set to the same value as the  default  for
-       the   match   limit.   If  the  limit  is  exceeded,  pcre2_match()  or
+       The default value for the depth limit can be set when PCRE2  is  built;
+       if  it  is not, the default is set to the same value as the default for
+       the  match  limit.   If  the  limit  is  exceeded,   pcre2_match()   or
        pcre2_dfa_match() returns PCRE2_ERROR_DEPTHLIMIT. A value for the depth
-       limit  may also be supplied by an item at the start of a pattern of the
+       limit may also be supplied by an item at the start of a pattern of  the
        form


          (*LIMIT_DEPTH=ddd)


-       where ddd is a decimal number.  However,  such  a  setting  is  ignored
-       unless ddd is less than the limit set by the caller of pcre2_match() or
+       where  ddd  is a decimal number. However, such a setting is ignored un-
+       less ddd is less than the limit set by the caller of  pcre2_match()  or
        pcre2_dfa_match() or, if no such limit is set, less than the default.



@@ -1102,96 +1099,96 @@

        int pcre2_config(uint32_t what, void *where);


-       The function pcre2_config() makes it possible for  a  PCRE2  client  to
-       discover  which  optional  features  have  been compiled into the PCRE2
-       library. The pcre2build documentation  has  more  details  about  these
-       optional features.
+       The  function  pcre2_config()  makes  it possible for a PCRE2 client to
+       discover which optional features have been compiled into the PCRE2  li-
+       brary.  The  pcre2build  documentation has more details about these op-
+       tional features.


-       The  first  argument  for pcre2_config() specifies which information is
-       required. The second argument is a pointer to  memory  into  which  the
-       information  is  placed.  If  NULL  is passed, the function returns the
-       amount of memory that is needed  for  the  requested  information.  For
-       calls  that  return  numerical  values,  the  value  is  in bytes; when
-       requesting these values, where should point  to  appropriately  aligned
-       memory.  For calls that return strings, the required length is given in
-       code units, not counting the terminating zero.
+       The first argument for pcre2_config() specifies  which  information  is
+       required. The second argument is a pointer to memory into which the in-
+       formation is placed. If NULL is passed, the function returns the amount
+       of  memory that is needed for the requested information. For calls that
+       return numerical values, the value is in bytes; when  requesting  these
+       values,  where  should point to appropriately aligned memory. For calls
+       that return strings, the required length is given in  code  units,  not
+       counting the terminating zero.


-       When requesting information, the returned value from pcre2_config()  is
-       non-negative  on success, or the negative error code PCRE2_ERROR_BADOP-
-       TION if the value in the first argument is not recognized. The  follow-
+       When  requesting information, the returned value from pcre2_config() is
+       non-negative on success, or the negative error code  PCRE2_ERROR_BADOP-
+       TION  if the value in the first argument is not recognized. The follow-
        ing information is available:


          PCRE2_CONFIG_BSR


-       The  output  is a uint32_t integer whose value indicates what character
-       sequences the \R  escape  sequence  matches  by  default.  A  value  of
-       PCRE2_BSR_UNICODE  means  that  \R  matches  any  Unicode  line  ending
-       sequence; a value of PCRE2_BSR_ANYCRLF means that \R matches  only  CR,
-       LF, or CRLF. The default can be overridden when a pattern is compiled.
+       The output is a uint32_t integer whose value indicates  what  character
+       sequences  the  \R  escape  sequence  matches  by  default.  A value of
+       PCRE2_BSR_UNICODE means that \R matches any  Unicode  line  ending  se-
+       quence; a value of PCRE2_BSR_ANYCRLF means that \R matches only CR, LF,
+       or CRLF. The default can be overridden when a pattern is compiled.


          PCRE2_CONFIG_COMPILED_WIDTHS


-       The  output  is a uint32_t integer whose lower bits indicate which code
-       unit widths were selected when PCRE2 was  built.  The  1-bit  indicates
-       8-bit  support, and the 2-bit and 4-bit indicate 16-bit and 32-bit sup-
+       The output is a uint32_t integer whose lower bits indicate  which  code
+       unit  widths  were  selected  when PCRE2 was built. The 1-bit indicates
+       8-bit support, and the 2-bit and 4-bit indicate 16-bit and 32-bit  sup-
        port, respectively.


          PCRE2_CONFIG_DEPTHLIMIT


-       The output is a uint32_t integer that gives the default limit  for  the
-       depth  of  nested  backtracking in pcre2_match() or the depth of nested
-       recursions, lookarounds, and atomic groups in  pcre2_dfa_match().  Fur-
+       The  output  is a uint32_t integer that gives the default limit for the
+       depth of nested backtracking in pcre2_match() or the  depth  of  nested
+       recursions,  lookarounds,  and atomic groups in pcre2_dfa_match(). Fur-
        ther details are given with pcre2_set_depth_limit() above.


          PCRE2_CONFIG_HEAPLIMIT


-       The  output is a uint32_t integer that gives, in kibibytes, the default
-       limit  for  the  amount  of  heap  memory  used  by  pcre2_match()   or
-       pcre2_dfa_match().      Further      details     are     given     with
+       The output is a uint32_t integer that gives, in kibibytes, the  default
+       limit   for  the  amount  of  heap  memory  used  by  pcre2_match()  or
+       pcre2_dfa_match().     Further     details     are      given      with
        pcre2_set_heap_limit() above.


          PCRE2_CONFIG_JIT


-       The output is a uint32_t integer that is set  to  one  if  support  for
+       The  output  is  a  uint32_t  integer that is set to one if support for
        just-in-time compiling is available; otherwise it is set to zero.


          PCRE2_CONFIG_JITTARGET


-       The  where  argument  should point to a buffer that is at least 48 code
-       units long.  (The  exact  length  required  can  be  found  by  calling
-       pcre2_config()  with  where  set  to NULL.) The buffer is filled with a
-       string that contains the name of the architecture  for  which  the  JIT
-       compiler  is  configured,  for  example  "x86  32bit  (little  endian +
-       unaligned)". If JIT support is not available, PCRE2_ERROR_BADOPTION  is
-       returned,  otherwise the number of code units used is returned. This is
+       The where argument should point to a buffer that is at  least  48  code
+       units  long.  (The  exact  length  required  can  be  found  by calling
+       pcre2_config() with where set to NULL.) The buffer  is  filled  with  a
+       string  that  contains  the  name of the architecture for which the JIT
+       compiler is configured, for example "x86 32bit  (little  endian  +  un-
+       aligned)".  If  JIT  support is not available, PCRE2_ERROR_BADOPTION is
+       returned, otherwise the number of code units used is returned. This  is
        the length of the string, plus one unit for the terminating zero.


          PCRE2_CONFIG_LINKSIZE


        The output is a uint32_t integer that contains the number of bytes used
-       for  internal  linkage  in  compiled regular expressions. When PCRE2 is
-       configured, the value can be set to 2, 3, or 4, with the default  being
-       2.  This is the value that is returned by pcre2_config(). However, when
-       the 16-bit library is compiled, a value of 3 is rounded up  to  4,  and
-       when  the  32-bit  library  is compiled, internal linkages always use 4
+       for internal linkage in compiled regular  expressions.  When  PCRE2  is
+       configured,  the value can be set to 2, 3, or 4, with the default being
+       2. This is the value that is returned by pcre2_config(). However,  when
+       the  16-bit  library  is compiled, a value of 3 is rounded up to 4, and
+       when the 32-bit library is compiled, internal  linkages  always  use  4
        bytes, so the configured value is not relevant.


        The default value of 2 for the 8-bit and 16-bit libraries is sufficient
-       for  all but the most massive patterns, since it allows the size of the
-       compiled pattern to be up to 65535  code  units.  Larger  values  allow
-       larger  regular  expressions to be compiled by those two libraries, but
+       for all but the most massive patterns, since it allows the size of  the
+       compiled  pattern  to  be  up  to 65535 code units. Larger values allow
+       larger regular expressions to be compiled by those two  libraries,  but
        at the expense of slower matching.


          PCRE2_CONFIG_MATCHLIMIT


        The output is a uint32_t integer that gives the default match limit for
-       pcre2_match().  Further  details are given with pcre2_set_match_limit()
+       pcre2_match(). Further details are given  with  pcre2_set_match_limit()
        above.


          PCRE2_CONFIG_NEWLINE


-       The output is a uint32_t integer  whose  value  specifies  the  default
-       character  sequence that is recognized as meaning "newline". The values
+       The  output  is  a  uint32_t  integer whose value specifies the default
+       character sequence that is recognized as meaning "newline". The  values
        are:


          PCRE2_NEWLINE_CR       Carriage return (CR)
@@ -1201,23 +1198,23 @@
          PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
          PCRE2_NEWLINE_NUL      The NUL character (binary zero)


-       The default should normally correspond to  the  standard  sequence  for
+       The  default  should  normally  correspond to the standard sequence for
        your operating system.


          PCRE2_CONFIG_NEVER_BACKSLASH_C


-       The  output  is  a uint32_t integer that is set to one if the use of \C
-       was permanently disabled when PCRE2 was built; otherwise it is  set  to
+       The output is a uint32_t integer that is set to one if the  use  of  \C
+       was  permanently  disabled when PCRE2 was built; otherwise it is set to
        zero.


          PCRE2_CONFIG_PARENSLIMIT


-       The  output is a uint32_t integer that gives the maximum depth of nest-
+       The output is a uint32_t integer that gives the maximum depth of  nest-
        ing of parentheses (of any kind) in a pattern. This limit is imposed to
-       cap  the  amount of system stack used when a pattern is compiled. It is
-       specified when PCRE2 is built; the default is 250. This limit does  not
-       take  into  account  the  stack that may already be used by the calling
-       application. For  finer  control  over  compilation  stack  usage,  see
+       cap the amount of system stack used when a pattern is compiled.  It  is
+       specified  when PCRE2 is built; the default is 250. This limit does not
+       take into account the stack that may already be used by the calling ap-
+       plication.   For  finer  control  over  compilation  stack  usage,  see
        pcre2_set_compile_recursion_guard().


          PCRE2_CONFIG_STACKRECURSE
@@ -1227,25 +1224,25 @@


          PCRE2_CONFIG_UNICODE_VERSION


-       The where argument should point to a buffer that is at  least  24  code
-       units  long.  (The  exact  length  required  can  be  found  by calling
-       pcre2_config() with where set to NULL.)  If  PCRE2  has  been  compiled
-       without  Unicode  support,  the buffer is filled with the text "Unicode
-       not supported". Otherwise, the Unicode  version  string  (for  example,
-       "8.0.0")  is  inserted. The number of code units used is returned. This
+       The  where  argument  should point to a buffer that is at least 24 code
+       units long.  (The  exact  length  required  can  be  found  by  calling
+       pcre2_config()  with  where  set  to  NULL.) If PCRE2 has been compiled
+       without Unicode support, the buffer is filled with  the  text  "Unicode
+       not  supported".  Otherwise,  the  Unicode version string (for example,
+       "8.0.0") is inserted. The number of code units used is  returned.  This
        is the length of the string plus one unit for the terminating zero.


          PCRE2_CONFIG_UNICODE


-       The output is a uint32_t integer that is set to one if Unicode  support
-       is  available; otherwise it is set to zero. Unicode support implies UTF
+       The  output is a uint32_t integer that is set to one if Unicode support
+       is available; otherwise it is set to zero. Unicode support implies  UTF
        support.


          PCRE2_CONFIG_VERSION


-       The where argument should point to a buffer that is at  least  24  code
-       units  long.  (The  exact  length  required  can  be  found  by calling
-       pcre2_config() with where set to NULL.) The buffer is filled  with  the
+       The  where  argument  should point to a buffer that is at least 24 code
+       units long.  (The  exact  length  required  can  be  found  by  calling
+       pcre2_config()  with  where set to NULL.) The buffer is filled with the
        PCRE2 version string, zero-terminated. The number of code units used is
        returned. This is the length of the string plus one unit for the termi-
        nating zero.
@@ -1263,103 +1260,103 @@


        pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *code);


-       The  pcre2_compile() function compiles a pattern into an internal form.
-       The pattern is defined by a pointer to a string of  code  units  and  a
-       length  (in  code units). If the pattern is zero-terminated, the length
-       can be specified  as  PCRE2_ZERO_TERMINATED.  The  function  returns  a
-       pointer  to  a  block  of memory that contains the compiled pattern and
-       related data, or NULL if an error occurred.
+       The pcre2_compile() function compiles a pattern into an internal  form.
+       The  pattern  is  defined  by a pointer to a string of code units and a
+       length (in code units). If the pattern is zero-terminated,  the  length
+       can  be  specified  as  PCRE2_ZERO_TERMINATED.  The  function returns a
+       pointer to a block of memory that contains the compiled pattern and re-
+       lated data, or NULL if an error occurred.


-       If the compile context argument ccontext is NULL, memory for  the  com-
-       piled  pattern  is  obtained  by  calling  malloc().  Otherwise,  it is
-       obtained from the same memory function that was used  for  the  compile
-       context.  The  caller must free the memory by calling pcre2_code_free()
-       when it is no longer needed.  If pcre2_code_free()  is  called  with  a
-       NULL argument, it returns immediately, without doing anything.
+       If  the  compile context argument ccontext is NULL, memory for the com-
+       piled pattern is obtained by calling malloc().  Otherwise,  it  is  ob-
+       tained from the same memory function that was used for the compile con-
+       text. The caller must free the memory by calling pcre2_code_free() when
+       it is no longer needed.  If pcre2_code_free() is called with a NULL ar-
+       gument, it returns immediately, without doing anything.


        The function pcre2_code_copy() makes a copy of the compiled code in new
-       memory, using the same memory allocator as was used for  the  original.
-       However,  if  the  code  has  been  processed  by the JIT compiler (see
-       below), the JIT information cannot be copied (because it  is  position-
-       dependent).  The new copy can initially be used only for non-JIT match-
-       ing, though it can be passed to  pcre2_jit_compile()  if  required.  If
+       memory,  using  the same memory allocator as was used for the original.
+       However, if the code has been processed by the JIT  compiler  (see  be-
+       low),  the JIT information cannot be copied (because it is position-de-
+       pendent).  The new copy can initially be used only for  non-JIT  match-
+       ing,  though  it  can  be passed to pcre2_jit_compile() if required. If
        pcre2_code_copy() is called with a NULL argument, it returns NULL.


        The pcre2_code_copy() function provides a way for individual threads in
-       a multithreaded application to acquire a private copy  of  shared  com-
-       piled  code.   However, it does not make a copy of the character tables
-       used by the compiled pattern; the new pattern code points to  the  same
-       tables  as  the original code.  (See "Locale Support" below for details
-       of these character tables.) In many applications the  same  tables  are
-       used  throughout, so this behaviour is appropriate. Nevertheless, there
+       a  multithreaded  application  to acquire a private copy of shared com-
+       piled code.  However, it does not make a copy of the  character  tables
+       used  by  the compiled pattern; the new pattern code points to the same
+       tables as the original code.  (See "Locale Support" below  for  details
+       of  these  character  tables.) In many applications the same tables are
+       used throughout, so this behaviour is appropriate. Nevertheless,  there
        are occasions when a copy of a compiled pattern and the relevant tables
-       are  needed.  The pcre2_code_copy_with_tables() provides this facility.
-       Copies of both the code and the tables are  made,  with  the  new  code
-       pointing  to the new tables. The memory for the new tables is automati-
-       cally freed when pcre2_code_free() is called for the new  copy  of  the
-       compiled  code.  If pcre2_code_copy_with_tables() is called with a NULL
+       are needed. The pcre2_code_copy_with_tables() provides  this  facility.
+       Copies  of  both  the  code  and the tables are made, with the new code
+       pointing to the new tables. The memory for the new tables is  automati-
+       cally  freed  when  pcre2_code_free() is called for the new copy of the
+       compiled code. If pcre2_code_copy_with_tables() is called with  a  NULL
        argument, it returns NULL.


-       NOTE: When one of the matching functions is  called,  pointers  to  the
+       NOTE:  When  one  of  the matching functions is called, pointers to the
        compiled pattern and the subject string are set in the match data block
-       so that they can be referenced by the  substring  extraction  functions
-       after  a  successful match.  After running a match, you must not free a
-       compiled pattern or a subject string until after all operations on  the
-       match  data  block have taken place, unless, in the case of the subject
-       string, you have used the PCRE2_COPY_MATCHED_SUBJECT option,  which  is
-       described  in  the  section  entitled  "Option  bits for pcre2_match()"
-       below.
+       so  that  they  can be referenced by the substring extraction functions
+       after a successful match.  After running a match, you must not  free  a
+       compiled  pattern or a subject string until after all operations on the
+       match data block have taken place, unless, in the case of  the  subject
+       string,  you  have used the PCRE2_COPY_MATCHED_SUBJECT option, which is
+       described in the section entitled "Option bits for  pcre2_match()"  be-
+       low.


-       The options argument for pcre2_compile() contains various bit  settings
-       that  affect  the  compilation.  It  should be zero if none of them are
-       required. The available options are described below. Some of  them  (in
-       particular,  those  that  are  compatible with Perl, but some others as
-       well) can also be set and  unset  from  within  the  pattern  (see  the
-       detailed description in the pcre2pattern documentation).
+       The  options argument for pcre2_compile() contains various bit settings
+       that affect the compilation. It should be zero if none of them are  re-
+       quired.  The  available  options  are described below. Some of them (in
+       particular, those that are compatible with Perl,  but  some  others  as
+       well)  can  also  be set and unset from within the pattern (see the de-
+       tailed description in the pcre2pattern documentation).


-       For  those options that can be different in different parts of the pat-
-       tern, the contents of the options argument specifies their settings  at
-       the  start  of  compilation. The PCRE2_ANCHORED, PCRE2_ENDANCHORED, and
-       PCRE2_NO_UTF_CHECK options can be set at the time of matching  as  well
+       For those options that can be different in different parts of the  pat-
+       tern,  the contents of the options argument specifies their settings at
+       the start of compilation. The  PCRE2_ANCHORED,  PCRE2_ENDANCHORED,  and
+       PCRE2_NO_UTF_CHECK  options  can be set at the time of matching as well
        as at compile time.


-       Some  additional  options  and  less  frequently  required compile-time
-       parameters (for example, the newline setting) can be provided in a com-
+       Some additional options and less frequently required  compile-time  pa-
+       rameters  (for  example, the newline setting) can be provided in a com-
        pile context (as described above).


        If errorcode or erroroffset is NULL, pcre2_compile() returns NULL imme-
-       diately. Otherwise, the variables to which these point are  set  to  an
-       error  code  and  an  offset (number of code units) within the pattern,
-       respectively, when pcre2_compile() returns NULL because  a  compilation
-       error has occurred. The values are not defined when compilation is suc-
+       diately.  Otherwise,  the  variables to which these point are set to an
+       error code and an offset (number of code units) within the pattern, re-
+       spectively, when pcre2_compile() returns NULL because a compilation er-
+       ror has occurred. The values are not defined when compilation  is  suc-
        cessful and pcre2_compile() returns a non-NULL value.


-       There are nearly 100 positive  error  codes  that  pcre2_compile()  may
-       return  if  it finds an error in the pattern. There are also some nega-
-       tive error codes that are used for invalid UTF  strings  when  validity
-       checking  is in force. These are the same as given by pcre2_match() and
+       There  are nearly 100 positive error codes that pcre2_compile() may re-
+       turn if it finds an error in the pattern. There are also some  negative
+       error  codes that are used for invalid UTF strings when validity check-
+       ing is in force. These are the  same  as  given  by  pcre2_match()  and
        pcre2_dfa_match(), and are described in the pcre2unicode documentation.
-       There  is  no  separate  documentation  for  the  positive error codes,
-       because the textual error messages that are  obtained  by  calling  the
+       There is no separate documentation for the positive  error  codes,  be-
+       cause  the  textual  error  messages  that  are obtained by calling the
        pcre2_get_error_message() function (see "Obtaining a textual error mes-
-       sage" below) should be  self-explanatory.  Macro  names  starting  with
-       PCRE2_ERROR_  are defined for both positive and negative error codes in
+       sage"  below)  should  be  self-explanatory.  Macro names starting with
+       PCRE2_ERROR_ are defined for both positive and negative error codes  in
        pcre2.h.


        The value returned in erroroffset is an indication of where in the pat-
-       tern  the  error  occurred. It is not necessarily the furthest point in
-       the pattern that was read. For example,  after  the  error  "lookbehind
-       assertion is not fixed length", the error offset points to the start of
-       the failing assertion. For an invalid UTF-8 or UTF-16 string, the  off-
+       tern the error occurred. It is not necessarily the  furthest  point  in
+       the pattern that was read. For example, after the error "lookbehind as-
+       sertion is not fixed length", the error offset points to the  start  of
+       the  failing assertion. For an invalid UTF-8 or UTF-16 string, the off-
        set is that of the first code unit of the failing character.


-       Some  errors are not detected until the whole pattern has been scanned;
-       in these cases, the offset passed back is the length  of  the  pattern.
-       Note  that  the  offset is in code units, not characters, even in a UTF
+       Some errors are not detected until the whole pattern has been  scanned;
+       in  these  cases,  the offset passed back is the length of the pattern.
+       Note that the offset is in code units, not characters, even  in  a  UTF
        mode. It may sometimes point into the middle of a UTF-8 or UTF-16 char-
        acter.


-       This  code  fragment shows a typical straightforward call to pcre2_com-
+       This code fragment shows a typical straightforward call  to  pcre2_com-
        pile():


          pcre2_code *re;
@@ -1376,28 +1373,28 @@


    Main compile options


-       The following names for option bits are defined in the  pcre2.h  header
+       The  following  names for option bits are defined in the pcre2.h header
        file:


          PCRE2_ANCHORED


        If this bit is set, the pattern is forced to be "anchored", that is, it
-       is constrained to match only at the first matching point in the  string
-       that  is being searched (the "subject string"). This effect can also be
-       achieved by appropriate constructs in the pattern itself, which is  the
+       is  constrained to match only at the first matching point in the string
+       that is being searched (the "subject string"). This effect can also  be
+       achieved  by appropriate constructs in the pattern itself, which is the
        only way to do it in Perl.


          PCRE2_ALLOW_EMPTY_CLASS


-       By  default, for compatibility with Perl, a closing square bracket that
-       immediately follows an opening one is treated as a data  character  for
-       the  class.  When  PCRE2_ALLOW_EMPTY_CLASS  is  set,  it terminates the
+       By default, for compatibility with Perl, a closing square bracket  that
+       immediately  follows  an opening one is treated as a data character for
+       the class. When  PCRE2_ALLOW_EMPTY_CLASS  is  set,  it  terminates  the
        class, which therefore contains no characters and so can never match.


          PCRE2_ALT_BSUX


-       This option request alternative handling  of  three  escape  sequences,
-       which  makes  PCRE2's  behaviour more like ECMAscript (aka JavaScript).
+       This  option  request  alternative  handling of three escape sequences,
+       which makes PCRE2's behaviour more like  ECMAscript  (aka  JavaScript).
        When it is set:


        (1) \U matches an upper case "U" character; by default \U causes a com-
@@ -1404,129 +1401,128 @@
        pile time error (Perl uses \U to upper case subsequent characters).


        (2) \u matches a lower case "u" character unless it is followed by four
-       hexadecimal digits, in which case the hexadecimal  number  defines  the
-       code  point  to match. By default, \u causes a compile time error (Perl
+       hexadecimal  digits,  in  which case the hexadecimal number defines the
+       code point to match. By default, \u causes a compile time  error  (Perl
        uses it to upper case the following character).


-       (3) \x matches a lower case "x" character unless it is followed by  two
-       hexadecimal  digits,  in  which case the hexadecimal number defines the
-       code point to match. By default, as in Perl, a  hexadecimal  number  is
+       (3)  \x matches a lower case "x" character unless it is followed by two
+       hexadecimal digits, in which case the hexadecimal  number  defines  the
+       code  point  to  match. By default, as in Perl, a hexadecimal number is
        always expected after \x, but it may have zero, one, or two digits (so,
        for example, \xz matches a binary zero character followed by z).


        ECMAscript 6 added additional functionality to \u. This can be accessed
-       using   the  PCRE2_EXTRA_ALT_BSUX  extra  option  (see  "Extra  compile
-       options" below).  Note that this alternative  escape  handling  applies
-       only  to  patterns.  Neither of these options affects the processing of
-       replacement strings passed to pcre2_substitute().
+       using the PCRE2_EXTRA_ALT_BSUX extra option  (see  "Extra  compile  op-
+       tions" below).  Note that this alternative escape handling applies only
+       to patterns. Neither of these options affects  the  processing  of  re-
+       placement strings passed to pcre2_substitute().


          PCRE2_ALT_CIRCUMFLEX


        In  multiline  mode  (when  PCRE2_MULTILINE  is  set),  the  circumflex
-       metacharacter  matches at the start of the subject (unless PCRE2_NOTBOL
-       is set), and also after any internal  newline.  However,  it  does  not
+       metacharacter matches at the start of the subject (unless  PCRE2_NOTBOL
+       is  set),  and  also  after  any internal newline. However, it does not
        match after a newline at the end of the subject, for compatibility with
-       Perl. If you want a multiline circumflex also to match after  a  termi-
+       Perl.  If  you want a multiline circumflex also to match after a termi-
        nating newline, you must set PCRE2_ALT_CIRCUMFLEX.


          PCRE2_ALT_VERBNAMES


-       By  default, for compatibility with Perl, the name in any verb sequence
-       such as (*MARK:NAME) is  any  sequence  of  characters  that  does  not
-       include  a  closing  parenthesis. The name is not processed in any way,
-       and it is not possible to include a closing parenthesis  in  the  name.
-       However,  if  the  PCRE2_ALT_VERBNAMES  option is set, normal backslash
-       processing is applied to verb  names  and  only  an  unescaped  closing
-       parenthesis  terminates the name. A closing parenthesis can be included
-       in a name either as \) or between \Q and \E. If the  PCRE2_EXTENDED  or
-       PCRE2_EXTENDED_MORE  option  is set with PCRE2_ALT_VERBNAMES, unescaped
-       whitespace in verb names is  skipped  and  #-comments  are  recognized,
-       exactly as in the rest of the pattern.
+       By default, for compatibility with Perl, the name in any verb  sequence
+       such  as  (*MARK:NAME)  is any sequence of characters that does not in-
+       clude a closing parenthesis. The name is not processed in any way,  and
+       it  is  not possible to include a closing parenthesis in the name. How-
+       ever, if the PCRE2_ALT_VERBNAMES option is set, normal  backslash  pro-
+       cessing  is  applied to verb names and only an unescaped closing paren-
+       thesis terminates the name. A closing parenthesis can be included in  a
+       name  either  as  \)  or  between  \Q  and \E. If the PCRE2_EXTENDED or
+       PCRE2_EXTENDED_MORE option is set with  PCRE2_ALT_VERBNAMES,  unescaped
+       whitespace  in verb names is skipped and #-comments are recognized, ex-
+       actly as in the rest of the pattern.


          PCRE2_AUTO_CALLOUT


-       If  this  bit  is  set,  pcre2_compile()  automatically inserts callout
-       items, all with number 255, before each pattern  item,  except  immedi-
-       ately  before  or after an explicit callout in the pattern. For discus-
+       If this bit  is  set,  pcre2_compile()  automatically  inserts  callout
+       items,  all  with  number 255, before each pattern item, except immedi-
+       ately before or after an explicit callout in the pattern.  For  discus-
        sion of the callout facility, see the pcre2callout documentation.


          PCRE2_CASELESS


-       If this bit is set, letters in the pattern match both upper  and  lower
-       case  letters in the subject. It is equivalent to Perl's /i option, and
-       it can be changed within  a  pattern  by  a  (?i)  option  setting.  If
-       PCRE2_UTF  is  set, Unicode properties are used for all characters with
-       more than one other case, and for all characters whose code points  are
-       greater  than  U+007F.  For lower valued characters with only one other
-       case, a lookup table is used for speed. When PCRE2_UTF is  not  set,  a
+       If  this  bit is set, letters in the pattern match both upper and lower
+       case letters in the subject. It is equivalent to Perl's /i option,  and
+       it  can  be  changed  within  a  pattern  by  a (?i) option setting. If
+       PCRE2_UTF is set, Unicode properties are used for all  characters  with
+       more  than one other case, and for all characters whose code points are
+       greater than U+007F. For lower valued characters with  only  one  other
+       case,  a  lookup  table is used for speed. When PCRE2_UTF is not set, a
        lookup table is used for all code points less than 256, and higher code
-       points (available only in 16-bit or 32-bit mode)  are  treated  as  not
+       points  (available  only  in  16-bit or 32-bit mode) are treated as not
        having another case.


          PCRE2_DOLLAR_ENDONLY


-       If  this bit is set, a dollar metacharacter in the pattern matches only
-       at the end of the subject string. Without this option,  a  dollar  also
-       matches  immediately before a newline at the end of the string (but not
-       before any other newlines). The PCRE2_DOLLAR_ENDONLY option is  ignored
-       if  PCRE2_MULTILINE  is  set.  There is no equivalent to this option in
+       If this bit is set, a dollar metacharacter in the pattern matches  only
+       at  the  end  of the subject string. Without this option, a dollar also
+       matches immediately before a newline at the end of the string (but  not
+       before  any other newlines). The PCRE2_DOLLAR_ENDONLY option is ignored
+       if PCRE2_MULTILINE is set. There is no equivalent  to  this  option  in
        Perl, and no way to set it within a pattern.


          PCRE2_DOTALL


-       If this bit is set, a dot metacharacter  in  the  pattern  matches  any
-       character,  including  one  that  indicates a newline. However, it only
+       If  this  bit  is  set,  a dot metacharacter in the pattern matches any
+       character, including one that indicates a  newline.  However,  it  only
        ever matches one character, even if newlines are coded as CRLF. Without
        this option, a dot does not match when the current position in the sub-
-       ject is at a newline. This option is equivalent to  Perl's  /s  option,
+       ject  is  at  a newline. This option is equivalent to Perl's /s option,
        and it can be changed within a pattern by a (?s) option setting. A neg-
-       ative class such as [^a] always matches newline characters, and the  \N
-       escape  sequence always matches a non-newline character, independent of
+       ative  class such as [^a] always matches newline characters, and the \N
+       escape sequence always matches a non-newline character, independent  of
        the setting of PCRE2_DOTALL.


          PCRE2_DUPNAMES


-       If this bit is set, names used to identify capture groups need  not  be
-       unique.   This  can  be helpful for certain types of pattern when it is
-       known that only one instance of the named group can  ever  be  matched.
-       There  are  more  details  of  named capture groups below; see also the
+       If  this  bit is set, names used to identify capture groups need not be
+       unique.  This can be helpful for certain types of pattern  when  it  is
+       known  that  only  one instance of the named group can ever be matched.
+       There are more details of named capture  groups  below;  see  also  the
        pcre2pattern documentation.


          PCRE2_ENDANCHORED


-       If this bit is set, the end of any pattern match must be right  at  the
+       If  this  bit is set, the end of any pattern match must be right at the
        end of the string being searched (the "subject string"). If the pattern
        match succeeds by reaching (*ACCEPT), but does not reach the end of the
-       subject,  the match fails at the current starting point. For unanchored
-       patterns, a new match is then tried at the next  starting  point.  How-
+       subject, the match fails at the current starting point. For  unanchored
+       patterns,  a  new  match is then tried at the next starting point. How-
        ever, if the match succeeds by reaching the end of the pattern, but not
-       the end of the subject, backtracking occurs and  an  alternative  match
+       the  end  of  the subject, backtracking occurs and an alternative match
        may be found. Consider these two patterns:


          .(*ACCEPT)|..
          .|..


-       If  matched against "abc" with PCRE2_ENDANCHORED set, the first matches
-       "c" whereas the second matches "bc". The  effect  of  PCRE2_ENDANCHORED
-       can  also  be achieved by appropriate constructs in the pattern itself,
+       If matched against "abc" with PCRE2_ENDANCHORED set, the first  matches
+       "c"  whereas  the  second matches "bc". The effect of PCRE2_ENDANCHORED
+       can also be achieved by appropriate constructs in the  pattern  itself,
        which is the only way to do it in Perl.


        For DFA matching with pcre2_dfa_match(), PCRE2_ENDANCHORED applies only
-       to  the  first  (that  is,  the longest) matched string. Other parallel
-       matches, which are necessarily substrings of the first one, must  obvi-
+       to the first (that is, the  longest)  matched  string.  Other  parallel
+       matches,  which are necessarily substrings of the first one, must obvi-
        ously end before the end of the subject.


          PCRE2_EXTENDED


-       If  this  bit  is  set,  most white space characters in the pattern are
-       totally ignored except when escaped or inside a character  class.  How-
-       ever,  white  space  is  not  allowed within sequences such as (?> that
-       introduce various parenthesized groups, nor  within  numerical  quanti-
-       fiers such as {1,3}. Ignorable white space is permitted between an item
-       and a following quantifier and between a quantifier and a  following  +
-       that  indicates  possessiveness. PCRE2_EXTENDED is equivalent to Perl's
-       /x option, and it can be changed within a pattern by a (?x) option set-
-       ting.
+       If this bit is set, most white space characters in the pattern are  to-
+       tally ignored except when escaped or inside a character class. However,
+       white space is not allowed within sequences such as (?> that  introduce
+       various  parenthesized groups, nor within numerical quantifiers such as
+       {1,3}. Ignorable white space is permitted between an item and a follow-
+       ing  quantifier  and  between a quantifier and a following + that indi-
+       cates possessiveness. PCRE2_EXTENDED is equivalent to Perl's /x option,
+       and it can be changed within a pattern by a (?x) option setting.


        When  PCRE2  is compiled without Unicode support, PCRE2_EXTENDED recog-
        nizes as white space only those characters with code points  less  than
@@ -1561,90 +1557,88 @@


          PCRE2_EXTENDED_MORE


-       This option  has  the  effect  of  PCRE2_EXTENDED,  but,  in  addition,
-       unescaped  space  and  horizontal  tab  characters are ignored inside a
-       character class. Note: only these two characters are ignored,  not  the
-       full  set  of pattern white space characters that are ignored outside a
-       character  class.  PCRE2_EXTENDED_MORE  is  equivalent  to  Perl's  /xx
-       option,  and  it can be changed within a pattern by a (?xx) option set-
-       ting.
+       This option has the effect of PCRE2_EXTENDED,  but,  in  addition,  un-
+       escaped  space and horizontal tab characters are ignored inside a char-
+       acter class. Note: only these two characters are ignored, not the  full
+       set  of pattern white space characters that are ignored outside a char-
+       acter class. PCRE2_EXTENDED_MORE is equivalent to  Perl's  /xx  option,
+       and it can be changed within a pattern by a (?xx) option setting.


          PCRE2_FIRSTLINE


        If this option is set, the start of an unanchored pattern match must be
-       before  or  at  the  first  newline in the subject string following the
-       start of matching, though the matched text may continue over  the  new-
+       before or at the first newline in  the  subject  string  following  the
+       start  of  matching, though the matched text may continue over the new-
        line. If startoffset is non-zero, the limiting newline is not necessar-
-       ily the first newline in the  subject.  For  example,  if  the  subject
+       ily  the  first  newline  in  the  subject. For example, if the subject
        string is "abc\nxyz" (where \n represents a single-character newline) a
-       pattern match for "yz" succeeds with PCRE2_FIRSTLINE if startoffset  is
-       greater  than 3. See also PCRE2_USE_OFFSET_LIMIT, which provides a more
-       general limiting facility. If PCRE2_FIRSTLINE is  set  with  an  offset
-       limit,  a match must occur in the first line and also within the offset
+       pattern  match for "yz" succeeds with PCRE2_FIRSTLINE if startoffset is
+       greater than 3. See also PCRE2_USE_OFFSET_LIMIT, which provides a  more
+       general  limiting  facility.  If  PCRE2_FIRSTLINE is set with an offset
+       limit, a match must occur in the first line and also within the  offset
        limit. In other words, whichever limit comes first is used.


          PCRE2_LITERAL


        If this option is set, all meta-characters in the pattern are disabled,
-       and  it is treated as a literal string. Matching literal strings with a
+       and it is treated as a literal string. Matching literal strings with  a
        regular expression engine is not the most efficient way of doing it. If
-       you  are  doing  a  lot of literal matching and are worried about effi-
+       you are doing a lot of literal matching and  are  worried  about  effi-
        ciency, you should consider using other approaches. The only other main
        options  that  are  allowed  with  PCRE2_LITERAL  are:  PCRE2_ANCHORED,
        PCRE2_ENDANCHORED, PCRE2_AUTO_CALLOUT, PCRE2_CASELESS, PCRE2_FIRSTLINE,
        PCRE2_MATCH_INVALID_UTF,  PCRE2_NO_START_OPTIMIZE,  PCRE2_NO_UTF_CHECK,
-       PCRE2_UTF,    and    PCRE2_USE_OFFSET_LIMIT.    The    extra    options
-       PCRE2_EXTRA_MATCH_LINE  and  PCRE2_EXTRA_MATCH_WORD are also supported.
-       Any other options cause an error.
+       PCRE2_UTF,  and  PCRE2_USE_OFFSET_LIMIT.  The  extra  options PCRE2_EX-
+       TRA_MATCH_LINE and PCRE2_EXTRA_MATCH_WORD are also supported. Any other
+       options cause an error.


          PCRE2_MATCH_INVALID_UTF


-       This option forces PCRE2_UTF (see below) and also enables  support  for
-       matching  by  pcre2_match() in subject strings that contain invalid UTF
-       sequences.  This facility  is  not  supported  for  DFA  matching.  For
-       details, see the pcre2unicode documentation.
+       This  option  forces PCRE2_UTF (see below) and also enables support for
+       matching by pcre2_match() in subject strings that contain  invalid  UTF
+       sequences.   This  facility  is not supported for DFA matching. For de-
+       tails, see the pcre2unicode documentation.


          PCRE2_MATCH_UNSET_BACKREF


-       If  this  option  is  set,  a  backreference  to an unset capture group
-       matches an empty string (by default this causes  the  current  matching
-       alternative  to  fail).   A  pattern such as (\1)(a) succeeds when this
-       option is set (assuming it can find an "a" in the subject), whereas  it
-       fails  by  default,  for  Perl compatibility. Setting this option makes
+       If this option is set,  a  backreference  to  an  unset  capture  group
+       matches  an  empty  string (by default this causes the current matching
+       alternative to fail).  A pattern such as (\1)(a) succeeds when this op-
+       tion  is  set  (assuming it can find an "a" in the subject), whereas it
+       fails by default, for Perl compatibility.  Setting  this  option  makes
        PCRE2 behave more like ECMAscript (aka JavaScript).


          PCRE2_MULTILINE


-       By default, for the purposes of matching "start of line"  and  "end  of
-       line",  PCRE2  treats the subject string as consisting of a single line
-       of characters, even if it actually contains  newlines.  The  "start  of
-       line"  metacharacter  (^)  matches only at the start of the string, and
-       the "end of line" metacharacter ($) matches only  at  the  end  of  the
-       string,  or  before  a  terminating  newline  (except  when  PCRE2_DOL-
-       LAR_ENDONLY is set). Note, however, that unless  PCRE2_DOTALL  is  set,
-       the "any character" metacharacter (.) does not match at a newline. This
-       behaviour (for ^, $, and dot) is the same as Perl.
+       By  default,  for  the purposes of matching "start of line" and "end of
+       line", PCRE2 treats the subject string as consisting of a  single  line
+       of  characters,  even  if  it actually contains newlines. The "start of
+       line" metacharacter (^) matches only at the start of  the  string,  and
+       the  "end  of  line"  metacharacter  ($) matches only at the end of the
+       string, or before a terminating newline (except  when  PCRE2_DOLLAR_EN-
+       DONLY is set). Note, however, that unless PCRE2_DOTALL is set, the "any
+       character" metacharacter (.) does not match at a newline.  This  behav-
+       iour (for ^, $, and dot) is the same as Perl.


-       When PCRE2_MULTILINE it is set, the "start of line" and "end  of  line"
-       constructs  match  immediately following or immediately before internal
-       newlines in the subject string, respectively, as well as  at  the  very
-       start  and  end.  This is equivalent to Perl's /m option, and it can be
+       When  PCRE2_MULTILINE  it is set, the "start of line" and "end of line"
+       constructs match immediately following or immediately  before  internal
+       newlines  in  the  subject string, respectively, as well as at the very
+       start and end. This is equivalent to Perl's /m option, and  it  can  be
        changed within a pattern by a (?m) option setting. Note that the "start
        of line" metacharacter does not match after a newline at the end of the
-       subject, for compatibility with Perl.  However, you can change this  by
-       setting  the PCRE2_ALT_CIRCUMFLEX option. If there are no newlines in a
-       subject string, or no occurrences of ^  or  $  in  a  pattern,  setting
+       subject,  for compatibility with Perl.  However, you can change this by
+       setting the PCRE2_ALT_CIRCUMFLEX option. If there are no newlines in  a
+       subject  string,  or  no  occurrences  of  ^ or $ in a pattern, setting
        PCRE2_MULTILINE has no effect.


          PCRE2_NEVER_BACKSLASH_C


-       This  option  locks out the use of \C in the pattern that is being com-
-       piled.  This escape can  cause  unpredictable  behaviour  in  UTF-8  or
-       UTF-16  modes,  because  it may leave the current matching point in the
-       middle of a multi-code-unit character. This option  may  be  useful  in
-       applications  that  process  patterns  from external sources. Note that
-       there is also a build-time option that permanently locks out the use of
-       \C.
+       This option locks out the use of \C in the pattern that is  being  com-
+       piled.   This  escape  can  cause  unpredictable  behaviour in UTF-8 or
+       UTF-16 modes, because it may leave the current matching  point  in  the
+       middle of a multi-code-unit character. This option may be useful in ap-
+       plications that process patterns from external sources. Note that there
+       is also a build-time option that permanently locks out the use of \C.


          PCRE2_NEVER_UCP


@@ -1661,9 +1655,9 @@
        This  option  locks out interpretation of the pattern as UTF-8, UTF-16,
        or UTF-32, depending on which library is in use. In particular, it pre-
        vents  the  creator of the pattern from switching to UTF interpretation
-       by starting the pattern with (*UTF).  This  option  may  be  useful  in
-       applications  that process patterns from external sources. The combina-
-       tion of PCRE2_UTF and PCRE2_NEVER_UTF causes an error.
+       by starting the pattern with (*UTF). This option may be useful  in  ap-
+       plications that process patterns from external sources. The combination
+       of PCRE2_UTF and PCRE2_NEVER_UTF causes an error.


          PCRE2_NO_AUTO_CAPTURE


@@ -1738,8 +1732,8 @@
        does. However, if the same match is  run  with  PCRE2_NO_START_OPTIMIZE
        set,  the  initial  scan  along the subject string does not happen. The
        first match attempt is run starting  from  "D"  and  when  this  fails,
-       (*COMMIT)  prevents  any  further  matches  being tried, so the overall
-       result is "no match".
+       (*COMMIT)  prevents any further matches being tried, so the overall re-
+       sult is "no match".


        As another start-up optimization makes use of a minimum  length  for  a
        matching subject, which is recorded when possible. Consider the pattern
@@ -1750,8 +1744,8 @@
        "XXBB", the "starting character" optimization skips "XX", then tries to
        match  "BB", which is long enough. In the process, (*MARK:2) is encoun-
        tered and remembered. When the match attempt fails,  the  next  "B"  is
-       found,  but  there  is  only  one  character left, so there are no more
-       attempts, and "no match" is returned with the "last mark seen"  set  to
+       found,  but  there is only one character left, so there are no more at-
+       tempts, and "no match" is returned with the "last  mark  seen"  set  to
        "2".  If  NO_START_OPTIMIZE is set, however, matches are tried at every
        possible starting position, including at the end of the subject,  where
        (*MARK:1)  is encountered, but there is no "B", so the "last mark seen"
@@ -1769,8 +1763,8 @@


        If you know that your pattern is a valid UTF string, and  you  want  to
        skip   this   check   for   performance   reasons,   you  can  set  the
-       PCRE2_NO_UTF_CHECK option. When it is set, the  effect  of  passing  an
-       invalid UTF string as a pattern is undefined. It may cause your program
+       PCRE2_NO_UTF_CHECK option. When it is set, the effect of passing an in-
+       valid  UTF  string as a pattern is undefined. It may cause your program
        to crash or loop.


        Note  that  this  option  can  also  be  passed  to  pcre2_match()  and
@@ -1810,8 +1804,8 @@
        This option must be set for pcre2_compile() if pcre2_set_offset_limit()
        is  going  to be used to set a non-default offset limit in a match con-
        text for matches that use this pattern. An error  is  generated  if  an
-       offset  limit  is  set  without  this option. For more details, see the
-       description of pcre2_set_offset_limit() in the section  that  describes
+       offset  limit is set without this option. For more details, see the de-
+       scription of pcre2_set_offset_limit() in  the  section  that  describes
        match contexts. See also the PCRE2_FIRSTLINE option above.


          PCRE2_UTF
@@ -1820,9 +1814,9 @@
        strings that are subsequently processed as strings  of  UTF  characters
        instead  of  single-code-unit  strings.  It  is available when PCRE2 is
        built to include Unicode support (which is  the  default).  If  Unicode
-       support  is  not  available,  the use of this option provokes an error.
-       Details of how PCRE2_UTF changes the behaviour of PCRE2  are  given  in
-       the  pcre2unicode  page.  In  particular,  note that it changes the way
+       support is not available, the use of this option provokes an error. De-
+       tails of how PCRE2_UTF changes the behaviour of PCRE2 are given in  the
+       pcre2unicode  page.  In  particular,  note  that  it  changes  the  way
        PCRE2_CASELESS handles characters with code points greater than 127.


    Extra compile options
@@ -1843,74 +1837,73 @@


        These values also cause errors if encountered in escape sequences  such
        as \x{d912} within a pattern. However, it seems that some applications,
-       when using PCRE2 to check for unwanted  characters  in  UTF-8  strings,
-       explicitly   test  for  the  surrogates  using  escape  sequences.  The
-       PCRE2_NO_UTF_CHECK option does  not  disable  the  error  that  occurs,
-       because  it applies only to the testing of input strings for UTF valid-
-       ity.
+       when using PCRE2 to check for unwanted characters in UTF-8 strings, ex-
+       plicitly   test   for   the  surrogates  using  escape  sequences.  The
+       PCRE2_NO_UTF_CHECK option does not disable the error that  occurs,  be-
+       cause it applies only to the testing of input strings for UTF validity.


-       If the extra option PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is set,  surro-
-       gate  code  point values in UTF-8 and UTF-32 patterns no longer provoke
-       errors and are incorporated in the compiled pattern. However, they  can
-       only  match  subject characters if the matching function is called with
+       If  the extra option PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is set, surro-
+       gate code point values in UTF-8 and UTF-32 patterns no  longer  provoke
+       errors  and are incorporated in the compiled pattern. However, they can
+       only match subject characters if the matching function is  called  with
        PCRE2_NO_UTF_CHECK set.


          PCRE2_EXTRA_ALT_BSUX


-       The original option PCRE2_ALT_BSUX causes PCRE2 to process \U, \u,  and
-       \x  in  the way that ECMAscript (aka JavaScript) does. Additional func-
+       The  original option PCRE2_ALT_BSUX causes PCRE2 to process \U, \u, and
+       \x in the way that ECMAscript (aka JavaScript) does.  Additional  func-
        tionality was defined by ECMAscript 6; setting PCRE2_EXTRA_ALT_BSUX has
-       the  effect  of PCRE2_ALT_BSUX, but in addition it recognizes \u{hhh..}
+       the effect of PCRE2_ALT_BSUX, but in addition it  recognizes  \u{hhh..}
        as a hexadecimal character code, where hhh.. is any number of hexadeci-
        mal digits.


          PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL


-       This  is a dangerous option. Use with care. By default, an unrecognized
-       escape such as \j or a malformed one such as \x{2z} causes  a  compile-
+       This is a dangerous option. Use with care. By default, an  unrecognized
+       escape  such  as \j or a malformed one such as \x{2z} causes a compile-
        time error when detected by pcre2_compile(). Perl is somewhat inconsis-
-       tent in handling such items: for example, \j is treated  as  a  literal
-       "j",  and non-hexadecimal digits in \x{} are just ignored, though warn-
-       ings are given in both cases if Perl's warning switch is enabled.  How-
-       ever,  a  malformed  octal  number  after \o{ always causes an error in
+       tent  in  handling  such items: for example, \j is treated as a literal
+       "j", and non-hexadecimal digits in \x{} are just ignored, though  warn-
+       ings  are given in both cases if Perl's warning switch is enabled. How-
+       ever, a malformed octal number after \o{  always  causes  an  error  in
        Perl.


-       If the PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL  extra  option  is  passed  to
-       pcre2_compile(),  all  unrecognized  or  malformed escape sequences are
-       treated as single-character escapes. For example, \j is a  literal  "j"
-       and  \x{2z}  is  treated  as  the  literal string "x{2z}". Setting this
-       option means that typos in patterns may go undetected  and  have  unex-
-       pected  results. Also note that a sequence such as [\N{] is interpreted
-       as a malformed attempt at [\N{...}] and so is treated as  [N{]  whereas
-       [\N]  gives  an  error  because  an  unqualified  \N  is a valid escape
-       sequence but is not supported in a character class. To reiterate:  this
-       is a dangerous option. Use with great care.
+       If  the  PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL  extra  option  is passed to
+       pcre2_compile(), all unrecognized or  malformed  escape  sequences  are
+       treated  as  single-character escapes. For example, \j is a literal "j"
+       and \x{2z} is treated as the literal string "x{2z}". Setting  this  op-
+       tion means that typos in patterns may go undetected and have unexpected
+       results. Also note that a sequence such as [\N{] is  interpreted  as  a
+       malformed  attempt  at [\N{...}] and so is treated as [N{] whereas [\N]
+       gives an error because an unqualified \N is a valid escape sequence but
+       is  not supported in a character class. To reiterate: this is a danger-
+       ous option. Use with great care.


          PCRE2_EXTRA_ESCAPED_CR_IS_LF


-       There  are  some  legacy applications where the escape sequence \r in a
-       pattern is expected to match a newline. If this option is set, \r in  a
-       pattern  is  converted to \n so that it matches a LF (linefeed) instead
-       of a CR (carriage return) character. The option does not affect a  lit-
-       eral  CR in the pattern, nor does it affect CR specified as an explicit
+       There are some legacy applications where the escape sequence  \r  in  a
+       pattern  is expected to match a newline. If this option is set, \r in a
+       pattern is converted to \n so that it matches a LF  (linefeed)  instead
+       of  a CR (carriage return) character. The option does not affect a lit-
+       eral CR in the pattern, nor does it affect CR specified as an  explicit
        code point such as \x{0D}.


          PCRE2_EXTRA_MATCH_LINE


-       This option is provided for use by  the  -x  option  of  pcre2grep.  It
-       causes  the  pattern  only to match complete lines. This is achieved by
-       automatically inserting the code for "^(?:" at the start  of  the  com-
-       piled  pattern  and ")$" at the end. Thus, when PCRE2_MULTILINE is set,
-       the matched line may be in the  middle  of  the  subject  string.  This
-       option can be used with PCRE2_LITERAL.
+       This  option  is  provided  for  use  by the -x option of pcre2grep. It
+       causes the pattern only to match complete lines. This  is  achieved  by
+       automatically  inserting  the  code for "^(?:" at the start of the com-
+       piled pattern and ")$" at the end. Thus, when PCRE2_MULTILINE  is  set,
+       the  matched  line may be in the middle of the subject string. This op-
+       tion can be used with PCRE2_LITERAL.


          PCRE2_EXTRA_MATCH_WORD


-       This  option  is  provided  for  use  by the -w option of pcre2grep. It
-       causes the pattern only to match strings that have a word  boundary  at
-       the  start and the end. This is achieved by automatically inserting the
-       code for "\b(?:" at the start of the compiled pattern and ")\b" at  the
-       end.  The option may be used with PCRE2_LITERAL. However, it is ignored
+       This option is provided for use by  the  -w  option  of  pcre2grep.  It
+       causes  the  pattern only to match strings that have a word boundary at
+       the start and the end. This is achieved by automatically inserting  the
+       code  for "\b(?:" at the start of the compiled pattern and ")\b" at the
+       end. The option may be used with PCRE2_LITERAL. However, it is  ignored
        if PCRE2_EXTRA_MATCH_LINE is also set.



@@ -1933,53 +1926,53 @@

        void pcre2_jit_stack_free(pcre2_jit_stack *jit_stack);


-       These functions provide support for  JIT  compilation,  which,  if  the
-       just-in-time  compiler  is available, further processes a compiled pat-
+       These  functions  provide  support  for  JIT compilation, which, if the
+       just-in-time compiler is available, further processes a  compiled  pat-
        tern into machine code that executes much faster than the pcre2_match()
-       interpretive  matching function. Full details are given in the pcre2jit
+       interpretive matching function. Full details are given in the  pcre2jit
        documentation.


-       JIT compilation is a heavyweight optimization. It can  take  some  time
-       for  patterns  to  be analyzed, and for one-off matches and simple pat-
-       terns the benefit of faster execution might be offset by a much  slower
-       compilation  time.  Most (but not all) patterns can be optimized by the
+       JIT  compilation  is  a heavyweight optimization. It can take some time
+       for patterns to be analyzed, and for one-off matches  and  simple  pat-
+       terns  the benefit of faster execution might be offset by a much slower
+       compilation time.  Most (but not all) patterns can be optimized by  the
        JIT compiler.



LOCALE SUPPORT

-       PCRE2 handles caseless matching, and determines whether characters  are
-       letters,  digits, or whatever, by reference to a set of tables, indexed
-       by character code point. This applies only  to  characters  whose  code
-       points  are  less than 256. By default, higher-valued code points never
-       match escapes such as \w or \d.  However, if PCRE2 is built  with  Uni-
+       PCRE2  handles caseless matching, and determines whether characters are
+       letters, digits, or whatever, by reference to a set of tables,  indexed
+       by  character  code  point.  This applies only to characters whose code
+       points are less than 256. By default, higher-valued code  points  never
+       match  escapes  such as \w or \d.  However, if PCRE2 is built with Uni-
        code support, all characters can be tested with \p and \P, or, alterna-
-       tively, the PCRE2_UCP option can be set when  a  pattern  is  compiled;
-       this  causes  \w and friends to use Unicode property support instead of
+       tively,  the  PCRE2_UCP  option  can be set when a pattern is compiled;
+       this causes \w and friends to use Unicode property support  instead  of
        the built-in tables.


-       The use of locales with Unicode is discouraged.  If  you  are  handling
-       characters  with  code  points  greater than 128, you should either use
+       The  use  of  locales  with Unicode is discouraged. If you are handling
+       characters with code points greater than 128,  you  should  either  use
        Unicode support, or use locales, but not try to mix the two.


-       PCRE2 contains an internal set of character tables  that  are  used  by
-       default.   These  are  sufficient  for many applications. Normally, the
-       internal tables recognize only ASCII characters. However, when PCRE2 is
+       PCRE2 contains an internal set of character tables that are used by de-
+       fault.  These are sufficient for many applications. Normally,  the  in-
+       ternal  tables  recognize only ASCII characters. However, when PCRE2 is
        built, it is possible to cause the internal tables to be rebuilt in the
        default "C" locale of the local system, which may cause them to be dif-
        ferent.


-       The  internal tables can be overridden by tables supplied by the appli-
-       cation that calls PCRE2. These may be created  in  a  different  locale
-       from  the  default.  As more and more applications change to using Uni-
+       The internal tables can be overridden by tables supplied by the  appli-
+       cation  that  calls  PCRE2.  These may be created in a different locale
+       from the default.  As more and more applications change to  using  Uni-
        code, the need for this locale support is expected to die away.


-       External tables are built by calling the  pcre2_maketables()  function,
-       in  the relevant locale. The result can be passed to pcre2_compile() as
-       often  as  necessary,  by  creating  a  compile  context  and   calling
-       pcre2_set_character_tables()  to  set  the  tables pointer therein. For
-       example, to build and use tables that are appropriate  for  the  French
-       locale  (where  accented  characters  with  values greater than 128 are
+       External  tables  are built by calling the pcre2_maketables() function,
+       in the relevant locale. The result can be passed to pcre2_compile()  as
+       often   as  necessary,  by  creating  a  compile  context  and  calling
+       pcre2_set_character_tables() to set the tables pointer therein. For ex-
+       ample,  to build and use tables that are appropriate for the French lo-
+       cale (where accented  characters  with  values  greater  than  128  are
        treated as letters), the following code could be used:


          setlocale(LC_CTYPE, "fr_FR");
@@ -1988,15 +1981,15 @@
          pcre2_set_character_tables(ccontext, tables);
          re = pcre2_compile(..., ccontext);


-       The locale name "fr_FR" is used on Linux and other  Unix-like  systems;
-       if  you  are using Windows, the name for the French locale is "french".
-       It is the caller's responsibility to ensure that the memory  containing
+       The  locale  name "fr_FR" is used on Linux and other Unix-like systems;
+       if you are using Windows, the name for the French locale  is  "french".
+       It  is the caller's responsibility to ensure that the memory containing
        the tables remains available for as long as it is needed.


        The pointer that is passed (via the compile context) to pcre2_compile()
-       is saved with the compiled pattern, and the same  tables  are  used  by
-       pcre2_match()  and pcre_dfa_match(). Thus, for any single pattern, com-
-       pilation and matching both happen in the  same  locale,  but  different
+       is  saved  with  the  compiled pattern, and the same tables are used by
+       pcre2_match() and pcre_dfa_match(). Thus, for any single pattern,  com-
+       pilation  and  matching  both  happen in the same locale, but different
        patterns can be processed in different locales.



@@ -2004,13 +1997,13 @@

        int pcre2_pattern_info(const pcre2 *code, uint32_t what, void *where);


-       The  pcre2_pattern_info()  function returns general information about a
+       The pcre2_pattern_info() function returns general information  about  a
        compiled pattern. For information about callouts, see the next section.
-       The  first  argument  for pcre2_pattern_info() is a pointer to the com-
+       The first argument for pcre2_pattern_info() is a pointer  to  the  com-
        piled pattern. The second argument specifies which piece of information
-       is  required,  and  the  third  argument  is a pointer to a variable to
-       receive the data. If the third argument is NULL, the first argument  is
-       ignored,  and  the  function  returns the size in bytes of the variable
+       is required, and the third argument is a pointer to a variable  to  re-
+       ceive  the  data.  If the third argument is NULL, the first argument is
+       ignored, and the function returns the size in  bytes  of  the  variable
        that is required for the information requested. Otherwise, the yield of
        the function is zero for success, or one of the following negative num-
        bers:
@@ -2020,9 +2013,9 @@
          PCRE2_ERROR_BADOPTION      the value of what was invalid
          PCRE2_ERROR_UNSET          the requested field is not set


-       The "magic number" is placed at the start of each compiled  pattern  as
-       an  simple check against passing an arbitrary memory pointer. Here is a
-       typical call of pcre2_pattern_info(), to obtain the length of the  com-
+       The  "magic  number" is placed at the start of each compiled pattern as
+       an simple check against passing an arbitrary memory pointer. Here is  a
+       typical  call of pcre2_pattern_info(), to obtain the length of the com-
        piled pattern:


          int rc;
@@ -2040,22 +2033,22 @@
          PCRE2_INFO_EXTRAOPTIONS


        Return copies of the pattern's options. The third argument should point
-       to  a  uint32_t  variable.  PCRE2_INFO_ARGOPTIONS  returns  exactly the
-       options that were passed to pcre2_compile(), whereas  PCRE2_INFO_ALLOP-
-       TIONS  returns  the compile options as modified by any top-level (*XXX)
-       option settings such as (*UTF) at the  start  of  the  pattern  itself.
-       PCRE2_INFO_EXTRAOPTIONS  returns the extra options that were set in the
-       compile context by calling the pcre2_set_compile_extra_options()  func-
+       to a uint32_t variable. PCRE2_INFO_ARGOPTIONS returns exactly  the  op-
+       tions  that  were  passed to pcre2_compile(), whereas PCRE2_INFO_ALLOP-
+       TIONS returns the compile options as modified by any  top-level  (*XXX)
+       option  settings  such  as  (*UTF)  at the start of the pattern itself.
+       PCRE2_INFO_EXTRAOPTIONS returns the extra options that were set in  the
+       compile  context by calling the pcre2_set_compile_extra_options() func-
        tion.


-       For   example,   if  the  pattern  /(*UTF)abc/  is  compiled  with  the
-       PCRE2_EXTENDED  option,  the  result   for   PCRE2_INFO_ALLOPTIONS   is
-       PCRE2_EXTENDED  and  PCRE2_UTF.   Option settings such as (?i) that can
-       change within a pattern do not affect the result  of  PCRE2_INFO_ALLOP-
-       TIONS, even if they appear right at the start of the pattern. (This was
-       different in some earlier releases.)
+       For example, if the pattern /(*UTF)abc/ is compiled with the  PCRE2_EX-
+       TENDED  option,  the result for PCRE2_INFO_ALLOPTIONS is PCRE2_EXTENDED
+       and PCRE2_UTF.  Option settings such as (?i) that can change  within  a
+       pattern do not affect the result of PCRE2_INFO_ALLOPTIONS, even if they
+       appear right at the start of the pattern. (This was different  in  some
+       earlier releases.)


-       A pattern compiled without PCRE2_ANCHORED is automatically anchored  by
+       A  pattern compiled without PCRE2_ANCHORED is automatically anchored by
        PCRE2 if the first significant item in every top-level branch is one of
        the following:


@@ -2064,7 +2057,7 @@
          \G    always
          .*    sometimes - see below


-       When .* is the first significant item, anchoring is possible only  when
+       When  .* is the first significant item, anchoring is possible only when
        all the following are true:


          .* is not in an atomic group
@@ -2074,15 +2067,15 @@
          Neither (*PRUNE) nor (*SKIP) appears in the pattern
          PCRE2_NO_DOTSTAR_ANCHOR is not set


-       For  patterns  that are auto-anchored, the PCRE2_ANCHORED bit is set in
+       For patterns that are auto-anchored, the PCRE2_ANCHORED bit is  set  in
        the options returned for PCRE2_INFO_ALLOPTIONS.


          PCRE2_INFO_BACKREFMAX


-       Return the number of the highest  backreference  in  the  pattern.  The
-       third  argument  should  point  to  an uint32_t variable. Named capture
-       groups acquire numbers as well as names, and these  count  towards  the
-       highest  backreference.  Backreferences  such as \4 or \g{12} match the
+       Return  the  number  of  the  highest backreference in the pattern. The
+       third argument should point to  an  uint32_t  variable.  Named  capture
+       groups  acquire  numbers  as well as names, and these count towards the
+       highest backreference. Backreferences such as \4 or  \g{12}  match  the
        captured characters of the given group, but in addition, the check that
        a capture group is set in a conditional group such as (?(3)a|b) is also
        a backreference.  Zero is returned if there are no backreferences.
@@ -2089,56 +2082,56 @@


          PCRE2_INFO_BSR


-       The output is a uint32_t integer whose value indicates  what  character
-       sequences  the \R escape sequence matches. A value of PCRE2_BSR_UNICODE
-       means that \R matches any Unicode line  ending  sequence;  a  value  of
+       The  output  is a uint32_t integer whose value indicates what character
+       sequences the \R escape sequence matches. A value of  PCRE2_BSR_UNICODE
+       means  that  \R  matches  any  Unicode line ending sequence; a value of
        PCRE2_BSR_ANYCRLF means that \R matches only CR, LF, or CRLF.


          PCRE2_INFO_CAPTURECOUNT


-       Return  the  highest  capture  group number in the pattern. In patterns
+       Return the highest capture group number in  the  pattern.  In  patterns
        where (?| is not used, this is also the total number of capture groups.
        The third argument should point to an uint32_t variable.


          PCRE2_INFO_DEPTHLIMIT


-       If  the  pattern set a backtracking depth limit by including an item of
-       the form (*LIMIT_DEPTH=nnnn) at the start, the value is  returned.  The
+       If the pattern set a backtracking depth limit by including an  item  of
+       the  form  (*LIMIT_DEPTH=nnnn) at the start, the value is returned. The
        third argument should point to a uint32_t integer. If no such value has
-       been  set,  the  call  to  pcre2_pattern_info()   returns   the   error
-       PCRE2_ERROR_UNSET. Note that this limit will only be used during match-
-       ing if it is less than the limit set or defaulted by the caller of  the
-       match function.
+       been  set, the call to pcre2_pattern_info() returns the error PCRE2_ER-
+       ROR_UNSET. Note that this limit will only be used during matching if it
+       is  less  than  the  limit  set or defaulted by the caller of the match
+       function.


          PCRE2_INFO_FIRSTBITMAP


-       In  the absence of a single first code unit for a non-anchored pattern,
-       pcre2_compile() may construct a 256-bit table that defines a fixed  set
-       of  values for the first code unit in any match. For example, a pattern
-       that starts with [abc] results in a table with  three  bits  set.  When
-       code  unit  values greater than 255 are supported, the flag bit for 255
-       means "any code unit of value 255 or above". If such a table  was  con-
-       structed,  a pointer to it is returned. Otherwise NULL is returned. The
+       In the absence of a single first code unit for a non-anchored  pattern,
+       pcre2_compile()  may construct a 256-bit table that defines a fixed set
+       of values for the first code unit in any match. For example, a  pattern
+       that  starts  with  [abc]  results in a table with three bits set. When
+       code unit values greater than 255 are supported, the flag bit  for  255
+       means  "any  code unit of value 255 or above". If such a table was con-
+       structed, a pointer to it is returned. Otherwise NULL is returned.  The
        third argument should point to a const uint8_t * variable.


          PCRE2_INFO_FIRSTCODETYPE


        Return information about the first code unit of any matched string, for
-       a  non-anchored pattern. The third argument should point to an uint32_t
-       variable. If there is a fixed first value, for example, the letter  "c"
-       from  a  pattern such as (cat|cow|coyote), 1 is returned, and the value
-       can be retrieved using PCRE2_INFO_FIRSTCODEUNIT. If there is  no  fixed
-       first  value,  but it is known that a match can occur only at the start
-       of the subject or following a newline in the subject,  2  is  returned.
+       a non-anchored pattern. The third argument should point to an  uint32_t
+       variable.  If there is a fixed first value, for example, the letter "c"
+       from a pattern such as (cat|cow|coyote), 1 is returned, and  the  value
+       can  be  retrieved using PCRE2_INFO_FIRSTCODEUNIT. If there is no fixed
+       first value, but it is known that a match can occur only at  the  start
+       of  the  subject  or following a newline in the subject, 2 is returned.
        Otherwise, and for anchored patterns, 0 is returned.


          PCRE2_INFO_FIRSTCODEUNIT


-       Return  the  value  of  the first code unit of any matched string for a
-       pattern where PCRE2_INFO_FIRSTCODETYPE returns 1; otherwise  return  0.
-       The  third  argument should point to an uint32_t variable. In the 8-bit
-       library, the value is always less than 256. In the 16-bit  library  the
-       value  can  be  up  to 0xffff. In the 32-bit library in UTF-32 mode the
+       Return the value of the first code unit of any  matched  string  for  a
+       pattern  where  PCRE2_INFO_FIRSTCODETYPE returns 1; otherwise return 0.
+       The third argument should point to an uint32_t variable. In  the  8-bit
+       library,  the  value is always less than 256. In the 16-bit library the
+       value can be up to 0xffff. In the 32-bit library  in  UTF-32  mode  the
        value can be up to 0x10ffff, and up to 0xffffffff when not using UTF-32
        mode.


@@ -2145,8 +2138,8 @@
          PCRE2_INFO_FRAMESIZE


        Return the size (in bytes) of the data frames that are used to remember
-       backtracking positions when the pattern is processed  by  pcre2_match()
-       without  the  use  of  JIT. The third argument should point to a size_t
+       backtracking  positions  when the pattern is processed by pcre2_match()
+       without the use of JIT. The third argument should  point  to  a  size_t
        variable. The frame size depends on the number of capturing parentheses
        in the pattern. Each additional capture group adds two PCRE2_SIZE vari-
        ables.
@@ -2153,16 +2146,16 @@


          PCRE2_INFO_HASBACKSLASHC


-       Return 1 if the pattern contains any instances of \C, otherwise 0.  The
+       Return  1 if the pattern contains any instances of \C, otherwise 0. The
        third argument should point to an uint32_t variable.


          PCRE2_INFO_HASCRORLF


-       Return  1  if  the  pattern  contains any explicit matches for CR or LF
+       Return 1 if the pattern contains any explicit  matches  for  CR  or  LF
        characters, otherwise 0. The third argument should point to an uint32_t
-       variable.  An explicit match is either a literal CR or LF character, or
-       \r or  \n  or  one  of  the  equivalent  hexadecimal  or  octal  escape
-       sequences.
+       variable. An explicit match is either a literal CR or LF character,  or
+       \r  or  \n  or  one  of  the equivalent hexadecimal or octal escape se-
+       quences.


          PCRE2_INFO_HEAPLIMIT


@@ -2169,82 +2162,82 @@
        If the pattern set a heap memory limit by including an item of the form
        (*LIMIT_HEAP=nnnn) at the start, the value is returned. The third argu-
        ment should point to a uint32_t integer. If no such value has been set,
-       the call to pcre2_pattern_info() returns the  error  PCRE2_ERROR_UNSET.
-       Note  that  this  limit will only be used during matching if it is less
+       the  call  to pcre2_pattern_info() returns the error PCRE2_ERROR_UNSET.
+       Note that this limit will only be used during matching if  it  is  less
        than the limit set or defaulted by the caller of the match function.


          PCRE2_INFO_JCHANGED


-       Return 1 if the (?J) or (?-J) option setting is used  in  the  pattern,
-       otherwise  0.  The third argument should point to an uint32_t variable.
-       (?J) and (?-J) set and unset the local PCRE2_DUPNAMES  option,  respec-
+       Return  1  if  the (?J) or (?-J) option setting is used in the pattern,
+       otherwise 0. The third argument should point to an  uint32_t  variable.
+       (?J)  and  (?-J) set and unset the local PCRE2_DUPNAMES option, respec-
        tively.


          PCRE2_INFO_JITSIZE


-       If  the  compiled  pattern was successfully processed by pcre2_jit_com-
-       pile(), return the size of the  JIT  compiled  code,  otherwise  return
+       If the compiled pattern was successfully  processed  by  pcre2_jit_com-
+       pile(),  return  the  size  of  the JIT compiled code, otherwise return
        zero. The third argument should point to a size_t variable.


          PCRE2_INFO_LASTCODETYPE


-       Returns  1 if there is a rightmost literal code unit that must exist in
-       any matched string, other than at its start. The third argument  should
-       point  to  an  uint32_t  variable.  If  there  is  no  such value, 0 is
-       returned. When 1 is  returned,  the  code  unit  value  itself  can  be
-       retrieved  using PCRE2_INFO_LASTCODEUNIT. For anchored patterns, a last
-       literal value is recorded only if  it  follows  something  of  variable
-       length.  For example, for the pattern /^a\d+z\d+/ the returned value is
-       1 (with "z" returned from PCRE2_INFO_LASTCODEUNIT), but  for  /^a\dz\d/
-       the returned value is 0.
+       Returns 1 if there is a rightmost literal code unit that must exist  in
+       any  matched string, other than at its start. The third argument should
+       point to an uint32_t variable. If there is no  such  value,  0  is  re-
+       turned. When 1 is returned, the code unit value itself can be retrieved
+       using PCRE2_INFO_LASTCODEUNIT. For anchored patterns,  a  last  literal
+       value  is recorded only if it follows something of variable length. For
+       example, for the pattern /^a\d+z\d+/ the returned value is 1 (with  "z"
+       returned  from PCRE2_INFO_LASTCODEUNIT), but for /^a\dz\d/ the returned
+       value is 0.


          PCRE2_INFO_LASTCODEUNIT


-       Return  the value of the rightmost literal code unit that must exist in
-       any matched string, other than  at  its  start,  for  a  pattern  where
+       Return the value of the rightmost literal code unit that must exist  in
+       any  matched  string,  other  than  at  its  start, for a pattern where
        PCRE2_INFO_LASTCODETYPE returns 1. Otherwise, return 0. The third argu-
        ment should point to an uint32_t variable.


          PCRE2_INFO_MATCHEMPTY


-       Return 1 if the pattern might match an empty string, otherwise  0.  The
-       third  argument  should  point  to an uint32_t variable. When a pattern
+       Return  1  if the pattern might match an empty string, otherwise 0. The
+       third argument should point to an uint32_t  variable.  When  a  pattern
        contains recursive subroutine calls it is not always possible to deter-
-       mine  whether  or  not it can match an empty string. PCRE2 takes a cau-
+       mine whether or not it can match an empty string. PCRE2  takes  a  cau-
        tious approach and returns 1 in such cases.


          PCRE2_INFO_MATCHLIMIT


-       If the pattern set a match limit by  including  an  item  of  the  form
-       (*LIMIT_MATCH=nnnn)  at  the  start,  the  value is returned. The third
-       argument should point to a uint32_t integer. If no such value has  been
-       set,    the    call   to   pcre2_pattern_info()   returns   the   error
-       PCRE2_ERROR_UNSET. Note that this limit will only be used during match-
-       ing  if it is less than the limit set or defaulted by the caller of the
-       match function.
+       If  the  pattern  set  a  match  limit by including an item of the form
+       (*LIMIT_MATCH=nnnn) at the start, the value is returned. The third  ar-
+       gument  should  point  to a uint32_t integer. If no such value has been
+       set, the call to pcre2_pattern_info() returns the error PCRE2_ERROR_UN-
+       SET.  Note  that  this limit will only be used during matching if it is
+       less than the limit set or defaulted by the caller of the  match  func-
+       tion.


          PCRE2_INFO_MAXLOOKBEHIND


        Return the number of characters (not code units) in the longest lookbe-
-       hind  assertion  in  the  pattern. The third argument should point to a
-       uint32_t integer. This information is useful when  doing  multi-segment
-       matching  using  the  partial matching facilities. Note that the simple
+       hind assertion in the pattern. The third argument  should  point  to  a
+       uint32_t  integer.  This information is useful when doing multi-segment
+       matching using the partial matching facilities. Note  that  the  simple
        assertions \b and \B require a one-character lookbehind. \A also regis-
-       ters  a  one-character  lookbehind, though it does not actually inspect
-       the previous character. This is to ensure that at least  one  character
-       from  the old segment is retained when a new segment is processed. Oth-
-       erwise, if there are no lookbehinds in  the  pattern,  \A  might  match
-       incorrectly at the start of a second or subsequent segment.
+       ters a one-character lookbehind, though it does  not  actually  inspect
+       the  previous  character. This is to ensure that at least one character
+       from the old segment is retained when a new segment is processed.  Oth-
+       erwise,  if there are no lookbehinds in the pattern, \A might match in-
+       correctly at the start of a second or subsequent segment.


          PCRE2_INFO_MINLENGTH


-       If  a  minimum  length  for  matching subject strings was computed, its
+       If a minimum length for matching  subject  strings  was  computed,  its
        value is returned. Otherwise the returned value is 0. This value is not
-       computed  when PCRE2_NO_START_OPTIMIZE is set. The value is a number of
-       characters, which in UTF mode may be different from the number of  code
-       units.  The  third  argument  should point to an uint32_t variable. The
-       value is a lower bound to the length of any matching string. There  may
-       not  be  any  strings  of that length that do actually match, but every
+       computed when PCRE2_NO_START_OPTIMIZE is set. The value is a number  of
+       characters,  which in UTF mode may be different from the number of code
+       units. The third argument should point to  an  uint32_t  variable.  The
+       value  is a lower bound to the length of any matching string. There may
+       not be any strings of that length that do  actually  match,  but  every
        string that does match is at least that long.


          PCRE2_INFO_NAMECOUNT
@@ -2252,44 +2245,44 @@
          PCRE2_INFO_NAMETABLE


        PCRE2 supports the use of named as well as numbered capturing parenthe-
-       ses.  The names are just an additional way of identifying the parenthe-
+       ses. The names are just an additional way of identifying the  parenthe-
        ses, which still acquire numbers. Several convenience functions such as
-       pcre2_substring_get_byname()  are provided for extracting captured sub-
-       strings by name. It is also possible to extract the data  directly,  by
-       first  converting  the  name to a number in order to access the correct
-       pointers in the output vector (described with pcre2_match() below).  To
-       do  the  conversion,  you  need to use the name-to-number map, which is
-       described by these three values.
+       pcre2_substring_get_byname() are provided for extracting captured  sub-
+       strings  by  name. It is also possible to extract the data directly, by
+       first converting the name to a number in order to  access  the  correct
+       pointers  in the output vector (described with pcre2_match() below). To
+       do the conversion, you need to use the name-to-number map, which is de-
+       scribed by these three values.


-       The map consists of a number of  fixed-size  entries.  PCRE2_INFO_NAME-
-       COUNT  gives  the number of entries, and PCRE2_INFO_NAMEENTRYSIZE gives
-       the size of each entry in code units; both of these return  a  uint32_t
+       The  map  consists  of a number of fixed-size entries. PCRE2_INFO_NAME-
+       COUNT gives the number of entries, and  PCRE2_INFO_NAMEENTRYSIZE  gives
+       the  size  of each entry in code units; both of these return a uint32_t
        value. The entry size depends on the length of the longest name.


        PCRE2_INFO_NAMETABLE returns a pointer to the first entry of the table.
-       This is a PCRE2_SPTR pointer to a block of code  units.  In  the  8-bit
-       library,  the  first two bytes of each entry are the number of the cap-
-       turing parenthesis, most significant byte first. In the 16-bit library,
-       the  pointer  points  to 16-bit code units, the first of which contains
-       the parenthesis number. In the 32-bit library, the  pointer  points  to
-       32-bit  code units, the first of which contains the parenthesis number.
+       This is a PCRE2_SPTR pointer to a block of code units. In the 8-bit li-
+       brary, the first two bytes of each entry are the number of the  captur-
+       ing  parenthesis,  most  significant byte first. In the 16-bit library,
+       the pointer points to 16-bit code units, the first  of  which  contains
+       the  parenthesis  number.  In the 32-bit library, the pointer points to
+       32-bit code units, the first of which contains the parenthesis  number.
        The rest of the entry is the corresponding name, zero terminated.


-       The names are in alphabetical order. If (?| is used to create  multiple
-       capture  groups  with  the  same number, as described in the section on
-       duplicate group numbers in the pcre2pattern page,  the  groups  may  be
-       given  the same name, but there is only one entry in the table. Differ-
-       ent names for groups of the same number are not permitted.
+       The  names are in alphabetical order. If (?| is used to create multiple
+       capture groups with the same number, as described in the section on du-
+       plicate group numbers in the pcre2pattern page, the groups may be given
+       the same name, but there is only one  entry  in  the  table.  Different
+       names for groups of the same number are not permitted.


-       Duplicate names for capture groups with different numbers  are  permit-
+       Duplicate  names  for capture groups with different numbers are permit-
        ted, but only if PCRE2_DUPNAMES is set. They appear in the table in the
-       order in which they were found in the pattern. In the  absence  of  (?|
-       this  is  the  order of increasing number; when (?| is used this is not
-       necessarily the case because later capture groups may have  lower  num-
+       order  in  which  they were found in the pattern. In the absence of (?|
+       this is the order of increasing number; when (?| is used  this  is  not
+       necessarily  the  case because later capture groups may have lower num-
        bers.


-       As  a  simple  example of the name/number table, consider the following
-       pattern after compilation by the 8-bit library  (assume  PCRE2_EXTENDED
+       As a simple example of the name/number table,  consider  the  following
+       pattern  after  compilation by the 8-bit library (assume PCRE2_EXTENDED
        is set, so white space - including newlines - is ignored):


          (?<date> (?<year>(\d\d)?\d\d) -
@@ -2296,7 +2289,7 @@
          (?<month>\d\d) - (?<day>\d\d) )


        There are four named capture groups, so the table has four entries, and
-       each entry in the table is eight bytes long. The table is  as  follows,
+       each  entry  in the table is eight bytes long. The table is as follows,
        with non-printing bytes shows in hexadecimal, and undefined bytes shown
        as ??:


@@ -2305,8 +2298,8 @@
          00 04 m  o  n  t  h  00
          00 02 y  e  a  r  00 ??


-       When writing code to extract data from named capture groups  using  the
-       name-to-number  map,  remember that the length of the entries is likely
+       When  writing  code to extract data from named capture groups using the
+       name-to-number map, remember that the length of the entries  is  likely
        to be different for each compiled pattern.


          PCRE2_INFO_NEWLINE
@@ -2325,15 +2318,15 @@


          PCRE2_INFO_SIZE


-       Return  the  size  of  the  compiled  pattern  in  bytes (for all three
-       libraries). The third argument should point to a size_t variable.  This
-       value  includes  the  size  of the general data block that precedes the
-       code units of the compiled pattern itself. The value that is used  when
-       pcre2_compile()  is  getting memory in which to place the compiled pat-
-       tern may be slightly larger than the value  returned  by  this  option,
-       because  there are cases where the code that calculates the size has to
-       over-estimate. Processing a pattern with  the  JIT  compiler  does  not
-       alter the value returned by this option.
+       Return the size of the compiled pattern in bytes  (for  all  three  li-
+       braries).  The  third  argument should point to a size_t variable. This
+       value includes the size of the general data  block  that  precedes  the
+       code  units of the compiled pattern itself. The value that is used when
+       pcre2_compile() is getting memory in which to place the  compiled  pat-
+       tern may be slightly larger than the value returned by this option, be-
+       cause there are cases where the code that calculates the  size  has  to
+       over-estimate.  Processing a pattern with the JIT compiler does not al-
+       ter the value returned by this option.



 INFORMATION ABOUT A PATTERN'S CALLOUTS
@@ -2343,30 +2336,30 @@
          void *user_data);


        A script language that supports the use of string arguments in callouts
-       might like to scan all the callouts in a  pattern  before  running  the
+       might  like  to  scan  all the callouts in a pattern before running the
        match. This can be done by calling pcre2_callout_enumerate(). The first
-       argument is a pointer to a compiled pattern, the  second  points  to  a
-       callback  function,  and the third is arbitrary user data. The callback
-       function is called for every callout in the pattern  in  the  order  in
+       argument  is  a  pointer  to a compiled pattern, the second points to a
+       callback function, and the third is arbitrary user data.  The  callback
+       function  is  called  for  every callout in the pattern in the order in
        which they appear. Its first argument is a pointer to a callout enumer-
-       ation block, and its second argument is the user_data  value  that  was
-       passed  to  pcre2_callout_enumerate(). The contents of the callout enu-
-       meration block are described in the pcre2callout  documentation,  which
+       ation  block,  and  its second argument is the user_data value that was
+       passed to pcre2_callout_enumerate(). The contents of the  callout  enu-
+       meration  block  are described in the pcre2callout documentation, which
        also gives further details about callouts.



SERIALIZATION AND PRECOMPILING

-       It  is  possible  to  save  compiled patterns on disc or elsewhere, and
-       reload them later, subject to a number of  restrictions.  The  host  on
-       which  the  patterns  are  reloaded must be running the same version of
+       It is possible to save compiled patterns  on  disc  or  elsewhere,  and
+       reload  them  later,  subject  to a number of restrictions. The host on
+       which the patterns are reloaded must be running  the  same  version  of
        PCRE2, with the same code unit width, and must also have the same endi-
-       anness,  pointer  width,  and PCRE2_SIZE type. Before compiled patterns
-       can be saved, they must be converted to a "serialized" form,  which  in
-       the  case of PCRE2 is really just a bytecode dump.  The functions whose
-       names begin with pcre2_serialize_ are used for converting to  and  from
-       the  serialized form. They are described in the pcre2serialize documen-
-       tation. Note that PCRE2 serialization does not  convert  compiled  pat-
+       anness, pointer width, and PCRE2_SIZE type.  Before  compiled  patterns
+       can  be  saved, they must be converted to a "serialized" form, which in
+       the case of PCRE2 is really just a bytecode dump.  The functions  whose
+       names  begin  with pcre2_serialize_ are used for converting to and from
+       the serialized form. They are described in the pcre2serialize  documen-
+       tation.  Note  that  PCRE2 serialization does not convert compiled pat-
        terns to an abstract format like Java or .NET serialization.



@@ -2380,60 +2373,58 @@

        void pcre2_match_data_free(pcre2_match_data *match_data);


-       Information  about  a  successful  or unsuccessful match is placed in a
-       match data block, which is an opaque  structure  that  is  accessed  by
-       function  calls.  In particular, the match data block contains a vector
-       of offsets into the subject string that define the matched part of  the
-       subject  and  any  substrings  that were captured. This is known as the
+       Information about a successful or unsuccessful match  is  placed  in  a
+       match  data  block,  which  is  an opaque structure that is accessed by
+       function calls. In particular, the match data block contains  a  vector
+       of  offsets into the subject string that define the matched part of the
+       subject and any substrings that were captured. This  is  known  as  the
        ovector.


-       Before calling pcre2_match(), pcre2_dfa_match(),  or  pcre2_jit_match()
+       Before  calling  pcre2_match(), pcre2_dfa_match(), or pcre2_jit_match()
        you must create a match data block by calling one of the creation func-
-       tions above. For pcre2_match_data_create(), the first argument  is  the
-       number  of  pairs  of  offsets  in  the ovector. One pair of offsets is
-       required to identify the string that matched the whole pattern, with an
-       additional  pair for each captured substring. For example, a value of 4
-       creates enough space to record the matched portion of the subject  plus
-       three  captured  substrings. A minimum of at least 1 pair is imposed by
+       tions  above.  For pcre2_match_data_create(), the first argument is the
+       number of pairs of offsets in the ovector. One pair of offsets  is  re-
+       quired  to  identify the string that matched the whole pattern, with an
+       additional pair for each captured substring. For example, a value of  4
+       creates  enough space to record the matched portion of the subject plus
+       three captured substrings. A minimum of at least 1 pair is  imposed  by
        pcre2_match_data_create(), so it is always possible to return the over-
        all matched string.


        The second argument of pcre2_match_data_create() is a pointer to a gen-
-       eral context, which can specify custom memory management for  obtaining
+       eral  context, which can specify custom memory management for obtaining
        the memory for the match data block. If you are not using custom memory
        management, pass NULL, which causes malloc() to be used.


-       For pcre2_match_data_create_from_pattern(), the  first  argument  is  a
+       For  pcre2_match_data_create_from_pattern(),  the  first  argument is a
        pointer to a compiled pattern. The ovector is created to be exactly the
        right size to hold all the substrings a pattern might capture. The sec-
-       ond  argument is again a pointer to a general context, but in this case
+       ond argument is again a pointer to a general context, but in this  case
        if NULL is passed, the memory is obtained using the same allocator that
        was used for the compiled pattern (custom or default).


-       A  match  data block can be used many times, with the same or different
-       compiled patterns. You can extract information from a match data  block
-       after  a  match  operation  has  finished,  using  functions  that  are
-       described in the sections on  matched  strings  and  other  match  data
-       below.
+       A match data block can be used many times, with the same  or  different
+       compiled  patterns. You can extract information from a match data block
+       after a match operation has finished,  using  functions  that  are  de-
+       scribed in the sections on matched strings and other match data below.


        When  a  call  of  pcre2_match()  fails, valid data is available in the
-       match   block   only   when   the   error    is    PCRE2_ERROR_NOMATCH,
-       PCRE2_ERROR_PARTIAL,  or  one  of  the  error  codes for an invalid UTF
-       string. Exactly what is available depends on the error, and is detailed
-       below.
+       match block only  when  the  error  is  PCRE2_ERROR_NOMATCH,  PCRE2_ER-
+       ROR_PARTIAL,  or  one of the error codes for an invalid UTF string. Ex-
+       actly what is available depends on the error, and is detailed below.


-       When  one of the matching functions is called, pointers to the compiled
-       pattern and the subject string are set in the match data block so  that
-       they  can  be referenced by the extraction functions after a successful
+       When one of the matching functions is called, pointers to the  compiled
+       pattern  and the subject string are set in the match data block so that
+       they can be referenced by the extraction functions after  a  successful
        match. After running a match, you must not free a compiled pattern or a
-       subject  string until after all operations on the match data block (for
-       that match) have taken place,  unless,  in  the  case  of  the  subject
-       string,  you  have used the PCRE2_COPY_MATCHED_SUBJECT option, which is
-       described in the  section  entitled  "Option  bits  for  pcre2_match()"
-       below.
+       subject string until after all operations on the match data block  (for
+       that  match)  have  taken  place,  unless,  in  the case of the subject
+       string, you have used the PCRE2_COPY_MATCHED_SUBJECT option,  which  is
+       described  in  the section entitled "Option bits for pcre2_match()" be-
+       low.


-       When  a match data block itself is no longer needed, it should be freed
-       by calling pcre2_match_data_free(). If this function is called  with  a
+       When a match data block itself is no longer needed, it should be  freed
+       by  calling  pcre2_match_data_free(). If this function is called with a
        NULL argument, it returns immediately, without doing anything.



@@ -2444,15 +2435,15 @@
          uint32_t options, pcre2_match_data *match_data,
          pcre2_match_context *mcontext);


-       The  function pcre2_match() is called to match a subject string against
-       a compiled pattern, which is passed in the code argument. You can  call
+       The function pcre2_match() is called to match a subject string  against
+       a  compiled pattern, which is passed in the code argument. You can call
        pcre2_match() with the same code argument as many times as you like, in
-       order to find multiple matches in the subject string or to  match  dif-
+       order  to  find multiple matches in the subject string or to match dif-
        ferent subject strings with the same pattern.


-       This  function  is  the  main  matching facility of the library, and it
-       operates in a Perl-like manner. For specialist use  there  is  also  an
-       alternative  matching function, which is described below in the section
+       This function is the main matching facility of the library, and it  op-
+       erates  in  a Perl-like manner. For specialist use there is also an al-
+       ternative matching function, which is described below  in  the  section
        about the pcre2_dfa_match() function.


        Here is an example of a simple call to pcre2_match():
@@ -2467,7 +2458,7 @@
            md,             /* the match data block */
            NULL);          /* a match context; NULL means use defaults */


-       If the subject string is zero-terminated, the length can  be  given  as
+       If  the  subject  string is zero-terminated, the length can be given as
        PCRE2_ZERO_TERMINATED. A match context must be provided if certain less
        common matching parameters are to be changed. For details, see the sec-
        tion on the match context above.
@@ -2474,110 +2465,110 @@


    The string to be matched by pcre2_match()


-       The  subject string is passed to pcre2_match() as a pointer in subject,
-       a length in length, and a starting offset in  startoffset.  The  length
-       and  offset  are  in  code units, not characters.  That is, they are in
-       bytes for the 8-bit library, 16-bit code units for the 16-bit  library,
-       and  32-bit  code units for the 32-bit library, whether or not UTF pro-
+       The subject string is passed to pcre2_match() as a pointer in  subject,
+       a  length  in  length, and a starting offset in startoffset. The length
+       and offset are in code units, not characters.  That  is,  they  are  in
+       bytes  for the 8-bit library, 16-bit code units for the 16-bit library,
+       and 32-bit code units for the 32-bit library, whether or not  UTF  pro-
        cessing is enabled.


        If startoffset is greater than the length of the subject, pcre2_match()
-       returns  PCRE2_ERROR_BADOFFSET.  When  the starting offset is zero, the
-       search for a match starts at the beginning of the subject, and this  is
+       returns PCRE2_ERROR_BADOFFSET. When the starting offset  is  zero,  the
+       search  for a match starts at the beginning of the subject, and this is
        by far the most common case. In UTF-8 or UTF-16 mode, the starting off-
-       set must point to the start of a character, or to the end of  the  sub-
-       ject  (in  UTF-32 mode, one code unit equals one character, so all off-
-       sets are valid). Like the  pattern  string,  the  subject  may  contain
-       binary zeros.
+       set  must  point to the start of a character, or to the end of the sub-
+       ject (in UTF-32 mode, one code unit equals one character, so  all  off-
+       sets  are  valid). Like the pattern string, the subject may contain bi-
+       nary zeros.


-       A  non-zero  starting offset is useful when searching for another match
-       in the same subject by calling pcre2_match()  again  after  a  previous
-       success.   Setting  startoffset  differs  from passing over a shortened
-       string and setting PCRE2_NOTBOL in the case of a  pattern  that  begins
+       A non-zero starting offset is useful when searching for  another  match
+       in  the  same  subject  by calling pcre2_match() again after a previous
+       success.  Setting startoffset differs from  passing  over  a  shortened
+       string  and  setting  PCRE2_NOTBOL in the case of a pattern that begins
        with any kind of lookbehind. For example, consider the pattern


          \Biss\B


-       which  finds  occurrences  of "iss" in the middle of words. (\B matches
-       only if the current position in the subject is not  a  word  boundary.)
+       which finds occurrences of "iss" in the middle of  words.  (\B  matches
+       only  if  the  current position in the subject is not a word boundary.)
        When applied to the string "Mississipi" the first call to pcre2_match()
-       finds the first occurrence. If pcre2_match() is called again with  just
-       the  remainder  of  the  subject,  namely  "issipi", it does not match,
-       because \B is always false at the start of the subject, which is deemed
-       to  be  a word boundary. However, if pcre2_match() is passed the entire
+       finds  the first occurrence. If pcre2_match() is called again with just
+       the remainder of the subject, namely "issipi", it does not  match,  be-
+       cause  \B  is always false at the start of the subject, which is deemed
+       to be a word boundary. However, if pcre2_match() is passed  the  entire
        string again, but with startoffset set to 4, it finds the second occur-
-       rence  of "iss" because it is able to look behind the starting point to
+       rence of "iss" because it is able to look behind the starting point  to
        discover that it is preceded by a letter.


-       Finding all the matches in a subject is tricky  when  the  pattern  can
+       Finding  all  the  matches  in a subject is tricky when the pattern can
        match an empty string. It is possible to emulate Perl's /g behaviour by
-       first  trying  the  match  again  at  the   same   offset,   with   the
-       PCRE2_NOTEMPTY_ATSTART  and  PCRE2_ANCHORED  options,  and then if that
-       fails, advancing the starting  offset  and  trying  an  ordinary  match
-       again.  There  is  some  code  that  demonstrates how to do this in the
-       pcre2demo sample program. In the most general case, you have  to  check
-       to  see  if the newline convention recognizes CRLF as a newline, and if
-       so, and the current character is CR followed by LF, advance the  start-
+       first   trying   the   match   again  at  the  same  offset,  with  the
+       PCRE2_NOTEMPTY_ATSTART and PCRE2_ANCHORED options,  and  then  if  that
+       fails,  advancing  the  starting  offset  and  trying an ordinary match
+       again. There is some code that demonstrates  how  to  do  this  in  the
+       pcre2demo  sample  program. In the most general case, you have to check
+       to see if the newline convention recognizes CRLF as a newline,  and  if
+       so,  and the current character is CR followed by LF, advance the start-
        ing offset by two characters instead of one.


        If a non-zero starting offset is passed when the pattern is anchored, a
        single attempt to match at the given offset is made. This can only suc-
-       ceed  if  the  pattern does not require the match to be at the start of
-       the subject. In other words, the anchoring must be the result  of  set-
-       ting  the PCRE2_ANCHORED option or the use of .* with PCRE2_DOTALL, not
+       ceed if the pattern does not require the match to be at  the  start  of
+       the  subject.  In other words, the anchoring must be the result of set-
+       ting the PCRE2_ANCHORED option or the use of .* with PCRE2_DOTALL,  not
        by starting the pattern with ^ or \A.


    Option bits for pcre2_match()


        The unused bits of the options argument for pcre2_match() must be zero.
-       The    only    bits    that    may    be    set   are   PCRE2_ANCHORED,
-       PCRE2_COPY_MATCHED_SUBJECT,      PCRE2_ENDANCHORED,       PCRE2_NOTBOL,
-       PCRE2_NOTEOL,   PCRE2_NOTEMPTY,  PCRE2_NOTEMPTY_ATSTART,  PCRE2_NO_JIT,
-       PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and  PCRE2_PARTIAL_SOFT.  Their
+       The   only   bits    that    may    be    set    are    PCRE2_ANCHORED,
+       PCRE2_COPY_MATCHED_SUBJECT,  PCRE2_ENDANCHORED, PCRE2_NOTBOL, PCRE2_NO-
+       TEOL,     PCRE2_NOTEMPTY,     PCRE2_NOTEMPTY_ATSTART,     PCRE2_NO_JIT,
+       PCRE2_NO_UTF_CHECK,  PCRE2_PARTIAL_HARD,  and PCRE2_PARTIAL_SOFT. Their
        action is described below.


-       Setting  PCRE2_ANCHORED  or PCRE2_ENDANCHORED at match time is not sup-
-       ported by the just-in-time (JIT) compiler. If it is set,  JIT  matching
-       is  disabled  and  the interpretive code in pcre2_match() is run. Apart
-       from PCRE2_NO_JIT (obviously), the remaining options are supported  for
+       Setting PCRE2_ANCHORED or PCRE2_ENDANCHORED at match time is  not  sup-
+       ported  by  the just-in-time (JIT) compiler. If it is set, JIT matching
+       is disabled and the interpretive code in pcre2_match()  is  run.  Apart
+       from  PCRE2_NO_JIT (obviously), the remaining options are supported for
        JIT matching.


          PCRE2_ANCHORED


        The PCRE2_ANCHORED option limits pcre2_match() to matching at the first
-       matching position. If a pattern was compiled  with  PCRE2_ANCHORED,  or
-       turned  out to be anchored by virtue of its contents, it cannot be made
-       unachored at matching time. Note that setting the option at match  time
+       matching  position.  If  a pattern was compiled with PCRE2_ANCHORED, or
+       turned out to be anchored by virtue of its contents, it cannot be  made
+       unachored  at matching time. Note that setting the option at match time
        disables JIT matching.


          PCRE2_COPY_MATCHED_SUBJECT


-       By  default,  a  pointer to the subject is remembered in the match data
-       block so that, after a successful match, it can be  referenced  by  the
-       substring  extraction  functions.  This means that the subject's memory
-       must not be freed until all such  operations  are  complete.  For  some
-       applications  where  the  lifetime of the subject string is not guaran-
-       teed, it may be necessary to make a copy of the subject string, but  it
-       is wasteful to do this unless the match is successful. After a success-
-       ful match, if PCRE2_COPY_MATCHED_SUBJECT is set, the subject is  copied
-       and  the  new  pointer is remembered in the match data block instead of
-       the original subject pointer. The memory allocator that  was  used  for
-       the  match  block  itself is used. The copy is automatically freed when
-       pcre2_match_data_free() is called to free the match data block.  It  is
+       By default, a pointer to the subject is remembered in  the  match  data
+       block  so  that,  after a successful match, it can be referenced by the
+       substring extraction functions. This means that  the  subject's  memory
+       must  not be freed until all such operations are complete. For some ap-
+       plications where the lifetime of the subject string is not  guaranteed,
+       it  may  be  necessary  to make a copy of the subject string, but it is
+       wasteful to do this unless the match is successful. After a  successful
+       match,  if PCRE2_COPY_MATCHED_SUBJECT is set, the subject is copied and
+       the new pointer is remembered in the match data block  instead  of  the
+       original  subject  pointer.  The memory allocator that was used for the
+       match block itself is  used.  The  copy  is  automatically  freed  when
+       pcre2_match_data_free()  is  called to free the match data block. It is
        also automatically freed if the match data block is re-used for another
        match operation.


          PCRE2_ENDANCHORED


-       If the PCRE2_ENDANCHORED option is set, any string  that  pcre2_match()
-       matches  must be right at the end of the subject string. Note that set-
+       If  the  PCRE2_ENDANCHORED option is set, any string that pcre2_match()
+       matches must be right at the end of the subject string. Note that  set-
        ting the option at match time disables JIT matching.


          PCRE2_NOTBOL


        This option specifies that first character of the subject string is not
-       the  beginning  of  a  line, so the circumflex metacharacter should not
-       match before it. Setting this without  having  set  PCRE2_MULTILINE  at
+       the beginning of a line, so the  circumflex  metacharacter  should  not
+       match  before  it.  Setting  this without having set PCRE2_MULTILINE at
        compile time causes circumflex never to match. This option affects only
        the behaviour of the circumflex metacharacter. It does not affect \A.


@@ -2584,9 +2575,9 @@
          PCRE2_NOTEOL


        This option specifies that the end of the subject string is not the end
-       of  a line, so the dollar metacharacter should not match it nor (except
-       in multiline mode) a newline immediately before it. Setting this  with-
-       out  having  set PCRE2_MULTILINE at compile time causes dollar never to
+       of a line, so the dollar metacharacter should not match it nor  (except
+       in  multiline mode) a newline immediately before it. Setting this with-
+       out having set PCRE2_MULTILINE at compile time causes dollar  never  to
        match. This option affects only the behaviour of the dollar metacharac-
        ter. It does not affect \Z or \z.


@@ -2593,85 +2584,85 @@
          PCRE2_NOTEMPTY


        An empty string is not considered to be a valid match if this option is
-       set. If there are alternatives in the pattern, they are tried.  If  all
-       the  alternatives  match  the empty string, the entire match fails. For
+       set.  If  there are alternatives in the pattern, they are tried. If all
+       the alternatives match the empty string, the entire  match  fails.  For
        example, if the pattern


          a?b?


-       is applied to a string not beginning with "a" or  "b",  it  matches  an
+       is  applied  to  a  string not beginning with "a" or "b", it matches an
        empty string at the start of the subject. With PCRE2_NOTEMPTY set, this
-       match is not valid, so pcre2_match() searches further into  the  string
+       match  is  not valid, so pcre2_match() searches further into the string
        for occurrences of "a" or "b".


          PCRE2_NOTEMPTY_ATSTART


-       This  is  like PCRE2_NOTEMPTY, except that it locks out an empty string
+       This is like PCRE2_NOTEMPTY, except that it locks out an  empty  string
        match only at the first matching position, that is, at the start of the
-       subject  plus  the  starting offset. An empty string match later in the
-       subject is permitted.  If the pattern is anchored,  such  a  match  can
-       occur only if the pattern contains \K.
+       subject plus the starting offset. An empty string match  later  in  the
+       subject is permitted.  If the pattern is anchored, such a match can oc-
+       cur only if the pattern contains \K.


          PCRE2_NO_JIT


-       By   default,   if   a  pattern  has  been  successfully  processed  by
-       pcre2_jit_compile(), JIT is automatically used  when  pcre2_match()  is
-       called  with  options  that JIT supports. Setting PCRE2_NO_JIT disables
+       By  default,  if  a  pattern  has  been   successfully   processed   by
+       pcre2_jit_compile(),  JIT  is  automatically used when pcre2_match() is
+       called with options that JIT supports.  Setting  PCRE2_NO_JIT  disables
        the use of JIT; it forces matching to be done by the interpreter.


          PCRE2_NO_UTF_CHECK


        When PCRE2_UTF is set at compile time, the validity of the subject as a
-       UTF   string   is   checked  unless  PCRE2_NO_UTF_CHECK  is  passed  to
+       UTF  string  is  checked  unless  PCRE2_NO_UTF_CHECK   is   passed   to
        pcre2_match() or PCRE2_MATCH_INVALID_UTF was passed to pcre2_compile().
        The latter special case is discussed in detail in the pcre2unicode doc-
        umentation.


-       In the default case, if a non-zero starting offset is given, the  check
-       is  applied  only  to  that part of the subject that could be inspected
-       during matching, and there is a check that the starting  offset  points
-       to  the first code unit of a character or to the end of the subject. If
-       there are no lookbehind assertions in the pattern, the check starts  at
+       In  the default case, if a non-zero starting offset is given, the check
+       is applied only to that part of the subject  that  could  be  inspected
+       during  matching,  and there is a check that the starting offset points
+       to the first code unit of a character or to the end of the subject.  If
+       there  are no lookbehind assertions in the pattern, the check starts at
        the starting offset.  Otherwise, it starts at the length of the longest
-       lookbehind before the starting offset, or at the start of  the  subject
-       if  there are not that many characters before the starting offset. Note
+       lookbehind  before  the starting offset, or at the start of the subject
+       if there are not that many characters before the starting offset.  Note
        that the sequences \b and \B are one-character lookbehinds.


        The check is carried out before any other processing takes place, and a
-       negative  error  code is returned if the check fails. There are several
-       UTF error codes for each code unit width,  corresponding  to  different
-       problems  with  the code unit sequence. There are discussions about the
-       validity of UTF-8 strings, UTF-16 strings, and UTF-32  strings  in  the
+       negative error code is returned if the check fails. There  are  several
+       UTF  error  codes  for each code unit width, corresponding to different
+       problems with the code unit sequence. There are discussions  about  the
+       validity  of  UTF-8  strings, UTF-16 strings, and UTF-32 strings in the
        pcre2unicode documentation.


        If you know that your subject is valid, and you want to skip this check
        for performance reasons, you can set the PCRE2_NO_UTF_CHECK option when
-       calling  pcre2_match().  You  might  want to do this for the second and
-       subsequent calls to pcre2_match() if you are making repeated  calls  to
+       calling pcre2_match(). You might want to do this  for  the  second  and
+       subsequent  calls  to pcre2_match() if you are making repeated calls to
        find multiple matches in the same subject string.


-       Warning:  Unless  PCRE2_MATCH_INVALID_UTF was set at compile time, when
-       PCRE2_NO_UTF_CHECK is set at  match  time  the  effect  of  passing  an
-       invalid  string  as  a  subject, or an invalid value of startoffset, is
-       undefined.  Your program may crash or loop indefinitely or  give  wrong
-       results.
+       Warning: Unless PCRE2_MATCH_INVALID_UTF was set at compile  time,  when
+       PCRE2_NO_UTF_CHECK  is  set  at match time the effect of passing an in-
+       valid string as a subject, or an invalid value of startoffset, is unde-
+       fined.   Your  program may crash or loop indefinitely or give wrong re-
+       sults.


          PCRE2_PARTIAL_HARD
          PCRE2_PARTIAL_SOFT


-       These  options  turn  on  the partial matching feature. A partial match
-       occurs if the end of the subject string is  reached  successfully,  but
-       there  are not enough subject characters to complete the match. If this
-       happens when PCRE2_PARTIAL_SOFT (but not  PCRE2_PARTIAL_HARD)  is  set,
-       matching  continues  by  testing any remaining alternatives. Only if no
-       complete match can be found is PCRE2_ERROR_PARTIAL returned instead  of
-       PCRE2_ERROR_NOMATCH.  In other words, PCRE2_PARTIAL_SOFT specifies that
-       the caller is prepared to handle a partial match, but only if  no  com-
+       These options turn on the partial matching feature. A partial match oc-
+       curs  if  the  end  of  the subject string is reached successfully, but
+       there are not enough subject characters to complete the match. If  this
+       happens  when  PCRE2_PARTIAL_SOFT  (but not PCRE2_PARTIAL_HARD) is set,
+       matching continues by testing any remaining alternatives.  Only  if  no
+       complete  match can be found is PCRE2_ERROR_PARTIAL returned instead of
+       PCRE2_ERROR_NOMATCH. In other words, PCRE2_PARTIAL_SOFT specifies  that
+       the  caller  is prepared to handle a partial match, but only if no com-
        plete match can be found.


-       If  PCRE2_PARTIAL_HARD is set, it overrides PCRE2_PARTIAL_SOFT. In this
-       case, if a partial match is found,  pcre2_match()  immediately  returns
-       PCRE2_ERROR_PARTIAL,  without  considering  any  other alternatives. In
+       If PCRE2_PARTIAL_HARD is set, it overrides PCRE2_PARTIAL_SOFT. In  this
+       case,  if  a  partial match is found, pcre2_match() immediately returns
+       PCRE2_ERROR_PARTIAL, without considering  any  other  alternatives.  In
        other words, when PCRE2_PARTIAL_HARD is set, a partial match is consid-
        ered to be more important that an alternative complete match.


@@ -2681,38 +2672,38 @@

NEWLINE HANDLING WHEN MATCHING

-       When PCRE2 is built, a default newline convention is set; this is  usu-
-       ally  the standard convention for the operating system. The default can
-       be overridden in a compile context by calling  pcre2_set_newline().  It
-       can  also be overridden by starting a pattern string with, for example,
-       (*CRLF), as described in the section  on  newline  conventions  in  the
-       pcre2pattern  page. During matching, the newline choice affects the be-
-       haviour of the dot, circumflex, and dollar metacharacters. It may  also
-       alter  the  way  the  match starting position is advanced after a match
+       When  PCRE2 is built, a default newline convention is set; this is usu-
+       ally the standard convention for the operating system. The default  can
+       be  overridden  in a compile context by calling pcre2_set_newline(). It
+       can also be overridden by starting a pattern string with, for  example,
+       (*CRLF),  as  described  in  the  section on newline conventions in the
+       pcre2pattern page. During matching, the newline choice affects the  be-
+       haviour  of the dot, circumflex, and dollar metacharacters. It may also
+       alter the way the match starting position is  advanced  after  a  match
        failure for an unanchored pattern.


        When PCRE2_NEWLINE_CRLF, PCRE2_NEWLINE_ANYCRLF, or PCRE2_NEWLINE_ANY is
-       set  as  the  newline convention, and a match attempt for an unanchored
+       set as the newline convention, and a match attempt  for  an  unanchored
        pattern fails when the current starting position is at a CRLF sequence,
-       and  the  pattern contains no explicit matches for CR or LF characters,
-       the match position is advanced by two characters  instead  of  one,  in
+       and the pattern contains no explicit matches for CR or  LF  characters,
+       the  match  position  is  advanced by two characters instead of one, in
        other words, to after the CRLF.


        The above rule is a compromise that makes the most common cases work as
-       expected. For example, if the pattern  is  .+A  (and  the  PCRE2_DOTALL
-       option is not set), it does not match the string "\r\nA" because, after
-       failing at the start, it skips both the CR and the LF before  retrying.
-       However,  the  pattern  [\r\n]A does match that string, because it con-
+       expected.  For example, if the pattern is .+A (and the PCRE2_DOTALL op-
+       tion is not set), it does not match the string "\r\nA"  because,  after
+       failing  at the start, it skips both the CR and the LF before retrying.
+       However, the pattern [\r\n]A does match that string,  because  it  con-
        tains an explicit CR or LF reference, and so advances only by one char-
        acter after the first failure.


        An explicit match for CR of LF is either a literal appearance of one of
-       those characters in the pattern, or one of the \r or \n  or  equivalent
+       those  characters  in the pattern, or one of the \r or \n or equivalent
        octal or hexadecimal escape sequences. Implicit matches such as [^X] do
-       not count, nor does \s, even though it includes CR and LF in the  char-
+       not  count, nor does \s, even though it includes CR and LF in the char-
        acters that it matches.


-       Notwithstanding  the above, anomalous effects may still occur when CRLF
+       Notwithstanding the above, anomalous effects may still occur when  CRLF
        is a valid newline sequence and explicit \r or \n escapes appear in the
        pattern.


@@ -2723,82 +2714,82 @@

        PCRE2_SIZE *pcre2_get_ovector_pointer(pcre2_match_data *match_data);


-       In  general, a pattern matches a certain portion of the subject, and in
-       addition, further substrings from the subject  may  be  picked  out  by
-       parenthesized  parts  of  the  pattern.  Following the usage in Jeffrey
-       Friedl's book, this is called "capturing"  in  what  follows,  and  the
-       phrase  "capture  group" (Perl terminology) is used for a fragment of a
-       pattern that picks out a substring. PCRE2 supports several other  kinds
+       In general, a pattern matches a certain portion of the subject, and  in
+       addition,  further  substrings  from  the  subject may be picked out by
+       parenthesized parts of the pattern.  Following  the  usage  in  Jeffrey
+       Friedl's  book,  this  is  called  "capturing" in what follows, and the
+       phrase "capture group" (Perl terminology) is used for a fragment  of  a
+       pattern  that picks out a substring. PCRE2 supports several other kinds
        of parenthesized group that do not cause substrings to be captured. The
-       pcre2_pattern_info() function can be used to find out how many  capture
+       pcre2_pattern_info()  function can be used to find out how many capture
        groups there are in a compiled pattern.


-       You  can  use  auxiliary functions for accessing captured substrings by
+       You can use auxiliary functions for accessing  captured  substrings  by
        number or by name, as described in sections below.


        Alternatively, you can make direct use of the vector of PCRE2_SIZE val-
-       ues,  called  the  ovector,  which  contains  the  offsets  of captured
-       strings.  It  is  part  of  the  match  data   block.    The   function
-       pcre2_get_ovector_pointer()  returns  the  address  of the ovector, and
+       ues, called  the  ovector,  which  contains  the  offsets  of  captured
+       strings.   It   is   part  of  the  match  data  block.   The  function
+       pcre2_get_ovector_pointer() returns the address  of  the  ovector,  and
        pcre2_get_ovector_count() returns the number of pairs of values it con-
        tains.


        Within the ovector, the first in each pair of values is set to the off-
        set of the first code unit of a substring, and the second is set to the
-       offset  of the first code unit after the end of a substring. These val-
-       ues are always code unit offsets, not character offsets. That is,  they
-       are  byte  offsets  in  the 8-bit library, 16-bit offsets in the 16-bit
-       library, and 32-bit offsets in the 32-bit library.
+       offset of the first code unit after the end of a substring. These  val-
+       ues  are always code unit offsets, not character offsets. That is, they
+       are byte offsets in the 8-bit library, 16-bit offsets in the 16-bit li-
+       brary, and 32-bit offsets in the 32-bit library.


-       After a partial match  (error  return  PCRE2_ERROR_PARTIAL),  only  the
-       first  pair  of  offsets  (that is, ovector[0] and ovector[1]) are set.
-       They identify the part of the subject that was partially  matched.  See
+       After  a  partial  match  (error  return PCRE2_ERROR_PARTIAL), only the
+       first pair of offsets (that is, ovector[0]  and  ovector[1])  are  set.
+       They  identify  the part of the subject that was partially matched. See
        the pcre2partial documentation for details of partial matching.


-       After  a  fully  successful match, the first pair of offsets identifies
-       the portion of the subject string that was matched by the  entire  pat-
-       tern.  The  next  pair is used for the first captured substring, and so
-       on. The value returned by pcre2_match() is one more  than  the  highest
-       numbered  pair  that  has been set. For example, if two substrings have
-       been captured, the returned value is 3. If there are no  captured  sub-
+       After a fully successful match, the first pair  of  offsets  identifies
+       the  portion  of the subject string that was matched by the entire pat-
+       tern. The next pair is used for the first captured  substring,  and  so
+       on.  The  value  returned by pcre2_match() is one more than the highest
+       numbered pair that has been set. For example, if  two  substrings  have
+       been  captured,  the returned value is 3. If there are no captured sub-
        strings, the return value from a successful match is 1, indicating that
        just the first pair of offsets has been set.


-       If a pattern uses the \K escape sequence within a  positive  assertion,
+       If  a  pattern uses the \K escape sequence within a positive assertion,
        the reported start of a successful match can be greater than the end of
-       the match.  For example, if the pattern  (?=ab\K)  is  matched  against
+       the  match.   For  example,  if the pattern (?=ab\K) is matched against
        "ab", the start and end offset values for the match are 2 and 0.


-       If  a  capture group is matched repeatedly within a single match opera-
-       tion, it is the last portion of the subject that  it  matched  that  is
-       returned.
+       If a capture group is matched repeatedly within a single  match  opera-
+       tion, it is the last portion of the subject that it matched that is re-
+       turned.


        If the ovector is too small to hold all the captured substring offsets,
-       as much as possible is filled in, and the function returns a  value  of
-       zero.  If captured substrings are not of interest, pcre2_match() may be
+       as  much  as possible is filled in, and the function returns a value of
+       zero. If captured substrings are not of interest, pcre2_match() may  be
        called with a match data block whose ovector is of minimum length (that
        is, one pair).


-       It  is  possible for capture group number n+1 to match some part of the
-       subject when group n has not been used at  all.  For  example,  if  the
+       It is possible for capture group number n+1 to match some part  of  the
+       subject  when  group  n  has  not been used at all. For example, if the
        string "abc" is matched against the pattern (a|(z))(bc) the return from
-       the function is 4, and groups 1 and 3 are matched, but 2 is  not.  When
-       this  happens,  both values in the offset pairs corresponding to unused
+       the  function  is 4, and groups 1 and 3 are matched, but 2 is not. When
+       this happens, both values in the offset pairs corresponding  to  unused
        groups are set to PCRE2_UNSET.


-       Offset values that correspond to  unused  groups  at  the  end  of  the
-       expression  are  also  set  to  PCRE2_UNSET. For example, if the string
-       "abc" is matched against the pattern (abc)(x(yz)?)? groups 2 and 3  are
-       not  matched.  The  return  from the function is 2, because the highest
-       used capture group number is 1. The offsets  for  for  the  second  and
-       third  capture groupss (assuming the vector is large enough, of course)
-       are set to PCRE2_UNSET.
+       Offset  values  that  correspond to unused groups at the end of the ex-
+       pression are also set to PCRE2_UNSET. For example, if the string  "abc"
+       is  matched  against  the pattern (abc)(x(yz)?)? groups 2 and 3 are not
+       matched. The return from the function is 2, because  the  highest  used
+       capture  group  number  is  1. The offsets for for the second and third
+       capture groupss (assuming the vector is large enough,  of  course)  are
+       set to PCRE2_UNSET.


        Elements in the ovector that do not correspond to capturing parentheses
        in the pattern are never changed. That is, if a pattern contains n cap-
        turing parentheses, no more than ovector[0] to ovector[2n+1] are set by
-       pcre2_match().  The  other  elements retain whatever values they previ-
-       ously had. After a failed match attempt, the contents  of  the  ovector
+       pcre2_match(). The other elements retain whatever  values  they  previ-
+       ously  had.  After  a failed match attempt, the contents of the ovector
        are unchanged.



@@ -2808,55 +2799,55 @@

        PCRE2_SIZE pcre2_get_startchar(pcre2_match_data *match_data);


-       As  well as the offsets in the ovector, other information about a match
-       is retained in the match data block and can be retrieved by  the  above
-       functions  in  appropriate  circumstances.  If they are called at other
+       As well as the offsets in the ovector, other information about a  match
+       is  retained  in the match data block and can be retrieved by the above
+       functions in appropriate circumstances. If they  are  called  at  other
        times, the result is undefined.


-       After a successful match, a partial match (PCRE2_ERROR_PARTIAL),  or  a
-       failure  to  match (PCRE2_ERROR_NOMATCH), a mark name may be available.
-       The function pcre2_get_mark() can be called to access this name,  which
-       can  be  specified  in  the  pattern by any of the backtracking control
+       After  a  successful match, a partial match (PCRE2_ERROR_PARTIAL), or a
+       failure to match (PCRE2_ERROR_NOMATCH), a mark name may  be  available.
+       The  function pcre2_get_mark() can be called to access this name, which
+       can be specified in the pattern by  any  of  the  backtracking  control
        verbs, not just (*MARK). The same function applies to all the verbs. It
        returns a pointer to the zero-terminated name, which is within the com-
        piled pattern. If no name is available, NULL is returned. The length of
-       the  name  (excluding  the terminating zero) is stored in the code unit
-       that precedes the name. You should use this length instead  of  relying
+       the name (excluding the terminating zero) is stored in  the  code  unit
+       that  precedes  the name. You should use this length instead of relying
        on the terminating zero if the name might contain a binary zero.


-       After  a  successful  match, the name that is returned is the last mark
+       After a successful match, the name that is returned is  the  last  mark
        name encountered on the matching path through the pattern. Instances of
-       backtracking  verbs  without  names do not count. Thus, for example, if
+       backtracking verbs without names do not count. Thus,  for  example,  if
        the matching path contains (*MARK:A)(*PRUNE), the name "A" is returned.
-       After  a  "no  match"  or a partial match, the last encountered name is
-       returned. For example, consider this pattern:
+       After a "no match" or a partial match, the last encountered name is re-
+       turned. For example, consider this pattern:


          ^(*MARK:A)((*MARK:B)a|b)c


-       When it matches "bc", the returned name is A. The B mark is  "seen"  in
-       the  first  branch of the group, but it is not on the matching path. On
-       the other hand, when this pattern fails to  match  "bx",  the  returned
+       When  it  matches "bc", the returned name is A. The B mark is "seen" in
+       the first branch of the group, but it is not on the matching  path.  On
+       the  other  hand,  when  this pattern fails to match "bx", the returned
        name is B.


-       Warning:  By  default, certain start-of-match optimizations are used to
-       give a fast "no match" result in some situations. For example,  if  the
-       anchoring  is removed from the pattern above, there is an initial check
-       for the presence of "c" in the  subject  before  running  the  matching
-       engine. This check fails for "bx", causing a match failure without see-
-       ing any marks. You can disable the start-of-match optimizations by set-
-       ting  the  PCRE2_NO_START_OPTIMIZE  option  for  pcre2_compile()  or by
-       starting the pattern with (*NO_START_OPT).
+       Warning: By default, certain start-of-match optimizations are  used  to
+       give  a  fast "no match" result in some situations. For example, if the
+       anchoring is removed from the pattern above, there is an initial  check
+       for  the presence of "c" in the subject before running the matching en-
+       gine. This check fails for "bx", causing a match failure without seeing
+       any  marks. You can disable the start-of-match optimizations by setting
+       the PCRE2_NO_START_OPTIMIZE option for pcre2_compile() or  by  starting
+       the pattern with (*NO_START_OPT).


-       After a successful match, a partial match, or one of  the  invalid  UTF
-       errors  (for example, PCRE2_ERROR_UTF8_ERR5), pcre2_get_startchar() can
+       After  a  successful  match, a partial match, or one of the invalid UTF
+       errors (for example, PCRE2_ERROR_UTF8_ERR5), pcre2_get_startchar()  can
        be called. After a successful or partial match it returns the code unit
-       offset  of  the character at which the match started. For a non-partial
-       match, this can be different to the value of ovector[0] if the  pattern
-       contains  the  \K escape sequence. After a partial match, however, this
-       value is always the same as ovector[0] because \K does not  affect  the
+       offset of the character at which the match started. For  a  non-partial
+       match,  this can be different to the value of ovector[0] if the pattern
+       contains the \K escape sequence. After a partial match,  however,  this
+       value  is  always the same as ovector[0] because \K does not affect the
        result of a partial match.


-       After  a UTF check failure, pcre2_get_startchar() can be used to obtain
+       After a UTF check failure, pcre2_get_startchar() can be used to  obtain
        the code unit offset of the invalid UTF character. Details are given in
        the pcre2unicode page.


@@ -2863,14 +2854,14 @@

ERROR RETURNS FROM pcre2_match()

-       If  pcre2_match() fails, it returns a negative number. This can be con-
-       verted to a text string by calling the pcre2_get_error_message()  func-
-       tion  (see  "Obtaining a textual error message" below).  Negative error
-       codes are also returned by other functions,  and  are  documented  with
-       them.  The codes are given names in the header file. If UTF checking is
+       If pcre2_match() fails, it returns a negative number. This can be  con-
+       verted  to a text string by calling the pcre2_get_error_message() func-
+       tion (see "Obtaining a textual error message" below).   Negative  error
+       codes  are  also  returned  by other functions, and are documented with
+       them. The codes are given names in the header file. If UTF checking  is
        in force and an invalid UTF subject string is detected, one of a number
-       of  UTF-specific negative error codes is returned. Details are given in
-       the pcre2unicode page. The following are the other errors that  may  be
+       of UTF-specific negative error codes is returned. Details are given  in
+       the  pcre2unicode  page. The following are the other errors that may be
        returned by pcre2_match():


          PCRE2_ERROR_NOMATCH
@@ -2879,20 +2870,20 @@


          PCRE2_ERROR_PARTIAL


-       The  subject  string did not match, but it did match partially. See the
+       The subject string did not match, but it did match partially.  See  the
        pcre2partial documentation for details of partial matching.


          PCRE2_ERROR_BADMAGIC


        PCRE2 stores a 4-byte "magic number" at the start of the compiled code,
-       to  catch  the case when it is passed a junk pointer. This is the error
+       to catch the case when it is passed a junk pointer. This is  the  error
        that is returned when the magic number is not present.


          PCRE2_ERROR_BADMODE


-       This error is given when a compiled pattern is passed to a function  in
-       a  library  of a different code unit width, for example, a pattern com-
-       piled by the 8-bit library is passed to  a  16-bit  or  32-bit  library
+       This  error is given when a compiled pattern is passed to a function in
+       a library of a different code unit width, for example, a  pattern  com-
+       piled  by  the  8-bit  library  is passed to a 16-bit or 32-bit library
        function.


          PCRE2_ERROR_BADOFFSET
@@ -2906,15 +2897,15 @@
          PCRE2_ERROR_BADUTFOFFSET


        The UTF code unit sequence that was passed as a subject was checked and
-       found to be valid (the PCRE2_NO_UTF_CHECK option was not set), but  the
-       value  of startoffset did not point to the beginning of a UTF character
+       found  to be valid (the PCRE2_NO_UTF_CHECK option was not set), but the
+       value of startoffset did not point to the beginning of a UTF  character
        or the end of the subject.


          PCRE2_ERROR_CALLOUT


-       This error is never generated by pcre2_match() itself. It  is  provided
-       for  use  by  callout  functions  that  want  to cause pcre2_match() or
-       pcre2_callout_enumerate() to return a distinctive error code.  See  the
+       This  error  is never generated by pcre2_match() itself. It is provided
+       for use by callout  functions  that  want  to  cause  pcre2_match()  or
+       pcre2_callout_enumerate()  to  return a distinctive error code. See the
        pcre2callout documentation for details.


          PCRE2_ERROR_DEPTHLIMIT
@@ -2927,15 +2918,15 @@


          PCRE2_ERROR_INTERNAL


-       An  unexpected  internal error has occurred. This error could be caused
+       An unexpected internal error has occurred. This error could  be  caused
        by a bug in PCRE2 or by overwriting of the compiled pattern.


          PCRE2_ERROR_JIT_STACKLIMIT


-       This error is returned when a pattern  that  was  successfully  studied
-       using  JIT  is being matched, but the memory available for the just-in-
-       time processing stack is not large enough. See the pcre2jit  documenta-
-       tion for more details.
+       This error is returned when a pattern that was successfully studied us-
+       ing JIT is being matched, but the memory available for the just-in-time
+       processing  stack  is  not large enough. See the pcre2jit documentation
+       for more details.


          PCRE2_ERROR_MATCHLIMIT


@@ -2943,11 +2934,11 @@

          PCRE2_ERROR_NOMEMORY


-       If  a  pattern contains many nested backtracking points, heap memory is
-       used to remember them. This error is given when the  memory  allocation
-       function  (default  or  custom)  fails.  Note  that  a different error,
-       PCRE2_ERROR_HEAPLIMIT, is given if the amount of memory needed  exceeds
-       the    heap   limit.   PCRE2_ERROR_NOMEMORY   is   also   returned   if
+       If a pattern contains many nested backtracking points, heap  memory  is
+       used  to  remember them. This error is given when the memory allocation
+       function (default or  custom)  fails.  Note  that  a  different  error,
+       PCRE2_ERROR_HEAPLIMIT,  is given if the amount of memory needed exceeds
+       the   heap   limit.   PCRE2_ERROR_NOMEMORY   is   also   returned    if
        PCRE2_COPY_MATCHED_SUBJECT is set and memory allocation fails.


          PCRE2_ERROR_NULL
@@ -2956,12 +2947,12 @@


          PCRE2_ERROR_RECURSELOOP


-       This error is returned when  pcre2_match()  detects  a  recursion  loop
-       within  the  pattern. Specifically, it means that either the whole pat-
+       This  error  is  returned  when  pcre2_match() detects a recursion loop
+       within the pattern. Specifically, it means that either the  whole  pat-
        tern or a capture group has been called recursively for the second time
-       at  the  same position in the subject string. Some simple patterns that
-       might do this are detected and faulted at compile time, but  more  com-
-       plicated  cases,  in particular mutual recursions between two different
+       at the same position in the subject string. Some simple  patterns  that
+       might  do  this are detected and faulted at compile time, but more com-
+       plicated cases, in particular mutual recursions between  two  different
        groups, cannot be detected until matching is attempted.



@@ -2970,21 +2961,21 @@
        int pcre2_get_error_message(int errorcode, PCRE2_UCHAR *buffer,
          PCRE2_SIZE bufflen);


-       A text message for an error code  from  any  PCRE2  function  (compile,
-       match,  or  auxiliary)  can be obtained by calling pcre2_get_error_mes-
-       sage(). The code is passed as the first argument,  with  the  remaining
-       two  arguments  specifying  a  code  unit buffer and its length in code
-       units, into which the text message is placed. The message  is  returned
-       in  code  units  of the appropriate width for the library that is being
+       A  text  message  for  an  error code from any PCRE2 function (compile,
+       match, or auxiliary) can be obtained  by  calling  pcre2_get_error_mes-
+       sage().  The  code  is passed as the first argument, with the remaining
+       two arguments specifying a code unit buffer  and  its  length  in  code
+       units,  into  which the text message is placed. The message is returned
+       in code units of the appropriate width for the library  that  is  being
        used.


-       The returned message is terminated with a trailing zero, and the  func-
-       tion  returns  the  number  of  code units used, excluding the trailing
-       zero.  If  the  error  number  is  unknown,  the  negative  error  code
-       PCRE2_ERROR_BADDATA  is  returned. If the buffer is too small, the mes-
-       sage is truncated (but still with a trailing zero),  and  the  negative
-       error  code PCRE2_ERROR_NOMEMORY is returned.  None of the messages are
-       very long; a buffer size of 120 code units is ample.
+       The  returned message is terminated with a trailing zero, and the func-
+       tion returns the number of code  units  used,  excluding  the  trailing
+       zero. If the error number is unknown, the negative error code PCRE2_ER-
+       ROR_BADDATA is returned. If the buffer is too  small,  the  message  is
+       truncated (but still with a trailing zero), and the negative error code
+       PCRE2_ERROR_NOMEMORY is returned.  None of the messages are very  long;
+       a buffer size of 120 code units is ample.



EXTRACTING CAPTURED SUBSTRINGS BY NUMBER
@@ -3002,39 +2993,39 @@

        void pcre2_substring_free(PCRE2_UCHAR *buffer);


-       Captured substrings can be accessed directly by using  the  ovector  as
+       Captured  substrings  can  be accessed directly by using the ovector as
        described above.  For convenience, auxiliary functions are provided for
-       extracting  captured  substrings  as  new,  separate,   zero-terminated
+       extracting   captured  substrings  as  new,  separate,  zero-terminated
        strings. A substring that contains a binary zero is correctly extracted
-       and has a further zero added on the end, but  the  result  is  not,  of
+       and  has  a  further  zero  added on the end, but the result is not, of
        course, a C string.


        The functions in this section identify substrings by number. The number
        zero refers to the entire matched substring, with higher numbers refer-
-       ring  to  substrings  captured by parenthesized groups. After a partial
-       match, only substring zero is available.  An  attempt  to  extract  any
-       other  substring  gives the error PCRE2_ERROR_PARTIAL. The next section
+       ring to substrings captured by parenthesized groups.  After  a  partial
+       match,  only  substring  zero  is  available. An attempt to extract any
+       other substring gives the error PCRE2_ERROR_PARTIAL. The  next  section
        describes similar functions for extracting captured substrings by name.


-       If a pattern uses the \K escape sequence within a  positive  assertion,
+       If  a  pattern uses the \K escape sequence within a positive assertion,
        the reported start of a successful match can be greater than the end of
-       the match.  For example, if the pattern  (?=ab\K)  is  matched  against
-       "ab",  the  start  and  end offset values for the match are 2 and 0. In
-       this situation, calling these functions with a  zero  substring  number
+       the  match.   For  example,  if the pattern (?=ab\K) is matched against
+       "ab", the start and end offset values for the match are  2  and  0.  In
+       this  situation,  calling  these functions with a zero substring number
        extracts a zero-length empty string.


-       You  can  find the length in code units of a captured substring without
-       extracting it by calling pcre2_substring_length_bynumber().  The  first
-       argument  is a pointer to the match data block, the second is the group
-       number, and the third is a pointer to a variable into which the  length
-       is  placed.  If  you just want to know whether or not the substring has
+       You can find the length in code units of a captured  substring  without
+       extracting  it  by calling pcre2_substring_length_bynumber(). The first
+       argument is a pointer to the match data block, the second is the  group
+       number,  and the third is a pointer to a variable into which the length
+       is placed. If you just want to know whether or not  the  substring  has
        been captured, you can pass the third argument as NULL.


-       The pcre2_substring_copy_bynumber() function  copies  a  captured  sub-
-       string  into  a supplied buffer, whereas pcre2_substring_get_bynumber()
-       copies it into new memory, obtained using the  same  memory  allocation
-       function  that  was  used for the match data block. The first two argu-
-       ments of these functions are a pointer to the match data  block  and  a
+       The  pcre2_substring_copy_bynumber()  function  copies  a captured sub-
+       string into a supplied buffer,  whereas  pcre2_substring_get_bynumber()
+       copies  it  into  new memory, obtained using the same memory allocation
+       function that was used for the match data block. The  first  two  argu-
+       ments  of  these  functions are a pointer to the match data block and a
        capture group number.


        The final arguments of pcre2_substring_copy_bynumber() are a pointer to
@@ -3043,25 +3034,25 @@
        for the extracted substring, excluding the terminating zero.


        For pcre2_substring_get_bynumber() the third and fourth arguments point
-       to  variables that are updated with a pointer to the new memory and the
-       number of code units that comprise the substring, again  excluding  the
-       terminating  zero.  When  the substring is no longer needed, the memory
+       to variables that are updated with a pointer to the new memory and  the
+       number  of  code units that comprise the substring, again excluding the
+       terminating zero. When the substring is no longer  needed,  the  memory
        should be freed by calling pcre2_substring_free().


-       The return value from all these functions is zero  for  success,  or  a
-       negative  error  code.  If  the pattern match failed, the match failure
-       code is returned.  If a substring number  greater  than  zero  is  used
-       after  a partial match, PCRE2_ERROR_PARTIAL is returned. Other possible
+       The  return  value  from  all these functions is zero for success, or a
+       negative error code. If the pattern match  failed,  the  match  failure
+       code  is returned.  If a substring number greater than zero is used af-
+       ter a partial match, PCRE2_ERROR_PARTIAL is  returned.  Other  possible
        error codes are:


          PCRE2_ERROR_NOMEMORY


-       The buffer was too small for  pcre2_substring_copy_bynumber(),  or  the
+       The  buffer  was  too small for pcre2_substring_copy_bynumber(), or the
        attempt to get memory failed for pcre2_substring_get_bynumber().


          PCRE2_ERROR_NOSUBSTRING


-       There  is  no  substring  with that number in the pattern, that is, the
+       There is no substring with that number in the  pattern,  that  is,  the
        number is greater than the number of capturing parentheses.


          PCRE2_ERROR_UNAVAILABLE
@@ -3072,8 +3063,8 @@


          PCRE2_ERROR_UNSET


-       The substring did not participate in the match.  For  example,  if  the
-       pattern  is  (abc)|(def) and the subject is "def", and the ovector con-
+       The  substring  did  not  participate in the match. For example, if the
+       pattern is (abc)|(def) and the subject is "def", and the  ovector  con-
        tains at least two capturing slots, substring number 1 is unset.



@@ -3084,31 +3075,31 @@

        void pcre2_substring_list_free(PCRE2_SPTR *list);


-       The pcre2_substring_list_get() function  extracts  all  available  sub-
-       strings  and  builds  a  list of pointers to them. It also (optionally)
-       builds a second list that  contains  their  lengths  (in  code  units),
-       excluding a terminating zero that is added to each of them. All this is
+       The  pcre2_substring_list_get()  function  extracts  all available sub-
+       strings and builds a list of pointers to  them.  It  also  (optionally)
+       builds  a  second list that contains their lengths (in code units), ex-
+       cluding a terminating zero that is added to each of them. All  this  is
        done in a single block of memory that is obtained using the same memory
        allocation function that was used to get the match data block.


-       This  function  must be called only after a successful match. If called
+       This function must be called only after a successful match.  If  called
        after a partial match, the error code PCRE2_ERROR_PARTIAL is returned.


-       The address of the memory block is returned via listptr, which is  also
+       The  address of the memory block is returned via listptr, which is also
        the start of the list of string pointers. The end of the list is marked
-       by a NULL pointer. The address of the list of lengths is  returned  via
-       lengthsptr.  If your strings do not contain binary zeros and you do not
+       by  a  NULL pointer. The address of the list of lengths is returned via
+       lengthsptr. If your strings do not contain binary zeros and you do  not
        therefore need the lengths, you may supply NULL as the lengthsptr argu-
-       ment  to  disable  the  creation of a list of lengths. The yield of the
-       function is zero if all went well, or PCRE2_ERROR_NOMEMORY if the  mem-
-       ory  block could not be obtained. When the list is no longer needed, it
+       ment to disable the creation of a list of lengths.  The  yield  of  the
+       function  is zero if all went well, or PCRE2_ERROR_NOMEMORY if the mem-
+       ory block could not be obtained. When the list is no longer needed,  it
        should be freed by calling pcre2_substring_list_free().


        If this function encounters a substring that is unset, which can happen
-       when  capture  group  number  n+1 matches some part of the subject, but
-       group n has not been used at all, it returns an empty string. This  can
+       when capture group number n+1 matches some part  of  the  subject,  but
+       group  n has not been used at all, it returns an empty string. This can
        be distinguished from a genuine zero-length substring by inspecting the
-       appropriate offset in the ovector, which contain PCRE2_UNSET for  unset
+       appropriate  offset in the ovector, which contain PCRE2_UNSET for unset
        substrings, or by calling pcre2_substring_length_bynumber().



@@ -3128,7 +3119,7 @@

        void pcre2_substring_free(PCRE2_UCHAR *buffer);


-       To  extract a substring by name, you first have to find associated num-
+       To extract a substring by name, you first have to find associated  num-
        ber.  For example, for this pattern:


          (a+)b(?<xxx>\d+)...
@@ -3136,32 +3127,32 @@
        the number of the capture group called "xxx" is 2. If the name is known
        to be unique (PCRE2_DUPNAMES was not set), you can find the number from
        the name by calling pcre2_substring_number_from_name(). The first argu-
-       ment  is the compiled pattern, and the second is the name. The yield of
-       the function is the group number, PCRE2_ERROR_NOSUBSTRING if  there  is
-       no  group  with that name, or PCRE2_ERROR_NOUNIQUESUBSTRING if there is
-       more than one group with that name.  Given the number, you can  extract
-       the  substring  directly from the ovector, or use one of the "bynumber"
+       ment is the compiled pattern, and the second is the name. The yield  of
+       the  function  is the group number, PCRE2_ERROR_NOSUBSTRING if there is
+       no group with that name, or PCRE2_ERROR_NOUNIQUESUBSTRING if  there  is
+       more  than one group with that name.  Given the number, you can extract
+       the substring directly from the ovector, or use one of  the  "bynumber"
        functions described above.


-       For convenience, there are also "byname" functions that  correspond  to
-       the  "bynumber"  functions,  the  only difference being that the second
-       argument is a name instead of a number. If PCRE2_DUPNAMES  is  set  and
+       For  convenience,  there are also "byname" functions that correspond to
+       the "bynumber" functions, the only difference being that the second ar-
+       gument  is  a  name  instead  of a number. If PCRE2_DUPNAMES is set and
        there are duplicate names, these functions scan all the groups with the
-       given name, and return the captured  substring  from  the  first  named
+       given  name,  and  return  the  captured substring from the first named
        group that is set.


-       If  there are no groups with the given name, PCRE2_ERROR_NOSUBSTRING is
-       returned. If all groups with the name have  numbers  that  are  greater
-       than  the  number  of  slots in the ovector, PCRE2_ERROR_UNAVAILABLE is
-       returned. If there is at least one group with a slot  in  the  ovector,
-       but no group is found to be set, PCRE2_ERROR_UNSET is returned.
+       If there are no groups with the given name, PCRE2_ERROR_NOSUBSTRING  is
+       returned.  If  all  groups  with the name have numbers that are greater
+       than the number of slots in the ovector, PCRE2_ERROR_UNAVAILABLE is re-
+       turned.  If there is at least one group with a slot in the ovector, but
+       no group is found to be set, PCRE2_ERROR_UNSET is returned.


        Warning: If the pattern uses the (?| feature to set up multiple capture
-       groups with the same number, as described in the section  on  duplicate
+       groups  with  the same number, as described in the section on duplicate
        group numbers in the pcre2pattern page, you cannot use names to distin-
-       guish the different capture groups, because names are not  included  in
-       the  compiled  code.  The  matching process uses only numbers. For this
-       reason, the use of different names for  groups  with  the  same  number
+       guish  the  different capture groups, because names are not included in
+       the compiled code. The matching process uses  only  numbers.  For  this
+       reason,  the  use  of  different  names for groups with the same number
        causes an error at compile time.



@@ -3174,72 +3165,72 @@
          PCRE2_SIZE rlength, PCRE2_UCHAR *outputbuffer,
          PCRE2_SIZE *outlengthptr);


-       This  function calls pcre2_match() and then makes a copy of the subject
-       string in outputbuffer, replacing one or more parts that  were  matched
+       This function calls pcre2_match() and then makes a copy of the  subject
+       string  in  outputbuffer, replacing one or more parts that were matched
        with the replacement string, whose length is supplied in rlength.  This
-       can be given as PCRE2_ZERO_TERMINATED  for  a  zero-terminated  string.
-       The  default is to perform just one replacement, but there is an option
-       that requests multiple replacements (see PCRE2_SUBSTITUTE_GLOBAL  below
+       can  be  given  as  PCRE2_ZERO_TERMINATED for a zero-terminated string.
+       The default is to perform just one replacement, but there is an  option
+       that  requests multiple replacements (see PCRE2_SUBSTITUTE_GLOBAL below
        for details).


-       Matches  in  which  a  \K item in a lookahead in the pattern causes the
-       match to end before it starts are not supported, and give  rise  to  an
+       Matches in which a \K item in a lookahead in  the  pattern  causes  the
+       match  to  end  before it starts are not supported, and give rise to an
        error return. For global replacements, matches in which \K in a lookbe-
-       hind causes the match to start earlier than the point that was  reached
+       hind  causes the match to start earlier than the point that was reached
        in the previous iteration are also not supported.


-       The  first  seven  arguments  of pcre2_substitute() are the same as for
+       The first seven arguments of pcre2_substitute() are  the  same  as  for
        pcre2_match(), except that the partial matching options are not permit-
-       ted,  and  match_data may be passed as NULL, in which case a match data
-       block is obtained and freed within this function, using memory  manage-
-       ment  functions from the match context, if provided, or else those that
+       ted, and match_data may be passed as NULL, in which case a  match  data
+       block  is obtained and freed within this function, using memory manage-
+       ment functions from the match context, if provided, or else those  that
        were used to allocate memory for the compiled code.


-       If an external match_data block is provided,  its  contents  afterwards
-       are  those  set by the final call to pcre2_match(). For global changes,
-       this will have ended in a matching error. The contents of  the  ovector
+       If  an  external  match_data block is provided, its contents afterwards
+       are those set by the final call to pcre2_match(). For  global  changes,
+       this  will  have ended in a matching error. The contents of the ovector
        within the match data block may or may not have been changed.


-       The  outlengthptr  argument  must point to a variable that contains the
-       length, in code units, of the output buffer. If the  function  is  suc-
-       cessful,  the value is updated to contain the length of the new string,
+       The outlengthptr argument must point to a variable  that  contains  the
+       length,  in  code  units, of the output buffer. If the function is suc-
+       cessful, the value is updated to contain the length of the new  string,
        excluding the trailing zero that is automatically added.


-       If the function is not  successful,  the  value  set  via  outlengthptr
-       depends  on  the  type  of  error. For syntax errors in the replacement
-       string, the value is the offset in the  replacement  string  where  the
-       error  was  detected.  For  other  errors,  the value is PCRE2_UNSET by
-       default. This includes the case of the output buffer being  too  small,
-       unless  PCRE2_SUBSTITUTE_OVERFLOW_LENGTH  is  set (see below), in which
-       case the value is the minimum length needed, including  space  for  the
-       trailing  zero.  Note  that  in  order  to compute the required length,
-       pcre2_substitute() has  to  simulate  all  the  matching  and  copying,
-       instead of giving an error return as soon as the buffer overflows. Note
-       also that the length is in code units, not bytes.
+       If  the  function is not successful, the value set via outlengthptr de-
+       pends on the type of  error.  For  syntax  errors  in  the  replacement
+       string, the value is the offset in the replacement string where the er-
+       ror was detected. For other errors, the value  is  PCRE2_UNSET  by  de-
+       fault. This includes the case of the output buffer being too small, un-
+       less PCRE2_SUBSTITUTE_OVERFLOW_LENGTH is set (see below), in which case
+       the  value is the minimum length needed, including space for the trail-
+       ing zero. Note that in order to compute the required length, pcre2_sub-
+       stitute() has to simulate all the matching and copying, instead of giv-
+       ing an error return as soon as the buffer overflows. Note also that the
+       length is in code units, not bytes.


-       In the replacement string, which is interpreted as a UTF string in  UTF
-       mode,  and  is  checked  for UTF validity unless the PCRE2_NO_UTF_CHECK
-       option is set, a dollar character is an escape character that can spec-
-       ify  the  insertion  of  characters  from  capture groups or names from
-       (*MARK) or other control verbs in the pattern. The following forms  are
-       always recognized:
+       In  the replacement string, which is interpreted as a UTF string in UTF
+       mode, and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK op-
+       tion is set, a dollar character is an escape character that can specify
+       the insertion of characters from capture groups or names  from  (*MARK)
+       or  other  control verbs in the pattern. The following forms are always
+       recognized:


          $$                  insert a dollar character
          $<n> or ${<n>}      insert the contents of group <n>
          $*MARK or ${*MARK}  insert a control verb name


-       Either  a  group  number  or  a  group name can be given for <n>. Curly
-       brackets are required only if the following character would  be  inter-
+       Either a group number or a group name  can  be  given  for  <n>.  Curly
+       brackets  are  required only if the following character would be inter-
        preted as part of the number or name. The number may be zero to include
-       the entire matched string.   For  example,  if  the  pattern  a(b)c  is
-       matched  with "=abc=" and the replacement string "+$1$0$1+", the result
+       the  entire  matched  string.   For  example,  if  the pattern a(b)c is
+       matched with "=abc=" and the replacement string "+$1$0$1+", the  result
        is "=+babcb+=".


-       $*MARK inserts the name from the last encountered backtracking  control
-       verb  on the matching path that has a name. (*MARK) must always include
-       a name, but the other verbs need not.  For  example,  in  the  case  of
+       $*MARK  inserts the name from the last encountered backtracking control
+       verb on the matching path that has a name. (*MARK) must always  include
+       a  name,  but  the  other  verbs  need not. For example, in the case of
        (*MARK:A)(*PRUNE) the name inserted is "A", but for (*MARK:A)(*PRUNE:B)
-       the relevant name is "B". This facility can be used to  perform  simple
+       the  relevant  name is "B". This facility can be used to perform simple
        simultaneous substitutions, as this pcre2test example shows:


          /(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
@@ -3246,19 +3237,19 @@
              apple lemon
           2: pear orange


-       As  well as the usual options for pcre2_match(), a number of additional
+       As well as the usual options for pcre2_match(), a number of  additional
        options can be set in the options argument of pcre2_substitute().


        PCRE2_SUBSTITUTE_GLOBAL causes the function to iterate over the subject
-       string,  replacing every matching substring. If this option is not set,
-       only the first matching substring is replaced. The search  for  matches
-       takes  place in the original subject string (that is, previous replace-
-       ments do not affect it).  Iteration is  implemented  by  advancing  the
-       startoffset  value  for  each search, which is always passed the entire
+       string, replacing every matching substring. If this option is not  set,
+       only  the  first matching substring is replaced. The search for matches
+       takes place in the original subject string (that is, previous  replace-
+       ments  do  not  affect  it).  Iteration is implemented by advancing the
+       startoffset value for each search, which is always  passed  the  entire
        subject string. If an offset limit is set in the match context, search-
        ing stops when that limit is reached.


-       You  can  restrict  the effect of a global substitution to a portion of
+       You can restrict the effect of a global substitution to  a  portion  of
        the subject string by setting either or both of startoffset and an off-
        set limit. Here is a pcre2test example:


@@ -3266,68 +3257,67 @@
          ABC ABC ABC ABC\=offset=3,offset_limit=12
           2: ABC A!C A!C ABC


-       When  continuing  with  global substitutions after matching a substring
+       When continuing with global substitutions after  matching  a  substring
        with zero length, an attempt to find a non-empty match at the same off-
        set is performed.  If this is not successful, the offset is advanced by
        one character except when CRLF is a valid newline sequence and the next
-       two  characters are CR, LF. In this case, the offset is advanced by two
+       two characters are CR, LF. In this case, the offset is advanced by  two
        characters.


-       PCRE2_SUBSTITUTE_OVERFLOW_LENGTH changes what happens when  the  output
+       PCRE2_SUBSTITUTE_OVERFLOW_LENGTH  changes  what happens when the output
        buffer is too small. The default action is to return PCRE2_ERROR_NOMEM-
-       ORY immediately. If this option  is  set,  however,  pcre2_substitute()
+       ORY  immediately.  If  this  option is set, however, pcre2_substitute()
        continues to go through the motions of matching and substituting (with-
-       out, of course, writing anything) in order to compute the size of  buf-
-       fer  that  is  needed.  This  value is passed back via the outlengthptr
-       variable,   with   the   result   of   the   function    still    being
-       PCRE2_ERROR_NOMEMORY.
+       out,  of course, writing anything) in order to compute the size of buf-
+       fer that is needed. This value is  passed  back  via  the  outlengthptr
+       variable,  with  the  result  of  the  function  still  being PCRE2_ER-
+       ROR_NOMEMORY.


-       Passing  a  buffer  size  of zero is a permitted way of finding out how
-       much memory is needed for given substitution. However, this  does  mean
+       Passing a buffer size of zero is a permitted way  of  finding  out  how
+       much  memory  is needed for given substitution. However, this does mean
        that the entire operation is carried out twice. Depending on the appli-
-       cation, it may be more efficient to allocate a large  buffer  and  free
-       the   excess   afterwards,   instead  of  using  PCRE2_SUBSTITUTE_OVER-
+       cation,  it  may  be more efficient to allocate a large buffer and free
+       the  excess  afterwards,  instead   of   using   PCRE2_SUBSTITUTE_OVER-
        FLOW_LENGTH.


        PCRE2_SUBSTITUTE_UNKNOWN_UNSET causes references to capture groups that
        do not appear in the pattern to be treated as unset groups. This option
-       should be used with care, because it means that a typo in a group  name
+       should  be used with care, because it means that a typo in a group name
        or number no longer causes the PCRE2_ERROR_NOSUBSTRING error.


-       PCRE2_SUBSTITUTE_UNSET_EMPTY  causes  unset  capture  groups (including
-       unknown  groups  when  PCRE2_SUBSTITUTE_UNKNOWN_UNSET  is  set)  to  be
-       treated  as  empty  strings  when  inserted as described above. If this
-       option is not set, an attempt to  insert  an  unset  group  causes  the
-       PCRE2_ERROR_UNSET  error.  This  option does not influence the extended
-       substitution syntax described below.
+       PCRE2_SUBSTITUTE_UNSET_EMPTY causes unset capture groups (including un-
+       known  groups when PCRE2_SUBSTITUTE_UNKNOWN_UNSET is set) to be treated
+       as empty strings when inserted as described above. If  this  option  is
+       not set, an attempt to insert an unset group causes the PCRE2_ERROR_UN-
+       SET error. This option does not  influence  the  extended  substitution
+       syntax described below.


-       PCRE2_SUBSTITUTE_EXTENDED causes extra processing to be applied to  the
-       replacement  string.  Without this option, only the dollar character is
-       special, and only the group insertion forms  listed  above  are  valid.
+       PCRE2_SUBSTITUTE_EXTENDED  causes extra processing to be applied to the
+       replacement string. Without this option, only the dollar  character  is
+       special,  and  only  the  group insertion forms listed above are valid.
        When PCRE2_SUBSTITUTE_EXTENDED is set, two things change:


-       Firstly,  backslash in a replacement string is interpreted as an escape
+       Firstly, backslash in a replacement string is interpreted as an  escape
        character. The usual forms such as \n or \x{ddd} can be used to specify
-       particular  character codes, and backslash followed by any non-alphanu-
-       meric character quotes that character. Extended quoting  can  be  coded
+       particular character codes, and backslash followed by any  non-alphanu-
+       meric  character  quotes  that character. Extended quoting can be coded
        using \Q...\E, exactly as in pattern strings.


-       There  are  also four escape sequences for forcing the case of inserted
-       letters.  The insertion mechanism has three states:  no  case  forcing,
+       There are also four escape sequences for forcing the case  of  inserted
+       letters.   The  insertion  mechanism has three states: no case forcing,
        force upper case, and force lower case. The escape sequences change the
        current state: \U and \L change to upper or lower case forcing, respec-
-       tively,  and  \E (when not terminating a \Q quoted sequence) reverts to
-       no case forcing. The sequences \u and \l force the next  character  (if
-       it  is  a  letter)  to  upper or lower case, respectively, and then the
+       tively, and \E (when not terminating a \Q quoted sequence)  reverts  to
+       no  case  forcing. The sequences \u and \l force the next character (if
+       it is a letter) to upper or lower  case,  respectively,  and  then  the
        state automatically reverts to no case forcing. Case forcing applies to
-       all  inserted  characters, including those from capture groups and let-
+       all inserted  characters, including those from capture groups and  let-
        ters within \Q...\E quoted sequences.


        Note that case forcing sequences such as \U...\E do not nest. For exam-
-       ple,  the  result of processing "\Uaa\LBB\Ecc\E" is "AAbbcc"; the final
-       \E  has  no   effect.   Note   also   that   the   PCRE2_ALT_BSUX   and
-       PCRE2_EXTRA_ALT_BSUX  options  do not apply to not apply to replacement
-       strings.
+       ple, the result of processing "\Uaa\LBB\Ecc\E" is "AAbbcc";  the  final
+       \E  has  no  effect.  Note  also  that the PCRE2_ALT_BSUX and PCRE2_EX-
+       TRA_ALT_BSUX options do not apply to not apply to replacement strings.


        The second effect of setting PCRE2_SUBSTITUTE_EXTENDED is to  add  more
        flexibility  to  capture  group  substitution. The syntax is similar to
@@ -3357,8 +3347,8 @@
           1: HELLO


        The PCRE2_SUBSTITUTE_UNSET_EMPTY option does not affect these  extended
-       substitutions.   However,   PCRE2_SUBSTITUTE_UNKNOWN_UNSET  does  cause
-       unknown groups in the extended syntax forms to be treated as unset.
+       substitutions.  However,  PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause un-
+       known groups in the extended syntax forms to be treated as unset.


        If successful, pcre2_substitute()  returns  the  number  of  successful
        matches.  This  may  be  zero  if  no  matches were found, and is never
@@ -3373,8 +3363,8 @@


        PCRE2_ERROR_UNSET is returned for an unset substring insertion (includ-
        ing an unknown substring when  PCRE2_SUBSTITUTE_UNKNOWN_UNSET  is  set)
-       when  the  simple  (non-extended)  syntax  is  used  and  PCRE2_SUBSTI-
-       TUTE_UNSET_EMPTY is not set.
+       when  the simple (non-extended) syntax is used and PCRE2_SUBSTITUTE_UN-
+       SET_EMPTY is not set.


        PCRE2_ERROR_NOMEMORY is returned  if  the  output  buffer  is  not  big
        enough. If the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option is set, the size
@@ -3382,17 +3372,17 @@
        does not happen by default.


        PCRE2_ERROR_BADREPLACEMENT  is  used for miscellaneous syntax errors in
-       the   replacement   string,   with   more   particular   errors   being
-       PCRE2_ERROR_BADREPESCAPE  (invalid  escape  sequence), PCRE2_ERROR_REP-
-       MISSINGBRACE (closing curly bracket not found),  PCRE2_ERROR_BADSUBSTI-
-       TUTION   (syntax   error   in   extended   group   substitution),   and
-       PCRE2_ERROR_BADSUBSPATTERN (the pattern match ended before  it  started
-       or  the match started earlier than the current position in the subject,
-       which can happen if \K is used in an assertion).
+       the replacement string, with more  particular  errors  being  PCRE2_ER-
+       ROR_BADREPESCAPE (invalid escape sequence), PCRE2_ERROR_REPMISSINGBRACE
+       (closing curly bracket not found), PCRE2_ERROR_BADSUBSTITUTION  (syntax
+       error  in  extended group substitution), and PCRE2_ERROR_BADSUBSPATTERN
+       (the pattern match ended before it started or the match started earlier
+       than  the  current  position  in the subject, which can happen if \K is
+       used in an assertion).


        As for all PCRE2 errors, a text message that describes the error can be
-       obtained   by   calling  the  pcre2_get_error_message()  function  (see
-       "Obtaining a textual error message" above).
+       obtained  by  calling  the pcre2_get_error_message() function (see "Ob-
+       taining a textual error message" above).


    Substitution callouts


@@ -3442,8 +3432,8 @@

        If the value is zero, the replacement is accepted, and,  if  PCRE2_SUB-
        STITUTE_GLOBAL  is set, processing continues with a search for the next
-       match. If the value  is  not  zero,  the  current  replacement  is  not
-       accepted.  If the value is greater than zero, processing continues when
+       match. If the value is not zero, the current  replacement  is  not  ac-
+       cepted.  If  the  value is greater than zero, processing continues when
        PCRE2_SUBSTITUTE_GLOBAL is set. Otherwise (the value is less than  zero
        or  PCRE2_SUBSTITUTE_GLOBAL  is  not set), the the rest of the input is
        copied to the output and the call to pcre2_substitute() exits,  return-
@@ -3456,10 +3446,10 @@
          PCRE2_SPTR name, PCRE2_SPTR *first, PCRE2_SPTR *last);


        When  a  pattern  is compiled with the PCRE2_DUPNAMES option, names for
-       capture groups are not required  to  be  unique.  Duplicate  names  are
-       always  allowed  for  groups with the same number, created by using the
-       (?| feature. Indeed, if such groups are named, they are required to use
-       the same names.
+       capture groups are not required to be unique. Duplicate names  are  al-
+       ways  allowed for groups with the same number, created by using the (?|
+       feature. Indeed, if such groups are named, they are required to use the
+       same names.


        Normally,  patterns  that  use duplicate names are such that in any one
        match, only one of each set of identically-named  groups  participates.
@@ -3467,10 +3457,10 @@


        When   duplicates   are   present,   pcre2_substring_copy_byname()  and
        pcre2_substring_get_byname() return the first  substring  corresponding
-       to   the   given   name   that   is  set.  Only  if  none  are  set  is
-       PCRE2_ERROR_UNSET is returned.  The  pcre2_substring_number_from_name()
-       function returns the error PCRE2_ERROR_NOUNIQUESUBSTRING when there are
-       duplicate names.
+       to  the given name that is set. Only if none are set is PCRE2_ERROR_UN-
+       SET is returned. The  pcre2_substring_number_from_name()  function  re-
+       turns  the error PCRE2_ERROR_NOUNIQUESUBSTRING when there are duplicate
+       names.


        If you want to get full details of all captured substrings for a  given
        name,  you  must use the pcre2_substring_nametable_scan() function. The
@@ -3557,44 +3547,43 @@


        The unused bits of the options argument for pcre2_dfa_match()  must  be
        zero.   The   only   bits   that   may   be   set  are  PCRE2_ANCHORED,
-       PCRE2_COPY_MATCHED_SUBJECT,      PCRE2_ENDANCHORED,       PCRE2_NOTBOL,
-       PCRE2_NOTEOL,          PCRE2_NOTEMPTY,          PCRE2_NOTEMPTY_ATSTART,
-       PCRE2_NO_UTF_CHECK,       PCRE2_PARTIAL_HARD,       PCRE2_PARTIAL_SOFT,
-       PCRE2_DFA_SHORTEST,  and  PCRE2_DFA_RESTART.  All  but the last four of
-       these are exactly the same as for pcre2_match(), so  their  description
-       is not repeated here.
+       PCRE2_COPY_MATCHED_SUBJECT, PCRE2_ENDANCHORED, PCRE2_NOTBOL,  PCRE2_NO-
+       TEOL,   PCRE2_NOTEMPTY,   PCRE2_NOTEMPTY_ATSTART,   PCRE2_NO_UTF_CHECK,
+       PCRE2_PARTIAL_HARD,   PCRE2_PARTIAL_SOFT,    PCRE2_DFA_SHORTEST,    and
+       PCRE2_DFA_RESTART.  All but the last four of these are exactly the same
+       as for pcre2_match(), so their description is not repeated here.


          PCRE2_PARTIAL_HARD
          PCRE2_PARTIAL_SOFT


-       These  have  the  same general effect as they do for pcre2_match(), but
-       the details are slightly different. When PCRE2_PARTIAL_HARD is set  for
-       pcre2_dfa_match(),  it  returns  PCRE2_ERROR_PARTIAL  if the end of the
+       These have the same general effect as they do  for  pcre2_match(),  but
+       the  details are slightly different. When PCRE2_PARTIAL_HARD is set for
+       pcre2_dfa_match(), it returns PCRE2_ERROR_PARTIAL if  the  end  of  the
        subject is reached and there is still at least one matching possibility
        that requires additional characters. This happens even if some complete
-       matches have already been found. When PCRE2_PARTIAL_SOFT  is  set,  the
-       return  code  PCRE2_ERROR_NOMATCH is converted into PCRE2_ERROR_PARTIAL
-       if the end of the subject is  reached,  there  have  been  no  complete
+       matches  have  already  been found. When PCRE2_PARTIAL_SOFT is set, the
+       return code PCRE2_ERROR_NOMATCH is converted  into  PCRE2_ERROR_PARTIAL
+       if  the  end  of  the  subject  is reached, there have been no complete
        matches, but there is still at least one matching possibility. The por-
-       tion of the string that was inspected when the  longest  partial  match
+       tion  of  the  string that was inspected when the longest partial match
        was found is set as the first matching string in both cases. There is a
-       more detailed discussion of partial and  multi-segment  matching,  with
+       more  detailed  discussion  of partial and multi-segment matching, with
        examples, in the pcre2partial documentation.


          PCRE2_DFA_SHORTEST


-       Setting  the PCRE2_DFA_SHORTEST option causes the matching algorithm to
+       Setting the PCRE2_DFA_SHORTEST option causes the matching algorithm  to
        stop as soon as it has found one match. Because of the way the alterna-
-       tive  algorithm  works, this is necessarily the shortest possible match
+       tive algorithm works, this is necessarily the shortest  possible  match
        at the first possible matching point in the subject string.


          PCRE2_DFA_RESTART


-       When pcre2_dfa_match() returns a partial match, it is possible to  call
+       When  pcre2_dfa_match() returns a partial match, it is possible to call
        it again, with additional subject characters, and have it continue with
        the same match. The PCRE2_DFA_RESTART option requests this action; when
-       it  is  set,  the workspace and wscount options must reference the same
-       vector as before because data about the match so far is  left  in  them
+       it is set, the workspace and wscount options must  reference  the  same
+       vector  as  before  because data about the match so far is left in them
        after a partial match. There is more discussion of this facility in the
        pcre2partial documentation.


@@ -3602,8 +3591,8 @@

        When pcre2_dfa_match() succeeds, it may have matched more than one sub-
        string in the subject. Note, however, that all the matches from one run
-       of the function start at the same point in  the  subject.  The  shorter
-       matches  are all initial substrings of the longer matches. For example,
+       of  the  function  start  at the same point in the subject. The shorter
+       matches are all initial substrings of the longer matches. For  example,
        if the pattern


          <.*>
@@ -3618,80 +3607,80 @@
          <something> <something else>
          <something>


-       On success, the yield of the function is a number  greater  than  zero,
-       which  is  the  number  of  matched substrings. The offsets of the sub-
-       strings are returned in the ovector, and can be extracted by number  in
-       the  same way as for pcre2_match(), but the numbers bear no relation to
-       any capture groups that may exist in the pattern, because DFA  matching
+       On  success,  the  yield of the function is a number greater than zero,
+       which is the number of matched substrings.  The  offsets  of  the  sub-
+       strings  are returned in the ovector, and can be extracted by number in
+       the same way as for pcre2_match(), but the numbers bear no relation  to
+       any  capture groups that may exist in the pattern, because DFA matching
        does not support capturing.


-       Calls  to  the  convenience  functions  that extract substrings by name
-       return the error PCRE2_ERROR_DFA_UFUNC (unsupported function)  if  used
-       after a DFA match. The convenience functions that extract substrings by
+       Calls to the convenience functions that extract substrings by name  re-
+       turn the error PCRE2_ERROR_DFA_UFUNC (unsupported function) if used af-
+       ter a DFA match. The convenience functions that extract  substrings  by
        number never return PCRE2_ERROR_NOSUBSTRING.


-       The matched strings are stored in  the  ovector  in  reverse  order  of
-       length;  that  is,  the longest matching string is first. If there were
-       too many matches to fit into the ovector, the yield of the function  is
+       The  matched  strings  are  stored  in  the ovector in reverse order of
+       length; that is, the longest matching string is first.  If  there  were
+       too  many matches to fit into the ovector, the yield of the function is
        zero, and the vector is filled with the longest matches.


-       NOTE:  PCRE2's  "auto-possessification" optimization usually applies to
-       character repeats at the end of a pattern (as well as internally).  For
-       example,  the pattern "a\d+" is compiled as if it were "a\d++". For DFA
-       matching, this means that only one possible  match  is  found.  If  you
-       really  do  want multiple matches in such cases, either use an ungreedy
-       repeat such as "a\d+?" or set  the  PCRE2_NO_AUTO_POSSESS  option  when
-       compiling.
+       NOTE: PCRE2's "auto-possessification" optimization usually  applies  to
+       character  repeats at the end of a pattern (as well as internally). For
+       example, the pattern "a\d+" is compiled as if it were "a\d++". For  DFA
+       matching,  this means that only one possible match is found. If you re-
+       ally do want multiple matches in such cases, either use an ungreedy re-
+       peat  such as "a\d+?" or set the PCRE2_NO_AUTO_POSSESS option when com-
+       piling.


    Error returns from pcre2_dfa_match()


        The pcre2_dfa_match() function returns a negative number when it fails.
-       Many of the errors are the same  as  for  pcre2_match(),  as  described
+       Many  of  the  errors  are  the same as for pcre2_match(), as described
        above.  There are in addition the following errors that are specific to
        pcre2_dfa_match():


          PCRE2_ERROR_DFA_UITEM


-       This return is given if pcre2_dfa_match() encounters  an  item  in  the
-       pattern  that it does not support, for instance, the use of \C in a UTF
+       This  return  is  given  if pcre2_dfa_match() encounters an item in the
+       pattern that it does not support, for instance, the use of \C in a  UTF
        mode or a backreference.


          PCRE2_ERROR_DFA_UCOND


-       This return is given if pcre2_dfa_match() encounters a  condition  item
+       This  return  is given if pcre2_dfa_match() encounters a condition item
        that uses a backreference for the condition, or a test for recursion in
        a specific capture group. These are not supported.


          PCRE2_ERROR_DFA_UINVALID_UTF


-       This return is given if pcre2_dfa_match() is called for a pattern  that
-       was  compiled  with  PCRE2_MATCH_INVALID_UTF. This is not supported for
+       This  return is given if pcre2_dfa_match() is called for a pattern that
+       was compiled with PCRE2_MATCH_INVALID_UTF. This is  not  supported  for
        DFA matching.


          PCRE2_ERROR_DFA_WSSIZE


-       This return is given if pcre2_dfa_match() runs  out  of  space  in  the
+       This  return  is  given  if  pcre2_dfa_match() runs out of space in the
        workspace vector.


          PCRE2_ERROR_DFA_RECURSE


        When a recursion or subroutine call is processed, the matching function
-       calls itself recursively, using private  memory  for  the  ovector  and
-       workspace.   This  error  is given if the internal ovector is not large
-       enough. This should be extremely rare, as a  vector  of  size  1000  is
+       calls  itself  recursively,  using  private  memory for the ovector and
+       workspace.  This error is given if the internal ovector  is  not  large
+       enough.  This  should  be  extremely  rare, as a vector of size 1000 is
        used.


          PCRE2_ERROR_DFA_BADRESTART


-       When  pcre2_dfa_match()  is  called  with the PCRE2_DFA_RESTART option,
-       some plausibility checks are made on the  contents  of  the  workspace,
-       which  should  contain data about the previous partial match. If any of
+       When pcre2_dfa_match() is called  with  the  PCRE2_DFA_RESTART  option,
+       some  plausibility  checks  are  made on the contents of the workspace,
+       which should contain data about the previous partial match. If  any  of
        these checks fail, this error is given.



SEE ALSO

-       pcre2build(3),   pcre2callout(3),    pcre2demo(3),    pcre2matching(3),
+       pcre2build(3),    pcre2callout(3),    pcre2demo(3),   pcre2matching(3),
        pcre2partial(3), pcre2posix(3), pcre2sample(3), pcre2unicode(3).



@@ -3721,9 +3710,9 @@
        PCRE2  is distributed with a configure script that can be used to build
        the library in Unix-like environments using the applications  known  as
        Autotools. Also in the distribution are files to support building using
-       CMake instead of configure.  The  text  file  README  contains  general
-       information  about  building  with Autotools (some of which is repeated
-       below), and also has some comments about building on various  operating
+       CMake instead of configure. The text file README contains  general  in-
+       formation  about building with Autotools (some of which is repeated be-
+       low), and also has some comments about building  on  various  operating
        systems.  There  is a lot more information about building PCRE2 without
        using Autotools (including information about using CMake  and  building
        "by  hand")  in  the  text file called NON-AUTOTOOLS-BUILD.  You should
@@ -3746,8 +3735,8 @@
        compiler, as described in NON-AUTOTOOLS-BUILD.


        The complete list of options for configure (which includes the standard
-       ones such as the  selection  of  the  installation  directory)  can  be
-       obtained by running
+       ones such as the selection of the installation directory)  can  be  ob-
+       tained by running


          ./configure --help


@@ -3804,8 +3793,8 @@
          --disable-unicode


        to the configure command. This setting applies to all three  libraries.
-       It  is  not  possible  to  build  one library with Unicode support, and
-       another without, in the same configuration.
+       It  is  not possible to build one library with Unicode support, and an-
+       other without, in the same configuration.


        Of itself, Unicode support does not make PCRE2 treat strings as  UTF-8,
        UTF-16 or UTF-32. To do that, applications that use the library can set
@@ -3831,8 +3820,8 @@
        The \C escape sequence, which matches a single code unit, even in a UTF
        mode,  can  cause unpredictable behaviour because it may leave the cur-
        rent matching point in the middle of a multi-code-unit  character.  The
-       application  can  lock  it  out  by setting the PCRE2_NEVER_BACKSLASH_C
-       option when calling pcre2_compile(). There is also a build-time option
+       application  can lock it out by setting the PCRE2_NEVER_BACKSLASH_C op-
+       tion when calling pcre2_compile(). There is also a build-time option


          --enable-never-backslash-C


@@ -3878,8 +3867,8 @@

          --enable-newline-is-cr


-       to the configure  command.  There  is  also  an  --enable-newline-is-lf
-       option, which explicitly specifies linefeed as the newline character.
+       to the configure command. There is also an  --enable-newline-is-lf  op-
+       tion, which explicitly specifies linefeed as the newline character.


        Alternatively, you can specify that line endings are to be indicated by
        the two-character sequence CRLF (CR immediately followed by LF). If you
@@ -3991,8 +3980,8 @@
        used,  but because the size of each backtracking "frame" depends on the
        number of capturing parentheses in a pattern, the amount of  heap  that
        is  used  before  the  limit is reached varies from pattern to pattern.
-       This limit was more useful in versions  before  10.30,  where  function
-       recursion was used for backtracking.
+       This limit was more useful in versions before 10.30, where function re-
+       cursion was used for backtracking.


        As well as applying to pcre2_match(), the depth limit also controls the
        depth of recursive function calls in pcre2_dfa_match(). These are  used
@@ -4028,8 +4017,8 @@
          --enable-ebcdic --disable-unicode


        to the configure command. This setting implies --enable-rebuild-charta-
-       bles.  You  should  only  use  it if you know that you are in an EBCDIC
-       environment (for example, an IBM mainframe operating system).
+       bles.  You should only use it if you know that you are in an EBCDIC en-
+       vironment (for example, an IBM mainframe operating system).


        It is not possible to support both EBCDIC and UTF-8 codes in  the  same
        version  of  the  library. Consequently, --enable-unicode and --enable-
@@ -4058,9 +4047,9 @@
        erates output using local code, and another that calls an external pro-
        gram or script.  If --disable-pcre2grep-callout-fork is  added  to  the
        configure  command,  only  the  first  kind of callout is supported; if
-       --disable-pcre2grep-callout  is  used,  all  callouts  are   completely
-       ignored. For more details of pcre2grep callouts, see the pcre2grep doc-
-       umentation.
+       --disable-pcre2grep-callout is used, all callouts  are  completely  ig-
+       nored.  For more details of pcre2grep callouts, see the pcre2grep docu-
+       mentation.



 PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT
@@ -4105,8 +4094,8 @@
          --enable-pcre2test-libreadline
          --enable-pcre2test-libedit


-       to  the  configure  command,  pcre2test  is linked with the libreadline
-       orlibedit library, respectively, and when its input is from a terminal,
+       to  the configure command, pcre2test is linked with the libreadline or-
+       libedit library, respectively, and when its input is from  a  terminal,
        it  reads  it using the readline() function. This provides line-editing
        and history facilities. Note that libreadline is  GPL-licensed,  so  if
        you  distribute  a binary of pcre2test linked in this way, there may be
@@ -4149,14 +4138,13 @@
          --enable-valgrind


        to the configure command, PCRE2 will use valgrind annotations  to  mark
-       certain  memory  regions  as  unaddressable.  This  allows it to detect
-       invalid memory accesses, and  is  mostly  useful  for  debugging  PCRE2
-       itself.
+       certain  memory  regions as unaddressable. This allows it to detect in-
+       valid memory accesses, and is mostly useful for debugging PCRE2 itself.



CODE COVERAGE REPORTING

-       If  your  C  compiler is gcc, you can build a version of PCRE2 that can
+       If your C compiler is gcc, you can build a version of  PCRE2  that  can
        generate a code coverage report for its test suite. To enable this, you
        must install lcov version 1.6 or above. Then specify


@@ -4165,7 +4153,7 @@
        to the configure command and build PCRE2 in the usual way.


        Note that using ccache (a caching C compiler) is incompatible with code
-       coverage reporting. If you have configured ccache to run  automatically
+       coverage  reporting. If you have configured ccache to run automatically
        on your system, you must set the environment variable


          CCACHE_DISABLE=1
@@ -4172,13 +4160,13 @@


        before running make to build PCRE2, so that ccache is not used.


-       When  --enable-coverage  is  used,  the  following addition targets are
+       When --enable-coverage is used,  the  following  addition  targets  are
        added to the Makefile:


          make coverage


-       This creates a fresh coverage report for the PCRE2 test  suite.  It  is
-       equivalent  to running "make coverage-reset", "make coverage-baseline",
+       This  creates  a  fresh coverage report for the PCRE2 test suite. It is
+       equivalent to running "make coverage-reset", "make  coverage-baseline",
        "make check", and then "make coverage-report".


          make coverage-reset
@@ -4195,29 +4183,29 @@


          make coverage-clean-report


-       This removes the generated coverage report without cleaning the  cover-
+       This  removes the generated coverage report without cleaning the cover-
        age data itself.


          make coverage-clean-data


-       This  removes  the captured coverage data without removing the coverage
+       This removes the captured coverage data without removing  the  coverage
        files created at compile time (*.gcno).


          make coverage-clean


-       This cleans all coverage data including the generated coverage  report.
-       For  more  information about code coverage, see the gcov and lcov docu-
+       This  cleans all coverage data including the generated coverage report.
+       For more information about code coverage, see the gcov and  lcov  docu-
        mentation.



DISABLING THE Z AND T FORMATTING MODIFIERS

-       The C99 standard defines formatting modifiers z and t  for  size_t  and
-       ptrdiff_t  values, respectively. By default, PCRE2 uses these modifiers
-       in environments other than Microsoft  Visual  Studio  when  __STDC_VER-
-       SION__  is  defined  and  has  a value greater than or equal to 199901L
-       (indicating C99).  However, there is  at  least  one  environment  that
-       claims to be C99 but does not support these modifiers. If
+       The  C99  standard  defines formatting modifiers z and t for size_t and
+       ptrdiff_t values, respectively. By default, PCRE2 uses these  modifiers
+       in  environments  other  than  Microsoft Visual Studio when __STDC_VER-
+       SION__ is defined and has a value greater than or equal to 199901L (in-
+       dicating  C99).  However, there is at least one environment that claims
+       to be C99 but does not support these modifiers. If


          --disable-percent-zt


@@ -4227,39 +4215,39 @@

SUPPORT FOR FUZZERS

-       There is a special option for use by people who  want  to  run  fuzzing
+       There  is  a  special  option for use by people who want to run fuzzing
        tests on PCRE2:


          --enable-fuzz-support


        At present this applies only to the 8-bit library. If set, it causes an
-       extra library  called  libpcre2-fuzzsupport.a  to  be  built,  but  not
-       installed.  This contains a single function called LLVMFuzzerTestOneIn-
-       put() whose arguments are a pointer to a string and the length  of  the
-       string.  When  called,  this  function tries to compile the string as a
-       pattern, and if that succeeds, to match it.  This is done both with  no
-       options  and  with some random options bits that are generated from the
+       extra  library  called  libpcre2-fuzzsupport.a to be built, but not in-
+       stalled. This contains a single  function  called  LLVMFuzzerTestOneIn-
+       put()  whose  arguments are a pointer to a string and the length of the
+       string. When called, this function tries to compile  the  string  as  a
+       pattern,  and if that succeeds, to match it.  This is done both with no
+       options and with some random options bits that are generated  from  the
        string.


-       Setting --enable-fuzz-support also causes  a  binary  called  pcre2fuz-
-       zcheck  to be created. This is normally run under valgrind or used when
+       Setting  --enable-fuzz-support  also  causes  a binary called pcre2fuz-
+       zcheck to be created. This is normally run under valgrind or used  when
        PCRE2 is compiled with address sanitizing enabled. It calls the fuzzing
-       function  and  outputs  information  about  what it is doing. The input
-       strings are specified by arguments: if an argument starts with "="  the
-       rest  of it is a literal input string. Otherwise, it is assumed to be a
+       function and outputs information about what  it  is  doing.  The  input
+       strings  are specified by arguments: if an argument starts with "=" the
+       rest of it is a literal input string. Otherwise, it is assumed to be  a
        file name, and the contents of the file are the test string.



OBSOLETE OPTION

-       In versions of PCRE2 prior to 10.30, there were two  ways  of  handling
-       backtracking  in the pcre2_match() function. The default was to use the
+       In  versions  of  PCRE2 prior to 10.30, there were two ways of handling
+       backtracking in the pcre2_match() function. The default was to use  the
        system stack, but if


          --disable-stack-for-recursion


-       was set, memory on the heap was used. From release 10.30  onwards  this
-       has  changed  (the  stack  is  no longer used) and this option now does
+       was  set,  memory on the heap was used. From release 10.30 onwards this
+       has changed (the stack is no longer used)  and  this  option  now  does
        nothing except give a warning.



@@ -4426,8 +4414,8 @@
          No match


        This  shows  that all match attempts start at the beginning of the sub-
-       ject. In other words, the pattern is anchored.  You  can  disable  this
-       optimization  by passing PCRE2_NO_DOTSTAR_ANCHOR to pcre2_compile(), or
+       ject. In other words, the pattern is anchored. You can disable this op-
+       timization  by  passing  PCRE2_NO_DOTSTAR_ANCHOR to pcre2_compile(), or
        starting the pattern with (*NO_DOTSTAR_ANCHOR). In this case, the  out-
        put changes to:


@@ -4538,10 +4526,10 @@
        (the "ovector"). You may read the elements in this vector, but you must
        not change any of them.


-       For calls to pcre2_match(),  the  offset_vector  field  is  not  (since
-       release  10.30)  a pointer to the actual ovector that was passed to the
-       matching function in the match data block.  Instead  it  points  to  an
-       internal  ovector  of a size large enough to hold all possible captured
+       For calls to pcre2_match(), the offset_vector field is not  (since  re-
+       lease  10.30)  a  pointer  to the actual ovector that was passed to the
+       matching function in the match data block. Instead it points to an  in-
+       ternal  ovector  of  a  size large enough to hold all possible captured
        substrings in the pattern. Note that whenever a recursion or subroutine
        call  within  a pattern completes, the capturing state is reset to what
        it was before.
@@ -4554,8 +4542,8 @@
        differ   by   one;  for  example,  when  the  callout  in  the  pattern
        ((a)(b))(?C2) is taken, capture_last is 1 but capture_top is 4.


-       The  contents  of  ovector[2]  to  ovector[<capture_top>*2-1]  can   be
-       inspected in order to extract substrings that have been matched so far,
+       The contents of ovector[2] to  ovector[<capture_top>*2-1]  can  be  in-
+       spected  in  order to extract substrings that have been matched so far,
        in the same way as extracting substrings after a match  has  completed.
        The  values in ovector[0] and ovector[1] are always PCRE2_UNSET because
        the match is by definition not complete. Substrings that have not  been
@@ -4574,8 +4562,8 @@
        were passed to the matching function.


        The  start_match  field normally contains the offset within the subject
-       at which the current match attempt  started.  However,  if  the  escape
-       sequence  \K has been encountered, this value is changed to reflect the
+       at which the current match attempt started. However, if the escape  se-
+       quence  \K  has  been encountered, this value is changed to reflect the
        modified starting point. If the pattern is not  anchored,  the  callout
        function may be called several times from the same point in the pattern
        for different starting points in the subject.
@@ -4649,11 +4637,10 @@
        failed. If the value is less than zero, the match is abandoned, and the
        matching function returns the negative value.


-       Negative  values  should  normally  be   chosen   from   the   set   of
-       PCRE2_ERROR_xxx  values.  In  particular,  PCRE2_ERROR_NOMATCH forces a
-       standard "no match" failure. The error  number  PCRE2_ERROR_CALLOUT  is
-       reserved  for  use by callout functions; it will never be used by PCRE2
-       itself.
+       Negative values should normally be chosen from  the  set  of  PCRE2_ER-
+       ROR_xxx  values.  In  particular, PCRE2_ERROR_NOMATCH forces a standard
+       "no match" failure. The error number  PCRE2_ERROR_CALLOUT  is  reserved
+       for use by callout functions; it will never be used by PCRE2 itself.



 CALLOUT ENUMERATION
@@ -4663,14 +4650,14 @@
          void *user_data);


        A script language that supports the use of string arguments in callouts
-       might  like  to  scan  all the callouts in a pattern before running the
+       might like to scan all the callouts in a  pattern  before  running  the
        match. This can be done by calling pcre2_callout_enumerate(). The first
-       argument  is  a  pointer  to a compiled pattern, the second points to a
-       callback function, and the third is arbitrary user data.  The  callback
-       function  is  called  for  every callout in the pattern in the order in
+       argument is a pointer to a compiled pattern, the  second  points  to  a
+       callback  function,  and the third is arbitrary user data. The callback
+       function is called for every callout in the pattern  in  the  order  in
        which they appear. Its first argument is a pointer to a callout enumer-
-       ation  block,  and  its second argument is the user_data value that was
-       passed to pcre2_callout_enumerate(). The data block contains  the  fol-
+       ation block, and its second argument is the user_data  value  that  was
+       passed  to  pcre2_callout_enumerate(). The data block contains the fol-
        lowing fields:


          version                Block version number
@@ -4681,17 +4668,17 @@
          callout_string_length  Length of callout string
          callout_string         Points to callout string or is NULL


-       The  version  number is currently 0. It will increase if new fields are
-       ever added to the block. The remaining fields are  the  same  as  their
-       namesakes  in  the pcre2_callout block that is used for callouts during
+       The version number is currently 0. It will increase if new  fields  are
+       ever  added  to  the  block. The remaining fields are the same as their
+       namesakes in the pcre2_callout block that is used for  callouts  during
        matching, as described above.


-       Note that the value of pattern_position is  unique  for  each  callout.
-       However,  if  a callout occurs inside a group that is quantified with a
+       Note  that  the  value  of pattern_position is unique for each callout.
+       However, if a callout occurs inside a group that is quantified  with  a
        non-zero minimum or a fixed maximum, the group is replicated inside the
-       compiled  pattern.  For example, a pattern such as /(a){2}/ is compiled
-       as if it were /(a)(a)/. This means that the callout will be  enumerated
-       more  than  once,  but with the same value for pattern_position in each
+       compiled pattern. For example, a pattern such as /(a){2}/  is  compiled
+       as  if it were /(a)(a)/. This means that the callout will be enumerated
+       more than once, but with the same value for  pattern_position  in  each
        case.


        The callback function should normally return zero. If it returns a non-
@@ -4723,9 +4710,9 @@
 DIFFERENCES BETWEEN PCRE2 AND PERL


        This document describes the differences in the ways that PCRE2 and Perl
-       handle regular expressions. The differences  described  here  are  with
-       respect  to Perl versions 5.26, but as both Perl and PCRE2 are continu-
-       ally changing, the information may sometimes be out of date.
+       handle regular expressions. The differences described here are with re-
+       spect to Perl versions 5.26, but as both Perl and PCRE2 are continually
+       changing, the information may sometimes be out of date.


        1. PCRE2 has only a subset of Perl's Unicode support. Details  of  what
        it does have are given in the pcre2unicode page.
@@ -4732,8 +4719,8 @@


        2.  Like  Perl, PCRE2 allows repeat quantifiers on parenthesized asser-
        tions, but they do not mean what you might think. For example, (?!a){3}
-       does  not  assert  that  the next three characters are not "a". It just
-       asserts that the next character is not "a" three times  (in  principle;
+       does not assert that the next three characters are not "a". It just as-
+       serts that the next character is not "a"  three  times  (in  principle;
        PCRE2  optimizes this to run the assertion just once). Perl allows some
        repeat quantifiers on other  assertions,  for  example,  \b*  (but  not
        \b{3}), but these do not seem to have any use.
@@ -4767,11 +4754,11 @@
        in between are treated as literals. However, this is slightly different
        from Perl in that $ and @ are  also  handled  as  literals  inside  the
        quotes. In Perl, they cause variable interpolation (but of course PCRE2
-       does not have variables). Also,  Perl  does  "double-quotish  backslash
-       interpolation" on any backslashes between \Q and \E which, its documen-
-       tation says, "may lead to confusing results". PCRE2 treats a  backslash
-       between  \Q  and  \E  just like any other character. Note the following
-       examples:
+       does not have variables). Also, Perl does "double-quotish backslash in-
+       terpolation" on any backslashes between \Q and \E which, its documenta-
+       tion says, "may lead to confusing results". PCRE2  treats  a  backslash
+       between \Q and \E just like any other character. Note the following ex-
+       amples:


            Pattern            PCRE2 matches     Perl matches


@@ -4814,8 +4801,8 @@

        12. There are some differences that are concerned with the settings  of
        captured  strings  when  part  of  a  pattern is repeated. For example,
-       matching "aba" against the  pattern  /^(a(b)?)+$/  in  Perl  leaves  $2
-       unset, but in PCRE2 it is set to "b".
+       matching "aba" against the pattern /^(a(b)?)+$/ in Perl leaves  $2  un-
+       set, but in PCRE2 it is set to "b".


        13.  PCRE2's  handling  of duplicate capture group numbers and names is
        not as general as Perl's. This is a consequence of the fact  the  PCRE2
@@ -4845,10 +4832,10 @@
        \p{Ll} match all letters, regardless of case, when case independence is
        specified.


-       17.  PCRE2  provides  some  extensions  to  the Perl regular expression
-       facilities.  Perl 5.10 includes new features that are  not  in  earlier
-       versions  of  Perl,  some  of which (such as named parentheses) were in
-       PCRE2 for some time before. This list is with respect to Perl 5.26:
+       17.  PCRE2  provides some extensions to the Perl regular expression fa-
+       cilities.  Perl 5.10 includes new features that are not in earlier ver-
+       sions  of Perl, some of which (such as named parentheses) were in PCRE2
+       for some time before. This list is with respect to Perl 5.26:


        (a) Although lookbehind assertions in PCRE2  must  match  fixed  length
        strings,  each alternative branch of a lookbehind assertion can match a
@@ -4892,8 +4879,8 @@
        changed within the pattern.


        18.  The  Perl  /a modifier restricts /d numbers to pure ascii, and the
-       /aa modifier restricts /i  case-insensitive  matching  to  pure  ascii,
-       ignoring  Unicode  rules.  This  separation  cannot be represented with
+       /aa modifier restricts /i case-insensitive matching to pure ascii,  ig-
+       noring  Unicode  rules.  This  separation  cannot  be  represented with
        PCRE2_UCP.


        19. Perl has different limits than PCRE2. See the pcre2limit documenta-
@@ -4986,8 +4973,8 @@


        There  is  a limit to the size of pattern that JIT supports, imposed by
        the size of machine stack that it uses. The exact rules are  not  docu-
-       mented  because  they  may  change at any time, in particular, when new
-       optimizations are introduced.  If a pattern  is  too  big,  a  call  to
+       mented because they may change at any time, in particular, when new op-
+       timizations are introduced.  If  a  pattern  is  too  big,  a  call  to
        pcre2_jit_compile() returns PCRE2_ERROR_NOMEMORY.


        PCRE2_JIT_COMPLETE  requests the JIT compiler to generate code for com-
@@ -5013,8 +5000,8 @@
        the entire compiled pattern is freed by calling pcre2_code_free().


        In  some circumstances you may need to call additional functions. These
-       are described in the  section  entitled  "Controlling  the  JIT  stack"
-       below.
+       are described in the section entitled "Controlling the JIT  stack"  be-
+       low.


        There are some pcre2_match() options that are not supported by JIT, and
        there are also some pattern items that JIT cannot handle.  Details  are
@@ -5029,8 +5016,8 @@


        If the JIT compiler finds an unsupported item, no JIT  data  is  gener-
        ated.  You  can find out if JIT matching is available after compiling a
-       pattern by calling  pcre2_pattern_info()  with  the  PCRE2_INFO_JITSIZE
-       option.  A non-zero result means that JIT compilation was successful. A
+       pattern by calling pcre2_pattern_info() with the PCRE2_INFO_JITSIZE op-
+       tion.  A  non-zero  result means that JIT compilation was successful. A
        result of 0 means that JIT support is not available, or the pattern was
        not  processed by pcre2_jit_compile(), or the JIT compiler was not able
        to handle the pattern.
@@ -5039,9 +5026,9 @@
 MATCHING SUBJECTS CONTAINING INVALID UTF


        When a pattern is compiled with the PCRE2_UTF option,  subject  strings
-       are  normally  expected  to  be  a valid sequence of UTF code units. By
-       default, this is checked at the start of matching and an error is  gen-
-       erated if invalid UTF is detected. The PCRE2_NO_UTF_CHECK option can be
+       are  normally expected to be a valid sequence of UTF code units. By de-
+       fault, this is checked at the start of matching and an error is  gener-
+       ated  if  invalid UTF is detected. The PCRE2_NO_UTF_CHECK option can be
        passed to pcre2_match() to skip the check (for improved performance) if
        you  are  sure  that  a subject string is valid. If this option is used
        with an invalid string, the result is undefined.
@@ -5050,8 +5037,8 @@
        UTF   sequences   is   available.   Calling  pcre2_compile()  with  the
        PCRE2_MATCH_INVALID_UTF option has two effects:  it  tells  the  inter-
        preter  in pcre2_match() to support invalid UTF, and, if pcre2_jit_com-
-       pile() is called, the compiled JIT  code  also  supports  invalid  UTF.
-       Details of how this support works, in both the JIT and the interpretive
+       pile() is called, the compiled JIT code also supports invalid UTF.  De-
+       tails  of  how this support works, in both the JIT and the interpretive
        cases, is given in the pcre2unicode documentation.


        There  is  also  an  obsolete  option  for  pcre2_jit_compile()  called
@@ -5096,11 +5083,11 @@


        When the compiled JIT code runs, it needs a block of memory to use as a
        stack.  By default, it uses 32KiB on the machine stack.  However,  some
-       large   or   complicated  patterns  need  more  than  this.  The  error
-       PCRE2_ERROR_JIT_STACKLIMIT is given when there  is  not  enough  stack.
-       Three  functions  are provided for managing blocks of memory for use as
-       JIT stacks. There is further discussion about the use of JIT stacks  in
-       the section entitled "JIT stack FAQ" below.
+       large  or complicated patterns need more than this. The error PCRE2_ER-
+       ROR_JIT_STACKLIMIT is given when there is not enough stack. Three func-
+       tions are provided for managing blocks of memory for use as JIT stacks.
+       There is further discussion about the use of JIT stacks in the  section
+       entitled "JIT stack FAQ" below.


        The  pcre2_jit_stack_create()  function  creates a JIT stack. Its argu-
        ments are a starting size, a maximum size, and a general  context  (for
@@ -5144,8 +5131,8 @@
        A  callback function is obeyed whenever JIT code is about to be run; it
        is not obeyed when pcre2_match() is called with options that are incom-
        patible  for JIT matching. A callback function can therefore be used to
-       determine whether a match operation was  executed  by  JIT  or  by  the
-       interpreter.
+       determine whether a match operation was executed by JIT or by  the  in-
+       terpreter.


        You may safely use the same JIT stack for more than one pattern (either
        by assigning directly or by callback), as  long  as  the  patterns  are
@@ -5155,10 +5142,10 @@
        stack to the one used for currently suspended match(es).


        In a multithread application, if you do not specify a JIT stack, or  if
-       you  assign  or  pass  back  NULL from a callback, that is thread-safe,
-       because each thread has its own machine stack. However, if  you  assign
-       or  pass  back a non-NULL JIT stack, this must be a different stack for
-       each thread so that the application is thread-safe.
+       you  assign or pass back NULL from a callback, that is thread-safe, be-
+       cause each thread has its own machine stack. However, if you assign  or
+       pass back a non-NULL JIT stack, this must be a different stack for each
+       thread so that the application is thread-safe.


        Strictly speaking, even more is allowed. You can assign the  same  non-
        NULL  stack  to a match context that is used by any number of patterns,
@@ -5197,11 +5184,11 @@


        (2) Why don't we simply allocate blocks of memory with malloc()?


-       Modern  operating  systems  have  a  nice  feature: they can reserve an
-       address space instead of allocating memory. We can safely allocate mem-
-       ory  pages  inside  this address space, so the stack could grow without
-       moving memory data (this is important because of pointers). Thus we can
-       allocate 1MiB address space, and use only a single memory page (usually
+       Modern  operating  systems have a nice feature: they can reserve an ad-
+       dress space instead of allocating memory. We can safely allocate memory
+       pages inside this address space, so the stack could grow without moving
+       memory data (this is important because of pointers). Thus we can  allo-
+       cate  1MiB  address  space,  and use only a single memory page (usually
        4KiB) if that is enough. However, we can still grow up to 1MiB  anytime
        if needed.


@@ -5304,12 +5291,12 @@
        matching  directly instead of calling pcre2_match() (obviously only for
        patterns that have been successfully processed by pcre2_jit_compile()).


-       The fast path  function  is  called  pcre2_jit_match(),  and  it  takes
-       exactly  the  same  arguments  as  pcre2_match().  However, the subject
-       string must be specified with a length;  PCRE2_ZERO_TERMINATED  is  not
-       supported.   Unsupported  option  bits  (for  example,  PCRE2_ANCHORED,
-       PCRE2_ENDANCHORED and PCRE2_COPY_MATCHED_SUBJECT) are  ignored,  as  is
-       the  PCRE2_NO_JIT  option.  The  return values are also the same as for
+       The fast path function is called pcre2_jit_match(), and  it  takes  ex-
+       actly  the same arguments as pcre2_match(). However, the subject string
+       must be specified with a  length;  PCRE2_ZERO_TERMINATED  is  not  sup-
+       ported. Unsupported option bits (for example, PCRE2_ANCHORED, PCRE2_EN-
+       DANCHORED  and  PCRE2_COPY_MATCHED_SUBJECT)  are  ignored,  as  is  the
+       PCRE2_NO_JIT  option.  The  return  values  are  also  the  same as for
        pcre2_match(), plus PCRE2_ERROR_JIT_BADOPTION if a matching mode  (par-
        tial or complete) is requested that was not compiled.


@@ -5357,14 +5344,14 @@

        The maximum size of a compiled pattern  is  approximately  64  thousand
        code units for the 8-bit and 16-bit libraries if PCRE2 is compiled with
-       the  default  internal  linkage  size,  which  is  2  bytes  for  these
-       libraries.  If  you  want to process regular expressions that are truly
+       the default internal linkage size, which  is  2  bytes  for  these  li-
+       braries.  If  you  want  to  process regular expressions that are truly
        enormous, you can compile PCRE2 with an internal linkage size of 3 or 4
        (when  building  the  16-bit  library,  3  is rounded up to 4). See the
        README file in the source distribution and the pcre2build documentation
        for  details.  In  these cases the limit is substantially larger.  How-
-       ever, the speed of execution is slower.  In  the  32-bit  library,  the
-       internal linkage size is always 4.
+       ever, the speed of execution is slower. In the 32-bit library, the  in-
+       ternal linkage size is always 4.


        The maximum length of a source pattern string is essentially unlimited;
        it is the largest number a PCRE2_SIZE variable can hold.  However,  the
@@ -5371,10 +5358,10 @@
        program that calls pcre2_compile() can specify a smaller limit.


        The maximum length (in code units) of a subject string is one less than
-       the largest number a PCRE2_SIZE variable can  hold.  PCRE2_SIZE  is  an
-       unsigned  integer  type,  usually  defined as size_t. Its maximum value
-       (that is ~(PCRE2_SIZE)0) is reserved as a special indicator  for  zero-
-       terminated strings and unset offsets.
+       the largest number a PCRE2_SIZE variable can hold. PCRE2_SIZE is an un-
+       signed integer type, usually defined as size_t. Its maximum value (that
+       is ~(PCRE2_SIZE)0) is reserved as a special indicator  for  zero-termi-
+       nated strings and unset offsets.


        All values in repeating quantifiers must be less than 65536.


@@ -5432,8 +5419,8 @@

        An alternative algorithm is provided by the pcre2_dfa_match() function;
        it operates in a different way, and is not Perl-compatible. This alter-
-       native  has  advantages  and  disadvantages  compared with the standard
-       algorithm, and these are described below.
+       native  has advantages and disadvantages compared with the standard al-
+       gorithm, and these are described below.


        When there is only one possible way in which a given subject string can
        match  a pattern, the two algorithms give the same answer. A difference
@@ -5501,8 +5488,8 @@
        Although the general principle of this matching algorithm  is  that  it
        scans  the subject string only once, without backtracking, there is one
        exception: when a lookaround assertion is encountered,  the  characters
-       following  or  preceding  the  current  point  have to be independently
-       inspected.
+       following  or  preceding the current point have to be independently in-
+       spected.


        The scan continues until either the end of the subject is  reached,  or
        there  are  no more unterminated paths. At this point, terminated paths
@@ -5536,9 +5523,9 @@
        not  supported  or behave differently in the alternative matching func-
        tion. Those that are not supported cause an error if encountered.


-       1. Because the algorithm finds all  possible  matches,  the  greedy  or
-       ungreedy  nature  of  repetition quantifiers is not relevant (though it
-       may affect auto-possessification, as just described). During  matching,
+       1. Because the algorithm finds all possible matches, the greedy or  un-
+       greedy  nature of repetition quantifiers is not relevant (though it may
+       affect auto-possessification,  as  just  described).  During  matching,
        greedy  and  ungreedy  quantifiers are treated in exactly the same way.
        However, possessive quantifiers can make a difference when what follows
        could  also  match  what  is  quantified, for example in a pattern like
@@ -5567,8 +5554,8 @@


        5. Again for the same reason, script runs are not supported.


-       6.  Because  many  paths  through the tree may be active, the \K escape
-       sequence, which resets the start of the match when encountered (but may
+       6. Because many paths through the tree may be active, the \K escape se-
+       quence, which resets the start of the match when encountered  (but  may
        be on some paths and not on others), is not supported.


        7.  Callouts  are  supported, but the value of the capture_top field is
@@ -5660,11 +5647,11 @@


        If the application sees the user's keystrokes one by one, and can check
        that what has been typed so far is potentially valid,  it  is  able  to
-       raise  an  error  as  soon  as  a  mistake  is made, by beeping and not
-       reflecting the character that has been typed, for example. This immedi-
-       ate  feedback is likely to be a better user interface than a check that
-       is delayed until the entire string has been entered.  Partial  matching
-       can  also be useful when the subject string is very long and is not all
+       raise  an  error  as  soon as a mistake is made, by beeping and not re-
+       flecting the character that has been typed, for example. This immediate
+       feedback  is  likely to be a better user interface than a check that is
+       delayed until the entire string has been entered. Partial matching  can
+       also  be  useful  when  the  subject string is very long and is not all
        available at once.


        PCRE2 supports partial matching by means of the PCRE2_PARTIAL_SOFT  and
@@ -5671,8 +5658,8 @@
        PCRE2_PARTIAL_HARD  options,  which  can be set when calling a matching
        function.  The difference between the two options is whether or  not  a
        partial match is preferred to an alternative complete match, though the
-       details differ between the two types  of  matching  function.  If  both
-       options are set, PCRE2_PARTIAL_HARD takes precedence.
+       details differ between the two types of matching function. If both  op-
+       tions are set, PCRE2_PARTIAL_HARD takes precedence.


        If  you  want to use partial matching with just-in-time optimized code,
        you must call pcre2_jit_compile() with one or both of these options:
@@ -5684,8 +5671,8 @@
        tial  matches  on the same pattern. If the appropriate JIT mode has not
        been compiled, interpretive matching code is used.


-       Setting a partial matching option  disables  two  of  PCRE2's  standard
-       optimizations. PCRE2 remembers the last literal code unit in a pattern,
+       Setting a partial matching option disables two of PCRE2's standard  op-
+       timizations.  PCRE2  remembers the last literal code unit in a pattern,
        and abandons matching immediately if it is not present in  the  subject
        string.  This  optimization  cannot  be  used for a subject string that
        might match only partially. PCRE2 also knows the minimum  length  of  a
@@ -5808,8 +5795,8 @@
        Because the DFA functions always search for all possible  matches,  and
        there  is  no  difference between greedy and ungreedy repetition, their
        behaviour is different from  the  standard  functions  when  PCRE2_PAR-
-       TIAL_HARD  is  set.  Consider  the  string  "dog"  matched  against the
-       ungreedy pattern shown above:
+       TIAL_HARD  is  set.  Consider  the string "dog" matched against the un-
+       greedy pattern shown above:


          /dog(sbody)??/


@@ -5887,11 +5874,11 @@
        to.


        That means that, for an unanchored pattern, if a continued match fails,
-       it is not possible to try again at  a  new  starting  point.  All  this
-       facility  is  capable  of  doing  is continuing with the previous match
-       attempt. In the previous example, if the second set of data  is  "ug23"
-       the  result is no match, even though there would be a match for "aug23"
-       if the entire string were given at once. Depending on the  application,
+       it is not possible to try again at a new starting point. All  this  fa-
+       cility  is  capable  of doing is continuing with the previous match at-
+       tempt. In the previous example, if the second set of data is "ug23" the
+       result  is  no match, even though there would be a match for "aug23" if
+       the entire string were given at once.  Depending  on  the  application,
        this may or may not be what you want.  The only way to allow for start-
        ing again at the next character is to retain the matched  part  of  the
        subject and try a new complete match.
@@ -5933,8 +5920,8 @@


        1. If the pattern contains a test for the beginning of a line, you need
        to  pass  the  PCRE2_NOTBOL option when the subject string for any call
-       does start at the beginning of a line. There  is  also  a  PCRE2_NOTEOL
-       option, but in practice when doing multi-segment matching you should be
+       does start at the beginning of a line. There is also a PCRE2_NOTEOL op-
+       tion,  but  in practice when doing multi-segment matching you should be
        using PCRE2_PARTIAL_HARD, which includes the effect of PCRE2_NOTEOL.


        2. If a pattern contains a lookbehind assertion, characters  that  pre-
@@ -6053,8 +6040,8 @@
          1234|ABCD


        where  no  string can be a partial match for both alternatives. This is
-       not a problem if a standard matching  function  is  used,  because  the
-       entire match has to be rerun each time:
+       not a problem if a standard matching function is used, because the  en-
+       tire match has to be rerun each time:


            re> /1234|3789/
          data> ABC123\=ph
@@ -6103,10 +6090,10 @@


        Perl's  regular expressions are described in its own documentation, and
        regular expressions in general are covered in a number of  books,  some
-       of  which  have  copious  examples. Jeffrey Friedl's "Mastering Regular
-       Expressions", published by  O'Reilly,  covers  regular  expressions  in
-       great  detail.  This  description  of  PCRE2's  regular  expressions is
-       intended as reference material.
+       of which have copious examples. Jeffrey Friedl's "Mastering Regular Ex-
+       pressions", published by O'Reilly, covers regular expressions in  great
+       detail.  This description of PCRE2's regular expressions is intended as
+       reference material.


        This document discusses the regular expression patterns that  are  sup-
        ported  by  PCRE2  when  its  main matching function, pcre2_match(), is
@@ -6124,8 +6111,8 @@
        set by special items at the start of a pattern. These are not Perl-com-
        patible,  but  are provided to make these options accessible to pattern
        writers who are not able to change the program that processes the  pat-
-       tern.  Any  number  of  these  items  may  appear, but they must all be
-       together right at the start of the pattern string, and the letters must
+       tern.  Any  number  of these items may appear, but they must all be to-
+       gether right at the start of the pattern string, and the  letters  must
        be in upper case.


    UTF support
@@ -6144,8 +6131,8 @@


        Some applications that allow their users to supply patterns may wish to
        restrict  them  to  non-UTF  data  for   security   reasons.   If   the
-       PCRE2_NEVER_UTF  option  is  passed  to  pcre2_compile(), (*UTF) is not
-       allowed, and its appearance in a pattern causes an error.
+       PCRE2_NEVER_UTF  option is passed to pcre2_compile(), (*UTF) is not al-
+       lowed, and its appearance in a pattern causes an error.


    Unicode property support


@@ -6165,8 +6152,8 @@
        Starting a pattern with (*NOTEMPTY) or (*NOTEMPTY_ATSTART) has the same
        effect as passing the PCRE2_NOTEMPTY or  PCRE2_NOTEMPTY_ATSTART  option
        to whichever matching function is subsequently called to match the pat-
-       tern. These options lock out the  matching  of  empty  strings,  either
-       entirely, or only at the start of the subject.
+       tern. These options lock out the matching of empty strings, either  en-
+       tirely, or only at the start of the subject.


    Disabling auto-possessification


@@ -6258,8 +6245,8 @@
          (*NUL)       the NUL character (binary zero)


        These override the default and the options given to the compiling func-
-       tion. For example, on a Unix system where LF  is  the  default  newline
-       sequence, the pattern
+       tion. For example, on a Unix system where LF is the default newline se-
+       quence, the pattern


          (*CR)a.b


@@ -6271,8 +6258,8 @@
        tions are true. It also affects the interpretation of the dot metachar-
        acter  when  PCRE2_DOTALL  is not set, and the behaviour of \N when not
        followed by an opening brace. However, it does not affect what  the  \R
-       escape  sequence  matches.  By  default,  this  is  any Unicode newline
-       sequence, for Perl compatibility. However, this can be changed; see the
+       escape  sequence  matches.  By default, this is any Unicode newline se-
+       quence, for Perl compatibility. However, this can be changed;  see  the
        next section and the description of \R in the section entitled "Newline
        sequences" below. A change of \R setting can be combined with a  change
        of newline convention.
@@ -6430,9 +6417,9 @@
        as \x{dc} or \334.  However, using the braced versions does  make  such
        sequences easier to read.


-       Support  is  available  for  some  ECMAScript  (aka  JavaScript) escape
-       sequences via two compile-time options. If PCRE2_ALT_BSUX is  set,  the
-       sequence  \x followed by { is not recognized. Only if \x is followed by
+       Support  is  available  for some ECMAScript (aka JavaScript) escape se-
+       quences via two compile-time options. If PCRE2_ALT_BSUX is set, the se-
+       quence  \x  followed  by { is not recognized. Only if \x is followed by
        two hexadecimal digits is it recognized as a character  escape.  Other-
        wise  it  is interpreted as a literal "x" character. In this mode, sup-
        port for code points greater than 256 is provided by \u, which must  be
@@ -6439,10 +6426,10 @@
        followed  by  four hexadecimal digits; otherwise it is interpreted as a
        literal "u" character.


-       PCRE2_EXTRA_ALT_BSUX has the same  effect  as  PCRE2_ALT_BSUX  and,  in
-       addition,  \u{hhh..}  is recognized as the character specified by hexa-
-       decimal code point.  There may be any  number  of  hexadecimal  digits.
-       This syntax is from ECMAScript 6.
+       PCRE2_EXTRA_ALT_BSUX has the same effect as PCRE2_ALT_BSUX and, in  ad-
+       dition, \u{hhh..} is recognized as the character specified by hexadeci-
+       mal code point.  There may be any number of  hexadecimal  digits.  This
+       syntax is from ECMAScript 6.


        The  \N{U+hhh..} escape sequence is recognized only when PCRE2 is oper-
        ating in UTF mode. Perl also uses \N{name}  to  specify  characters  by
@@ -6450,8 +6437,8 @@
        followed by an opening brace (curly bracket) it has an entirely differ-
        ent meaning, matching any character that is not a newline.


-       There  are  some  legacy  applications  where the escape sequence \r is
-       expected to match a newline. If the PCRE2_EXTRA_ESCAPED_CR_IS_LF option
+       There  are some legacy applications where the escape sequence \r is ex-
+       pected to match a newline. If the  PCRE2_EXTRA_ESCAPED_CR_IS_LF  option
        is  set,  \r  in  a  pattern is converted to \n so that it matches a LF
        (linefeed) instead of a CR (carriage return) character.


@@ -6469,8 +6456,8 @@
        one  of @, [, \, ], ^, _, or ?. Any other character provokes a compile-
        time error. The sequence \c@ encodes character code  0;  after  \c  the
        letters  (in either case) encode characters 1-26 (hex 01 to hex 1A); [,
-       \, ], ^, and _ encode characters 27-31 (hex 1B  to  hex  1F),  and  \c?
-       becomes either 255 (hex FF) or 95 (hex 5F).
+       \, ], ^, and _ encode characters 27-31 (hex 1B to hex 1F), and \c?  be-
+       comes either 255 (hex FF) or 95 (hex 5F).


        Thus,  apart  from  \c?, these escapes generate the same character code
        values as they do in an ASCII environment, though the meanings  of  the
@@ -6486,8 +6473,8 @@
        95; otherwise it generates 255.


        After \0 up to two further octal digits are read. If  there  are  fewer
-       than  two  digits,  just  those  that  are  present  are used. Thus the
-       sequence \0\x\015 specifies two binary zeros followed by a CR character
+       than  two  digits,  just  those that are present are used. Thus the se-
+       quence \0\x\015 specifies two binary zeros followed by a  CR  character
        (code value 13). Make sure you supply two digits after the initial zero
        if the pattern character that follows is itself an octal digit.


@@ -6570,14 +6557,14 @@
        In  Perl,  the  sequences  \F, \l, \L, \u, and \U are recognized by its
        string handler and used to modify the case of following characters.  By
        default,  PCRE2  does  not  support these escape sequences in patterns.
-       However,  if  either  of  the  PCRE2_ALT_BSUX  or  PCRE2_EXTRA_ALT_BSUX
-       options  is  set,  \U  matches  a  "U" character, and \u can be used to
-       define a character by code point, as described above.
+       However, if either of the PCRE2_ALT_BSUX  or  PCRE2_EXTRA_ALT_BSUX  op-
+       tions  is set, \U matches a "U" character, and \u can be used to define
+       a character by code point, as described above.


    Absolute and relative backreferences


-       The sequence \g followed by a signed  or  unsigned  number,  optionally
-       enclosed  in  braces, is an absolute or relative backreference. A named
+       The sequence \g followed by a signed or unsigned number, optionally en-
+       closed  in  braces,  is  an absolute or relative backreference. A named
        backreference can be coded as \g{name}.  Backreferences  are  discussed
        later, following the discussion of parenthesized groups.


@@ -6622,8 +6609,8 @@
        match.


        The  default  \s  characters  are HT (9), LF (10), VT (11), FF (12), CR
-       (13), and space (32), which are defined  as  white  space  in  the  "C"
-       locale. This list may vary if locale-specific matching is taking place.
+       (13), and space (32), which are defined as white space in the  "C"  lo-
+       cale.  This  list may vary if locale-specific matching is taking place.
        For example, in some locales the "non-breaking space" character  (\xA0)
        is recognized as white space, and in others the VT character is not.


@@ -6701,8 +6688,8 @@

          (?>\r\n|\n|\x0b|\f|\r|\x85)


-       This  is  an  example  of an "atomic group", details of which are given
-       below.  This particular group matches either the two-character sequence
+       This is an example of an "atomic group", details of which are given be-
+       low.  This particular group matches either the  two-character  sequence
        CR  followed  by  LF,  or  one  of  the single characters LF (linefeed,
        U+000A), VT (vertical tab, U+000B), FF (form feed,  U+000C),  CR  (car-
        riage  return,  U+000D), or NEL (next line, U+0085). Because this is an
@@ -6729,14 +6716,14 @@
        tion.  Note that these special settings, which are not Perl-compatible,
        are recognized only at the very start of a pattern, and that they  must
        be  in upper case. If more than one of them is present, the last one is
-       used. They can be combined with a change  of  newline  convention;  for
-       example, a pattern can start with:
+       used. They can be combined with a change of newline convention; for ex-
+       ample, a pattern can start with:


          (*ANY)(*BSR_ANYCRLF)


        They  can also be combined with the (*UTF) or (*UCP) special sequences.
-       Inside a character class, \R  is  treated  as  an  unrecognized  escape
-       sequence, and causes an error.
+       Inside a character class, \R is treated as an unrecognized  escape  se-
+       quence, and causes an error.


    Unicode character properties


@@ -6746,8 +6733,8 @@
        non-UTF modes these sequences are of course limited to testing  charac-
        ters  whose code points are less than U+0100 and U+10000, respectively.
        In 32-bit non-UTF mode, code points greater than 0x10ffff (the  Unicode
-       limit)  may  be  encountered.  These  are  all  treated as being in the
-       Unknown script and with an unassigned type. The extra escape  sequences
+       limit)  may  be  encountered. These are all treated as being in the Un-
+       known script and with an unassigned type. The  extra  escape  sequences
        are:


          \p{xx}   a character with the xx property
@@ -6779,35 +6766,34 @@
        Braille, Buginese, Buhid, Canadian_Aboriginal, Carian,  Caucasian_Alba-
        nian,  Chakma,  Cham,  Cherokee,  Common,  Coptic,  Cuneiform, Cypriot,
        Cyrillic, Deseret, Devanagari, Dogra,  Duployan,  Egyptian_Hieroglyphs,
-       Elbasan,   Ethiopic,  Georgian,  Glagolitic,  Gothic,  Grantha,  Greek,
-       Gujarati,  Gunjala_Gondi,  Gurmukhi,  Han,   Hangul,   Hanifi_Rohingya,
-       Hanunoo,   Hatran,   Hebrew,   Hiragana,  Imperial_Aramaic,  Inherited,
-       Inscriptional_Pahlavi, Inscriptional_Parthian, Javanese,  Kaithi,  Kan-
-       nada,  Katakana,  Kayah_Li,  Kharoshthi, Khmer, Khojki, Khudawadi, Lao,
-       Latin, Lepcha, Limbu, Linear_A, Linear_B, Lisu, Lycian,  Lydian,  Maha-
-       jani,  Makasar, Malayalam, Mandaic, Manichaean, Marchen, Masaram_Gondi,
-       Medefaidrin,     Meetei_Mayek,     Mende_Kikakui,     Meroitic_Cursive,
-       Meroitic_Hieroglyphs,  Miao,  Modi,  Mongolian,  Mro, Multani, Myanmar,
-       Nabataean, New_Tai_Lue, Newa, Nko, Nushu, Ogham, Ol_Chiki,  Old_Hungar-
-       ian,  Old_Italic,  Old_North_Arabian, Old_Permic, Old_Persian, Old_Sog-
-       dian,   Old_South_Arabian,   Old_Turkic,   Oriya,    Osage,    Osmanya,
-       Pahawh_Hmong,    Palmyrene,    Pau_Cin_Hau,    Phags_Pa,    Phoenician,
-       Psalter_Pahlavi, Rejang, Runic, Samaritan,  Saurashtra,  Sharada,  Sha-
-       vian,  Siddham,  SignWriting,  Sinhala, Sogdian, Sora_Sompeng, Soyombo,
-       Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa,  Tai_Le,  Tai_Tham,
-       Tai_Viet,  Takri,  Tamil,  Tangut, Telugu, Thaana, Thai, Tibetan, Tifi-
-       nagh,  Tirhuta,  Ugaritic,  Unknown,   Vai,   Warang_Citi,   Yi,   Zan-
-       abazar_Square.
+       Elbasan,  Ethiopic,  Georgian,  Glagolitic, Gothic, Grantha, Greek, Gu-
+       jarati, Gunjala_Gondi, Gurmukhi, Han, Hangul, Hanifi_Rohingya, Hanunoo,
+       Hatran,   Hebrew,   Hiragana,   Imperial_Aramaic,  Inherited,  Inscrip-
+       tional_Pahlavi,  Inscriptional_Parthian,  Javanese,  Kaithi,   Kannada,
+       Katakana,  Kayah_Li,  Kharoshthi, Khmer, Khojki, Khudawadi, Lao, Latin,
+       Lepcha, Limbu, Linear_A,  Linear_B,  Lisu,  Lycian,  Lydian,  Mahajani,
+       Makasar,  Malayalam, Mandaic, Manichaean, Marchen, Masaram_Gondi, Mede-
+       faidrin, Meetei_Mayek, Mende_Kikakui, Meroitic_Cursive, Meroitic_Hiero-
+       glyphs,  Miao,  Modi,  Mongolian,  Mro,  Multani,  Myanmar,  Nabataean,
+       New_Tai_Lue,  Newa,  Nko,  Nushu,   Ogham,   Ol_Chiki,   Old_Hungarian,
+       Old_Italic,  Old_North_Arabian,  Old_Permic,  Old_Persian, Old_Sogdian,
+       Old_South_Arabian, Old_Turkic,  Oriya,  Osage,  Osmanya,  Pahawh_Hmong,
+       Palmyrene,  Pau_Cin_Hau, Phags_Pa, Phoenician, Psalter_Pahlavi, Rejang,
+       Runic, Samaritan, Saurashtra, Sharada, Shavian,  Siddham,  SignWriting,
+       Sinhala,  Sogdian, Sora_Sompeng, Soyombo, Sundanese, Syloti_Nagri, Syr-
+       iac, Tagalog,  Tagbanwa,  Tai_Le,  Tai_Tham,  Tai_Viet,  Takri,  Tamil,
+       Tangut, Telugu, Thaana, Thai, Tibetan, Tifinagh, Tirhuta, Ugaritic, Un-
+       known, Vai, Warang_Citi, Yi, Zanabazar_Square.


        Each character has exactly one Unicode general category property, spec-
-       ified by a two-letter abbreviation. For compatibility with Perl,  nega-
-       tion  can  be  specified  by including a circumflex between the opening
-       brace and the property name.  For  example,  \p{^Lu}  is  the  same  as
+       ified  by a two-letter abbreviation. For compatibility with Perl, nega-
+       tion can be specified by including a  circumflex  between  the  opening
+       brace  and  the  property  name.  For  example,  \p{^Lu} is the same as
        \P{Lu}.


        If only one letter is specified with \p or \P, it includes all the gen-
-       eral category properties that start with that letter. In this case,  in
-       the  absence of negation, the curly brackets in the escape sequence are
+       eral  category properties that start with that letter. In this case, in
+       the absence of negation, the curly brackets in the escape sequence  are
        optional; these two examples have the same effect:


          \p{L}
@@ -6859,20 +6845,20 @@
          Zp    Paragraph separator
          Zs    Space separator


-       The special property L& is also supported: it matches a character  that
-       has  the  Lu,  Ll, or Lt property, in other words, a letter that is not
+       The  special property L& is also supported: it matches a character that
+       has the Lu, Ll, or Lt property, in other words, a letter  that  is  not
        classified as a modifier or "other".


-       The Cs (Surrogate) property  applies  only  to  characters  whose  code
-       points  are in the range U+D800 to U+DFFF. These characters are no dif-
-       ferent to any other character when PCRE2 is not in UTF mode (using  the
-       16-bit  or  32-bit  library).   However,  they are not valid in Unicode
+       The  Cs  (Surrogate)  property  applies  only  to characters whose code
+       points are in the range U+D800 to U+DFFF. These characters are no  dif-
+       ferent  to any other character when PCRE2 is not in UTF mode (using the
+       16-bit or 32-bit library).  However, they  are  not  valid  in  Unicode
        strings and so cannot be tested by PCRE2 in UTF mode, unless UTF valid-
-       ity   checking   has   been   turned   off   (see   the  discussion  of
+       ity  checking  has   been   turned   off   (see   the   discussion   of
        PCRE2_NO_UTF_CHECK in the pcre2api page).


-       The long synonyms for  property  names  that  Perl  supports  (such  as
-       \p{Letter})  are  not supported by PCRE2, nor is it permitted to prefix
+       The  long  synonyms  for  property  names  that  Perl supports (such as
+       \p{Letter}) are not supported by PCRE2, nor is it permitted  to  prefix
        any of these properties with "Is".


        No character that is in the Unicode table has the Cn (unassigned) prop-
@@ -6879,68 +6865,68 @@
        erty.  Instead, this property is assumed for any code point that is not
        in the Unicode table.


-       Specifying caseless matching does not affect  these  escape  sequences.
-       For  example,  \p{Lu}  always  matches only upper case letters. This is
+       Specifying  caseless  matching  does not affect these escape sequences.
+       For example, \p{Lu} always matches only upper  case  letters.  This  is
        different from the behaviour of current versions of Perl.


-       Matching characters by Unicode property is not fast, because PCRE2  has
-       to  do  a  multistage table lookup in order to find a character's prop-
+       Matching  characters by Unicode property is not fast, because PCRE2 has
+       to do a multistage table lookup in order to find  a  character's  prop-
        erty. That is why the traditional escape sequences such as \d and \w do
-       not  use  Unicode  properties  in PCRE2 by default, though you can make
-       them do so by setting the PCRE2_UCP option or by starting  the  pattern
+       not use Unicode properties in PCRE2 by default,  though  you  can  make
+       them  do  so by setting the PCRE2_UCP option or by starting the pattern
        with (*UCP).


    Extended grapheme clusters


-       The  \X  escape  matches  any number of Unicode characters that form an
+       The \X escape matches any number of Unicode  characters  that  form  an
        "extended grapheme cluster", and treats the sequence as an atomic group
-       (see  below).  Unicode supports various kinds of composite character by
-       giving each character a grapheme breaking property,  and  having  rules
+       (see below).  Unicode supports various kinds of composite character  by
+       giving  each  character  a grapheme breaking property, and having rules
        that use these properties to define the boundaries of extended grapheme
-       clusters. The rules are defined in Unicode Standard Annex 29,  "Unicode
-       Text  Segmentation".  Unicode 11.0.0 abandoned the use of some previous
-       properties that had been used for emojis.  Instead it introduced  vari-
-       ous  emoji-specific  properties.  PCRE2  uses  only the Extended Picto-
+       clusters.  The rules are defined in Unicode Standard Annex 29, "Unicode
+       Text Segmentation". Unicode 11.0.0 abandoned the use of  some  previous
+       properties  that had been used for emojis.  Instead it introduced vari-
+       ous emoji-specific properties. PCRE2  uses  only  the  Extended  Picto-
        graphic property.


-       \X always matches at least one character. Then it  decides  whether  to
+       \X  always  matches  at least one character. Then it decides whether to
        add additional characters according to the following rules for ending a
        cluster:


        1. End at the end of the subject string.


-       2. Do not end between CR and LF; otherwise end after any control  char-
+       2.  Do not end between CR and LF; otherwise end after any control char-
        acter.


-       3.  Do  not  break  Hangul (a Korean script) syllable sequences. Hangul
-       characters are of five types: L, V, T, LV, and LVT. An L character  may
-       be  followed by an L, V, LV, or LVT character; an LV or V character may
+       3. Do not break Hangul (a Korean  script)  syllable  sequences.  Hangul
+       characters  are of five types: L, V, T, LV, and LVT. An L character may
+       be followed by an L, V, LV, or LVT character; an LV or V character  may
        be followed by a V or T character; an LVT or T character may be follwed
        only by a T character.


-       4.  Do  not  end  before  extending  characters or spacing marks or the
-       "zero-width joiner" character.  Characters  with  the  "mark"  property
-       always have the "extend" grapheme breaking property.
+       4. Do not end before extending  characters  or  spacing  marks  or  the
+       "zero-width  joiner" character. Characters with the "mark" property al-
+       ways have the "extend" grapheme breaking property.


        5. Do not end after prepend characters.


        6. Do not break within emoji modifier sequences or emoji zwj sequences.
        That is, do not break between characters with the Extended_Pictographic
-       property.   Extend  and  ZWJ characters are allowed between the charac-
+       property.  Extend and ZWJ characters are allowed  between  the  charac-
        ters.


-       7. Do not break within emoji flag sequences.  That  is,  do  not  break
-       between  regional  indicator (RI) characters if there are an odd number
-       of RI characters before the break point.
+       7.  Do not break within emoji flag sequences. That is, do not break be-
+       tween regional indicator (RI) characters if there are an odd number  of
+       RI characters before the break point.


        8. Otherwise, end the cluster.


    PCRE2's additional properties


-       As well as the standard Unicode properties described above, PCRE2  sup-
-       ports  four  more  that  make it possible to convert traditional escape
-       sequences such as \w and \s to use Unicode properties. PCRE2 uses these
-       non-standard,  non-Perl  properties  internally  when PCRE2_UCP is set.
+       As  well as the standard Unicode properties described above, PCRE2 sup-
+       ports four more that make it possible to convert traditional escape se-
+       quences  such  as \w and \s to use Unicode properties. PCRE2 uses these
+       non-standard, non-Perl properties internally  when  PCRE2_UCP  is  set.
        However, they may also be used explicitly. These properties are:


          Xan   Any alphanumeric character
@@ -6948,42 +6934,42 @@
          Xsp   Any Perl space character
          Xwd   Any Perl "word" character


-       Xan matches characters that have either the L (letter) or the  N  (num-
-       ber)  property. Xps matches the characters tab, linefeed, vertical tab,
-       form feed, or carriage return, and any other character that has  the  Z
-       (separator)  property.   Xsp  is  the  same as Xps; in PCRE1 it used to
-       exclude vertical tab, for Perl compatibility,  but  Perl  changed.  Xwd
+       Xan  matches  characters that have either the L (letter) or the N (num-
+       ber) property. Xps matches the characters tab, linefeed, vertical  tab,
+       form  feed,  or carriage return, and any other character that has the Z
+       (separator) property.  Xsp is the same as Xps; in PCRE1 it used to  ex-
+       clude  vertical  tab,  for  Perl  compatibility,  but Perl changed. Xwd
        matches the same characters as Xan, plus underscore.


-       There  is another non-standard property, Xuc, which matches any charac-
-       ter that can be represented by a Universal Character Name  in  C++  and
-       other  programming  languages.  These are the characters $, @, ` (grave
-       accent), and all characters with Unicode code points  greater  than  or
-       equal  to U+00A0, except for the surrogates U+D800 to U+DFFF. Note that
-       most base (ASCII) characters are excluded. (Universal  Character  Names
-       are  of  the  form \uHHHH or \UHHHHHHHH where H is a hexadecimal digit.
+       There is another non-standard property, Xuc, which matches any  charac-
+       ter  that  can  be represented by a Universal Character Name in C++ and
+       other programming languages. These are the characters $,  @,  `  (grave
+       accent),  and  all  characters with Unicode code points greater than or
+       equal to U+00A0, except for the surrogates U+D800 to U+DFFF. Note  that
+       most  base  (ASCII) characters are excluded. (Universal Character Names
+       are of the form \uHHHH or \UHHHHHHHH where H is  a  hexadecimal  digit.
        Note that the Xuc property does not match these sequences but the char-
        acters that they represent.)


    Resetting the match start


-       In  normal  use,  the  escape sequence \K causes any previously matched
-       characters not to be included in the final  matched  sequence  that  is
-       returned. For example, the pattern:
+       In normal use, the escape sequence \K  causes  any  previously  matched
+       characters not to be included in the final matched sequence that is re-
+       turned. For example, the pattern:


          foo\Kbar


-       matches  "foobar",  but  reports that it has matched "bar". \K does not
+       matches "foobar", but reports that it has matched "bar".  \K  does  not
        interact with anchoring in any way. The pattern:


          ^foo\Kbar


-       matches only when the subject begins  with  "foobar"  (in  single  line
-       mode),  though  it again reports the matched string as "bar". This fea-
-       ture is similar to a lookbehind assertion (described below).   However,
-       in  this  case,  the part of the subject before the real match does not
-       have to be of fixed length, as lookbehind assertions do. The use of  \K
-       does  not interfere with the setting of captured substrings.  For exam-
+       matches  only  when  the  subject  begins with "foobar" (in single line
+       mode), though it again reports the matched string as "bar".  This  fea-
+       ture  is similar to a lookbehind assertion (described below).  However,
+       in this case, the part of the subject before the real  match  does  not
+       have  to be of fixed length, as lookbehind assertions do. The use of \K
+       does not interfere with the setting of captured substrings.  For  exam-
        ple, when the pattern


          (foo)\Kbar
@@ -6990,27 +6976,27 @@


        matches "foobar", the first substring is still set to "foo".


-       Perl documents that the use  of  \K  within  assertions  is  "not  well
-       defined".  In  PCRE2,  \K  is acted upon when it occurs inside positive
-       assertions, but is ignored in negative assertions.  Note  that  when  a
-       pattern  such  as (?=ab\K) matches, the reported start of the match can
-       be greater than the end of the match. Using \K in a  lookbehind  asser-
-       tion  at the start of a pattern can also lead to odd effects. For exam-
-       ple, consider this pattern:
+       Perl  documents  that  the use of \K within assertions is "not well de-
+       fined". In PCRE2, \K is acted upon when it occurs inside  positive  as-
+       sertions,  but is ignored in negative assertions. Note that when a pat-
+       tern such as (?=ab\K) matches, the reported start of the match  can  be
+       greater  than  the end of the match. Using \K in a lookbehind assertion
+       at the start of a pattern can also lead to odd  effects.  For  example,
+       consider this pattern:


          (?<=\Kfoo)bar


-       If the subject is "foobar", a call to  pcre2_match()  with  a  starting
-       offset  of 3 succeeds and reports the matching string as "foobar", that
-       is, the start of the reported match is earlier  than  where  the  match
+       If  the  subject  is  "foobar", a call to pcre2_match() with a starting
+       offset of 3 succeeds and reports the matching string as "foobar",  that
+       is,  the  start  of  the reported match is earlier than where the match
        started.


    Simple assertions


-       The  final use of backslash is for certain simple assertions. An asser-
-       tion specifies a condition that has to be met at a particular point  in
-       a  match, without consuming any characters from the subject string. The
-       use of groups for more complicated assertions is described below.   The
+       The final use of backslash is for certain simple assertions. An  asser-
+       tion  specifies a condition that has to be met at a particular point in
+       a match, without consuming any characters from the subject string.  The
+       use  of groups for more complicated assertions is described below.  The
        backslashed assertions are:


          \b     matches at a word boundary
@@ -7021,48 +7007,48 @@
          \z     matches only at the end of the subject
          \G     matches at the first matching position in the subject


-       Inside  a  character  class, \b has a different meaning; it matches the
-       backspace character. If any other of  these  assertions  appears  in  a
+       Inside a character class, \b has a different meaning;  it  matches  the
+       backspace  character.  If  any  other  of these assertions appears in a
        character class, an "invalid escape sequence" error is generated.


-       A  word  boundary is a position in the subject string where the current
-       character and the previous character do not both match \w or  \W  (i.e.
-       one  matches  \w  and the other matches \W), or the start or end of the
-       string if the first or last character matches  \w,  respectively.  When
-       PCRE2  is  built with Unicode support, the meanings of \w and \W can be
-       changed by setting the PCRE2_UCP option. When this  is  done,  it  also
-       affects  \b  and  \B.  Neither  PCRE2 nor Perl has a separate "start of
-       word" or "end of word" metasequence. However, whatever follows \b  nor-
-       mally determines which it is. For example, the fragment \ba matches "a"
-       at the start of a word.
+       A word boundary is a position in the subject string where  the  current
+       character  and  the previous character do not both match \w or \W (i.e.
+       one matches \w and the other matches \W), or the start or  end  of  the
+       string  if  the  first or last character matches \w, respectively. When
+       PCRE2 is built with Unicode support, the meanings of \w and \W  can  be
+       changed by setting the PCRE2_UCP option. When this is done, it also af-
+       fects \b and \B. Neither PCRE2 nor Perl has a separate "start of  word"
+       or  "end  of  word" metasequence. However, whatever follows \b normally
+       determines which it is. For example, the fragment \ba  matches  "a"  at
+       the start of a word.


-       The \A, \Z, and \z assertions differ from  the  traditional  circumflex
+       The  \A,  \Z,  and \z assertions differ from the traditional circumflex
        and dollar (described in the next section) in that they only ever match
-       at the very start and end of the subject string, whatever  options  are
-       set.  Thus,  they are independent of multiline mode. These three asser-
-       tions are not affected by the  PCRE2_NOTBOL  or  PCRE2_NOTEOL  options,
-       which  affect only the behaviour of the circumflex and dollar metachar-
-       acters. However, if the startoffset argument of pcre2_match()  is  non-
-       zero,  indicating  that  matching is to start at a point other than the
-       beginning of the subject, \A can never match.  The  difference  between
-       \Z  and \z is that \Z matches before a newline at the end of the string
+       at  the  very start and end of the subject string, whatever options are
+       set. Thus, they are independent of multiline mode. These  three  asser-
+       tions  are  not  affected  by the PCRE2_NOTBOL or PCRE2_NOTEOL options,
+       which affect only the behaviour of the circumflex and dollar  metachar-
+       acters.  However,  if the startoffset argument of pcre2_match() is non-
+       zero, indicating that matching is to start at a point  other  than  the
+       beginning  of  the subject, \A can never match.  The difference between
+       \Z and \z is that \Z matches before a newline at the end of the  string
        as well as at the very end, whereas \z matches only at the end.


-       The \G assertion is true only when the current matching position is  at
-       the  start point of the matching process, as specified by the startoff-
-       set argument of pcre2_match(). It differs from \A  when  the  value  of
-       startoffset  is  non-zero. By calling pcre2_match() multiple times with
-       appropriate arguments, you can mimic Perl's /g option,  and  it  is  in
+       The  \G assertion is true only when the current matching position is at
+       the start point of the matching process, as specified by the  startoff-
+       set  argument  of  pcre2_match().  It differs from \A when the value of
+       startoffset is non-zero. By calling pcre2_match() multiple  times  with
+       appropriate  arguments,  you  can  mimic Perl's /g option, and it is in
        this kind of implementation where \G can be useful.


-       Note,  however,  that  PCRE2's  implementation of \G, being true at the
-       starting character of the matching process, is  subtly  different  from
-       Perl's,  which  defines it as true at the end of the previous match. In
-       Perl, these can be different when the  previously  matched  string  was
+       Note, however, that PCRE2's implementation of \G,  being  true  at  the
+       starting  character  of  the matching process, is subtly different from
+       Perl's, which defines it as true at the end of the previous  match.  In
+       Perl,  these  can  be  different when the previously matched string was
        empty. Because PCRE2 does just one match at a time, it cannot reproduce
        this behaviour.


-       If all the alternatives of a pattern begin with \G, the  expression  is
+       If  all  the alternatives of a pattern begin with \G, the expression is
        anchored to the starting match position, and the "anchored" flag is set
        in the compiled regular expression.


@@ -7069,70 +7055,70 @@

CIRCUMFLEX AND DOLLAR

-       The circumflex and dollar  metacharacters  are  zero-width  assertions.
-       That  is,  they test for a particular condition being true without con-
+       The  circumflex  and  dollar  metacharacters are zero-width assertions.
+       That is, they test for a particular condition being true  without  con-
        suming any characters from the subject string. These two metacharacters
-       are  concerned  with matching the starts and ends of lines. If the new-
-       line convention is set so that only the two-character sequence CRLF  is
-       recognized  as  a newline, isolated CR and LF characters are treated as
+       are concerned with matching the starts and ends of lines. If  the  new-
+       line  convention is set so that only the two-character sequence CRLF is
+       recognized as a newline, isolated CR and LF characters are  treated  as
        ordinary data characters, and are not recognized as newlines.


        Outside a character class, in the default matching mode, the circumflex
-       character  is  an  assertion  that is true only if the current matching
-       point is at the start of the subject string. If the  startoffset  argu-
-       ment  of  pcre2_match() is non-zero, or if PCRE2_NOTBOL is set, circum-
-       flex can never match if the PCRE2_MULTILINE option is unset.  Inside  a
-       character  class,  circumflex  has  an  entirely different meaning (see
-       below).
+       character is an assertion that is true only  if  the  current  matching
+       point  is  at the start of the subject string. If the startoffset argu-
+       ment of pcre2_match() is non-zero, or if PCRE2_NOTBOL is  set,  circum-
+       flex  can  never match if the PCRE2_MULTILINE option is unset. Inside a
+       character class, circumflex has an entirely different meaning (see  be-
+       low).


-       Circumflex need not be the first character of the pattern if  a  number
-       of  alternatives are involved, but it should be the first thing in each
-       alternative in which it appears if the pattern is ever  to  match  that
-       branch.  If all possible alternatives start with a circumflex, that is,
-       if the pattern is constrained to match only at the start  of  the  sub-
-       ject,  it  is  said  to be an "anchored" pattern. (There are also other
+       Circumflex  need  not be the first character of the pattern if a number
+       of alternatives are involved, but it should be the first thing in  each
+       alternative  in  which  it appears if the pattern is ever to match that
+       branch. If all possible alternatives start with a circumflex, that  is,
+       if  the  pattern  is constrained to match only at the start of the sub-
+       ject, it is said to be an "anchored" pattern.  (There  are  also  other
        constructs that can cause a pattern to be anchored.)


-       The dollar character is an assertion that is true only if  the  current
-       matching  point  is  at  the  end of the subject string, or immediately
-       before a newline  at  the  end  of  the  string  (by  default),  unless
-       PCRE2_NOTEOL is set. Note, however, that it does not actually match the
-       newline. Dollar need not be the last character of the pattern if a num-
-       ber of alternatives are involved, but it should be the last item in any
-       branch in which it appears. Dollar has no special meaning in a  charac-
+       The  dollar  character is an assertion that is true only if the current
+       matching point is at the end of the subject string, or immediately  be-
+       fore  a newline at the end of the string (by default), unless PCRE2_NO-
+       TEOL is set. Note, however, that it does not actually  match  the  new-
+       line.  Dollar need not be the last character of the pattern if a number
+       of alternatives are involved, but it should be the  last  item  in  any
+       branch  in which it appears. Dollar has no special meaning in a charac-
        ter class.


-       The  meaning  of  dollar  can be changed so that it matches only at the
-       very end of the string, by setting the PCRE2_DOLLAR_ENDONLY  option  at
+       The meaning of dollar can be changed so that it  matches  only  at  the
+       very  end  of the string, by setting the PCRE2_DOLLAR_ENDONLY option at
        compile time. This does not affect the \Z assertion.


        The meanings of the circumflex and dollar metacharacters are changed if
-       the PCRE2_MULTILINE option is set. When this  is  the  case,  a  dollar
-       character  matches before any newlines in the string, as well as at the
-       very end, and a circumflex matches immediately after internal  newlines
-       as  well as at the start of the subject string. It does not match after
-       a newline that ends the string, for compatibility with  Perl.  However,
+       the  PCRE2_MULTILINE  option  is  set.  When this is the case, a dollar
+       character matches before any newlines in the string, as well as at  the
+       very  end, and a circumflex matches immediately after internal newlines
+       as well as at the start of the subject string. It does not match  after
+       a  newline  that ends the string, for compatibility with Perl. However,
        this can be changed by setting the PCRE2_ALT_CIRCUMFLEX option.


-       For  example, the pattern /^abc$/ matches the subject string "def\nabc"
-       (where \n represents a newline) in multiline mode, but  not  otherwise.
-       Consequently,  patterns  that  are anchored in single line mode because
-       all branches start with ^ are not anchored in  multiline  mode,  and  a
-       match  for  circumflex  is  possible  when  the startoffset argument of
-       pcre2_match() is non-zero. The PCRE2_DOLLAR_ENDONLY option  is  ignored
+       For example, the pattern /^abc$/ matches the subject string  "def\nabc"
+       (where  \n  represents a newline) in multiline mode, but not otherwise.
+       Consequently, patterns that are anchored in single  line  mode  because
+       all  branches  start  with  ^ are not anchored in multiline mode, and a
+       match for circumflex is  possible  when  the  startoffset  argument  of
+       pcre2_match()  is  non-zero. The PCRE2_DOLLAR_ENDONLY option is ignored
        if PCRE2_MULTILINE is set.


-       When  the  newline  convention (see "Newline conventions" below) recog-
-       nizes the two-character sequence CRLF as a newline, this is  preferred,
-       even  if  the  single  characters CR and LF are also recognized as new-
-       lines. For example, if the newline convention  is  "any",  a  multiline
-       mode  circumflex matches before "xyz" in the string "abc\r\nxyz" rather
-       than after CR, even though CR on its own is a valid newline.  (It  also
+       When the newline convention (see "Newline  conventions"  below)  recog-
+       nizes  the two-character sequence CRLF as a newline, this is preferred,
+       even if the single characters CR and LF are  also  recognized  as  new-
+       lines.  For  example,  if  the newline convention is "any", a multiline
+       mode circumflex matches before "xyz" in the string "abc\r\nxyz"  rather
+       than  after  CR, even though CR on its own is a valid newline. (It also
        matches at the very start of the string, of course.)


-       Note  that  the sequences \A, \Z, and \z can be used to match the start
-       and end of the subject in both modes, and if all branches of a  pattern
-       start  with \A it is always anchored, whether or not PCRE2_MULTILINE is
+       Note that the sequences \A, \Z, and \z can be used to match  the  start
+       and  end of the subject in both modes, and if all branches of a pattern
+       start with \A it is always anchored, whether or not PCRE2_MULTILINE  is
        set.



@@ -7139,73 +7125,73 @@
FULL STOP (PERIOD, DOT) AND \N

        Outside a character class, a dot in the pattern matches any one charac-
-       ter  in  the subject string except (by default) a character that signi-
+       ter in the subject string except (by default) a character  that  signi-
        fies the end of a line.


-       When a line ending is defined as a single character, dot never  matches
-       that  character; when the two-character sequence CRLF is used, dot does
-       not match CR if it is immediately followed  by  LF,  but  otherwise  it
-       matches  all characters (including isolated CRs and LFs). When any Uni-
-       code line endings are being recognized, dot does not match CR or LF  or
+       When  a line ending is defined as a single character, dot never matches
+       that character; when the two-character sequence CRLF is used, dot  does
+       not  match  CR  if  it  is immediately followed by LF, but otherwise it
+       matches all characters (including isolated CRs and LFs). When any  Uni-
+       code  line endings are being recognized, dot does not match CR or LF or
        any of the other line ending characters.


-       The  behaviour  of  dot  with regard to newlines can be changed. If the
-       PCRE2_DOTALL option is set, a dot matches any  one  character,  without
-       exception.   If  the two-character sequence CRLF is present in the sub-
+       The behaviour of dot with regard to newlines can  be  changed.  If  the
+       PCRE2_DOTALL  option  is  set, a dot matches any one character, without
+       exception.  If the two-character sequence CRLF is present in  the  sub-
        ject string, it takes two dots to match it.


-       The handling of dot is entirely independent of the handling of  circum-
-       flex  and  dollar,  the  only relationship being that they both involve
+       The  handling of dot is entirely independent of the handling of circum-
+       flex and dollar, the only relationship being  that  they  both  involve
        newlines. Dot has no special meaning in a character class.


-       The escape sequence \N when not followed by an  opening  brace  behaves
-       like  a dot, except that it is not affected by the PCRE2_DOTALL option.
-       In other words, it matches any character except one that signifies  the
+       The  escape  sequence  \N when not followed by an opening brace behaves
+       like a dot, except that it is not affected by the PCRE2_DOTALL  option.
+       In  other words, it matches any character except one that signifies the
        end of a line.


        When \N is followed by an opening brace it has a different meaning. See
-       the section entitled "Non-printing characters" above for details.  Perl
-       also  uses  \N{name}  to specify characters by Unicode name; PCRE2 does
+       the  section entitled "Non-printing characters" above for details. Perl
+       also uses \N{name} to specify characters by Unicode  name;  PCRE2  does
        not support this.



MATCHING A SINGLE CODE UNIT

-       Outside a character class, the escape sequence \C matches any one  code
-       unit,  whether or not a UTF mode is set. In the 8-bit library, one code
-       unit is one byte; in the 16-bit library it is a  16-bit  unit;  in  the
-       32-bit  library  it  is  a 32-bit unit. Unlike a dot, \C always matches
-       line-ending characters. The feature is provided in  Perl  in  order  to
+       Outside  a character class, the escape sequence \C matches any one code
+       unit, whether or not a UTF mode is set. In the 8-bit library, one  code
+       unit  is  one  byte;  in the 16-bit library it is a 16-bit unit; in the
+       32-bit library it is a 32-bit unit. Unlike a  dot,  \C  always  matches
+       line-ending  characters.  The  feature  is provided in Perl in order to
        match individual bytes in UTF-8 mode, but it is unclear how it can use-
        fully be used.


-       Because \C breaks up characters into individual  code  units,  matching
-       one  unit  with  \C  in UTF-8 or UTF-16 mode means that the rest of the
-       string may start with a malformed UTF  character.  This  has  undefined
-       results, because PCRE2 assumes that it is matching character by charac-
-       ter in a valid UTF string (by default it checks  the  subject  string's
-       validity  at  the  start of processing unless the PCRE2_NO_UTF_CHECK or
+       Because  \C  breaks  up characters into individual code units, matching
+       one unit with \C in UTF-8 or UTF-16 mode means that  the  rest  of  the
+       string may start with a malformed UTF character. This has undefined re-
+       sults, because PCRE2 assumes that it is matching character by character
+       in a valid UTF string (by default it checks the subject string's valid-
+       ity at  the  start  of  processing  unless  the  PCRE2_NO_UTF_CHECK  or
        PCRE2_MATCH_INVALID_UTF option is used).


-       An  application  can  lock  out  the  use  of   \C   by   setting   the
-       PCRE2_NEVER_BACKSLASH_C  option  when  compiling  a pattern. It is also
+       An   application   can   lock   out  the  use  of  \C  by  setting  the
+       PCRE2_NEVER_BACKSLASH_C option when compiling a  pattern.  It  is  also
        possible to build PCRE2 with the use of \C permanently disabled.


-       PCRE2 does not allow \C to appear in lookbehind  assertions  (described
-       below)  in UTF-8 or UTF-16 modes, because this would make it impossible
-       to calculate the length of  the  lookbehind.  Neither  the  alternative
+       PCRE2  does  not allow \C to appear in lookbehind assertions (described
+       below) in UTF-8 or UTF-16 modes, because this would make it  impossible
+       to  calculate  the  length  of  the lookbehind. Neither the alternative
        matching function pcre2_dfa_match() nor the JIT optimizer support \C in
        these UTF modes.  The former gives a match-time error; the latter fails
        to optimize and so the match is always run using the interpreter.


-       In  the  32-bit  library,  however,  \C  is  always supported (when not
-       explicitly locked out) because it always matches a  single  code  unit,
+       In the 32-bit library, however, \C is always supported  (when  not  ex-
+       plicitly  locked  out)  because  it  always matches a single code unit,
        whether or not UTF-32 is specified.


        In general, the \C escape sequence is best avoided. However, one way of
-       using it that avoids the problem of malformed UTF-8 or  UTF-16  charac-
-       ters  is  to use a lookahead to check the length of the next character,
-       as in this pattern, which could be used with  a  UTF-8  string  (ignore
+       using  it  that avoids the problem of malformed UTF-8 or UTF-16 charac-
+       ters is to use a lookahead to check the length of the  next  character,
+       as  in  this  pattern,  which could be used with a UTF-8 string (ignore
        white space and line breaks):


          (?| (?=[\x00-\x7f])(\C) |
@@ -7213,12 +7199,12 @@
              (?=[\x{800}-\x{ffff}])(\C)(\C)(\C) |
              (?=[\x{10000}-\x{1fffff}])(\C)(\C)(\C)(\C))


-       In  this  example,  a  group  that starts with (?| resets the capturing
-       parentheses numbers in each alternative (see "Duplicate Group  Numbers"
+       In this example, a group that starts  with  (?|  resets  the  capturing
+       parentheses  numbers in each alternative (see "Duplicate Group Numbers"
        below). The assertions at the start of each branch check the next UTF-8
-       character for values whose encoding uses 1, 2, 3, or 4  bytes,  respec-
-       tively.  The  character's  individual  bytes  are  then captured by the
-       appropriate number of \C groups.
+       character  for  values whose encoding uses 1, 2, 3, or 4 bytes, respec-
+       tively. The character's individual bytes are then captured by  the  ap-
+       propriate number of \C groups.



SQUARE BRACKETS AND CHARACTER CLASSES
@@ -7225,62 +7211,61 @@

        An opening square bracket introduces a character class, terminated by a
        closing square bracket. A closing square bracket on its own is not spe-
-       cial by default.  If a closing square bracket is required as  a  member
+       cial  by  default.  If a closing square bracket is required as a member
        of the class, it should be the first data character in the class (after
-       an initial circumflex, if present) or escaped with  a  backslash.  This
-       means  that,  by default, an empty class cannot be defined. However, if
-       the PCRE2_ALLOW_EMPTY_CLASS option is set, a closing square bracket  at
+       an  initial  circumflex,  if present) or escaped with a backslash. This
+       means that, by default, an empty class cannot be defined.  However,  if
+       the  PCRE2_ALLOW_EMPTY_CLASS option is set, a closing square bracket at
        the start does end the (empty) class.


-       A  character class matches a single character in the subject. A matched
+       A character class matches a single character in the subject. A  matched
        character must be in the set of characters defined by the class, unless
-       the  first  character in the class definition is a circumflex, in which
+       the first character in the class definition is a circumflex,  in  which
        case the subject character must not be in the set defined by the class.
-       If  a  circumflex is actually required as a member of the class, ensure
+       If a circumflex is actually required as a member of the  class,  ensure
        it is not the first character, or escape it with a backslash.


-       For example, the character class [aeiou] matches any lower case  vowel,
-       while  [^aeiou]  matches  any character that is not a lower case vowel.
+       For  example, the character class [aeiou] matches any lower case vowel,
+       while [^aeiou] matches any character that is not a  lower  case  vowel.
        Note that a circumflex is just a convenient notation for specifying the
-       characters  that  are in the class by enumerating those that are not. A
-       class that starts with a circumflex is not an assertion; it still  con-
-       sumes  a  character  from the subject string, and therefore it fails if
+       characters that are in the class by enumerating those that are  not.  A
+       class  that starts with a circumflex is not an assertion; it still con-
+       sumes a character from the subject string, and therefore  it  fails  if
        the current pointer is at the end of the string.


-       Characters in a class may be specified by their code points  using  \o,
-       \x,  or \N{U+hh..} in the usual way. When caseless matching is set, any
-       letters in a class represent both their upper case and lower case  ver-
-       sions,  so  for example, a caseless [aeiou] matches "A" as well as "a",
-       and a caseless [^aeiou] does not match "A", whereas a  caseful  version
+       Characters  in  a class may be specified by their code points using \o,
+       \x, or \N{U+hh..} in the usual way. When caseless matching is set,  any
+       letters  in a class represent both their upper case and lower case ver-
+       sions, so for example, a caseless [aeiou] matches "A" as well  as  "a",
+       and  a  caseless [^aeiou] does not match "A", whereas a caseful version
        would.


-       Characters  that  might  indicate  line breaks are never treated in any
-       special way  when  matching  character  classes,  whatever  line-ending
-       sequence  is  in  use,  and  whatever  setting  of the PCRE2_DOTALL and
-       PCRE2_MULTILINE options is used. A class such as  [^a]  always  matches
+       Characters that might indicate line breaks are  never  treated  in  any
+       special  way  when matching character classes, whatever line-ending se-
+       quence is  in  use,  and  whatever  setting  of  the  PCRE2_DOTALL  and
+       PCRE2_MULTILINE  options  is  used. A class such as [^a] always matches
        one of these characters.


        The generic character type escape sequences \d, \D, \h, \H, \p, \P, \s,
-       \S, \v, \V, \w, and \W may appear in a character  class,  and  add  the
-       characters  that  they  match  to  the  class.  For example, [\dABCDEF]
-       matches any hexadecimal digit.  In  UTF  modes,  the  PCRE2_UCP  option
-       affects  the meanings of \d, \s, \w and their upper case partners, just
-       as it does when they appear outside a character class, as described  in
-       the  section  entitled  "Generic  character  types"  above.  The escape
-       sequence \b has a  different  meaning  inside  a  character  class;  it
-       matches  the  backspace character. The sequences \B, \R, and \X are not
-       special inside a character class. Like any  other  unrecognized  escape
-       sequences,  they  cause an error. The same is true for \N when not fol-
-       lowed by an opening brace.
+       \S,  \v,  \V,  \w,  and \W may appear in a character class, and add the
+       characters that they  match  to  the  class.  For  example,  [\dABCDEF]
+       matches  any  hexadecimal digit. In UTF modes, the PCRE2_UCP option af-
+       fects the meanings of \d, \s, \w and their upper case partners, just as
+       it does when they appear outside a character class, as described in the
+       section entitled "Generic character types" above. The  escape  sequence
+       \b  has  a  different  meaning inside a character class; it matches the
+       backspace character. The sequences \B, \R, and \X are not  special  in-
+       side  a  character class. Like any other unrecognized escape sequences,
+       they cause an error. The same is true for \N when not  followed  by  an
+       opening brace.


-       The minus (hyphen) character can be used to specify a range of  charac-
-       ters  in  a  character  class.  For  example,  [d-m] matches any letter
-       between d and m, inclusive. If a  minus  character  is  required  in  a
-       class,  it  must  be  escaped  with a backslash or appear in a position
-       where it cannot be interpreted as indicating a range, typically as  the
-       first or last character in the class, or immediately after a range. For
-       example, [b-d-z] matches letters in the range b to d, a hyphen  charac-
-       ter, or z.
+       The  minus (hyphen) character can be used to specify a range of charac-
+       ters in a character class. For example, [d-m] matches  any  letter  be-
+       tween  d and m, inclusive. If a minus character is required in a class,
+       it must be escaped with a backslash or appear in a  position  where  it
+       cannot  be interpreted as indicating a range, typically as the first or
+       last character in the class, or immediately after a range. For example,
+       [b-d-z] matches letters in the range b to d, a hyphen character, or z.


        Perl treats a hyphen as a literal if it appears before or after a POSIX
        class (see below) or before or after a character type escape such as as
@@ -7299,8 +7284,8 @@
        a range.


        Ranges normally include all code points between the start and end char-
-       acters, inclusive. They can also be  used  for  code  points  specified
-       numerically, for example [\000-\037]. Ranges can include any characters
+       acters, inclusive. They can also be used for code points specified  nu-
+       merically,  for  example [\000-\037]. Ranges can include any characters
        that are valid for the current mode. In any  UTF  mode,  the  so-called
        "surrogate"  characters (those whose code points lie between 0xd800 and
        0xdfff inclusive) may not  be  specified  explicitly  by  default  (the
@@ -7334,8 +7319,8 @@
        range), circumflex (only at the start), opening  square  bracket  (only
        when  it can be interpreted as introducing a POSIX class name, or for a
        special compatibility feature - see the next  two  sections),  and  the
-       terminating  closing  square  bracket.  However,  escaping  other  non-
-       alphanumeric characters does no harm.
+       terminating  closing  square  bracket.  However, escaping other non-al-
+       phanumeric characters does no harm.



 POSIX CHARACTER CLASSES
@@ -7384,8 +7369,8 @@
        ters in the range 128-255 when locale-specific matching  is  happening.
        However,  if the PCRE2_UCP option is passed to pcre2_compile(), some of
        the classes are changed so that Unicode character properties are  used.
-       This  is  achieved  by  replacing  certain  POSIX  classes  with  other
-       sequences, as follows:
+       This  is  achieved  by  replacing  certain POSIX classes with other se-
+       quences, as follows:


          [:alnum:]  becomes  \p{Xan}
          [:alpha:]  becomes  \p{L}
@@ -7436,9 +7421,9 @@
        from other environments, and is best not used in any new patterns. Note
        that \b matches at the start and the end of a word (see "Simple  asser-
        tions"  above),  and in a Perl-style pattern the preceding or following
-       character normally shows which is wanted,  without  the  need  for  the
-       assertions  that  are used above in order to give exactly the POSIX be-
-       haviour.
+       character normally shows which is wanted, without the need for the  as-
+       sertions  that are used above in order to give exactly the POSIX behav-
+       iour.



VERTICAL BAR
@@ -7460,8 +7445,8 @@

        The settings  of  the  PCRE2_CASELESS,  PCRE2_MULTILINE,  PCRE2_DOTALL,
        PCRE2_EXTENDED,  PCRE2_EXTENDED_MORE, and PCRE2_NO_AUTO_CAPTURE options
-       can be changed from  within  the  pattern  by  a  sequence  of  letters
-       enclosed  between "(?"  and ")". These options are Perl-compatible, and
+       can be changed from within the pattern by a  sequence  of  letters  en-
+       closed  between  "(?"   and ")". These options are Perl-compatible, and
        are described in detail in the pcre2api documentation. The option  let-
        ters are:


@@ -7473,8 +7458,8 @@
          xx for PCRE2_EXTENDED_MORE


        For example, (?im) sets caseless, multiline matching. It is also possi-
-       ble to unset these options by preceding the  relevant  letters  with  a
-       hyphen, for example (?-im). The two "extended" options are not indepen-
+       ble to unset these options by preceding the relevant letters with a hy-
+       phen,  for  example (?-im). The two "extended" options are not indepen-
        dent; unsetting either one cancels the effects of both of them.


        A  combined  setting  and  unsetting  such  as  (?im-sx),  which   sets
@@ -7486,15 +7471,15 @@


        If  the  first character following (? is a circumflex, it causes all of
        the above options to be unset. Thus, (?^) is equivalent  to  (?-imnsx).
-       Letters  may  follow  the  circumflex  to  cause some options to be re-
-       instated, but a hyphen may not appear.
+       Letters  may  follow  the circumflex to cause some options to be re-in-
+       stated, but a hyphen may not appear.


        The PCRE2-specific options PCRE2_DUPNAMES  and  PCRE2_UNGREEDY  can  be
        changed  in  the  same  way as the Perl-compatible options by using the
        characters J and U respectively. However, these are not unset by (?^).


-       When one of these option changes occurs at  top  level  (that  is,  not
-       inside  group  parentheses), the change applies to the remainder of the
+       When one of these option changes occurs at top level (that is, not  in-
+       side  group  parentheses),  the  change applies to the remainder of the
        pattern that follows. An option change within a group (see below for  a
        description of groups) affects only that part of the group that follows
        it, so
@@ -7667,11 +7652,11 @@
        well  as  convenience  functions  for extracting captured substrings by
        name.


-       Warning: When more than one capture  group  has  the  same  number,  as
-       described  in the previous section, a name given to one of them applies
-       to all of them. Perl allows identically numbered groups to have differ-
-       ent  names.  Consider this pattern, where there are two capture groups,
-       both numbered 1:
+       Warning: When more than one capture group has the same number,  as  de-
+       scribed in the previous section, a name given to one of them applies to
+       all of them. Perl allows identically numbered groups to have  different
+       names.  Consider this pattern, where there are two capture groups, both
+       numbered 1:


          (?|(?<AA>aa)|(?<BB>bb))


@@ -7733,8 +7718,8 @@
        conditions below), either to check whether a capture group has matched,
        or to check for recursion, all groups with the same name are tested. If
        the condition is true for any one of them,  the  overall  condition  is
-       true.  This  is  the  same  behaviour as testing by number. For further
-       details of the interfaces for handling named capture  groups,  see  the
+       true.  This is the same behaviour as testing by number. For further de-
+       tails of the interfaces for handling  named  capture  groups,  see  the
        pcre2api documentation.



@@ -7817,10 +7802,10 @@
        By default, quantifiers are "greedy", that is, they match  as  much  as
        possible (up to the maximum number of permitted times), without causing
        the rest of the pattern to fail. The  classic  example  of  where  this
-       gives  problems  is  in  trying  to match comments in C programs. These
-       appear between /* and */ and within the comment,  individual  *  and  /
-       characters  may  appear. An attempt to match C comments by applying the
-       pattern
+       gives  problems is in trying to match comments in C programs. These ap-
+       pear between /* and */ and within the comment, individual * and / char-
+       acters  may appear. An attempt to match C comments by applying the pat-
+       tern


          /\*.*\*/


@@ -7852,16 +7837,16 @@
        words, it inverts the default behaviour.


        When  a  parenthesized  group is quantified with a minimum repeat count
-       that is greater than 1 or  with  a  limited  maximum,  more  memory  is
-       required  for  the  compiled  pattern, in proportion to the size of the
-       minimum or maximum.
+       that is greater than 1 or with a limited maximum, more  memory  is  re-
+       quired for the compiled pattern, in proportion to the size of the mini-
+       mum or maximum.


        If a pattern starts with  .*  or  .{0,}  and  the  PCRE2_DOTALL  option
        (equivalent  to  Perl's /s) is set, thus allowing the dot to match new-
        lines, the pattern is implicitly  anchored,  because  whatever  follows
        will  be  tried against every character position in the subject string,
-       so there is no point in retrying the  overall  match  at  any  position
-       after the first. PCRE2 normally treats such a pattern as though it were
+       so there is no point in retrying the overall match at any position  af-
+       ter  the  first. PCRE2 normally treats such a pattern as though it were
        preceded by \A.


        In cases where it is known that the subject  string  contains  no  new-
@@ -7946,8 +7931,8 @@
        match, if anchored at the current point in the subject string.


        Atomic groups are not capture groups. Simple cases such  as  the  above
-       example  can  be  thought  of  as a maximizing repeat that must swallow
-       everything it can.  So, while both \d+ and \d+? are prepared to  adjust
+       example  can be thought of as a maximizing repeat that must swallow ev-
+       erything it can.  So, while both \d+ and \d+? are  prepared  to  adjust
        the  number  of digits they match in order to make the rest of the pat-
        tern match, (?>\d+) can only match an entire sequence of digits.


@@ -7965,10 +7950,10 @@

          (abc|xyz){2,3}+


-       Possessive  quantifiers  are  always  greedy;  the   setting   of   the
-       PCRE2_UNGREEDY  option  is  ignored. They are a convenient notation for
-       the simpler forms of atomic group. However, there is no  difference  in
-       the meaning of a possessive quantifier and the equivalent atomic group,
+       Possessive quantifiers are always greedy; the setting of the  PCRE2_UN-
+       GREEDY  option  is ignored. They are a convenient notation for the sim-
+       pler forms of atomic group. However, there  is  no  difference  in  the
+       meaning  of  a  possessive  quantifier and the equivalent atomic group,
        though there may be a performance  difference;  possessive  quantifiers
        should be slightly faster.


@@ -7984,8 +7969,8 @@
        when B must follow.  This feature can be disabled by the PCRE2_NO_AUTO-
        POSSESS option, or starting the pattern with (*NO_AUTO_POSSESS).


-       When  a  pattern  contains  an unlimited repeat inside a group that can
-       itself be repeated an unlimited number of times, the use of  an  atomic
+       When a pattern contains an unlimited repeat inside a group that can it-
+       self be repeated an unlimited number of times, the  use  of  an  atomic
        group  is the only way to avoid some failing matches taking a very long
        time indeed. The pattern


@@ -7999,13 +7984,13 @@

        it takes a long time before reporting  failure.  This  is  because  the
        string  can be divided between the internal \D+ repeat and the external
-       * repeat in a large number of ways, and all  have  to  be  tried.  (The
-       example  uses  [!?]  rather than a single character at the end, because
-       both PCRE2 and Perl have an optimization that allows for  fast  failure
-       when  a single character is used. They remember the last single charac-
-       ter that is required for a match, and fail early if it is  not  present
-       in  the  string.)  If  the pattern is changed so that it uses an atomic
-       group, like this:
+       * repeat in a large number of ways, and all have to be tried. (The  ex-
+       ample uses [!?] rather than a single character at the end, because both
+       PCRE2 and Perl have an optimization that allows for fast failure when a
+       single  character is used. They remember the last single character that
+       is required for a match, and fail early if it is  not  present  in  the
+       string.)  If  the  pattern  is changed so that it uses an atomic group,
+       like this:


          ((?>\D+)|<\d+>)*[!?]


@@ -8115,8 +8100,8 @@

        A backreference that occurs inside the group to which it  refers  fails
        when  the  group  is  first used, so, for example, (a\1) never matches.
-       However, such references can be  useful  inside  repeated  groups.  For
-       example, the pattern
+       However, such references can be useful inside repeated groups. For  ex-
+       ample, the pattern


          (a|b\1)+


@@ -8142,8 +8127,8 @@

        More complicated assertions are coded as  parenthesized  groups.  There
        are  two  kinds:  those  that look ahead of the current position in the
-       subject string, and those that look behind it,  and  in  each  case  an
-       assertion  may be positive (must match for the assertion to be true) or
+       subject string, and those that look behind it, and in each case an  as-
+       sertion  may  be  positive (must match for the assertion to be true) or
        negative (must not match for the assertion to be  true).  An  assertion
        group is matched in the normal way, and if it is true, matching contin-
        ues after it, but with the matching position in the subject  string  is
@@ -8150,8 +8135,8 @@
        was it was before the assertion was processed.


        A  lookaround  assertion  may  also appear as the condition in a condi-
-       tional group (see below). In this case,  the  result  of  matching  the
-       assertion determines which branch of the condition is followed.
+       tional group (see below). In this case, the result of matching the  as-
+       sertion determines which branch of the condition is followed.


        Lookaround assertions are atomic. If an assertion is true, but there is
        a subsequent matching failure, there is no backtracking into the asser-
@@ -8159,8 +8144,8 @@


        Assertion  groups are not capture groups. If an assertion contains cap-
        ture groups within it, these are counted for the purposes of  numbering
-       the  capture  groups  in  the  whole  pattern. Within each branch of an
-       assertion, locally captured substrings may be referenced in  the  usual
+       the  capture  groups in the whole pattern. Within each branch of an as-
+       sertion, locally captured substrings may be  referenced  in  the  usual
        way.  For  example,  a  sequence such as (.)\g{-1} can be used to check
        that two adjacent characters are the same.


@@ -8240,8 +8225,8 @@
        "bar". A lookbehind assertion is needed to achieve the other effect.


        If you want to force a matching failure at some point in a pattern, the
-       most  convenient  way  to  do  it  is with (?!) because an empty string
-       always matches, so an assertion that requires there not to be an  empty
+       most  convenient  way to do it is with (?!) because an empty string al-
+       ways matches, so an assertion that requires there not to  be  an  empty
        string must always fail.  The backtracking control verb (*FAIL) or (*F)
        is a synonym for (?!).


@@ -8297,18 +8282,18 @@
        that is already active, is not supported.


        Perl does not support backreferences in lookbehinds. PCRE2 does support
-       them,   but   only    if    certain    conditions    are    met.    The
-       PCRE2_MATCH_UNSET_BACKREF  option must not be set, there must be no use
-       of (?| in the pattern (it creates duplicate group numbers), and if  the
-       backreference  is by name, the name must be unique. Of course, the ref-
-       erenced group must itself match a fixed length substring. The following
-       pattern matches words containing at least two characters that begin and
-       end with the same character:
+       them, but only if  certain  conditions  are  met.  The  PCRE2_MATCH_UN-
+       SET_BACKREF  option must not be set, there must be no use of (?| in the
+       pattern (it creates duplicate group numbers), and if the  backreference
+       is  by  name,  the name must be unique. Of course, the referenced group
+       must itself match a  fixed  length  substring.  The  following  pattern
+       matches  words  containing  at  least two characters that begin and end
+       with the same character:


           \b(\w)\w++(?<=\1)


-       Possessive quantifiers can  be  used  in  conjunction  with  lookbehind
-       assertions to specify efficient matching of fixed-length strings at the
+       Possessive quantifiers can be used in conjunction with  lookbehind  as-
+       sertions  to  specify efficient matching of fixed-length strings at the
        end of subject strings. Consider a simple pattern such as


          abcd$
@@ -8379,9 +8364,9 @@


        If  part  of a pattern is enclosed between (*script_run: or (*sr: and a
        closing parenthesis, it fails if the sequence  of  characters  that  it
-       matches  are  not  a  script  run. After a failure, normal backtracking
-       occurs. Script runs can be used to detect spoofing attacks using  char-
-       acters  that  look the same, but are from different scripts. The string
+       matches  are not a script run. After a failure, normal backtracking oc-
+       curs. Script runs can be used to detect spoofing attacks using  charac-
+       ters  that  look  the  same, but are from different scripts. The string
        "paypal.com" is an infamous example, where the letters could be a  mix-
        ture of Latin and Cyrillic. This pattern ensures that the matched char-
        acters in a sequence of non-spaces that follow white space are a script
@@ -8437,9 +8422,9 @@
        If  the  condition is satisfied, the yes-pattern is used; otherwise the
        no-pattern (if present) is used. An absent no-pattern is equivalent  to
        an  empty string (it always matches). If there are more than two alter-
-       natives in the group, a compile-time error  occurs.  Each  of  the  two
-       alternatives  may  itself  contain nested groups of any form, including
-       conditional groups; the restriction to two alternatives applies only at
+       natives in the group, a compile-time error occurs. Each of the two  al-
+       ternatives may itself contain nested groups of any form, including con-
+       ditional groups; the restriction to two alternatives  applies  only  at
        the  level of the condition itself. This pattern fragment is an example
        where the alternatives are complex:


@@ -8546,65 +8531,64 @@

        If the condition is the string (DEFINE), the condition is always false,
        even  if there is a group with the name DEFINE. In this case, there may
-       be only one alternative in the rest of the  conditional  group.  It  is
-       always  skipped  if control reaches this point in the pattern; the idea
-       of DEFINE is that it can be used to define subroutines that can be ref-
-       erenced  from  elsewhere.  (The use of subroutines is described below.)
-       For  example,  a  pattern  to   match   an   IPv4   address   such   as
-       "192.168.23.245"  could  be  written  like this (ignore white space and
-       line breaks):
+       be only one alternative in the rest of the conditional group. It is al-
+       ways  skipped if control reaches this point in the pattern; the idea of
+       DEFINE is that it can be used to define subroutines that can be  refer-
+       enced  from elsewhere. (The use of subroutines is described below.) For
+       example, a pattern to match an IPv4 address  such  as  "192.168.23.245"
+       could be written like this (ignore white space and line breaks):


          (?(DEFINE) (?<byte> 2[0-4]\d | 25[0-5] | 1\d\d | [1-9]?\d) )
          \b (?&byte) (\.(?&byte)){3} \b


-       The first part of the pattern is a DEFINE group inside which a  another
-       group  named "byte" is defined. This matches an individual component of
-       an IPv4 address (a number less than 256). When  matching  takes  place,
-       this  part  of  the pattern is skipped because DEFINE acts like a false
-       condition. The rest of the pattern uses references to the  named  group
-       to  match the four dot-separated components of an IPv4 address, insist-
+       The  first part of the pattern is a DEFINE group inside which a another
+       group named "byte" is defined. This matches an individual component  of
+       an  IPv4  address  (a number less than 256). When matching takes place,
+       this part of the pattern is skipped because DEFINE acts  like  a  false
+       condition.  The  rest of the pattern uses references to the named group
+       to match the four dot-separated components of an IPv4 address,  insist-
        ing on a word boundary at each end.


    Checking the PCRE2 version


-       Programs that link with a PCRE2 library can check the version by  call-
-       ing  pcre2_config()  with  appropriate arguments. Users of applications
-       that do not have access to the underlying code cannot do this.  A  spe-
-       cial  "condition" called VERSION exists to allow such users to discover
+       Programs  that link with a PCRE2 library can check the version by call-
+       ing pcre2_config() with appropriate arguments.  Users  of  applications
+       that  do  not have access to the underlying code cannot do this. A spe-
+       cial "condition" called VERSION exists to allow such users to  discover
        which version of PCRE2 they are dealing with by using this condition to
-       match  a string such as "yesno". VERSION must be followed either by "="
+       match a string such as "yesno". VERSION must be followed either by  "="
        or ">=" and a version number.  For example:


          (?(VERSION>=10.4)yes|no)


-       This pattern matches "yes" if the PCRE2 version is greater or equal  to
-       10.4,  or "no" otherwise. The fractional part of the version number may
+       This  pattern matches "yes" if the PCRE2 version is greater or equal to
+       10.4, or "no" otherwise. The fractional part of the version number  may
        not contain more than two digits.


    Assertion conditions


-       If the condition is not in any of the  above  formats,  it  must  be  a
-       parenthesized  assertion.  This may be a positive or negative lookahead
-       or lookbehind assertion. Consider this pattern, again  containing  non-
-       significant  white  space,  and with the two alternatives on the second
+       If  the  condition  is  not  in  any of the above formats, it must be a
+       parenthesized assertion. This may be a positive or  negative  lookahead
+       or  lookbehind  assertion. Consider this pattern, again containing non-
+       significant white space, and with the two alternatives  on  the  second
        line:


          (?(?=[^a-z]*[a-z])
          \d{2}-[a-z]{3}-\d{2}  |  \d{2}-\d{2}-\d{2} )


-       The condition  is  a  positive  lookahead  assertion  that  matches  an
-       optional  sequence of non-letters followed by a letter. In other words,
-       it tests for the presence of at least one letter in the subject.  If  a
-       letter  is found, the subject is matched against the first alternative;
-       otherwise it is  matched  against  the  second.  This  pattern  matches
-       strings  in  one  of the two forms dd-aaa-dd or dd-dd-dd, where aaa are
+       The  condition  is  a  positive lookahead assertion that matches an op-
+       tional sequence of non-letters followed by a letter. In other words, it
+       tests for the presence of at least one letter in the subject. If a let-
+       ter is found, the subject is matched  against  the  first  alternative;
+       otherwise  it  is  matched  against  the  second.  This pattern matches
+       strings in one of the two forms dd-aaa-dd or dd-dd-dd,  where  aaa  are
        letters and dd are digits.


        When an assertion that is a condition contains capture groups, any cap-
-       turing  that  occurs  in  a matching branch is retained afterwards, for
-       both positive and negative assertions, because matching always  contin-
-       ues  after  the  assertion, whether it succeeds or fails. (Compare non-
-       conditional assertions, for which captures are retained only for  posi-
+       turing that occurs in a matching branch  is  retained  afterwards,  for
+       both  positive and negative assertions, because matching always contin-
+       ues after the assertion, whether it succeeds or  fails.  (Compare  non-
+       conditional  assertions, for which captures are retained only for posi-
        tive assertions that succeed.)



@@ -8611,44 +8595,44 @@
COMMENTS

        There are two ways of including comments in patterns that are processed
-       by PCRE2. In both cases, the start of the comment  must  not  be  in  a
-       character  class,  nor  in  the middle of any other sequence of related
-       characters such as (?: or a group name or number. The  characters  that
+       by  PCRE2.  In  both  cases,  the start of the comment must not be in a
+       character class, nor in the middle of any  other  sequence  of  related
+       characters  such  as (?: or a group name or number. The characters that
        make up a comment play no part in the pattern matching.


-       The  sequence (?# marks the start of a comment that continues up to the
-       next closing parenthesis. Nested parentheses are not permitted. If  the
-       PCRE2_EXTENDED  or  PCRE2_EXTENDED_MORE  option  is set, an unescaped #
-       character also introduces a comment, which in this  case  continues  to
-       immediately  after  the next newline character or character sequence in
+       The sequence (?# marks the start of a comment that continues up to  the
+       next  closing parenthesis. Nested parentheses are not permitted. If the
+       PCRE2_EXTENDED or PCRE2_EXTENDED_MORE option is  set,  an  unescaped  #
+       character  also  introduces  a comment, which in this case continues to
+       immediately after the next newline character or character  sequence  in
        the pattern. Which characters are interpreted as newlines is controlled
-       by  an option passed to the compiling function or by a special sequence
+       by an option passed to the compiling function or by a special  sequence
        at the start of the pattern, as described in the section entitled "New-
        line conventions" above. Note that the end of this type of comment is a
-       literal newline sequence in the pattern; escape sequences  that  happen
+       literal  newline  sequence in the pattern; escape sequences that happen
        to represent a newline do not count. For example, consider this pattern
-       when PCRE2_EXTENDED is set, and the default newline convention (a  sin-
+       when  PCRE2_EXTENDED is set, and the default newline convention (a sin-
        gle linefeed character) is in force:


          abc #comment \n still comment


-       On  encountering  the # character, pcre2_compile() skips along, looking
-       for a newline in the pattern. The sequence \n is still literal at  this
-       stage,  so  it does not terminate the comment. Only an actual character
+       On encountering the # character, pcre2_compile() skips  along,  looking
+       for  a newline in the pattern. The sequence \n is still literal at this
+       stage, so it does not terminate the comment. Only an  actual  character
        with the code value 0x0a (the default newline) does so.



RECURSIVE PATTERNS

-       Consider the problem of matching a string in parentheses, allowing  for
-       unlimited  nested  parentheses.  Without the use of recursion, the best
-       that can be done is to use a pattern that  matches  up  to  some  fixed
-       depth  of  nesting.  It  is not possible to handle an arbitrary nesting
+       Consider  the problem of matching a string in parentheses, allowing for
+       unlimited nested parentheses. Without the use of  recursion,  the  best
+       that  can  be  done  is  to use a pattern that matches up to some fixed
+       depth of nesting. It is not possible to  handle  an  arbitrary  nesting
        depth.


        For some time, Perl has provided a facility that allows regular expres-
-       sions  to recurse (amongst other things). It does this by interpolating
-       Perl code in the expression at run time, and the code can refer to  the
+       sions to recurse (amongst other things). It does this by  interpolating
+       Perl  code in the expression at run time, and the code can refer to the
        expression itself. A Perl pattern using code interpolation to solve the
        parentheses problem can be created like this:


@@ -8657,67 +8641,67 @@
        The (?p{...}) item interpolates Perl code at run time, and in this case
        refers recursively to the pattern in which it appears.


-       Obviously,  PCRE2  cannot  support  the  interpolation  of  Perl  code.
-       Instead, it supports special syntax for recursion of  the  entire  pat-
-       tern, and also for individual capture group recursion. After its intro-
-       duction in PCRE1 and Python, this kind of  recursion  was  subsequently
-       introduced into Perl at release 5.10.
+       Obviously, PCRE2 cannot support the interpolation  of  Perl  code.  In-
+       stead,  it supports special syntax for recursion of the entire pattern,
+       and also for individual capture group recursion. After its introduction
+       in PCRE1 and Python, this kind of recursion was subsequently introduced
+       into Perl at release 5.10.


-       A  special  item  that consists of (? followed by a number greater than
-       zero and a closing parenthesis is a recursive subroutine  call  of  the
-       capture  group of the given number, provided that it occurs inside that
-       group. (If not,  it  is  a  non-recursive  subroutine  call,  which  is
-       described  in  the  next  section.)  The special item (?R) or (?0) is a
-       recursive call of the entire regular expression.
+       A special item that consists of (? followed by a  number  greater  than
+       zero  and  a  closing parenthesis is a recursive subroutine call of the
+       capture group of the given number, provided that it occurs inside  that
+       group.  (If  not,  it  is a non-recursive subroutine call, which is de-
+       scribed in the next section.) The special item (?R) or (?0) is a recur-
+       sive call of the entire regular expression.


-       This PCRE2 pattern solves the nested parentheses  problem  (assume  the
+       This  PCRE2  pattern  solves the nested parentheses problem (assume the
        PCRE2_EXTENDED option is set so that white space is ignored):


          \( ( [^()]++ | (?R) )* \)


-       First  it matches an opening parenthesis. Then it matches any number of
-       substrings which can either be a  sequence  of  non-parentheses,  or  a
-       recursive  match  of the pattern itself (that is, a correctly parenthe-
-       sized substring).  Finally there is a closing parenthesis. Note the use
-       of a possessive quantifier to avoid backtracking into sequences of non-
+       First it matches an opening parenthesis. Then it matches any number  of
+       substrings  which can either be a sequence of non-parentheses, or a re-
+       cursive match of the pattern itself (that is, a correctly parenthesized
+       substring).   Finally there is a closing parenthesis. Note the use of a
+       possessive quantifier to avoid  backtracking  into  sequences  of  non-
        parentheses.


-       If this were part of a larger pattern, you would not  want  to  recurse
+       If  this  were  part of a larger pattern, you would not want to recurse
        the entire pattern, so instead you could use this:


          ( \( ( [^()]++ | (?1) )* \) )


-       We  have  put the pattern into parentheses, and caused the recursion to
+       We have put the pattern into parentheses, and caused the  recursion  to
        refer to them instead of the whole pattern.


-       In a larger pattern,  keeping  track  of  parenthesis  numbers  can  be
-       tricky.  This is made easier by the use of relative references. Instead
+       In  a  larger  pattern,  keeping  track  of  parenthesis numbers can be
+       tricky. This is made easier by the use of relative references.  Instead
        of (?1) in the pattern above you can write (?-2) to refer to the second
-       most  recently  opened  parentheses  preceding  the recursion. In other
-       words, a negative number counts capturing  parentheses  leftwards  from
+       most recently opened parentheses  preceding  the  recursion.  In  other
+       words,  a  negative  number counts capturing parentheses leftwards from
        the point at which it is encountered.


-       Be  aware  however, that if duplicate capture group numbers are in use,
-       relative references refer to the earliest group  with  the  appropriate
+       Be aware however, that if duplicate capture group numbers are  in  use,
+       relative  references  refer  to the earliest group with the appropriate
        number. Consider, for example:


          (?|(a)|(b)) (c) (?-2)


        The first two capture groups (a) and (b) are both numbered 1, and group
-       (c) is number 2. When the reference (?-2) is  encountered,  the  second
-       most  recently opened parentheses has the number 1, but it is the first
+       (c)  is  number  2. When the reference (?-2) is encountered, the second
+       most recently opened parentheses has the number 1, but it is the  first
        such group (the (a) group) to which the recursion refers. This would be
-       the  same if an absolute reference (?1) was used. In other words, rela-
+       the same if an absolute reference (?1) was used. In other words,  rela-
        tive references are just a shorthand for computing a group number.


-       It is also possible to refer to subsequent capture groups,  by  writing
-       references  such  as  (?+2). However, these cannot be recursive because
-       the reference is not inside the parentheses that are  referenced.  They
-       are  always  non-recursive  subroutine  calls, as described in the next
+       It  is  also possible to refer to subsequent capture groups, by writing
+       references such as (?+2). However, these cannot  be  recursive  because
+       the  reference  is not inside the parentheses that are referenced. They
+       are always non-recursive subroutine calls, as  described  in  the  next
        section.


-       An alternative approach is to use named parentheses.  The  Perl  syntax
-       for  this  is  (?&name);  PCRE1's earlier syntax (?P>name) is also sup-
+       An  alternative  approach  is to use named parentheses. The Perl syntax
+       for this is (?&name); PCRE1's earlier syntax  (?P>name)  is  also  sup-
        ported. We could rewrite the above example as follows:


          (?<pn> \( ( [^()]++ | (?&pn) )* \) )
@@ -8726,40 +8710,40 @@
        used.


        The example pattern that we have been looking at contains nested unlim-
-       ited repeats, and so the use of a possessive  quantifier  for  matching
-       strings  of  non-parentheses  is important when applying the pattern to
+       ited  repeats,  and  so the use of a possessive quantifier for matching
+       strings of non-parentheses is important when applying  the  pattern  to
        strings that do not match. For example, when this pattern is applied to


          (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()


-       it yields "no match" quickly. However, if a  possessive  quantifier  is
-       not  used, the match runs for a very long time indeed because there are
-       so many different ways the + and * repeats can carve  up  the  subject,
+       it  yields  "no  match" quickly. However, if a possessive quantifier is
+       not used, the match runs for a very long time indeed because there  are
+       so  many  different  ways the + and * repeats can carve up the subject,
        and all have to be tested before failure can be reported.


-       At  the  end  of a match, the values of capturing parentheses are those
-       from the outermost level. If you want to obtain intermediate values,  a
+       At the end of a match, the values of capturing  parentheses  are  those
+       from  the outermost level. If you want to obtain intermediate values, a
        callout function can be used (see below and the pcre2callout documenta-
        tion). If the pattern above is matched against


          (ab(cd)ef)


-       the value for the inner capturing parentheses  (numbered  2)  is  "ef",
-       which  is  the last value taken on at the top level. If a capture group
-       is not matched at the top level, its final  captured  value  is  unset,
-       even  if it was (temporarily) set at a deeper level during the matching
+       the  value  for  the  inner capturing parentheses (numbered 2) is "ef",
+       which is the last value taken on at the top level. If a  capture  group
+       is  not  matched  at  the top level, its final captured value is unset,
+       even if it was (temporarily) set at a deeper level during the  matching
        process.


-       Do not confuse the (?R) item with the condition (R),  which  tests  for
-       recursion.   Consider  this pattern, which matches text in angle brack-
-       ets, allowing for arbitrary nesting. Only digits are allowed in  nested
-       brackets  (that is, when recursing), whereas any characters are permit-
+       Do  not  confuse  the (?R) item with the condition (R), which tests for
+       recursion.  Consider this pattern, which matches text in  angle  brack-
+       ets,  allowing for arbitrary nesting. Only digits are allowed in nested
+       brackets (that is, when recursing), whereas any characters are  permit-
        ted at the outer level.


          < (?: (?(R) \d++  | [^<>]*+) | (?R)) * >


-       In this pattern, (?(R) is the start of a conditional  group,  with  two
-       different  alternatives  for the recursive and non-recursive cases. The
+       In  this  pattern,  (?(R) is the start of a conditional group, with two
+       different alternatives for the recursive and non-recursive  cases.  The
        (?R) item is the actual recursive call.


    Differences in recursion processing between PCRE2 and Perl
@@ -8766,22 +8750,21 @@


        Some former differences between PCRE2 and Perl no longer exist.


-       Before release 10.30, recursion processing in PCRE2 differed from  Perl
-       in  that  a  recursive  subroutine call was always treated as an atomic
-       group. That is, once it had matched some of the subject string, it  was
-       never  re-entered,  even if it contained untried alternatives and there
-       was a subsequent matching failure. (Historical note:  PCRE  implemented
+       Before  release 10.30, recursion processing in PCRE2 differed from Perl
+       in that a recursive subroutine call was always  treated  as  an  atomic
+       group.  That is, once it had matched some of the subject string, it was
+       never re-entered, even if it contained untried alternatives  and  there
+       was  a  subsequent matching failure. (Historical note: PCRE implemented
        recursion before Perl did.)


-       Starting  with  release 10.30, recursive subroutine calls are no longer
+       Starting with release 10.30, recursive subroutine calls are  no  longer
        treated as atomic. That is, they can be re-entered to try unused alter-
-       natives  if  there  is a matching failure later in the pattern. This is
-       now compatible with the way Perl works. If you want a  subroutine  call
+       natives if there is a matching failure later in the  pattern.  This  is
+       now  compatible  with the way Perl works. If you want a subroutine call
        to be atomic, you must explicitly enclose it in an atomic group.


-       Supporting  backtracking  into  recursions  simplifies certain types of
-       recursive  pattern.  For  example,  this  pattern  matches  palindromic
-       strings:
+       Supporting backtracking into recursions simplifies certain types of re-
+       cursive pattern. For example, this pattern matches palindromic strings:


          ^((.)(?1)\2|.?)$


@@ -8824,9 +8807,9 @@
        is used outside the parentheses to which it refers, it operates  a  bit
        like  a  subroutine  in  a programming language. More accurately, PCRE2
        treats the referenced group as an independent subpattern which it tries
-       to  match  at  the  current  matching position. The called group may be
-       defined before or after the reference.  A  numbered  reference  can  be
-       absolute or relative, as in these examples:
+       to  match at the current matching position. The called group may be de-
+       fined before or after the reference. A numbered reference can be  abso-
+       lute or relative, as in these examples:


          (...(absolute)...)...(?2)...
          (...(relative)...)...(?-1)...
@@ -8868,8 +8851,8 @@


        For compatibility with Oniguruma, the non-Perl syntax \g followed by  a
        name or a number enclosed either in angle brackets or single quotes, is
-       an alternative syntax for calling a group  as  a  subroutine,  possibly
-       recursively.  Here  are two of the examples used above, rewritten using
+       an alternative syntax for calling a group as a subroutine, possibly re-
+       cursively.  Here  are  two  of the examples used above, rewritten using
        this syntax:


          (?<pn> \( ( (?>[^()]+) | \g<pn> )* \) )
@@ -9049,12 +9032,12 @@
        where  A,  B, and C may be complex expressions. After matching "A", the
        matcher processes "BC"; if that fails, causing a  backtrack,  (*ACCEPT)
        is  triggered  and the match succeeds. In both cases, all but C is cap-
-       tured. Whereas (*COMMIT) (see  below)  means  "fail  on  backtrack",  a
-       repeated (*ACCEPT) of this type means "succeed on backtrack".
+       tured. Whereas (*COMMIT) (see below) means "fail on backtrack",  a  re-
+       peated (*ACCEPT) of this type means "succeed on backtrack".


-       Warning:  (*ACCEPT)  should  not  be  used  within  a script run group,
-       because it causes an immediate  exit  from  the  group,  bypassing  the
-       script run checking.
+       Warning:  (*ACCEPT)  should  not be used within a script run group, be-
+       cause it causes an immediate exit from the group, bypassing the  script
+       run checking.


          (*FAIL) or (*FAIL:NAME)


@@ -9070,15 +9053,15 @@
        A match with the string "aaaa" always fails, but the callout  is  taken
        before each backtrack happens (in this example, 10 times).


-       (*ACCEPT:NAME)     and     (*FAIL:NAME)     behave    the    same    as
-       (*MARK:NAME)(*ACCEPT) and (*MARK:NAME)(*FAIL), respectively, that is, a
-       (*MARK) is recorded just before the verb acts.
+       (*ACCEPT:NAME)  and  (*FAIL:NAME)  behave the same as (*MARK:NAME)(*AC-
+       CEPT) and (*MARK:NAME)(*FAIL), respectively,  that  is,  a  (*MARK)  is
+       recorded just before the verb acts.


    Recording which path was taken


-       There  is  one  verb  whose  main  purpose  is to track how a match was
-       arrived at, though it also has a  secondary  use  in  conjunction  with
-       advancing the match starting point (see (*SKIP) below).
+       There  is  one  verb whose main purpose is to track how a match was ar-
+       rived at, though it also has a secondary use in  conjunction  with  ad-
+       vancing the match starting point (see (*SKIP) below).


          (*MARK:NAME) or (*:NAME)


@@ -9175,8 +9158,8 @@
        (*COMMIT) during a match does not always guarantee that a match must be
        at this starting point.


-       Note that (*COMMIT) at the start of a pattern is not  the  same  as  an
-       anchor,  unless PCRE2's start-of-match optimizations are turned off, as
+       Note that (*COMMIT) at the start of a pattern is not the same as an an-
+       chor,  unless  PCRE2's  start-of-match optimizations are turned off, as
        shown in this output from pcre2test:


            re> /(*COMMIT)abc/
@@ -9229,14 +9212,14 @@
        (starting  at  the  first  character in the string), the starting point
        skips on to start the next attempt at "c". Note that a possessive quan-
        tifer  does not have the same effect as this example; although it would
-       suppress backtracking  during  the  first  match  attempt,  the  second
-       attempt  would  start at the second character instead of skipping on to
+       suppress backtracking during the first match attempt,  the  second  at-
+       tempt  would  start  at  the second character instead of skipping on to
        "c".


-       If (*SKIP) is used inside a lookbehind to specify a new starting  posi-
-       tion  that  is  not later than the starting point of the current match,
-       the position specified by (*SKIP) is ignored, and  instead  the  normal
-       "bumpalong" occurs.
+       If (*SKIP) is used to specify a new starting position that is the  same
+       as  the  starting  position of the current match, or (by being inside a
+       lookbehind) earlier, the position specified by (*SKIP) is ignored,  and
+       instead the normal "bumpalong" occurs.


          (*SKIP:NAME)


@@ -9289,8 +9272,8 @@
        skips  to  the second alternative and tries COND2, without backtracking
        into COND1. If that succeeds and BAR fails, COND3 is tried.  If  subse-
        quently  BAZ fails, there are no more alternatives, so there is a back-
-       track to whatever came before the  entire  group.  If  (*THEN)  is  not
-       inside an alternation, it acts like (*PRUNE).
+       track to whatever came before the entire group. If (*THEN) is  not  in-
+       side an alternation, it acts like (*PRUNE).


        The  behaviour  of (*THEN:NAME) is not the same as (*MARK:NAME)(*THEN).
        It is like (*MARK:NAME) in that the name is remembered for passing back
@@ -9297,11 +9280,11 @@
        to  the  caller. However, (*SKIP:NAME) searches only for names set with
        (*MARK), ignoring those set by other backtracking verbs.


-       A group that does not contain a | character  is  just  a  part  of  the
-       enclosing  alternative;  it  is  not a nested alternation with only one
-       alternative. The effect of (*THEN) extends beyond such a group  to  the
-       enclosing  alternative.   Consider  this  pattern, where A, B, etc. are
-       complex pattern fragments that do not contain any | characters at  this
+       A group that does not contain a | character is just a part of  the  en-
+       closing  alternative;  it is not a nested alternation with only one al-
+       ternative. The effect of (*THEN) extends beyond such a group to the en-
+       closing  alternative.  Consider this pattern, where A, B, etc. are com-
+       plex pattern fragments that do not contain any  |  characters  at  this
        level:


          A (B(*THEN)C) | D
@@ -9325,14 +9308,14 @@


          ^.*? (?(?=a) a | b(*THEN)c )


-       If the subject is "ba", this pattern does not  match.  Because  .*?  is
-       ungreedy,  it  initially  matches  zero characters. The condition (?=a)
-       then fails, the character "b" is matched,  but  "c"  is  not.  At  this
-       point,  matching does not backtrack to .*? as might perhaps be expected
-       from the presence of the | character. The conditional group is part  of
-       the  single  alternative  that  comprises the whole pattern, and so the
-       match fails. (If there was a backtrack into .*?, allowing it  to  match
-       "b", the match would succeed.)
+       If the subject is "ba", this pattern does not match. Because .*? is un-
+       greedy,  it initially matches zero characters. The condition (?=a) then
+       fails, the character "b" is matched, but "c" is  not.  At  this  point,
+       matching  does  not  backtrack to .*? as might perhaps be expected from
+       the presence of the | character. The conditional group is part  of  the
+       single  alternative  that comprises the whole pattern, and so the match
+       fails. (If there was a backtrack into .*?, allowing it  to  match  "b",
+       the match would succeed.)


        The  verbs just described provide four different "strengths" of control
        when subsequent matching fails. (*THEN) is the weakest, carrying on the
@@ -9383,9 +9366,9 @@


        (*ACCEPT) in a standalone positive assertion causes  the  assertion  to
        succeed  without  any  further  processing; captured strings and a mark
-       name (if  set)  are  retained.  In  a  standalone  negative  assertion,
-       (*ACCEPT)  causes the assertion to fail without any further processing;
-       captured substrings and any mark name are discarded.
+       name (if set) are retained. In a standalone negative  assertion,  (*AC-
+       CEPT) causes the assertion to fail without any further processing; cap-
+       tured substrings and any mark name are discarded.


        If the assertion is a condition, (*ACCEPT) causes the condition  to  be
        true  for  a  positive assertion and false for a negative one; captured
@@ -9415,8 +9398,8 @@
        These behaviours occur whether or not the group is called recursively.


        (*ACCEPT) in a group called as a subroutine causes the subroutine match
-       to succeed without any  further  processing.  Matching  then  continues
-       after the subroutine call. Perl documents this behaviour. Perl's treat-
+       to succeed without any further processing. Matching then continues  af-
+       ter  the  subroutine call. Perl documents this behaviour. Perl's treat-
        ment of the other verbs in subroutines is different in some cases.


        (*FAIL) in a group called as a subroutine has  its  normal  effect:  it
@@ -9447,7 +9430,7 @@


REVISION

-       Last updated: 21 June 2019
+       Last updated: 22 June 2019
        Copyright (c) 1997-2019 University of Cambridge.
 ------------------------------------------------------------------------------


@@ -9531,9 +9514,9 @@
        also reduce the memory requirements.


        In contrast to  pcre2_match(),  pcre2_dfa_match()  does  use  recursive
-       function  calls,  but  only  for  processing  atomic groups, lookaround
-       assertions, and recursion within the pattern. The original  version  of
-       the code used to allocate quite large internal workspace vectors on the
+       function  calls,  but only for processing atomic groups, lookaround as-
+       sertions, and recursion within the pattern. The original version of the
+       code  used  to  allocate  quite large internal workspace vectors on the
        stack, which caused some problems for  some  patterns  in  environments
        with  small  stacks.  From release 10.32 the code for pcre2_dfa_match()
        has been re-factored to use heap memory  when  necessary  for  internal
@@ -9553,8 +9536,8 @@
        (a|e|i|o|u).  In  general,  the simplest construction that provides the
        required behaviour is usually the most efficient. Jeffrey Friedl's book
        contains  a  lot  of useful general discussion about optimizing regular
-       expressions for efficient performance. This  document  contains  a  few
-       observations about PCRE2.
+       expressions for efficient performance. This document contains a few ob-
+       servations about PCRE2.


        Using  Unicode  character  properties  (the  \p, \P, and \X escapes) is
        slow, because PCRE2 has to use a multi-stage table lookup  whenever  it
@@ -9575,11 +9558,11 @@
        option  is  set,  the pattern is implicitly anchored by PCRE2, since it
        can match only at the start of a subject string.  If  the  pattern  has
        multiple top-level branches, they must all be anchorable. The optimiza-
-       tion can be disabled by  the  PCRE2_NO_DOTSTAR_ANCHOR  option,  and  is
-       automatically disabled if the pattern contains (*PRUNE) or (*SKIP).
+       tion can be disabled by the PCRE2_NO_DOTSTAR_ANCHOR option, and is  au-
+       tomatically disabled if the pattern contains (*PRUNE) or (*SKIP).


-       If  PCRE2_DOTALL  is  not  set,  PCRE2  cannot  make this optimization,
-       because the dot metacharacter does not then match a newline, and if the
+       If  PCRE2_DOTALL  is  not set, PCRE2 cannot make this optimization, be-
+       cause the dot metacharacter does not then match a newline, and  if  the
        subject  string contains newlines, the pattern may match from the char-
        acter immediately following one of them instead of from the very start.
        For example, the pattern
@@ -9593,8 +9576,8 @@


        If you are using such a pattern with subject strings that do  not  con-
        tain   newlines,   the   best   performance   is  obtained  by  setting
-       PCRE2_DOTALL, or starting the pattern with  ^.*  or  ^.*?  to  indicate
-       explicit anchoring. That saves PCRE2 from having to scan along the sub-
+       PCRE2_DOTALL, or starting the pattern with ^.* or ^.*? to indicate  ex-
+       plicit  anchoring.  That saves PCRE2 from having to scan along the sub-
        ject looking for a newline to restart at.


        Beware of patterns that contain nested indefinite  repeats.  These  can
@@ -9608,8 +9591,8 @@
        2, 3, or 4 times, and for each of those cases other than 0 or 4, the  +
        repeats  can  match  different numbers of times.) When the remainder of
        the pattern is such that the entire match is going to fail,  PCRE2  has
-       in  principle  to  try  every  possible variation, and this can take an
-       extremely long time, even for relatively short strings.
+       in  principle to try every possible variation, and this can take an ex-
+       tremely long time, even for relatively short strings.


        An optimization catches some of the more simple cases such as


@@ -9663,8 +9646,8 @@
        matching,  and  on  the amount of heap memory that is used. The default
        values of the limits are very large, and unlikely ever to operate. They
        can  be  changed  when  PCRE2  is  built, and they can also be set when
-       pcre2_match() or pcre2_dfa_match() is  called.  For  details  of  these
-       interfaces,  see  the pcre2build documentation and the section entitled
+       pcre2_match() or pcre2_dfa_match() is called. For details of these  in-
+       terfaces,  see  the  pcre2build  documentation and the section entitled
        "The match context" in the pcre2api documentation.


        The pcre2test test program has a modifier called  "find_limits"  which,
@@ -9714,8 +9697,8 @@


        This  set of functions provides a POSIX-style API for the PCRE2 regular
        expression 8-bit library. There are no POSIX-style wrappers for PCRE2's
-       16-bit  and  32-bit  libraries.  See  the  pcre2api documentation for a
-       description of PCRE2's native API, which contains much additional func-
+       16-bit  and  32-bit libraries. See the pcre2api documentation for a de-
+       scription of PCRE2's native API, which contains much  additional  func-
        tionality.


        The functions described here are wrapper functions that ultimately call
@@ -9776,11 +9759,11 @@


COMPILING A PATTERN

-       The  function  pcre2_regcomp()  is  called to compile a pattern into an
-       internal form. By default, the pattern is a C string  terminated  by  a
-       binary zero (but see REG_PEND below). The preg argument is a pointer to
-       a regex_t structure that is used as  a  base  for  storing  information
-       about  the compiled regular expression. (It is also used for input when
+       The function pcre2_regcomp() is called to compile a pattern into an in-
+       ternal form. By default, the pattern is a C string terminated by a  bi-
+       nary zero (but see REG_PEND below). The preg argument is a pointer to a
+       regex_t structure that is used as a base for storing information  about
+       the  compiled  regular  expression.  (It  is  also  used for input when
        REG_PEND is set.)


        The argument cflags is either zero, or contains one or more of the bits
@@ -9816,10 +9799,10 @@
          REG_NOSUB


        When  a  pattern  that  is  compiled  with  this  flag  is  passed   to
-       pcre2_regexec()  for  matching,  the  nmatch  and  pmatch arguments are
-       ignored, and no captured strings are returned.  Versions  of  the  PCRE
-       library  prior  to  10.22 used to set the PCRE2_NO_AUTO_CAPTURE compile
-       option, but this no longer happens because it disables the use of back-
+       pcre2_regexec()  for  matching, the nmatch and pmatch arguments are ig-
+       nored, and no captured strings are returned. Versions of the  PCRE  li-
+       brary  prior to 10.22 used to set the PCRE2_NO_AUTO_CAPTURE compile op-
+       tion, but this no longer happens because it disables the use  of  back-
        references.


          REG_PEND
@@ -9826,8 +9809,8 @@


        If  this option is set, the reg_endp field in the preg structure (which
        has the type const char *) must be set to point to the character beyond
-       the  end  of  the  pattern  before calling pcre2_regcomp(). The pattern
-       itself may now contain binary zeros, which are treated as data  charac-
+       the  end of the pattern before calling pcre2_regcomp(). The pattern it-
+       self may now contain binary zeros, which are treated  as  data  charac-
        ters.  Without  REG_PEND,  a binary zero terminates the pattern and the
        re_endp field is ignored. This is a GNU extension to the POSIX standard
        and  should be used with caution in software intended to be portable to
@@ -9854,8 +9837,8 @@
        Note that REG_UTF is not part of the POSIX standard.


        In the absence of these flags, no options  are  passed  to  the  native
-       function.   This  means  the  the  regex is compiled with PCRE2 default
-       semantics. In particular, the way it handles newline characters in  the
+       function.   This means the the regex is compiled with PCRE2 default se-
+       mantics. In particular, the way it handles newline  characters  in  the
        subject  string  is  the Perl way, not the POSIX way. Note that setting
        PCRE2_MULTILINE has only some of the effects specified for REG_NEWLINE.
        It  does not affect the way newlines are matched by the dot metacharac-
@@ -9864,8 +9847,8 @@
        The yield of pcre2_regcomp() is zero on success,  and  non-zero  other-
        wise.  The preg structure is filled in on success, and one other member
        of the structure (as well as re_endp) is public: re_nsub  contains  the
-       number  of  capturing  subpatterns  in  the regular expression. Various
-       error codes are defined in the header file.
+       number  of capturing subpatterns in the regular expression. Various er-
+       ror codes are defined in the header file.


        NOTE: If the yield of pcre2_regcomp() is non-zero, you must not attempt
        to use the contents of the preg structure. If, for example, you pass it
@@ -9906,8 +9889,8 @@


        Default POSIX newline handling can be obtained by setting  PCRE2_DOTALL
        and  PCRE2_DOLLAR_ENDONLY  when  calling  pcre2_compile() directly, but
-       there is no way to make PCRE2 behave exactly  as  for  the  REG_NEWLINE
-       action.  When  using  the  POSIX  API,  passing  REG_NEWLINE to PCRE2's
+       there is no way to make PCRE2 behave exactly as for the REG_NEWLINE ac-
+       tion.  When  using  the  POSIX  API,  passing  REG_NEWLINE  to  PCRE2's
        pcre2_regcomp()  function  causes  PCRE2_MULTILINE  to  be  passed   to
        pcre2_compile(), and REG_DOTALL passes PCRE2_DOTALL. There is no way to
        pass PCRE2_DOLLAR_ENDONLY.
@@ -9941,15 +9924,15 @@


        When this option  is  set,  the  subject  string  starts  at  string  +
        pmatch[0].rm_so  and  ends  at  string  + pmatch[0].rm_eo, which should
-       point to the first character beyond the string.  There  may  be  binary
-       zeros  within the subject string, and indeed, using REG_STARTEND is the
+       point to the first character beyond the string. There may be binary ze-
+       ros  within  the  subject string, and indeed, using REG_STARTEND is the
        only way to pass a subject string that contains a binary zero.


        Whatever the value of  pmatch[0].rm_so,  the  offsets  of  the  matched
        string  and  any  captured  substrings  are still given relative to the
        start of string itself. (Before PCRE2 release 10.30  these  were  given
-       relative  to  string  +  pmatch[0].rm_so,  but  this differs from other
-       implementations.)
+       relative  to  string + pmatch[0].rm_so, but this differs from other im-
+       plementations.)


        This is a BSD extension, compatible with  but  not  specified  by  IEEE
        Standard  1003.2 (POSIX.2), and should be used with caution in software
@@ -9964,8 +9947,8 @@
        pcre2_regexec() are ignored (except possibly  as  input  for  REG_STAR-
        TEND).


-       The  value  of  nmatch  may  be  zero, and the value pmatch may be NULL
-       (unless REG_STARTEND is set); in both these cases  no  data  about  any
+       The  value of nmatch may be zero, and the value pmatch may be NULL (un-
+       less REG_STARTEND is set); in  both  these  cases  no  data  about  any
        matched strings is returned.


        Otherwise,  the  portion  of  the string that was matched, and also any
@@ -9978,9 +9961,9 @@
        elements relate to the capturing subpatterns of the regular expression.
        Unused entries in the array have both structure members set to -1.


-       A  successful  match  yields  a  zero  return;  various error codes are
-       defined in the header file, of  which  REG_NOMATCH  is  the  "expected"
-       failure code.
+       A  successful  match  yields a zero return; various error codes are de-
+       fined in the header file, of which REG_NOMATCH is the "expected"  fail-
+       ure code.



 ERROR MESSAGES
@@ -9989,8 +9972,8 @@
        pcre2_regcomp() or pcre2_regexec() to a printable message. If  preg  is
        not  NULL, the error should have arisen from the use of that structure.
        A message terminated by a binary zero is placed in errbuf. If the  buf-
-       fer  is  too  short,  only  the first errbuf_size - 1 characters of the
-       error message are used. The yield of the function is the size of buffer
+       fer  is too short, only the first errbuf_size - 1 characters of the er-
+       ror message are used. The yield of the function is the size  of  buffer
        needed  to hold the whole message, including the terminating zero. This
        value is greater than errbuf_size if the message was truncated.


@@ -9999,8 +9982,8 @@

        Compiling a regular expression causes memory to be allocated and  asso-
        ciated  with the preg structure. The function pcre2_regfree() frees all
-       such memory, after which preg may no  longer  be  used  as  a  compiled
-       expression.
+       such memory, after which preg may no longer be used as a  compiled  ex-
+       pression.



AUTHOR
@@ -10060,8 +10043,8 @@

        If PCRE2 is installed elsewhere, you may need to add additional options
        to the command line. For example, on a Unix-like system that has  PCRE2
-       installed  in  /usr/local,  you  can  compile the demonstration program
-       using a command like this:
+       installed  in /usr/local, you can compile the demonstration program us-
+       ing a command like this:


          cc -o pcre2demo -I/usr/local/include pcre2demo.c \
             -L/usr/local/lib -lpcre2-8
@@ -10073,8 +10056,8 @@
          ./pcre2demo -g 'cat|dog' 'the dog sat on the cat'


        Note  that  there  is  a  much  more comprehensive test program, called
-       pcre2test, which supports many  more  facilities  for  testing  regular
-       expressions using all three PCRE2 libraries (8-bit, 16-bit, and 32-bit,
+       pcre2test, which supports many more facilities for testing regular  ex-
+       pressions  using  all three PCRE2 libraries (8-bit, 16-bit, and 32-bit,
        though not all three need be installed). The pcre2demo program is  pro-
        vided as a relatively simple coding example.


@@ -10210,8 +10193,8 @@
          errorcode = fwrite(bytes, 1, bytescount, fd);


        Note  that  the  serialized data is binary data that may contain any of
-       the 256 possible byte  values.  On  systems  that  make  a  distinction
-       between binary and non-binary data, be sure that the file is opened for
+       the 256 possible byte values. On systems that make  a  distinction  be-
+       tween  binary  and non-binary data, be sure that the file is opened for
        binary output.


        Serializing a set of patterns leaves the original  data  untouched,  so
@@ -10218,18 +10201,18 @@
        they  can  still  be used for matching. Their memory must eventually be
        freed in the usual way by calling pcre2_code_free(). When you have fin-
        ished with the byte stream, it too must be freed by calling pcre2_seri-
-       alize_free(). If this function is  called  with  a  NULL  argument,  it
-       returns immediately without doing anything.
+       alize_free(). If this function is called with a NULL argument,  it  re-
+       turns immediately without doing anything.



RE-USING PRECOMPILED PATTERNS

-       In  order  to  re-use  a  set of saved patterns you must first make the
-       serialized byte stream available in main memory (for example, by  read-
-       ing  from  a  file).  The  management of this memory block is up to the
-       application.  You  can  use  the  pcre2_serialize_get_number_of_codes()
-       function  to  find out how many compiled patterns are in the serialized
-       data without actually decoding the patterns:
+       In  order to re-use a set of saved patterns you must first make the se-
+       rialized byte stream available in main memory (for example, by  reading
+       from a file). The management of this memory block is up to the applica-
+       tion. You can use the pcre2_serialize_get_number_of_codes() function to
+       find  out how many compiled patterns are in the serialized data without
+       actually decoding the patterns:


          uint8_t *bytes = <serialized data>;
          int32_t number_of_codes = pcre2_serialize_get_number_of_codes(bytes);
@@ -10237,8 +10220,8 @@
        The pcre2_serialize_decode() function reads a byte stream and recreates
        the compiled patterns in new memory blocks, setting pointers to them in
        a vector. The first two arguments are a pointer to  a  suitable  vector
-       and  its  length,  and  the third argument points to a byte stream. The
-       final argument is a pointer to a general context, which can be used  to
+       and its length, and the third argument points to a byte stream. The fi-
+       nal argument is a pointer to a general context, which can  be  used  to
        specify  custom  memory mangagement functions for the decoded patterns.
        If this argument is NULL, malloc() and free() are used. After deserial-
        ization, the byte stream is no longer needed and can be discarded.
@@ -10250,9 +10233,9 @@
            pcre2_serialize_decode(list_of_codes, 2, bytes, NULL);


        If  the  vector  is  not  large enough for all the patterns in the byte
-       stream, it is filled  with  those  that  fit,  and  the  remainder  are
-       ignored.  The  yield of the function is the number of decoded patterns,
-       or one of the following negative error codes:
+       stream, it is filled with those that fit, and  the  remainder  are  ig-
+       nored.  The yield of the function is the number of decoded patterns, or
+       one of the following negative error codes:


          PCRE2_ERROR_BADDATA    second argument is zero or less
          PCRE2_ERROR_BADMAGIC   mismatch of id bytes in the data
@@ -10266,9 +10249,9 @@


        Decoded patterns can be used for matching in the usual way, and must be
        freed by calling pcre2_code_free(). However, be aware that there  is  a
-       potential  race  issue  if  you  are  using multiple patterns that were
-       decoded from a single byte stream in  a  multithreaded  application.  A
-       single copy of the character tables is used by all the decoded patterns
+       potential  race  issue if you are using multiple patterns that were de-
+       coded from a single byte stream in a multithreaded application. A  sin-
+       gle  copy  of  the character tables is used by all the decoded patterns
        and a reference count is used to arrange for its memory to be automati-
        cally  freed when the last pattern is freed, but there is no locking on
        this reference count. Therefore, if you want to call  pcre2_code_free()
@@ -10351,8 +10334,8 @@


        Note that \0dd is always an octal code. The treatment of backslash fol-
        lowed  by  a non-zero digit is complicated; for details see the section
-       "Non-printing characters"  in  the  pcre2pattern  documentation,  where
-       details  of  escape  processing  in EBCDIC environments are also given.
+       "Non-printing characters" in the pcre2pattern documentation, where  de-
+       tails  of  escape  processing  in  EBCDIC  environments are also given.
        \N{U+hh..} is synonymous with \x{hh..} in PCRE2 but is not supported in
        EBCDIC  environments.  Note  that  \N  not followed by an opening curly
        bracket has a different meaning (see below).
@@ -10460,24 +10443,24 @@
        Braille,  Buginese, Buhid, Canadian_Aboriginal, Carian, Caucasian_Alba-
        nian, Chakma,  Cham,  Cherokee,  Common,  Coptic,  Cuneiform,  Cypriot,
        Cyrillic,  Deseret,  Devanagari, Dogra, Duployan, Egyptian_Hieroglyphs,
-       Elbasan,  Ethiopic,  Georgian,  Glagolitic,  Gothic,  Grantha,   Greek,
-       Gujarati,   Gunjala_Gondi,   Gurmukhi,  Han,  Hangul,  Hanifi_Rohingya,
-       Hanunoo,  Hatran,  Hebrew,   Hiragana,   Imperial_Aramaic,   Inherited,
-       Inscriptional_Pahlavi,  Inscriptional_Parthian,  Javanese, Kaithi, Kan-
-       nada, Katakana, Kayah_Li, Kharoshthi, Khmer,  Khojki,  Khudawadi,  Lao,
-       Latin,  Lepcha,  Limbu, Linear_A, Linear_B, Lisu, Lycian, Lydian, Maha-
-       jani, Makasar, Malayalam, Mandaic, Manichaean, Marchen,  Masaram_Gondi,
-       Medefaidrin,     Meetei_Mayek,     Mende_Kikakui,     Meroitic_Cursive,
-       Meroitic_Hieroglyphs, Miao, Modi,  Mongolian,  Mro,  Multani,  Myanmar,
-       Nabataean,  New_Tai_Lue, Newa, Nko, Nushu, Ogham, Ol_Chiki, Old_Hungar-
-       ian, Old_Italic, Old_North_Arabian, Old_Permic,  Old_Persian,  Old_Sog-
-       dian,    Old_South_Arabian,    Old_Turkic,   Oriya,   Osage,   Osmanya,
-       Pahawh_Hmong,    Palmyrene,    Pau_Cin_Hau,    Phags_Pa,    Phoenician,
-       Psalter_Pahlavi,  Rejang,  Runic,  Samaritan, Saurashtra, Sharada, Sha-
-       vian, Siddham, SignWriting, Sinhala,  Sogdian,  Sora_Sompeng,  Soyombo,
-       Sundanese,  Syloti_Nagri,  Syriac, Tagalog, Tagbanwa, Tai_Le, Tai_Tham,
-       Tai_Viet, Takri, Tamil, Tangut, Telugu, Thaana,  Thai,  Tibetan,  Tifi-
-       nagh, Tirhuta, Ugaritic, Vai, Warang_Citi, Yi, Zanabazar_Square.
+       Elbasan, Ethiopic, Georgian, Glagolitic, Gothic,  Grantha,  Greek,  Gu-
+       jarati, Gunjala_Gondi, Gurmukhi, Han, Hangul, Hanifi_Rohingya, Hanunoo,
+       Hatran,  Hebrew,  Hiragana,   Imperial_Aramaic,   Inherited,   Inscrip-
+       tional_Pahlavi,   Inscriptional_Parthian,  Javanese,  Kaithi,  Kannada,
+       Katakana, Kayah_Li, Kharoshthi, Khmer, Khojki, Khudawadi,  Lao,  Latin,
+       Lepcha,  Limbu,  Linear_A,  Linear_B,  Lisu,  Lycian, Lydian, Mahajani,
+       Makasar, Malayalam, Mandaic, Manichaean, Marchen, Masaram_Gondi,  Mede-
+       faidrin, Meetei_Mayek, Mende_Kikakui, Meroitic_Cursive, Meroitic_Hiero-
+       glyphs,  Miao,  Modi,  Mongolian,  Mro,  Multani,  Myanmar,  Nabataean,
+       New_Tai_Lue,   Newa,   Nko,   Nushu,  Ogham,  Ol_Chiki,  Old_Hungarian,
+       Old_Italic, Old_North_Arabian,  Old_Permic,  Old_Persian,  Old_Sogdian,
+       Old_South_Arabian,  Old_Turkic,  Oriya,  Osage,  Osmanya, Pahawh_Hmong,
+       Palmyrene, Pau_Cin_Hau, Phags_Pa, Phoenician, Psalter_Pahlavi,  Rejang,
+       Runic,  Samaritan,  Saurashtra, Sharada, Shavian, Siddham, SignWriting,
+       Sinhala, Sogdian, Sora_Sompeng, Soyombo, Sundanese, Syloti_Nagri,  Syr-
+       iac,  Tagalog,  Tagbanwa,  Tai_Le,  Tai_Tham,  Tai_Viet,  Takri, Tamil,
+       Tangut, Telugu, Thaana, Thai,  Tibetan,  Tifinagh,  Tirhuta,  Ugaritic,
+       Vai, Warang_Citi, Yi, Zanabazar_Square.



 CHARACTER CLASSES
@@ -10604,9 +10587,9 @@
        for example (?^in). An option setting may appear at the start of a non-
        capture group, for example (?i:...).


-       The following are recognized only at the very start  of  a  pattern  or
-       after  one  of the newline or \R options with similar syntax. More than
-       one of them may appear. For the first three, d is a decimal number.
+       The following are recognized only at the very start of a pattern or af-
+       ter one of the newline or \R options with similar syntax. More than one
+       of them may appear. For the first three, d is a decimal number.


          (*LIMIT_DEPTH=d) set the backtracking limit to d
          (*LIMIT_HEAP=d)  set the heap size limit to d * 1024 bytes
@@ -10630,8 +10613,8 @@


NEWLINE CONVENTION

-       These  are  recognized  only  at the very start of the pattern or after
-       option settings with a similar syntax.
+       These are recognized only at the very start of the pattern or after op-
+       tion settings with a similar syntax.


          (*CR)           carriage return only
          (*LF)           linefeed only
@@ -10643,8 +10626,8 @@


WHAT \R MATCHES

-       These are recognized only at the very start of  the  pattern  or  after
-       option setting with a similar syntax.
+       These are recognized only at the very start of the pattern or after op-
+       tion setting with a similar syntax.


          (*BSR_ANYCRLF)  CR, LF, or CRLF
          (*BSR_UNICODE)  any Unicode newline sequence
@@ -10886,23 +10869,23 @@
        The character escapes \b, \B, \d, \D, \s, \S, \w, and \W correctly test
        characters  of  any  code  value,  but, by default, the characters that
        PCRE2 recognizes as digits, spaces, or word characters remain the  same
-       set  as  in  non-UTF  mode,  all  with  code points less than 256. This
-       remains true even when PCRE2  is  built  to  include  Unicode  support,
-       because  to do otherwise would slow down matching in many common cases.
-       Note that this also applies to \b and \B, because they are  defined  in
-       terms  of  \w  and  \W.  If you want to test for a wider sense of, say,
-       "digit", you can use explicit Unicode property tests  such  as  \p{Nd}.
-       Alternatively,  if you set the PCRE2_UCP option, the way that the char-
-       acter escapes work is changed so that Unicode properties  are  used  to
-       determine which characters match. There are more details in the section
-       on generic character types in the pcre2pattern documentation.
+       set  as  in  non-UTF mode, all with code points less than 256. This re-
+       mains true even when PCRE2 is built to include Unicode support, because
+       to  do  otherwise  would  slow down matching in many common cases. Note
+       that this also applies to \b and \B, because they are defined in  terms
+       of  \w  and \W. If you want to test for a wider sense of, say, "digit",
+       you can use explicit Unicode property tests such  as  \p{Nd}.  Alterna-
+       tively, if you set the PCRE2_UCP option, the way that the character es-
+       capes work is changed so that Unicode properties are used to  determine
+       which  characters  match.  There  are  more  details  in the section on
+       generic character types in the pcre2pattern documentation.


        Similarly, characters that match the POSIX named character classes  are
        all low-valued characters, unless the PCRE2_UCP option is set.


-       However,  the  special  horizontal  and  vertical  white space matching
-       escapes (\h, \H, \v, and \V) do match all the appropriate Unicode char-
-       acters, whether or not PCRE2_UCP is set.
+       However,  the  special horizontal and vertical white space matching es-
+       capes (\h, \H, \v, and \V) do match all the appropriate Unicode charac-
+       ters, whether or not PCRE2_UCP is set.



 CASE-EQUIVALENCE IN UTF MODE
@@ -10947,12 +10930,12 @@
        are  only  normally  used  with a small number of scripts. For example,
        U+102E0 (Coptic Epact thousands mark) is used only with Arabic and Cop-
        tic.  In  order  to  make it possible to check this, a Unicode property
-       called Script Extension exists. Its value is a  list  of  scripts  that
-       apply  to  the character. For the majority of characters, the list con-
-       tains just one script, the same one as the  Script  property.  However,
-       for  characters  such  as U+102E0 more than one Script is listed. There
-       are also some Common characters that have a single,  non-Common  script
-       in their Script Extension list.
+       called Script Extension exists. Its value is a list of scripts that ap-
+       ply to the character. For the majority of characters, the list contains
+       just one script, the same one as  the  Script  property.  However,  for
+       characters  such  as  U+102E0 more than one Script is listed. There are
+       also some Common characters that have a single,  non-Common  script  in
+       their Script Extension list.


        The next section describes the basic rules for deciding whether a given
        string of characters is a script run. Note,  however,  that  there  are
@@ -10989,8 +10972,8 @@


        The  first  has the Script Extension list Arabic, Hanifi Rohingya, Syr-
        iac, and Thaana; the second has just Arabic and Hanifi  Rohingya.  Both
-       of  them  could  appear  in  script  runs  of  either  Arabic or Hanifi
-       Rohingya. The first could also appear in Syriac or Thaana script  runs,
+       of  them  could  appear  in  script runs of either Arabic or Hanifi Ro-
+       hingya. The first could also appear in Syriac or  Thaana  script  runs,
        but the second could not.


    The Chinese Han script
@@ -11004,8 +10987,8 @@
        gana,  Katakana,  and Han, or a mixture of Hangul and Han, or a mixture
        of Bopomofo and Han, but not, for example,  a  mixture  of  Hangul  and
        Bopomofo  and  Han. PCRE2 (like Perl) follows Unicode's Technical Stan-
-       dard     39     ("Unicode     Security     Mechanisms",     http://uni-
-       code.org/reports/tr39/) in allowing such mixtures.
+       dard  39   ("Unicode   Security   Mechanisms",   http://unicode.org/re-
+       ports/tr39/) in allowing such mixtures.


    Decimal digits


@@ -11012,8 +10995,8 @@
        Unicode  contains  many sets of 10 decimal digits in different scripts,
        and some scripts (including the Common script) contain  more  than  one
        set.  Some  of these decimal digits them are visually indistinguishable
-       from the common ASCII  digits.  In  addition  to  the  script  checking
-       described above, if a script run contains any decimal digits, they must
+       from the common ASCII digits. In addition to the  script  checking  de-
+       scribed  above,  if a script run contains any decimal digits, they must
        all come from the same set of 10 adjacent characters.



@@ -11022,8 +11005,8 @@
        When the PCRE2_UTF option is set, the strings passed  as  patterns  and
        subjects are (by default) checked for validity on entry to the relevant
        functions. If an invalid UTF string is passed, a negative error code is
-       returned.  The  code  unit  offset  to  the  offending character can be
-       extracted from the match data block by  calling  pcre2_get_startchar(),
+       returned.  The  code  unit offset to the offending character can be ex-
+       tracted from the match data  block  by  calling  pcre2_get_startchar(),
        which is used for this purpose after a UTF error.


        In  some  situations, you may already know that your strings are valid,
@@ -11072,16 +11055,16 @@
        UTF-16, where they are used in pairs to encode code points with  values
        greater  than  0xFFFF. The code points that are encoded by UTF-16 pairs
        are available independently in the  UTF-8  and  UTF-32  encodings.  (In
-       other  words,  the  whole  surrogate  thing is a fudge for UTF-16 which
-       unfortunately messes up UTF-8 and UTF-32.)
+       other  words, the whole surrogate thing is a fudge for UTF-16 which un-
+       fortunately messes up UTF-8 and UTF-32.)


        Setting PCRE2_NO_UTF_CHECK at compile time does not disable  the  error
        that  is  given if an escape sequence for an invalid Unicode code point
        is encountered in the pattern. If you want to  allow  escape  sequences
-       such   as   \x{d800}   (a   surrogate  code  point)  you  can  set  the
-       PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES extra option. However, this is pos-
-       sible only in UTF-8 and UTF-32 modes, because these values are not rep-
-       resentable in UTF-16.
+       such  as  \x{d800}  (a  surrogate code point) you can set the PCRE2_EX-
+       TRA_ALLOW_SURROGATE_ESCAPES extra option.  However,  this  is  possible
+       only  in  UTF-8  and  UTF-32 modes, because these values are not repre-
+       sentable in UTF-16.


    Errors in UTF-8 strings


@@ -11171,18 +11154,18 @@
MATCHING IN INVALID UTF STRINGS

        You can run pattern matches on subject strings that may contain invalid
-       UTF    sequences    if    you    call    pcre2_compile()    with    the
-       PCRE2_MATCH_INVALID_UTF  option.  This  is  supported by pcre2_match(),
-       including  JIT   matching,   but   not   by   pcre2_dfa_match().   When
-       PCRE2_MATCH_INVALID_UTF  is set, it forces PCRE2_UTF to be set as well.
-       Note, however, that the pattern itself must be a valid UTF string.
+       UTF sequences if you  call  pcre2_compile()  with  the  PCRE2_MATCH_IN-
+       VALID_UTF  option.  This  is  supported by pcre2_match(), including JIT
+       matching, but not by pcre2_dfa_match(). When PCRE2_MATCH_INVALID_UTF is
+       set,  it  forces  PCRE2_UTF  to be set as well. Note, however, that the
+       pattern itself must be a valid UTF string.


        Setting PCRE2_MATCH_INVALID_UTF does not  affect  what  pcre2_compile()
        generates,  but  if pcre2_jit_compile() is subsequently called, it does
        generate different code. If JIT is not used, the option affects the be-
-       haviour    of    the   interpretive   code   in   pcre2_match().   When
-       PCRE2_MATCH_INVALID_UTF is set at compile time,  PCRE2_NO_UTF_CHECK  is
-       ignored at match time.
+       haviour of the interpretive code in pcre2_match(). When PCRE2_MATCH_IN-
+       VALID_UTF is set at compile  time,  PCRE2_NO_UTF_CHECK  is  ignored  at
+       match time.


        In  this  mode,  an  invalid  code  unit  sequence in the subject never
        matches any pattern item. It does not match  dot,  it  does  not  match


Modified: code/trunk/doc/pcre2grep.txt
===================================================================
--- code/trunk/doc/pcre2grep.txt    2019-06-21 16:10:17 UTC (rev 1117)
+++ code/trunk/doc/pcre2grep.txt    2019-06-22 16:36:15 UTC (rev 1118)
@@ -12,11 +12,11 @@
 DESCRIPTION


        pcre2grep  searches  files  for  character patterns, in the same way as
-       other grep commands do,  but  it  uses  the  PCRE2  regular  expression
-       library  to  support  patterns  that  are  compatible  with the regular
-       expressions of Perl 5. See pcre2syntax(3) for a quick-reference summary
-       of  pattern  syntax,  or  pcre2pattern(3) for a full description of the
-       syntax and semantics of the regular expressions that PCRE2 supports.
+       other grep commands do, but it uses the PCRE2  regular  expression  li-
+       brary  to support patterns that are compatible with the regular expres-
+       sions of Perl 5. See pcre2syntax(3) for a  quick-reference  summary  of
+       pattern syntax, or pcre2pattern(3) for a full description of the syntax
+       and semantics of the regular expressions that PCRE2 supports.


        Patterns, whether supplied on the command line or in a  separate  file,
        are given without delimiters. For example:
@@ -26,8 +26,8 @@
        If you attempt to use delimiters (for example, by surrounding a pattern
        with slashes, as is common in Perl scripts), they  are  interpreted  as
        part  of  the pattern. Quotes can of course be used to delimit patterns
-       on the command line because they are  interpreted  by  the  shell,  and
-       indeed  quotes  are required if a pattern contains white space or shell
+       on the command line because they are interpreted by the shell, and  in-
+       deed  quotes  are  required  if a pattern contains white space or shell
        metacharacters.


        The first argument that follows any option settings is treated  as  the
@@ -54,8 +54,8 @@
        controlled  by  parameters  that  can  be  set by the --buffer-size and
        --max-buffer-size options. The first of these sets the size  of  buffer
        that  is obtained at the start of processing. If an input file contains
-       very long lines, a larger buffer may be  needed;  this  is  handled  by
-       automatically extending the buffer, up to the limit specified by --max-
+       very long lines, a larger buffer may be needed; this is handled by  au-
+       tomatically  extending  the buffer, up to the limit specified by --max-
        buffer-size. The default values for these parameters can  be  set  when
        pcre2grep  is  built;  if nothing is specified, the defaults are set to
        20KiB and 1MiB respectively. An error occurs if a line is too long  and
@@ -75,12 +75,12 @@
        By default, as soon as one pattern matches a line, no further  patterns
        are considered. However, if --colour (or --color) is used to colour the
        matching substrings, or if --only-matching, --file-offsets, or  --line-
-       offsets  is  used  to  output  only  the  part of the line that matched
-       (either shown literally, or as an offset), scanning resumes immediately
+       offsets  is  used to output only the part of the line that matched (ei-
+       ther shown literally, or as an offset),  scanning  resumes  immediately
        following  the  match,  so that further matches on the same line can be
-       found. If there are multiple  patterns,  they  are  all  tried  on  the
-       remainder  of  the  line, but patterns that follow the one that matched
-       are not tried on the earlier part of the line.
+       found. If there are multiple patterns, they are all tried  on  the  re-
+       mainder  of the line, but patterns that follow the one that matched are
+       not tried on the earlier part of the line.


        This behaviour means that the order  in  which  multiple  patterns  are
        specified  can affect the output when one of the above options is used.
@@ -89,11 +89,11 @@
        overlap).


        Patterns that can match an empty string are accepted, but empty  string
-       matches   are   never   recognized.   An   example   is   the   pattern
-       "(super)?(man)?", in which all components are  optional.  This  pattern
-       finds  all  occurrences  of  both "super" and "man"; the output differs
-       from matching with "super|man" when only the  matching  substrings  are
-       being shown.
+       matches   are  never  recognized.  An  example  is  the  pattern  "(su-
+       per)?(man)?", in which all components are optional. This pattern  finds
+       all  occurrences  of  both  "super"  and "man"; the output differs from
+       matching with "super|man" when only the matching substrings  are  being
+       shown.


        If  the  LC_ALL or LC_CTYPE environment variable is set, pcre2grep uses
        the value to set a locale when calling the PCRE2 library.  The --locale
@@ -116,153 +116,152 @@
        By  default,  a  file that contains a binary zero byte within the first
        1024 bytes is identified as a binary file, and is processed  specially.
        (GNU grep identifies binary files in this manner.) However, if the new-
-       line type is specified as "nul", that is,  the  line  terminator  is  a
-       binary  zero,  the  test  for  a  binary  file  is not applied. See the
-       --binary-files option for a means of changing the way binary files  are
-       handled.
+       line type is specified as "nul", that is, the line terminator is a  bi-
+       nary zero, the test for a binary file is not applied. See the --binary-
+       files option for a means of changing the way binary files are handled.



BINARY ZEROS IN PATTERNS

-       Patterns  passed  from the command line are strings that are terminated
-       by a binary zero, so cannot contain internal zeros.  However,  patterns
+       Patterns passed from the command line are strings that  are  terminated
+       by  a  binary zero, so cannot contain internal zeros. However, patterns
        that are read from a file via the -f option may contain binary zeros.



OPTIONS

-       The  order  in  which some of the options appear can affect the output.
-       For example, both the -H and -l options affect  the  printing  of  file
-       names.  Whichever  comes later in the command line will be the one that
-       takes effect. Similarly, except where noted  below,  if  an  option  is
-       given  twice,  the  later setting is used. Numerical values for options
-       may be followed by K  or  M,  to  signify  multiplication  by  1024  or
+       The order in which some of the options appear can  affect  the  output.
+       For  example,  both  the  -H and -l options affect the printing of file
+       names. Whichever comes later in the command line will be the  one  that
+       takes  effect.  Similarly,  except  where  noted below, if an option is
+       given twice, the later setting is used. Numerical  values  for  options
+       may  be  followed  by  K  or  M,  to  signify multiplication by 1024 or
        1024*1024 respectively.


        --        This terminates the list of options. It is useful if the next
-                 item on the command line starts with a hyphen but is  not  an
-                 option.  This  allows for the processing of patterns and file
+                 item  on  the command line starts with a hyphen but is not an
+                 option. This allows for the processing of patterns  and  file
                  names that start with hyphens.


        -A number, --after-context=number
-                 Output up to number lines  of  context  after  each  matching
-                 line.  Fewer lines are output if the next match or the end of
-                 the file is reached, or if the  processing  buffer  size  has
-                 been  set  too  small.  If file names and/or line numbers are
-                 being output, a hyphen separator is used instead of  a  colon
-                 for  the  context  lines.  A  line  containing "--" is output
-                 between each group of lines, unless they are in fact contigu-
-                 ous  in the input file. The value of number is expected to be
-                 relatively small. When -c is used, -A is ignored.
+                 Output  up  to  number  lines  of context after each matching
+                 line. Fewer lines are output if the next match or the end  of
+                 the  file  is  reached,  or if the processing buffer size has
+                 been set too small. If file names and/or line numbers are be-
+                 ing output, a hyphen separator is used instead of a colon for
+                 the context lines. A line containing "--" is  output  between
+                 each  group  of  lines, unless they are in fact contiguous in
+                 the input file. The value of number is expected to  be  rela-
+                 tively small. When -c is used, -A is ignored.


        -a, --text
-                 Treat binary files as text. This is equivalent  to  --binary-
+                 Treat  binary  files as text. This is equivalent to --binary-
                  files=text.


        -B number, --before-context=number
-                 Output  up  to  number  lines of context before each matching
-                 line. Fewer lines are output if the  previous  match  or  the
-                 start  of the file is within number lines, or if the process-
-                 ing buffer size has been set too small. If file names  and/or
-                 line  numbers  are  being  output, a hyphen separator is used
-                 instead of a colon for the context lines. A  line  containing
-                 "--"  is  output between each group of lines, unless they are
-                 in fact contiguous in the input file. The value of number  is
-                 expected  to  be  relatively  small.  When  -c is used, -B is
-                 ignored.
+                 Output up to number lines of  context  before  each  matching
+                 line.  Fewer  lines  are  output if the previous match or the
+                 start of the file is within number lines, or if the  process-
+                 ing  buffer size has been set too small. If file names and/or
+                 line numbers are being output, a hyphen separator is used in-
+                 stead  of  a  colon  for the context lines. A line containing
+                 "--" is output between each group of lines, unless  they  are
+                 in  fact contiguous in the input file. The value of number is
+                 expected to be relatively small. When -c is used, -B  is  ig-
+                 nored.


        --binary-files=word
-                 Specify how binary files are to be processed. If the word  is
-                 "binary"  (the  default),  pattern  matching  is performed on
-                 binary files, but the only  output  is  "Binary  file  <name>
-                 matches"  when a match succeeds. If the word is "text", which
-                 is equivalent to the -a or --text option,  binary  files  are
-                 processed  in  the  same way as any other file. In this case,
-                 when a match succeeds, the  output  may  be  binary  garbage,
-                 which  can  have  nasty effects if sent to a terminal. If the
-                 word is  "without-match",  which  is  equivalent  to  the  -I
-                 option,  binary  files  are  not  processed  at all; they are
-                 assumed not to be of interest and are skipped without causing
-                 any output or affecting the return code.
+                 Specify  how binary files are to be processed. If the word is
+                 "binary" (the default), pattern matching is performed on  bi-
+                 nary  files,  but  the  only  output  is  "Binary file <name>
+                 matches" when a match succeeds. If the word is "text",  which
+                 is  equivalent  to  the -a or --text option, binary files are
+                 processed in the same way as any other file.  In  this  case,
+                 when  a  match  succeeds,  the  output may be binary garbage,
+                 which can have nasty effects if sent to a  terminal.  If  the
+                 word  is  "without-match",  which is equivalent to the -I op-
+                 tion, binary files are not processed at all; they are assumed
+                 not  to  be  of  interest and are skipped without causing any
+                 output or affecting the return code.


        --buffer-size=number
-                 Set  the  parameter that controls how much memory is obtained
+                 Set the parameter that controls how much memory  is  obtained
                  at the start of processing for buffering files that are being
                  scanned. See also --max-buffer-size below.


        -C number, --context=number
-                 Output  number  lines  of  context both before and after each
-                 matching line.  This is equivalent to setting both -A and  -B
+                 Output number lines of context both  before  and  after  each
+                 matching  line.  This is equivalent to setting both -A and -B
                  to the same value.


        -c, --count
-                 Do  not  output  lines from the files that are being scanned;
-                 instead output the number  of  lines  that  would  have  been
+                 Do not output lines from the files that  are  being  scanned;
+                 instead  output  the  number  of  lines  that would have been
                  shown, either because they matched, or, if -v is set, because
-                 they failed to match. By default, this count is  exactly  the
-                 same  as the number of lines that would have been output, but
-                 if the -M (multiline) option is used (without -v), there  may
-                 be  more suppressed lines than the count (that is, the number
+                 they  failed  to match. By default, this count is exactly the
+                 same as the number of lines that would have been output,  but
+                 if  the -M (multiline) option is used (without -v), there may
+                 be more suppressed lines than the count (that is, the  number
                  of matches).


-                 If no lines are selected, the number zero is output. If  sev-
-                 eral  files are are being scanned, a count is output for each
-                 of them and the -t option can be used to cause a total to  be
-                 output  at  the  end.  However,  if  the --files-with-matches
-                 option is also  used,  only  those  files  whose  counts  are
-                 greater  than  zero  are listed. When -c is used, the -A, -B,
-                 and -C options are ignored.
+                 If  no lines are selected, the number zero is output. If sev-
+                 eral files are are being scanned, a count is output for  each
+                 of  them and the -t option can be used to cause a total to be
+                 output at the end. However, if the  --files-with-matches  op-
+                 tion  is also used, only those files whose counts are greater
+                 than zero are listed. When -c is used, the -A, -B, and -C op-
+                 tions are ignored.


        --colour, --color
                  If this option is given without any data, it is equivalent to
-                 "--colour=auto".   If  data  is required, it must be given in
+                 "--colour=auto".  If data is required, it must  be  given  in
                  the same shell item, separated by an equals sign.


        --colour=value, --color=value
                  This option specifies under what circumstances the parts of a
                  line that matched a pattern should be coloured in the output.
-                 By default, the output is not coloured. The value  (which  is
-                 optional,  see above) may be "never", "always", or "auto". In
-                 the latter case, colouring happens only if the standard  out-
-                 put  is connected to a terminal. More resources are used when
+                 By  default,  the output is not coloured. The value (which is
+                 optional, see above) may be "never", "always", or "auto".  In
+                 the  latter case, colouring happens only if the standard out-
+                 put is connected to a terminal. More resources are used  when
                  colouring is enabled, because pcre2grep has to search for all
-                 possible  matches in a line, not just one, in order to colour
+                 possible matches in a line, not just one, in order to  colour
                  them all.


-                 The colour that is used can be specified by  setting  one  of
-                 the  environment variables PCRE2GREP_COLOUR, PCRE2GREP_COLOR,
+                 The  colour  that  is used can be specified by setting one of
+                 the environment variables PCRE2GREP_COLOUR,  PCRE2GREP_COLOR,
                  PCREGREP_COLOUR, or PCREGREP_COLOR, which are checked in that
                  order.  If  none  of  these  are  set,  pcre2grep  looks  for
-                 GREP_COLORS or GREP_COLOR (in that order). The value  of  the
-                 variable  should  be  a string of two numbers, separated by a
-                 semicolon, except in the  case  of  GREP_COLORS,  which  must
+                 GREP_COLORS  or  GREP_COLOR (in that order). The value of the
+                 variable should be a string of two numbers,  separated  by  a
+                 semicolon,  except  in  the  case  of GREP_COLORS, which must
                  start with "ms=" or "mt=" followed by two semicolon-separated
-                 colours, terminated by the end of the string or by  a  colon.
-                 If  GREP_COLORS  does  not  start  with  "ms=" or "mt=" it is
-                 ignored, and GREP_COLOR is checked.
+                 colours,  terminated  by the end of the string or by a colon.
+                 If GREP_COLORS does not start with "ms=" or "mt=" it  is  ig-
+                 nored, and GREP_COLOR is checked.


-                 If the string obtained from one of the above  variables  con-
+                 If  the  string obtained from one of the above variables con-
                  tains any characters other than semicolon or digits, the set-
                  ting is ignored and the default colour is used. The string is
                  copied directly into the control string for setting colour on
-                 a terminal, so it is your responsibility to ensure  that  the
-                 values  make  sense.  If  no relevant environment variable is
+                 a  terminal,  so it is your responsibility to ensure that the
+                 values make sense. If no  relevant  environment  variable  is
                  set, the default is "1;31", which gives red.


        -D action, --devices=action
-                 If an input path is  not  a  regular  file  or  a  directory,
-                 "action"  specifies  how  it is to be processed. Valid values
-                 are "read" (the default) or "skip" (silently skip the path).
+                 If  an  input path is not a regular file or a directory, "ac-
+                 tion" specifies how it is to be processed. Valid  values  are
+                 "read" (the default) or "skip" (silently skip the path).


        -d action, --directories=action
                  If an input path is a directory, "action" specifies how it is
-                 to  be  processed.   Valid  values are "read" (the default in
-                 non-Windows environments, for compatibility with  GNU  grep),
-                 "recurse"  (equivalent to the -r option), or "skip" (silently
-                 skip the path, the default in Windows environments).  In  the
-                 "read"  case,  directories  are read as if they were ordinary
-                 files. In some operating systems  the  effect  of  reading  a
-                 directory like this is an immediate end-of-file; in others it
+                 to be processed.  Valid values are  "read"  (the  default  in
+                 non-Windows  environments,  for compatibility with GNU grep),
+                 "recurse" (equivalent to the -r option), or "skip"  (silently
+                 skip  the  path, the default in Windows environments). In the
+                 "read" case, directories are read as if  they  were  ordinary
+                 files.  In some operating systems the effect of reading a di-
+                 rectory like this is an immediate end-of-file; in  others  it
                  may provoke an error.


        --depth-limit=number
@@ -271,142 +270,142 @@
        -e pattern, --regex=pattern, --regexp=pattern
                  Specify a pattern to be matched. This option can be used mul-
                  tiple times in order to specify several patterns. It can also
-                 be used as a way of specifying a single pattern  that  starts
-                 with  a hyphen. When -e is used, no argument pattern is taken
-                 from the command line; all  arguments  are  treated  as  file
-                 names.  There is no limit to the number of patterns. They are
-                 applied to each line in the order in which they  are  defined
+                 be  used  as a way of specifying a single pattern that starts
+                 with a hyphen. When -e is used, no argument pattern is  taken
+                 from  the  command  line;  all  arguments are treated as file
+                 names. There is no limit to the number of patterns. They  are
+                 applied  to  each line in the order in which they are defined
                  until one matches.


-                 If  -f is used with -e, the command line patterns are matched
+                 If -f is used with -e, the command line patterns are  matched
                  first, followed by the patterns from the file(s), independent
-                 of  the order in which these options are specified. Note that
-                 multiple use of -e is not the same as a single  pattern  with
+                 of the order in which these options are specified. Note  that
+                 multiple  use  of -e is not the same as a single pattern with
                  alternatives. For example, X|Y finds the first character in a
-                 line that is X or Y, whereas if the two  patterns  are  given
+                 line  that  is  X or Y, whereas if the two patterns are given
                  separately, with X first, pcre2grep finds X if it is present,
                  even if it follows Y in the line. It finds Y only if there is
-                 no  X  in  the line. This matters only if you are using -o or
+                 no X in the line. This matters only if you are  using  -o  or
                  --colo(u)r to show the part(s) of the line that matched.


        --exclude=pattern
                  Files (but not directories) whose names match the pattern are
-                 skipped  without  being processed. This applies to all files,
-                 whether listed on the command  line,  obtained  from  --file-
+                 skipped without being processed. This applies to  all  files,
+                 whether  listed  on  the  command line, obtained from --file-
                  list, or by scanning a directory. The pattern is a PCRE2 reg-
-                 ular expression, and is matched against the  final  component
-                 of  the  file  name,  not the entire path. The -F, -w, and -x
-                 options do not apply to this pattern. The option may be given
+                 ular  expression,  and is matched against the final component
+                 of the file name, not the entire path. The -F, -w, and -x op-
+                 tions  do  not apply to this pattern. The option may be given
                  any number of times in order to specify multiple patterns. If
-                 a file name matches both an --include and an  --exclude  pat-
+                 a  file  name matches both an --include and an --exclude pat-
                  tern, it is excluded. There is no short form for this option.


        --exclude-from=filename
-                 Treat  each  non-empty  line  of  the file as the data for an
+                 Treat each non-empty line of the file  as  the  data  for  an
                  --exclude option. What constitutes a newline when reading the
-                 file  is the operating system's default. The --newline option
-                 has no effect on this option. This option may be  given  more
+                 file is the operating system's default. The --newline  option
+                 has  no  effect on this option. This option may be given more
                  than once in order to specify a number of files to read.


        --exclude-dir=pattern
                  Directories whose names match the pattern are skipped without
-                 being processed, whatever  the  setting  of  the  --recursive
-                 option.  This  applies  to all directories, whether listed on
-                 the command line, obtained from --file-list, or by scanning a
-                 parent  directory. The pattern is a PCRE2 regular expression,
-                 and is matched against the final component of  the  directory
-                 name,  not the entire path. The -F, -w, and -x options do not
-                 apply to this pattern. The option may be given any number  of
-                 times  in order to specify more than one pattern. If a direc-
-                 tory matches both  --include-dir  and  --exclude-dir,  it  is
-                 excluded. There is no short form for this option.
+                 being  processed, whatever the setting of the --recursive op-
+                 tion. This applies to all directories, whether listed on  the
+                 command  line,  obtained  from  --file-list, or by scanning a
+                 parent directory. The pattern is a PCRE2 regular  expression,
+                 and  is  matched against the final component of the directory
+                 name, not the entire path. The -F, -w, and -x options do  not
+                 apply  to this pattern. The option may be given any number of
+                 times in order to specify more than one pattern. If a  direc-
+                 tory  matches both --include-dir and --exclude-dir, it is ex-
+                 cluded. There is no short form for this option.


        -F, --fixed-strings
-                 Interpret  each  data-matching  pattern  as  a  list of fixed
-                 strings, separated by  newlines,  instead  of  as  a  regular
-                 expression.  What  constitutes  a newline for this purpose is
-                 controlled by the --newline option. The -w (match as a  word)
-                 and  -x (match whole line) options can be used with -F.  They
-                 apply to each of the fixed strings. A line is selected if any
+                 Interpret each data-matching  pattern  as  a  list  of  fixed
+                 strings,  separated  by newlines, instead of as a regular ex-
+                 pression. What constitutes a newline for this purpose is con-
+                 trolled by the --newline option. The -w (match as a word) and
+                 -x (match whole line) options can be used with -F.  They  ap-
+                 ply  to  each of the fixed strings. A line is selected if any
                  of the fixed strings are found in it (subject to -w or -x, if
-                 present). This option applies only to the patterns  that  are
-                 matched  against  the contents of files; it does not apply to
-                 patterns specified by  any  of  the  --include  or  --exclude
-                 options.
+                 present).  This  option applies only to the patterns that are
+                 matched against the contents of files; it does not  apply  to
+                 patterns  specified  by any of the --include or --exclude op-
+                 tions.


        -f filename, --file=filename
-                 Read  patterns  from  the  file, one per line, and match them
-                 against each line of input. As is the case with  patterns  on
-                 the  command line, no delimiters should be used. What consti-
-                 tutes a newline when reading the file is the  operating  sys-
-                 tem's  default interpretation of \n. The --newline option has
-                 no effect on this option. Trailing  white  space  is  removed
-                 from  each  line,  and blank lines are ignored. An empty file
-                 contains no patterns and therefore matches nothing.  Patterns
-                 read  from a file in this way may contain binary zeros, which
-                 are treated as ordinary data characters. See  also  the  com-
-                 ments  about  multiple  patterns versus a single pattern with
+                 Read patterns from the file, one per  line,  and  match  them
+                 against  each  line of input. As is the case with patterns on
+                 the command line, no delimiters should be used. What  consti-
+                 tutes  a  newline when reading the file is the operating sys-
+                 tem's default interpretation of \n. The --newline option  has
+                 no  effect  on  this  option. Trailing white space is removed
+                 from each line, and blank lines are ignored.  An  empty  file
+                 contains  no patterns and therefore matches nothing. Patterns
+                 read from a file in this way may contain binary zeros,  which
+                 are  treated  as  ordinary data characters. See also the com-
+                 ments about multiple patterns versus a  single  pattern  with
                  alternatives in the description of -e above.


-                 If this option is given more than  once,  all  the  specified
-                 files  are read. A data line is output if any of the patterns
-                 match it. A file name can be given as "-"  to  refer  to  the
-                 standard  input.  When  -f is used, patterns specified on the
-                 command line using -e may also be present;  they  are  tested
-                 before  the  file's  patterns.  However,  no other pattern is
+                 If  this  option  is  given more than once, all the specified
+                 files are read. A data line is output if any of the  patterns
+                 match  it.  A  file  name can be given as "-" to refer to the
+                 standard input. When -f is used, patterns  specified  on  the
+                 command  line  using  -e may also be present; they are tested
+                 before the file's patterns.  However,  no  other  pattern  is
                  taken from the command line; all arguments are treated as the
                  names of paths to be searched.


        --file-list=filename
-                 Read  a  list  of  files  and/or  directories  that are to be
+                 Read a list of  files  and/or  directories  that  are  to  be
                  scanned from the given file, one per line. What constitutes a
-                 newline  when  reading  the  file  is  the operating system's
-                 default. Trailing white space is removed from each line,  and
+                 newline when reading the file is the operating  system's  de-
+                 fault.  Trailing  white  space is removed from each line, and
                  blank lines are ignored. These paths are processed before any
-                 that are listed on the command line. The  file  name  can  be
-                 given  as  "-"  to refer to the standard input. If --file and
-                 --file-list are both specified  as  "-",  patterns  are  read
-                 first.  This is useful only when the standard input is a ter-
-                 minal, from which further lines (the list of  files)  can  be
+                 that  are  listed  on  the command line. The file name can be
+                 given as "-" to refer to the standard input.  If  --file  and
+                 --file-list  are  both  specified  as  "-", patterns are read
+                 first. This is useful only when the standard input is a  ter-
+                 minal,  from  which  further lines (the list of files) can be
                  read after an end-of-file indication. If this option is given
                  more than once, all the specified files are read.


        --file-offsets
-                 Instead of showing lines or parts of lines that  match,  show
-                 each  match  as  an  offset  from the start of the file and a
-                 length, separated by a comma. In this  mode,  no  context  is
-                 shown.  That  is,  the -A, -B, and -C options are ignored. If
+                 Instead  of  showing lines or parts of lines that match, show
+                 each match as an offset from the start  of  the  file  and  a
+                 length,  separated  by  a  comma. In this mode, no context is
+                 shown. That is, the -A, -B, and -C options  are  ignored.  If
                  there is more than one match in a line, each of them is shown
-                 separately.  This option is mutually exclusive with --output,
+                 separately. This option is mutually exclusive with  --output,
                  --line-offsets, and --only-matching.


        -H, --with-filename
-                 Force the inclusion of the file name at the start  of  output
+                 Force  the  inclusion of the file name at the start of output
                  lines when searching a single file. By default, the file name
                  is not shown in this case.  For matching lines, the file name
                  is followed by a colon; for context lines, a hyphen separator
-                 is used. If a line number is also being  output,  it  follows
-                 the  file  name. When the -M option causes a pattern to match
-                 more than one line, only the first is preceded  by  the  file
-                 name.  This  option  overrides  any  previous  -h,  -l, or -L
-                 options.
+                 is  used.  If  a line number is also being output, it follows
+                 the file name. When the -M option causes a pattern  to  match
+                 more  than  one  line, only the first is preceded by the file
+                 name. This option overrides any previous -h, -l,  or  -L  op-
+                 tions.


        -h, --no-filename
                  Suppress the output file names when searching multiple files.
-                 By  default,  file  names  are  shown when multiple files are
-                 searched. For matching lines, the file name is followed by  a
-                 colon;  for  context lines, a hyphen separator is used.  If a
-                 line number is also being output, it follows the  file  name.
+                 By default, file names are  shown  when  multiple  files  are
+                 searched.  For matching lines, the file name is followed by a
+                 colon; for context lines, a hyphen separator is used.   If  a
+                 line  number  is also being output, it follows the file name.
                  This option overrides any previous -H, -L, or -l options.


        --heap-limit=number
                  See --match-limit below.


-       --help    Output  a  help  message, giving brief details of the command
-                 options and file type support, and then exit.  Anything  else
+       --help    Output a help message, giving brief details  of  the  command
+                 options  and  file type support, and then exit. Anything else
                  on the command line is ignored.


-       -I        Ignore   binary   files.  This  is  equivalent  to  --binary-
+       -I        Ignore  binary  files.  This  is  equivalent   to   --binary-
                  files=without-match.


        -i, --ignore-case
@@ -413,56 +412,56 @@
                  Ignore upper/lower case distinctions during comparisons.


        --include=pattern
-                 If any --include patterns are specified, the only files  that
-                 are  processed  are those that match one of the patterns (and
-                 do not match an --exclude  pattern).  This  option  does  not
-                 affect  directories,  but  it  applies  to all files, whether
-                 listed on the command line, obtained from --file-list, or  by
-                 scanning  a directory. The pattern is a PCRE2 regular expres-
-                 sion, and is matched against the final component of the  file
-                 name,  not the entire path. The -F, -w, and -x options do not
-                 apply to this pattern. The option may be given any number  of
-                 times.  If  a  file  name  matches  both  an --include and an
-                 --exclude pattern, it is excluded.  There is  no  short  form
-                 for this option.
+                 If  any --include patterns are specified, the only files that
+                 are processed are those that match one of the  patterns  (and
+                 do  not match an --exclude pattern). This option does not af-
+                 fect directories, but it applies to all files, whether listed
+                 on  the  command line, obtained from --file-list, or by scan-
+                 ning a directory. The pattern is a PCRE2 regular  expression,
+                 and  is matched against the final component of the file name,
+                 not the entire path. The -F, -w, and -x options do not  apply
+                 to this pattern. The option may be given any number of times.
+                 If a file name matches both an  --include  and  an  --exclude
+                 pattern, it is excluded.  There is no short form for this op-
+                 tion.


        --include-from=filename
-                 Treat  each  non-empty  line  of  the file as the data for an
+                 Treat each non-empty line of the file  as  the  data  for  an
                  --include option. What constitutes a newline for this purpose
-                 is  the  operating system's default. The --newline option has
+                 is the operating system's default. The --newline  option  has
                  no effect on this option. This option may be given any number
                  of times; all the files are read.


        --include-dir=pattern
-                 If  any --include-dir patterns are specified, the only direc-
-                 tories that are processed are those that  match  one  of  the
-                 patterns  (and  do  not match an --exclude-dir pattern). This
-                 applies to all directories, whether  listed  on  the  command
-                 line,  obtained  from  --file-list,  or  by scanning a parent
-                 directory. The pattern is a PCRE2 regular expression, and  is
-                 matched  against  the  final component of the directory name,
-                 not the entire path. The -F, -w, and -x options do not  apply
+                 If any --include-dir patterns are specified, the only  direc-
+                 tories  that  are  processed  are those that match one of the
+                 patterns (and do not match an  --exclude-dir  pattern).  This
+                 applies  to  all  directories,  whether listed on the command
+                 line, obtained from --file-list, or by scanning a parent  di-
+                 rectory.  The  pattern  is a PCRE2 regular expression, and is
+                 matched against the final component of  the  directory  name,
+                 not  the entire path. The -F, -w, and -x options do not apply
                  to this pattern. The option may be given any number of times.
-                 If a directory matches both --include-dir and  --exclude-dir,
+                 If  a directory matches both --include-dir and --exclude-dir,
                  it is excluded. There is no short form for this option.


        -L, --files-without-match
-                 Instead  of  outputting lines from the files, just output the
-                 names of the files that do not contain any lines  that  would
-                 have  been  output. Each file name is output once, on a sepa-
-                 rate line. This option overrides any previous -H, -h,  or  -l
+                 Instead of outputting lines from the files, just  output  the
+                 names  of  the files that do not contain any lines that would
+                 have been output. Each file name is output once, on  a  sepa-
+                 rate  line.  This option overrides any previous -H, -h, or -l
                  options.


        -l, --files-with-matches
-                 Instead  of  outputting lines from the files, just output the
+                 Instead of outputting lines from the files, just  output  the
                  names of the files containing lines that would have been out-
-                 put.  Each  file  name  is  output  once, on a separate line.
-                 Searching normally stops as soon as a matching line is  found
-                 in  a  file.  However, if the -c (count) option is also used,
-                 matching continues in order to obtain the correct count,  and
-                 those  files  that  have  at least one match are listed along
+                 put. Each file name is  output  once,  on  a  separate  line.
+                 Searching  normally stops as soon as a matching line is found
+                 in a file. However, if the -c (count) option  is  also  used,
+                 matching  continues in order to obtain the correct count, and
+                 those files that have at least one  match  are  listed  along
                  with their counts. Using this option with -c is a way of sup-
-                 pressing  the  listing  of files with no matches. This opeion
+                 pressing the listing of files with no  matches.  This  opeion
                  overrides any previous -H, -h, or -L options.


        --label=name
@@ -471,312 +470,312 @@
                  input)" is used. There is no short form for this option.


        --line-buffered
-                 When this option is given, non-compressed input is  read  and
-                 processed  line by line, and the output is flushed after each
-                 write. By default, input is  read  in  large  chunks,  unless
-                 pcre2grep  can  determine  that it is reading from a terminal
-                 (which is currently possible only in  Unix-like  environments
-                 or  Windows).  Output  to  terminal is normally automatically
-                 flushed by the operating system. This option  can  be  useful
+                 When  this  option is given, non-compressed input is read and
+                 processed line by line, and the output is flushed after  each
+                 write.  By  default,  input  is  read in large chunks, unless
+                 pcre2grep can determine that it is reading  from  a  terminal
+                 (which  is  currently possible only in Unix-like environments
+                 or Windows). Output to  terminal  is  normally  automatically
+                 flushed  by  the  operating system. This option can be useful
                  when the input or output is attached to a pipe and you do not
-                 want pcre2grep to buffer up large amounts of data.   However,
-                 its  use  will  affect  performance,  and  the -M (multiline)
-                 option ceases to work. When input is from a compressed .gz or
+                 want  pcre2grep to buffer up large amounts of data.  However,
+                 its use will affect performance, and the -M  (multiline)  op-
+                 tion  ceases  to work. When input is from a compressed .gz or
                  .bz2 file, --line-buffered is ignored.


        --line-offsets
-                 Instead  of  showing lines or parts of lines that match, show
+                 Instead of showing lines or parts of lines that  match,  show
                  each match as a line number, the offset from the start of the
-                 line,  and a length. The line number is terminated by a colon
-                 (as usual; see the -n option), and the offset and length  are
-                 separated  by  a  comma.  In  this mode, no context is shown.
-                 That is, the -A, -B, and -C options are ignored. If there  is
-                 more  than  one  match in a line, each of them is shown sepa-
-                 rately. This option  is  mutually  exclusive  with  --output,
+                 line, and a length. The line number is terminated by a  colon
+                 (as  usual; see the -n option), and the offset and length are
+                 separated by a comma. In this  mode,  no  context  is  shown.
+                 That  is, the -A, -B, and -C options are ignored. If there is
+                 more than one match in a line, each of them  is  shown  sepa-
+                 rately.  This  option  is  mutually  exclusive with --output,
                  --file-offsets, and --only-matching.


        --locale=locale-name
-                 This  option specifies a locale to be used for pattern match-
-                 ing. It overrides the value in the LC_ALL or  LC_CTYPE  envi-
-                 ronment  variables.  If  no  locale  is  specified, the PCRE2
-                 library's default (usually the "C" locale) is used. There  is
-                 no short form for this option.
+                 This option specifies a locale to be used for pattern  match-
+                 ing.  It  overrides the value in the LC_ALL or LC_CTYPE envi-
+                 ronment variables. If no locale is specified, the  PCRE2  li-
+                 brary's default (usually the "C" locale) is used. There is no
+                 short form for this option.


        --match-limit=number
-                 Processing  some  regular expression patterns may take a very
+                 Processing some regular expression patterns may take  a  very
                  long time to search for all possible matching strings. Others
-                 may  require  a  very large amount of memory. There are three
+                 may require a very large amount of memory.  There  are  three
                  options that set resource limits for matching.


                  The --match-limit option provides a means of limiting comput-
-                 ing  resource  usage  when  processing  patterns that are not
-                 going to match, but which have a very large number of  possi-
-                 bilities in their search trees. The classic example is a pat-
-                 tern that uses nested unlimited  repeats.  Internally,  PCRE2
-                 has  a  counter that is incremented each time around its main
-                 processing  loop.  If  the  value  set  by  --match-limit  is
-                 reached, an error occurs.
+                 ing resource usage when processing patterns that are not  go-
+                 ing to match, but which have a very large number of possibil-
+                 ities in their search trees. The classic example is a pattern
+                 that  uses  nested unlimited repeats. Internally, PCRE2 has a
+                 counter that is incremented each time around  its  main  pro-
+                 cessing  loop.  If the value set by --match-limit is reached,
+                 an error occurs.


-                 The  --heap-limit  option specifies, as a number of kibibytes
-                 (units of 1024 bytes), the amount of heap memory that may  be
+                 The --heap-limit option specifies, as a number  of  kibibytes
+                 (units  of 1024 bytes), the amount of heap memory that may be
                  used for matching. Heap memory is needed only if matching the
-                 pattern requires a significant number of nested  backtracking
+                 pattern  requires a significant number of nested backtracking
                  points to be remembered. This parameter can be set to zero to
                  forbid the use of heap memory altogether.


-                 The --depth-limit option limits the  depth  of  nested  back-
+                 The  --depth-limit  option  limits  the depth of nested back-
                  tracking points, which indirectly limits the amount of memory
                  that is used. The amount of memory needed for each backtrack-
-                 ing  point  depends on the number of capturing parentheses in
+                 ing point depends on the number of capturing  parentheses  in
                  the pattern, so the amount of memory that is used before this
-                 limit  acts  varies from pattern to pattern. This limit is of
+                 limit acts varies from pattern to pattern. This limit  is  of
                  use only if it is set smaller than --match-limit.


-                 There are no short forms for these options. The default  lim-
-                 its  can  be  set when the PCRE2 library is compiled; if they
-                 are not specified, the defaults are very large and so  effec-
+                 There  are no short forms for these options. The default lim-
+                 its can be set when the PCRE2 library is  compiled;  if  they
+                 are  not specified, the defaults are very large and so effec-
                  tively unlimited.


        --max-buffer-size=number
-                 This  limits  the  expansion  of the processing buffer, whose
-                 initial size can be set by --buffer-size. The maximum  buffer
-                 size  is  silently  forced to be no smaller than the starting
+                 This limits the expansion of  the  processing  buffer,  whose
+                 initial  size can be set by --buffer-size. The maximum buffer
+                 size is silently forced to be no smaller  than  the  starting
                  buffer size.


        -M, --multiline
-                 Allow patterns to match more than one line. When this  option
+                 Allow  patterns to match more than one line. When this option
                  is set, the PCRE2 library is called in "multiline" mode. This
-                 allows a matched string to extend past the end of a line  and
-                 continue  on one or more subsequent lines. Patterns used with
+                 allows  a matched string to extend past the end of a line and
+                 continue on one or more subsequent lines. Patterns used  with
                  -M may usefully contain literal newline characters and inter-
-                 nal  occurrences of ^ and $ characters. The output for a suc-
-                 cessful match may consist of more than one  line.  The  first
-                 line  is  the  line  in which the match started, and the last
-                 line is the line in which the match  ended.  If  the  matched
-                 string  ends  with a newline sequence, the output ends at the
-                 end of that line.  If -v is set,  none  of  the  lines  in  a
-                 multi-line  match  are output. Once a match has been handled,
-                 scanning restarts at the beginning of the line after the  one
+                 nal occurrences of ^ and $ characters. The output for a  suc-
+                 cessful  match  may  consist of more than one line. The first
+                 line is the line in which the match  started,  and  the  last
+                 line  is  the  line  in which the match ended. If the matched
+                 string ends with a newline sequence, the output ends  at  the
+                 end  of  that  line.   If  -v  is set, none of the lines in a
+                 multi-line match are output. Once a match has  been  handled,
+                 scanning  restarts at the beginning of the line after the one
                  in which the match ended.


-                 The  newline  sequence  that separates multiple lines must be
-                 matched as part of the pattern.  For  example,  to  find  the
-                 phrase  "regular  expression" in a file where "regular" might
-                 be at the end of a line and "expression" at the start of  the
+                 The newline sequence that separates multiple  lines  must  be
+                 matched  as  part  of  the  pattern. For example, to find the
+                 phrase "regular expression" in a file where  "regular"  might
+                 be  at the end of a line and "expression" at the start of the
                  next line, you could use this command:


                    pcre2grep -M 'regular\s+expression' <file>


-                 The  \s  escape  sequence  matches any white space character,
-                 including newlines, and is followed  by  +  so  as  to  match
-                 trailing  white  space  on the first line as well as possibly
-                 handling a two-character newline sequence.
+                 The \s escape sequence matches any white space character, in-
+                 cluding  newlines, and is followed by + so as to match trail-
+                 ing white space on the first line as well  as  possibly  han-
+                 dling a two-character newline sequence.


-                 There is a limit to the number of lines that can be  matched,
-                 imposed  by  the way that pcre2grep buffers the input file as
-                 it scans it. With a  sufficiently  large  processing  buffer,
+                 There  is a limit to the number of lines that can be matched,
+                 imposed by the way that pcre2grep buffers the input  file  as
+                 it  scans  it.  With  a sufficiently large processing buffer,
                  this should not be a problem, but the -M option does not work
                  when input is read line by line (see --line-buffered.)


        -N newline-type, --newline=newline-type
-                 The PCRE2 library supports  five  different  conventions  for
-                 indicating  the  ends of lines. They are the single-character
-                 sequences CR (carriage return) and LF  (linefeed),  the  two-
-                 character  sequence CRLF, an "anycrlf" convention, which rec-
-                 ognizes any of the preceding three types, and an  "any"  con-
-                 vention, in which any Unicode line ending sequence is assumed
-                 to end a line. The Unicode sequences are the three just  men-
-                 tioned,  plus  VT  (vertical  tab,  U+000B),  FF  (form feed,
-                 U+000C),  NEL  (next  line,  U+0085),  LS  (line   separator,
+                 The PCRE2 library supports five different conventions for in-
+                 dicating the ends of lines. They are the single-character se-
+                 quences CR (carriage return) and LF (linefeed), the two-char-
+                 acter sequence CRLF, an "anycrlf"  convention,  which  recog-
+                 nizes  any of the preceding three types, and an "any" conven-
+                 tion, in which any Unicode line ending sequence is assumed to
+                 end  a  line.  The  Unicode sequences are the three just men-
+                 tioned, plus  VT  (vertical  tab,  U+000B),  FF  (form  feed,
+                 U+000C),   NEL  (next  line,  U+0085),  LS  (line  separator,
                  U+2028), and PS (paragraph separator, U+2029).


-                 When  the  PCRE2  library  is  built,  a  default line-ending
-                 sequence  is  specified.   This  is  normally  the   standard
-                 sequence for the operating system. Unless otherwise specified
-                 by this option, pcre2grep uses the  library's  default.   The
-                 possible values for this option are CR, LF, CRLF, ANYCRLF, or
-                 ANY. This makes it possible to use pcre2grep  to  scan  files
-                 that have come from other environments without having to mod-
-                 ify their line endings. If the data  that  is  being  scanned
-                 does  not  agree  with  the  convention  set  by this option,
-                 pcre2grep may behave in strange ways. Note that  this  option
-                 does  not apply to files specified by the -f, --exclude-from,
-                 or --include-from options, which  are  expected  to  use  the
-                 operating system's standard newline sequence.
+                 When the PCRE2 library is built, a  default  line-ending  se-
+                 quence  is specified.  This is normally the standard sequence
+                 for the operating system. Unless otherwise specified by  this
+                 option,  pcre2grep  uses the library's default.  The possible
+                 values for this option are CR, LF,  CRLF,  ANYCRLF,  or  ANY.
+                 This  makes  it  possible to use pcre2grep to scan files that
+                 have come from other environments without  having  to  modify
+                 their  line  endings.  If the data that is being scanned does
+                 not agree with the convention set by this  option,  pcre2grep
+                 may  behave  in  strange ways. Note that this option does not
+                 apply to files specified by the -f, --exclude-from, or  --in-
+                 clude-from  options,  which are expected to use the operating
+                 system's standard newline sequence.


        -n, --line-number
                  Precede each output line by its line number in the file, fol-
-                 lowed by a colon for matching lines or a hyphen  for  context
+                 lowed  by  a colon for matching lines or a hyphen for context
                  lines. If the file name is also being output, it precedes the
-                 line number. When the -M option causes  a  pattern  to  match
-                 more  than  one  line, only the first is preceded by its line
+                 line  number.  When  the  -M option causes a pattern to match
+                 more than one line, only the first is preceded  by  its  line
                  number. This option is forced if --line-offsets is used.


-       --no-jit  If the PCRE2 library is built with support  for  just-in-time
+       --no-jit  If  the  PCRE2 library is built with support for just-in-time
                  compiling (which speeds up matching), pcre2grep automatically
                  makes use of this, unless it was explicitly disabled at build
-                 time.  This  option  can be used to disable the use of JIT at
-                 run time. It is provided for testing and working round  prob-
+                 time. This option can be used to disable the use  of  JIT  at
+                 run  time. It is provided for testing and working round prob-
                  lems.  It should never be needed in normal use.


        -O text, --output=text
-                 When  there  is a match, instead of outputting the whole line
-                 that matched, output just the  given  text.  This  option  is
-                 mutually  exclusive with --only-matching, --file-offsets, and
+                 When there is a match, instead of outputting the  whole  line
+                 that  matched, output just the given text. This option is mu-
+                 tually exclusive with  --only-matching,  --file-offsets,  and
                  --line-offsets. Escape sequences starting with a dollar char-
-                 acter  may be used to insert the contents of the matched part
+                 acter may be used to insert the contents of the matched  part
                  of the line and/or captured substrings into the text.


-                 $<digits> or ${<digits>} is replaced  by  the  captured  sub-
-                 string  of  the  given  decimal  number; zero substitutes the
+                 $<digits>  or  ${<digits>}  is  replaced by the captured sub-
+                 string of the given  decimal  number;  zero  substitutes  the
                  whole match. If the number is greater than the number of cap-
-                 turing  substrings,  or if the capture is unset, the replace-
+                 turing substrings, or if the capture is unset,  the  replace-
                  ment is empty.


-                 $a is replaced by bell; $b by backspace; $e by escape; $f  by
-                 form  feed;  $n by newline; $r by carriage return; $t by tab;
+                 $a  is replaced by bell; $b by backspace; $e by escape; $f by
+                 form feed; $n by newline; $r by carriage return; $t  by  tab;
                  $v by vertical tab.


-                 $o<digits> is replaced by the character  represented  by  the
+                 $o<digits>  is  replaced  by the character represented by the
                  given octal number; up to three digits are processed.


-                 $x<digits>  is  replaced  by the character represented by the
+                 $x<digits> is replaced by the character  represented  by  the
                  given hexadecimal number; up to two digits are processed.


-                 Any other character is substituted by itself. In  particular,
+                 Any  other character is substituted by itself. In particular,
                  $$ is replaced by a single dollar.


        -o, --only-matching
                  Show only the part of the line that matched a pattern instead
-                 of the whole line. In this mode, no context  is  shown.  That
-                 is,  the -A, -B, and -C options are ignored. If there is more
-                 than one match in a line, each of them is  shown  separately,
-                 on  a  separate  line  of  output.  If -o is combined with -v
-                 (invert the sense of the match to find  non-matching  lines),
-                 no  output is generated, but the return code is set appropri-
-                 ately. If the matched portion of the line is  empty,  nothing
-                 is  output  unless  the  file  name  or line number are being
-                 printed, in which case they are shown on an  otherwise  empty
+                 of  the  whole  line. In this mode, no context is shown. That
+                 is, the -A, -B, and -C options are ignored. If there is  more
+                 than  one  match in a line, each of them is shown separately,
+                 on a separate line of output. If -o is combined with -v  (in-
+                 vert  the  sense of the match to find non-matching lines), no
+                 output is generated, but the return  code  is  set  appropri-
+                 ately.  If  the matched portion of the line is empty, nothing
+                 is output unless the file  name  or  line  number  are  being
+                 printed,  in  which case they are shown on an otherwise empty
                  line.  This  option  is  mutually  exclusive  with  --output,
                  --file-offsets and --line-offsets.


        -onumber, --only-matching=number
-                 Show only the part of the line  that  matched  the  capturing
+                 Show  only  the  part  of the line that matched the capturing
                  parentheses of the given number. Up to 50 capturing parenthe-
-                 ses are supported by default. This limit can be  changed  via
-                 the  --om-capture option. A pattern may contain any number of
-                 capturing parentheses, but only those whose number is  within
-                 the  limit can be accessed by -o. An error occurs if the num-
+                 ses  are  supported by default. This limit can be changed via
+                 the --om-capture option. A pattern may contain any number  of
+                 capturing  parentheses, but only those whose number is within
+                 the limit can be accessed by -o. An error occurs if the  num-
                  ber specified by -o is greater than the limit.


                  -o0 is the same as -o without a number. Because these options
-                 can  be given without an argument (see above), if an argument
-                 is present, it must be given in  the  same  shell  item,  for
-                 example, -o3 or --only-matching=2. The comments given for the
-                 non-argument case above also apply to  this  option.  If  the
-                 specified  capturing parentheses do not exist in the pattern,
-                 or were not set in the match, nothing is  output  unless  the
+                 can be given without an argument (see above), if an  argument
+                 is  present, it must be given in the same shell item, for ex-
+                 ample, -o3 or --only-matching=2. The comments given  for  the
+                 non-argument  case  above  also  apply to this option. If the
+                 specified capturing parentheses do not exist in the  pattern,
+                 or  were  not  set in the match, nothing is output unless the
                  file name or line number are being output.


-                 If  this  option is given multiple times, multiple substrings
-                 are output for each match,  in  the  order  the  options  are
-                 given,  and  all on one line. For example, -o3 -o1 -o3 causes
-                 the substrings matched by capturing parentheses 3 and  1  and
-                 then  3 again to be output. By default, there is no separator
+                 If this option is given multiple times,  multiple  substrings
+                 are  output  for  each  match,  in  the order the options are
+                 given, and all on one line. For example, -o3 -o1  -o3  causes
+                 the  substrings  matched by capturing parentheses 3 and 1 and
+                 then 3 again to be output. By default, there is no  separator
                  (but see the next but one option).


        --om-capture=number
-                 Set the number of capturing parentheses that can be  accessed
+                 Set  the number of capturing parentheses that can be accessed
                  by -o. The default is 50.


        --om-separator=text
-                 Specify  a  separating string for multiple occurrences of -o.
-                 The default is an empty string. Separating strings are  never
+                 Specify a separating string for multiple occurrences  of  -o.
+                 The  default is an empty string. Separating strings are never
                  coloured.


        -q, --quiet
                  Work quietly, that is, display nothing except error messages.
-                 The exit status indicates whether or  not  any  matches  were
+                 The  exit  status  indicates  whether or not any matches were
                  found.


        -r, --recursive
-                 If  any given path is a directory, recursively scan the files
-                 it contains, taking note of any --include and --exclude  set-
-                 tings.  By  default, a directory is read as a normal file; in
-                 some operating systems this gives an  immediate  end-of-file.
-                 This  option  is  a  shorthand  for  setting the -d option to
-                 "recurse".
+                 If any given path is a directory, recursively scan the  files
+                 it  contains, taking note of any --include and --exclude set-
+                 tings. By default, a directory is read as a normal  file;  in
+                 some  operating  systems this gives an immediate end-of-file.
+                 This option is a shorthand for setting the -d option to  "re-
+                 curse".


        --recursion-limit=number
                  See --match-limit above.


        -s, --no-messages
-                 Suppress error  messages  about  non-existent  or  unreadable
-                 files.  Such  files  are quietly skipped. However, the return
+                 Suppress  error  messages  about  non-existent  or unreadable
+                 files. Such files are quietly skipped.  However,  the  return
                  code is still 2, even if matches were found in other files.


        -t, --total-count
-                 This option is useful when scanning more than  one  file.  If
-                 used  on its own, -t suppresses all output except for a grand
-                 total number of matching lines (or non-matching lines  if  -v
-                 is  used)  in  all  the files. If -t is used with -c, a grand
-                 total is output except when the previous output is  just  one
-                 line.  In  other words, it is not output when just one file's
-                 count is listed. If file names are being  output,  the  grand
-                 total  is preceded by "TOTAL:". Otherwise, it appears as just
-                 another number. The -t option is ignored when  used  with  -L
-                 (list  files  without matches), because the grand total would
+                 This  option  is  useful when scanning more than one file. If
+                 used on its own, -t suppresses all output except for a  grand
+                 total  number  of matching lines (or non-matching lines if -v
+                 is used) in all the files. If -t is used with -c, a grand to-
+                 tal  is  output  except  when the previous output is just one
+                 line. In other words, it is not output when just  one  file's
+                 count  is  listed.  If file names are being output, the grand
+                 total is preceded by "TOTAL:". Otherwise, it appears as  just
+                 another  number.  The  -t option is ignored when used with -L
+                 (list files without matches), because the grand  total  would
                  always be zero.


        -u, --utf Operate in UTF-8 mode. This option is available only if PCRE2
                  has been compiled with UTF-8 support. All patterns (including
-                 those for any --exclude and --include options) and  all  sub-
-                 ject  lines  that  are scanned must be valid strings of UTF-8
+                 those  for  any --exclude and --include options) and all sub-
+                 ject lines that are scanned must be valid  strings  of  UTF-8
                  characters.


        -U, --utf-allow-invalid
-                 As --utf, but in addition subject lines may  contain  invalid
-                 UTF-8  code  unit sequences. These can never form part of any
+                 As  --utf,  but in addition subject lines may contain invalid
+                 UTF-8 code unit sequences. These can never form part  of  any
                  pattern match. This facility allows valid UTF-8 strings to be
                  sought in executable or other binary files.  For more details
-                 about matching in non-valid UTF-8 strings, see the  pcre2uni-
+                 about  matching in non-valid UTF-8 strings, see the pcre2uni-
                  code(3) documentation.


        -V, --version
-                 Write  the version numbers of pcre2grep and the PCRE2 library
-                 to the standard output and then exit. Anything  else  on  the
+                 Write the version numbers of pcre2grep and the PCRE2  library
+                 to  the  standard  output and then exit. Anything else on the
                  command line is ignored.


        -v, --invert-match
-                 Invert  the  sense  of  the match, so that lines which do not
+                 Invert the sense of the match, so that  lines  which  do  not
                  match any of the patterns are the ones that are found.


        -w, --word-regex, --word-regexp
                  Force the patterns only to match "words". That is, there must
-                 be  a  word  boundary  at  the  start and end of each matched
-                 string. This is equivalent to having "\b(?:" at the start  of
-                 each  pattern, and ")\b" at the end. This option applies only
-                 to the patterns that are  matched  against  the  contents  of
-                 files;  it does not apply to patterns specified by any of the
+                 be a word boundary at the  start  and  end  of  each  matched
+                 string.  This is equivalent to having "\b(?:" at the start of
+                 each pattern, and ")\b" at the end. This option applies  only
+                 to  the  patterns  that  are  matched against the contents of
+                 files; it does not apply to patterns specified by any of  the
                  --include or --exclude options.


        -x, --line-regex, --line-regexp
-                 Force the patterns to start matching only at  the  beginnings
-                 of  lines,  and  in  addition,  require  them to match entire
+                 Force  the  patterns to start matching only at the beginnings
+                 of lines, and in  addition,  require  them  to  match  entire
                  lines. In multiline mode the match may be more than one line.
                  This is equivalent to having "^(?:" at the start of each pat-
-                 tern and ")$" at the end. This option  applies  only  to  the
-                 patterns  that  are matched against the contents of files; it
-                 does not apply to patterns specified by any of the  --include
+                 tern  and  ")$"  at  the end. This option applies only to the
+                 patterns that are matched against the contents of  files;  it
+                 does  not apply to patterns specified by any of the --include
                  or --exclude options.



ENVIRONMENT VARIABLES

-       The  environment  variables  LC_ALL  and LC_CTYPE are examined, in that
-       order, for a locale. The first one that is set is  used.  This  can  be
-       overridden  by  the  --locale  option.  If  no locale is set, the PCRE2
-       library's default (usually the "C" locale) is used.
+       The environment variables LC_ALL and LC_CTYPE are examined, in that or-
+       der, for a locale. The first one that is set is used. This can be over-
+       ridden by the --locale option. If no locale is set, the PCRE2 library's
+       default (usually the "C" locale) is used.



NEWLINES
@@ -783,14 +782,14 @@

        The -N (--newline) option allows pcre2grep to scan files with different
        newline conventions from the default. Any parts of the input files that
-       are written to the standard output are copied identically,  with  what-
-       ever  newline sequences they have in the input. However, the setting of
-       this option affects only the way scanned files are processed.  It  does
-       not  affect  the  interpretation  of files specified by the -f, --file-
+       are  written  to the standard output are copied identically, with what-
+       ever newline sequences they have in the input. However, the setting  of
+       this  option  affects only the way scanned files are processed. It does
+       not affect the interpretation of files specified  by  the  -f,  --file-
        list, --exclude-from, or --include-from options, nor does it affect the
-       way  in  which  pcre2grep writes informational messages to the standard
+       way in which pcre2grep writes informational messages  to  the  standard
        error and output streams. For these it uses the string "\n" to indicate
-       newlines,  relying on the C I/O library to convert this to an appropri-
+       newlines, relying on the C I/O library to convert this to an  appropri-
        ate sequence.



@@ -797,18 +796,18 @@
OPTIONS COMPATIBILITY

        Many of the short and long forms of pcre2grep's options are the same as
-       in  the GNU grep program. Any long option of the form --xxx-regexp (GNU
+       in the GNU grep program. Any long option of the form --xxx-regexp  (GNU
        terminology) is also available as --xxx-regex (PCRE2 terminology). How-
-       ever,  the  --depth-limit,  --file-list,  --file-offsets, --heap-limit,
-       --include-dir, --line-offsets, --locale,  --match-limit,  -M,  --multi-
-       line,  -N,  --newline,  --om-separator,  --output,  -u,  --utf, -U, and
+       ever, the  --depth-limit,  --file-list,  --file-offsets,  --heap-limit,
+       --include-dir,  --line-offsets,  --locale,  --match-limit, -M, --multi-
+       line, -N, --newline,  --om-separator,  --output,  -u,  --utf,  -U,  and
        --utf-allow-invalid options are specific to pcre2grep, as is the use of
        the --only-matching option with a capturing parentheses number.


-       Although  most  of the common options work the same way, a few are dif-
-       ferent in pcre2grep. For example, the --include option's argument is  a
-       glob  for GNU grep, but a regular expression for pcre2grep. If both the
-       -c and -l options are given, GNU grep lists only  file  names,  without
+       Although most of the common options work the same way, a few  are  dif-
+       ferent  in pcre2grep. For example, the --include option's argument is a
+       glob for GNU grep, but a regular expression for pcre2grep. If both  the
+       -c  and  -l  options are given, GNU grep lists only file names, without
        counts, but pcre2grep gives the counts as well.



@@ -815,7 +814,7 @@
OPTIONS WITH DATA

        There are four different ways in which an option with data can be spec-
-       ified.  If a short form option is used, the  data  may  follow  immedi-
+       ified.   If  a  short  form option is used, the data may follow immedi-
        ately, or (with one exception) in the next command line item. For exam-
        ple:


@@ -822,69 +821,69 @@
          -f/some/file
          -f /some/file


-       The exception is the -o option, which may appear with or without  data.
-       Because  of this, if data is present, it must follow immediately in the
+       The  exception is the -o option, which may appear with or without data.
+       Because of this, if data is present, it must follow immediately in  the
        same item, for example -o3.


-       If a long form option is used, the data may appear in the same  command
-       line  item,  separated by an equals character, or (with two exceptions)
+       If  a long form option is used, the data may appear in the same command
+       line item, separated by an equals character, or (with  two  exceptions)
        it may appear in the next command line item. For example:


          --file=/some/file
          --file /some/file


-       Note, however, that if you want to supply a file name beginning with  ~
-       as  data  in  a  shell  command,  and have the shell expand ~ to a home
-       directory, you must separate the file name from the option, because the
+       Note,  however, that if you want to supply a file name beginning with ~
+       as data in a shell command, and have the shell expand ~ to a  home  di-
+       rectory,  you  must separate the file name from the option, because the
        shell does not treat ~ specially unless it is at the start of an item.


-       The  exceptions  to the above are the --colour (or --color) and --only-
-       matching options, for which the data  is  optional.  If  one  of  these
-       options  does  have  data, it must be given in the first form, using an
+       The exceptions to the above are the --colour (or --color)  and  --only-
+       matching  options,  for which the data is optional. If one of these op-
+       tions does have data, it must be given in  the  first  form,  using  an
        equals character. Otherwise pcre2grep will assume that it has no data.



USING PCRE2'S CALLOUT FACILITY

-       pcre2grep has, by default, support for  calling  external  programs  or
-       scripts  or  echoing  specific strings during matching by making use of
-       PCRE2's callout facility. However, this support can  be  completely  or
-       partially  disabled  when  pcre2grep is built. You can find out whether
-       your binary has support for callouts by  running  it  with  the  --help
-       option. If callout support is completely disabled, all callouts in pat-
+       pcre2grep  has,  by  default,  support for calling external programs or
+       scripts or echoing specific strings during matching by  making  use  of
+       PCRE2's  callout  facility.  However, this support can be completely or
+       partially disabled when pcre2grep is built. You can  find  out  whether
+       your  binary has support for callouts by running it with the --help op-
+       tion. If callout support is completely disabled, all callouts  in  pat-
        terns are ignored by pcre2grep.  If the facility is partially disabled,
-       calling  external  programs is not supported, and callouts that request
+       calling external programs is not supported, and callouts  that  request
        it are ignored.


-       A callout in a PCRE2 pattern is of the form (?C<arg>) where  the  argu-
-       ment  is either a number or a quoted string (see the pcre2callout docu-
-       mentation for details). Numbered callouts  are  ignored  by  pcre2grep;
+       A  callout  in a PCRE2 pattern is of the form (?C<arg>) where the argu-
+       ment is either a number or a quoted string (see the pcre2callout  docu-
+       mentation  for  details).  Numbered  callouts are ignored by pcre2grep;
        only callouts with string arguments are useful.


    Calling external programs or scripts


        This facility can be independently disabled when pcre2grep is built. It
-       is supported for Windows, where a call to _spawnvp() is used, for  VMS,
-       where  lib$spawn()  is  used,  and  for any other Unix-like environment
+       is  supported for Windows, where a call to _spawnvp() is used, for VMS,
+       where lib$spawn() is used, and  for  any  other  Unix-like  environment
        where fork() and execv() are available.


        If the callout string does not start with a pipe (vertical bar) charac-
-       ter,  it  is parsed into a list of substrings separated by pipe charac-
-       ters. The first substring must be an executable name, with the  follow-
+       ter, it is parsed into a list of substrings separated by  pipe  charac-
+       ters.  The first substring must be an executable name, with the follow-
        ing substrings specifying arguments:


          executable_name|arg1|arg2|...


-       Any  substring  (including  the  executable  name)  may  contain escape
-       sequences started by a dollar character: $<digits>  or  ${<digits>}  is
-       replaced  by  the captured substring of the given decimal number, which
-       must be greater than zero. If the number is greater than the number  of
-       capturing  substrings,  or  if the capture is unset, the replacement is
+       Any substring (including the executable name) may  contain  escape  se-
+       quences  started by a dollar character: $<digits> or ${<digits>} is re-
+       placed by the captured substring of the  given  decimal  number,  which
+       must  be greater than zero. If the number is greater than the number of
+       capturing substrings, or if the capture is unset,  the  replacement  is
        empty.


-       Any other character is substituted by  itself.  In  particular,  $$  is
-       replaced  by  a  single  dollar and $| is replaced by a pipe character.
-       Here is an example:
+       Any  other character is substituted by itself. In particular, $$ is re-
+       placed by a single dollar and $| is replaced by a pipe character.  Here
+       is an example:


          echo -e "abcde\n12345" | pcre2grep \
            '(?x)(.)(..(.))
@@ -897,13 +896,13 @@
            Arg1: [1] [234] [4] Arg2: |1| ()
            12345


-       The parameters for the system call that is used to run the  program  or
+       The  parameters  for the system call that is used to run the program or
        script are zero-terminated strings. This means that binary zero charac-
-       ters in the callout argument will cause premature termination of  their
-       substrings,  and  therefore should not be present. Any syntax errors in
-       the string (for example, a dollar not followed  by  another  character)
-       cause  the  callout to be ignored. If running the program fails for any
-       reason (including the non-existence of the executable), a local  match-
+       ters  in the callout argument will cause premature termination of their
+       substrings, and therefore should not be present. Any syntax  errors  in
+       the  string  (for  example, a dollar not followed by another character)
+       cause the callout to be ignored. If running the program fails  for  any
+       reason  (including the non-existence of the executable), a local match-
        ing failure occurs and the matcher backtracks in the normal way.


    Echoing a specific string
@@ -912,28 +911,28 @@
        pletely disabled when pcre2grep was built. If the callout string starts
        with a pipe (vertical bar) character, the rest of the string is written
        to the output, having been passed through the same escape processing as
-       text  from the --output option. This provides a simple echoing facility
-       that avoids calling an external program or  script.  No  terminator  is
-       added  to  the  string,  so  if you want a newline, you must include it
-       explicitly. Matching continues normally after the string is output.  If
-       you  want  to  see  only  the callout output but not any output from an
-       actual match, you should end the relevant pattern with (*FAIL).
+       text from the --output option. This provides a simple echoing  facility
+       that  avoids  calling  an  external program or script. No terminator is
+       added to the string, so if you want a newline, you must include it  ex-
+       plicitly.  Matching  continues  normally after the string is output. If
+       you want to see only the callout output but not any output from an  ac-
+       tual match, you should end the relevant pattern with (*FAIL).



MATCHING ERRORS

-       It is possible to supply a regular expression that takes  a  very  long
-       time  to  fail  to  match certain lines. Such patterns normally involve
-       nested indefinite repeats, for example: (a+)*\d when matched against  a
-       line  of  a's  with  no  final digit. The PCRE2 matching function has a
-       resource limit that causes it to abort in these circumstances. If  this
-       happens,  pcre2grep  outputs  an error message and the line that caused
-       the problem to the standard error stream. If there  are  more  than  20
+       It  is  possible  to supply a regular expression that takes a very long
+       time to fail to match certain lines.  Such  patterns  normally  involve
+       nested  indefinite repeats, for example: (a+)*\d when matched against a
+       line of a's with no final digit. The PCRE2 matching function has a  re-
+       source  limit  that  causes it to abort in these circumstances. If this
+       happens, pcre2grep outputs an error message and the  line  that  caused
+       the  problem  to  the  standard error stream. If there are more than 20
        such errors, pcre2grep gives up.


-       The  --match-limit  option  of pcre2grep can be used to set the overall
-       resource limit. There are also other limits that affect the  amount  of
-       memory  used  during  matching;  see the discussion of --heap-limit and
+       The --match-limit option of pcre2grep can be used to  set  the  overall
+       resource  limit.  There are also other limits that affect the amount of
+       memory used during matching; see the  discussion  of  --heap-limit  and
        --depth-limit above.



@@ -940,13 +939,13 @@
DIAGNOSTICS

        Exit status is 0 if any matches were found, 1 if no matches were found,
-       and  2  for syntax errors, overlong lines, non-existent or inaccessible
-       files (even if matches were found in other files) or too many  matching
+       and 2 for syntax errors, overlong lines, non-existent  or  inaccessible
+       files  (even if matches were found in other files) or too many matching
        errors. Using the -s option to suppress error messages about inaccessi-
        ble files does not affect the return code.


-       When  run  under  VMS,  the  return  code  is  placed  in  the   symbol
-       PCRE2GREP_RC  because  VMS  does  not  distinguish  between exit(0) and
+       When   run  under  VMS,  the  return  code  is  placed  in  the  symbol
+       PCRE2GREP_RC because VMS  does  not  distinguish  between  exit(0)  and
        exit(1).




Modified: code/trunk/doc/pcre2pattern.3
===================================================================
--- code/trunk/doc/pcre2pattern.3    2019-06-21 16:10:17 UTC (rev 1117)
+++ code/trunk/doc/pcre2pattern.3    2019-06-22 16:36:15 UTC (rev 1118)
@@ -1,4 +1,4 @@
-.TH PCRE2PATTERN 3 "21 June 2019" "PCRE2 10.34"
+.TH PCRE2PATTERN 3 "22 June 2019" "PCRE2 10.34"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH "PCRE2 REGULAR EXPRESSION DETAILS"
@@ -3564,9 +3564,10 @@
 first match attempt, the second attempt would start at the second character
 instead of skipping on to "c".
 .P
-If (*SKIP) is used inside a lookbehind to specify a new starting position that
-is not later than the starting point of the current match, the position 
-specified by (*SKIP) is ignored, and instead the normal "bumpalong" occurs.
+If (*SKIP) is used to specify a new starting position that is the same as the
+starting position of the current match, or (by being inside a lookbehind)
+earlier, the position specified by (*SKIP) is ignored, and instead the normal
+"bumpalong" occurs.
 .sp
   (*SKIP:NAME)
 .sp
@@ -3787,6 +3788,6 @@
 .rs
 .sp
 .nf
-Last updated: 21 June 2019
+Last updated: 22 June 2019
 Copyright (c) 1997-2019 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcre2test.txt
===================================================================
--- code/trunk/doc/pcre2test.txt    2019-06-21 16:10:17 UTC (rev 1117)
+++ code/trunk/doc/pcre2test.txt    2019-06-22 16:36:15 UTC (rev 1118)
@@ -13,8 +13,8 @@
        but it can also be used for  experimenting  with  regular  expressions.
        This  document  describes the features of the test program; for details
        of the regular expressions themselves, see the pcre2pattern  documenta-
-       tion.  For  details  of  the  PCRE2  library  function  calls and their
-       options, see the pcre2api documentation.
+       tion.  For  details  of  the PCRE2 library function calls and their op-
+       tions, see the pcre2api documentation.


        The input for pcre2test is a sequence of  regular  expression  patterns
        and  subject  strings  to  be matched. There are also command lines for
@@ -33,8 +33,8 @@
        which are specifically designed for use in conjunction  with  the  test
        script  and  data  files that are distributed as part of PCRE2. All the
        modifiers are documented here, some  without  much  justification,  but
-       many  of  them  are  unlikely  to  be  of  use  except when testing the
-       libraries.
+       many  of  them  are  unlikely  to be of use except when testing the li-
+       braries.



PCRE2's 8-BIT, 16-BIT AND 32-BIT LIBRARIES
@@ -41,18 +41,18 @@

        Different versions of the PCRE2 library can be built to support charac-
        ter  strings  that  are encoded in 8-bit, 16-bit, or 32-bit code units.
-       One, two, or  all  three  of  these  libraries  may  be  simultaneously
-       installed. The pcre2test program can be used to test all the libraries.
+       One, two, or all three of these libraries  may  be  simultaneously  in-
+       stalled.  The  pcre2test program can be used to test all the libraries.
        However, its own input and output are  always  in  8-bit  format.  When
        testing  the  16-bit  or 32-bit libraries, patterns and subject strings
-       are converted to 16-bit or 32-bit format before  being  passed  to  the
-       library  functions.  Results are converted back to 8-bit code units for
+       are converted to 16-bit or 32-bit format before being passed to the li-
+       brary  functions.  Results  are  converted back to 8-bit code units for
        output.


        In the rest of this document, the names of library functions and struc-
-       tures  are  given  in  generic  form,  for example, pcre_compile(). The
-       actual names used in the libraries have a suffix _8, _16,  or  _32,  as
-       appropriate.
+       tures  are  given in generic form, for example, pcre_compile(). The ac-
+       tual names used in the libraries have a suffix _8, _16, or _32, as  ap-
+       propriate.



 INPUT ENCODING
@@ -70,8 +70,8 @@
        processed for backslash escapes, which makes it possible to include any
        data value in strings that are passed to the library for matching.  For
        patterns,  there  is a facility for specifying some or all of the 8-bit
-       input characters as hexadecimal  pairs,  which  makes  it  possible  to
-       include binary zeros.
+       input characters as hexadecimal pairs, which makes it possible  to  in-
+       clude binary zeros.


    Input for the 16-bit and 32-bit libraries


@@ -78,10 +78,10 @@
        When testing the 16-bit or 32-bit libraries, there is a need to be able
        to generate character code points greater than 255 in the strings  that
        are  passed to the library. For subject lines, backslash escapes can be
-       used. In addition, when the  utf  modifier  (see  "Setting  compilation
-       options" below) is set, the pattern and any following subject lines are
-       interpreted as UTF-8 strings and translated  to  UTF-16  or  UTF-32  as
-       appropriate.
+       used. In addition, when the utf modifier (see "Setting compilation  op-
+       tions"  below)  is set, the pattern and any following subject lines are
+       interpreted as UTF-8 strings and translated to UTF-16 or UTF-32 as  ap-
+       propriate.


        For  non-UTF testing of wide characters, the utf8_input modifier can be
        used. This is mutually exclusive with  utf,  and  is  allowed  only  in
@@ -121,8 +121,8 @@
                  piled.


        -AC       As  for  -ac,  but in addition behave as if each subject line
-                 has the callout_extra  modifier,  that  is,  show  additional
-                 information from callouts.
+                 has the callout_extra modifier, that is, show additional  in-
+                 formation from callouts.


        -b        Behave  as  if each pattern has the fullbincode modifier; the
                  full internal binary form of the pattern is output after com-
@@ -130,9 +130,9 @@


        -C        Output  the  version  number  of  the  PCRE2 library, and all
                  available information about the optional  features  that  are
-                 included,  and  then  exit  with  zero  exit  code. All other
-                 options are ignored. If both -C and -LM are  present,  which-
-                 ever is first is recognized.
+                 included,  and  then  exit with zero exit code. All other op-
+                 tions are ignored. If both -C and -LM are present,  whichever
+                 is first is recognized.


        -C option Output  information  about a specific build-time option, then
                  exit. This functionality is intended for use in scripts  such
@@ -269,8 +269,8 @@
        supply them explicitly.


        An  empty  line  or  the end of the file signals the end of the subject
-       lines for a test, at which point a  new  pattern  or  command  line  is
-       expected if there is still input to be read.
+       lines for a test, at which point a new pattern or command line  is  ex-
+       pected if there is still input to be read.



 COMMAND LINES
@@ -311,8 +311,8 @@
        as indicating a newline in a pattern or subject string. The default can
        be  overridden when a pattern is compiled. The standard test files con-
        tain tests of various newline conventions,  but  the  majority  of  the
-       tests  expect  a  single  linefeed  to  be  recognized  as a newline by
-       default. Without special action the tests would fail when PCRE2 is com-
+       tests  expect  a  single  linefeed to be recognized as a newline by de-
+       fault. Without special action the tests would fail when PCRE2  is  com-
        piled with either CR or CRLF as the default newline.


        The #newline_default command specifies a list of newline types that are
@@ -323,14 +323,14 @@


        If the default newline is in the list, this command has no effect. Oth-
        erwise, except when testing the POSIX  API,  a  newline  modifier  that
-       specifies  the  first  newline  convention in the list (LF in the above
-       example) is added to any pattern that does not already have  a  newline
+       specifies the first newline convention in the list (LF in the above ex-
+       ample) is added to any pattern that does not  already  have  a  newline
        modifier. If the newline list is empty, the feature is turned off. This
        command is present in a number of the standard test input files.


-       When the POSIX API is being tested there is  no  way  to  override  the
-       default  newline  convention,  though it is possible to set the newline
-       convention from within the pattern. A warning is given if the posix  or
+       When the POSIX API is being tested there is no way to override the  de-
+       fault newline convention, though it is possible to set the newline con-
+       vention from within the pattern. A warning is given  if  the  posix  or
        posix_nosub  modifier is used when #newline_default would set a default
        for the non-POSIX API.


@@ -344,8 +344,8 @@
        The  appearance of this line causes all subsequent modifier settings to
        be checked for compatibility with the perltest.sh script, which is used
        to  confirm that Perl gives the same results as PCRE2. Also, apart from
-       comment lines, #pattern commands, and #subject  commands  that  set  or
-       unset  "mark", no command lines are permitted, because they and many of
+       comment lines, #pattern commands, and #subject commands that set or un-
+       set  "mark",  no  command lines are permitted, because they and many of
        the modifiers are specific to pcre2test, and should not be used in test
        files  that  are  also  processed by perltest.sh. The #perltest command
        helps detect tests that are accidentally put in the wrong file.
@@ -376,8 +376,8 @@
        list are separated by commas followed by optional white space. Trailing
        whitespace in a modifier list is ignored. Some modifiers may  be  given
        for  both patterns and subject lines, whereas others are valid only for
-       one  or  the  other.  Each  modifier  has  a  long  name,  for  example
-       "anchored",  and  some of them must be followed by an equals sign and a
+       one or the other. Each modifier has  a  long  name,  for  example  "an-
+       chored",  and  some  of  them  must be followed by an equals sign and a
        value, for example, "offset=12". Values cannot  contain  comma  charac-
        ters,  but may contain spaces. Modifiers that do not take values may be
        preceded by a minus sign to turn off a previous setting.
@@ -498,8 +498,8 @@
          \= This is a comment.
          abc\= This is an invalid modifier list.


-       A  backslash  followed  by  any  other  non-alphanumeric character just
-       escapes that character. A backslash followed by anything else causes an
+       A  backslash  followed by any other non-alphanumeric character just es-
+       capes that character. A backslash followed by anything else  causes  an
        error.  However,  if the very last character in the line is a backslash
        (and there is no modifier list), it is ignored. This  gives  a  way  of
        passing  an  empty line as data, since a real empty line terminates the
@@ -523,13 +523,13 @@
        The following modifiers set options for pcre2_compile(). Most  of  them
        set  bits  in  the  options  argument of that function, but those whose
        names start with PCRE2_EXTRA are additional options that are set in the
-       compile  context.  For  the  main options, there are some single-letter
-       abbreviations that are the same as Perl options. There is special  han-
+       compile context. For the main options, there are some single-letter ab-
+       breviations that are the same as Perl options. There  is  special  han-
        dling  for  /x:  if  a second x is present, PCRE2_EXTENDED is converted
-       into  PCRE2_EXTENDED_MORE  as  in  Perl.  A   third   appearance   adds
-       PCRE2_EXTENDED  as  well,  though  this  makes no difference to the way
-       pcre2_compile() behaves. See pcre2api for a description of the  effects
-       of these options.
+       into PCRE2_EXTENDED_MORE as in Perl. A third appearance adds  PCRE2_EX-
+       TENDED  as  well, though this makes no difference to the way pcre2_com-
+       pile() behaves. See pcre2api for a description of the effects of  these
+       options.


              allow_empty_class         set PCRE2_ALLOW_EMPTY_CLASS
              allow_surrogate_escapes   set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
@@ -577,9 +577,9 @@


    Setting compilation controls


-       The following modifiers  affect  the  compilation  process  or  request
-       information  about  the  pattern. There are single-letter abbreviations
-       for some that are heavily used in the test files.
+       The following modifiers affect the compilation process or  request  in-
+       formation  about the pattern. There are single-letter abbreviations for
+       some that are heavily used in the test files.


              bsr=[anycrlf|unicode]     specify \R handling
          /B  bincode                   show binary code without lengths
@@ -717,8 +717,8 @@
        minated  strings but can be passed by length instead of being zero-ter-
        minated. The use_length modifier causes this to happen. Using a  length
        happens  automatically  (whether  or not use_length is set) when hex is
-       set, because patterns  specified  in  hexadecimal  may  contain  binary
-       zeros.
+       set, because patterns specified in hexadecimal may contain  binary  ze-
+       ros.


        If hex or use_length is used with the POSIX wrapper API (see "Using the
        POSIX wrapper API" below), the REG_PEND extension is used to  pass  the
@@ -770,8 +770,8 @@
        partial modifier in "Subject Modifiers" below for details of how  these
        options are specified for each match attempt.


-       JIT  compilation  is  requested  by the jit pattern modifier, which may
-       optionally be followed by an equals sign and a number in the range 0 to
+       JIT compilation is requested by the jit pattern modifier, which may op-
+       tionally be followed by an equals sign and a number in the range  0  to
        7.   The  three bits that make up the number specify which of the three
        JIT operating modes are to be compiled:


@@ -799,8 +799,8 @@
        none was compiled for non-partial matching.


        If  JIT compilation is successful, the compiled JIT code will automati-
-       cally be used when an appropriate type of match  is  run,  except  when
-       incompatible  run-time options are specified. For more details, see the
+       cally be used when an appropriate type of match is run, except when in-
+       compatible  run-time  options  are specified. For more details, see the
        pcre2jit documentation. See also the jitstack modifier below for a  way
        of setting the size of the JIT stack.


@@ -847,8 +847,8 @@
    Limiting nested parentheses


        The parens_nest_limit modifier sets a limit  on  the  depth  of  nested
-       parentheses  in  a  pattern.  Breaching  the limit causes a compilation
-       error.  The default for the library is set when  PCRE2  is  built,  but
+       parentheses  in a pattern. Breaching the limit causes a compilation er-
+       ror.  The default for the library is  set  when  PCRE2  is  built,  but
        pcre2test  sets  its  own default of 220, which is required for running
        the standard test suite.


@@ -886,13 +886,13 @@
        buffer is too small for the error message. If  this  modifier  has  not
        been set, a large buffer is used.


-       The  aftertext  and  allaftertext  subject  modifiers work as described
-       below. All other modifiers are either ignored, with a warning  message,
-       or cause an error.
+       The  aftertext and allaftertext subject modifiers work as described be-
+       low. All other modifiers are either ignored, with a warning message, or
+       cause an error.


-       The  pattern  is  passed  to  regcomp()  as a zero-terminated string by
-       default, but if the use_length or hex modifiers are set,  the  REG_PEND
-       extension is used to pass it by length.
+       The  pattern  is passed to regcomp() as a zero-terminated string by de-
+       fault, but if the use_length or hex modifiers are set, the REG_PEND ex-
+       tension is used to pass it by length.


    Testing the stack guard feature


@@ -920,8 +920,8 @@
          2   a set of tables defining ISO 8859 characters


        In  table 2, some characters whose codes are greater than 128 are iden-
-       tified as letters, digits, spaces,  etc.  Setting  alternate  character
-       tables and a locale are mutually exclusive.
+       tified as letters, digits, spaces, etc. Setting alternate character ta-
+       bles and a locale are mutually exclusive.


    Setting certain match controls


@@ -971,12 +971,12 @@
        terns"  below.  If pushcopy is used instead of push, a copy of the com-
        piled pattern is stacked, leaving the original  as  current,  ready  to
        match  the  following  input  lines. This provides a way of testing the
-       pcre2_code_copy() function.   The  push  and  pushcopy   modifiers  are
-       incompatible  with  compilation  modifiers  such  as global that act at
-       match time. Any that are specified are ignored (for the stacked  copy),
-       with a warning message, except for replace, which causes an error. Note
-       that jitverify, which is allowed, does not carry through to any  subse-
-       quent matching that uses a stacked pattern.
+       pcre2_code_copy() function.  The push and pushcopy  modifiers  are  in-
+       compatible  with compilation modifiers such as global that act at match
+       time. Any that are specified are ignored (for the stacked copy), with a
+       warning  message,  except for replace, which causes an error. Note that
+       jitverify, which is allowed, does not carry through to  any  subsequent
+       matching that uses a stacked pattern.


    Testing foreign pattern conversion


@@ -1124,12 +1124,12 @@
        The  allusedtext modifier requests that all the text that was consulted
        during a successful pattern match by the interpreter should  be  shown.
        This  feature  is not supported for JIT matching, and if requested with
-       JIT it is ignored (with  a  warning  message).  Setting  this  modifier
-       affects the output if there is a lookbehind at the start of a match, or
-       a lookahead at the end, or if \K is used  in  the  pattern.  Characters
-       that  precede or follow the start and end of the actual match are indi-
-       cated in the output by '<' or '>' characters underneath them.  Here  is
-       an example:
+       JIT it is ignored (with a warning message). Setting this  modifier  af-
+       fects the output if there is a lookbehind at the start of a match, or a
+       lookahead at the end, or if \K is used in the pattern. Characters  that
+       precede  or  follow the start and end of the actual match are indicated
+       in the output by '<' or '>' characters underneath them. Here is an  ex-
+       ample:


            re> /(?<=pqr)abc(?=xyz)/
          data> 123pqrabcxyz456\=allusedtext
@@ -1145,8 +1145,8 @@
        string. The only time when this occurs is when \K has been processed as
        part of the match. In this situation, the output for the matched string
        is  displayed  from  the  starting  character instead of from the match
-       point, with circumflex characters under  the  earlier  characters.  For
-       example:
+       point, with circumflex characters under the earlier characters. For ex-
+       ample:


            re> /abc\Kxyz/
          data> abcxyz\=startchar
@@ -1171,12 +1171,12 @@
        The allvector modifier requests that the entire ovector be shown, what-
        ever the outcome of the match. Compare allcaptures, which shows only up
        to the maximum number of capture groups for the pattern, and then  only
-       for  a  successful  complete  non-DFA  match. This modifier, which acts
-       after any match result, and also for DFA matching, provides a means  of
+       for  a successful complete non-DFA match. This modifier, which acts af-
+       ter any match result, and also for DFA matching, provides  a  means  of
        checking  that there are no unexpected modifications to ovector fields.
        Before each match attempt, the ovector is filled with a special  value,
-       and   if   this  is  found  in  both  elements  of  a  capturing  pair,
-       "<unchanged>" is output. After a successful match, this applies to  all
+       and  if  this  is  found  in  both  elements of a capturing pair, "<un-
+       changed>" is output. After a successful  match,  this  applies  to  all
        groups  after the maximum capture group for the pattern. In other cases
        it applies to the entire ovector. After a partial match, the first  two
        elements  are  the only ones that should be set. After a DFA match, the
@@ -1207,12 +1207,12 @@
        If an empty string  is  matched,  the  next  match  is  done  with  the
        PCRE2_NOTEMPTY_ATSTART and PCRE2_ANCHORED flags set, in order to search
        for another, non-empty, match at the same point in the subject. If this
-       match  fails,  the  start  offset  is advanced, and the normal match is
-       retried. This imitates the way Perl handles such cases when  using  the
-       /g  modifier  or  the  split()  function. Normally, the start offset is
-       advanced by one character, but if  the  newline  convention  recognizes
-       CRLF  as  a newline, and the current character is CR followed by LF, an
-       advance of two characters occurs.
+       match  fails, the start offset is advanced, and the normal match is re-
+       tried. This imitates the way Perl handles such cases when using the  /g
+       modifier  or  the  split()  function. Normally, the start offset is ad-
+       vanced by one character, but if the newline convention recognizes  CRLF
+       as  a  newline,  and the current character is CR followed by LF, an ad-
+       vance of two characters occurs.


    Testing substring extraction functions


@@ -1275,8 +1275,8 @@
        than  256 characters) for substitution tests, as fixed-size buffers are
        used. To make it easy to test for buffer overflow, if  the  replacement
        string  starts  with a number in square brackets, that number is passed
-       to pcre2_substitute() as the  size  of  the  output  buffer,  with  the
-       replacement  string  starting at the next character. Here is an example
+       to pcre2_substitute() as the size of the output buffer,  with  the  re-
+       placement  string  starting  at  the next character. Here is an example
        that tests the edge case:


          /abc/
@@ -1285,10 +1285,10 @@
              123abc123\=replace=[9]XYZ
          Failed: error -47: no more memory


-       The   default   action   of    pcre2_substitute()    is    to    return
-       PCRE2_ERROR_NOMEMORY  when  the output buffer is too small. However, if
-       the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option is set (by using  the  sub-
-       stitute_overflow_length  modifier),  pcre2_substitute() continues to go
+       The  default  action  of  pcre2_substitute()  is  to  return  PCRE2_ER-
+       ROR_NOMEMORY  when  the  output  buffer  is  too small. However, if the
+       PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option is set (by  using  the  substi-
+       tute_overflow_length  modifier),  pcre2_substitute()  continues  to  go
        through the motions of matching and substituting  (but  not  doing  any
        callouts),  in  order  to  compute the size of buffer that is required.
        When this happens, pcre2test shows the required  buffer  length  (which
@@ -1323,8 +1323,8 @@
        Then are listed the offsets of the old substring, its contents, and the
        same for the replacement.


-       By default, the  substitution  callout  function  returns  zero,  which
-       accepts the replacement and causes matching to continue if /g was used.
+       By default, the substitution callout function returns zero,  which  ac-
+       cepts  the  replacement and causes matching to continue if /g was used.
        Two further modifiers can be used to test other return values. If  sub-
        stitute_skip  is  set to a value greater than zero the callout function
        returns +1 for the match of that number, and similarly  substitute_stop
@@ -1411,8 +1411,8 @@


        The memory modifier causes pcre2test to log the sizes of all heap  mem-
        ory   allocation  and  freeing  calls  that  occur  during  a  call  to
-       pcre2_match() or pcre2_dfa_match().  These  occur  only  when  a  match
-       requires  a bigger vector than the default for remembering backtracking
+       pcre2_match() or pcre2_dfa_match(). These occur only when a  match  re-
+       quires  a  bigger  vector than the default for remembering backtracking
        points (pcre2_match()) or for internal  workspace  (pcre2_dfa_match()).
        In  many cases there will be no heap memory used and therefore no addi-
        tional output. No heap memory is allocated during matching with JIT, so
@@ -1435,9 +1435,9 @@


    Setting the size of the output vector


-       The ovector modifier applies only to  the  subject  line  in  which  it
-       appears,  though  of  course  it can also be used to set a default in a
-       #subject command. It specifies the number of pairs of offsets that  are
+       The ovector modifier applies only to the subject line in which  it  ap-
+       pears, though of course it can also be used to set a default in a #sub-
+       ject command. It specifies the number of  pairs  of  offsets  that  are
        available for storing matching information. The default is 15.


        A  value of zero is useful when testing the POSIX API because it causes
@@ -1491,12 +1491,12 @@


        When a match succeeds, pcre2test outputs  the  list  of  captured  sub-
        strings,  starting  with number 0 for the string that matched the whole
-       pattern.   Otherwise,  it  outputs  "No  match"  when  the  return   is
-       PCRE2_ERROR_NOMATCH,  or  "Partial  match:"  followed  by the partially
-       matching substring when the return is PCRE2_ERROR_PARTIAL.  (Note  that
-       this  is  the  entire  substring  that was inspected during the partial
-       match; it may include characters before the actual  match  start  if  a
-       lookbehind assertion, \K, \b, or \B was involved.)
+       pattern.  Otherwise, it outputs "No match" when the return is PCRE2_ER-
+       ROR_NOMATCH,  or  "Partial  match:"  followed by the partially matching
+       substring when the return is PCRE2_ERROR_PARTIAL. (Note  that  this  is
+       the  entire  substring  that was inspected during the partial match; it
+       may include characters before the actual match start  if  a  lookbehind
+       assertion, \K, \b, or \B was involved.)


        For any other return, pcre2test outputs the PCRE2 negative error number
        and a short descriptive phrase. If the error is  a  failed  UTF  string
@@ -1541,8 +1541,8 @@
           0: cat
           0+ aract


-       If global matching is requested, the  results  of  successive  matching
-       attempts are output in sequence, like this:
+       If global matching is requested, the results of successive matching at-
+       tempts are output in sequence, like this:


            re> /\Bi(\w\w)/g
          data> Mississippi
@@ -1580,12 +1580,12 @@
           2: tan


        Using the normal matching function on this data finds only "tang".  The
-       longest  matching  string  is  always  given first (and numbered zero).
-       After a PCRE2_ERROR_PARTIAL return, the  output  is  "Partial  match:",
-       followed  by  the  partially  matching substring. Note that this is the
-       entire substring that was inspected during the partial  match;  it  may
-       include characters before the actual match start if a lookbehind asser-
-       tion, \b, or \B was involved. (\K is not supported for DFA matching.)
+       longest  matching string is always given first (and numbered zero). Af-
+       ter a PCRE2_ERROR_PARTIAL return, the output is "Partial match:",  fol-
+       lowed by the partially matching substring. Note that this is the entire
+       substring that was inspected during the partial match; it  may  include
+       characters before the actual match start if a lookbehind assertion, \b,
+       or \B was involved. (\K is not supported for DFA matching.)


        If global matching is requested, the search for further matches resumes
        at the end of the longest match. For example:
@@ -1638,12 +1638,12 @@
          --->pqrabcdef
            0    ^  ^     \d


-       This  output  indicates  that  callout  number  0  occurred for a match
-       attempt starting at the fourth character of the  subject  string,  when
-       the  pointer  was  at  the seventh character, and when the next pattern
-       item was \d. Just one circumflex is output if  the  start  and  current
-       positions  are  the same, or if the current position precedes the start
-       position, which can happen if the callout is in a lookbehind assertion.
+       This  output  indicates  that callout number 0 occurred for a match at-
+       tempt starting at the fourth character of the subject string, when  the
+       pointer  was  at  the seventh character, and when the next pattern item
+       was \d. Just one circumflex is output if the start  and  current  posi-
+       tions are the same, or if the current position precedes the start posi-
+       tion, which can happen if the callout is in a lookbehind assertion.


        Callouts numbered 255 are assumed to be automatic callouts, inserted as
        a result of the auto_callout pattern modifier. In this case, instead of
@@ -1660,8 +1660,8 @@
           0: E*


        If a pattern contains (*MARK) items, an additional line is output when-
-       ever a change of latest mark is passed to  the  callout  function.  For
-       example:
+       ever a change of latest mark is passed to the callout function. For ex-
+       ample:


            re> /a(*MARK:X)bc/auto_callout
          data> abc
@@ -1683,8 +1683,8 @@


        The output for a callout with a string argument is similar, except that
        instead  of outputting a callout number before the position indicators,
-       the callout string and its offset in  the  pattern  string  are  output
-       before  the reflection of the subject string, and the subject string is
+       the callout string and its offset in the pattern string are output  be-
+       fore  the  reflection  of the subject string, and the subject string is
        reflected for each callout. For example:


            re> /^ab(?C'first')cd(?C"second")ef/
@@ -1800,9 +1800,9 @@


        When  pcre2test  is outputting text that is a matched part of a subject
        string, it behaves in the same way, unless a different locale has  been
-       set  for  the  pattern  (using  the locale modifier). In this case, the
-       isprint() function is used to  distinguish  printing  and  non-printing
-       characters.
+       set  for the pattern (using the locale modifier). In this case, the is-
+       print() function is used to distinguish printing and non-printing char-
+       acters.



 SAVING AND RESTORING COMPILED PATTERNS
@@ -1814,14 +1814,14 @@
        have  the  same  endianness,  pointer width and PCRE2_SIZE type. Before
        compiled patterns can be saved they must be serialized, that  is,  con-
        verted  to a stream of bytes. A single byte stream may contain any num-
-       ber of compiled patterns, but they must  all  use  the  same  character
-       tables. A single copy of the tables is included in the byte stream (its
+       ber of compiled patterns, but they must all use the same character  ta-
+       bles.  A  single copy of the tables is included in the byte stream (its
        size is 1088 bytes).


-       The functions whose names begin  with  pcre2_serialize_  are  used  for
-       serializing  and de-serializing. They are described in the pcre2serial-
-       ize  documentation.  In  this  section  we  describe  the  features  of
-       pcre2test that can be used to test these functions.
+       The functions whose names begin with pcre2_serialize_ are used for  se-
+       rializing  and de-serializing. They are described in the pcre2serialize
+       documentation. In this section we describe the  features  of  pcre2test
+       that can be used to test these functions.


        Note  that  "serialization" in PCRE2 does not convert compiled patterns
        to an abstract format like Java or .NET. It  just  makes  a  reloadable
@@ -1831,8 +1831,8 @@
        piled, it is pushed onto a stack of compiled  patterns,  and  pcre2test
        expects  the next line to contain a new pattern (or command) instead of
        a subject line. By contrast, the pushcopy modifier causes a copy of the
-       compiled  pattern  to  be  stacked,  leaving the original available for
-       immediate matching. By using push and/or pushcopy, a number of patterns
+       compiled  pattern to be stacked, leaving the original available for im-
+       mediate matching. By using push and/or pushcopy, a number  of  patterns
        can  be  compiled  and  retained. These modifiers are incompatible with
        posix, and control modifiers that act at match time are ignored (with a
        message)  for the stacked patterns. The jitverify modifier applies only
@@ -1855,8 +1855,8 @@
        matched with the pattern, terminated as usual by an empty line  or  end
        of  file.  This  command  may be followed by a modifier list containing
        only control modifiers that act after a pattern has been  compiled.  In
-       particular,  hex,  posix,  posix_nosub,  push,  and  pushcopy  are  not
-       allowed, nor are any option-setting modifiers.  The JIT modifiers  are,
+       particular,  hex,  posix,  posix_nosub,  push, and pushcopy are not al-
+       lowed, nor are any option-setting modifiers.  The  JIT  modifiers  are,
        however  permitted.  Here is an example that saves and reloads two pat-
        terns.