On Thu, Feb 22, 2007 at 01:52:00AM -0500, Daniel Richard G. wrote:
> Okay everyone, new version of my patch. This one integrates Bob's changes
> between patch 5 and 6, plus a few more of my own. It is not quite yet patch
> 7, but please give it a whirl, and let me know what more it needs.
OK, great!
> To apply:
>
> cd /path/to/pcre-7.0
> mv ucptable.c ucptable.h
> mv perltest perltest.pl
> patch -p1 </path/to/pcre-7.0-drg2.patch
At this point, I get,
$ patch -p1 < ~/pcre-7.0-drg2.patch
patching file autogen.sh
...
patching file RunGrepTest.in
Hunk #5 FAILED at 201.
1 out of 7 hunks FAILED -- saving rejects to file RunGrepTest.in.rej
patching file RunTest.in
...
RunGrepTest.in.rej is attached.
> chmod +x autogen.sh
> ./autogen.sh
> ./configure
> make distcheck
>
>
> Change notes from drg1:
>
>
> ==== configure.ac ====
>
> * Added checking of --with-link-size option argument.
Good thinking!
> * There's a simplification I'd like to make: When you give
> AC_ARG_ENABLE(foo, ...), Autoconf automatically defines for you a shell
> variable $enable_foo that contains the value of the option ("yes", "no",
> or whatever was specified with --enable-foo=blah). Same deal with
> AC_ARG_WITH and $with_foo.
>
> What do y'all say we ditch the ac_pcre_* variables, and use
> enable_foo/with_foo in their stead? We could then eliminate the third
> argument from most of the AC_ARG_ENABLE/WITH calls, as it would be
> redundant.
I like the way it is. That's because it's easy to provide default
values. Also, it's nice and easy to read. Using the built in variables
is slightly more confusing to someone that doesn't know autoconf that
well.
> ==== Makefile.am ====
>
> * perltest --> perltest.pl
Why is this necessary at this point in time? Does it help solve a
build system problem?
> * Fixed up the pcrecpp documentation conditionals. For future reference:
> there is no such thing as dist_noinst_MANS; you have to assign those
> files to EXTRA_DIST.
>
> ==== pcre_compile.c ====
>
> * Use "#ifndef EBCDIC" instead of "#if !EBCDIC", to fix the "EBCDIC not
> defined" problem that Philip observed.
> diff -ruN pcre-7.0/autogen.sh pcre-7.1-RC0/autogen.sh
> --- pcre-7.0/autogen.sh 1969-12-31 19:00:00.000000000 -0500
> +++ pcre-7.1-RC0/autogen.sh 2007-02-20 22:44:21.000000000 -0500
> @@ -0,0 +1,21 @@
> +#!/bin/sh
Do you think the rewrite of the autoconf script is necessary at this
point? I'm personally looking for minimal change at this point, only
until the patch is acceptted upstream. Then we can reak havok again!
> diff -ruN pcre-7.0/CMakeLists.txt pcre-7.1-RC0/CMakeLists.txt
> --- pcre-7.0/CMakeLists.txt 1969-12-31 19:00:00.000000000 -0500
> +++ pcre-7.1-RC0/CMakeLists.txt 2007-02-21 01:55:06.000000000 -0500
> @@ -0,0 +1 @@
> +# This file is a placeholder.
What's this?
> diff -ruN pcre-7.0/RunGrepTest.in pcre-7.1-RC0/RunGrepTest.in
> --- pcre-7.0/RunGrepTest.in 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/RunGrepTest.in 2007-02-22 00:43:52.000000000 -0500
> @@ -3,6 +3,10 @@
> # This file is generated by configure from RunGrepTest.in. Make any changes
> # to that file.
>
> +# Set the C locale, so that sort(1) behaves predictably.
> +LC_ALL=C
> +export LC_ALL
> +
> echo "Testing pcregrep"
> ./pcregrep -V
>
> @@ -10,7 +14,7 @@
> # itself. What we are checking here is the file handling and options that are
> # supported by pcregrep.
>
> -cf=diff
> +cf="diff -u"
> valgrind=
> if [ ! -d testdata ] ; then
> ln -s @top_srcdir@/testdata testdata
> @@ -19,7 +23,7 @@
>
> while [ $# -gt 0 ] ; do
> case $1 in
> - valgrind) valgrind="valgrind -q --leak-check=no";;
> + valgrind) valgrind="valgrind -q --leak-check=no";;
> *) echo "Unknown argument $1"; exit 1;;
> esac
> shift
> @@ -127,19 +131,19 @@
> $valgrind ./pcregrep -L 'fox' $testdata/grepinput $testdata/grepinputx >>testtry
>
> echo "---------------------------- Test 33 -----------------------------" >>testtry
> -$valgrind ./pcregrep 'fox' $testdata/grepnonexist >>testtry 2>&1
> +$valgrind ./pcregrep 'fox' $testdata/grepnonexist >>testtry 2>&1
> echo "RC=$?" >>testtry
>
> echo "---------------------------- Test 34 -----------------------------" >>testtry
> -$valgrind ./pcregrep -s 'fox' $testdata/grepnonexist >>testtry 2>&1
> +$valgrind ./pcregrep -s 'fox' $testdata/grepnonexist >>testtry 2>&1
> echo "RC=$?" >>testtry
>
> echo "---------------------------- Test 35 -----------------------------" >>testtry
> -$valgrind ./pcregrep -L -r --include=grepinputx 'fox' $testdata >>testtry
> +$valgrind ./pcregrep -L -r --include=grepinputx 'fox' $testdata >>testtry
> echo "RC=$?" >>testtry
>
> echo "---------------------------- Test 36 -----------------------------" >>testtry
> -$valgrind ./pcregrep -L -r --include=grepinput --exclude 'grepinput$' 'fox' $testdata >>testtry
> +$valgrind ./pcregrep -L -r --include=grepinput --exclude 'grepinput$' 'fox' $testdata | sort >>testtry
> echo "RC=$?" >>testtry
>
> echo "---------------------------- Test 37 -----------------------------" >>testtry
> @@ -197,11 +201,12 @@
> $valgrind ./pcregrep --newline=crlf "^(abc|def|ghi|jkl)" $testdata/grepinputx >>testtry
>
> echo "---------------------------- Test 52 ------------------------------" >>testtry
> -$valgrind ./pcregrep --newline=cr -F "def
jkl" $testdata/grepinputx >>testtry
> +pattern=`printf 'def\rjkl'`
> +$valgrind ./pcregrep --newline=cr -F "$pattern" $testdata/grepinputx >>testtry
>
> echo "---------------------------- Test 53 ------------------------------" >>testtry
> -$valgrind ./pcregrep --newline=crlf -F "xxx
> -jkl" $testdata/grepinputx >>testtry
> +pattern=`printf 'xxx\r\njkl'`
> +$valgrind ./pcregrep --newline=crlf -F "$pattern" $testdata/grepinputx >>testtry
>
> echo "---------------------------- Test 54 ------------------------------" >>testtry
> $valgrind ./pcregrep -n --newline=any "^(abc|def|ghi|jkl)" $testdata/grepinputx >>testtry
> @@ -215,7 +220,7 @@
>
> if [ $no_utf8 -ne 0 ] ; then
> echo "Testing pcregrep UTF-8 features"
> -
> +
> echo "---------------------------- Test U1 ------------------------------" >testtry
> $valgrind ./pcregrep -n -u --newline=any "^X" $testdata/grepinput8 >>testtry
>
> @@ -229,6 +234,6 @@
> echo "Skipping pcregrep UTF-8 tests: no UTF-8 support in PCRE library"
> fi
>
> -exit 0
> +exit 0
>
> # End
> diff -ruN pcre-7.0/RunTest.in pcre-7.1-RC0/RunTest.in
> --- pcre-7.0/RunTest.in 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/RunTest.in 2007-02-21 22:06:36.000000000 -0500
> @@ -3,6 +3,9 @@
> # This file is generated by configure from RunTest.in. Make any changes
> # to that file.
>
> +# TODO: Replace use of variables @LINK_SIZE@, @UTF8@ and @UCP@ with
> +# the data found from the output of 'pcretest -C' instead.
> +
> # Run PCRE tests
>
> cf=diff
> @@ -36,13 +39,13 @@
> 7) do7=yes;;
> 8) do8=yes;;
> 9) do9=yes;;
> - valgrind) valgrind="valgrind -q";;
> + valgrind) valgrind="valgrind -q";;
> *) echo "Unknown test number $1"; exit 1;;
> esac
> shift
> done
>
> -if [ "@LINK_SIZE@" != "" -a "@LINK_SIZE@" != "-DLINK_SIZE=2" ] ; then
> +if [ "@LINK_SIZE@" != "2" ] ; then
> if [ $do2 = yes ] ; then
> echo "Can't run test 2 with an internal link size other than 2"
> exit 1
> @@ -57,7 +60,7 @@
> fi
> fi
>
> -if [ "@UTF8@" = "" ] ; then
> +if [ "@UTF8@" = "no" ] ; then
> if [ $do4 = yes ] ; then
> echo "Can't run test 4 because UTF-8 support is not configured"
> exit 1
> @@ -80,7 +83,7 @@
> fi
> fi
>
> -if [ "@UCP@" = "" ] ; then
> +if [ "@UCP@" = "no" ] ; then
> if [ $do6 = yes ] ; then
> echo "Can't run test 6 because Unicode property support is not configured"
> exit 1
> @@ -97,12 +100,12 @@
> do1=yes
> do2=yes
> do3=yes
> - if [ "@UTF8@" != "" ] ; then do4=yes; fi
> - if [ "@UTF8@" != "" ] ; then do5=yes; fi
> - if [ "@UTF8@" != "" -a "@UCP@" != "" ] ; then do6=yes; fi
> - do7=yes
> - if [ "@UTF8@" != "" ] ; then do8=yes; fi
> - if [ "@UTF8@" != "" -a "@UCP@" != "" ] ; then do9=yes; fi
> + if [ "@UTF8@" != "no" ] ; then do4=yes; fi
> + if [ "@UTF8@" != "no" ] ; then do5=yes; fi
> + if [ "@UTF8@" != "no" -a "@UCP@" != "no" ] ; then do6=yes; fi
> + do7=yes
> + if [ "@UTF8@" != "no" ] ; then do8=yes; fi
> + if [ "@UTF8@" != "no" -a "@UCP@" != "no" ] ; then do9=yes; fi
> fi
>
> # Show which release
> @@ -126,7 +129,7 @@
> # PCRE tests that are not Perl-compatible - API & error tests, mostly
>
> if [ $do2 = yes ] ; then
> - if [ "@LINK_SIZE@" = "" -o "@LINK_SIZE@" = "-DLINK_SIZE=2" ] ; then
> + if [ "@LINK_SIZE@" = "2" ] ; then
> echo "Test 2: API and error handling (not Perl compatible)"
> $valgrind ./pcretest -q $testdata/testinput2 testtry
> if [ $? = 0 ] ; then
> @@ -138,7 +141,7 @@
> echo " "
> else
> echo Test 2 skipped for link size other than 2 \(@LINK_SIZE@\)
> - echo " "
> + echo " "
> fi
> fi
>
> @@ -184,7 +187,7 @@
> fi
>
> if [ $do5 = yes ] ; then
> - if [ "@LINK_SIZE@" = "" -o "@LINK_SIZE@" = "-DLINK_SIZE=2" ] ; then
> + if [ "@LINK_SIZE@" = "2" ] ; then
> echo "Test 5: API and internals for UTF-8 support (not Perl compatible)"
> $valgrind ./pcretest -q $testdata/testinput5 testtry
> if [ $? = 0 ] ; then
> @@ -196,12 +199,12 @@
> echo " "
> else
> echo Test 5 skipped for link size other than 2 \(@LINK_SIZE@\)
> - echo " "
> + echo " "
> fi
> fi
>
> if [ $do6 = yes ] ; then
> - if [ "@LINK_SIZE@" = "" -o "@LINK_SIZE@" = "-DLINK_SIZE=2" ] ; then
> + if [ "@LINK_SIZE@" = "2" ] ; then
> echo "Test 6: Unicode property support"
> $valgrind ./pcretest -q $testdata/testinput6 testtry
> if [ $? = 0 ] ; then
> @@ -213,7 +216,7 @@
> echo " "
> else
> echo Test 6 skipped for link size other than 2 \(@LINK_SIZE@\)
> - echo " "
> + echo " "
> fi
> fi
>
> diff -ruN pcre-7.0/configure.ac pcre-7.1-RC0/configure.ac
> --- pcre-7.0/configure.ac 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/configure.ac 2007-02-21 23:50:44.000000000 -0500
> @@ -1,311 +1,413 @@
> dnl Process this file with autoconf to produce a configure script.
>
> -dnl This configure.in file has been hacked around quite a lot as a result of
> -dnl patches that various people have sent to me (PH). Sometimes the information
> -dnl I get is contradictory. I've tried to put in comments that explain things,
> -dnl but in some cases the information is second-hand and I have no way of
> -dnl verifying it. I am not an autoconf or libtool expert!
> -
> -dnl This is required at the start; the name is the name of a file
> -dnl it should be seeing, to verify it is in the same directory.
> -
> -AC_INIT(dftables.c)
> -AC_CONFIG_SRCDIR([pcre.h])
> -
> -dnl A safety precaution
> +m4_define(pcre_major, [7])
> +m4_define(pcre_minor, [1])
> +m4_define(pcre_prerelease, [-RC0])
> +m4_define(pcre_date, [2006-Dec-18])
I like the way you quoted the rhs, I almost did that. What do you think
about quoting the rhs too?
> +# Libtool shared library interface versions (current:revision:age)
> +m4_define(libpcre_version, [0:1:0])
> +m4_define(libpcreposix_version, [0:0:0])
> +m4_define(libpcrecpp_version, [0:0:0])
What's the motivation of changing this from a shell variable to an m4
define? Did it solve a problem?
> +AC_INIT(PCRE, pcre_major.pcre_minor[]pcre_prerelease, , pcre)
We previously had
AC_INIT(pcre, pcre_major.pcre_minor[pcre_prerelease])
Does everyone want to switch from "pcre" to "PCRE"?
> +# Set a more sensible default value for $(htmldir).
> +if test "x$htmldir" = 'x${docdir}'
> +then
> + htmldir='${docdir}/html'
> +fi
>
> -dnl Provide versioning information for libtool shared libraries that
> -dnl are built by default on Unix systems.
> +# Define all variables that represents the AC_ARG_ENABLE options.
> +ac_pcre_cpp=yes
> +ac_pcre_utf8=maybe # default is "no"
The comment above does not match the value. Also, reimplementing the
same feature in a different way is fine with me if it has a benefit.
However, it seems like a waste of my time to have written it originally
if it is being reimplented for personal preferences. Please don't read
this as an attack on your work, I actually like the way it's done. I
just dislike wasting my time.
> +ac_pcre_unicode_properties=no
> +ac_pcre_newline=lf
> +ac_pcre_ebcdic=no
> +ac_pcre_recurse=no
> +ac_pcre_posix_malloc_threshold=10
> +ac_pcre_link_size=2
> +ac_pcre_match_limit=10000000
> +ac_pcre_match_limit_recursion=MATCH_LIMIT
The above line stands out as odd to me. This is the first time that
MATCH_LIMIT is used in the file. Is this what you intended?
>
> +# Handle --disable-cpp
> +AC_ARG_ENABLE(cpp,
> + AS_HELP_STRING([--disable-cpp],
> + [disable C++ support]),
> + ac_pcre_cpp="$enableval")
Now, I previously had,
AC_ARG_ENABLE(cpp, AC_HELP_STRING([--disable-cpp],[disable C++ support]),
ac_pcre_cpp="no")
Now, the major difference here, is that you set ac_pcre_cpp to
"$enableval" and I was setting it to "no". I'm not sure what's best
here. I prefer the "no" because I know exactly what the value is going
to be set to. Is there any way the user can manipulate the value of
$enableval to be something other than "no"?
This question goes for the other values as well.
> +# Make sure that if ac_pcre_unicode_properties was set, that UTF-8 support
> +# is enabled.
> +#
> +if test "x$ac_pcre_unicode_properties" = "xyes"
> +then
> + if test "x$ac_pcre_utf8" = "xno"
> + then
> + AC_MSG_ERROR([support for Unicode properties requires UTF-8 support])
> + fi
> + ac_pcre_utf8=yes
> +fi
> +if test "x$ac_pcre_utf8" = "xmaybe"
> +then
> + ac_pcre_utf8=no
> +fi
The above looks great. Good work!
> +# This facilitates -ansi builds under Linux
> +dnl AC_DEFINE([_GNU_SOURCE], [], [Enable GNU extensions in glibc])
Whoa, why would we want to add this? I don't think this is a good idea
until after the build system is integrated. What effects could this have
on the translation units that I'm diffing?
> +$PACKAGE-$VERSION configuration summary:
> +
> + Install prefix ......... : ${prefix}
> + C preprocessor ......... : ${CPP}
> + C compiler ............. : ${CC}
> + C++ preprocessor ....... : ${CXXCPP}
> + C++ compiler ........... : ${CXX}
> + Linker ................. : ${LD}
> + C preprocessor flags ... : ${CPPFLAGS}
> + C compiler flags ....... : ${CFLAGS}
> + C++ compiler flags ..... : ${CXXFLAGS}
> + Linker flags ........... : ${LDFLAGS}
> + Extra libraries ........ : ${LIBS}
> +
> + Build C++ library ...... : ${ac_pcre_cpp}
> + Enable UTF-8 support ... : ${ac_pcre_utf8}
> + Unicode properties ..... : ${ac_pcre_unicode_properties}
> + Newline char/sequence .. : ${ac_pcre_newline}
> + EBCDIC coding .......... : ${ac_pcre_ebcdic}
> + Use stack recursion .... : ${ac_pcre_recurse}
> + POSIX mem threshold .... : ${ac_pcre_posix_malloc_threshold}
> + Internal link size ..... : ${ac_pcre_link_size}
> + Match limit ............ : ${ac_pcre_match_limit}
> + Match limit recursion .. : ${ac_pcre_match_limit_recursion}
> + Build shared libs ...... : ${enable_shared}
> + Build static libs ...... : ${enable_static}
The above is neat. Good job.
>
> diff -ruN pcre-7.0/libpcre.pc.in pcre-7.1-RC0/libpcre.pc.in
> --- pcre-7.0/libpcre.pc.in 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/libpcre.pc.in 2007-02-15 22:06:46.000000000 -0500
> @@ -7,6 +7,6 @@
>
> Name: libpcre
> Description: PCRE - Perl compatible regular expressions C library
> -Version: @PCRE_VERSION@
> +Version: @PACKAGE_VERSION@
> Libs: -L${libdir} -lpcre
> Cflags: -I${includedir}
> diff -ruN pcre-7.0/libpcrecpp.pc.in pcre-7.1-RC0/libpcrecpp.pc.in
> --- pcre-7.0/libpcrecpp.pc.in 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/libpcrecpp.pc.in 2007-02-15 22:06:38.000000000 -0500
> @@ -7,6 +7,6 @@
>
> Name: libpcrecpp
> Description: PCRECPP - C++ wrapper for PCRE
> -Version: @PCRE_VERSION@
> +Version: @PACKAGE_VERSION@
> Libs: -L${libdir} -lpcre -lpcrecpp
> Cflags: -I${includedir}
> diff -ruN pcre-7.0/pcre-config.in pcre-7.1-RC0/pcre-config.in
> --- pcre-7.0/pcre-config.in 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/pcre-config.in 2007-02-15 22:06:29.000000000 -0500
> @@ -43,7 +43,7 @@
> echo $exec_prefix
> ;;
> --version)
> - echo @PCRE_VERSION@
> + echo @PACKAGE_VERSION@
> ;;
> --cflags | --cflags-posix)
> if test @includedir@ != /usr/include ; then
> diff -ruN pcre-7.0/pcre.h.in pcre-7.1-RC0/pcre.h.in
> --- pcre-7.0/pcre.h.in 1969-12-31 19:00:00.000000000 -0500
> +++ pcre-7.1-RC0/pcre.h.in 2007-02-15 21:08:23.000000000 -0500
> @@ -0,0 +1,294 @@
> +/*************************************************
> +* Perl-Compatible Regular Expressions *
> +*************************************************/
> +
> +/* This is the public header file for the PCRE library, to be #included by
> +applications that call the PCRE functions.
> +
> + Copyright (c) 1997-2006 University of Cambridge
> +
> +-----------------------------------------------------------------------------
> +Redistribution and use in source and binary forms, with or without
> +modification, are permitted provided that the following conditions are met:
> +
> + * Redistributions of source code must retain the above copyright notice,
> + this list of conditions and the following disclaimer.
> +
> + * Redistributions in binary form must reproduce the above copyright
> + notice, this list of conditions and the following disclaimer in the
> + documentation and/or other materials provided with the distribution.
> +
> + * Neither the name of the University of Cambridge nor the names of its
> + contributors may be used to endorse or promote products derived from
> + this software without specific prior written permission.
> +
> +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
> +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
> +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
> +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
> +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
> +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> +POSSIBILITY OF SUCH DAMAGE.
> +-----------------------------------------------------------------------------
> +*/
> +
> +#ifndef _PCRE_H
> +#define _PCRE_H
> +
> +/* The current PCRE version information. */
> +
> +/* NOTES FOR FUTURE MAINTAINERS: Do not use numbers with leading zeros, because
> +they may be treated as octal constants. The PCRE_PRERELEASE feature is for
> +identifying release candidates. It might be defined as -RC2, for example. In
> +real releases, it should be defined empty. Do not change the alignment of these
> +statments. The code in ./configure greps out the version numbers by using "cut"
> +to get values from column 29 onwards. These are substituted into pcre-config
> +and libpcre.pc. The values are not put into configure.ac and substituted here
> +(which would simplify this issue) because that makes life harder for those who
> +cannot run ./configure. As it now stands, this file need not be edited in that
> +circumstance. */
> +
> +#define PCRE_MAJOR @PCRE_MAJOR@
> +#define PCRE_MINOR @PCRE_MINOR@
> +#define PCRE_PRERELEASE @PCRE_PRERELEASE@
> +#define PCRE_DATE @PCRE_DATE@
> +
> +/* Win32 uses DLL by default; it needs special stuff for exported functions
> +when building PCRE. */
> +
> +#ifdef _WIN32
> +# ifdef PCRE_DEFINITION
> +# ifdef DLL_EXPORT
> +# define PCRE_DATA_SCOPE __declspec(dllexport)
> +# endif
> +# else
> +# ifndef PCRE_STATIC
> +# define PCRE_DATA_SCOPE extern __declspec(dllimport)
> +# endif
> +# endif
> +#endif
> +
> +/* Otherwise, we use the standard "extern". */
> +
> +#ifndef PCRE_DATA_SCOPE
> +# ifdef __cplusplus
> +# define PCRE_DATA_SCOPE extern "C"
> +# else
> +# define PCRE_DATA_SCOPE extern
> +# endif
> +#endif
> +
> +/* Have to include stdlib.h in order to ensure that size_t is defined;
> +it is needed here for malloc. */
> +
> +#include <stdlib.h>
> +
> +/* Allow for C++ users */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/* Options */
> +
> +#define PCRE_CASELESS 0x00000001
> +#define PCRE_MULTILINE 0x00000002
> +#define PCRE_DOTALL 0x00000004
> +#define PCRE_EXTENDED 0x00000008
> +#define PCRE_ANCHORED 0x00000010
> +#define PCRE_DOLLAR_ENDONLY 0x00000020
> +#define PCRE_EXTRA 0x00000040
> +#define PCRE_NOTBOL 0x00000080
> +#define PCRE_NOTEOL 0x00000100
> +#define PCRE_UNGREEDY 0x00000200
> +#define PCRE_NOTEMPTY 0x00000400
> +#define PCRE_UTF8 0x00000800
> +#define PCRE_NO_AUTO_CAPTURE 0x00001000
> +#define PCRE_NO_UTF8_CHECK 0x00002000
> +#define PCRE_AUTO_CALLOUT 0x00004000
> +#define PCRE_PARTIAL 0x00008000
> +#define PCRE_DFA_SHORTEST 0x00010000
> +#define PCRE_DFA_RESTART 0x00020000
> +#define PCRE_FIRSTLINE 0x00040000
> +#define PCRE_DUPNAMES 0x00080000
> +#define PCRE_NEWLINE_CR 0x00100000
> +#define PCRE_NEWLINE_LF 0x00200000
> +#define PCRE_NEWLINE_CRLF 0x00300000
> +#define PCRE_NEWLINE_ANY 0x00400000
> +
> +/* Exec-time and get/set-time error codes */
> +
> +#define PCRE_ERROR_NOMATCH (-1)
> +#define PCRE_ERROR_NULL (-2)
> +#define PCRE_ERROR_BADOPTION (-3)
> +#define PCRE_ERROR_BADMAGIC (-4)
> +#define PCRE_ERROR_UNKNOWN_OPCODE (-5)
> +#define PCRE_ERROR_UNKNOWN_NODE (-5) /* For backward compatibility */
> +#define PCRE_ERROR_NOMEMORY (-6)
> +#define PCRE_ERROR_NOSUBSTRING (-7)
> +#define PCRE_ERROR_MATCHLIMIT (-8)
> +#define PCRE_ERROR_CALLOUT (-9) /* Never used by PCRE itself */
> +#define PCRE_ERROR_BADUTF8 (-10)
> +#define PCRE_ERROR_BADUTF8_OFFSET (-11)
> +#define PCRE_ERROR_PARTIAL (-12)
> +#define PCRE_ERROR_BADPARTIAL (-13)
> +#define PCRE_ERROR_INTERNAL (-14)
> +#define PCRE_ERROR_BADCOUNT (-15)
> +#define PCRE_ERROR_DFA_UITEM (-16)
> +#define PCRE_ERROR_DFA_UCOND (-17)
> +#define PCRE_ERROR_DFA_UMLIMIT (-18)
> +#define PCRE_ERROR_DFA_WSSIZE (-19)
> +#define PCRE_ERROR_DFA_RECURSE (-20)
> +#define PCRE_ERROR_RECURSIONLIMIT (-21)
> +#define PCRE_ERROR_NULLWSLIMIT (-22)
> +#define PCRE_ERROR_BADNEWLINE (-23)
> +
> +/* Request types for pcre_fullinfo() */
> +
> +#define PCRE_INFO_OPTIONS 0
> +#define PCRE_INFO_SIZE 1
> +#define PCRE_INFO_CAPTURECOUNT 2
> +#define PCRE_INFO_BACKREFMAX 3
> +#define PCRE_INFO_FIRSTBYTE 4
> +#define PCRE_INFO_FIRSTCHAR 4 /* For backwards compatibility */
> +#define PCRE_INFO_FIRSTTABLE 5
> +#define PCRE_INFO_LASTLITERAL 6
> +#define PCRE_INFO_NAMEENTRYSIZE 7
> +#define PCRE_INFO_NAMECOUNT 8
> +#define PCRE_INFO_NAMETABLE 9
> +#define PCRE_INFO_STUDYSIZE 10
> +#define PCRE_INFO_DEFAULT_TABLES 11
> +
> +/* Request types for pcre_config(). Do not re-arrange, in order to remain
> +compatible. */
> +
> +#define PCRE_CONFIG_UTF8 0
> +#define PCRE_CONFIG_NEWLINE 1
> +#define PCRE_CONFIG_LINK_SIZE 2
> +#define PCRE_CONFIG_POSIX_MALLOC_THRESHOLD 3
> +#define PCRE_CONFIG_MATCH_LIMIT 4
> +#define PCRE_CONFIG_STACKRECURSE 5
> +#define PCRE_CONFIG_UNICODE_PROPERTIES 6
> +#define PCRE_CONFIG_MATCH_LIMIT_RECURSION 7
> +
> +/* Bit flags for the pcre_extra structure. Do not re-arrange or redefine
> +these bits, just add new ones on the end, in order to remain compatible. */
> +
> +#define PCRE_EXTRA_STUDY_DATA 0x0001
> +#define PCRE_EXTRA_MATCH_LIMIT 0x0002
> +#define PCRE_EXTRA_CALLOUT_DATA 0x0004
> +#define PCRE_EXTRA_TABLES 0x0008
> +#define PCRE_EXTRA_MATCH_LIMIT_RECURSION 0x0010
> +
> +/* Types */
> +
> +struct real_pcre; /* declaration; the definition is private */
> +typedef struct real_pcre pcre;
> +
> +/* When PCRE is compiled as a C++ library, the subject pointer type can be
> +replaced with a custom type. For conventional use, the public interface is a
> +const char *. */
> +
> +#ifndef PCRE_SPTR
> +#define PCRE_SPTR const char *
> +#endif
> +
> +/* The structure for passing additional data to pcre_exec(). This is defined in
> +such as way as to be extensible. Always add new fields at the end, in order to
> +remain compatible. */
> +
> +typedef struct pcre_extra {
> + unsigned long int flags; /* Bits for which fields are set */
> + void *study_data; /* Opaque data from pcre_study() */
> + unsigned long int match_limit; /* Maximum number of calls to match() */
> + void *callout_data; /* Data passed back in callouts */
> + const unsigned char *tables; /* Pointer to character tables */
> + unsigned long int match_limit_recursion; /* Max recursive calls to match() */
> +} pcre_extra;
> +
> +/* The structure for passing out data via the pcre_callout_function. We use a
> +structure so that new fields can be added on the end in future versions,
> +without changing the API of the function, thereby allowing old clients to work
> +without modification. */
> +
> +typedef struct pcre_callout_block {
> + int version; /* Identifies version of block */
> + /* ------------------------ Version 0 ------------------------------- */
> + int callout_number; /* Number compiled into pattern */
> + int *offset_vector; /* The offset vector */
> + PCRE_SPTR subject; /* The subject being matched */
> + int subject_length; /* The length of the subject */
> + int start_match; /* Offset to start of this match attempt */
> + int current_position; /* Where we currently are in the subject */
> + int capture_top; /* Max current capture */
> + int capture_last; /* Most recently closed capture */
> + void *callout_data; /* Data passed in with the call */
> + /* ------------------- Added for Version 1 -------------------------- */
> + int pattern_position; /* Offset to next item in the pattern */
> + int next_item_length; /* Length of next item in the pattern */
> + /* ------------------------------------------------------------------ */
> +} pcre_callout_block;
> +
> +/* Indirection for store get and free functions. These can be set to
> +alternative malloc/free functions if required. Special ones are used in the
> +non-recursive case for "frames". There is also an optional callout function
> +that is triggered by the (?) regex item. For Virtual Pascal, these definitions
> +have to take another form. */
> +
> +#ifndef VPCOMPAT
> +PCRE_DATA_SCOPE void *(*pcre_malloc)(size_t);
> +PCRE_DATA_SCOPE void (*pcre_free)(void *);
> +PCRE_DATA_SCOPE void *(*pcre_stack_malloc)(size_t);
> +PCRE_DATA_SCOPE void (*pcre_stack_free)(void *);
> +PCRE_DATA_SCOPE int (*pcre_callout)(pcre_callout_block *);
> +#else /* VPCOMPAT */
> +PCRE_DATA_SCOPE void *pcre_malloc(size_t);
> +PCRE_DATA_SCOPE void pcre_free(void *);
> +PCRE_DATA_SCOPE void *pcre_stack_malloc(size_t);
> +PCRE_DATA_SCOPE void pcre_stack_free(void *);
> +PCRE_DATA_SCOPE int pcre_callout(pcre_callout_block *);
> +#endif /* VPCOMPAT */
> +
> +/* Exported PCRE functions */
> +
> +PCRE_DATA_SCOPE pcre *pcre_compile(const char *, int, const char **, int *,
> + const unsigned char *);
> +PCRE_DATA_SCOPE pcre *pcre_compile2(const char *, int, int *, const char **,
> + int *, const unsigned char *);
> +PCRE_DATA_SCOPE int pcre_config(int, void *);
> +PCRE_DATA_SCOPE int pcre_copy_named_substring(const pcre *, const char *,
> + int *, int, const char *, char *, int);
> +PCRE_DATA_SCOPE int pcre_copy_substring(const char *, int *, int, int, char *,
> + int);
> +PCRE_DATA_SCOPE int pcre_dfa_exec(const pcre *, const pcre_extra *,
> + const char *, int, int, int, int *, int , int *, int);
> +PCRE_DATA_SCOPE int pcre_exec(const pcre *, const pcre_extra *, PCRE_SPTR,
> + int, int, int, int *, int);
> +PCRE_DATA_SCOPE void pcre_free_substring(const char *);
> +PCRE_DATA_SCOPE void pcre_free_substring_list(const char **);
> +PCRE_DATA_SCOPE int pcre_fullinfo(const pcre *, const pcre_extra *, int,
> + void *);
> +PCRE_DATA_SCOPE int pcre_get_named_substring(const pcre *, const char *,
> + int *, int, const char *, const char **);
> +PCRE_DATA_SCOPE int pcre_get_stringnumber(const pcre *, const char *);
> +PCRE_DATA_SCOPE int pcre_get_stringtable_entries(const pcre *, const char *,
> + char **, char **);
> +PCRE_DATA_SCOPE int pcre_get_substring(const char *, int *, int, int,
> + const char **);
> +PCRE_DATA_SCOPE int pcre_get_substring_list(const char *, int *, int,
> + const char ***);
> +PCRE_DATA_SCOPE int pcre_info(const pcre *, int *, int *);
> +PCRE_DATA_SCOPE const unsigned char *pcre_maketables(void);
> +PCRE_DATA_SCOPE int pcre_refcount(pcre *, int);
> +PCRE_DATA_SCOPE pcre_extra *pcre_study(const pcre *, int, const char **);
> +PCRE_DATA_SCOPE const char *pcre_version(void);
> +
> +#ifdef __cplusplus
> +} /* extern "C" */
> +#endif
> +
> +#endif /* End of pcre.h */
> diff -ruN pcre-7.0/pcre_compile.c pcre-7.1-RC0/pcre_compile.c
> --- pcre-7.0/pcre_compile.c 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/pcre_compile.c 2007-02-21 22:21:29.000000000 -0500
> @@ -312,7 +312,7 @@
> 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* - 71 40 */
> 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* 72- | */
> 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* & - 87 50 */
> - 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* 88- ? */
> + 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* 88- */
Do you really want to change the above?
> 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* - -103 60 */
> 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* 104- ? */
> 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* 112-119 70 */
> @@ -346,7 +346,7 @@
> 0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* - 71 */
> 0x00,0x00,0x00,0x80,0x00,0x80,0x80,0x80, /* 72- | */
> 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* & - 87 */
> - 0x00,0x00,0x00,0x80,0x80,0x80,0x00,0x00, /* 88- ? */
> + 0x00,0x00,0x00,0x80,0x80,0x80,0x00,0x00, /* 88- */
Same here, why is this being changed?
> 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* - -103 */
> 0x00,0x00,0x00,0x00,0x00,0x10,0x00,0x80, /* 104- ? */
> 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* 112-119 */
Honestly, I would really prefer to leave the EBCDIC stuff the way it was
until we get the patch committed. However, if it needs to be done now, I
can update my expected translation units. I'm just afraid that the more
code we change, the more will be broken.
> diff -ruN pcre-7.0/pcre_ucp_searchfuncs.c pcre-7.1-RC0/pcre_ucp_searchfuncs.c
> --- pcre-7.0/pcre_ucp_searchfuncs.c 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/pcre_ucp_searchfuncs.c 2007-02-20 22:37:47.000000000 -0500
> @@ -45,7 +45,7 @@
>
> #include "ucp.h" /* Category definitions */
> #include "ucpinternal.h" /* Internal table details */
> -#include "ucptable.c" /* The table itself */
> +#include "ucptable.h" /* The table itself */
This is another change that could wait I believe.
> /* Table to translate from particular type value to the general value. */
> diff -ruN pcre-7.0/pcrecpp.cc pcre-7.1-RC0/pcrecpp.cc
> --- pcre-7.0/pcrecpp.cc 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/pcrecpp.cc 2007-02-20 22:57:43.000000000 -0500
> @@ -29,6 +29,10 @@
> //
> // Author: Sanjay Ghemawat
>
> +#ifdef HAVE_CONFIG_H
> +# include <config.h>
> +#endif
Why do you need the ifdef? If config.h isn't #included, I think we'll
get a compile error. I don't see any problem with the way this file was
before.
> #include <stdlib.h>
> #include <stdio.h>
> #include <ctype.h>
> @@ -37,7 +41,6 @@
> #include <errno.h>
> #include <string>
> #include <algorithm>
> -#include "config.h"
> // We need this to compile the proper dll on windows/msys. This is copied
> // from pcre_internal.h. It would probably be better just to include that.
> #define PCRE_DEFINITION /* Win32 __declspec(export) trigger for .dll */
> diff -ruN pcre-7.0/pcredemo.c pcre-7.1-RC0/pcredemo.c
> --- pcre-7.0/pcredemo.c 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/pcredemo.c 2007-02-21 01:06:35.000000000 -0500
> @@ -16,6 +16,10 @@
> */
>
>
> +#ifdef HAVE_CONFIG_H
> +# include <config.h>
> +#endif
Same comment here. In fact, why are we adding the #include here?
> #include <stdio.h>
> #include <string.h>
> #include <pcre.h>
> diff -ruN pcre-7.0/pcregrep.c pcre-7.1-RC0/pcregrep.c
> --- pcre-7.0/pcregrep.c 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/pcregrep.c 2007-02-20 22:57:49.000000000 -0500
> @@ -37,6 +37,10 @@
> -----------------------------------------------------------------------------
> */
>
> +#ifdef HAVE_CONFIG_H
> +# include <config.h>
> +#endif
Same comment.
> +
> #include <ctype.h>
> #include <locale.h>
> #include <stdio.h>
> @@ -48,7 +52,6 @@
> #include <sys/stat.h>
> #include <unistd.h>
>
> -#include "config.h"
> #include "pcre.h"
>
> #define FALSE 0
> @@ -56,7 +59,6 @@
>
> typedef int BOOL;
>
> -#define VERSION "4.4 29-Nov-2006"
Yeah, good catch here!
> #define MAX_PATTERN_COUNT 100
>
> #if BUFSIZ > 8192
> @@ -244,7 +246,7 @@
>
> /************* Directory scanning in Unix ***********/
>
> -#if IS_UNIX
> +#if defined HAVE_SYS_STAT_H && defined HAVE_DIRENT_H && defined HAVE_SYS_TYPES_H
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <dirent.h>
> @@ -314,7 +316,7 @@
> when it did not exist. */
>
>
> -#elif HAVE_WIN32API
> +#elif HAVE_WINDOWS_H
>
> #ifndef STRICT
> # define STRICT
> @@ -436,8 +438,8 @@
> typedef void directory_type;
>
> int isdirectory(char *filename) { return 0; }
> -directory_type * opendirectory(char *filename) {}
> -char *readdirectory(directory_type *dir) {}
> +directory_type * opendirectory(char *filename) { return (directory_type*)0;}
> +char *readdirectory(directory_type *dir) { return (char*)0;}
> void closedirectory(directory_type *dir) {}
>
>
> @@ -1328,8 +1330,7 @@
> case 'x': process_options |= PO_LINE_MATCH; break;
>
> case 'V':
> - fprintf(stderr, "pcregrep version %s using ", VERSION);
> - fprintf(stderr, "PCRE version %s\n", pcre_version());
> + fprintf(stderr, "pcregrep version %s\n", pcre_version());
> exit(0);
> break;
>
> diff -ruN pcre-7.0/perltest.pl pcre-7.1-RC0/perltest.pl
> --- pcre-7.0/perltest.pl 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/perltest.pl 2007-02-21 01:31:20.000000000 -0500
> @@ -1,4 +1,4 @@
> -#! /usr/bin/perl
> +#! /usr/bin/env perl
I also don't understand this. It seems non standard.
> # Program for testing regular expressions with perl to check that PCRE handles
> # them the same. This is the version that supports /8 for UTF-8 testing. As it
> diff -ruN pcre-7.0/testdata/grepoutput pcre-7.1-RC0/testdata/grepoutput
> --- pcre-7.0/testdata/grepoutput 2006-12-19 04:31:35.000000000 -0500
> +++ pcre-7.1-RC0/testdata/grepoutput 2007-02-21 00:39:11.000000000 -0500
> @@ -342,8 +342,8 @@
> ./testdata/grepinputx
> RC=0
> ---------------------------- Test 36 -----------------------------
> -./testdata/grepinputx
> ./testdata/grepinput8
> +./testdata/grepinputx
Why is this necessary?
Thanks! Your changes look great.
Bob Rossi