Re: [pcre-dev] PCRE autotools patch drg2

トップ ページ
このメッセージを削除
著者: Bob Rossi
日付:  
To: Daniel Richard G.
CC: pcre-dev
題目: Re: [pcre-dev] PCRE autotools patch drg2
On Thu, Feb 22, 2007 at 08:55:00PM -0500, Daniel Richard G. wrote:
> On Thu, 2007 Feb 22 09:48:26 -0500, Bob Rossi wrote:
> >
> > At this point, I get,
> > $ patch -p1 < ~/pcre-7.0-drg2.patch
> > patching file autogen.sh
> > ...
> > patching file RunGrepTest.in
> > Hunk #5 FAILED at 201.
> > 1 out of 7 hunks FAILED -- saving rejects to file RunGrepTest.in.rej
> > patching file RunTest.in
> > ...
> >
> > RunGrepTest.in.rej is attached.
>
> As Philip noted, this is due to embedded CR characters in the patch getting
> mangled. I'll uuencode the next patch to get around this.


OK, sounds good.

> > > What do y'all say we ditch the ac_pcre_* variables, and use
> > > enable_foo/with_foo in their stead? We could then eliminate the third
> > > argument from most of the AC_ARG_ENABLE/WITH calls, as it would be
> > > redundant.
> >
> > I like the way it is. That's because it's easy to provide default
> > values. Also, it's nice and easy to read. Using the built in variables
> > is slightly more confusing to someone that doesn't know autoconf that
> > well.
>
> The default variable assignments would stay as they are; this is just about
> using variables that Automake already defines for us, instead of creating a
> parallel set.


OK, is it pretty much guarenteed that autoconf won't rename this
variables?

> > Also, reimplementing the same feature in a different way is fine with me
> > if it has a benefit. However, it seems like a waste of my time to have
> > written it originally if it is being reimplented for personal
> > preferences. Please don't read this as an attack on your work, I actually
> > like the way it's done. I just dislike wasting my time.
>
> You're referring to the ac_pcre_unicode_properties_set mechanism? I thought
> we'd discussed that it wouldn't catch the --enable-unicode-properties
> --disable-utf8 error case, and that you'd agreed that this could be
> addressed later on....


I see. It's fine. The new way is nice to me as well.

> > > +ac_pcre_match_limit_recursion=MATCH_LIMIT
> >
> > The above line stands out as odd to me. This is the first time that
> > MATCH_LIMIT is used in the file. Is this what you intended?
>
> Yes! If the user doesn't give a value for match-limit-recursion, then it
> would be much nicer for config.h to have
>
>     #define MATCH_LIMIT        10000000
>     #define MATCH_LIMIT_RECURSION    MATCH_LIMIT

>
> instead of
>
>     #define MATCH_LIMIT        10000000
>     #define MATCH_LIMIT_RECURSION    10000000

>
> :)


OK, that's tricky. Maybe you should add a comment to the configure.ac if
you are going to do it this way. I doubt many people will understand
what is going on unless they look deeply into it.

> > > +# Handle --disable-cpp
> > > +AC_ARG_ENABLE(cpp,
> > > +              AS_HELP_STRING([--disable-cpp],
> > > +                             [disable C++ support]),
> > > +              ac_pcre_cpp="$enableval")

> >
> > Now, I previously had,
> > AC_ARG_ENABLE(cpp, AC_HELP_STRING([--disable-cpp],[disable C++ support]),
> > ac_pcre_cpp="no")
> > Now, the major difference here, is that you set ac_pcre_cpp to
> > "$enableval" and I was setting it to "no". I'm not sure what's best
> > here. I prefer the "no" because I know exactly what the value is going
> > to be set to. Is there any way the user can manipulate the value of
> > $enableval to be something other than "no"?
> >
> > This question goes for the other values as well.
>
> Yes indeedy... if the user specifies any of the following:
>
>     --enable-cpp        => enableval="yes"
>     --enable-cpp=blah    => enableval="blah"
>     --disable-cpp=xyzzy    => enableval="xyzzy"

>
> There's nothing about an AC_ARG_ENABLE option that makes it inherently a
> positive (--enable-*) or negative (--disable-*) option. (Ditto for
> AC_ARG_WITH.) There is the help string that you show the user, but that has
> no effect on the logic.
>
> Even though we don't advertise --enable-cpp, and it's not needed (being the
> default anyway), we don't want it to have an effect opposite from what a
> user would expect.


I understand. I can see both arguments to this point. I mean, the user
would do '--enable-cpp=blah', and that would do nothing for them. Is
this what we want?

> > > +# This facilitates -ansi builds under Linux
> > > +dnl AC_DEFINE([_GNU_SOURCE], [], [Enable GNU extensions in glibc])
> >
> > Whoa, why would we want to add this? I don't think this is a good idea
> > until after the build system is integrated. What effects could this have
> > on the translation units that I'm diffing?
>
> I've commented it out for now, but the motivation is to allow strict-ANSI
> builds under Linux. Right now, if you try to build with -ansi, some
> stat()-related macros fail in pcregrep.c.
>
> (Being able to build in strict-ANSI mode is useful for ensuring
> compatibility with older Unix environments, so it's something we'll want to
> have in the new PCRE.)


Thanks.

> > > diff -ruN pcre-7.0/pcre_compile.c pcre-7.1-RC0/pcre_compile.c
> > > --- pcre-7.0/pcre_compile.c    2006-12-19 04:31:35.000000000 -0500
> > > +++ pcre-7.1-RC0/pcre_compile.c    2007-02-21 22:21:29.000000000 -0500
> > > @@ -312,7 +312,7 @@
> > >    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /*    - 71 40 */
> > >    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /*  72- |     */
> > >    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /*  & - 87 50 */
> > > -  0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /*  88- ?     */
> > > +  0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /*  88-      */

> >
> > Do you really want to change the above?
>
> Argh! Looks like my editor mangled a Latin-1 character. That'll definitely
> have to go... we could convert it to UTF-8, but the safest thing would be
> to change it to a \x## or \### character reference, so that the source file
> is 7-bit clean.
>
> Philip, what do you think? We surely don't want Latin-1 chars in the source
> files (and preferably no UTF-8 ones either) as that makes for encoding
> headaches all round when people work on them---case in point....


Maybe it's cause I live in the US, but why would an editor modify a file
in places that you don't edit? That just seems wrong to me.

> > Honestly, I would really prefer to leave the EBCDIC stuff the way it was
> > until we get the patch committed. However, if it needs to be done now, I
> > can update my expected translation units. I'm just afraid that the more
> > code we change, the more will be broken.
>
> The "#if !EBCDIC" --> "#ifndef EBCDIC" changes are there since the cpp
> semantic is directly tied to what's in the configuration header. We're
> generally refraining from source changes right now, but this is one of the
> exceptions, due to the close connection.
>
> > > diff -ruN pcre-7.0/pcre_ucp_searchfuncs.c pcre-7.1-RC0/pcre_ucp_searchfuncs.c
> > > --- pcre-7.0/pcre_ucp_searchfuncs.c    2006-12-19 04:31:35.000000000 -0500
> > > +++ pcre-7.1-RC0/pcre_ucp_searchfuncs.c    2007-02-20 22:37:47.000000000 -0500
> > > @@ -45,7 +45,7 @@

> > >
> > >  #include "ucp.h"               /* Category definitions */
> > >  #include "ucpinternal.h"       /* Internal table details */
> > > -#include "ucptable.c"          /* The table itself */
> > > +#include "ucptable.h"          /* The table itself */

> >
> > This is another change that could wait I believe.
>
> A similar issue closely tied to the build system. If the file is named
> ucptable.c, then Automake will want to compile it separately, which fails.


Why didn't I see this?

> > >  /* Table to translate from particular type value to the general value. */
> > > diff -ruN pcre-7.0/pcrecpp.cc pcre-7.1-RC0/pcrecpp.cc
> > > --- pcre-7.0/pcrecpp.cc    2006-12-19 04:31:35.000000000 -0500
> > > +++ pcre-7.1-RC0/pcrecpp.cc    2007-02-20 22:57:43.000000000 -0500
> > > @@ -29,6 +29,10 @@
> > >  //
> > >  // Author: Sanjay Ghemawat

> > >
> > > +#ifdef HAVE_CONFIG_H
> > > +# include <config.h>
> > > +#endif
> >
> > Why do you need the ifdef? If config.h isn't #included, I think we'll
> > get a compile error. I don't see any problem with the way this file was
> > before.
>
> The HAVE_CONFIG_H bit is a standard idiom. The idea is, either you or
> Autotools #defines PACKAGE_VERSION et al. in config.h, and passes
> -DHAVE_CONFIG_H to the compiler---or it gives -DPACKAGE_VERSION=7.1 et al.
> and no -DHAVE_CONFIG_H.
>
> (Autotools will #define all the config.h symbols on the compiler command
> line if you don't do the AC_CONFIG_HEADERS bit)


I see, that makes sense I suppose.

> > > diff -ruN pcre-7.0/pcredemo.c pcre-7.1-RC0/pcredemo.c
> > > --- pcre-7.0/pcredemo.c    2006-12-19 04:31:35.000000000 -0500
> > > +++ pcre-7.1-RC0/pcredemo.c    2007-02-21 01:06:35.000000000 -0500
> > > @@ -16,6 +16,10 @@
> > >  */

> > >
> > >
> > > +#ifdef HAVE_CONFIG_H
> > > +# include <config.h>
> > > +#endif
> >
> > Same comment here. In fact, why are we adding the #include here?
>
> IIRC, I got a compile error without it. Eventually, all the *.c files will
> need to have this bit added---that's the convention for the config header.


I don't get a compiler error with out it. What's changed on your end?

> > > diff -ruN pcre-7.0/testdata/grepoutput pcre-7.1-RC0/testdata/grepoutput
> > > --- pcre-7.0/testdata/grepoutput    2006-12-19 04:31:35.000000000 -0500
> > > +++ pcre-7.1-RC0/testdata/grepoutput    2007-02-21 00:39:11.000000000 -0500
> > > @@ -342,8 +342,8 @@
> > >  ./testdata/grepinputx
> > >  RC=0
> > >  ---------------------------- Test 36 -----------------------------
> > > -./testdata/grepinputx
> > >  ./testdata/grepinput8
> > > +./testdata/grepinputx

> >
> > Why is this necessary?
>
> This output is from a pcregrep -L -r command, which gives a list of files
> as output. The list, alas, is not sorted, and the test was failing for me
> as I was getting a different file order than the reference. So I re-sorted
> the list to proper lexical order ('8' < 'x') and added a sort(1) command to
> RunGrepTest.in so that the output is always sorted.


Does the test fail with pcre-7.0 as well? I wouldn't want to improve the
test suite in this patch if it was failing before as well.

> > Thanks! Your changes look great.
>
> Likewise for your critiques :-) Thanks for going through my patch so
> meticulously!


Your welcome. I appreciate all of the answers you have given me!

> On Thu, 2007 Feb 22 11:28:46 +0000, Philip Hazel wrote:
> I'll make another patch. There's a few other bits that need addressing as
> well.
>
>
> Thank you Philip, Bob, and Craig for your input; it's much appreciated!


Can't wait to see the next patch.

Bob Rossi