Re: [pcre-dev] Kudos on PCRE 7.2...

Etusivu
Poista viesti
Lähettäjä: Daniel Richard G.
Päiväys:  
Vastaanottaja: Robert Roessler
Kopio: pcre-dev
Aihe: Re: [pcre-dev] Kudos on PCRE 7.2...
On Thu, 2007 Aug 30 19:32:13 -0700, Robert Roessler wrote:
>
> The changes are all in the direction of making a simple system more
> complex (and with more pre-requisites), which I would think should make
> any experienced software engineer look askance (maybe more than once).


Supporting multiple systems (and build systems) is not a simple task
that can be supported by a simple setup. Look at every software package
that uses autotools instead of a plain user-edited makefile, and ask
why.

> How does not only taking a clean, simple, SINGLE include of config.h
> (with or without the proper language-intended construction - "" vs <>),
> and transforming that into twenty-odd points of failure or confusion (do
> I need this here? hmmm...), but now the includes themselves don't even
> HAPPEN without defining a mysterious symbol which isn't ever mentioned?


This is a standard idiom in software that makes use of autotools, and
ensures that config.h is the first non-comment source seen by the
compiler. This requirement was not fulfilled with the previous single
#include of config.h, and what's more, it was not obvious that the
requirement was not being fulfilled.

This "mysterious symbol" HAVE_CONFIG_H is there because in its absence,
you are expected to define the required preprocessor symbols on the
compiler command line.

> Not to mention the fact that forever after (unless this "improvement"
> is rescinded in a timely fashion), every new source file added to PCRE
> forever will now need to also have this gratuitous and bizarrely
> expressed include.


Yes. That is exactly the intent. Every individually compiled .c file
will have its own #include<config.h>, protected by "#ifdef
HAVE_CONFIG_H".

There is nothing gratuitous or bizarre about this; it is as common and
prosaic in free software C code as GPL or BSD license boilerplate.

> This seems like a losing proposition from any and all of the
> perspectives of ease of use, reliability, maintainability, and just
> simple code clarity/transparency.


I think it is a winning proposition on all those points, and moreover, I
can point to an actual instance where the change improved matters
(ensuring that config.h comes first, and keeping that the case).

> And how does ANY of this large set of changes make life easier (or even
> as good as it was) for users of one of the more popular toolchains on
> the dominant operating system on the planet? The point is not so much
> that there are not solution/project files for any version of Visual
> Studio (I can certainly contribute those), but that clearly the concerns
> of this portion of the developer population don't seem to be very well
> represented... if they were even considered at all.


We would be happy to address inadequacies in the new system, but your
comments so far reflect more an unfamiliarity with common conventions in
free software code than actual failings specific to PCRE. You could say
"I think the configuration script should be called 'config.sh'
instead of 'configure'", and what am I to say to that?

Moreover, PCRE has already been down the path of user-contributed Visual
Studio projects, and it was not pretty---these could not be maintained
by Mr. Hazel, and so were usually out of date (and never distributed in
the tarball).

With CMake in place, you'd have an even higher bar to meet. Could you
also contribute projects for Visual Studio 6.0? Borland's IDE? Xcode on
MacOS X? Eclipse? Kdevelop? And provide updates for every future
release?

CMake, like Autoconf, is not the prettiest or most straightforward tool
in the world. There are many aspects of it that I don't like. But the
benefits it provides are simply too numerous to ignore, and the
alternatives don't even pass the laugh test.

> > Maybe it is possible you could contribute a Makefile.static makefile
> > that would build pcre as a static library. It could contain a bunch of
> > simple commands to do this.
>
> My own uses don't include anything quite this simple.


Because the simplest approach isn't always the most serviceable ^_^

> > What about changing the symbol PCREPOSIX_STATIC -> PCRE_STATIC made your
> > life harder?
>
> Pretty straightforward, actually - as would be the case with pretty much
> any use of PCRE that includes redistributing PCRE as a component of some
> package, one likes to support as many different releases of PCRE as
> possible, to give one's ultimate consumers the flexibility to select a
> version of PCRE that meets *their* needs (possibly extended formalized
> testing, just plain better comfort level from experience with a
> particular release, etc).


So your beef is with the incompatible non-API source-level changes. That
is an understandable point. Things are in flux right now, and the view
is that long-standing issues are being addressed (e.g. declspec
directives that were never quite right until now).

Things will settle down eventually, and will stay as they are, because
they will finally have been done "The Right Way(tm)." That is what I'm
aiming for, and a major reason for why I'm aiming to have PCRE conform
to these common free-software conventions.

> ONE of my two major uses of PCRE involves shipping an Objective Caml
> binding for [the POSIX form of] PCRE in a single package that is
> buildable on Windows OR *nix platforms - all controlled with a
> reasonably simple makefile. To digress, my approach to this is to have
> them expand the PCRE distribution of choice *somewhere* and then point
> to it with make vars from wherever they have expanded my package - I
> then use VPATH in the makefile to actually reference the PCRE source.
>
> Anyway, back on point, life is already complex enough in this
> environment, because version to version, there have been changes that
> force me to have conditional paths in the makefile to define or not
> define symbols that the PCRE source now wants / doesn't want... one of
> my personal favorites was the abrupt shift of the default build mode of
> PCRE [posix] from static lib to DLL (back in the 6.x days?). ;)
>
> So every time that some change like "PCREPOSIX_STATIC -> PCRE_STATIC"
> happens, it means I have to notice that it happened (since it probably
> hasn't been pointed out in a changelist), code a new path in the
> makefile to deal with it (adding a possible point of failure), and now
> my testing task has just gotten larger.


You are maintaining a parallel PCRE build system. One of the downsides
of this is that you have to track changes in the preprocessor interface
to the source code.

Your issue is not a problem with the PCRE source or build system; it is
simply an occupational hazard of the approach you are taking. You would
encounter the same headache if Mr. Hazel adds a new source file, and I
dare say you're not going to ask us to avoid doing that.

The standard approach for handling a source package bundled inside a
larger source package, as your use case seems to be, is to have the
outer package invoke something along the lines of

    (cd pcre && ./configure && make)


i.e. allow the inner package's own build system deal with the source
code, instead of driving the build directly. Your experience illustrates
exactly why this approach is more common.

> I am getting the idea that even with my two wildly different uses of
> PCRE (Windows-only using the Visutal Studio IDE, as well as a
> cross-platform source-based release for *nix AND Windows systems), I am
> somehow not being *perceived* as your target demographic (if such
> terminology is meaningful here)... OK, a bit of sarcasm. :)


CMake was added for the explicit purpose of supporting Windows builds.
My employer makes use of PCRE, and building on Windows before this
addition was like pulling teeth.

> But, allow me to make my penultimate point: while I will admit this is a
> home-grown idea (it came to me during this sequence of communications),
> it strikes me that when shipping open-source projects, it is in the
> interests of the project's *consumers* for the *developers* to extend
> the concept of "interface"... so that ideas like contracts and expected
> invariants include the externally visible aspects and controlling inputs
> to the build process.
>
> In case this seems too tortured, I am trying to draw an analog between
> all the reasons we expect adherence to an *interface* to make changing
> *implementations* safe and easy, to having a package maintain
> [obviously, as much as possible] an unchanging build methodology for all
> things observable.
>
> So, I am saying that the same things should be done when breaking these
> observable aspects of the build process as when we decide that a [gasp,
> sacred] interface be changed: do we REALLY need to do this, are the
> intended beneficiaries us or our consumers (both can be valid - or not),
> have we left the old "interface" in place for those whose needs are
> already being met perfectly adequately, and if the answer is NO for this
> last question, then have we been very upfront and open with our
> consumers about WHY we made this choice, WHAT it will do for them, and
> HOW they manage to keep using our product without everything they use it
> in being BROKEN.


I understand what you are asking for. Rather than putting this
constraint on PCRE development, I think you should either refrain from
maintaining a third-party build system for PCRE (and use the one we
ship), or accept that doing so is going to involve some effort tracking
minor source-level changes from release to release. Again, what if we
add a new source file? A new generated-source file?

The build system that can build PCRE X.Y will not necessarily build PCRE
X.(Y+1), and it is not reasonable to ask for this to be the case.

> To wrap up, my philosophical backing on simplicity in engineering comes
> from Albert Einstein: "Everything should be made as simple as possible,
> but not simpler."(*) It seems like we differ substantially on where
> that break is.(**)


I'm no more a fan of gratuitous complexity than you are, Mr. Roessler.
The difference is an awareness of why this feature or that is present,
and why its absence constitutes "simpler than possible."

> (**) - still trying to communicate: the original authors of Unix had, as
> one of their cool ideas, the thought that while a tool might do wondrous
> and complex things (requiring wondrous and complex specification of
> controlling options), as a DEFAULT it should do something USEFUL. I
> submit that re-structuring these silly includes such that they DON'T
> happen at all (even though they are absolutely required) without a
> magical extra define appearing on every compile command line (documented
> or NOT)... completely disregards this time-proven concept.


But those #includes are _not_ absolutely required---not if you pass all
the required preprocessor symbols in as -Dfoo=bar directives on the
command line. Many build systems do exactly this instead of bothering
with a config header (though, admittedly, larger projects tend toward
the latter).


--Daniel