Re: [pcre-dev] Serialization format versioning

Top Page
Delete this message
Author: Daniel Richard G.
Date:  
To: Zoltán Herczeg, pcre-dev
Subject: Re: [pcre-dev] Serialization format versioning
On Mon, 2018 Jun 25 20:14+0200, Zoltán Herczeg wrote:
> > I can understand that. But I would point out that PCRE2's current
> > notion of serialization is quite limited compared to what that word
> > usually implies (cf. Java, .NET object serialization), so this is
> > not likely to be the only time that an application developer finds
> > the functionality does not meet their needs.
>
> True. Bytecode dump would probably be a better name.


I agree. It's more or less a sanitized version of saving/loading the
memory contents of a pcre2_code object.

While I wouldn't go so far as to say the API names should change, I do
think it would be good to have a blurb in the pcre2serialize doc along
the lines of

    Note that this API does not serialize compiled regexes to an
    abstract format, like Java or .NET object serialization. The
    serialized output is more like a bytecode dump, and can only be
    re-loaded by the same environment that produced it (PCRE2 version,
    PCRE2 code unit width, machine word size, etc.). As any PCRE2
    version upgrade will invalidate all serialized regexes produced by
    earlier versions, applications should be prepared to re-compile
    regexes from their source text as needed.


Notably, I'd like to prevent developers from falling into the trap of
assuming PCRE2's serialization is abstract, building an application
around that assumption, and then discovering this isn't the case at the
next version bump.


--Daniel


P.S.: Please Cc: me on any replies, as I am not subscribed to this list.


--
Daniel Richard G. || skunk@???
My ASCII-art .sig got a bad case of Times New Roman.