A year or so ago there were discussions on this list about a new API for PCRE.
The current one is 17 years old and has been greatly hacked around to
accommodate new features while retaining compabitility. Over most of this year
I have been working on implementing the new API, known as PCRE2. It has now got
to the stage where it is nearly complete. It runs all the tests and there are
pcre2test and pcre2grep programs that use it. I have made the basic reference
documentation (the pcre2api and pcre2test man pages) but there is still a lot
of documentation work to do. I have removed the discussion documents from the
FTP site, because the implementation has ended up being slightly different in
some places.
The code of PCRE2 is available for checking out from the SVN repository:
svn co svn://vcs.exim.org/pcre2/code/trunk pcre2
If there are any keen folks on this list who would like to take a look at it,
please do so. I would be very grateful for any comments - positive or negative.
There is still time to make changes before even an alpha release, though it is
too late for anything major. Please note the following points:
1. You should treat this as a new project, not just a drastic update to PCRE1.
A lot has changed, though the underlying structure of the code is much the
same. We have made the version number 10.00 (currently set to 10.00-DEV) so as
to avoid any confusion with PCRE1 versions.
2. Note that --enable-utf and --enable-ucp have been amalgamated into
--enable-unicode.
3. I have updated the CMake files as well as the configure files. CMake works
for me on Linux, but I have no way of testing it on Windows.
4. One very visible change is that explicit "studying" of compiled patterns has
been abolished - it now always happens automatically. Originally I thought this
might take a lot of resources, but it never has. Things are simpler without it.
5. Another very visible change is that pcre_exec() has become pcre2_match(),
which is a better name. Various other names have been changed as well.
6. The new pcre2test program has been completely re-written. The old one
started as a quick hack, but with so many added options its syntax became
horribly messy. The input format has been redesigned and is mostly not
compatible with the old program.
7. The JIT code is not yet present (it is being worked on). An attempt to
configure with JIT will be ignored. References to JIT in the documentation are
anticipatory.
8. There are no facilities for saving/restoring a compiled pattern. This was
always a hack in PCRE1, added when processors were slower, and before the
existence of JIT support (which cannot be saved).
9. There is no C++ wrapper. The existing PCRE1 wrapper has no maintainer at the
moment, so it is unlikely to be ported to PCRE2. It now seems to me that in
fact it is best NOT to include such a wrapper with PCRE2, but to encourage
somebody to create and maintain a separate project - or several projects, as I
think there are different views on how best to do the wrapping.
10. There is some HTML documentation, but a lot of links will be broken for
files that do not yet exist.
11. The discussion document proposed a "find and replace" function. This has
not yet been implemented.
12. Finally ... this is very much a work-in-progress still. There will be
changes to the files from time to time, though I think the base code is now
relatively stable.
I hope some people on this list will be able to find time to do some testing.
There is no hurry, as it will take some time to finish the documentation and
remaining code. My hope is that perhaps there might be a PCRE2 release by the
end of the year.