[pcre-dev] [Bug 2290] Recursion issue

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 2290] Recursion issue
https://bugs.exim.org/show_bug.cgi?id=2290

--- Comment #4 from Philip Hazel <ph10@???> ---
(In reply to smx from comment #3)

> The code i use is some thin c++ wrapper, it was used for quite some years,
> but recursion was never used in it.


Ah. Well, in that case I suspect there is a problem with the wrapper. I'm not a
C++ programmer, so not familiar with C++ isms.

> What this thin wrapper is using, boils down to example like this - taken
> directly from the debugger:
> std::string patt="\\(([^()]++|(?R))*\\)";
> 1)
> cpattern=pcre_compile2(patt.c_str(),0,&error_code,&error_string,
> &error_offset,NULL);


Did you check that the result is not NULL? That is, check for a compile error?

> 2) pcre_exec(cpattern,NULL,"(abc)",5,0,0,offsets,96)
> It return -5
>
> That question - just to make sure - does 96 here means: 96/3 which is max 32
> capturedExpressions?


Yes, but that pattern only has one capturing group. 96 should be the total size
of the offsets vector.

> Hmm, pcre1(8.xx) or newer(10.xx) - which is faster and/or smaller?


PCRE2 is probably bigger, because it has been extended. Originally, it was
exactly the same code, but with a revised (and more extendable) API. However,
there have been major refactorings since then. A quick look on my Linux box at
the non-shared library shows my guess above was wrong. libpcre.a is 2206564
bytes whereas the latest libpcre2-8.a (not yet released) is 1829282 bytes. So
not a huge difference. I don't know about speed. In both cases you will get a
big speed-up if you can make use of the JIT optimization.

> I read somewhere that Pcre 10.xx is missing one feature that 8.xx series
> has, but i don't remember what that is(callouts or maybe something else).


It's the bundled C++ wrapper, which for the 8.xx series lost its maintainer.
I decided not to do that any more, because I had learned that different people
had different ideas as to how it should be wrapped, so it seemed better not to
choose just one style. (And as I said, I'm not a C++ person.)

> This then should work if it works on a pcretest, i'll try to play with
> pcretest and find the cause, maybe that wrapper just doesn't use something
> right or overrides something - but all other cases works fine and i checked
> it a few times, any idea why this may not work is welcome.


If everything else works, it is most mysterious. If pcretest fails for you, try
pcretest's -d option (to check what's compiled) and maybe the /C modifier on a
pattern to insert automatic callouts to see how it is matching.

> Also, does maybe PCRE_NO_UTF8_CHECK(with PCRE_UTF8) speed things a little
> when it's known for sure utf8 is used?


Yes. That's what it is for.

--
You are receiving this mail because:
You are on the CC list for the bug.