Philip, Zoltán, did you see this?
I just tested the latest SVN 977 and the problem still persists. Other
than that, it works just fine.
Ralf
On 12.06.2012 17:38, Ralf Junker wrote:
> On 12.06.2012 10:30, Philip Hazel wrote:
>
>>> I am right now investigating a DFA 8-bit vs. DFA 16-bit "?R" recursive
>>> pattern inconsistency. I will provide details ASAP.
>>>
>>> If possible, please hold back the PCRE 8.31 release for another day.
>>
>> No problem! I was going to leave it till the end of this week in any
>> case. Thanks for your testing.
>
> Thanks! Here are the details:
>
> Pattern and subject below return different results when run with 8-bit and 16-bit DFA. The problem only shows when specifying the non-default ISO 8859 character tables (/T1).
>
> Matching with pcre16_dfa_exec yields a "Pointer arithmetic underrun in process: pcretest.exe(4172)" in pcre_dfa_exec.c line 3208. This is the pcretest output for 16-bit:
>
>
> PCRE version 8.31-RC1 2012-06-01
>
> /<H((?(?!<H|F>)(.)|(?R))++)*F>/T1
> \Dtext <H more text <H texting more hexA0-"\xA0" hex above 7F-"\xBC" F> text xxxxx <H text F> text F> text2 <H text sample F> more text.
> 0: <H more text <H texting more hexA0-"\xa0" hex above 7F-"\xbc" F> text xxxxx <H text F> text F>
>
>
> For 8-bit there is no buffer underrun. Te matched string is shorter compared to 16-bit. This is the pcretest output for 8-bit:
>
>
> PCRE version 8.31-RC1 2012-06-01
>
> /<H((?(?!<H|F>)(.)|(?R))++)*F>/T1
> \Dtext <H more text <H texting more hexA0-"\xA0" hex above 7F-"\xBC" F> text xxxxx <H text F> text F> text2 <H text sample F> more text.
> 0: <H more text <H texting more hexA0-"\xa0" hex above 7F-"\xbc" F>
>
>
> Note: Both 8-bit and 16-bit match identically if the non-ASCII chars \xA0 and \xBC are changed to ASCII (<= 127).
>
> Ralf