Re: [pcre-dev] 8.31-RC1 test release is available

Top Page
Delete this message
Author: Ralf Junker
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] 8.31-RC1 test release is available
On 12.06.2012 10:30, Philip Hazel wrote:

>> I am right now investigating a DFA 8-bit vs. DFA 16-bit "?R" recursive
>> pattern inconsistency. I will provide details ASAP.
>>
>> If possible, please hold back the PCRE 8.31 release for another day.
>
> No problem! I was going to leave it till the end of this week in any
> case. Thanks for your testing.


Thanks! Here are the details:

Pattern and subject below return different results when run with 8-bit and 16-bit DFA. The problem only shows when specifying the non-default ISO 8859 character tables (/T1).

Matching with pcre16_dfa_exec yields a "Pointer arithmetic underrun in process: pcretest.exe(4172)" in pcre_dfa_exec.c line 3208. This is the pcretest output for 16-bit:


PCRE version 8.31-RC1 2012-06-01

/<H((?(?!<H|F>)(.)|(?R))++)*F>/T1
\Dtext <H more text <H texting more  hexA0-"\xA0"    hex above 7F-"\xBC" F> text xxxxx <H text F> text F> text2 <H text sample F> more text.
 0: <H more text <H texting more  hexA0-"\xa0"    hex above 7F-"\xbc" F> text xxxxx <H text F> text F>



For 8-bit there is no buffer underrun. Te matched string is shorter compared to 16-bit. This is the pcretest output for 8-bit:


PCRE version 8.31-RC1 2012-06-01

/<H((?(?!<H|F>)(.)|(?R))++)*F>/T1
\Dtext <H more text <H texting more  hexA0-"\xA0"    hex above 7F-"\xBC" F> text xxxxx <H text F> text F> text2 <H text sample F> more text.
 0: <H more text <H texting more  hexA0-"\xa0"    hex above 7F-"\xbc" F>



Note: Both 8-bit and 16-bit match identically if the non-ASCII chars \xA0 and \xBC are changed to ASCII (<= 127).

Ralf