Re: [pcre-dev] \K in lookahead assertion gives unpredictable…

Top Page
Delete this message
Author: Zoltán Herczeg
Date:  
To: ND
CC: Pcre-dev
Subject: Re: [pcre-dev] \K in lookahead assertion gives unpredictable result
Hi,

just a little help for debugging this. In such cases, where the start offset is bigger than end offset (1 > 0 this case), pcretest prints out the characters from start offset to the end of the input. It could print an empty string (such as Perl), but that would not be helpful for testing/debugging purposes.

The characters are printed by this function:

static int pchars(pcre_uint8 *p, int length, FILE *f)

It starts with:

if (length < 0)
length = strlen((char *)p);

I suspect something is wrong with p.

The zero terminated string is ensured by line 4925 in pcretest:
    if (pcre_mode == PCRE8_MODE)
    {
      *q8 = 0;
      len = (int)(q8 - (pcre_uint8 *)dbuffer);
    }


Please check what dbuffer contains after this line, and please check what is the value of 'p' in pchars(..).

Philip, perhaps we could write a note when the length is < 0. Such as:
0: <start offset (1) is bigger than end offset (0), print input to the end> b

Hope this helps,
Zoltan

"Zoltán Herczeg" <hzmester@???> írta:
>Hi,
>
>yes it seems there is a rubbish after the 'b'. However, I don't see this behavior in the recent release (under Linux at least), and your binary is half year old. Could you try it with a newer version?
>
>Anyway I suspect the result is correct (it start with b), just the printing does not stop after the end of the input. This might be a pcretest or windows libc bug.
>
>Regards,
>Zoltan
>
>ND <nadenj@???> írta:
>>Good day!
>>
>>Here is pcretest.exe listing:
>>
>>
>>PCRE version 8.34-RC 2013-06-14
>>/(?=a\K)/
>>ab
>> 0: b\x89b\x1f\xe4J~\x04
>>
>>
>>This match is unpredictable for me. May be a bug there.
>>
>>Thanks a lot.
>>
>>--
>>## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
>
>
>--
>## List details at https://lists.exim.org/mailman/listinfo/pcre-dev