Re: [pcre-dev] Start optimization issue

Top Page
Delete this message
Author: Zoltán Herczeg
Date:  
To: ND
CC: Pcre-dev
Subject: Re: [pcre-dev] Start optimization issue
Hi,

I have investigated this issue, and in the optimized case, PCRE_STARTLINE is set, so it searches the first newline. There is a comment for this before is_startline(...):

/* This is called to find out if every branch starts with ^ or .* so that
"first char" processing can be done to speed things up in multiline
matching and for non-DOTALL patterns that start with .* (which must start at
the beginning or after \n). As in the case of is_anchored() (see above), we
have to take account of back references to capturing brackets that contain .*
because in that case we can't make the assumption. ... */

Probably the atomic block affects this case, which removes the backtracking ability from .* and maybe other recursion control verbs (like (*COMMIT)) can also do this.

Regards,
Zoltan

ND <nadenj@???> írta:
>Good day!>
>

Here is pcretest.exe listing:>
>

PCRE version 8.31 2012-07-06>
/(?>.*?a)(?<=ba)/>
aba>
No match>
>

MATCH was inspected.>
More investigation returns that is start optimization issue.>
>

PCRE version 8.31 2012-07-06>
/(*NO_START_OPT)(?>.*?a)(?<=ba)/>
aba>
0: ba>
>

What kind of start optimization doing things? I don't find in >
documentation anything about this case.>
>

Thanx.>
>

-- >
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev >