Re: [pcre-dev] Remove some restrictions of lookbehind assert…

Top Page
Delete this message
Author: ND
Date:  
To: Pcre-dev
Subject: Re: [pcre-dev] Remove some restrictions of lookbehind assertions
On 2019-08-01 08:20, Zoltán Herczeg wrote:
>If we would use your idea for doing (0,n-1) match, that could be too
> slow for large subject, and people would complain.
>


Yes, it could be slow. But:
- we can use [\G,n-1]
- we can honestly warn about it in docs. Performance of X(?<=Y) will be
roughly comparable with "X(?=.*?Y\z)" isn't it?


> Before we chose anything to implement, it would be good to know about
> the problems we want to solve. Especially whether we can solve them with
> the current construct. I mean you can always construct artificial use
> cases for certain features, but are they necessary or you can solve them
> other ways.
>

I was faced with a need of nonfixed length lookbehind two times:

1. when data came by stream of 24kB blocks and I need to find a last
numeric in each of it
/.{24000}(?<=(\d++)\D*+)/g

2. when I have a json-array file and want to find every top-level element
that have "id" tag at any nested level
/(\{(?:[^{}]++|(?1))*+\})(?<=\{"id":"(?>.*?").*)/g


> It is also frequent that you combine regexes with other script
> languages. For example you split a string first into records, and do
> some search in each record, rather than trying everything with one
> complicated regex.


1. Combine is sometimes a bad thing in performance point of view
2. There are cases where there is no no programming language available for
user, only regex. And exactly this case is in one of my application.