Re: [pcre-dev] Remove some restrictions of lookbehind assert…

Top Page
Delete this message
Author: Zoltán Herczeg
Date:  
To: pcre-dev@exim.org
CC: ND
Subject: Re: [pcre-dev] Remove some restrictions of lookbehind assertions
Thinking about practical use cases. With the proposed changes, doing a submatch is quite overcomplicated:

(*:A)submatch(*:B)(*MOVE:A)(*SETEND:B)match-submatch-again(*MOVE:B)(*SETEND)

Perhaps the other idea, use capturing brackets for this purpose could be better. Something like this:

(submatch)(*match:{1}pattern) is easier.

Inside the {}, a name can be presented as well.

The (*MOVE) could be kept for moving the string pointer around.

Let me know your opinion.

Regards,
Zoltan
 
-------- Eredeti levél --------
Feladó: Zoltán Herczeg < hzmester@??? (Link -> mailto:hzmester@freemail.hu) >
Dátum: 2019 július 29 18:51:34
Tárgy: Re: [pcre-dev] Remove some restrictions of lookbehind assertions
Címzett: pcre-dev@??? < pcre-dev@??? (Link -> mailto:pcre-dev@exim.org) >
> > (*SETEND:mark_name)
> >   - This verb changes the end position to the position recorded by the last mark which name is
> mark_name. If the position is smaller than the current string position, it is set to the current string > position.
> By "end position" do you mean "end of subject"? I'm misunderstanding
> something here because won't a MARK name usually be earlier than the
> current position? Or do you envisage using this in some kind of loop? In
> the interpreter, this will be easy to implement only if the MARK is
> earlier on the matching path. Oh, are you thinking of something like
> this?
> (*MARK:A)<stuff>(*MARK:B)<stuff>(*MOVE:A)(SETEND:B)<more stuff>
> That would be straightforward in the interpreter, I think.

Exactly. Usually (*MOVE:A)(SETEND:B) would be in an assertion to check something which should be somewhere before, kind of a generic lookbehind assertion. Maybe we could use capturing blocks for that, which require far less text (and a bit less flexible). Anyway we should not forget about the flags (noteol and friends I think) which controls \z (and friends).
Anyway before we implement anything lets discuss it. I think we can figure out something which is:
- Easy to maintain: I suspect not many people will use this (at least not now), so an easier approach is less likely introduce a lot of new bugs
- Flexible enough in practice: moving string pointers and do a sub-match looks a valid use case. However, since perl does not have have this feature, it is likely that very few practical use cases actually requires it.
Regards,
Zoltan