Re: [pcre-dev] [Bug 1447] New: Support for Enumerations

Top Page
Delete this message
Author: ph10
Date:  
To: 1447
CC: pcre-dev
Subject: Re: [pcre-dev] [Bug 1447] New: Support for Enumerations
On Thu, 20 Feb 2014, Go L. Elijah wrote:
>
> I believe regular expressions lack a construct (ignore the spaces):
>
> (?@ aaa | bbb(d*) | ccc(a*) | ddd)
>
> which, when applied to
>
> "cccaaaddd"
>
> results in this (PHP-alike notation):
>
> array("cccaaa", 2, "aaa")
>
> meaning the (?@ ... ) captures as an integer, and the clauses merely enumerate
> options.


I am not sure what you mean by "captures as an integer", not quite how
you are doing the matching (I'm not familiar with PHP). I work only at
the C-level code of PCRE, where the result of a match is a list of
strings.

> this may be an attractive substitute (with identical output):
>
> ( aaa (?0)| bbb(d*) (?1)| ccc(a*) (?2)| ddd (?3))
>
> however, the numbers are now optional, which requires a dynamic typing scheme
> as in PHP. but it would also allow for:
>
> ( aaa (?"one")| bbb (?"two")| ccc (?"three")| ddd (?"four"))
>
> which means that certain patterns can be replaced by a canonical pattern.
>
> anyway, you can see where this is going.


No, I'm sorry, I can't. I have a feeling that this may relate to the
existing feature for duplicate subpattern numbers or names, but I am not
sure. Are you familiar with those features?

In any event, this suggestion looks like a major non-Perl-compatible
change, which makes it unlikely to be attractive to PCRE developers.

Philip

--
Philip Hazel