[pcre-dev] [Bug 1336] Universal Character Name escape code

Top Page
Delete this message
Author: Paulo Torrens
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1336] Universal Character Name escape code
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1336




--- Comment #6 from Paulo Torrens <paulo_torrens@???> 2013-02-17 17:52:27 ---
This is a nice feature, but... my problem is basically repeating the universal
characters across regexes and files... E.g.:

c.sah:
(...)
identifier:
/[_a-z$@`\xA0-\xD7FF\xE000-\xFFFF][\w$@`\xA0-\xD7FF\xE000-\xFFFF]*/i;
(...)


cpp.sah:
(...)
identifier:
/[_a-z$@`\xA0-\xD7FF\xE000-\xFFFF][\w$@`\xA0-\xD7FF\xE000-\xFFFF]*/i;
(...)


java.sah:
(...)
identifier:
/[_a-z$@`\xA0-\xD7FF\xE000-\xFFFF][\w$@`\xA0-\xD7FF\xE000-\xFFFF]*/i;
(...)


Aaaand so on.
And that's because on the example I didn't allow the \uXXXX escape code within
identifiers, which C/C++ accepts... "int abc\u1234def = 10;" is valid code.



That's why an escape code would help me with this problem. :)
I wanted to avoid repetition (DRY code).

I will probably use an implicit rule UCN (and EUCN or something like that for
escaped characters) that will expand to a complex regex which accepts what I
need...

But thanks for your suggestion! =D


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email