I ran across something strange, and wondered if this is expected.
In the transcript below, the 2 REs do not behave the same. The
intent of the REs is to match 1 or more characters in a class
consisting of decimal digits, lower case letters, and the closing
square bracket (']').
The two REs are: "[a-z\x5d0-9]+" and "\x5ba-z\x5d0-9]+" (the '['
starting the first RE is replaced by '\x5b' in the second RE).
The first RE works as expected, but the second RE that begins with
'\x5b' does not; the character class in the second RE is terminated
by the '\x5d', but that is not the case in the first RE!
If it is expected, can someone explain the logic behind how these
escape sequences are supposed to work?
Thanks,
Jim
---
C:\Users\JimD\Downloads\pcretest\bin>pcretest -b
PCRE version 7.0 18-Dec-2006
re> "[a-z\x5d0-9]+"
------------------------------------------------------------------
0 37 Bra 0
3 [0-9\]a-z]+
37 37 Ket
40 End
------------------------------------------------------------------
data> "AA a02]23 "
0: a02]23
data> ^Z
C:\Users\JimD\Downloads\pcretest\bin>pcretest -b
PCRE version 7.0 18-Dec-2006
re> "\x5ba-z\x5d0-9]+"
------------------------------------------------------------------
0 21 Bra 0
3 [a-z]0-9
19 ]+
21 21 Ket
24 End
------------------------------------------------------------------
data> "AA a02]23 "
No match
data> ^Z
C:\Users\JimD\Downloads\pcretest\bin>