[Pcre-svn] [1001] code/trunk/doc/pcrepattern.3: Improve doc…

Startseite
Nachricht löschen
Autor: Subversion repository
Datum:  
To: pcre-svn
Betreff: [Pcre-svn] [1001] code/trunk/doc/pcrepattern.3: Improve documentation of \c in EBCDIC mode.
Revision: 1001
          http://vcs.pcre.org/viewvc?view=rev&revision=1001
Author:   ph10
Date:     2012-08-08 11:18:25 +0100 (Wed, 08 Aug 2012)


Log Message:
-----------
Improve documentation of \c in EBCDIC mode.

Modified Paths:
--------------
    code/trunk/doc/pcrepattern.3


Modified: code/trunk/doc/pcrepattern.3
===================================================================
--- code/trunk/doc/pcrepattern.3    2012-08-08 09:38:49 UTC (rev 1000)
+++ code/trunk/doc/pcrepattern.3    2012-08-08 10:18:25 UTC (rev 1001)
@@ -246,15 +246,22 @@
   \ex{hhh..} character with hex code hhh.. (non-JavaScript mode)
   \euhhhh    character with hex code hhhh (JavaScript mode only)
 .sp
-The precise effect of \ecx is as follows: if x is a lower case letter, it
-is converted to upper case. Then bit 6 of the character (hex 40) is inverted.
-Thus \ecz becomes hex 1A (z is 7A), but \ec{ becomes hex 3B ({ is 7B), while
-\ec; becomes hex 7B (; is 3B). If the byte following \ec has a value greater
-than 127, a compile-time error occurs. This locks out non-ASCII characters in
-all modes. (When PCRE is compiled in EBCDIC mode, all byte values are valid. A
-lower case letter is converted to upper case, and then the 0xc0 bits are
-flipped.)
+The precise effect of \ecx on ASCII characters is as follows: if x is a lower
+case letter, it is converted to upper case. Then bit 6 of the character (hex
+40) is inverted. Thus \ecA to \ecZ become hex 01 to hex 1A (A is 41, Z is 5A),
+but \ec{ becomes hex 3B ({ is 7B), and \ec; becomes hex 7B (; is 3B). If the
+data item (byte or 16-bit value) following \ec has a value greater than 127, a
+compile-time error occurs. This locks out non-ASCII characters in all modes.
 .P
+The \ec facility was designed for use with ASCII characters, but with the
+extension to Unicode it is even less useful than it once was. It is, however,
+recognized when PCRE is compiled in EBCDIC mode, where data items are always
+bytes. In this mode, all values are valid after \ec. If the next character is a
+lower case letter, it is converted to upper case. Then the 0xc0 bits of the
+byte are inverted. Thus \ecA becomes hex 01, as in ASCII (A is C1), but because
+the EBCDIC letters are disjoint, \ecZ becomes hex 29 (Z is E9), and other 
+characters also generate different values.
+.P
 By default, after \ex, from zero to two hexadecimal digits are read (letters
 can be in upper or lower case). Any number of hexadecimal digits may appear
 between \ex{ and }, but the character code is constrained as follows:
@@ -2922,6 +2929,6 @@
 .rs
 .sp
 .nf
-Last updated: 10 July 2012
+Last updated: 08 August 2012
 Copyright (c) 1997-2012 University of Cambridge.
 .fi