Re: [pcre-dev] [Bug 1295] add 32-bit library

Startseite
Nachricht löschen
Autor: Thorsten Schöning
Datum:  
To: PCRE Development Mailing List
Betreff: Re: [pcre-dev] [Bug 1295] add 32-bit library
Guten Tag Ze'ev Atlas,
am Sonntag, 28. Oktober 2012 um 21:35 schrieben Sie:

> Wow, wow... stop it right there.  Back in the seventies, when we
> used such techniques, they were already considered IMPOLITE (or
> shall we say, downright wrong).


I'm not involved in any of the UTF-32 bit work but I find the
discussion very interesting and would like to add my thinkings of the
problem: I would agree with this statement. My feeling is the only
real argument Christian has is that his data and scenario seems to
need/prefer masking/PCRE_NO_UTF32_CHECK, but this is only one scenario
and as a library PCRE should focus on the main valid ones.

Let's look on character conversion functions, which I think are
comparable to pattern matching. They provide defaults which may lead
to ignored characters on invalid input or replaced one with e.g. ? or
else. And what's the result? The internet and support forums are full
of problems with encoding issues because the people just didn't know
about different character sets, that they should know how the text
they use is encoded and feed this information to the conversion
routines, often resulting in garbage somewhere where the text is
presented.

Within some years most of those functions were considered bad design
and got changed to at least optionally error out on invalid input, if
there's any chance to know which input is invalid.

http://msdn.microsoft.com/en-us/library/windows/desktop/dd374130(v=vs.85).aspx

With UTF-32 PCRE seems to be able to know which input is valid and
from my point of view the default should always be to error on any
invalid one. If the functionality for PCRE_UTF32_MASK_AND_NO_CHECK is
currently available, provide it as an compile time option with this
name, as it may help new users of the lib to understand problems.

Mit freundlichen Grüßen,

Thorsten Schöning

-- 
Thorsten Schöning       E-Mail:Thorsten.Schoening@???
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/


Telefon...........05151- 9468- 55
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow