Thanks Philip.
It's true that the empty character class will not match any character.
But shouldn't regular expressions containing empty character class(es) be rejected
by PCRE?
Thanks,
Mahendra Ladhe
--- On Tue, 4/11/08, Philip Hazel <ph10@???> wrote:
From: Philip Hazel <ph10@???>
Subject: Re: [pcre-dev] pcretest program accepts empty character class
To: "Mahendra Ladhe" <lml108@???>
Cc: pcre-dev@???
Date: Tuesday, 4 November, 2008, 12:34 PM
On Tue, 4 Nov 2008, Mahendra Ladhe wrote:
> kindly acknowledge if the following is a defect in PCRE library or
> not. Here, I'm giving a regular expression which consists of an empty
> character class to the pcretest program which accepts it without
> giving any error.
>
> mladhe@linux61:~/softwares/pcre-7.8/cmake] ./pcretest
> PCRE version 7.8 2008-09-05
>
> re> /[^\x00-\xff]/
> data> \x00
> No match
> data> \xff
> No match
> data> \x80
> No match
> data>
>
> It does not match any character as shown by a few examples above.
It is exactly compatible with Perl, which gives the same result. Of
course, Perl is permanently in Unicode mode these days, whereas PCRE can
be in either, but I think that the above is still the correct behaviour.
It is not the only construct that always fails (it wouldn't always fail
in UTF-8 mode, of course). Constructs like (?!) also always fail.
Philip
--
Philip Hazel
Add more friends to your messenger and enjoy! Go to http://messenger.yahoo.com/invite/From ph10@??? Wed Nov 05 09:13:36 2008
Envelope-to: pcre-dev@???
Received: from ppsw-6.csi.cam.ac.uk ([131.111.8.136]:47760)
by tahini.csx.cam.ac.uk with esmtp (Exim 4.69)
(envelope-from <ph10@???>) id 1KxeSF-0005V3-Gu
for pcre-dev@???; Wed, 05 Nov 2008 09:13:35 +0000
X-Cam-AntiVirus: no malware found
X-Cam-SpamDetails: not scanned
X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/
Received: from demon-gw.quercite.com ([83.104.196.193]:50789
helo?ercite.quercite.com)
by ppsw-6.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.156]:587)
with esmtpsa (PLAIN:ph10) (TLSv1:DHE-RSA-AES256-SHA:256)
id 1KxeSF-0001dP-KS (Exim 4.70)
(return-path <ph10@???>); Wed, 05 Nov 2008 09:13:35 +0000
Date: Wed, 5 Nov 2008 09:13:30 +0000 (GMT)
From: Philip Hazel <ph10@???>
To: Mahendra Ladhe <lml108@???>
In-Reply-To: <960998.47056.qm@???>
Message-ID: <Pine.LNX.4.64.0811050909420.29537@???>
References: <960998.47056.qm@???>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset?-ASCII
X-Spam-Score: -2.5 (--)
X-Spam-Status: No, score?.5 required~0 tests?L_TRUSTED?.8, AWL?.699,
BAYES_00?.5,
DNS_FROM_SECURITYSAGE>513 autolearnO version^1.8
Cc: pcre-dev@???
Subject: Re: [pcre-dev] pcretest program accepts empty character class
X-BeenThere: pcre-dev@???
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: pcre-dev@???
List-Id: PCRE Development <pcre-dev.exim.org>
List-Unsubscribe: <http://lists.exim.org/mailman/listinfo/pcre-dev>,
<mailto:pcre-dev-request@exim.org?subject?subscribe>
List-Archive: <http://lists.exim.org/lurker/list/pcre-dev.html>
List-Post: <mailto:pcre-dev@exim.org>
List-Help: <mailto:pcre-dev-request@exim.org?subject?lp>
List-Subscribe: <http://lists.exim.org/mailman/listinfo/pcre-dev>,
<mailto:pcre-dev-request@exim.org?subject?bscribe>
X-List-Received-Date: Wed, 05 Nov 2008 09:13:36 -0000
On Wed, 5 Nov 2008, Mahendra Ladhe wrote:
> It's true that the empty character class will not match any character.
> But shouldn't regular expressions containing empty character class(es)
> be rejected by PCRE?
They are not rejected by Perl. PCRE tries to be Perl-compatible. So no,
I do not think they should be rejected by PCRE.
Philip
--
Philip Hazel