[pcre-dev] [Bug 978] Generalized support for alternate chara…

Top Page

Reply to this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 978] Generalized support for alternate character encodings
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=978




--- Comment #1 from Philip Hazel <ph10@???> 2010-04-18 17:42:56 ---
On Sun, 18 Apr 2010, Doug Cook wrote:

> One possible solution would be to define the input character sequence using a
> more object-oriented system.


...

> The potential disadvantages of this (other than implementation cost) would be
> an unknown performance impact (only way to determine would be to try it)


I suspect the performance impact would be large. Instead of just a
straightforward "load byte" (and a bit more, in UTF-8 mode, but still
done inline), your scheme uses an indirected function call for each
character. I know there are PCRE users to whom performance does really
matter, and I wouldn't like to degrade it.

Implementing this for testing would, perhaps, be fairly easy, because you
could just modify the GETCHAR etc. macros. Then you could prove or
disprove my assertion above.

Philip


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email