Re: [pcre-dev] Ignoring a whole set of unicode characters

Author: Ze'ev Atlas
Date:
To: pcre-dev@exim.org
Subject: Re: [pcre-dev] Ignoring a whole set of unicode characters

Philip wrote:>Final thought: It might be easier in the DFA matching function, because
>that moves along the subject character by character, without
>backtracking.
Thank you PhilipYou are probably correct to point that the DFA matching function is a better place to add such option because there, it won't be different performance wise (or even better) than writing (assuming conceptual Posix classes)if ($a=~ /\b([[:desired language consonant:](?:[:desired language mark:]*)]+)\b/) {...}to identify a 'valid' word in the desired language. However the pattern above (did not test it, so it might have some bugs) would not remove the undesired characters from the captured result.
In the end of the day, you are also correct to point that I could do something like$a=~s/[:desired language mark:]//;prior to the match exercise, so I guess I should go with that type of solution, at least until the wishlist is fulfilled :) Ze'ev Atlas

This message is part of the following thread:
	the complete thread tree sorted by date
	ph10 at