[pcre-dev] Help in rewriting broken lookbehind assertion

Página Principal
Apagar esta mensagem
Autor: Thorsten Schöning
Data:  
Para: pcre-dev
Assunto: [pcre-dev] Help in rewriting broken lookbehind assertion
Hello,

in one of my applications I want to match all file names except for
some with special extensions. I already have similar regexes with
lookbehind assertions, but those where simpler and only needed one
top level branch, the one I need now is a bit more complex. It doesn't
work as I thought because I use sub level branches with variable
matching length which is not permitted in lookbehind assertions.

The only solution to my problem which I found is to use a lookahead
assertion and negate the check for a match in my source code. As this
is easy to accomplish, I would rather avoid it because all used
regexes in the context of my application are used as whitelists with
lookbehind assertions and for the sake of consistency I would like to
use this problematic one as a whitelist, too.

The regex:

(?i)(?<!(?:(?:p7m|pk7|pkcs7)\.(?:p7m|pk7|pkcs7))|zip|zip\.(?:p7m|pk7|pkcs7))$

Some example filenames:

test.txt
test.txt.pk7
test.txt.p7m.pk7
test.txt.pkcs7.zip
test.txt.pkcs7.zip.p7m.pk7

The regex should only match the first and second file name.

I would really appreciate if some of you could have a look at the
regex and give me some hints about how I could rewrite it to a
functional lookbehind assertion or something similar usable as a
whitelist. Thanks for any ideas!

Mit freundlichen Grüßen,

Thorsten Schöning

-- 
Thorsten Schöning       E-Mail:Thorsten.Schoening@???
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/


Telefon.............030-2 1001-310
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow