Re: [exim] "Bad words" filtering

Etusivu
Poista viesti
Vastaa
Lähettäjä: Mike Cardwell
Päiväys:  
Vastaanottaja: Exim Users List
Aihe: Re: [exim] "Bad words" filtering
Chris Bayliss wrote:

>>> We're currently using a custom written smtp server that filters "bad words".
>>>
>>
>> Cue the Scunthorpe problem.
>
> We used to filter along these lines a long time ago until better
> things were available. What amazed me was the number of surnames that
> fell foul of the filter, such as Wank, Cock and Cunther (all real
> examples).
>
> The other issue is that mis-spelling is reasonably comon to evade
> filters. Once you try matching similar words, the problem of
> false positives gets worse.


I think you could cover most of the cases quite easily with a small
amount of effort in the regex creation. Ie, use word boundaries, account
for obvious obfuscation tricks, and miss-spellings.

/\bw[a4nk(s|z|[e30o]r[sz5]?)\b/

However, this doesn't get rid of the case where somebody might have a
swear word for a surname. That's when it might become a good idea to
star out the word, rather than block the entire email. Depends why
you're filtering though I suppose.

--
Mike Cardwell
IT Consultant .. http://cardwellit.com/