* On 23/10/06 21:16 -0200, Marlon Cabrera Oliveira wrote:
| Hi,
|
| > To tell you the truth I'm losing ground lately against spammers. Two
| > reasons. The Image spam is getting through and because it poisons the
| > bayes I've lost much of the effectiveness of bayes filtering. I'm still
| > holding on but I've had people who I hosted for for over a year who
| > never had a single spam who are now getting a few. I am also having a
| > few more false positives than I used to.
|
|
| I'm having succes here detecting image spam using OSBF-Lua filter:
|
| from OSBF-lua website:
|
| "OSBF-Lua (Orthogonal Sparse Bigrams with confidence Factor) is a Lua C module
| for text classification. It is a port of the OSBF classifier implemented in
| the CRM114 project. This implementation attempts to put focus on the
| classification task itself by using Lua as the scripting language, a powerful
| yet light-weight and fast language, which makes it easier to build and test
| more elaborated filters and training methods.
|
| The OSBF algorithm is a typical Bayesian classifier but enhanced with two
| techniques that I originally developed for the CRM114 project: Orthogonal
| Sparse Bigrams - OSB, for feature extraction, and the Exponential
| Differential Document Count - EDDC (a.k.a Confidence Factor) for automatic
| feature selection. Combined, these two techniques produce a highly accurate
| classifier. OSBF was developed focused on two classes, SPAM and NON-SPAM, so
| the performance for more than two classes may not be the same."
|
|
| OSBF-Lua learn very fast. It only require Lua 5.1 installed on Exim server
| with dynamic loading enabled.
| See install doc; http://osbf-lua.luaforge.net/#installation
|
|
| On exim.conf I add this statements:
|
| On ## ON CONFIGURATION SETTINGS ##
|
| # set OSBF_LUA_DIR to where spamfilter.lua, spamfilter_command.lua etc were
| #installed
| OSBF_LUA_DIR=/usr/local/osbf-lua
|
|
| On ## TRANSPORTS CONFIGURATION ##
|
|
| add transport_filter to local_delivery transport:
|
| local_delivery:
| driver = appendfile
| check_string = ""
| create_directory
| delivery_date_add
| directory = ${home}/Maildir/
| directory_mode = 700
| envelope_to_add
| return_path_add
| group = mail
| maildir_format
| maildir_tag = ,S=$message_size
| message_prefix = ""
| message_suffix = ""
| mode = 0600
| quota = ${lookup{$local_part}lsearch*{/etc/mail/quota_usr}{$value} {4M}}
| quota_size_regex = S=(\d+)$
| quota_warn_threshold = 75%
| transport_filter = OSBF_LUA_DIR/spamfilter.lua --udir $home/osbf-lua
|
|
| that's it!! :)
|
|
| Verify our setup sending a message to yourself with the following in the
| subject line: help <your password>
|
| You will receive a message with a help about spamfilter.
|
| To verify that databases wre created correctly: stats <your password>
|
| >From now, all mesages that you received will be classified and tagged
| according the score they get:
|
| Tag Meaning
|
| [--] almost sure it's a spam - score <= -20
|
| [-] probably it's a spam (reinforcement zone) - score < 0 and > -20
|
| [+] probably it's not spam (reinforcement zone) - score >=0 and < 20
|
| [++] almost sure it's not spam - score >= 20. This tag is here just for
| symmetry, it's not used. An empty tag is used in place of it so as not to
| pollute the messages.
|
|
| If the classification is wrong you nust train the filter replaying the message
| back to yourself, replacing the subject with the correspondent training
| command:
|
| learn <password> spam or learn <password> nonspam
|
|
| After training a few messages, osbf-lua will increase the accuracy on spam
| detection.
| If you have a pre-classified messages (nonspam / spam) database on a imap
| folder, you can use the script toer.lua to do the training.
This doesn't look like a good solution. We simply don't want to accept
the message, if that were possible. Of course I know it's possible with
Exim, but the fact that this still leans towards SpamAssassin-ism ....
If this could be integrated within Exiscan framework, then I'd rethink
my stand.
cheers
- wash
+----------------------------------+-----------------------------------------+
Odhiambo Washington . WANANCHI ONLINE LTD (Nairobi, KE) |
wash () WANANCHI ! com . 1ere Etage, Loita Hse, Loita St., |
GSM: (+254) 722 743 223 . # 10286, 00100 NAIROBI |
GSM: (+254) 733 744 121 . (+254) 020 313 985 - 9 |
+---------------------------------+------------------------------------------+
"Oh My God! They killed init! You Bastards!"
--from a /. post