I was looking for a spam filter for my Exchange server. I had great luck with SpamAssassin on another box (just regular SMTP), and luckily I found two great resources today:
How To Use SpamAssassin on Win32: This is a fantastic example of someone documenting something they know how to do, and documenting it well.
It’s a fantastic body of information, written by someone who has been doing it for a long time. Everything is covered, including odd permutations, bugs, warnings, dependencies, etc.
Exchange SpamAssassin Sink: This event sink fires on every inbound message, writes it to a file, sics SpamAssassin on it, and parses the result. It can just add headers to the message (allowing client filtering), or it can toss it altogether.
I installed this whole solution in about two hours this afternoon, including tuning and fiddling. It’s currently filtering away like crazy — 50% of inbound email is spam right now, and I know I can turn down the threshold quite a bit yet.
On a well-powered Windows Server 2003 machine, it’s taking one second to filter each email. (It’s probably less, but the logs don’t list micro-seconds. Suffice it to say that no email has taken more than one second to process.) Remember, however, that none of the network tests (Razor, Pyzor, etc.) work on Windows, and they’re what tended to add all the processing time.
What’s nice about this setup is that it saves all email in “Ham” and “Spam” folders. While this is a bit of a privacy risk, obviously, it also allows you to save up thousands of good and bad emails then train SpamAsassin’s Bayesian filter on them (it even inclues a BAT file to do that in one click). My understanding is that SpamAssassin gets scary-good when you have a well-trained Bayesian database behind it.