[lug] Jafo's writeup on Spam filtering and BLUG archives

Sat Sep 11 15:54:11 MDT 2004

On Sat, Sep 11, 2004 at 11:54:59AM -0500, Michael J. Hammel wrote:
> I *reeeeally* need to get a spam filter set up.  I've managed to live
> without one for a long time, but I'm really tired of scanning all those
> messages everyday for the one or two that have any real value.
> 
> Sean put up a really good write up recently on a web site about his
> experiences with various filtering systems.  I think the message came
> through the LUG.  But when I tried to access the September archives I
> get strange errors - like the mime-type is forcing me to download the
> page and then when it tries to download it the page isn't there.
> 
> Can someone point me to Sean's write up?  I think its in one of his
> blogs on "codeslinger" or similar.

Several of the ideas discussed in that paper are discussed in great
detail in this paper:

http://slett.net/spam-filtering-for-mx/

which I have posted here before.  Sorry for the repeat, but it sounds
like you need it.  Of course everything in this paper assumes you have
control over your mail server.  

I have also had good results with greylisting (I think 80% reduction is
about what I've seen too).  It was very easy to set up using Tor's
greylistd package (for Debian, but it will work with anything that runs
Python 2.3 and Exim 4.  I set it up in an hour or so, and I don't have
much mail server experience.  It is a bit of a pain to set up for Debian
Woody, because you have to get backports of both Python and Exim.  Any
newer distribution will be easy.

I don't have any other spam prevention method running, and greylisting
has cut it down to less than 10 per day, so I haven't bothered.  Still,
I'm curious what would be the second most effective strategy after
greylisting?  SpamAssasin?  DNS blacklists?  Invalid SMTP filters (Exim
does some of that by default though)?  It seems like my spam level keeps
creeping up, like the graph on Sean's page, so eventually I'm sure I'll
have to add more filtering.

One idea I ran by Tor was this:  assuming you have blacklists and all
the other techniques he recommends set up, where are the remaining spams
coming from?  I'm guessing a lot of them are coming from open relays
that have been discovered by a spammer for such a short time that they
haven't gotten on the blacklists yet.  I randomly tested a couple of
non-bounce spams that got through my greylisting, and they both came
from open relays.  An obvious idea is to somehow test incoming messages
from unknown MXes to see whether they're coming from an open relay, and
bounce the message if they are.  Tor thought this would be difficult
with Exim without new features, I'm a novice so I don't have a clue.
Anyone heard of such a thing?  Does it already exist?

This seems to be what I'm thinking of for Postfix, but I don't
understand exactly how it works:

http://www.zonque.org/projects/grinch/

Daniel