Server and OS, Spam Fighting

Killing Spam(mers)

I’m writing this ex post facto, so me referring to things like, “I did this” actually happened months ago.  I’m just now recounting it.  Note that the assumption with this document is sendmail.  You’re running sendmail as your MTA, right?  No?  Well go back to the beginning and try again.  So.. you’re running sendmail as your MTA, right?  No? …

Spammers Can Fuck Off and Die in a Fire

No one likes email spam, and everyone on the Internet has to deal with it.  If spammers could figure out a way to deliver it to people who don’t have email addresses, they’d do it.  They’re the scum of the Internet, but unfortunately it’s still illegal to shoot them.

Back in my earlier days at AOL, one of the Mail Operations guys filled out a Purchase Request form and tacked it up outside his office.  The contents of the PR:

  • 1 white unmarked van
  • 4 black ninja suits and hoods
  • 4 baseball bats

The “project” this PR belonged to: Spam Fighting.  This would have been a great catharsis, but for obvious reasons the PR was never submitted.  Until the government is willing to suspend habeas corpus on spammers, the only realistic way to deal with them is to prevent their email tripe from getting through.

SpamAssassin

I run and have run my own email server for a bunch of years.  Long before GMail was even a twinkle in Google’s eye.  I do that mainly so that I can have my own email address at my own domain, but more importantly so that I can have full control of what doesn’t come in, what does come in, and where it goes if it makes it in.  Unfortunately, somewhere along the way, I slipped up and published my email address or it was stolen from (or sold by) one of the vendors I do business with.  And that’s all she wrote.  The spam started, and has never stopped.

SpamAssassin is a useful tool in combatting spam once it gets into your local server.  It’s not perfect, and in fact is pretty easy for spammers to skate through it.  But it’s an easy first-line to set up, and I recommend doing that.  For FreeBSD users:

pkg install procmail
pkg install spamassassin

Procmail was installed so that sendmail could use it to send messages to the spamassassin daemon.  Once done, I had to tell sendmail to call procmail for each incoming message.  The file /etc/mail/hostname.mc was edited to add:

FEATURE(local_procmail)dnl
MAILER(procmail)dnl

And then the command

make install

was run in the /etc/mail directory.  A new sendmail.cf was generated.  I restarted sendmail with:

service sendmail restart

And at that point, sendmail was calling procmail.  Without a ~/.procmailrc file, procmail would just deliver everything to my user’s spool file.  But the point behind having procmail installed was to have it send things to SpamAssassin.  So I had to get the daemon running.  In /etc/rc.conf:

spamd_enable="YES"

And then from the CLI:

service sa-spamd start

With the daemon running, time to send messages to it.  I’ll included small clips of my ~/.procmailrc and attempt to explain them.

#!/bin/sh
MAILDIR=$HOME/Mail
DEFAULT=$MAILDIR/spam
SPOOL=/var/mail/myuser
SPR=9876543210

The idea here was to set up some interesting variables that would be used later. The $SPR variable might seem the most curious; it’s there to allow me to group a bunch of Subject: lines together, or a bunch of From: lines together, and then do the same thing with them. You’ll see that a bit further in the file.

:0fw
| /usr/local/bin/spamc

Send everything that procmail sees (which is everything sendmail sees for this user) to the SpamAssassin daemon for checking. The SpamAssassin daemon will determine whether it thinks a message is spam based on various heuristics. It’ll mark the message with a Spam level, using *s. And if it thinks a message is spam (default is 5 *s or more), it’ll change the Subject: of the message to prepend SPAM to it.

I decided that messages with 8 *s or more weren’t even worth looking at. They go straight to /dev/null. In the same .procmailrc file:

#
# Anything 8 stars or more is trashed
:0
* ^X-Spam-Level: \*\*\*\*\*\*\*\*.*
/dev/null

After running and training SpamAssassin for a while, I found that messages were still slinking through it. I decided to collect up a group of messages via their Subject: lines or their From: lines, and just trash them to /dev/null. This is where the $SPR variable came into play. For instance:

# Group the From: and Sender: lines that will end up in /dev/null.
:0
* $ $SPR^0 ^From:.*bad_sender_1.*
* $ $SPR^0 ^From:.*bad_sender_2.*
* $ $SPR^0 ^From:.*bad_sender_3.*
* $ $SPR^0 ^From:.*bad_sender_4.*
/dev/null

The basic idea with these entries is that if any one of them matches, the entire rule is true and the message ends up in /dev/null. Like a giant logical or statement. I did the same thing with common Subject: lines and patterns, sending those to /dev/null as well.

# Group things that are probably spam but need further examination
:0
* $ $SPR^0 ^X-Spam-Status: .*BAYES_99.*
* $ $SPR^0 ^X-Spam-Status: .*BAYES_95.*
* $ $SPR^0 ^X-Spam-Status: Yes
* $ $SPR^0 ^Subject:.*\[SPAM\].*
$DEFAULT

I used the $SPR here as well to group things together that are probably spam, but deserve taking a second look at. They’ll end up in my ~/Mail/spam mbox file.

Training SpamAssassin

If you receive a bunch of spam on a daily basis, it makes sense to train SpamAssassin.  I did this by creating a new mbox file called ~/Mail/is-spam, which the IMAP daemon on joker immediately saw and made public to my IMAP client (Thunderbird, Apple Mail, et al).  If a message made it by SpamAssassin and ended up in my in-box, I would just move it to the is-spam folder.  Then on joker, I set up a crontab for my user that looked like this:

MAILTO=""
0,10,20,40,50 * * * * /usr/local/bin/sa-learn --spam --mbox ~/Mail/is-spam

Meaning every 10 minutes, read the is-spam mbox file and learn from it.  SpamAssassin is smart enough to know if it’s already seen and learned from a mail message, so it’ll ignore repeated runs on the same set of messages and only learn from new ones.  I would (and still do) routinely delete the messages in the is-spam mbox file.

SpamAssassin: Not Good Enough

I’m not done with this entry.  In fact, it’s going to go on for quite a bit more, so if you need to take a bio break, go get another cup of coffee, or both: go do that now.

Now that you’re back: SpamAssassin is too weak a filter to stop most messages these days.  When I rebuilt joker as a FreeBSD box as noted in this entry, I somehow managed to completely muck up the Bayes database that I’d been building up for years.  But no matter how hard I tried to re-train it, spammers were still getting through.  Massively.  As in nearly 200, sometimes 300 messages a day.  It almost had me convinced to get rid of the same email address I’d been using for a couple of decades.

Before throwing in the towel and moving all of my email to GMail (yuck!) I did a lot of searching around on the IntardWebz to what solutions other than SpamAssassin existed.  It was through this search that I learned of things called sendmail milters. I hunted around for spam-fighting milters and tried a few of them out.  None of them worked until I found:

My Hero: Spamilter and, by Proxy, Neal Horman

I somehow stumbled upon the project called spamilter by Neal Horman.  It’s a sendmail milter written completely in C (very tight C code, too), that checks a bunch of things before the remote MTA is even allowed to send the sendmail “data” command to submit the body of the message.  It checks things like incoming IP addresses against known blacklists
sm1(spamhaus, spamcop, et al), whether the hostname in the HELO resolves properly, etc.  If any of these checks fail, it tells sendmail to immediately shut the door on the incoming MTA.

I initially built spamilter using the existing /usr/ports included with FreeBSD 10.1

cd /usr/ports/mail/spamilter
make
make install

Getting it running took a few steps.  The documentation included with the port was a bit sparse, but Neal’s website helped.  A /usr/local/etc/spamilter/spamilter.conf file was needed, and the defaults included were a good starting point.  The one entry for:

PolicyUrl = http://www.somedomain.com/policy.html

Needed to be changed to match my website.  I used the policy.html file included with the port and just put it in the appropriate directory for the apache24 daemon to pick it up.

Next to change: the /etc/syslog.conf.  Spamilter writes its logs to two files by default: /var/log/spam.log and /var/log/spam.info:

# Needed for spamilter
!Spamilter
*.=info /var/log/spam.log
*.<>info /var/log/spam.info

And in /etc/newsyslog.conf:

/var/log/spam.log                       644  30     *    $D0   Z
/var/log/spam.info                      644  30     *    $D0   Z

Edit /etc/rc.conf, of course:

# Spamilter
spamilter_enable="YES"

Restart syslog, and start spamilter:

service syslogd restart
service spamilter start

And finally: tell sendmail to use the milter by editing the /etc/mail/hostname.mc file.  Add this up near the first few options:

INPUT_MAIL_FILTER(`spamilter', `S=inet:7726@127.0.0.1, F=T, T=C:30s;R:4m;S:30s;E:30s')

Remake the sendmail.cf and restart sendmail:

make install
service sendmail restart

Spamilter’s Ammo: /var/db/spamilter

After a “make install” of spamilter, a directory called /var/db/spamilter should be created it.  If not, create it and fill it with the example db.* files that are included in the port.  This is spamilter’s primary ammo.  These files are read with every incoming message, so edits to them are live the moment you save them.  The file that I edit the most is /var/db/spamilter/db.sndr.

I do this mainly because of the immense number of new TLDs that have been created.  I say this not as a Verisign employee, but as a receiver of spam faked through those new domains: Fuck.  You.  IANA.  Seriously.  Whoever within IANA (and ICANN) decided to significantly expand that list of TLDs needs to go play in traffic.

That file is also helpful to whitelist domain entries for people who don’t know how to do email properly.  For instance, a couple of my banks really don’t know how to set up their DNS or MTA configurations properly.  Better said: whichever company they hired to do all of that, does it poorly.  So if my-bank.com has broken DNS or a fucked up MTA, I can put a whitelist entry in the db.sndr file:

# My-Bank sucks
.my-bank.com                      |               |Accept

Then those stupid TLDs. This is dangerous because it means I’ll never be able to receive email from any of them. But I’m OK with that. If you’re not emailing me from a .com, .net, .org, or other traditional TLD… go get a real domain. If you don’t like that, well, I don’t want your email. Capiche?

.click                  |               |Reject
.website                        |               |Reject
.xyg                    |               |Reject
.xyz                    |               |Reject
[lots clipped]

 No IPv6?

After getting the milter running, I was shocked at the amount of incoming attempts were just instantly closed by sendmail and the milter.  None of those messages ever got to SpamAssassin (Of note: SpamAssassin also has a milter, which I tried.  It… doesn’t work well).  But a new problem: incoming IPv6 connections were being closed.  The milter didn’t understand or grok IPv6 MTA connections, and Google (via GMail) has IPv6 MTAs.  So my friends using GMail couldn’t send me mail.

Doh!

This is when I made a new friend.  I found Neal’s contact information in the source code and sent him some email asking about IPv6.  He admitted that he hadn’t much experience with coding to it, but it was on his mind.  And within fairly short order, he had a patch submitted to handle incoming IPv6 MTAs.  I could once again receive GMail-sourced mail.

Filtering Before Sendmail

Neal’s work with spamilter includes a daemon called ipfwmtad.  It will talk to FreeBSD’s local ipfw (or Linux’s iptables) to put in timed incoming port 25 blocks for certain MTAs.  When spamilter rejects a connection from an MTA, it’ll tell sm2ipfwmtad to add another block.

Since my joker VM was using pf for its packet filtering, I’d disabled ipfw completely.  I figured I had no real need for it.  But, as it turns out, both can comfortably co-exist on a FreeBSD box without getting in each others’ way.  I wanted to make sure that ipfw started in “open” mode because I wanted it to pass everything it saw.  In /etc/rc.conf:

# Firewall ... ipfw for use with spamilter
firewall_enable="YES"
firewall_type="open"
ipfwmtad_enable="YES"

And then in /usr/local/etc/spamilter/spamilter.conf:

MtaHostIpfw = 1

Fire everything up:

service ipfw start
service spamilter restart

By default, ipfwmtad will insert rule #90 into ipfw and add to that. Assuming everything is set up properly, you’ll see rule #90 growing and eventually being pruned as older entries time out. Check that with:

ipfw list 90

Up to Date Code

Neal was keeping his up-to-date code on SourceForge.  Some time in May, 2015, he moved it all over to GitHub.  The latest and greatest code can easily be snagged with:

git clone https://github.com/nkhorman/spamilter.git

It’s far more up to date than what’s available in the /usr/ports directory for FreeBSD, or in the spamilter pkg that also exists.  To build it, you’ll want the FreeBSD “src” package installed as well, because spamilter assumes sendmail’s source code is in /usr/src/contrib/sendmail.

More Coming

There’s more configuration and customization I’ve done with spamilter, and with Neal’s help.  I’ll write another entry when I get a chance, which will focus on ipfwmtad, having it run on a different machine than spamilter, and a few other topics.  But the summary is: at this point spam rarely gets by.  Maybe once or twice a week I’ll receive a piece of spam that made it through the milter and somehow fooled SpamAssassin.  But out of the nearly 200 messages I receive a day, that’s not a bad average.blocked

Leave a Reply