Print Page   |   Contact Us   |   Sign In   |   Join AITP
Community Search
News & Press: Feature

Technology Options for Fighting Spam

Monday, October 13, 2003   (0 Comments)
Posted by: Charles Oriez
Share |

AITP shares the view of many that technology works better than legislation at fighting spam. Given the ease with which spam crosses national borders, legislation within one country or within one state in one country, is going to have a steep uphill climb to be an effective tool in fighting spam.

There are many options that you can choose among to help you avoid spam.

Spam fighting is not formulaic. The set of blacklists and other options I use for a small non-profit in Colorado without fear of false positives may not be the combination that is right for your company. For instance, blocking all traffic from mainland China will work for me, with fairly high confidence that I will only be blocking spam. If you have a major customer or supplier in Hong Kong, you cannot make that same decision with that same expectation. If too much spam is getting through, or too much good mail is getting blocked, adjustments will have to be made quickly.

Blacklists

A DNSBL is a DNS-based blocking list created to limit the acceptance of spam. The MAPS RBL was the first, but has since been supplanted by others. Sendmail, which seems to be the most popular mail server software, permits the ISP or company to add a DNSBL with a FEATURE line in the configuration file, which might look like this:

FEATURE(rbl, 'spamcop.net', '570 Spam blocked!')

Every piece of e-mail hitting your server has an IP Address for the server that is attempting to hand off the mail to you. In this example we compare that IPA to a database on the remote spamcop.net server. If that IPA is listed on spamcop as a spam source, we refuse the e-mail. If we want to check that IPA against other databases, we simply add additional FEATURE statements to our configuration file.

There are well over 400 different DNSBLs in existence today. Most of them, including many of the most effective ones, are free to use. The database providers' reasoning is that the more people use them, the more effective the providers will be in putting pressure on spam-friendly areas of the Net to clean up their act, making their own spam-fighting job easier.

But if you decide to go in this direction, which DNSBL(s) do you use to the best effect? That, of course, is up to your own business needs.

The single most effective thing for any ISP to reduce the amount of spam coming to its customers is to block all traffic coming through Asian servers, in particular those from China, Taiwan and Korea. This would work, of course, if your company is not doing any business in those countries. If you have a major presence in Korea, for instance, blocking traffic from there is probably not a viable option. Blackholes.us is your best option for country specific blocking.

Spamcop, which I used in my example, tracks current spam. They use an algorithm that compares total traffic originating on a server to the amount of reported spam originating on that server. When a threshold value is hit, the server is listed. The listing remains only until the spam stops coming, and potentially as long as 48 hours longer. I usually recommend to clients using DNSBLs to block spam that they query spamcop first. They are at http://spamcop.net/.

SPEWS has spam traps scattered over the Internet. Spam comes in, they complain, and the Internet provider has an opportunity to act on or ignore the complaint. One key to SPEWS' success is that nothing in the complaint alerts the ISP that the complaint is coming from SPEWS. Ideally, an ISP should treat every complaint it gets as coming from SPEWS, and act equally promptly on all of them. However, if the ISP makes a choice not to act, a small part of the ISP's address space covering the spammer's operations gets listed on SPEWS. If the ISP still fails to act, progressively larger parts of the ISP's space gets listed. Eventually, the listing gets large enough that someone decides it is time to act on the complaints. Judging from complaints from spammers, spam-friendly providers and legitimate customers who get caught in the expanding blocks, this list seems to be particularly popular and effective. SPEWS is at http://spews.org.

Spamhaus works more or less like SPEWS, with one key difference. Spamhaus expands its blocks by adding the corporate servers of the spam-friendly ISP to the list. Then, instead of customers complaining that mail isn't getting delivered, it is the sales force complaining that mail isn't being delivered. Spamhaus is at http://www.spamhaus.org/.

There are also a large number of lists intended to focus on specific security holes that spammers exploit to steal third-party resources to send their spam. ORDB and DSBL both look for open relays. They can be found at http://www.ordb.org/ and http://dsbl.org/. Another favorite security hole for spammers is open proxies. Monkeys.com's open proxy list at http://www.monkeys.com/ seems best for identifying those. And finally, there used to be a problem with insecure formmail scripts. SORBS http://dnsbl.sorbs.net/ provides one of those.

A reasonably complete list of all free DNSBLs is available at http://moensted.dk/spam/. I recommend the Osirusoft suite of DNSBLs, augmented perhaps by country-specific lists provided by blackholes.us if you are confident that you will not be getting legitimate mail from the countries in question. In addition to the far eastern countries already mentioned, consider blocking Argentina and Brazil.

Since this was written, both Osirusoft and SPEWS have been subject to denial of service attacks, presumably orchestrated by spammers upset at the effectiveness of these tools in detecting and blocking spam. Osirusoft has permanently shut down. The SPEWS query engine remains reachable via mirrors at SORBS and Reynolds.

Procmail

Procmail is not for the faint of heart. Because of the potential for problems, few ISPs permit their customers to use it. I'm lucky, in that my ISP has it installed on its server and lets me use it to filter my mail. However, learning to use it and coding scripts without mistakes is my responsibility rather than theirs. And more than once I have unintentionally deleted mail and had to contact correspondents to ask them to retransmit their mail.

After my ISP has run mail through its global filters and decided to accept it, they route my domain's mail to either my mailbox or my users' mailboxes. Procmail then looks at the headers of the e-mail and decides whether to accept it or not, depending on my own criteria.

The advantage of Procmail is that this filtering decision happens at the server level, rather than at my desktop. I don't spend the time downloading something only to delete it later. My ISP saves on the spool space that he would use holding the mail until I download it.

Procmail uses regular expressions to determine whether the mail should be rejected or not. Below are some examples, and explanations. The lines starting with a # are comment lines only, to remind me of why a particular entry is in my script, and when I entered it.

#mpinet 06.17.03 216.53.128.0/17
:0
* ^Received.*216\.53\.12[89]\.[0-9]|\
^Received.*216\.53\.1[3-9][0-9]\.[0-9]|\
^Received.*216\.53\.2[0-5][0-9]\.[0-9]
{EXITCODE=77 LOG = mpinet - :0/dev/null}

The example above rejects any mail coming from a server in the mpinet region of the Internet. The EXITCODE=77 returns a "permission denied" message to the sender, while the /dev/null line deletes the mail from the server. The LOG entry puts information in my reject log so that I know which script bounced the mail.

# hotmail DAV exploit 6/8/03
:0
* ^Received.*hotmail\.com with DAV
{EXITCODE=77
LOG = «DAV - « :0 /dev/null}

This one blocks anything coming from hotmail that uses the DAV function to transmit spam. This hotmail security problem was only recently discovered, and will probably be fixed. However, almost all spam coming with a hotmail return address, except for those with forged addresses, is using this security hole. Good information on using procmail, including some standard recipes, can be found at http://www.uwasa.fi/~ts/info/proctips.html, while the source code, for any ISP who wants to install it on a server, can be found at http://www.procmail.org/.

Challenge-Response systems

Challenge-Response systems force senders of e-mail to prove they are human, with real addresses, rather than spammers using forged addresses. Most of your legitimate e-mail comes from someone you know. They may even be in your address book, but most likely you have already exchanged mail with them. When mail comes from a known address, it is automatically passed through to your inbox. However, when mail comes from an unknown person, it doesn't immediately show up in your inbox. In that case, the traffic is held on the server while a message goes back to the sender, asking them to confirm that they are real. They have to respond to that challenge before your ISP's system lets the original mail through to your inbox. Earthlink is implementing just such a procedure on its systems now.

It appears to be a great idea, but it has problems. Sometimes mail comes from a system that is not spam, but isn't coming from a person. Recently, I renewed my Denver Broncos season tickets on-line. As part of the process, I got an automated message from them confirming the purchase and providing various other information such as my priority number, when to expect the tickets in the mail, and instructions on getting a parking or transit pass. That did not come from a real person, and a challenge to it would not have been read and responded to by a real person. Similarly, when a Denver AITP member makes a reservation for dinner he might get parking or speaker information. In those cases, the person receiving the confirmation message can pre-approve the sender, but only if he knows what address the sender is using. This also poses problems for administrators of legitimate opt-in mailing lists. "They can get pretty overwhelming, is a nice polite way of putting it," said David Farber, a former Federal Communications Commission chief technologist who runs a 25,000-member list on technology.

Ironically, the vendor of one challenge-response system (Mail-block.com) was using spam to promote its system, got blacklisted, and the challenge-response messages issued by their system got rejected as spam on systems using DNSBLS to manage their inboxes.

Bayesian filters

Spam has certain patterns to it. If you receive mail discussing Viagra, it is almost certainly spam if you are not a doctor or pharmacist. Similarly, mail whose subject lines contain lots of capital letters or exclamation points is probably spam. Mail that claims to comply with S1618 is guaranteed to be spam. Mail mentioning certain African countries and bank accounts, certain parts of the body, or for that matter some anti-virus software, all will usually be spam. Bayesian filters detect these key words and add to a score. When the score gets large enough, the mail is rejected or tagged as spam.

This of course presents certain problems. When I did a spam workshop for Mile High AITP, the syllabus that I sent to the program chair scored too high. People discussing the Super Bowl ran into problems once Super Bowl XXX happened, and Amnesty International once had a problem when they reported a massacre of "over 21" people somewhere in Africa. The problem is that certain spammish words can occur quite naturally in any legitimate e-mail communication.

Popular bayesian filter software includes SpamAssassin (found at http://spamassassin.org/) and Spamnix (http://spamnix.com/ ). Mile High AITP's ISP uses SpamAssassin on their server. I use Spamnix as a plug-in to Eudora on my desktop. And while I have pointed out the problems with this type of filter, they have proven to be fairly accurate. Not perfectly accurate though, so I use them to tag spam rather than to delete it. Tagging is a process by which either a mail server or a client identifies mail as probable spam and inserts a keyword identifier, either in the headers or in the body of the mail. SpamAssassin inserts **JUNK** in the subject line and adds some extra headers with its analysis. Spamnix puts a block of analysis at the bottom of the mail. The recipient then routes anything identified as possible spam by either application to a special junk mail folder for further review.

These applications don't just use Bayesian logic though. They will also query many of the standard DNSBLs. If it was sent through an open relay, or compromised proxy server, the score reflects that. A complete list of the tests used and default scores assigned by both of those applications can be found at spamassassin.org.

Poetry

Since this was written, CEO Ann Mitchell departed from Habeas, citing irreconcilable differences with her board over the future direction of the company. Whether the direction chosen by the new leadership of the company will preserve its usefulness as a spam-fighting resource remains to be seen.

One interesting tool being experimented with right now, which might bear some promise, is a haiku-based system developed by Habeas. Founded by the former attorney for the MAPS Realtime Blacklist, this system uses copyright law, rather than anti-spam law, to fight spammers.

If you are a legitimate sender of bulk mail, which is to say someone who uses confirmed opt-in to build your lists, you contract with Habeas to include their haiku in your headers. This haiku is a method of sender warranting that the e-mail is not spam, so its presence lets your mail pass SpamAssassin and other filters. However, if a spammer inserts the haiku without signing the contract, or without having a confirmed opt-in process, then they have infringed on the Habeas copyright. Penalties under DMCA for copyright infringement are substantially higher than the penalties for violating various state anti-spam laws.

This is an innovative combination of law and technology that may be worth watching, even though it relies on the court system to go after spammers, which presents many of the same objections that we have to other legislative and legal solutions.

More information can be found at http://www.habeas.com/. It may be just the thing for legitimate senders of bulk mail to use in order to avoid the filters that currently route them into my spam folder.

Charles Oriez has an MS-CIS from the University of Denver and writes and speaks on e-mail issues in the Denver area.

Looking for more information on Spam? Check out the next two articles in this series, "Complaining About Spam 101" and "Spam in the Courtroom," which will be published in the November/December 2003 issue of Information Executive.


Member Log In


Forgot your password?

Haven't joined AITP yet?

Latest News
Upcoming Events

4/21/2014 » 4/22/2014
Region 5 Spring 2014 Meeting

4/23/2014
AITP Executive Committee Meeting

4/30/2014
AITP Professional Chapter Incentive (PCI) Documentation Due

4/30/2014
AITP Region Funding Documentation Due

4/30/2014
AITP Board of Directors Meeting

Online Surveys

Copyright © 2011-2014 Association of Information Technology Professionals, All Rights Reserved.
Use of this web site constitutes acceptance of the Terms of Use and Privacy Policy.
AITP Headquarters, 15000 Commerce Parkway, Suite C, Mount Laurel, NJ 08054
Phone: 800.224.9371 or 856.380.6910 · Fax: 856.439.0525 · Email: aitp_hq@aitp.org