Lorenzo Peraldo
1
Chapter 1
Introduction
The abuse of electronic messaging to send unauthorized and inappropriate bulk messages is commonly
named spamming. Spam is nowadays widely spread in different media, for example instant messaging
spam, web search engines spam, spam in blogs or forums, even mobile phone messaging spam, but the
most widely recognized and common form of spam is for sure e-mail spam.
E-mail spam is also known as unsolicited bulk e-mail (UBE) or unsolicited commercial e-mail(UCE)
and consists of sending e-mail messages, usually with commercial content, in large quantities to an
indiscriminate set of recipients. E-mail spamming started since the beginning of the internet and it
grew exponentially over the following years and nowadays spam e-mails represent the 80-85% of all
e-mail messages in the world. One of the reasons why the volume of spam has become higher and
higher every year is the fact that spamming has no costs for spammers. Therefore they can manage
very huge mailing lists without any operating costs thus adding more and more users to advertise
with bulk messages. Advertising messages are the most common but lately also other kinds of spam
messages started to travel through the net, such as political or religious purposes messages.
Although spamming has no costs for spammers, its effects are devastating in order of consumption
of computer and network resources and human attention and time. Moreover it has a high direct cost
for companies and internet service providers who want to fight spam, as well as indirect costs borne
by the victims of spam, such as financial theft, identity theft, data and intellectual property theft,
fraud, viruses and other malware infections that usually accompany spam messages.
Even though sending of junk e-mail has been prohibited from the beginning of the internet, enforced
by the Terms of Service/ Acceptable Use Policy (ToS/AUP) of the internet service providers, in many
states more permissive laws have been applied instead of tough laws against spam, especially in the US
(because of CAN-SPAM Act of 2003), while in other countries like Australia and the member countries
of the European Union anti-spam laws have been passed. As a result we can see from statistics that
nowadays the most spam e-mail are produced in the USA, while for example Australia’s rank in this
negative list has decreased since these tough laws against spamming were applied.
2
Chapter 2
Spam
In order to find a solution to the problem of spam it’s very important to define what is really considered
as spam and how spammers exploit weeknesses of the networks to spam.
3
wasn’t a problem yet, but as spam from these insecure resources grew, DNSBL operators started
listing their IP addresses in order to block spam coming form them.
Also for this reason, since 2003 spammers, rather than searching the global network for exploitable
services, began creating services on their own by commissioning computer viruses designed to deploy
proxies and other spam-sending tools on thousands of end-user computers. Virus-infected computers
not only serve spammers as spamming tools by sending spam messages, thus acting as proxies, but
also by perpetrating distributed denial-of-service attacks. To fight spam, many anti-spam techniques
have been implemented, with good or not so good results, but as years are passing by spammers are
always finding new methods to cheat these techniques.
4
Chapter 3
Anti-spam techniques
To prevent e-mail spam various anti-spam techniques are used both by end users and e-mail systems
administrators. Depending on who these techniques are executed by, they can be divided into four
categories: end-user techniques, if action by individual users is required; automated techniques for
e-mail administrators, if they can be automated and implemented directly on proxies or MTAs; auto-
mated techniques for e-mail senders, if they’re implemented on end-users’ computers maybe embedded
in products or software; techniques for researchers and law enforcement officials. None of these tech-
niques represent a complete and definitive solution to the problem of spam, as they all have a trade-off
between not blocking all spam vs rejecting legitimate messages, and the associated costs in time and
effort.
5
are delivered. Some e-mail servers could decide to reject all messages coming from certain countries
they expect to never communicate with; therefore they use a country-based filtering technique based
on country of origin of the e-mail determined by the senders IP address. Very often used are DNSBLs,
or DNS-based Blackhole Lists. These lists, published via the DNS, list sites know to emit spam, open
mail relays or proxies or ISPs known to support spam, so that mail servers can easily reject mail from
those sources. Other DNS-based anti-spam system may instead use white listing and mark as good
(white) IPs domains or URLs. Some mail administrators could also reduce spam by setting restrictions
on the MTA, for example enforcing technical requirements of the SMTP and blocking mail coming
from systems not compliant with the RFC standards. For example a simple HELO/EHLO checking
can reduce spam significantly.
The PTR DNS records in the reverse DNS can be used for different things; for example most
e-mail MTAs use FCrDNS verification and if there is a valid domain name, put it into the Received:
trace header field. Some MTAs perform FCrDNS verification on the domain name given on the
SMTP HELO and EHLO commands, but in this case e-mail is not rejected by default. PTR DNS
records may be also used to check the domain names in the rDNS to see if they’re likely from dial-up
users, dynamically assigned addresses, or home-based broadband customers. And finally a Forward
Confirmed reverse DNS verification can create a weak form of authentication that there is a valid
relationship between the owner of a domain name and the owner of the network that has been given
an IP address. Despite this authentication is weak, it can be strong enough to be used for whitelisting
purposes because spammers and phishers cannot usually bypass this verification when they use zombie
computers to forge the domains.
6
not stolen and check the Spamhaus Project ROKSO list before accepting new customers. One feature
spammers always try to exploit is the difficulty of implementation of opt-in mailing lists. To avoid this
it’s very important that mailing lists use instead confirmed opt-in , so that an address is never added to
a mailing list until the owner of that address confirms the opt-in. This point is very important because
it’s at the basis of anti-spam techniques and black lists such as those implemented by Spamhaus. To
combat spam firewall and routers can be useful too; these could for example be programmed to stop
SMTP traffic (through port 25) from those machines that are not supposed to send e-mail. As it may
happen that also home users are blocked by an ISP doing this, e-mail could still be sent from those
computers through port 587. All port 25 traffic can also be intercepted by a NAT (Network Address
Translator) and redirected to a mail server for verifications, for example for rate limiting.
An important contribution to fight spam is always well accepted from e-mail users. Spamcop for
example gathers spam reports from users and, by monitoring these reports, ISPs can learn of problems
before their mail servers are blacklisted.
7
Chapter 4
Honeypots
A honeypot is a trap set to detect, deflect or in some manner counteract attempts at unauthorized
use of information systems. It is always disguised as something containing valuable information or
resources to attract attackers, in our case spammers. Honeypots are assigned unused IP addresses
and they have no production value, so that all the traffic they see is surely malicious or unauthorized.
For this reason we are sure that all the traffic passing through honeypots designed to thwart spam
is illicit. Honeypots’ IP addresses are usually hidden so that no user can find them, but they can be
collected by address harvesting techniques in order to be added in spammers mailing lists.
Honeypots can be classified depending on two factors. Based on the deployment, we can recognize:
• production honeypots, easy to use, mainly used to improve the security of an organization,
captures limited information about attacks and attackers;
The second classification is based on the level of involvement of the honeypot. We can distinguish the
following categories:
• low-interaction honeypots, called honeyd, GPL licensed daemons that works by emulating com-
puters on the unused IP addresses of a network and provides simple functionalities;
• mwcollect and nepenthes, used to collect autonomously spreading malware and obtain the mal-
ware binaries without being infected (as all it’s done in a virtualized environment);
• honeytraps, which create port listeners based on TCP connection attempts to monitor traffic
and handle some unknown attacks;
• high-interaction honeypots, called honeynets, which are networks of real systems containing
several honeypots.
After seeing all these classifications and types of honeypots, let’s concentrate on what we’re most
interested in: spam honeypots. These honeypots have been created to masquerade as abusable re-
sources such as open mail relays and open proxies which are very attractive for attackers, in order to
discover the activities of the spammers. Honeypots have very important functionalities. Not only they
block spam, but they make possible the determination of the source of the attack and bulk capture
of the spam, which will be analysed and will be useful to determine URLs and response mechanisms
used by spammers. For example for open relay honeypots it’s easy to deceive spammers determining
the e-mail addresses (dropboxes) used by spammers to target their test messages and transmitting
any illicit relay e-mail received addressed to that dropbox e-mail address, in order to indicate to the
spammer that the honeypot is a real abusable open relay. So, since the introduction of honeypots as
anti-spam tools, spammer have started using chains of abused systems to send spam, to make detec-
tion of the actual source more difficult. So one merit of honeypots is for sure having made the abuse
less easy and less safe for spammers.
8
Many non-profit organizations started using honeypots and spamtraps in order not only to block
a large amount of spam passing through or directly addressed to their honeypots, but also to analyse
spam messages and their senders. Doing so they were able to create large Block Lists (DNSBLs),
published on the web for free, that any ISP or mail server can query to control the traffic over
the respective networks. These organizations include The Spamhaus Project (www.spamhaus.org),
SORBS (www.au.sorbs.net) and SpamCop.net (www.spamcop.net).
9
Chapter 5
The Spamhaus Project is a volunteer effort founded by Steve Linford in 1998 that aims to track e-mail
spammers and spam-related activity. Spamhaus is responsible for three widely used DNS Blocklists
that many internet service providers use to reduce the amount of spam they take on.
Generating these three Blocklists, Spamhaus follows a strict policy and a precise definition of
spam is needed. So as we said before, e-mail messages are considered spam if they’re both bulk and
unsolicited (UBE); spam is not an issue about content, doesn’t matter what’s written in the message,
but about consent. For this reason it’s very important to understand the meaning of Opt-in, Opt-out,
Confirmed Opt-in. To Opt-in means to have one’s e-mail address added to a mailing list. Spammers
exploit the fact that once ad address is opted-in, the recipient rarely opts-out in a formal way to delete
his address from that mailing list, so he will go on sending spam to that address. From the legal point
of view that is still unsolicited e-mail and therefore spam. To send solicited e-mail the recipient must
have verifiably confirmed permission for the address to be included on the specific mailing list, by
confirming (responding to) the list subscription request verification.
10
sent to Spamhaus spamtraps or submitted to Spamhaus by trusted third party intelligence are listed
in the SBL; spam services, including mail, web, DNS and other servers identified as being an integral
part of a spam operation or being under the direct control of spammers are listed in the SBL; the SBL
also lists known spam operations and gangs listed in the ROKSO list (we’ll see it later), and services
supporting these known spam operations.
IP addresses are removed immediately from the SBL database upon receipt by the SBL Team of
notification from the IP owner (the Internet Service Provider responsible for assigning or routing the
IP address) that the reason for listing has been corrected or terminated. If this doesn’t happen, SBL
records are automatically removed after they time out. This time-out can be different for any entry
of the SBL list, depending on the spam source (anyway it’s always the entry editor to decide it). For
unidentified spammers it can be 2 to 14 days, persistent spammers may have time-outs of 6 months,
while known spam gangs can be listed for up to 1 year or more.
• 127.0.0.2, if the data source is the SBL, which will contain direct UBE sources, spam services
and ROKSO spammers;
11
• 127.0.0.4−8, if the data source is the XBL, which will contain illegal third party exploits (proxies,
worm, trojan;
• 127.0.0.10 − 11, if the data source is the PBL, which will contain non-MTA IP address ranges
set by outbound mail policy.
5.2 ROKSO
The Spamhaus Register of Known Spam Operations (ROKSO) is a database of ”hard-core spam
gangs” - spammers and spam operations who have been terminated from three or more ISPs due to
spamming. The ROKSO list is not a DNSBL; it is, rather, a directory of publicly-sourced information
about these persons and their business and at times criminal activities.
To be placed on the ROKSO list a spammer must first be terminated by a minimum of 3 ISPs for
AUP violations. Once listed in ROKSO, IP addresses under the control of ROKSO-listed spammers
are automatically and preemptively listed in the Spamhaus Block List. For qualified Law Enforcement
Agencies Spamhaus provides a special version of this ROKSO database which gives access to records
with evidence, logs and information on illegal activities of many of these gangs, too sensitive to publish
here.
Each spam operation, or ”spam gang”, consists on average of between 1 to 5 spammers. The
majority of the spammers on the ROKSO List operate illegally and move from network to network
and country to country seeking out Internet Service Providers with poor security or known for not
enforcing of anti-spam policies. Many of these spam operations pretend to operate ”offshore”. Those
who don’t hide behind anonymity pretend to be small ISPs themselves, claiming to their providers
that the spam is being sent not by them but by non-existent customers. When caught, almost all use
the age old tactic of lying to each ISP long enough to buy a few days or weeks more of spamming and
when terminated simply move on to the next ISP already set up and waiting.
5.3 DROP
The Spamhaus Don’t Route Or Peer (DROP) List is an advisory ”drop all traffic” list, consisting
of stolen zombie netblocks and netblocks controlled entirely by professional spammers. DROP is a
tiny sub-set of the SBL designed for use by firewalls and routing equipment. DROP is simply a
text list of these IP address spaces, with the numbers of the underlying SBL listings as comments.
When implemented at a network or ISP’s core routers, DROP can protect all the network’s users from
spamming, scanning, harvesting and DDoS attacks originating on rogue netblocks.
12
Chapter 6
SORBS
SORBS stands for Spam and Open Relay Blocking System. It is an open proxy and open mail relay
DNSBL, later improved with complementary lists that include various other classes of hosts. The
SORBS DNSBL was created in 2002 first as a private list, then launched to the public in 2003. In
the beginning it was conceived as an anti-spam project based on a daemon checking ”on-the-fly” if
the e-mail it received had passed through proxies and open relay servers. The DNSBL created in this
way listed thousands of compromised hosts and proxy servers. Lately SORBS has also expanded to
include in its list hacked and hijacked servers, formmail scripts, trojan infestations and now it also
pre-emtively lists all dynamically allocated IP address spaces.
SORBS provides many different zones identified as *.sorbs.net. Some examples are dnsbl.sorbs.net
(including all the other DNS zones except spam.dnsbl.sorbs.net), rhsbl.sorbs.net (containint all RHS
zones), and obviously all their sub-zones. SORBS also provides other aggregated zones such as
safe.dnsbl.sorbs.net, problems.dnsbl.sorbs.net, relays.dnsbl.sorbs.net, proxies.dnsbl.sorbs.net. This
zones are those which servers query and address for new entries requests. In addition to providing the
SORBS zones, SORBS also makes the ASPEWS and SPEWS data available by DNSBL lookup, but
as the policy of SORBS was the publishing of data that is fully under SORBS control, the ASPEWS
and SPEWS zones are not included in the SORBS aggregate zone.
6.1 DUHL
SORBS adds IP ranges that belong to dialup modem pools, dynamically allocated wireless, and DSL
connections as well as DHCP LAN ranges by using reverse DNS PTR records, WHOIS records, and
sometimes by submission from the ISPs themselves. These IPs form the so called DUHL (Dynamic
User and Host List). It is similar to other DUL lists, but while these list dial-up ranges only, the DUHL
also lists IP spaces where addresses are assigned dynamically, as the increasing use of cable modem
and DSL connections has made dial-up quite rare and simple DUL lists are no longer so efficient.
SORBS DUHL originally started life as a straight import of the Dynablock list maintained by
Easynet NL. SORBS accepts requests for adding or removing entries from ISPs responsible for a certain
IP address space, beside listing dynamically allocated addresses that SORBS comes across, typically
after receiving spam from them, and performing reverse DNS naming. Using rDNS, SORBS uses IETF
draft ”draft-msullivan-dnsop-generic-naming-schemes-00.txt” about static and dynamic assignment
recommendations, to understand whether a network allocated static or dynamic addresses, relying
on the respect of recommendations about naming schemes. Matthew Sullivan of SORBS proposed in
this draft that generic reverse DNS addresses include purposing tokens such as ”static” or ”dynamic”.
This draft has actually expired, and generally it is considered more appropriate for ISPs to simply
block outgoing traffic to port 25 if they wish to prevent users from sending email directly, rather
than specifying it in the reverse DNS record for the IP. Another very important thing is that SORBS
expects hosts with long TTLs, as short TTL values (especially under 1 hour) usually indicate the
record is about to change. Removal/deletion requests for example need the Time To Live of the PTR
record to be 43200 seconds or more.
13
6.2 Submissions and queries
Submissions to SORBS can be made for three different lists:
• The Dynamic User/Host List (DUHL). This is a IP based list, and therefore forms part of
dnsbl.sorbs.net, and is available seperately as dul.dnsbl.sorbs.net. SORBS accepts submissions
to DUHL only from its registered logins with registered e-mail address matching the WHOIS
record for the domain.
• The Bad DNS Config List. This is a domain based list (sometimes knows as a Right Hand
Side Block List - RHSBL), and forms part of rhsbl.sorbs.net. It is available seperately as
baddns.rhsbl.sorbs.net. This list is explictitly for domains with bad DNS configurations, that
can cause real problems with some mail servers. There are two reasons why hosts and do-
mains could be listed here: the first one is that at least one MX record points to 127.0.0.1/32,
0.0.0.0/8 or 255.255.255.0/8. The second one is that at least one MX record points to 10.0.0.0/8,
172.16.0.0/12, 192.168.0.0/16 or to any address 224.0.0.0 - 254.255.255.255 and does not have a
MX record in normal address space.
• The No e-mail from this domain list. Like the previous one, this is a domain based list part of
rhsbl.sorbs.net. It lists hosts and domains that will never be used for sending legitimate e-mail.
For example SuperNews admins have indicated that no mail will ever be sent from the domains
*.supernews.net.
SORBS can be queried by providing the address we want to check. This query will produce a return
code that indicates which database the test result was obtained from. If the query is made on aggregate
zones, the return code will still identify the specific zone from which the result was obtained. All return
codes are in the form 127.0.0.x. For example 127.0.0.2 refers to http.dnsbl.sorbs.net, 127.0.0.8 refers
to block.dnsbl.sorbs.net. If an IP address appears in more than one database, all applicable codes are
returned, so we can have multiple codes returned in order to know all the databases containing that
IP address.
14
Chapter 7
SpamCop
SpamCop is a free spam reporting service, which allows recipients of unsolicited bulk e-mail (UBE)
and unsolicited commercial e-mail (UCE) to report offenses to the senders’ ISPs, and sometimes their
web hosts. SpamCop uses these reports to compile a DNSBL of computers sending spam called the
”SpamCop Blocking List” (SCBL) and websites referenced in the spam are used to create the Spam
URI Realtime Blocklists (SURBL) RHSBL. SpamCop has tools for ISPs to manage the reports sent
to them, to see details on individual spam messages, and to mark incidents as resolved.
15
The listing system operates based on the following rules, taking into account the reputation points
and number of reports.
• The SCBL lists IP addresses with a large number of reports relative to reputation points. The
treshold is manually set by the SpamCop team in order to make the list as accurate as possible.
• Reports are weighted in terms of freshness, which means on how recently the e-mail was received:
• total reports are weighted with respect to spamtrap reports scores in the following way: for
spamtrap scores less than 6, the number of spamtrap reports is multiplied by 5; for spamtrap
scores more than 7, this number is squared. This scores are then added to the total of reports.
For example:
– an IP address with 2 spamtrap reports and 3 SpamCop user reports will have a weighted
score of (2 ∗ 5) + 3 = 13
– a host with 7 spamtrap reports and 3 manual reports will have weighted score (7∗7)+3 = 52.
• The SCBL does not count reports regarding URLs or addresses in the body of the email. There-
fore, the SCBL does not list websites or email addresses used to receive replies in reported email,
unless that IP is also used to send the mail.
• The SCBL will not list an IP address with only one report filed.
• With only two reports against an IP address, the SCBL will list the IP address for a maximum
of 12 hours after the most recent reported mail was sent.
• The SCBL will not list an IP address if there are no reports against it within 24 hours.
• If a server sends bounces to an SCBL spamtrap in sufficient quantity to meet the listing criteria,
the SCBL will list that server. This situation results as some mail servers do not reject mail
during the SMTP transaction, but rather accept the mail and then send a bounce message later.
Viruses and spam often contain a forged From: field so if the e-mail is rejected or blocked during
the SMTP transaction, the bounce will go to the connecting IP. If the bounce comes after the
mail is accepted for delivery, then the bounce will go to the address in the From: field. Viruses
and spam often use addresses from the list of recipients to populate the From: field. Sometimes,
these addresses are spamtraps.
7.2 Limitations
For first-time SpamCop Reporters, the SpamCop Parsing and Reporting Service requires the reporter
manually verify that each submission is spam and that the destinations of the spam reports are correct.
People who use tools to automatically report spam, who report e-mail that is not spam, or report to
the wrong people may be fined or banned. This verification requires extra time and effort. Despite
these steps, reports to innocent bystanders do happen and ISPs may need to configure SpamCop to
not send further reports if they don’t want to see them again. SpamCop Reporters with a proven
track record are allowed to file Quick Reports, reducing both time and effort.
It is not clear whether reporting spam using SpamCop’s reporting service actually reduces the
amount of spam received, and complaints on SpamCop’s online forum provide anecdotal evidence
to support some skepticism about its effectiveness. Spammers who determine the identity of the
complaintants can, by doing so, also verify that the email addresses are still in use. What is clear is
16
that much spam email is filtered or blocked by the SCBL, which is fed by many SpamCop Reporters
reporting their spam.
That said, SpamCop is effective at helping ISPs, web hosts and email providers identify accounts
that are being abused and shut them down before the spammer finishes operations. Finally, SpamCop
provides information from its reports to third parties who are also working to fight spam, amplifying
the impact of its services beyond its own reach.
It is also remarkable in its own right that SpamCop has survived for so many years, considering
the severity of opposition other anti-spam companies have faced in the past. SpamCop has dealt with
attacks by spammers thus far by hiring services from Akamai, but is still the target of many hackers
and could face serious difficulties if it continues to grow in size and effectiveness. Significant offensive
weapons can be wielded by the criminal syndicates behind spammers. SpamCop views itself as an
attempt to stop spam without the necessity of governmental intervention, but because it lacks the
power of a government or large ISP, it may have greater difficulty dealing with spammers’ expertise
as well as the large ”bot” networks that they control and that they could use to perform a massive
DDoS attack.
17
Chapter 8
Conclusions
We’ve seen many different anti-spam techniques and in particular some based on honeypots and
spamtraps and how these techniques are used to create useful blocking lists and databases. The
introduction of these methods, as we’ve already said has the great merit to have made the abuse of
network exploitable resources harder and more subject to risks for spammers. Beside this, associations,
like The Spamhaus Project, which implement not only lists of simple IP addresses but also databases
with detailed descriptions and evidence of spammer’s attacks and techniques used, can be really helpful
if joint with an efficient legislation and law enforcement from the State.
Furthermore some interesting aspects come from the listing policies of these DNSBLs. Some are
created just thanks to feeds from honeypots or trusted third parties, while for example the SpamCop
SCBL also accepts feeds from its registered users and this can thus balance the filtering and listing
with respect to what users actually consider spam. On the other hand it’s true that not always this
method is efficient or at least we have no assurance of this, as for example not all spam reported by
some users will be blocked as the listing criteria is slightly more complicated.
Another relevant point about Spamhaus, SORBS, SpamCop and all the other honeypot-based
anti-spam organizations is the fact that there will always be a trade-off between not rejecting all the
spam vs blocking legitimate mail; some of them are often considered too aggressive. For this reason
it is very important to have a balanced listing criteria and it’s advisable to use whitelists in order
to prevent messages from wanted senders to be blocked. The last point to be considered is the price
in time to be paid for queries to the DNSBLs and databases, but this depends on each mail server
administrator’s sake.
In conclusion we can say that despite not being the ultimate anti-spam tool which will defeat
the problem of spam forever, honeypots have had a good impact in fighting spam and the three
organizations analyzed have been for years reason of matters for spammers. As I said, they still
require more co-ordination with law enforcement, that’s what they were created for, and less tolerance
on the State side, so that spammers would not just be blocked by few servers, but blocked in front of
a court.
18
Bibliography
[1] www.wikipedia.org
[2] www.cbsnews.com
[3] www.spamhaus.org
[4] www.au.sorbs.net
[5] www.spamcop.net
[6] Matthew Sullivan Spam and Open Relay Blocking System IETF Internet Draft
19