2017 Bad Bot Report

BAD BOT
REPORT
2017
Table of Contents
3 INVESTIGATING SURPRISE ATTACKS AT THE APPLICATION LAYER
5 EXECUTIVE SUMMARY OF FINDINGS
8 SECTION 1: THE BAD BOT LANDSCAPE
8 What is a Bad Bot?
8 Bad Bot vs Good Bot vs Human Traffic, 2016

9 Trend: Bad Bot vs Good Bot vs Human Traffic 20142016
10 Explosive Internet User Growth Continues in 2016
11 Size Matters: Bad Bot Traffic By Website Size, 2016

13 Bad Bots Lie About Their Identity
14 Even Bad Bots Skip Updates

16 The Good, The Bad, and the Amazon
17 The Weaponization of the Data Center
18 Mobile: The Undefended Frontier

19 Uncle Sams Bot Army
21 China: The Most Blocked Country
22 Countries with the Highest Bad Bot GDP
24 SECTION 2: BAD BOTS KNOW WHAT THEY WANT
24 The Daily Struggle: Bad Bots in the Wild
24 OWASP: One Taxonomy to Rule Them All
24 Bad Bot Magnet: If You Build It They Will Come
25 Youve Been Scraped
25 How Scraping Works
26 Sign Up Pages
26 Bad Bots Love Login Pages
27 How Credential Cracking Works

28 How Credential Stuffing Works
29 Protecting Your Login Page Is Not Enough

32 Application Denial of Service is Not Volumetric DDoS
33 All Your Web Analytics Are Wrong
34 Vulnerability Scanners Are Everywhere

35 Spamming is a Grievous Annoyance
35 Bad Bot Sophistication Levels
37 CONCLUSION
38 REFERENCES
40 ABOUT DISTIL NETWORKS
40 CONTACT US
2 Copyright 2017 Distil Networks. All Rights Reserved.

Investigating Surprise Attacks at
the Application Layer
The 2017 Bad Bot Report investigates the daily surprise attacks sneaking under sensors and wreaking
havoc on websites. This report is based on 2016 data collected from Distil Networks global network,
and includes hundreds of billions of bad bot requests, anonymized over thousands of domains. The
goal is to offer those on the frontlines of website security with guidance about the nature and impact
of automated threats.
What makes this report unique is the focus on bad bot activity at the application layer (layer 7 of
the OSI model1). Automated application layer attacks differ from volumetric DDoS attacks, the latter
manipulating lower level network protocols (see SYN flood for more detail2).
Bad bots interact with applications in the same way a legitimate user would making them harder to
prevent. Bots enable high-speed abuse, misuse, and attacks on websites and APIs. They enable
attackers, unsavory competitors, and fraudsters to perform a wide array of malicious activities.
This includes web scraping, competitive data mining, personal and financial data harvesting, brute-
force login and man-in-the-middle attacks, digital ad fraud, spam, transaction fraud, and more.

Investigating Surprise Attacks at the Application Layer | 2016: The Year Bad Bots Appear Before Congress
2016: The Year Bad Bots Appear

Before Congress
The bad bot problem has become so rampant its earned its first piece of federal legislation. In an
attempt to make the use of ticket scraping bots illegal, the US Congress passed the Better Online
Ticket Sales Act (BOTS) in September, 2016. Its purpose was to prohibit the use and selling of
software that circumvents security measures on ticket seller websites. It also prohibits selling any
ticket in interstate commerce that was knowingly obtained in violation of the prohibition. While
legislation is a welcome deterrent, scraping is a technical problem, and its difficult to legislate
against those you cant identify.3
Businesses hire out bad bot creators to price scrape competitor websites, capture findings from
consumer monitoring and opinion gathering sites, and scrape contact information from consumers to
whom they wish to market. Developers that create sophisticated scraping bots can earn as much as
$128,000 per year. Renting out bots-for-hire can cost as little as $3.33 per hour.4
After a year of record-breaking DDoS attacks from weaponized IoT devices, congressional hearings
on anti-scraping legislation, and increased bot activityits clear that bad bots are here to stay.

Executive Summary
of Findings
Bigger Site, Bigger Target
Bad bots made up 20% of all web traffic and are everywhere, at all timesthey dont take breaks
and they dont sleep. Even though bad bots are on all sites, larger sites were hit the hardest
in 2016. Bad bots accounted for 21.83% of large website web traffic, which saw an increase of
36.43% since 2015.
Bad Bots Tell Alternative Facts
Bad bots lie. 75.9% claimed to be the most popular browsers: Chrome, Safari, Internet Explorer,
and Firefox. There was also a 42.78% year-over-year increase in bad bots claiming to be
mobile browsers.
The Weaponization of the Data Center
Data centers were the weapon of choice for bad bots with 60.1% coming from the cloud. Amazon
AWS was the top originating ISP for the third year in a row with 16.37% of all bad bot trafficfour
times more than the next ISP (OVH SAS).
Bad Bots Go Mobile
Looking to scrape a competitors site? There may be an app for that. In 2016, 16.1% of bad bots
self-reported as mobile users. Mobile ISPs accounted for 9.4% of bad bot traffic. For the first time
Mobile Safari made the top five list of self-reported user agents, outranking Web Safari by 17%.
This was the first time Mobile Safari outranked Web Safari in terms of bad bot traffic.
Some Bad Bots Partying Like Its 1999
We humans arent the only ones falling behind on software updates; it turns out bad bots have the
same problem. One in every ten (9.45%) bad bots said they were using browser versions released
before 2013. Some bad bots were reporting browser versions released as far back as 1999.

Executive Summary of Findings
USA is the Only Bot Superpower, China and Russia are the Most Blocked
More bad bots claimed to be American than all other nationalities combined. Over half of bad bots
(55.4%) were hiding in plain sight within American data centers. China reached the top three for bad
bots for the first time, and along with Russia they were the two most blocked countries by websites.
USA Only Fifth in Bad Bot GDP (Bad Bots per Online User)
Dominica, Netherlands, Seychelles, and Iceland all had higher bad bot GDPs than the US. The
Caribbean island of Dominica had the highest number of bad bots per online user, double its nearest
rival. USA was only fifth highest, behind Iceland.
If You Build It, They Will Come
When it comes to the attractiveness of a website, bad bots have a type. There are four key website
features bad bots look for: proprietary content and/or pricing information, a login section, web forms,
and payment processing.
97% of sites with proprietary content and/or pricing were hit by unwanted scraping. 96% of websites
with login pages were hit by bad bots. 90% of websites were hit by bad bots that bypassed the login
page. 31% of websites with forms were hit by spam bots.
Advanced Persistent Bots (APBs)
Todays advanced persistent bots are sophisticated in that they can load JavaScript, hold onto
cookies, and load up external resources, and persistent, in that they can randomize their IP address,
headers, and user agents. In 2016, 75% of bad bots were Advanced Persistent Bots.
Telltale Signs Bots Are On Your Website
You can tell bad bots are on your site when unexpected spikes in traffic cause slowdowns
and downtime. In 2016, a third (32.36%) of sites had bad bot traffic spikes of 3x the mean, and
averaged 16 such spikes per year.

Executive Summary of Findings
Youll know bad bots are a problem when your sites SEO rankings plummet due to price
scraping and misguided ad spend as a result of skewed analytics. 93.9% of sites were visited
by bad bots that trigger marketing analytics trackers and performance measuring tools.
Because of bad bots your company will have a plethora of chargebacks to resolve with your
bank due to fraudulent transactions. Youll see high numbers of failed login attempts and
increased customer complaints regarding account lockouts.
Bad bots will leave fake posts, malicious backlinks, and competitor ads in your forums and
customer review sections. 31.1% of sites were hit with bots spamming their web forms.
WAFs Are No Match for Advanced Persistent Bots
If youre using a web application firewall (WAF) and are filtering out known violator user agents
and IP addresses, thats a good start. However, bad bots rotate through IPs, and cycle through
user agents to evade these WAF filters.
Youll need a way to differentiate humans from bad bots using headless browsers, browser
automation tools, and man-in-the-browser malware. 52.05% percent of bad bots load and
execute JavaScriptmeaning they have a JavaScript engine installed.

Section 1
THE BAD BOT LANDSCAPE
What is a Bad Bot?
What is the difference between a good and a bad bot? In simplistic terms, a good bot ensures online
businesses and their products can be found by prospective customers. Search engine crawlers like
GoogleBot and Bingbot are examples of good bots because they index websites by keywords to help
people match their search engine queries with the best set of websites for a given question.
Bad bots scrape data from sites without permission in order to reuse the data (e.g., pricing, inventory
levels) and gain a competitive edge. The truly ugly ones undertake criminal activities, such as fraud
and outright theft.
The Open Web Security Project (OWASP) provides a list of the different bad bot types in their
Automated Threat Handbook.5
Bad Bot vs Good Bot vs Human Traffic, 2016

Section One | Bad Bot vs Good Bot vs Human Traffic, 2016
In 2016, bad bots accounted for 19.9% of all website traffica 6.98% increase over 2015. Both
human and bad bot traffic numbers were higher than the prior year. Human traffic increased
by 12.8% to account for 61.3% of all website traffic. Good bots decreased by 30.5% in 2016,
accounting for 18.8% of all website traffic.
Trend: Bad Bot vs Good Bot vs Human Traffic 2014-2016
The chart below shows traffic trends since 2014.
While the percentage of bad bot traffic increased, its proportion relative to other traffic
remained relatively constant. Why is this? More people are coming online from developing
nations, using multiple devices (including smartphones, tablets, work and personal laptops)
to access the internet.

Section One | Explosive Internet User Growth Continues in 2016
Explosive Internet User Growth Continues in 2016
In 2016, approximately 185 million new internet users came online98.9% of whom live outside
the United States.6 Leading the way was India, which added an incredible 108 million users
(a 30.5% increase over 2015). The explosive growth is a result of state policies to spread internet
service to rural areas.
New Internet Users by Country, 2016
There are still many more people around the world without internet service. In India alone, 65.2%
of its population is still not connected. This means that explosive internet user growth will be a
continuing trend for the foreseeable future.
In 2016, approximately 185 million new internet

users came online.

Section One | Size Matters: Bad Bot Traffic By Website Size, 2016
Size Matters: Bad Bot Traffic By Website Size, 2016
We define website size by its Alexa index,7 which ranks sites by the amount of traffic received.
An Alexa score of 1 means that its the most popular internet siteas of this writing its Google.
com. We used Alexa rankings to categorize site sizes as follows:
Large = Alexa 1 10,000
Medium = Alexa 10,001 50,000
Small = Alexa 50,001 150,000
Tiny = Alexa 150,000+
Good Bot, Bad Bot, and Human Traffic to All Sized Sites, 2016
Even though bat bots were everywhere, large sites were hit the hardest21.83% of their traffic was
bad bots, up 36.43% over 2015. Tiny sites had the least human traffic71.93% of visitors were bots.
Bad bot traffic to large sites was up 36.43% in 2016.

Section One | Size Matters: Bad Bot Traffic By Website Size, 2016
Looking at traffic on large sites, the bad bot ratio

was 57.9% bad to 42.1% good.
The explanation for the drop in human traffic on smaller sites has to do with search engines. Good
bots like Googlebot and Bingbot crawl the web more or less equally regardless of site size. However,
larger sites are generally ranked higher in search engine results. Because humans rarely look past the
first few search engine results, small and tiny sites dont get the same level of SEO traffic uplift as do
large and medium sites.
Large and medium sites are more enticing targets for bad bots. The following four charts show the
ratio of bad to good bot traffic on large, medium, small, and tiny sites. Looking at traffic on large sites,
the bot ratio was 57.9% bad to 42.1% good. On the other end of the scale, small and tiny sites each
had 26.1% and 28.6% bad to good bots, respectively.
Large Sites
Medium Sites
Small Sites
Tiny Sites

Section One | Bad Bots Lie About Their Identity
Bad Bots Lie About Their Identity
Bad bots must lie about who they are to avoid detection. They do this by reporting their user agent as
a web browser or mobile device. In 2016, Chrome was the most popular browser bad bots claimed to
be, followed by Firefox (38.61%) and for the first time cracking the top three, Safari Mobile (8.95%).
Top Self-Reported Browsers, 2014-2016
Similar to previous years, in 2016 the majority of bad bots (75.9%) self-reported as either Chrome,
Safari, Firefox, or Internet Explorer. 16.1% self-reported as mobile browsers, such as Mobile Safari
and Opera. The other 8% reported themselves as good bots such as Googlebot and Bingbot.
Bad Bot Reported User Agent Types, 2016

Section One | Bad Bots Lie About Their Identity
Bad bots self-reporting as mobile users have

increased by 42.78% since 2015.
Interestingly, bad bots self-reporting as mobile users have increased by 42.78% since 2015a trend
we expect to continue. Web security teams are used to seeing bad traffic coming from web browsers,
so self-reporting as mobile browsers raises fewer red flags. The following chart shows year-over-year
growth since 2015 of user agent types bad bots claim to be.
Bad Bot Reported User Agent Types, 2015-2016
Even Bad Bots Skip Updates
Even though the majority of bad bots detected in 2016 were programmed to report as major
browsers, not all were up-to-date. Humans are always falling behind on performing software
updates and its the same for bad bots.

Section One | Even Bad Bots Skip Updates
The 10 Oldest Self-Reported Browsers by Bad Bots, 2016
One in every ten (9.45%) bad bots reported themselves as being out-of-date browsers at least five
years old. Released in 1999, Internet Explorer 5 was the oldest.
Why were bad bots reporting as out-of-date browsers? Perhaps some were written many years ago
and are still at work today. Some may have been targeting specific systems that only accept specific
browser versions. Others may be have been out-of-control programs, bouncing around the internet in
endless loops, still causing collateral damage.
One in every ten bad bots reported themselves as

being out-of-date browsers at least five years old.

Section One | The Good, The Bad, and the Amazon
The Good, The Bad, and the Amazon
It turns out bad bots are hip. They live in the cloud and are experimenting with mobile. Amazon
AWS, the worlds leading cloud host, generated four times the amount of bad bot traffic (16.37%) as
OVH SAS (a French ISP), which was a distant second place.
Clouds such as Amazon, Digital Ocean, and Google Cloud were all in the top 10 bad bot-
originating ISPs. T-Mobile, responsible for 1.98% of bad bot traffic, popped into the top 10 as well.
The 10 Bad Bot Originating ISPs, 2016

Section One | The Weaponization of the Data Center
The Weaponization of the Data Center
In looking at bad bot traffic by ISP type, 60.1% came from data centers, 30.5% from residential ISPs
such as Comcast and AT&T U-verse, and 9.4% from mobile carriers such as T-Mobile and KPN Mobile.
Bad Bot Traffic by ISP Type, 2016
Bad bots make news cycle headlines when massive botnetsusually launched from
commandeered residential networkstake popular websites offline by way of a distributed denial
of service (DDoS) attack. But when it comes to (OSI) layer 7 bad bot assaults such as account
takeover and web scraping, the ISP of choice is the data center. Why?
Its never been easier to build bad bots with open source software or cheaper to launch them from
globally distributed networks using the cloud. These trends have also broadened the scope of
bad bot use cases. Advanced persistent bots (APBs) can carry out sophisticated attacks, such as
account-based abuse and transaction fraud, which require multiple steps and deeper penetration
into the web application.

Section One | Mobile: The Undefended Frontier
Mobile: The Undefended Frontier
Looking to scrape a competitors site? Maybe a smartphone app exists for that. Mobile ISPs accounted
for 9.4% of bad bot traffic in 2016, with the top three contributing networks being T-Mobile, AT&T
Wireless, and China Mobile.
The 10 Mobile ISPs

Section One | Uncle Sams Bot Army
Uncle Sams Bot Army
Percent of US Bad Bots vs. Rest of World, 2016
Solidifying its place as the bad bot superpower, the United States topped the list of bad bot-
originating countries for the third year in a row. In fact, the US had more bad bot traffic (55.4%) than
all other countries combined. The Netherlands, which generated 11.4% of bad bot traffic, was the next
closest country.
You might be wondering if over half of all cybercrime really comes from US citizens. The answer
is almost certainly no. Unlike the criminals of yesteryear who needed to be physically present to
commit crimes, cyber thieves have technology to do their bidding for them.

Section One | Uncle Sams Bot Army
The US had more bad bot traffic than all other

countries combined.
Sure, a spammer bot might originate from a US data center, but the perpetrator responsible for
it could be located anywhere in the world. Individuals building careers by attacking US web
properties generally live in countries that dont have extradition treaties with America. Thanks to
virtual private data centers such as Amazon AWS, such cyber crooks leverage US-based ISPs to
carry out their attacks as if they originated inside America.
Of the top 10 bad bot originating countries, three maintained their rank from last year: US,
Germany, and Russia. China moved up four spots to crack the top three for the first time in three
years. South Korea made the biggest jump, up 14 spots from 2015. Russia still remains near the
bottom of the top ten.
Top 10 Bad Bot Originating Countries

Section One | China: The Most Blocked Country
China: The Most Blocked Country
China is the most blocked country by Distil Networks customers, followed by Russia and Germany.
Top 10 Customer Blocked Regions, 2016
Many companies use geo-fencing blacklists to choke off large swaths of unwanted traffic. In some
cases, it simply doesnt make sense that foreign visitors would use the site, so blocking chunks of
foreign IP addresses is good hygiene. In other cases, customers have suffered attacks from countries
that didnt traditionally generate good traffic, so have taken sensible measures to protect themselves.

Section One | China: The Most Blocked Country
China and Russia accounted for 79.9% of country-specific

block requests.
Analyzing the 2016 data, even though the US is by far the most dangerous country in terms of bad bot
traffic, China and Russia accounted for 79.9% of country-specific block requests.
Most Blocked Countries, 2016
Countries with the Highest Bad Bot GDP
By comparing the number of bad bots per online users within a given country, Distil is able to spotlight
countries having unusually high bad bot traffic. As part of the Caribbeans Windward Islands chain,
Dominica had the highest number of bad bots per online userdouble its nearest rival. The US was
only the fifth highest, behind Iceland.
While high Bad Bot GDP countries like US, Canada, and Netherlands make sense on this list, few
would consider Dominica, Seychelles, and Mauritius hotbeds of malicious web activity.

Section One | Countries with the Highest Bad Bot GDP
Bad Bot Per Capita, 2016
Having small populationsand thus small numbers of internet usersdoes help explain the listed
prominence of countries such as Dominica, Seychelles, and Mauritius. However, Distil has previously
observed bad actors who use such tiny countries in an effort to help conceal their activities. In 2014,
we reported about an infamous Russian hacker by the name of Track 2 who was operating out of
the Maldives.
And this isnt the first time Seychelles has been connected to nefarious online activities. In 2009,
following the conviction of the owners of (the BitTorrent site) Pirate Bay, it was sold to a Seychelles-
based company.8
If certain conditions such as good internet connectivity, lax law enforcement policies regarding
hacking, and no extradition (not to mention beautiful beaches) are metobscure countries can be
ideal hotbeds for malicious actors.

Section 2
BAD BOTS KNOW WHAT THEY WANT
The Daily Struggle: Bad Bots in the Wild
After analyzing data from some of the most targeted victims of global bad bot activity, there are common
themes about the daily struggle facing website defenders. What worked yesterday wont work today, and
what is working today wont work tomorrow. So the key to handling the problem is understanding bots
are gunning for on your site.
OWASP: One Taxonomy to Rule Them All
Readers may be familiar with the OWASP Top 10 that defines commonly known web application
vulnerabilities. Bad bots dont typically exploit such vulnerabilities, but rather abuse the business logic of
an application itself. To understand how bad bots do this, the OWASP introduced a new taxonomy in their
OWASP Automated Threat Handbook.9
Bad Bot Magnet: If You Build It They Will Come
Websites having one of the following attributes are most attractive to bad bots:
Unique content and/or product and pricing information
Sign up, login, and account pages
Payment processors
Web forms, such as contact, discussion forums, and reviews

Section Two | Youve Been Scraped
Youve Been Scraped
OWASP AUTOMATED THREAT: SCRAPING (OAT-011)
Web scraping extracts data in a way that it can be understood and reused. Bots will often simulate
the filling of a field (e.g. an airport on a travel site) to return back the desired information. As is the
case with many bots, then, web scrapers need to mimic human activity. This makes them difficult to
differentiate from legitimate human users and good bots. In 2016, 97% of sites were victims of web
scraping bots.
How Scraping Works
Scraping can also occur on pages behind a login screen.
OAT-011 Web Scraping
STEP 1 STEP 2 STEP 3 STEP 4

Original Site Duplicate Content SEO Penalized Revenue Drops
SEO Crawler

Section Two | Sign Up Pages
Sign Up Pages
OWASP AUTOMATED THREAT: ACCOUNT CREATION (OAT-019)
In 2016, 82% of sites having sign-up pages were the victims of bot activity aimed at creating
fake accounts.
Bad Bots Love Login Pages
OWASP AUTOMATED THREAT: CREDENTIAL CRACKING (OAT-007)
Credential cracking is sometimes referred to as a brute force dictionary attack. That is, if
a perpetrator knows a username for a digital resource, theyre able to cycle through the
dictionary to try and guess the password. This is why youll often see sites that require
passwords with upper and lowercase letters, numbers, and special characters.
But hackers are savvy; they test for obvious substitutions such as 0 for o. Longer, more
complex passwords are harder to hack, but additional low-cost, cloud-based computing
resources are available to assist them. Symptoms of such brute force attempts include a high
number of failed login attempts and increased complaints regarding account lockouts.

Section Two | How Credential Cracking Works
How Credential Cracking Works

OAT-007 Credential Cracking
STEP 1 STEP 2 STEP 3
Install credential cracking software Pick a target site: banking site, Steal confidential information, credit
social network, email, etc. card information, or purchase goods
and services in stolen account
In 2016, 95.8% of websites fell prey to account credential bots on their login page. In other words,
if you sample any group of 100 websites that contain a login page, 96 of them will have been
attacked in this manner.
Think about that for a moment. If you have a login page, you are almost certainly being attacked
by bad bots. With billions of stolen login credentials available on the dark web, bad bots are busy
testing them against websites all over the globe.

Section Two | How Credential Cracking Works
Bad bots target login pages in an attempt to takeover an account or steal data contained within,
such as credit card information. This can ruin your paying customers web experience by their
account having been hijacked, and increases costs for your business in handling related customer
service matters. Such assaults overwhelm systems and can lead to slowdowns and downtime.
OWASP AUTOMATED THREAT: CREDENTIAL STUFFING (OAT-008)
Credential stuffing exploits our propensity to reuse passwords across multiple sites. If the
infrastructure of a site owner is compromised and a list of usernames and passwords are stolen,
then that list can be leveraged to attack other websites. Of course, a responsible site owner should
encrypt such data, but many do not. Compromised admin accounts could also lead to encryption
keys being compromised as well.
How Credential Stuffing Works
OAT-008 Credential Stuffing
STEP 1 STEP 2 STEP 4

List of leaked passwords Login info tested against target Monetization
and usernames procured sites via login page
Option 1: Sell confirmed login creds

on dark web for higher price
STEP 3
Resulting lists of confirmed login
information created
Option 2: Use logins to

pilfer account

Section Two | Protecting Your Login Page Is Not Enough
Protecting Your Login Page Is Not Enough
OWASP AUTOMATED THREAT: CARDING (OAT-001)

OWASP AUTOMATED THREAT: CARD CRACKING (OAT-010)
OWASP AUTOMATED THREAT: CASHING OUT (OAT-012)
That an account has been compromised may not be clear to its owner; in most cases the aim is for
an account takeover to appearat least in the short termas if a valid user is going about legitimate
activity. Furthermore, because of the way credentials are traded on the dark web, the criminal use of
an account may occur some time after the initial takeover.
In 2016, 90% of websites with login pages had bad bots traversing web pages behind those logins.
Why takeover accounts? There are two main reasons to install a bad bot behind a login page. The
first reason is to scrape content that is only made available to registered members.
The second is transaction fraud. For example, bad bots are used to validate payment card details and
drive customer not present (CNP) fraud. The quality of stolen payment card data is often unknown
and criminals do not want to waste their time using card details that will never work in the first place.
A file of stolen card details may have millions of entries. Carding10 is a process in which bots work
through lists of stolen credit card numbers to find which ones are still valid. They do this by running
small transactions against a target merchants online payment processor.

Even when a payment card is valid, the expiry dates may not have been stored or may be out of
date. CVC numbers are never stored. However, the range of possible values for both of these
elements is small. The current expiry date, in the format of month and year, only represents 30 to
40 values to test.
The CVC code is a three-digit number so only 1,000 values are possible (000 to 999). Bad bots
can be used to test the range of possible values against a merchants online payment process to
identify the missing values in a process known as card cracking.11
Once a criminal has an enriched set of validated payment card records then they can really get
going. Using a single payment card record for a large standalone fraudulent transaction may be
tempting but is more likely to come under scrutiny and may require secondary authentication.
Another tactic is to use bots to process high volumes of small value transactions. What OWASP
calls cashing out.12
For straight financial transactions this is usually a transfer of funds via a mule account which
serves as a buffer account to anonymise the fraud. Other ways of monetising validated payment
card details include VAT refunds claimable on some types of purchases, and product return
fraud, where goods are purchased and returned with a refund request thats siphoned off from
the card holders account. Items that can be delivered electronically, such as event tickets and
equities, are popular targets.
It shouldnt come as a shock that our data shows bad bots were running rampant behind login
pages in 2016. In our sample of sites with proprietary content and payment processing 77.2% of
bad bot activity happened inside the account and 22.8% happened on the login page.

Bad Bot Login Requests vs Bad Bot Requests

Onced Logged In, 2016
It is tempting to consider this finding a damning indictment of login security. It isnt that simple.
You cant prevent a nefarious actor from signing up for an account manually and then handing
the account over to a bot. Few websites require more than a valid email address before
granting membership.

Section Two | Application Denial of Service is Not Volumetric DDoS
Application Denial of Service is Not Volumetric DDoS
OWASP AUTOMATED THREAT: DENIAL OF SERVICE (OAT-015)
Bad bots love scraping unique content or using stolen credentials to takeover accounts, but the
cascading secondary effects like Denial of Service and Skewing can be just as damaging.
With a volumetric DDoS attack the website is flooded, preventing access to its services. Its a layer 3
attack and easy to spot; it can flood your upstream infrastructure to the point where the packets never
arrive at the web server.
In contrast, an application denial of service event occurs when bots programmatically abuse the
business logic of your website. This happens at layer seven, so you wont notice it on your firewall and
your load balancer will be just fine. Its the web application and backend that keels over.
For example, if traffic to your home page triples, you can handle it. The same amount of traffic to a
shopping cart page will incur a much higher computational hit, because the requests impact inventory
and cross-sell databases, as well as payment processing, fraud, and other tools. It doesnt take much
traffic on that page for an application DoS attack to take hold.
A third (33%) of all websites experienced unexpected spikes in bad bot traffic that can lead to
slowdowns or downtime. Weve defined such a spike as an event that equates to at least three times
the 30-day rolling daily average of bad bot traffic.

Section Two | All Your Web Analytics Are Wrong
All Your Web Analytics Are Wrong
OWASP AUTOMATED THREAT: SKEWING (OAT-016)
Businesses rely on website metrics, such as visits, conversions, and other important analytical
data. In 2016, bad bots skewed such analytics on 94% of sites. This affects decision making within
organizations and can potentially lead to incorrect investments.

Section Two | Vulnerability Scanners Are Everywhere
Vulnerability Scanners Are Everywhere
OWASP AUTOMATED THREAT: VULNERABILITY SCANNING (OAT-014)

OWASP AUTOMATED THREAT: FOOTPRINTING (OAT-018)
OWASP AUTOMATED THREAT: FINGERPRINTING (OAT-004)
Vulnerability scanners are intrinsically bots; theyre scripts that run against websites in search of
security vulnerabilities. Contracted whitehat pentesters use them to help assess enterprise security,
but unauthorized vulnerability scanners were detected on 88% of all sites.
By constantly scanning your website bots can immediately act on the latest unpatched zero day
threat or flawed code release.

Section Two | Spamming is a Grievous Annoyance
Spamming is a Grievous Annoyance
OWASP AUTOMATED THREAT: SPAMMING (OAT-017)
For 31% of websites with web forms such as contact, discussion forums, and reviews, spam was a
frustrating reality last year. Form spam damages the customer experience, affects brand perception,
and can divert traffic away from your site. For any organization attempting to police form spam, it is a
time consuming and costly task.
Bad Bot Sophistication Levels
For this report, we have introduced a new classification system that designates the sophistication
level of each of the four bad bot types:
SIMPLE - These are the most bot-like. Connecting from a single, ISP-assigned IP address, they
connect to sites using automated scripts, not browsers, and dont self-report as being browsers.
MODERATE - This type is human-like. They use headless browser software that
simulates browser technologyincluding the ability to execute JavaScript.
SOPHISTICATED - These are the most human-like. They use browser automation
software, or malware installed within real browsers, to connect to sites.
ADVANCED PERSISTENT BOTS (APBS) - APBs are a combination of moderate and

sophisticated bad bots, and also tend to cycle through random IP addresses, come in through
anonymous proxies and peer to peer networks, as well as change their user agents.

Section Two | Bad Bot Sophistication Levels
Bad Bot Sophistication Levels, 2016
Simple bots arrived early in internet history and caused enough damage that website operators took
measures to thwart them. Using a web application firewall (WAF), security professionals could easily
blacklist bad IPs and user agents. In 2016, simple bad bots accounted for 25% of miscreant web traffic.
The majority of nefarious traffic (46.4%) came from bad bots classified as moderate. One of their
characteristics is their ability to execute JavaScript (52.05% of bad bots in 2016). The most difficult to
detect are the sophisticated bots, which comprised 28.7% of bad bot traffic last year.
Most alarmingly, three quarters of all bad bot traffic in 2016 came from advanced persistent bots that
use a mix of technologies and techniques to evade detection and maintain persistency on target sites.
Another characteristic of APBs is that theyre able to easily cycle through IP addresses and switch
their user agents. Simple blacklisting of IPs is wholly ineffective against APBs.
One of the side effects of increasing bad bot sophistication is their ability to carry out significant
attacks using fewer requests. APBs delay requests and stay under request rate limits. This method
being known as low and slow, it reduces the noise of many bad bot attack campaigns.

Conclusion
Do you have proprietary content like pricing information? Do you have logins? Do you have payment
processors? Do you have web forms? These are the four website attributes that attract bad bots.
Take time to learn about these areas of your website and find out if they are all properly secured.
One way to choke off bad bots is to geo-fence your website by blocking users from foreign nations
where your company doesnt do business. China and Russia are a good start.
Ask yourself if there is a good reason for your users to be on browsers that are several years past
their release date. Having a whitelist policy that imposes browser version age limits stops up to 10%
of bad bots.
Also ask yourself if all automated programs, even ones that arent search engine crawlers or pre-
approved tools, belong on your site. Consider creating a whitelist policy for good bots and setting
up filters to block all other botsdoing so blocks up to 25% of bad bots.
These tips will help you filter out bad bots. However, if you dont have a system in place that can
uniquely identify, distil, and respond to all your web and mobile traffic in real-time you wont see the
next bad bot attack coming even though its all over your site.
In the wake of such an attack your content will be all over the web, whatever youre storing in
customer accounts will be leaked, your website performance analytics will be useless for marketing,
and your site might be down.

References
1. The OSI Models Seven Layers Defined and Functions Explained

https://support.microsoft.com/en-us/help/103884/the-osi-model-s-
seven-layers-de ned-and-functions-explained
2. SYN flood (half open attack)

searchsecurity.techtarget.com/definition/SYN-flooding
3. The 2016 Better Online Ticket Sales Act and Advanced Persistent Bots
https://resources.distilnetworks.com/all-blog-posts/beating-ticket-bots
4. The 2016 Economics of Web Scraping

https://resources.distilnetworks.com/distil-white-papers/distil-networks-2016-economics-
of-web-scraping
5. OWASP Automated Threat Handbook Web Applications

www.owasp.org/images/3/33/Automated-threat-handbook.pdf
6. Internet Users by Country (2016)

internetlivestats.com/internet-users-by-country
7. Alexa
Alexa.org/topsites

References
8. Reservella: The shadowy company behind The Pirate Bay

https://arstechnica.com/tech-policy/2009/10/who-owns-the-pirate-bay-part-ii/
9. OWASP Automated Threat Handbook Web Applications

www.owasp.org/images/3/33/Automated-threat-handbook.pdf
10. Carding (OAT-001 ): multiple payment authorisation attempts used to

verify the validity of bulk stolen payment card data
11. Card Cracking (OAT-010): identify missing start/expiry dates and CVC codes
for stolen payment card data by trying different values.
12. Cashing Out (OAT-012): buy goods or obtain cash utilising validated
stolen payment card or other user account data.

About Distil Networks
Distil Networks, the global leader in bot detection and mitigation, is the only proactive and
precise way to identify and police malicious website traffic, mitigating 100% of OWASP
Automated Threats without impacting legitimate users.
Distil protects against web scraping, account takeovers, competitive data mining, online fraud,
unauthorized vulnerability scans, spam, man-in-the-middle attacks, digital ad fraud, API abuse,
and application denial of service.
Slash the high tax that bots place on your internal teams and web infrastructure and make
your online applications more secure with API security, real-time threat intelligence, an analyst
managed service, and complete visibility and control over human, good bot, and bad bot
traffic. For more information on Distil Networks, visit us at www.distilnetworks.com or follow
@DISTIL on Twitter.
Contact Us
For help mitigating account takeovers and other automated threats, please visit
www.distilnetworks.com, call 415-423-0831 or email sales@distilnetworks.com to
speak with one of our security experts.
LIVE DEMO FREE TRIAL CONTACT US

CONFIDENTIALITY STATEMENT
2017 Distil Networks. All rights reserved. The Distil and Distil Networks names and logos and
all other names, logos, and slogans identifying Distils products and services are trademarks
and service marks or registered trademarks and service marks of Distil Networks, Inc., or its
affiliates in the United States and/or other countries. All other trademarks and service marks
are the property of their respective owners.

2017 Bad Bot Report

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2017 Bad Bot Report

Uploaded by

Copyright:

Available Formats

BAD BOT

8 Bad Bot vs Good Bot vs Human Traffic, 2016

11 Size Matters: Bad Bot Traffic By Website Size, 2016

14 Even Bad Bots Skip Updates

18 Mobile: The Undefended Frontier

26 Bad Bots Love Login Pages

27 How Credential Cracking Works

29 Protecting Your Login Page Is Not Enough

33 All Your Web Analytics Are Wrong

34 Vulnerability Scanners Are Everywhere

2 Copyright 2017 Distil Networks. All Rights Reserved.

3 Copyright 2017 Distil Networks. All Rights Reserved.

2016: The Year Bad Bots Appear

4 Copyright 2017 Distil Networks. All Rights Reserved.

Bad Bots Tell Alternative Facts

The Weaponization of the Data Center

Bad Bots Go Mobile

Some Bad Bots Partying Like Its 1999

5 Copyright 2017 Distil Networks. All Rights Reserved.

If You Build It, They Will Come

Advanced Persistent Bots (APBs)

Telltale Signs Bots Are On Your Website

6 Copyright 2017 Distil Networks. All Rights Reserved.

WAFs Are No Match for Advanced Persistent Bots

7 Copyright 2017 Distil Networks. All Rights Reserved.

What is a Bad Bot?

Bad Bot vs Good Bot vs Human Traffic, 2016

Bad Bot vs Good Bot vs Human Traffic, 2016

8 Copyright 2017 Distil Networks. All Rights Reserved.

Trend: Bad Bot vs Good Bot vs Human Traffic 2014-2016

The chart below shows traffic trends since 2014.

Bad Bot vs Good Bot vs Human Traffic, 2016

9 Copyright 2017 Distil Networks. All Rights Reserved.

Explosive Internet User Growth Continues in 2016

New Internet Users by Country, 2016

In 2016, approximately 185 million new internet

10 Copyright 2017 Distil Networks. All Rights Reserved.

Size Matters: Bad Bot Traffic By Website Size, 2016

Large = Alexa 1 10,000

Medium = Alexa 10,001 50,000

Small = Alexa 50,001 150,000

Tiny = Alexa 150,000+

Bad bot traffic to large sites was up 36.43% in 2016.

11 Copyright 2017 Distil Networks. All Rights Reserved.

Looking at traffic on large sites, the bad bot ratio

12 Copyright 2017 Distil Networks. All Rights Reserved.

Bad Bots Lie About Their Identity

Top Self-Reported Browsers, 2014-2016

Bad Bot Reported User Agent Types, 2016

13 Copyright 2017 Distil Networks. All Rights Reserved.

Bad bots self-reporting as mobile users have

Bad Bot Reported User Agent Types, 2015-2016

Even Bad Bots Skip Updates

14 Copyright 2017 Distil Networks. All Rights Reserved.

The 10 Oldest Self-Reported Browsers by Bad Bots, 2016

One in every ten bad bots reported themselves as

15 Copyright 2017 Distil Networks. All Rights Reserved.

The Good, The Bad, and the Amazon

The 10 Bad Bot Originating ISPs, 2016

16 Copyright 2017 Distil Networks. All Rights Reserved.

The Weaponization of the Data Center

Bad Bot Traffic by ISP Type, 2016

17 Copyright 2017 Distil Networks. All Rights Reserved.

Mobile: The Undefended Frontier