You are on page 1of 84

c o m p u t e r s & s e c u r i t y 2 5 ( 2 0 0 6 ) 155

From the Editor-in-Chief

Special systems: Overlooked sources of security risk?

After the events of September 11, 2001 increasing attention is


being paid to security in special types of automated systemsd
systems that provide physical access control, SCADA systems,
plant process control systems, and so on. The consequences of
any security breach in these types of systems are, after all,
potentially catastrophic. Consider, for example, the consequences of a compromise of a physical access control system
in a nuclear power plant. If a saboteur were able to gain unauthorized physical access, untold damage and loss of life could easily
occur.
Information security professionals and auditors generally
focus on the security risk associated with these types of systems, as they rightfully should. At the same time, however, I
sometimes wonder if their risk analyses for such systems
are sufficiently complete. These types of systems for the
most part were originally developed for and deployed in nonnetworked environments. Risk analyses in past decades thus
did not need to consider the risks accruing from network connectivitydthe only plausible attack scenarios involved perpetrators who were able to gain physical access to the systems.
Today, however, the situation has changed considerably in
that nearly all such systems are connected to some type of
network, potentially exposing these systems to all kinds of remote attacks. Worse yet, many of these systems are not connected to some kind of air-gapped network that insulates
internal traffic from the outside world and vice versa. Many
are instead Internet connected, resulting in far greater levels
of security-related risk than were previously ever envisioned.
It has been my experience that security and other professionals are generally not oblivious to the perils of special
systems being connected to the outside world. Yet at the
same time, I have learned of incidents in which these systems
have been accessed over the Internet without authorization,
resulting in highly negative outcomes. In one case a remote
perpetrator broke into a system that controlled lighting levels
in a building; the perpetrator had a heyday changing the lighting levels back and forth until one of the administrators of this
system finally determined what the problem was and cut off
the perpetrators access. Needless to say, neither the owner
nor the administrator of this system had anticipated that
this kind of thing could happen.

There is, however, another facet of the risk associated with


SCADA, process control, physical security and other systems
that is lamentably almost universally overlookeddthe relationship of these special systems to security risk associated
with other networked systems and devices. It is almost as if
imagined security breach scenarios end, i.e., the game is
over, so to speak, if a perpetrator breaks into one or more
of these systems, yet the game will only have begun in
many cases. Perpetrators could easily use systems such as
process control systems that they have compromised to
launch vulnerability scans, perpetrate denial of service attacks, intrude into other networked systems, steal valuable
and sensitive data from databases and files, and so on.
The bottom line is that risk analysis performed on special systems must take into account not only the risk associated with the outcomes of these systems themselves
becoming compromised, but also the potential risk of these
systems being used against other networked systems and devices. To do anything less is to perform an incomplete risk
analysis. At the same time, however, no one would rightfully
expect that the few paragraphs in this editorial would persuade the majority of security professionals and auditors to
expand their view and focus when they assess risks associated with special systems. War stories, case studies of
real-life incidents in which these systems have been used
without authorization to launch attacks against other systems
and network devices, will in contrast have a much greater
effect. I thus invite and encourage readers to submit papers
that describe these kinds of war stories (without attribution
or references to the organizations in which they have
occurred, of course) to Computers & Security.
E. Eugene Schultz Ph.D., CISSP, CISM
Editor-in-Chief
E-mail address: eeschultz@sbcglobal.net
4 March 2006
0167-4048/$ see front matter
2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cose.2006.03.003

computers & security 25 (2006) 156162

Security views
1.

Malware update

A Visual Basic worm with multiple names, Blackmal.e,


MyWife.d, and others, is infecting Windows systems. It arrives
in the form of an attachment that can be an executable file or
a Multipurpose Internet Mail Extensions (MIME) file with an
embedded executable file. It replicates via shared folders and
tries to stop a number of different security-related programs.
The CME-24 worm (also called the Blackmal and Nyxem
worm) is destructive; it is programed to destroy files on infected
Windows systems on February 3, 2006. This worm causes systems that it infects to go to an on-line counter Web site. Easynet, a UK-based Internet service provider (ISP), is monitoring
the counter traffic to determine whether any of its customers
systems are infected and is warning customers accordingly.
A link to proof-of-concept exploit code for Mac OS X has
been posted on the Internet. This code appears to be screenshots of Mac OS X Leopard version 10.5; it sends itself to other
systems via the iChat instant messaging system. Fortunately,
this code does not have a malicious payload.
The Mare-D worm capitalizes on vulnerabilities in XML-RPC
for PHP and Mambo to infect Linux systems. This worm can also
install an IRC-controlled backdoor on systems that it infects.
Mare-D has been deemed to not pose much risk, however.
A newly detected proof-of-concept worm exploits a vulnerability in Apples Bluetooth implementation. Apple released
a patch for this vulnerability in the middle of last year. The
worm replicates by searching for other Bluetooth-enabled
devices and then sends a copy of itself to devices that it finds.
Two of the above news items were about vulnerabilities in
Mac OS X. For years people have paid little attention to security
in Mac OS X systems, for the most part I suspect because few perpetrators have been interested in creating exploit code for these
systems. The trend appears to be changing an increasing number of exploits are being produced and break-ins into Mac OS X
systems are becoming increasingly commonplace. The big question is whether the Mac user community, which has so far been
relatively complacent about security issues, will start to become
interested in Mac OS X security and more importantly do what is
needed to harden these systems sufficiently to make them less
vulnerable to attack.

2.

Update in the war against Cybercrime

Ang Chiong Teck, a student at Nanyang Technological University in Singapore, has received a jail sentence of four months

for selling illegal copies of Microsoft software. Purchasers of


this software had complained that they did not get the codes
necessary for on-line registration and downloading updates.
Additionally, the illegal copies even had counterfeited certificates of authenticity. When law enforcement investigated,
Teck quickly became a suspect. He had SD 20,000 of counterfeit software on his possession when he was arrested. His
sentencing was delayed to allow him to take his university
examinations first.
Spanish law enforcement has arrested a man on charges
that he broke into a computer at the US Navy base in
San Diego, California. His house was searched, resulting in
seizure of a computer and other potential evidences. He is
suspected of belonging to a ring that may have broken into
more than 100 computers, causing a financial loss of more
than USD 500,000.
Daniel Lin, one of four people charged for allegedly using
compromised computers to send volumes of spam, is
expected to plead guilty to the charges during his forthcoming
court appearance. Having admitted of using government and
commercial networks to send spam, Lin reached a deal with
federal prosecutors that will result in his receiving a prison
sentence of between 24 and 57 months. Without the deal Lin
would have gotten a much longer sentence. The accused individuals allegedly used proxies with bogus return paths to
transmit spam, a violation of the CAN-SPAM Act.
A federal grand jury has indicted Joseph Nathaniel Harris,
the former office manager of the San Jose Medical Group in
California, on multiple charges that resulted from stealing
computers and DVDs on which patient records were stored.
He allegedly broke into the medical groups office after he
quit his job. Harris could receive a prison sentence of up to
10 years and a fine of up to USD 250,000.
The well-known Web site of UK student Alex Tew, MillionDollarHomepage.com, has been hit with a barrage of denialof-service (DoS) attacks launched by cyber extortionists. Tew
set up this site to pay for his education by selling individual
pixels to advertisers. Having earned more than USD 1 million
so far, Tew recently revealed in his blog that people have
demanded a substantial amount of money in return for
leaving his site alone. Law enforcement says that Russians
are responsible for the attacks.
Eight Bulgarians have been arrested on charges that they
conducted a phishing scheme. The group allegedly ran numerous bogus Microsoft Web sites used in connection with
phishing email that was sent with falsified addresses to look

computers & security 25 (2006) 156162

as if it had been sent by Microsoft account billing. Recipients


were urged to enter credit card information; perpetrators
allegedly used any of this information that they obtained to
purchase goods and to make wire transfers.
Jeanson Ancheta has pleaded guilty in Los Angeles federal
court to charges related to establishing a botnet consisting of
hundreds of thousands of compromised computers. He allegedly offered to sell the botnets capabilities to enable others to
send spam and launch distributed denial-of-service (DDoS) attacks. According to the terms of a plea bargain, Ancheta would
agree to serve foursix years in prison, repay USD 58,000, forfeit a BMW, and pay USD 19,000 of restitution. The judge in the
case has not, however, approved the plea bargain. Ancheta
will be sentenced on May 1.
A federal grand jury in Seattle has named Christopher
Maxwell in an indictment related to his allegedly illegally creating and using a botnet. Maxwell and two alleged partners
reportedly made USD 100,000 by using the botnet to install
adware. Maxwell and the other two were additionally indicted
on charges of perpetrating a botnet attack on a hospital in the
Seattle Area that rendered doctors pagers inoperable and
closed down the hospitals intensive care unit. If convicted
of the charges that he faces, Maxwell could receive up to
a 10-year prison term and a fine of USD 250,000.
The British High Court has ordered 10 Internet service providers (ISPs) to reveal the identities of 150 individuals who may
have engaged in illegal file sharing. These ISPs must hand over
names, addresses and other personal information about the alleged illicit file sharers to the Federation Against Software Theft.
The British High Court has also directed two UK men to pay a total
of GBP 6500 for making almost 9000 songs available via peer-topeer sharing. Cases against three other individuals are also pending. The British Phonographic Industry (BPI) has initiated these
cases; defendants must also pay the BPIs court costs, which so
far have amounted to more than GBP 20,000.
The US District Attorneys Office in Chicago has obtained
indictments against 19 individuals who are allegedly part of
an international group involved in piracy. US law enforcement
is pressing for extradition of two alleged members of this
group. Prosecutors say that this group illegally copied and distributed software, games and movies valued at more than
USD 6.5 million, even though currently it appears that personal usage, not financial gain, may have been the primary
motivation for the groups activities. The alleged members
of this group will be charged for conspiracy to infringe on
copyrights. If they are convicted, they may have to serve up
to five years of imprisonment, pay a fine of up to USD
250,000, and pay restitution.
Canadas biggest record company, Nettwerk Music Group,
has announced that it will pay to defend David Gruebel,
whom the Recording Industry Association of America (RIAA)
has sued for allegedly having music that was illegally downloaded. The chief executive for Nettwerk, Terry McBride,
said: The current actions of the RIAA are not in my artists
best interests. Litigation is a deterrent to creativity. and it
is hurting the business I love. After hiring a Chicago law
firm to defend Gruebel, Nettwerk asserts that it will cover
any fines resulting from losing the case if this outcome occurs.
The RIAA wants a USD 9000 fine in this case, but will agree to
USD 4500 if the fine is paid before the ruling date.

157

A new phishing ploy has surfaced in the US. A message that


purports to be from the Internal Revenue Service (IRS) includes
a URL for a Web site that appears to inform users of the status of
their taxpayer refunds. Users are tricked into revealing their
names, Social Security Numbers (SSNs), and credit card data.
Israeli natives Ruth and Michael Haephrati have both been
extradited from the UK to Israel on charges that they created
and distributed a spyware tool designed to pilfer competitors
information. The couple reportedly obtained GBP 2000 for
each copy of this tool. Twenty other individuals in Israel and
the UK have been arrested for being involved in the Haephratis alleged activities.
Christopher William Smith has been ordered to pay America Online (AOL) more than USD 5 million for damages caused
and legal fees incurred as the result of his sending out volumes of spam. AOL originally sued Smith in 2004 under provisions of the CAN-SPAM Act. Smith must also face drug
violation-based charges.
William Genovese Jr. received a sentence of two years of
imprisonment for selling source code for two Windows operating systems after he pleaded guilty last year to one count
of illegal distribution of trade secrets. He has previously
been convicted of criminal charges 12 times; three of these
convictions are for computer crime-related activities. After
he serves his sentence, he must serve three years of supervised release in which he must continuously run programs
that monitor his Internet activity on his computer.
Japanese law enforcement has arrested Atsushi Takewaka
on charges that he and an alleged accomplice, Kiichi
Hirayama, created spyware that they deployed to steal passwords used for Internet banking. Hirayama allegedly persuaded Takewaka to create the spyware; Hirayama allegedly
sent CD-ROMs containing the spyware to certain corporations. Some of the recipients installed the spyware on their
computing systems. The two individuals then allegedly used
passwords that they pilfered to steal money from bank
accounts.
The US District Attorneys Office in Los Angeles has
arrested Jeffrey Brett Goodin on the grounds that he allegedly
perpetrated a phishing scheme designed to fool America
Online (AOL) users into disclosing their credit card information. Bogus email messages in connection with this scheme
informed AOL users that they needed to update their billing
information and they had to go to a certain Web site to do
so. The site was then used to glean financial information
that unsuspecting users entered. The stolen information
was used to run up fraudulent charges. The charges against
Goodin include wire fraud and illegal use of an access device.
If he is convicted, he could get a prison sentence of up to 30
years.
A Spanish man who used a computer worm three years ago
to perpetrate a denial-of-service attack has received a sentence of two years in jail and a fine of EUR1.4 million. Santiago
Garrido launched the attack, which interrupted Internet
service for millions of Spaniards. His motivation was
revenge he had previously been banned from using an IRC
chat room.
Russian computer criminals may be using Trojan horse
programs in an effort to steal money from French bank accounts. More than EUR1 million has reportedly already been

158

computers & security 25 (2006) 156162

stolen. The programs reportedly come embedded in email


messages and are also downloaded from malicious Web sites.
Remaining dormant until users access their on-line banking
accounts, the programs supposedly glean passwords and
other financial data and send them to the perpetrators. The
perpetrators are also allegedly working in connection with
people who allow stolen funds to go through their accounts
in return for a fee of up to 10% of the proceeds.
Honeywell International has lodged a civil complaint
against a former employee, Howard Nugent, who it claims disclosed information about 19,000 company employees on the
Internet. A US District Court judge in Arizona has ordered
him to not divulge the Honeywell information. Papers filed
in the court case indicate that Honeywells computers were
not broken into Nugent allegedly instead exceeded the level
of access allowed to him.
At a hearing last month, UK district judge Nicholas Evers
decided to deny the US request to have UK citizen Gary
McKinnon extradited unless the US promises that McKinnon
will not be considered a terrorist. McKinnon allegedly intruded into US Department of Defense (DOD) and NASA computers. The judge said that he worries that in the US terrorist
suspects can be tried under military law.
Brazilian federal police have arrested 41 individuals who
allegedly used a Trojan horse program to pilfer BRL10 million
from 200 accounts at six banks. The suspects allegedly sent
the program in email messages. Twenty-four additional
suspects are still wanted.
Stephen Sussich of Australia has been fined AUD 2000 and
must pay AUD 3000 in restitution because he installed a rootkit
on a server owned by Webcentral, an Australian company.
Sussich pleaded guilty to two counts of unauthorized data
modification to cause harm. Sussichs motivation does not
appear to be financial; apparently he did not even access
credit card data.
Luis Ochoa of California has been arrested for allegedly
uploading an Academy Award-nominated movie to the Internet. Law enforcement set up a string operation after someone
informed the Motion Picture Association of America (MPAA)
that he had revealed in a chat room that he was going to upload the movie. The movie had a watermark that indicated it
was a screener copy meant for viewing only by individuals
with Academy voting privileges. If convicted of all charges
that he faces, he could be sentenced to one year of imprisonment and a fine.
Former CA chief executive officer Sanjay Kumar faces
charges resulting from his allegedly deleting information
from his laptops hard drive. The information could conceivably have been used as evidence in accounting the debacle
that led to Kumars exit from CA. The US District Court in Eastern New York has stated that it intends to submit evidence
that Kumar reformatted his laptop to run the Linux operating
system, thereby destroying the contents of the laptops hard
drive. Kumars action allegedly occurred after the government
investigation had started and after a memorandum ordered
CA employees to preserve all pertinent data. Kumar was
indicted after a government investigation into dubious
accounting practices at CA.
Scott Levine of Florida, former CEO of Snipermail.com,
a bulk email company, has received a sentence of eight years

of imprisonment for intruding into Acxiom Corporations


database of consumer data and then pilfering more than one
billion records. He was convicted of 120 counts of illegal access
to a computer connected to the Internet, two counts of device
fraud, and one count of obstruction of justice. No evidence
that Levine used the data to perpetrate identity fraud exists.
Levine must also pay a fine of USD 12,300 as well as restitution;
the exact amount of restitution has not yet been determined,
however.
Police raids in Switzerland and Belgium have closed down
Razorback2, one of the largest index servers within the
eDonkey file sharing network. According to the RIAA, these
servers held an index of approximately 170 million illegally
copied files. The servers owner has been arrested and the
equipment has been seized.
The length of the War against Cybercrime portion of
Security Views seems to continually grow; more news items
concerning investigations, arrests and sentences for computer crime are being covered. Law enforcement and the legal
system in an increasing number of counties appear to be coming up to speed when it comes to dealing with computer
crime. At the same time, however, computer crime perpetrators seem to constantly create new ways of doing their evil
deeds. Additionally, a rapidly increasing number of computer
criminals seem to be surfacing for various reasons, one of the
chief of which is the promise of a great amount of financial
gain generally without all that much risk to them. So while
it is good to see some progress in the effort against computer
crime being made, computer crime is, unfortunately, inevitably going to become more prevalent.

3.
More compromises of personal and
financial information occur
An Ameriprise Financial employees laptop that contained
customer information was stolen out of a car. Ameriprise
Financial has sent letters to 158,000 customers informing
them accordingly. No customer SSNs were stored on the laptop, but unfortunately a file that contained the names and
SSNs of 68,000 current and former financial advisers was.
Providence Home Services is notifying 265,000 current and
former patients that their medical information fell into unauthorized hands when disks and tapes containing this information were pilfered from the car of an employee. Information
about numerous current and former employees was also on
the stolen disks and tapes. No evidence that the stolen information has been used for identity fraud purposes exists. Having
employees take home disks and tapes is a standard business
continuity-related procedure for Providence Home Services.
Providence Home Services has set up a hotline to answer inquiries from those whose information was compromised.
A perpetrator gained unauthorized access to a computer at
the University of Delawares School of Urban Affairs and Public Policy; SSNs of 159 graduate students were stored on that
system. Additionally, someone pilfered a backup hard drive
from the Universitys Department of Entomology and Wildlife
Ecology; the hard drive contained personal data. The university has notified all affected individuals.

computers & security 25 (2006) 156162

The University of Notre Dame is looking into a break-in


into a server on which confidential data about financial
donors is stored. The attack was discovered early this year.
The compromised server, which was taken offline, was not
connected to central databases at the university; it is being
forensically analyzed. The school has informed individuals
whose information was compromised.
The University of Northern Iowa has sent letters to 6000
staff members telling them that information about them in
Internal Revenue Service W-2 forms was potentially compromised when a laptops security was compromised. University
administrators say that there is no evidence that any of the
information was actually accessed, however. Staff members
were encouraged to closely watch their financial accounts in
case of identity theft attempts.
Canterbury University in New Zealand has terminated all
on-line access to student records after learning that students
were able to see other students records while they enrolled
on-line. The University is trying to determine the source of
the problem.
Perpetrators reportedly broke into the Rhode Island
government Web site, www.RI.gov, and stole credit card data
belonging to individuals who had engaged in on-line business
with Rhode Island state agencies. The credit card data were
encrypted. Several individuals boasted of these deeds on
a Russian-language Web site over a one month ago. A spokesperson for the Rhode Island Web site said that security for the
site complies with the payment card Industrys Data Security
Standards. Technical staff members have patched the vulnerability that the attackers exploited.
Credit card and bank routing information of up to nearly
one quarter million Boston Globe and Worcester Telegram &
Gazette subscribers was leaked when internal reports containing this information were recycled for use as routing
receipts for bundles of newspapers. When it found out what
had happened, the Globe sent delivery staff to gather the routing receipts, but was successful in retrieving only a fraction of
them. The Globe has advised credit card companies and financial institutions concerning the incident and intends to send
notification letters to its subscribers. This company has also
set up a hotline to enable customers to learn whether or not
their financial information was leaked.
A Boston investment bank has reported that it has been getting FAXes from Brigham and Womens Hospital with patient
medical information (SSNs, medical test results, and more)
concerning women who have had recently given birth at this
facility. The banks finance manager has been destroying
every FAX copy and has notified the hospital several times,
but until recently to no avail the FAXes kept coming for six
months. The documents contain a great deal of personal
information, including SSNs and medical test results. The hospital plans to inform the patients whose data were exposed.
The FBI is looking into unauthorized changes in a MySQL
database on which an electronic medical record system at
an orthopedics clinic is based. Orthopedics Northeast (ONE),
which is based in Indiana, stumbled onto the problem when
severe performance decrements occurred three months ago.
Technical staff concluded that the changes were apparently
made by someone who had gained unauthorized access to
the system. The original path of entry was through a virtual

159

private network (VPN); the attacker reached a proxy server


and then exploited a backdoor in WebChart software made
by Medical Informatics Engineering (MIE). The attacker
appended characters to a database query, causing the server
to crash, and also erased a print server directory.
Security breaches at a Wal-Mart store and also at a OfficeMax
store in California prompted the Bank of America (BoA), Washington Mutual Bank, and a credit union to void about 200,000
debit cards. The BoA sent letters to potentially affected customers; the letters explained that their debit cards were canceled and urged them to be on the alert for any unauthorized
transactions. Fortunately, so far no indications of customer account compromises or identity theft have surfaced. Neither
store has commented on the incidents. The FBI and the Secret
Service have launched an investigation.
John Lynch, Governor of New Hampshire, announced that
one of the states computers was accessed without authorization. The perpetrators may have tried to obtain credit card
account information about New Hampshire residents. At
risk is information concerning computer and in-person transactions at state liquor stores, motor vehicle offices, and other
places. Individuals who have used credit cards to buy something from one of these places during the last six months
were advised to watch for fraudulent transactions. The incident was discovered by technical staff, who found a Trojan
horse monitoring program running on the system.
A McAfee spokesperson reported that Deloitte & Touche,
the companys external auditing firm, lost a CD that contained
the names, SSNs and McAfee stock holdings of a large number
of current and prior McAfee employees. Acknowledging that
an employee left the unlabelled CD in the seat back pocket
on an airplane, Deloitte & Touche informed McAfee about
the lost disk early this year. All potentially affected employees
have been informed accordingly.
So many compromises and potential compromises of
personal and financial data have occurred that I am not sure
exactly where to start in commenting on this last round of
lamentable incidents. I have a very difficult time understanding why some organizations are so deficient in their security
and other practices. Consider, for example, all the organizations that allow unencrypted personal and financial data to
be stored on employee laptops or CDs that employees carry
around with them. Providence Home Services business continuity practices having employees take backup disks and
tapes home with them certainly qualifies as a worst practice anywhere as does Brigham and Womens Hospitals sending FAXes containing patient medical information to the wrong
destination and then not ceasing to do so when informed of the
problem. In any case, it is clear that many organizations have
a long way to go when it comes to securing financial and personal data. As such, in the future we are very likely to see an
increasing amount of news related data security breaches.

4.
Violence Against Women and DOJ
Reauthorization Act bans annoying
postings and messages
By signing the Violence Against Women and Department of
Justice Reauthorization Act, President Bush also signed into

160

computers & security 25 (2006) 156162

law Section 113, Preventing Cyberstalking, which among


other things makes it illegal to anonymously post annoying
Web messages or send annoying email messages. The criminal penalties for not revealing ones identity when posting
even potentially annoying Web content or email messages
include large fines and up to two years of imprisonment. The
law in effect updates existing telephone harassment laws to
prohibit using the Internet anonymously with the intention
to annoy. The pertinent section was buried into the unrelated
bill that passed in both houses of Congress. Critics contend
that this legislation circumvents the First Amendments protection of citizens right to write something that is potentially
annoying as well as their right to do it anonymously.
I like the idea of trying to clamp down on some of the
excesses (such as cyberstalking) that occur on the Internet,
especially those designed to trigger fear and resentment, but I
seriously question some of the provisions of this act. What is
annoying to one person may be perfectly acceptable to
another. The subjectivity involved in defining annoying
promises to render this act difficult to interpret and enforce.
Additionally, the civil rights implications are downright
frightening. If this act stands up to the test cases that will
invariably surface, Americans will lose yet more rights during
a time in which civil liberties are already being seriously eroded
in the US.

5.
Proposed US legislation would require
deletion of personal information on US
Web sites
Proposed federal legislation currently being considered by the
US Congress would require every US Web site to delete all information about visitors to the site, including names, street and
email addresses, telephone numbers, and so forth, if the information is no longer needed for a bona fide business reason. The
provisions for personal information deletion in the proposed
legislation, the Eliminate Warehousing of Consumer Internet
Data Act of 2006, are intended to fight identity theft because
Web sites that contain personal information are often major
targets for computer criminals. Some speculate that one effect
of this requirement would be to reduce concern about search
engines storing information about users search terms, something for which the US Department of Justice (DOJ) recently
subpoenaed Yahoo, Google, and other search engine providers.
As the bill is currently worded personal information does not
refer to search terms or Internet addresses. If this proposed legislation is passed, violations of this law could be punished by
the FTC as deceptive business practices whether or not
a Web site is run by a business or an individual.
The proposed legislation described in this news item appears to be another step forward in fighting identity theft. If
there is no genuine business-related reason to keep personal
information on a Web site, it is only logical that this information be removed. Controversies and challenges concerning the
interpretation of no longer needed will, of course, surface.
Still, requiring Web site operators to purge unneeded personal
information will help ensure that at least some targets of
opportunity for would-be identity thieves will disappear.

6.
Financial Services Authority Report
highlights need for banks to boost
on-line security
In its Financial Risk 2006 Report the UKs Financial Services
Authority (FSA) found that 50% of Internet users are very concerned about the risk of fraud. The report stated that banks
should strive more to alleviate these concerns by educating
Web users about on-line security. Of the 1500 people asked
about their on-line habits, many reported that they used
good security practices, yet a fourth did not remember when
their security software such as anti-virus software was last
updated. Industry group Apacs found that Internet fraud
losses rose to GBP 14.4 million during the first half of 2005
more than triple that of the same period the previous year.
A critical point made in the FSA report was that if customers
were expected to absorb the costs for on-line fraud, 77% would
avoid on-line banking altogether. Recent reports of criminal
gangs stealing millions from the government through tax
credit scams involving the Department for Work and Pensions
and Network Rail have fueled the concerns of on-line customers concerning on-line security.
It would be very difficult to disagree with the findings and
recommendations of the recent FSA report. Banks rely on
on-line transactions, yet they often do not go far enough in ensuring that these transactions are secure. I especially worry
about the threat of keystroke sniffers being installed on users
computers; few users know what keystroke loggers are, let
alone how to detect them. Losses from fraudulent on-line
transactions are starting to mount, as indicated in this and
several previous news items. As these losses grow, banks
and other financial institutions will be virtually forced to pay
more attention to on-line security.

7.
Washington State and Microsoft
sue anti-spyware vendor
Microsoft and the State of Washington each filed lawsuits in
the US District Court for the Western District of Washington
against Secure Computer and its principals. The charges include violation of Washingtons Computer Spyware Act and
three other laws. Secure Computer allegedly used scare tactics
that included putting misleading links on Googles Web site,
producing unwanted pop up advertising, and spamming.
Secure Computer implied that their software came from or
was endorsed by Microsoft and then went further by using
a Windows feature to pop up warnings on PCs, informing
the users that their system had been compromised and that
they should run a spyware scan. Users were later advised to
buy Secure Computers Spyware Cleaner for USD 49.95 to
remove the malware that was supposedly installed on their
computers. The program does not work, however. Washington
state law establishes a fine of up to USD 100,000 per violation.
If Secure Computer has actually done what it is being
accused of having done, the lawsuits brought against this
company are a just punishment. As I have said so many times
before, computer users are for the most part incredibly nave
concerning security issues; it would not be difficult for an

computers & security 25 (2006) 156162

unscrupulous person or organization to cause uncertainty and


even fear sufficient to motivate them to buy software that
promises to fix whatever the apparent problem is.

8.
Morgan Stanley offers to settle
with the SEC
US-based investment bank Morgan Stanley has offered to
settle with the Securities and Exchange Commission (SEC)
for USD 15 million to resolve a matter related to Morgan Stanleys having destroyed potential electronic evidence. The
company did not comply with an order to keep electronic
messages that pertained to a lawsuit that had been filed
against it. Morgan Stanley claims that backup tapes on which
the email messages in question were stored were accidentally
overwritten. The SEC has not decided whether to accept
Morgan Stanleys settlement offer.
This is a truly fascinating case. Morgan Stanley somehow
got its wires crossed and deleted evidence that the SEC
ordered it to hand over. I do not blame the SEC for taking its
time in deciding how to deal with this investment bank.
If the SEC accepts Morgan Stanleys offer, Morgan Stanley will
not only get away relatively cheaply (remember, USD 15 million is small change for a company such as Morgan Stanley),
but other companies faced with the dilemma of having to
hand over evidence that they know will be used against them
will also be tempted to accidentally erase the evidence. On
the other hand, offering to pay the SEC right up front not
only appears to be a magnanimous move on Morgan Stanleys
part, but it also promises to close one of the many complicated
cases with which I am sure that the SEC is having to deal.

9.
FTC settles with CardSystems Solutions
and ChoicePoint
ID verification services vendor CardSystems Solutions has settled charges brought by the Federal Trade Commission (FTC)
that this company failed to secure sensitive customer data.
The charges followed a major security incident that led to
more than 260,000 individual cases of identity fraud. CardSystems Solutions had been obtaining data from the magnetic
strips of credit and debit cards and storing them without
deploying ample security safeguards. The company, bought
by Pay By Touch late last year, has agreed to implement
a wide-ranging security program and undergo independent
security audits every two years for 20 years.
In settling with the FTC, ID verification services vendor
ChoicePoint must pay USD 10 million in civil penalties and
USD 5 million for consumer damages. The USD 10 million is
the FTCs largest civil fine to date. ChoicePoint was charged
with not sufficiently screening its clients for legitimacy and
for data handling methods that violated the Fair Credit
Reporting Act, the FTC Act, other federal laws, and privacy
rights. The settlement requires ChoicePoint to establish a security program that includes verifying the legitimacy of clients
for their services, auditing its clients use of the information
obtained, and making visits to client sites. ChoicePoint now

161

must also submit its new security program to independent


security audits every two years until 2026.
The FTC deserves a lot of credit for its efforts here. This
commission played hardball with both CardSystems Solutions and ChoicePoint because of their very deficient data protection practices and got a very good outcome in each case.
The large fine that ChoicePoint had to pay is particularly noteworthy; the FTC is in effect saying that organizations that do
a poor job in protecting personal and financial data are going
to have to face meaningful punishment. Hopefully, this outcome will send a powerful message to other organizations
that have poor data security practices.

10.
Lawsuits are not curtailing illegal
downloads
Surveys of 3000 on-line users in Spain, Germany, and the UK,
by the industry group Jupiter and studies by the International
Federation of the Phonographic Industries (IFPI) are indicating
that despite almost 20,000 people being sued in illegal song
downloading cases in 17 countries, illegal file sharing activity
has remained close to the same for the past two years.
Approximately 335 legal download stores and on-line music
services have two million songs legally available double
the amount from the previous year with 420 million singles
legally downloaded in 2005 and sales exceeding USD 1 billion
in 2005 up from USD 380 million in 2004. More rapid growth
is predicted this year. According to the surveys, 35% of illegal
file sharers have cut back on their activity, 14% have increased
their activity, and 33% of them buy less music than those who
obtain their music through purchasing it through legal channels. With approximately 870 million song files available
through illegal downloading on the net, the music industry
is having a difficult time persuading song-swappers to get
their music legally. The music industry is threatening to sue
Internet service providers (ISPs) if they do not start identifying
and stopping customers who ignore copyright restrictions. In
its Digital Music Report, the IFPI stated that music downloads
for mobile phones had reached USD 400 million annually,
which comprises 40% of the digital music business. Meanwhile, the plusses and minuses of the use of Digital Rights
Management technology, something that limits what consumers can do with their music once they have purchased
it, are still being debated.
The entertainment industry faces a continuing uphill struggle in its war against piracy. Using lawsuits as a mechanism for
reducing illegal downloads may not be working, but it nevertheless was a logical course of action to pursue. I suspect that much
of the reason that lawsuits are not working better than they are
is that most of the lawsuits have targeted individuals instead of
organizations. As such, many individuals who illegally download movies and music are probably not even aware of the
many lawsuits that have been filed over these type of activities,
so there is little or no intimidation factor. It is logical to also assume that the entertainment industry will in the not too distant
future now shift its strategy by increasingly going after ISPs who
do not prevent users from performing illegal downloads. The
road to success for this possible strategy is also not certain,
however; in the past numerous ISPs have been able to win court

162

computers & security 25 (2006) 156162

battles against the RIAA and other entertainment industry entities when they have been directed to hand over names of illegal
file sharers. Again, the entertainment industry does indeed
have a long way to go.

11.
Russian stock exchange operations
disrupted by virus
A virus halted computing operations at the main Russian
stock exchange. The Russian Trading System (RTS) halted
operations in its three markets for slightly over 1 h after an
unidentified virus infected computing systems there. The
infection produced a massive amount of outgoing traffic that
disrupted normal network operations. The virus reportedly

came in over the Internet and infected a computer connected


to a test trading system. The infected computer then started
generating huge volumes of traffic to the point that it overloaded the RTSs support routers. The result was that normal
traffic (data going into and out of the trading system) was not
being processed.
It is truly scary to think that a virus has actually
stopped trading transactions within a stock market. An incident of this nature should not occur in as critical a setting
as in a stock market. I would be very curious to learn more
details about this incident, including what operating system ran on the infected computer and how the virus in
question actually worked. I also wonder what kind (if
any) of anti-virus and incident response measures were
in effect.

computers & security 25 (2006) 163164

Modeling network security


Danny Bradbury
article info
Article history:
Received 6 March 2006
Revised 9 March 2006
Accepted 9 March 2006

A model idea for network security? For those that know


how, testing a virtual version of your network for security
can be more productive than testing the real thing.
For companies with complex networks, duplicating the
equipment and the intricate infrastructure for testing purposes is impossible. There is simply too much of it, and it
would be too expensive. Testing live systems for vulnerabilities is generally preferred, but how can you be sure that you
have tested for everything? And how can you model the security implications of planned changes to the network?
The alternative to physical testing is to do it virtually.
Building mathematical models of networks can help security
evaluators to understand their strengths and weaknesses. A
model can help to identify single points of failure, or reveal
how particular events in some nodes could lead to unexpected
results elsewhere. Using these models as a guide, administrators may be able to develop strategies to make networks more
reliable and more secure, preventing them from attack.
Ideally, a network model should be as accurate as possible,
but David A. Fisher, a senior member of the technical staff at
Carnegie Mellons Software Engineering Institute, draws a distinction between accuracy and precision in network modeling. They are often confused, he warns. Accuracy has to
do with correctness but it can be precise or abstract.
A precise model will be as detailed as possible as it tries to
represent your network in software, usually down to the configuration of specific devices, connection speeds, and applications. An abstract model, on the other hand, concerns itself
more with the general behavior of a network with certain
numbers of vaguely described nodes exhibiting certain behaviors. If you want to do predictive simulations, you must be
both accurate and precise, but if youre interested in gaining
E-mail address: danny@itjournalist.com

insight, or understanding the mechanisms involved, an accurate simulation without the precision is quite acceptable,
Fisher explains.
The value of abstract models becomes apparent in the context of activities at CERT, Fishers old employer. Concerned
with broader issues of Internet security and broad security responses, the organization wants to understand the basic
mechanisms involved in an attack rather than getting into detail. Malicious activities such as distributed denial of service
attacks occur across huge numbers of machines. These
models dont depend on details like the topology of the Internet or who is connected to who, Fisher explains. They can
be abstracted away. You are concerned about the number of
machines vulnerable to attack, and the number of machines
capable of launching attacks.
It is these types of network, with large numbers of autonomous nodes, where emergent behavior is prevalent, Fisher
says. Outcomes for the whole system derive from local events.
As one nodes influence affects its neighbor, the neighbors
behavior in turn will affect other neighbors. This emergent
behavior is similar in different domains. The spread of viruses
in large computer networks, for example, can be similar to the
spread of biological epidemics even though the details in
each domain (people vs computers) will be totally different.
Fisher developed a software language called Easel designed
to help model emergent behaviors in Internet security and the
critical national infrastructure. It works on the basis that, although you could not reasonably model every single node
on the Internet for example, you must model enough nodes
to reflect emergent behavior. Although Fisher does most of
his work in the range of 201000 nodes, the Macintosh-based
tool can model abstract networks up to 32,000 nodes in size.

164

computers & security 25 (2006) 163164

Modeling the emergent behavior in abstract networks may


be useful for looking at the critical national infrastructure, but
how will it help a real-world corporate network? Companies
like Opnet focus on precision models of such infrastructures.
We produce a virtual representation of the network infrastructure, says product marketing director for enterprise solutions
Devesh Satyavolu. This software model contains everything
from the servers hosting your applications through to the client
machines, routers, switches, and the protocols in use.
We create a baseline of the production infrastructure by
talking to various sources of monitoring information, Satyavolu continues. These sources include most industry
systems management players such as Computer Associates, BMC, and Hewlett-Packard. The product feeds information into a configuration management database
contained within its IT Guru product, and its Virtual Network Environment then models its behavior, using both
historical network traffic information where available,
and also enabling administrators to model what if scenarios. The companys Application Characterization Environment (ACE) also models application transaction behavior,
enabling staff to understand how interactions between clients and servers affect the network.
Opnets is a general network modeling environment,
designed to help staff manage everything from capacity to
performance. It is not security specific, but Satyavolu says
that it can be used for this purpose. We have a comprehensive rules engine for almost 400 rules that we ship, and
you could evaluate the accuracy of configurations on your
devices firewalls, switches, routers etc as they pertain to
network security, he says, arguing that analyzing the network against the rules can help you to answer security questions. Can someone burrow from point A to point B in my
network, or will access controllers in the middle stop them,
and if so, where?
Skybox Securitys modeling software focuses heavily on
using precise network models to analyze vulnerabilities, explains vice president of worldwide marketing Ed Cooper. Its
Skybox View Suite gathers information about the network
and builds what the company calls an integrated security
model. The software tools in the suite can then be used to simulate attacks on the model, while also analyzing access paths
through the network to throw up unexpected vulnerabilities.
The model is of little use without adequate simulation, argues Cooper, because like Opnet it enables you to conduct
what if analyses on the network. What if we deployed our
intrusion prevention systems and moved them from point A
to point B? he asks. If we infect a virtual model with
a worm, what propagation attributes will it adopt?
Getting the information out of real-world networks to populate these precise models can be daunting. For companies
without a properly populated confirmation management database, Skyboxs software must be given administration rights
to all network devices so that it can download their configuration files during auto-discovery.
How such auto-discovery works in practice remains to be
seen, but an interesting function of the Skybox software is

the ability to include business impact rules enabling network


administrators to associate technology assets with monetary
risk to the business in the event of an attack. This can be
done using risk metrics, or regulatory compliance frameworks
such as ISO 17799 or Sarbanes Oxley.
Should companies choose an abstract modeling approach
or use more precise network models to create a more detailed
picture of their network? It isnt an either/or decision, says
Colin OHalloran, director of the systems assurance group at
QinetiQ, a commercial spin-off from the UKs Ministry of
Defences Defence Evaluation and Research Agency, and a visiting professor at the University of York. Any model will
make simplifications because the more faithful you are to
the network the closer youll eventually get to the network
itself, he says.
Models that attempt to represent your system in detail will
naturally make some assumptions and simplifications, OHalloran argues. The best approach is to use different models at
different stages of analysis. Its not one model; its a whole
family of models that you need.
QinetiQ begins an analysis by checking models at the highest abstract level, characterizing nodes as simply compromised or not. For this, it uses the Failures-Divergence
Requirement (FDR) tool, a model checker for state machines.
Given a description of a system in terms of components
with simple states, it attempts to explore every possible combination of states. Such tools often suffer from a combination
explosion problem, in which the number of combined states
becomes astronomically large. QinetiQ breaks down the network into chunks, analyzing them individually.
This abstract analysis can help to highlight areas of potential vulnerability without pinning down what that vulnerability is, Hopkins explains. How it comes about is almost
irrelevant it just allows someone from outside the firewall
to do something that they shouldnt be able to do.
This level of analysis is pessimistic in nature. It will tell security consultants that theres a possibility of an intrusion
path through the network, but it might be a false positive. At
this point, OHalloran hands off to Paul Hopkins, group manager for investigations and security health check at QinetiQ,
who stress tests that particular part of the network using
the Skybox Security software to see if there is a real vulnerability there, or whether the problem is simply an artifact of
the abstract model.
But then, QinetiQ is a specialist in the security area, and
network modeling is not an activity for the faint of heart. At
the abstract level, you need the technical capability to build
and simulate these models, which in many cases will involve
programming the necessary properties and rules. At the precise level, your infrastructure must be mature enough to provide the information to make the model as complete as
possible (again, with the understanding that it is not possible
to be entirely complete).
Ideally, you will use a combination of the two, but this
will be a pipedream for most network departments struggling to firefight everyday performance and capacity problems. They will be hammering out security scenarios
using conventional analysis tools on live networks for
some years to come.

computers & security 25 (2006) 165168

Information Security The Fourth Wave


Basie von Solms*
University of Johannesburg, Johannesburg, South Africa

article info

abstract

Article history:

In a previous article [von Solms, 2000], the development of Information Security up to the

Received 20 February 2006

year 2000 was characterized as consisting of three waves:

Revised 9 March 2006


Accepted 9 March 2006
Keywords:

 the technical wave,


 the management wave, and
 the institutional wave.

Corporate Governance
Information Security

This paper continues this development of Information Security by characterizing the

Information Security Management

Fourth Wave that of Information Security Governance.

Information Security Governance

2006 Elsevier Ltd. All rights reserved.

Risk management
SarbanesOxley
Social engineering

1.

Introduction

The First Wave was characterized by Information Security


being a technical issue, best left to the technical experts.
The Second Wave was driven by the realization that Information Security has a strong management dimension, and that
aspects like policies and management involvement are very
important. The Third Wave consisted of the need to have
some form of standardization of Information Security in
a company, and aspects like best practices, certification, an
Information Security culture and the measurement and
monitoring of Information Security became important.
Since the paper (von Solms, 2000) introducing this development cycle for Information Security appeared in Computers
and Security, the development of the next wave of Information Security, the Fourth, became very clear and well defined.
This wave relates to the development and crucial role of
Information Security Governance.

* Tel.: 27 11 489 2843; fax: 27 11 489 2138.


E-mail address: basie@rau.ac.za

The drivers behind this Fourth Wave are closely related to


developments in fields of Corporate Governance and the related legal and regulatory areas. Top management and Boards
of Directors felt the heat as they started to become personally
accountable for the health (read Information Security) of their
IT systems on which they base their planning and decisions.
This paper will discuss this Fourth Wave, and the drivers
behind the wave.
In Section 2 we will briefly investigate the development of
Corporate Governance, and highlight the relationship with
Information Security. Section 3 will discuss the relationship
between Corporate Governance and Information Security in
more detail, followed by Section 4 investigating the concept
of Information Security Governance. After that, in Section 5
we look at some of the drivers behind the Fourth Wave,
followed by Section 6 which presents the discussion about
some of the consequences of this wave. We conclude with
a summary in Section 7.

166

computers & security 25 (2006) 165168

2.
Corporate Governance and Information
Security

3.
The relationship between Corporate
Governance and Information Security

Several documents related to Corporate Governance have


appeared during the last five years, and the importance of Corporate Governance in general is now established on an international level. Important examples of such documents are the
OECD Principles of Corporate Governance (OECD Principles of
Corporate Governance, 2004) and the King 2 Report on Corporate Governance (King 2 Report on Corporate Governance,
2002).
The following two quotes come from the OECD document
under the section Responsibilities of the Board:

The important, and interesting, aspect of the relationship


between governance and security is the clarity with which
this relationship had been expressed in relevant and recent
documentation.
The following type of statements has started to appear
more regularly, highlighting the integral role of Information
Security in Corporate Governance.

[Responsibilities of the Board include] ensuring the integrity of


the corporations accounting and financial reporting systems, including the independent audit, and that appropriate systems of
control are in place, in particular, systems for risk management,
financial and operational control, and compliance with the law
and relevant standards.
In order to fulfill their responsibilities, board members should
have access to accurate, relevant and timely information.
Therefore, although these documents do not necessarily
refer to Information Security per se, they do refer to aspects
like reporting systems, systems of control, compliance with
relevant standards, risk management, accurate, relevant and
timely information, internal controls, etc.
Most companies are totally dependent on their IT
systems to capture, store, process and distribute company
information. As Information Security is and has always
been the discipline to mitigate risks impacting on the
confidentiality, integrity and availability of a companys
IT resources, Information Security is extremely relevant
to what is required in such Corporate Governance
documents.
Several legal and regulatory developments related to
Corporate Governance have further escalated the role and accountability of senior management as far as their Corporate
Governance responsibilities are concerned, reaching the
agendas of board and other high level meetings. The leading
example here is the SarbanesOxley Act (SarbanesOxley,
2002).
This Act requires top management (and the Board) to sign
off on the information contained in annual reports.
. in this law (Act) there is a provision mandating that
CEOs and CFOs attest to their companies having proper
internal controls. Its hard to sign off on the validity of
data if the systems maintaining it are not secure. Its the
IT systems that keep the books. If systems are not secure,
then internal controls are not going to be too good. (Hurley,
2003)
From the above discussion, it is clear that, although indirectly mentioned, there is a significant relationship between Corporate Governance and Information
Security.

Corporate Governance consists of the set of policies and


internal controls by which organizations, irrespective of
size or form, are directed and managed. Information
security governance is a subset of organizations overall
(corporate) governance program. (Information Security
Governance a call to action).
.. boards of directors will increasingly be expected to
make information security an intrinsic part of governance,
preferably integrated with the processes they have in place
to govern IT. (Information Security Governance: Guidance
for Boards of Directors and Executive Management).
What has also emerged is the pivotal role of Information
Security as a risk management or risk mitigation discipline.
A representative statement in this case is:
An information security programme is a risk mitigation
method like other control and governance actions and
should therefore clearly fit into overall enterprise governance. (Information Security Governance: Guidance for
Boards of Directors and Executive Management).
This growing realization has established the fact that Information Security Governance has an enterprise wide impact,
and that the risks mitigated by an Information Security Governance plan are risks which have an enterprise wide business
implication.
Of course, we, as professionals and practitioners in the
field of Information Security, had been making these statements for some time, but we never really succeeded in getting
the impact we wanted. The wider emphasis on good
Corporate Governance has now succeeded to achieve that
which we had been preaching for so long.
Let us now have a closer look at precisely what we can
understand under the concept of Information Security
Governance.

4.
Information Security and Information
Security Governance
From the previous discussion, and many other references,
there can be no doubt that the developments in the field of
good Corporate Governance over the last three to four years
had escalated the importance of Information Security to
higher levels. It is not only the fact that the spotlight was on
Information Security which resulted in this, but also the

computers & security 25 (2006) 165168

establishment and growth in maturity of the concept of


Information Security Governance.
It became clear that Information Security Governance is
more than just Information Security Management. Information Security Governance clearly indicates the significant
role of top management and Boards of Directors in the way
Information Security is handled in a company.
The following definition tries to reflect this wider meaning of
Information Security Governance which flowed from its explicit
inclusion as an integral part of good Corporate Governance:
Information Security Governance is an integral part of
Corporate Governance, and consists of
 the management and leadership commitment of the
Board and Top management towards good information
security;
 the proper organizational structures for enforcing good
information security;
 full user awareness and commitment towards good information security; and
 the necessary policies, procedures, processes, technologies
and compliance enforcement mechanisms
all working together to ensure that the confidentiality, integrity and availability (CIA) of the companys electronic assets
(data, information, software, hardware, people etc) are maintained at all times.
Information Security Governance therefore involves everyone in a company from the Chairman of the Board right
through to the data entry clerk on the shop floor and the driver
of the vehicle delivering the products to the customers.
Information Security Governance can be seen as the overall way in which Information Security as a discipline is handled (used) to mitigate IT risks. One of the essential
characteristics of Information Security Governance is the
fact that it consists of a closed loop.
The loop starts with managements commitment to Information Security by treating it as a strategic aspect pivotal
to the existence of the company and being responsible for
managing the IT risks of the company. This treatment includes
the sanctioning of a Corporate Information Security Policy
accepted and signed off by the Board.
This Policy is supported by a suitable organizational structure for Information Security, specifying ownership and
responsibilities on all levels. The organizational structure
must take the compliance and operational management of
Information Security into account (von Solms, 2005). Such
ownership and responsibilities are strengthened by the necessary
User Awareness programs for all users of IT systems.
The required technology is rolled out and managed, and
compliance monitoring is instituted to measure the level of
compliance to policies, etc., reflecting the level to which IT
risks are managed. The results of such compliance monitoring
efforts are then fed back to Top Management to comprehensively inform them about the status of IT risk management.
This closes the loop.
Information Security Governance is therefore the
implementation of the full well-known PlanDoControl
Measure Report loop.

167

Let us now investigate some of the drivers behind this


Fourth Wave in more detail.

5.

Drivers behind the Fourth Wave

As discussed above, some of the major drivers behind this


Fourth Wave are definitely the bigger emphasis on good Corporate Governance and the supporting legal and regulatory
developments in this area.
Taking one step back, we can again reason that the major
drivers for this bigger emphasis on good Corporate Governance
and the supporting legal and regulatory developments are the
risks of committing fraud and misusing financial resources by manipulating the companys electronic data stored on its IT systems.
Therefore, preventing fraud through manipulating electronic company data seems to be the core of this drive. From
this core came the relevant regulatory and legal developments, as well as the pressure for good Corporate Governance.
The total integration of IT into the strategic operation of
companies over the last few years, and pervasiveness of the
use of IT throughout companies and the services they deliver,
opened up many opportunities to commit fraud using the
companys IT systems, resulting in serious risks.
One of the most serious of these risks is that of social engineering and its relationship to Information Security.
Senior management realized that the human side of using
IT systems, by employees, clients and customers, can cause
serious risks, not withstanding the amount of money spent
on the technical measures. It became clear to them that the Information Security problem cannot be solved by technical
means alone, and that strategic decisions on a high level had
to be made to ensure that all users are aware of possible risks,
and the impact of social engineering in attacking IT systems.
Attempts to use social engineering to commit fraud seem
to be rising. It is essential to realize that good Information Security Governance, in the sense discussed above, is essential
to addressing this risk.
Again, this has been stated over and over by Information
Security practitioners over many years, but the pressure
caused by good Corporate Governance allowed the penny to
drop on the level we targeted for such a long time.
An important question is, of course, whether this Fourth
Wave will be sustainable.

6.

Some consequences of the Fourth Wave

As discussed above, the major drivers behind this Fourth


Wave are definitely the emphasis on good Corporate Governance and the supporting legal and regulatory developments
in this area. For this reason it can be accepted that the Fourth
Wave will be sustainable, in the sense that top management
will not lose interest they cannot afford to, because their
heads are on the block.
This will give more exposure to Information Security in
general, which is what we have hoped for anyway. We will
probably find that audit committees become much more
sensitive towards Information Security, and we will even
see a person or persons on the Board assigned specific

168

computers & security 25 (2006) 165168

Information Security Governance responsibilities. In many


instances, these steps have started already.
The following quote supports the realization mentioned
above:
According to the 2005 Global Information Security Workforce
Study, sponsored by the International Information Systems
Security Certification Consortium, IT security professionals
are gaining increased access to corporate boardrooms. More
than 70% of those surveyed said they felt they had increased
influence on executives in 2005, and even more expect that
influence to keep growing. (Security Log, 2006).
It is, however, crucial to realize that Information Security
Governance as introduced by this Fourth Wave, is NOT a technical issue. Although it contains technical issues, other
(non-technical) issues like awareness and compliance management ensuring that the stakeholders conform to all
relevant policies, procedures and standards are core to
good Information Security Governance.
As such compliance and risk reporting is core to Information Security Governance, we will therefore see that the
Fourth Wave requires more formal reporting tools and
mechanisms ways and means to give Top Management an
easily understandable overview of precisely what the IT risks
are, and how these risks are being managed over time.

7.

Summary

Based on the three waves in the development of Information


Security as introduced in other articles (von Solms, 2000),
Information Security development is presently in its Fourth
Wave.
This wave reflects the development of Information Security Governance as a result of the emphasis on good Corporate
Governance.
The Fourth Wave of Information Security can therefore be
defined as the process of the explicit inclusion of Information
Security as an integral part of good Corporate Governance,
and the maturing of the concept of Information Security
Governance.
We as Information Security practitioners must use this
development to its optimum to ensure the security of IT
systems.

references

OECD Principles of Corporate Governance, http://www.oecd.org/


dataoecd/32/18/31557724.pdf; 2004 [accessed 13.01.2006].
King 2 Report on Corporate Governance, http://www.iodsa.co.za/
corporate.htm; 2002 [accessed 13.01.2006].
SarbanesOxley, http://news.findlaw.com/hdocs/docs/gwbush/
sarbanesoxley072302.pdf; 2002.
Hurley E, http://searchsecurity.techtarget.com/originalContent/
0, 289142,sid14_gci929451, 00.html; 2003.
Information Security Governance a call to action, National
Cyber Security Summit Task Force, www.cyberpartnership.
org/InfoSecGov_04.pdf; 2003.
Information Security Governance: guidance for Boards of Directors and Executive Management. USA: IT Governance Institute, ISBN 1-893209-28-8, www.itgovernance.org.
Security Log. Computerworld, http://www.computerworld.com/
securitytopics/security/story/0, 10801, 107706, 00.html?
sourceNLT_SEC&nid107706; 2006 [accessed 18.01.2006].
von Solms B. Information security governance. Computers and
Security 2005;24:4437.
von Solms B. Information Security The Third Wave? Computers
and Security 2000;19:61520.

Prof SH (Basie) von Solms holds a PhD in Computer Science,


and is the Head of Department of the Academy for Information Technology at the University of Johannesburg in Johannesburg, South Africa. He has been lecturing in Computer
Science and IT related fields since 1970. Prof von Solms specializes in research and consultancy in the area of Information
Security. He has written more than 90 papers on this aspect,
most of which were published internationally. Prof. S. H. von
Solms also supervised more than 15 PhD, students and more
than 45 Master students. Prof von Solms is the present VicePresident of IFIP, the International Federation for Information
Processing, and the immediate past Chairman of Technical
Committee 11 (Information Security), of the IFIP. He is also
a member of the General Assembly of IFIP. He has given numerous papers, related to Information Security, at International conferences and is regularly invited to be a member of
the Program Committees for international conferences. Prof
von Solms has been a consultant to industry on the subject
of Information Security for the last 10 years, and received
the 2005 ICT Leadership Award from the ICT Industry in SA.
He is a Member of the British Computer Society, a Fellow of
the Computer Society of South Africa, and a SAATCA Certificated Auditor for ISO 17799, the international Code of Practice
for Information Security Management.

Computers & Security (2006) 25

www.elsevier.com/locate/cose

EVENTS
For a more detailed listing of IS security and audit events, please refer to the events diary on www.compseconline.com
CSI NET SEC 06
1214 June 2006
Scottsdale, Arizona, USA
www.csinetsec.com
INFOSECURITY CANADA
1416 June 2006
Toronto, Canada
www.infosecuritycanada.com
INTERNATIONAL CONFERENCE ON DEPENDABLE
SYSTEMS AND NETWORKS 2006
2528 June 2006
Philadelphia, Pa, USA
www.dsn.org
18TH ANNUAL FIRST CONFERENCE
2530 June 2006
Baltimore, Maryland, USA
www.first.org/conference/2006

BLACK HAT USA 2006


29 July3 August 2006
Las Vegas, USA
http://www.blackhat.com/html/
bh-link/briengs.html

ISACA INTERNATIONAL CONFERENCE


30 July2 August 2006
Adelaide, Australia
www.isaca.org

8TH ANNUAL NEBRASKACERT


810 August 2006
Omaha, Nebraska, USA
www.certconf.org

THE 26TH INTERNATIONAL CONFERENCE ON


DISTRIBUTED COMPUTING SYSTEMS
47 July 2006
Lisboa, Portugal
http://icdcs2006.di.fc.ul.pt/

19TH IFIP WORLD COMPUTER CONGRESS


2025 August 2006
Santiago, Chile
http://www.wcc-2006.org/

IEEE CEC 2006 SPECIAL SESSION ON


EVOLUTIONARY COMPUTATION IN
CRYPTOLOGY AND COMPUTER SECURITY
1621 July 2006
Vancouver BC, Canada
http://163.117.149.137/cec2006ss.html

CSI ANNUAL CONFERENCE AND EXHIBITION


58 November 2006
Orlando, Florida, USA
www.gocsi.com

Computers & Security (2006) 25, 169e183

www.elsevier.com/locate/cose

Real-time analysis of intrusion detection


alerts via correlation
Soojin Lee*, Byungchun Chung, Heeyoul Kim, Yunho Lee,
Chanil Park, Hyunsoo Yoon
Division of Computer Science, Department of Electrical Engineering and Computer Science,
Korea Advanced Institute of Technology (KAIST), 373-1 Guseong-Dong, Yuseong-Gu,
Daejeon, Republic of Korea
Received 8 December 2004; revised 28 June 2005; accepted 23 September 2005

KEYWORDS
Security;
Intrusion detection;
Correlation;
Alert analysis;
Reduction;
Attack scenario

Abstract With the growing deployment of networks and the Internet, the importance of network security has increased. Recently, however, systems that detect
intrusions, which are important in security countermeasures, have been unable
to provide proper analysis or an effective defense mechanism. Instead, they have
overwhelmed human operators with a large volume of intrusion detection alerts.
This paper presents a fast and efficient system for analyzing alerts. Our system basically depends on the probabilistic correlation. However, we enhance the probabilistic correlation by applying more systematically defined similarity functions and
also present a new correlation component that is absent in other correlation models. The system can produce meaningful information by aggregating and correlating
the large volume of alerts and can detect large-scale attacks such as distributed
denial of service (DDoS) in early stage. We measured the processing rate of each
elementary component and carried out a scenario-based test in order to analyze
the efficiency of our system. Although the system is still imperfect, we were able
to reduce the numerous redundant alerts 5.5% of the original volume without distorting the meaning through two-phase reduction. This ability reduces the management overhead drastically and makes the analysis and correlation easy. Moreover,
we were able to construct attack scenarios for multistep attacks and detect largescale attacks in real time.
2005 Elsevier Ltd. All rights reserved.

* Corresponding author. Tel.: 82 42 869 5552; fax: 82 42 869 5569.


E-mail addresses: sjlee@camars.kaist.ac.kr (S. Lee), bcchung@camars.kaist.ac.kr (B. Chung), hykim@camars.kaist.ac.kr
(H. Kim), yhlee@camars.kaist.ac.kr (Y. Lee), chanil@camars.kaist.ac.kr (C. Park), hyoon@camars.kaist.ac.kr (H. Yoon).
0167-4048/$ - see front matter 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cose.2005.09.004

170

Introduction
Cyber attacks are escalating as the mission-critical
infrastructures for governments, companies, institutions, and millions of every day users become
increasingly reliant on interdependent computer
networks and the Internet. Moreover, current
cyber attacks show a tendency to become more
precise, distributive, and large-scale (CERT Coordination Center; Bugtraq). However, recent intrusion detection systems (IDSs) which are important
in security countermeasures, have been unable
to provide proper analysis or an effective security
mechanism for defending such cyber attacks because of several limitations.
First, as network traffic increases, the intrusion
detection alerts produced by IDSs are increasing
exponentially. In spite of this increase, most IDSs
neglect the overhead of human operators, who are
overwhelmed by the large volume of alerts. Second, human operators are fully responsible for
analyzing a networks status and the trends of
cyber attacks. Third, although cyber attacks can
produce multiple correlated alerts (Kendall, 1999;
CERT Coordination Center), IDSs are generally unable to detect such attacks as a complex single
attack but regard each alert as a separate attack.
Therefore, in the early stage, it is difficult to detect large-scale attacks such as a distributed denial of service (DDoS) or a worm.
These limitations are caused by the absence of
a mechanism that can preprocess and correlate the
massive number of alerts from IDSs. In fact, preprocessing and correlation of alerts are essential for
human operators because the information reproduced by this means can reduce the overhead of
human operators and help them react appropriately
(Bloedorn et al., 2001).
In proposing a fast and efficient system that
analyzes intrusion detection alerts via correlation,
we focused on providing human operators with
a level of flexibility that matches the topology and
status of a network. Our system basically depends
on the probabilistic correlation proposed in Valdes
and Skinner (2001) rather than the fixed rule-based
correlation of Perrochon et al. (2000), Cuppens
(2001a,b) Cuppens et al. (2002), Lee (1999) and
Lee et al. (2000). Compared with other models,
our model, which is similar to the probabilistic
correlation, has several advantages.
First, we considered the time similarity, though
this major measure of correlation is disregarded in
other models, and we used a mathematical function that computes the time similarity on the basis
of Brownes result in Browne et al. (2001). To pro-

S. Lee et al.
cess the time information more systematically,
we also applied the result to our system.
Second, for immediate analysis of the status of
a managed network and the trends of cyber
attacks, we used a situator, which can grasp the
trend of attacks being generated in the network by
analyzing the relations between the source and
the destination, as one of our components. With
a situator, we could detect large-scale attacks
such as a DDoS or worm in the early stage, and we
could respond to such threats as soon as possible.
Third, we implemented our model and tested it
for various attack scenarios. Moreover, as a result
of our improvement, the system has the capability
of real-time processing and is therefore more
practical than other models.
The remainder of this paper is organized as
follows. In next section, we describe the architecture of our proposed system and the details of
each component. Then, we describe the correlation hierarchy and similarity functions of our
system. Further, we compare our system with
other correlation systems and illustrate the performance of our system which is followed by an
overview of previously proposed correlation mechanisms. Finally, in the last section, we summarize
the paper and discuss future work.

System architecture
Our system consists of the five components as
shown in Fig. 1: Filter, Control center, Aggregator,
Correlator, and Situator. We attached the filter to
the sensor in each managed network and operated
other components from the control center.

Control center
The control center receives filtered alerts as
a Thread Event from the filter and saves them in

Aggregator
Sensor
Filter

Correlator

Situator
Control center

Sensor
Filter

Figure 1

DB

DB

Overall system architecture.

Real-time analysis of intrusion detection alerts via correlation

171

FILTER
Alert Queue
(1)

SENSOR

Alert Receiver
Module

(2)
(3)

ThreadEvent
Maker Module

CENTER

(8)

ThreadEvent
Sender Module

ThreadEvent Table

(4)
(5)
(6)
(7)
Timer

Sensor Alert
Thread Event
Control information

Figure 2

Internal architecture and processing flow of filter.

a database before forwarding them to the aggregator and the situator for further processing. When
the system is started, the control center initializes
a runtime environment by connecting to a database, setting the parameters and so on. The viewers that are used for inspecting the processed
information in each component and the data structure are also defined in the control center.

Filter
The filter gathers alerts from the sensor in each
managed network and eliminates redundancies
among those alerts. The features that the filter
uses to eliminate the redundancies are the source
and class of the attack. The filter merges the
redundant alerts into Thread Events and forwards
them to the control center at regular intervals.
The filter consists of three modules: an Alert
Receiver, a ThreadEvent Maker, and a ThreadEvent
Sender. The alert receiver forms one process and
the other two modules behave as multiple threads
in a single process. Fig. 2 shows the internal architecture and processing flow of the filter.
Alert receiver: the primary sensor of our system is
a widespread NIDS Snort. The alert receiver receives
alerts from the sensor in the form of an Alertpkt1
struct type, and sends them to the alert queue. The
alert queue saves the alerts in the order of arrival.
ThreadEvent Maker: after receiving the alerts
from the queue, the ThreadEvent Maker compares
them with previous alerts. If exact matches exist
1

Alertpkt is a data structure of snort log.

(4) Search previous ThreadEvent


(5) Insert new Thread Event
(6) Search Time information
(7) Receive Thread Event that
will be forwarded to center

between the alerts, ThreadEvent Maker merges


the alerts into a matching thread event. Otherwise,
if there is no match in the source or class of the
attack, a new thread event is generated. Fig. 3
shows the flow diagram of the ThreadEvent Maker.
ThreadEvent Sender: the ThreadEvent Sender
transfers the thread events to the control center
at predefined intervals. That is, whenever the alert
aggregation interval defined in the timer expires,
the ThreadEvent Sender stops updating the thread
event table and transfers thread events to control
center for further processing. The timer is reconfigurable according to the status of the network.
In our experiment, we set the timer as 1 min.

Aggregator
The aggregator compares the similarity of features
between the thread events transferred from each
filter. If common features exist between two
thread events, the aggregator merges them into
one meta event named an Aggregation Event. The
aggregator can merge the thread events that may
not be merged into a similar thread event in the
filter because the aggregator has longer merging
interval than the filter.
Fig. 4 illustrates a diagram of the processing flow
of the aggregator. When new thread events are
transferred into the control center, the network
module that is communicating with each managed
network calls the aggregator. The aggregator then
extracts the previous aggregation events generated
for certain period of time from the database and,
using the similarity functions defined in section

172

S. Lee et al.

Wait until Alertqueue


has elements
N

Is Alertqueue empty?
Y
Create sub ThreadEvent
and attach it to the matching
meta ThreadEvent

Dequeuing AlertEvent
from Alertqueue

Y
Does ThreadEvent Table
has the meta ThreadEvent
that has the same src IP
and attack class?

Create meta ThreadEvent


and save it to
ThreadEvent Table
N

Figure 3

Flow diagram of ThreadEvent Maker.

Similarity functions, compares them with the


newly transferred thread events to determine
whether they have common features. If they
have common features that satisfy the predefined
conditions, the aggregator updates the previous
aggregation event to include the new thread
event. Otherwise, the aggregator generates
a new aggregation event. Table 1 shows the weight
of each feature that is used to merge the thread
events in the aggregator. To aggregate the duplicated alerts, we suppressed the minimum expectation of similarity on the source, destination,
and attack class of the attack. We also relaxed
the minimum expectation and similarity expectation of the time in order to aggregate thread
events that could not be merged in the filter due
to the short merging interval.

Correlator
By analyzing the timing and causal relation between aggregation events, the correlator can catch
attack scenarios that are carried out in multiple
steps and accumulate a store of knowledge about
new attack patterns. We therefore suppressed the
minimum expectation of similarity on the source
and destination of the attack as shown in Table 2.
We also enforced the similarity expectation on the
source and destination of the attack. Moreover, to
correlate various attacks with the same destination, the minimum expectation and similarity expectation of the attack class are set to low.
As shown in Fig. 5, which illustrates a diagram of
the processing flow of the correlator, the correlator
only processes aggregation events. When a new

Network 1
Sensor1

Control center

Network 2
Sensor2

Thread
Event

Network module

Aggregator

- Communicate
with each network

Thread - Calculate feature


similarity & overall Update
Event
similarity

- Call Fusion( )
- Save Thread Event
Save

Network 3

- Update or create
Aggregation Event
Select
Meta Event

Update or
Create

Sensor3

Thread Event
Database

Figure 4

Aggregtion Event
Database

Processing flow diagram of aggregator.

Correlated
Meta Event
Database

Real-time analysis of intrusion detection alerts via correlation


Table 1

173

Weight of each feature in aggregator

Feature

Source IP

Source port

Destination IP

Destination port

Attack class

Time

Expectationa
Minimumb

Medium
High

Low
Low

Medium
High

Low
Low

High
High

Medium
Medium

a
b

Expectation: similarity expectation.


Minimum: minimum expectation of similarity.

aggregation event is transferred from the aggregator, the correlator selects the previously generated
Correlation Events within certain periods. If there
is a matching event in the lists of selected events,
a new aggregation event is merged into that correlation event with the time information. Otherwise,
a new correlation event is generated.
The correlator can provide us with important
information about the similarity of an attack class.
For example, the attack scenarios detected in the
correlator may consist of related attacks, and
these attacks can be considered as similar. Therefore, to construct a more precise matrix, the
results of the correlator should be fed back to
the similarity matrix.

Situator
The situator grasps the trend of attacks being
generated in the network by analyzing the relations between the source and the destination. This
capability enables early detection of the largescale attacks that originate from many attackers
around the world, such as DDoS and worm, and it
reduces the response time.
The situator can detect three types of attack:
1:N, N:1, and M:N. The 1:N attack means an attack
that originates from a single source to multiple destinations, such as a network scan and a service scan.
In contrast, the N:1 attack means an attack that originates from multiple sources to a single destination.
One example of an N:1 attack is a DDoS, and
such attacks tend to increase without warning.
Therefore, by analyzing the attack trends in a network, we can detect attacks in the early stage. As
with a worm or virus, an M:N attack has the entire
network as its destination. While this type of attack generates a small number of events for a specific source and destination, it generates a great
number of events in the entire network.
Table 2

Fig. 6 shows the internal architecture and a simple flow diagram of the situator. The situator saves
each thread event that is transferred from the filters
in candidate lists first. If the number of thread
events saved in each candidate list exceeds a predefined threshold, the situator classifies them into
a corresponding situation and generates the Situation Events. A human operator can reconfigure the
threshold according to the status of the managed
network or the trend of the current attacks. Fig. 7
shows a more detailed flow diagram of the situator.

Hierarchy of correlation
Each component of our system achieves the
following hierarchy of correlation and, to get the
correlation at different stages of the hierarchy, we
use multiple events. For example, we can infer
thread events (within a sensor correlation) and
then merge them into aggregation events or
situation events for center-level inspection. The
aggregation events are correlated into the correlation events again in order to construct the attack
scenarios of multistep attacks. Fig. 8 shows the
overall hierarchy of correlation.
Thread event: a thread event is the primary information unit in our system. The filter first eliminates the redundancies among a large number of
raw alerts and merges them into a small number
of thread events. In this process, the filter does
not use the similarity functions defined in section
Similarity functions. Instead, the filter simply
compares the source and the attack class. The
thread events generated in the filter are transferred to the control center for further processing.
Aggregation event: by setting the minimum expectation of similarity on the source, the destination ip address, and the attack class as high, and by
relaxing the similarity expectation for the time as

Weight of each feature in correlator

Feature

Source IP

Source port

Destination IP

Destination
port

Attack class

Time

Expectation
Minimum

High
High

Low
Low

High
High

Low
Low

Low
Low

Low
Low

174

S. Lee et al.
Correlator
- Calculate Similarity

Aggregation Event
Database

Update

- Update or create
Create
Correlated MetaEvent

Similarity
Matrix

Figure 5

Correlated
Meta Event
Database

Feedback

Processing flow diagram of correlator.

shown in Table 1, we can merge more thread


events from several sensors into a single aggregation event.
Correlation event: by relaxing the minimum
expectation of similarity on the attack class as
shown in Table 2, we were able to reconstruct
various steps of a multistep attack. Each step of
an attack may itself be an aggregation event. In
view of this possibility, we can recognize a multistep attack composed of, for example, a probe
followed by an exploit to gain access to a critical
host, and then using that host to launch an attack
to a more critical asset. To correlate the aggregation event, we also enforced the minimum expectations on the source and destination of the
attack.
Situation event: a situation event, which is independent of the aggregation event and the correlation event, comprises only thread events. As with
the filter, the similarity functions are not used
when the situation events are generated.

Similarity functions
In so far as we consider the similarity of features,
the minimum expectation of similarity, and the
expectation of similarity, our correlation approach
is similar to the probabilistic alert correlation
Situator

N:1 Situator

N:1 Candidates

DDoS
Detector

M:N Situator
Thread
Event

Worm
Detector

proposed in Valdes and Skinner (2001). However,


as mentioned in section Introduction, we introduce a more systematic approach by referring to
earlier research. To correlate meta events that
are possibly composed of several alerts, we defined similarity functions for a list value. Features
used in analyzing the similarity include the ip address, the port, the attack class of the attack,
and the time information. In this section, we describe only the basic similarity functions.

IP address similarity
If the sources of two different events (or attacks)
belong to the same sub-network, there is greater
probability that the same attacker launched the
two events. This probability may increase exponentially as the matching address becomes longer.
We can infer, therefore, that the similarity of ip
address agrees with the log scale. The similarity
function for the ip address is defined as follows and
its value can be readjusted to a realistic level
through more experiments.
IPsimilarity (string IP1, string IP2) {
If perfect match, return 1;
If C class match, return 0.8;
If B class match, return 0.4;
If A class match, return 0.2;
Return 0;
}

Port similarity
There may be a high probability that the list of
input events will become a subset of the meta
events because newly input events are generally
low level of the earlier meta events. We therefore
defined the similarity between the port list of two
events as the mean of the similarity between each
input event and the meta event. The input event
has a port list of L1 {x1, x2, ., xn}, and the meta
event has a port list of L2 {y1, y2, ., ym}, then the
similarity S between L1 and L2 is defined as follows.



Si max Similarity xi ; yj
1 j m

1:N Situator

1:N Candidates

Scan
Detector

S
Thread Event
Situation Event

Figure 6 Internal architecture and processing flow diagram of situator.

1X
Si
n

Attack class similarity


Because heterogeneous sensors may use a different
name for the same attack, it is difficult to

Real-time analysis of intrusion detection alerts via correlation

M:N Situator

thread event

N:1 Situator

N:1 Candidates

175

1:N Situator

1:N Candidates

If exists then increase


M:N Situator; exit;

Otherwise
If exists then increase
1:N Situator

If exists then increase


N:1 Situator
Otherwise
If Src_Cnt > T_N1

Otherwise

Increase N:1
Candidates

If Dst_Cnt > T_1N

Increase 1:N
Candidates

Union N:1,1:N Situator such that


AttackClass = event.AttackClass

If Src_Cnt > T_N1 and Dst_Cnt > T_1N then


Insert into M:N Situator

Figure 7

Detailed processing flow of situator.

correlate alerts from heterogeneous sensors. For


that reason, we need to define the definite set of
the attack class, and classify various attacks into
the corresponding class according to their
characteristics.
This potential problem between heterogeneous
sensors does not occur in our current developed
system, since we use only one sensor, Snort.
However, to enable our system to correlate heterogeneous sensors in an integrated system, we
need to define the attack class. In our current
system, we used the 34 types of attacks2 that the
snort provides; in the future, we hope to offer
clearer definitions of the attack classes.
To define the similarity between two attack
classes, we constructed a similarity matrix, S. The
size of the S matrix is 34 by 34, and each value in
the matrix is between 0 and 1. If the attack class of
the input event is i and the attack class of the
meta event is j, similarity between the two attack
classes is defined as S[i, j]. To establish the initial
values of the similarity matrix of the attack classes, we statistically analyzed a DARPA data set
and the real intrusion detection alerts.

Time similarity

distant in time, we should regard the two events


as having no similarity. According to the trend
analysis of exploitations conducted by Browne
et al. (2001), the number of incidents caused by
one exploit can be modeled with the following
formula.
CIS

p
M

C: the cumulative count of reported incidents


I, S: the regression coefficients determined by
analysis of the incident report data
M: the time since the start of the exploit cycle
The above formula indicates that most events
(or incidents) are generated in the early stage of
an exploit cycle, and the number decreases with
time. Consequently, the time similarity between
a newly input event and an earlier meta event
increases as close in time. In contrast, the time
similarity decreases rapidly as far in time. If the
created time of the input event is t2 and the

Sensor

Filter

Alert

The time information is important in alert correlation, and time similarity has great significance
when we calculate the overall similarity. For
example, if most features of the two events are
similar, and if the two events are extremely
2

34 types of attacks means Snort default classifications.


The classtype field of snort alert uses 34 classifications defined
by the classification config option. The classifications used by the
rules provided with Snort are defined in etc/classification.config.

Aggregator

Correlator

Aggregation
Event

Correlation
Event

Thread Event
Alert
Alert
Thread Event
Alert
Situation
Event

Alert
Alert
Alert

Figure 8

M:N Situation
N:1 Situation

Thread Event
Situator

1:N Situation

Overall hierarchy of correlation.

176

S. Lee et al.

created time of the meta event is t1, then the time


similarity S is defined as follows.
SIV 

p
t2  t1

I, V: the coefficients determined by the status


of network

merged into another meta event. Otherwise,


a new event is generated using the new event.

Analysis and performance evaluation


In this section, we describe the differences between our system and previous correlation systems. We proceed with the results of performance
evaluation.

Overall similarity
After calculating the similarity of each feature, we
need to calculate the overall similarity in order to
decide whether the two events can be correlated.
When calculating overall similarity, the expectation of similarity and the minimum expectation
of similarity play an important role as a weight
and a necessary condition each. By using the
expectation of similarity, we can attach importance to the significant features. The minimum
expectation of similarity is used as a threshold
value. For instance, certain features can be required to match exactly or approximately for an
event to be considered as a candidate for correlation with another. The minimum expectation thus
expresses the necessary but not sufficient conditions for correlation.
If any overlapping feature matches at a value less
than the minimum similarity for the feature, the
overall similarity between two events is zero.
Otherwise, the overall similarity is the weighted
average of the similarities of the overlapping features, using the respective expectations of similarity as weights.
As with the probabilistic approach (Valdes and
Skinner, 2001), we can define the overall similarity
between a new event, X, and an earlier event, Y,
as follows.
P
SIMX; Y



Ej  SIM Xj ; Yj
P
Ej
j

j: index over features in the event


Ej: expectation of similarity for feature j
Xj, Yj: values for feature j in events X and Y,
respectively
After the overall similarities are calculated
between a new event and earlier candidate
events, the largest value becomes a candidate
value for correlation. If the candidate value is
larger than the predefined threshold, the new
event and the corresponding meta event are

Comparison with previous correlation


systems
We based our proposed system on probabilistic
alert correlation (Valdes and Skinner, 2001). That
approach is similar to ours in so far as the correlation components use feature-specific similarity
functions and a probabilistic criterion. Our system,
however, has some distinctive characteristics.
First, our system uses the filter in each managed
network to merge duplicate alerts into thread
events; and, before conducting further correlation, the thread events that are transferred to
the control center are merged again by the aggregator. That is, although all alerts are considered
the subject of correlation in the probabilistic
approach, in our system only preprocessed alerts
(or events) in the filter and the aggregator are correlated. This property can drastically reduce the
overhead of our system. Second, in contrast to
our system, the probabilistic approach does not
consider time information as a significant feature.
For example, in the probabilistic approach, the
thread and scenario aggregation may occur over
intervals of days. Third, to construct the attack
class similarity matrix, we statistically analyze
the DARPA data set and the live alerts collected
from our network.
This approach enables us to reflect the recent
attack trends and to construct a more substantial
matrix.
Our system is more flexible than rule-based
approaches such as the Stanford CIDF correlator
(Perrochon et al., 2000) and the planning process
model (Cuppens et al., 2002). For instance, it
can find a new multistep attack and get a realistic
result in the correlation process. Moreover, while
most correlation systems (Perrochon et al., 2000;
Cuppens et al., 2002; Debar and Wespi, 2001;
Valdes and Skinner, 2001; Porras et al., 2002;
Ning et al., 2002; Lee et al., 2000) cannot detect
large-scale attacks such as DDoS or worm in the
early stage, our system can detect such attacks
in real time.

Real-time analysis of intrusion detection alerts via correlation

Performance evaluation
To assess the processing power and efficiency of our
system, we measured the reduction ratio of the
filter and the correlator, and the processing time of
each component. To evaluate the correlation performance, we also conducted the following scenariobased test using known techniques and exploits.
 Scenario #1: stealth scan to specific host
 Scenario #2: Buffer Overflow attack to FTP
server
 Scenario #3: CGI attack to Web server
 Scenario #4: Buffer Overflow attack to RPC
service
 Scenario #5: network scan to multiple host [1:N
attack]
 Scenario #6: DDoS attack [N:1 attack]
 Scenario #7: attack using worm and virus [N:M
attack]
Although useful for evaluating the performance
of the IDS, the DARPA data set is unsuitable for
evaluating the correlation system. We therefore
conducted known attack scenarios to inspect
whether our correlation component successfully
detects multistep attacks and large-scale attacks
in the early stage. In this paper, we only present
the results of Scenarios #2, #4, and #6.
Reduction ratio of filter and aggregator
We measured the reduction ratio of the filter and
the aggregator by dividing the number of events
generated in the filter or the aggregator into the
total number of alerts generated in the test
period. The results are as shown in Table 3.
When we set the timer in the filter to 1 min, the
average reduction ratio of the filter is 11.1%. In
the case of ICMP Nachi Worm by Ping CyberKit,
the filter on average merged 20 alerts into a single
thread event, and the maximum reduction ratio
was 5%.
Processing time of each component
To assess the processing efficiency of each component, we measured the processing time of
whatever single thread event (i.e. may include
several alerts) was processed completely in each
component.
As shown in Table 4, our system has a real-time
processing capability, since a single thread event
can be processed completely in all components
within 0.8 s. Furthermore, in contrast to other systems, in which most processes are conducted manually, the automation in our system can drastically
reduce the management overhead of human

Table 3

177
Reduction ratio of filter and aggregator

Component

Total number of alerts


generated within the
period of test

599,403

Filter

# of thread event
Average reduction ratio
(# of thread event/#
of alert)
# of aggregation event
Average reduction ratio
(# of aggregation event/#
of alert)

66,775
11.1%

Aggregator

33,173
5.5%

operators and simplify analysis. Moreover, by


providing useful information for the reaction of human operators in real time, our system can reduce
the response time.
Performance evaluation of correlator
(Scenario #2)
Scenario #2 is a multistep attack that exploits the
vulnerabilities of ftp server and consists of following steps.
 Step 1: host scan (detect whether ftp service is
running)
 Step 2: login attempt to detect the type of ftp
server
 Step 3: execution of exploit code to attack the
vulnerabilities of ftp server
 Step 4: system file access after acquiring the
privilege of root
In this scenario, an attacker may use NMAP
(NMAP) to try to detect whether an ftp service is
running on the target host. According to the results
of the NMAP, the attacker can determine whether
the target host runs the ftp service. The attacker
then logins to the target host to detect the type
of ftp server that is running. Finally, the attacker
learns that the WU-FTP is running.
The attacker exploits the specific vulnerability
of the WU-FTP server as shown in Fig. 9. If the
exploitation succeeds, the attacker can access

Table 4

Processing time of each component

Component

Processing
time (s)

Thread event saving in control center


Aggregator
Correlator (include event update)
Situator

0.0097
0.5398
0.0887
0.1691

Total

0.8103

178

S. Lee et al.

Figure 9

Buffer Overflow attack to FTP server.

the server using the username ftp and gain the


privilege of the system administrator. The attacker
then transfers the significant system file (that is, /
etc/passwd) to the attackers host.

Figure 10

Fig. 10 shows the thread events that are generated as a consequence of a Buffer Overflow attack
to an ftp server. As shown in Fig. 10, current IDSs
usually provide the detected alerts as they are,

Detection results of FTP server exploit: thread event.

Real-time analysis of intrusion detection alerts via correlation

Figure 11

179

Results of correlation with FTP server exploit in correlator.

and they cant represent the relation between


those alerts. For that reason, IDSs overwhelm human operators with a large volume of alerts and
make it difficult to analyze the attack trends.
Our system, however, can analyze alerts and
represent their relations by using our correlation
mechanism as shown in Fig. 11. The various events
shown in Fig. 11 are represented as a single meta
event because they have the same source and target of the attack. More detailed information such
as low level events (or attacks) of the correlation
event is also provided.

However, our system can find out the timing and


causal relations between the mingled thread
events and successfully correlate them into a single
meta event. As shown in Fig. 14, a series of attacks
is correlated into a correlation event, and the
name of the event is made up of the source and
destination of the attack.
Fig. 15 shows the attack scenario that our system finds out in real time, and the results correspond with our intended attack step. More
detailed information such as a list of attack signatures is also provided.

Performance evaluation of correlator


(Scenario #4)
Scenario #4 is a multistep attack that exploits the
vulnerabilities of an rpc service and consists of
following steps.

Performance evaluation of situator


(Scenario #6)
To evaluate the performance of our situator, we
classified the attack scenario as 1:N, N:1, and N:M.
In all cases (Scenarios #4, #5, and #6), we achieved
the intended results in real time, but here we
describe only the results of Scenario #6.
Scenario #6 is a many-to-one attack such as
a DDoS attack. A DDoS attack, which is usually

 Step 1: host scan using NMAP (detect whether


an rpc service is running)
 Step 2: execution of exploit code to attack the
vulnerabilities of rpc service
 Step 3: system file access after acquiring the
privilege of root (Root shell)
Fig. 12 shows all the steps. An attacker uses
NMAP to detect whether the rpc service is running
on the known host. With the result of NMAP, the attacker can determine whether the target host runs
the rpc service. The attacker then executes the
exploitation to exploit the specific vulnerability
of the rpc service. If the exploit succeeds, the attacker can gain the privilege of the system administrator and may access the significant system file
(that is, /etc/passwd).
Fig. 13 shows the thread events transferred to
the control center. Because these thread events
are mingled with other thread events from the
multiple sensors, it is difficult for human operator
to analyze the relation between those events.

Figure 12

Buffer Overflow attack to RPC service.

180

S. Lee et al.

Figure 13

Detection results of RPC exploit: thread event.

carried out in order to interrupt the service provision or normal operation of a specific host,
causes a great deal of overhead in a managed
network. Most IDSs, however, cant detect such an
attack in the early stage.
We emulated a DDoS attack with the aid of an
ICMP Flooder. As shown in Fig. 16, the ICMP Flooder
continuously transfers large packets to the target
of the attack.
Fig. 17 shows the thread events that were transferred to the control center. The thread events
generated by our emulated DDoS attack are the
events included in the rectangles. As you can see
in the Count field of the table, the DDoS attack
usually generates a large volume of alerts within
a short period. Whenever such a large volume of
alerts is transferred to the control center without
preprocessing (that is, merging in the filter) as is
the case in most IDSs, human operators may be
easily overwhelmed and react inappropriately.
Our system, however, can reduce the numerous
alerts to a small volume that human operators
can easily handle. For example, more than a thousand alerts can be merged into just 10 thread
events, as shown in Fig. 17.
Furthermore, in the case of alert flooding
(i.e. when a large volume of alerts is generated in
the managed network), the attack count of each
thread event also shows the average and maximum
reduction ratio of our system. The maximum

Figure 14

reduction ratio of the filter is 0.78%; or in other


words, 129 alerts can be merged into a single thread
event.
In Fig. 17, we can see that various attacks occurred in the managed network and that 10 out
of them are a similar type of attack. Our situator
can detect such situations in the early stage of
the attack and start the correlation process.
Fig. 18 shows the final result of the correlation.
Numerous alerts were correlated into only one
situation event in real time, since they had the
same class and target of the attack. In addition,
we can easily find out which attackers execute
a DDoS attack to the same target host and inspect
the detailed attack signatures. When the same
type of attacks continuously invade the same
target host, our situator updates the matching
situation event and increases its AlertCount.

Related work
Several alert aggregation and correlation techniques
(Perrochon et al., 2000; Cuppens, 2001a,b; Cuppens
et al., 2002; Debar and Wespi, 2001; Valdes and Skinner, 2001; Porras et al., 2002; Ning et al., 2002; Morin
et al., 2002) have been proposed to facilitate the
analysis of intrusions. Based on their own way, these
approaches tried to find the relationships between
alerts and to generate the significant information.

Result of correlation with RPC service exploit in correlator: correlation event.

Real-time analysis of intrusion detection alerts via correlation

Figure 15

Correlation event: more detailed information.

Perrochon et al. (2000) used a predefined rule to correlate alerts and to find the attack scenarios. Cuppens (2001a,b) and Cuppens et al. (2002) used
Lambda language to specify attack scenarios and
used Prolog predicates to correlate alerts based on
IDMEF data model. In Debar and Wespi (2001), an aggregation and correlation component was built into
a Tivoli Enterprise Console. In Valdes and Skinner
(2001), a probabilistic method was used to correlate
alerts by using the similarity between their features.
Porras et al. (2002) proposed a mission-impact-based
approach to analyzing the security alerts produced
by spatially distributed heterogeneous information
security (INFOSEC) devices. They intended to provide analysts with a powerful capability to automatically fuse together and isolate the INFOSEC alerts
that represent the greatest threat to the health
and security of their networks. Ning et al. (2002) developed three utilities to facilitate the analysis of
large sets of correlated alerts. In Morin et al.

Figure 16

181

(2002), a formal data model called M2D2 was


proposed in order to make full use of the available information. The effectiveness of the proposed aggregation and correlation algorithms depends heavily on
the information provided by the individual IDS.

Conclusion and future work


In this paper, we propose a fast and efficient
system for analyzing intrusion detection alerts.
By analyzing and correlating a large volume of
alerts with respect to feature similarity, our
system can produce meaningful information that
may be used in timely decisions and proper
responses. Several properties distinguish our system from other systems: two-phase reduction of
alerts in the filter and aggregator, the time similarity function, the situator, and the feed-back mechanism for the attack class similarity matrix in the

Execution of DDoS attack to target host.

182

S. Lee et al.

Figure 17

Detection results of DDoS attack: thread event.

correlator. The two-phase reduction of numerous


alerts drastically reduces the management overhead and simplifies the analysis and correlation.
The time similarity function is more systematically
defined than other systems, and it produces more
accurate results in real situations. The situator enables us to detect large-scale attacks such as DDoS

Figure 18

or worm in the early stage. This ability is absent in


other systems. The feed-back mechanism for the
attack class similarity matrix that we conceptually
described is still under construction.
To better evaluate the performance of our
system, we plan to use more attack scenarios.
Moreover, in order to make a more flexible system

Detection result of DDoS attack in situator: situation event.

Real-time analysis of intrusion detection alerts via correlation


that can correlate heterogeneous sensors, we plan
to introduce a host-based IDS and other NIDS.

Acknowledgement
This work was supported by the Korea Science and
Engineering Foundation (KOSEF) through the advanced Information Technology Research Center
(AITrc) and University IT Research Center Project.

References
Bloedorn E, Christiansen AD, Hill W, Skorupka C, Talbot LM,
Tivel J. Data mining for network intrusion detection: how
to get started. MITRE Technical Report; August 2001.
Browne H, Arbaugh W, McHugh J, Fithen W. A trend analysis of
exploitations. In: Proceedings of the 2001 IEEE symposium on
security and privacy; May 2001. p. 214e29.
Bugtraq. Security focus online, <http://online.securityfocus.
com/archive/1>.
CERT Coordination Center. Cert/CC advisories Carneige Melon.
Software Engineering Institute. Online, <http://www.cert.
org/advisories/>.
Cuppens F. Cooperative intrusion detection. In: International
symposium Information Superiority: Tools for Crisis and
Conflict-Management. Paris, France; September 2001a.
Cuppens F. Managing alerts in a multi intrusion detection environment. In: 17th annual computer security applications
conference (ACSAC). New Orleans; December 2001b.
Cuppens F, Autrel F, Miege A, Benferhat S. Correlation in an intrusion detection process. In: Internet security communication workshop (SECI02). Tunis-Tunisia; September 2002.
Debar H, Wespi A. Aggregation and correlation of intrusiondetection alerts. In: Proceedings of 2001 international workshop on recent advances in intrusion detection. Davis, CA;
October 2001.
Kendall K. A database of computer attacks for the evaluation of
intrusion detection systems. Masters thesis. Massachusetts
Institute of Technology; June 1999.
Lee W. A framework for constructing features and models for intrusion detection system. PhD thesis. Columbia University;
June 1999.
Lee W, Nimbalkar RA, Yee KK, Patil SB, Desai PH, Tran TT, et al.
A data mining and CIDF-based approach for detecting novel
and distributed intrusions. In: Proceedings of 2000 international workshop on recent advances in intrusion detection
(RAID00). Toulouse, France; October 2000.
Morin B, Me L, Debar H, Ducasse M. M2D2: a formal data model for
IDS alert correlation. In: Proceedings of the fifth international
symposium on recent advances in intrusion detection
(RAID02). In: LNCS 2516. Zurich, Switzerland; October 16e18,
2002. p. 115e37.
Ning P, Cui Y, Reeves DS. Analyzing intensive intrusion alerts
via correlation. In: Proceedings of the fifth international
symposium on recent advances in intrusion detection
(RAID02). In: LNCS 2516. Zurich, Switzerland; October
2002. p. 74e94.
NMAP network mapping tool, <http://www.insecure.org/
nmap/>.

183

Perrochon L, Jang E, Luckham DC. Enlisting event patterns for


cyber battlefield awareness. In: DARPA information survivability conference and exposition (DISCEX00). Hilton
Head, South Carolina; January 2000.
Porras P, Fong M, Valdes A. A mission impact-based approach to
INFOSEC alarm correlation. In: Fifth international workshop
on the recent advances in intrusion detection. Zurich,
Switzerland; October 2002.
Valdes A, Skinner K. Probabilistic alert correlation. In: Fourth
international workshop on the recent advances in intrusion
detection. Davis, USA; October 2001.
Soojin Lee received the B.S. degree in Computer Science from
Korea Military Academy in 1992. He also received the M.S. of
Computer Science from Yonsei University, South Korea, in
1996. He is currently working toward the Ph.D. degree at the
Division of Computer Science, Korea Advanced Institute of
Science and Technology (KAIST). His research interest includes
ad-hoc and sensor network, cryptography and computer security,
especially in intrusion detection system.
Byungchun Chung received the B.E. degree in Information and
Computer Engineering from Sungkyunkwan University, South
Korea, in 1998. He also received the M.S. degree in Computer
Science from Korea Advanced Institute of Science and Technology (KAIST) in 2001. He is currently working toward the Ph.D.
degree at the Division of Computer Science, KAIST. His research
interest includes cryptography and computer security, especially
in elliptic curve cryptography.
Heeyoul Kim received the B.E. degree in the Division of Computer Science from Korea Advanced Institute of Science and
Technology (KAIST), South Korea, in 2000, the M.S. degree in
Computer Science from KAIST, in 2002. He is currently working
toward the Ph.D. degree at the Division of Computer Science,
KAIST. His research interest includes cryptography and computer security, especially in secure group communication.
Yunho Lee received the B.E. degree in the Division of Computer
Science from Korea Advanced Institute of Science and Technology (KAIST), South Korea, in 2000, the M.S. degree in Computer
Science from KAIST, in 2002. He is currently working toward the
Ph.D. degree at the Division of Computer Science, KAIST. His
research interest includes cryptography and computer security,
especially in digital signature.
Chanil Park received the B.S. degree in Mathematics from Inha
University, South Korea, in 1999. He also received the M.S. degree in Mathematics from Korea Advanced Institute of Science
and Technology (KAIST) in 2001. He is currently working toward
the Ph.D. degree at the Division of Computer Science, KAIST. His
research interest includes cryptography and computer security,
especially in authentication.
Hyunsoo Yoon received the B.E. degree in electronics engineering from Seoul National University, South Korea, in 1979, the
M.S. degree in Computer Science from Korea Advanced Institute
of Science and Technology (KAIST) in 1981, and the Ph.D. degree in computer and information science from the Ohio State
University, Columbus, Ohio, in 1988. From 1988 to 1989, he
was a member of technical staff at AT& T Bell Labs. Since
1989 he has been a faculty member of Division of Computer
Science at KAIST. His main research interest includes wireless
sensor networks, 4G networks, and network security.

Computers & Security (2006) 25, 184e189

www.elsevier.com/locate/cose

A novel remote user authentication scheme


using bilinear pairings
Manik Lal Das a,b,*, Ashutosh Saxena a, Ved P. Gulati a,
Deepak B. Phatak b
a

Institute for Development and Research in Banking Technology, Castle Hills, Road Number 1,
Masab Tank, Hyderabad-500057, India
b
K. R. School of Information Technology, Indian Institute of Technology, Mumbai-400076, India
Received 20 January 2005; revised 16 August 2005; accepted 23 September 2005

KEYWORDS
Authentication;
Bilinear pairings;
Smart card;
Password;
Timestamp

Abstract The paper presents a remote user authentication scheme using the
properties of bilinear pairings. In the scheme, the remote system receives user
login request and allows login to the remote system if the login request is valid.
The scheme prohibits the scenario of many logged in users with the same loginID, and provides a flexible password change option to the registered users without
any assistance from the remote system.
2005 Elsevier Ltd. All rights reserved.

Introduction
Password authentication is an important technique
to verify the legitimacy of a user. The technique is
regarded as one of the most convenient methods
for remote user authentication. Based on the
computation complexity, password-based authentication schemes are classified into two broad

* Corresponding author. Institute for Development and


Research in Banking Technology, Castle Hills, Road Number 1,
Masab Tank, Hyderabad-500057, India. Tel.: C91 40 2353 4981;
fax: C91 40 2353 5157.
E-mail addresses: mdas@it.iitb.ac.in, mldas@idrbt.ac.in (M.
L. Das), asaxena@idrbt.ac.in (A. Saxena), vpgulati@idrbt.ac.in
(V.P. Gulati), dbp@it.iitb.ac.in (D.B. Phatak).

categories, viz. hash-based (Menezes et al.,


1996) authentication and public-key based authentication (IEEE P1363.2 Draft D12, 2003).
In 1981, Lamport introduced the first well-known
hash-based password authentication scheme.
Lamports scheme suffers from high hash overhead
and password resetting problems. Later, Shimizu
et al. (1998) overcome the weakness of Lamport
(1981) and proposed a modified scheme. Thereafter, many schemes and improvements (Lee et al.,
2002; Peyravian and Zunic, 2000; Ku et al., 2003;
Ku, 2004) on hash-based remote user authentication, have been proposed. These schemes take
low computation cost and are computationally
viable for implementation in a handheld device
like smart card; however, the schemes primarily
suffer from password guessing, stolen-verifier and

0167-4048/$ - see front matter 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cose.2005.09.002

A novel remote user authentication scheme


denial-of-service attacks (Ku et al., 2003; Hsieh
et al., 2003). In contrast, public-key based authentication schemes require high computation cost for
implementation, but meet higher security requirements. So far, several research works on public-key
based remote user authentication (Chang and Wu,
1993; Chang and Liao, 1994; Hwang and Yeh,
2002; Shen et al., 2003) have been done. Unfortunately, many times, a paper typically breaks a previous scheme and proposes a new one (Ku et al.,
2003; Hsieh et al., 2003), which someone breaks
later and, in turn, proposes a new one, and so on.
Most of such work, though quite important and useful, essentially provides an incremental advance to
the same basic theme (Peyravian and Zunic, 2000).
Recently, the bilinear pairings (Boneh and
Franklin, 2001), namely the Weil pairing and the
Tate pairing of algebraic curves have been found
as important applications (Boneh and Franklin,
2001; Hess, 2003) in cryptography and allowed us
to construct identity (ID) based cryptographic
schemes. In 1984, Shamir introduced the concept
of ID-based cryptosystem; however, the practical
ID-based schemes (Boneh and Franklin, 2001;
Cocks, 2001) were found in 2001.
In this paper, we present a remote user authentication scheme using the properties of bilinear
pairings. In our scheme, the user is assigned
a smart card, which is being personalized by
some parameters during the user registration process. The use of smart card not only makes the
scheme secure but also prevents the users from
distribution of their login-IDs, which effectively
prevents the scenario of many logged in users with
the same login-ID. The characteristics of our
scheme are summarised as follows:
- The users smart card generates a dynamic login
request and sends it to the remote system for
login to the system. The login request is computed by the smart card internally without any
human intervention and the login request is
composed by the user systems timestamp.
Thus, an adversary cannot predict the next login
request with the help of current login request.
- The users can choose and change their preferred
passwords freely without any assistance from
the remote system. During the user registration
process, the remote system stores a secret component and other parameters in a smart card,
and then sends it to the user securely. With the
help of the smart card and its secret component
the user can change his password without any
assistance from remote system.
- The remote system does not maintain any password or verifier table for the verification of

185
user login request. The login request verification requires user identity, remote system public-key corresponding to the remote systems
secret key.
- The scheme prevents the scenario of many
logged in users with the same login-ID. Typically, a registered user can share his password or
secret component with others, thus all who
know the password or secret component with
respect to the users login-ID, can login to the
remote system. This generally happens in digital library, where a subscriber can share his
login-ID and password with others, and many
users (who knows login-ID and password) can
download or view the digital document. In
our scheme, the login request is generated by
the smart card using its stored secret component without any human intervention. It is
extremely difficult to extract the secret
component from the smart card, and thus the
user cannot share it with others. Even if the
legitimate users password is shared with
others, the other person cannot login to the
system without the smart card. Once a valid
user logs into the remote system, his smart
card will be inside the terminal until the user
logs out. If the user pulls out the card from
the terminal after login the remote system,
the login session will be immediately expired.
Thus, the scheme can successfully prevent
the scenario of many logged in users with the
same login-ID.
- The scheme can resist the replay, forgery and
insider attacks.
The rest of the paper is organised as follows. In
the next section, we give some preliminaries of
bilinear pairings. In the section following that, we
propose our scheme and analyse the scheme in
Section Correctness, performance and security.
Finally we conclude the paper in last section.

Preliminaries
Bilinear pairings
Suppose G1 is an additive cyclic group generated
by P, whose order is a prime q, and G2 is a multiplicative cyclic group of the same order. A map
e^ : G1 !G1 /G2 is called a bilinear mapping if it
satisfies the following properties:
1. Bilinear: e^aP; bQ Ze^P; Q ab for all P, Q G1
and a, b Zq :

186
2. Non-degenerate: there exist P, Q G1 such that
e^P; Q s1:
3. Computable: there is an efficient algorithm to
compute e^P; Q for all P, Q G1.
We note that G1 is the group of points on an
elliptic curve and G2 is a multiplicative subgroup
of a finite field. Typically, the mapping e^ will be
derived from either the Weil or the Tate pairing
on an elliptic curve over a finite field.

Mathematical problems

Definition 1. (Discrete Logarithm Problem


(DLP)). Given Q, R G1, find an integer x Zq)
such that R Z xQ.
The MOV and FR reductions: Menezes et al.
(1993) and Frey and Ruck (1994) show a reduction
from the DLP in G1 to the DLP in G2. The reduction
is: Given an instance Q, R G1, where Q is a point
of order q, find xZq) , such that R Z xQ. Let T be an
element of G1 such that gZe^T; Q has order q, and
let hZe^T; R. Using bilinear property of e^, we have
e^T; RZe^T; Q x : Thus, DLP in G1 is no harder than
the DLP in G2.

Definition 2. (Computational DiffieeHellman


Problem (CDHP)). Given (P, aP, bP) for a,
bZq) ; compute abP.
The advantage of any probabilistic polynomiale
time algorithm A in solving CDHP in G1, is defined
CDH
as
AdvA;G
ZProbAP; aP; bP; abPZ1 : a; bZq) .
1
CDH
For every probabilistic algorithm A, AdvA;G
is
1
negligible.

Proposed scheme
There are three entities in the proposed scheme,
namely the user, users smart card and the remote
system. The scheme consists of mainly three
phases e the setup phase, the registration phase
and the authentication phase.

Setup phase
Suppose G1 is an additive cyclic group of order
prime q, and G2 is a multiplicative cyclic group of
the same order. Suppose P is a generator of G1,
e^ : G1 !G1 G2 is a bilinear mapping and H: {0,

M.L. Das et al.


1}) / G1 is a cryptographic hash function. The
remote system (we call it as RS in the rest of
the paper) selects a secret key s and computes
the public-key as PubRS Z sP. Then, the RS publishes the system parameters CG1, G2, e^, q, P,
PubRS, HD and keeps s secret.

Registration phase
This phase is executed by the following steps when
a new user wants to register with the RS.
R1. Suppose a new user Ui wants to register with
the RS.
R2. Ui submits his identity IDi and password PWi to
the RS.
R3. On receiving the registration request, the RS
computes RegIDi Z s$H(IDi) C H(PWi).
R4. The RS personalizes a smart card with the
parameters IDi, RegIDi, H($) and sends the
smart card to Ui over a secure channel.

Authentication phase
This phase is executed every time whenever a user
logs into the RS. The phase is further divided into
the login and verification phases. In the login
phase, user sends a login request to the RS. The
login request comprises with a dynamic coupon,
called DID, which is dependent on the users ID,
password and RSs secret key. The RS allows the
user to access the system only after successful
verification of the login request.
Login phase
The user Ui inserts the smart card in a terminal and
keys IDi and PWi. If IDi is identical to the one that is
stored in the smart card, the smart card performs
the following operations:
L1. Computes DIDi Z T$RegIDi, where T is the user
systems timestamp.
L2. Computes Vi Z T$H(PWi).
L3. Sends the login request CIDi, DIDi, Vi, TD to the
RS over a public channel.

Verification phase
Let the RS receives the login message CIDi, DIDi, Vi,
TD at time T) (RT ). The RS performs the following
operations to verify the login request:
V1. Verifies the validity of the time interval between T) and T. If (T)T ) % DT, the RS

A novel remote user authentication scheme


proceeds to the step (V2), where DT denotes
the expected valid time interval for transmission delay. Otherwise, rejects the login
request. We note that at the time of registration, the user and the RS have agreed on the
accepted value of the transmission delay DT.
V2. Checks whether e^DIDi  Vi ; PZe^HIDi ;
PubRS T: If it holds, the RS accepts the login
request; otherwise, rejects it.

Password change phase


This phase is invoked whenever a user Ui wants to
change his password. By invoking this phase, Ui can
easily change his password without taking any assistance from the RS. The phase works as follows:
P1. Ui attaches the smart card to a terminal and
keys IDi and PWi. If IDi is identical to the one
that is stored in the smart card, proceeds to
the step (P2); otherwise, terminates the
operation.
P2. Ui submits a new password PW)
i .
)
P3. The smart card computes RegID
Z RegIDi 
i
)
H(PWi) C H(PWi ) Z s$H(IDi) C H(PW)
i ).
P4. The password has been changed now with the
new password PW)
i and the smart card replaced the previously stored RegIDi value by
)
RegID
value.
i

Correctness, performance and security


Correctness
The verification step (V2) of a login request is
verified by the following:
e^DIDi  Vi ; P

Ze^T$Reg

V
;
P
ID
i
i

 

Ze^ T s$HIDi CHPWi  T$HPWi ; P


Ze^s$HIDi ; PT

as e^aP; Q Z e^P; Q a ; bilinearity of e^
Ze^HIDi ; PubRS T

as e^bP; Q Ze^P; bQ and PubRS ZsP

Performance
In order to compare the performance of our
scheme with the existing public-key based remote
user authentication schemes, we consider the
schemes (Chang and Liao, 1994; Shen et al.,
2003) which are based on ElGamals (1985)

187
signature scheme and used smart cards. The smart
card personalization cost for the registration process of our scheme is as per the schemes in (Chang
and Liao, 1994; Shen et al., 2003). The login phase
in (Chang and Liao, 1994; Shen et al., 2003) requires four discrete logarithm operations, one
scalar multiplication and one hash computation;
whereas the verification phase requires two discrete logarithm operations, one scalar multiplication, one hash computation and one inverse
operation. Our scheme needs two scalar multiplications of elliptic curve point and one hash to
point operation in the login phase; whereas two bilinear pairing operations, one scalar multiplication
of curve point, one point addition and one hash to
point operation in the verification phase. As the
pairing operation is costly (Barreto et al., 2002),
so the verification phase of our scheme takes
high computation cost compared to the verification phase in (Chang and Liao, 1994; Shen et al.,
2003). However, the verification process is done
by the RS with large computation system, thereby
the computation cost of the verification process
is not a constraint. The computation cost at the
users system (e.g., smart card) is a crucial issue
and the login phase of our scheme is efficient
than the login phase in (Chang and Liao, 1994;
Shen et al., 2003). Furthermore, our scheme
claims the following characteristics:

Claim 1. The scheme prevents the scenario of


many logged in users with the same login-ID.
Typically, a valid user can share his password or
secret component with others, thus all who know
the password or secret component corresponding
to the users login-ID, can login to the RS. For
example, in a digital library, a subscriber can share
his login-ID and password with others. Now,
the users who know login-ID and password of the
genuine subscriber can download or view the
information. In our proposed scheme, the login
request is generated by the smart card using its
stored secret component without any human intervention. The secret component cannot be
extracted from smart card and thus, cannot be
shared with others. Even if the legitimate users
password is shared, the other person cannot login
to the RS without the smart card. That is, who is
having the smart card and knowing the valid
password, can only login to the RS. It is noted
that the smart card will be inside the terminal
until the user logs out. If the user pulls out the card
from the terminal after login the RS, the login
session will be immediately expired. Thus, the

188
proposed scheme can successfully prevent the
scenario of many logged in users with the same
login-ID.

Claim 2. The scheme provides a user-friendly


password change option to the user without any
assistance from RS.
The user can choose and change his preferred
password freely without any assistance from the
RS. The user is given a smart card at the time of
user registration process, where the smart card is
personalized with a secret component and some
other parameters. With the help of the secret
component the user can change his password
without any assistance of RS. This avoids the RSs
burden and reduces communication cost for the
password change protocol.

Claim 3. The RS does not maintain any password


or verifier table for user login request
verification.
In our scheme, the RS does not maintain any
password or verifier table for the verification of
user login request. The user login request is
verified by the user identity and RS public-key
corresponding to the RSs secret key.

Security
Here, we show that the proposed scheme can
withstand the following attacks.
Replay attack
Suppose an adversary replays an intercepted valid
login request and the RS receives the request at
time Tnew. The attack cannot work because it fails
the step (V1) of the verification phase as the time
interval (Tnew  T ) exceeds the expected transmission delay DT.
Forgery attack
A valid user login message consists of IDi, DIDi, Vi
and T, where DIDi Z T$RegIDi and Vi Z T$H(PWi).
The RegIDi is stored in smart card by the RS at the
time of Ui registration process and it is extremely
difficult to extract RegIDi from the smart card. An
adversary cannot construct a valid RegIDi (Zs$H
(IDi) C H(PWi)) without the knowledge of RSs
secret key s and users password. If an adversary
intercepts a valid login message CIDi, DIDi, Vi, TD,
he cannot resend it later, but the timestamp will
be different in the next time and it fails the step

M.L. Das et al.


(V1) of the verification phase. If a valid smart
card is stolen, the unauthorized user cannot login
to the RS because he does not know the password
of the card owner. Furthermore, on intercepting
a valid login request CIDi, DIDi, Vi, TD, an adversary
can get s$T$H(IDi) by calculating DIDi  Vi. Using
s$T$H(IDi), the adversary can try the followings:
(i) Given CTH(IDi), s$TH(IDi)D; Compute s.
To compute s from CTH(IDi), s$TH(IDi)D is
a hard problem and it is equivalent to solve
the discrete logarithm problem (Definition 1 in
Section Preliminaries).
(ii) Given Cs$TH(IDi), T#$TH(IDi), TH(IDi)D; Compute s$T#$TH(IDi).
The adversary can choose a new timestamp
T#, and can try to generate a valid DIDi#  Vi#,
that is, s$T#$TH(IDi). However, to compute
s$T#$TH(IDi) from Cs$TH(IDi), T#$TH(IDi), TH(IDi)D
is a Computational DiffieeHellman Problem
(Definition 2 in Section Preliminaries).
Therefore, the adversary cannot forge a valid
login request with the help of s$T$H(IDi).
Insider attack
In many scenarios, the user uses a common
password to access several systems for his convenience. If the user login request is password-based
and the RS maintains password or verifier table for
login request verification, an insider of RS could
impersonate users login by stealing password and
gets access of the other systems. In our scheme,
the user login request is based on the users
password as well as an RSs secret key. The RS
does not maintain any password or verifier table,
thus an insider cannot get the user password.
Though, the user submits his password to the RS
during the registration process, he can change his
password on invoking the password change phase
after registration, thereby the scheme can withstand the insider attack.

Conclusion
We proposed a remote user authentication scheme
using the properties of bilinear pairings. The
scheme prevents the adversary from forgery attacks by employing a dynamic login request in
every login session. The use of smart card not
only makes the scheme secure but also prevents
the users from distribution of their login-IDs, which
effectively prohibits the scenario of many logged in
users with the same login-ID. Moreover, the scheme
provides a flexible password change option, where

A novel remote user authentication scheme


the users can change their passwords any time
without any assistance from the remote system.

References
Barreto PSLM, Kim HY, Lynn B, Scott M. Efficient algorithms
for pairing-based cryptosystems. In: Advances in cryptology e Crypto02, LNCS, vol. 2442. Springer-Verlag; 2002.
p. 354e68.
Boneh D, Franklin M. Identity-based encryption from the Weil
pairing. In: Advances in cryptology e Crypto01, LNCS, vol.
2139. Springer-Verlag; 2001. p. 213e29.
Chang CC, Wu TC. Remote password authentication with smart
cards. IEE Proceedings e E 1993;138(3):165e8.
Chang CC, Liao WY. A remote password authentication scheme
based upon ElGamals signature scheme. Computers & Security 1994;13(2):137e44.
Cocks C. An identity based encryption scheme based on quadratic residues. In: Cryptography and coding, LNCS, vol.
2260. Springer-Verlag; 2001. p. 360e3.
ElGamal T. A public key cryptosystem and signature scheme
based on the discrete logarithms. IEEE Transaction on Information Theory 1985;31(4):469e72.
Frey G, Ruck H. A remark concerning m-divisibility and the
discrete logarithm in the divisor class group of curves.
Mathematics of Computation 1994;62:865e74.
Hess F. Efficient identity based signature schemes based on
pairings. In: Selected areas in cryptography02, LNCS, vol.
2595. Springer-Verlag; 2003. p. 310e24.
Hsieh BT, Sun HM, Hwang T. On the security of some password
authentication protocols. Informatica 2003;14(2):195e204.
Hwang JJ, Yeh TC. Improvement on PeyravianeZunics password authentication schemes. IEICE Transactions on Communications 2002;E85-B(4):823e5.
IEEE P1363.2 draft D12: standard specifications for passwordbased public key cryptographic techniques. IEEE P1363
working group; 2003.
Ku WC, Chen CM, Lee HL. Weaknesses of LeeeLieHwangs hashbased password authentication scheme. ACM Operating
Systems Review 2003;37(4):9e25.
Ku WC. A hash-based strong-password authentication scheme
without using smart cards. ACM Operating Systems Review
2004;38(1):29e34.
Lamport L. Password authentication with insecure communication. Communications of the ACM 1981;24(11):770e2.
Lee CC, Li LH, Hwang MS. A remote user authentication scheme
using hash functions. ACM Operating Systems Review 2002;
36(4):23e9.
Menezes A, Okamoto T, Vanstone S. Reducing elliptic curve
logarithms to logarithms in a finite field. IEEE Transactions
on Information Theory 1993;39:1639e46.
Menezes A, van Oorschot PC, Vanstone S. Handbook of applied
cryptography. CRC Press; 1996.
Peyravian M, Zunic N. Methods for protecting password transmission. Computers & Security 2000;19(5):466e9.
Shamir A. Identity-based cryptosystems and signature schemes.
In: Advances in cryptology e Crypto84, LNCS, vol. 196.
Springer-Verlag; 1984. p. 47e53.
Shen JJ, Lin CW, Hwang MS. A modified remote user authentication scheme using smart cards. IEEE Transactions on
Consumer Electronics 2003;49(2):414e6.

189
Shimizu A, Horioka T, Inagaki H. A password authentication
methods for contents communication on the Internet. IEICE
Transactions on Communications 1998;E81-B(8):1666e73.
Manik Lal Das received his M. Tech.
degree in 1998. He is working in Institute for Development and Research in
Banking Technology, Hyderabad as Research Officer and pursuing his Ph.D.
degree in K. R. School of Information
Technology, Indian Institute of Technology, Bombay, India. He has published over 15 research articles in
refereed Journal Conferences. He is
a member of Cryptology Research Society of India and Indian Society for Technical Education. His research interests include Cryptography and Information Security.
Ashutosh Saxena received his M.Sc.
(1990), M. Tech. (1992) and Ph.D. in
Computer Science (1999) from Devi
Ahilya University, Indore. Presently,
he is working as Associate Professor
in Institute for Development and Research in Banking Technology, Hyderabad. He is on the Editorial Committees
of various International Journals and
Conferences, and is a Life Member of
Computer Society of India and Cryptology Research Society of India and Member of IEEE Computer
Society. He has authored and co-authored more than 50 research paper published in National/International Journals and
Conferences. His main research interest is in the areas of Authentication Technologies, Smart Cards, Key Management and
Security Issues in Banking.
Ved P. Gulati received his Ph.D. degree from Indian Institute of Technology, Kanpur, India. Presently, he is
a consultant advisor in Tata Consultancy Services, Hyderabad, India. He
was Director of Institute for Development and Research in Banking Technology, Hyderabad, India from 1997
to 2004. He is a member of IEEE, Cryptology Research Society of India and
Computer Society of India. His research Interests include Payment Systems, Security Technologies, and Financial Networks.
Deepak B. Phatak received his Ph.D.
degree from Indian Institute of Technology, Bombay, India. He is Subrao
M. Nilekani Chair Professor with K. R.
School of Information Technology, Indian Institute of Technology Bombay,
India. His research interests include
Data Bases, System performance evaluation, Smart Cards and Information
Systems.

Computers & Security (2006) 25, 190e200

www.elsevier.com/locate/cose

A novel approach for computer security


education using Minix instructional operating
system*
Wenliang Du*, Mingdong Shang, Haizhi Xu
Department of Electrical Engineering and Computer Science, 3-114 Center for Science and Technology,
Syracuse University, Syracuse, NY 13244, USA
Received 8 December 2004; revised 23 September 2005; accepted 23 September 2005

KEYWORDS
Computer security;
Education;
Courseware;
Laboratory projects;
Minix

Abstract To address national needs for computer security education, many universities have incorporated computer and security courses into their undergraduate
and graduate curricula. In these courses, students learn how to design, implement,
analyze, test, and operate a system or a network to achieve security. Pedagogical
research has shown that effective laboratory exercises are critically important to
the success of these types of courses. However, such effective laboratories do
not exist in computer security education.
Intrigued by the successful practice in operating system and network courses
education, we adopted a similar practice, i.e., building our laboratories based on
an instructional operating system. We use Minix operating system as the lab basis,
and in each lab we require students to add a different security mechanism to the
system. Benefited from the instructional operating system, we design our lab exercises in a way such that students can focus on one or a few specific security concepts while doing each exercise. The similar approach has proved to be effective
in teaching operating system and network courses, but it has not yet been used
in teaching computer security courses.
2005 Elsevier Ltd. All rights reserved.

Introduction
*

The project is supported by grant DUE-0231122 from the


National Science Foundation and by fundings from CASE center.
* Corresponding author. Tel.: 1 315 443 9180; fax: 1 315 443
1122.
E-mail addresses: wedu@ecs.syr.edu (W. Du), mshang@ecs.
syr.edu (M. Shang), hxu02@ecs.syr.edu (H. Xu).

The high priority that information security education warrants has been recognized since early
1990s. In 2001, Eugene Spafford, director of the
Center for Education and Research in Information
Assurance and Security (CERIAS) at Purdue University, testified before Congress that to ensure safe

0167-4048/$ - see front matter 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cose.2005.09.011

A novel approach for computer security education

191

computing, the security (and other desirable properties) must be designed in from the start. To do
that, we need to be sure all of our students
understand the many concerns of security, privacy,
integrity, and reliability (Spafford, 1997).
To address these needs, many universities have
incorporated computer and information security
courses into their undergraduate and graduate
curricula. In many curricula, computer security
and network security are two core courses. These
courses teach students how to design, implement,
analyze, test, and operate a system or a network
with the goal of making it secure. Pedagogical
research has shown that students learning is
enhanced if they can engage in a significant
amount of hands-on exercises. Therefore, effective laboratory exercises (or course projects) are
critically important to the success of computer
security education.
Traditional courses, such as operating systems,
compilers, and networking, have effective laboratory exercises, as the result of 20 years maturation. In contrast, laboratory designs in security
education courses are still embryonic. A variety of
approaches are currently used; three of the most
frequently used designs are the following: (1) the
free-style approach, i.e., instructors allow students to pick any security-related topic they are
interested in for the course projects; (2) the dedicated computing environment approach, i.e., students conduct security implementation, analysis
and testing (Hill et al., 2001; Mayo and Kearns,
1999) in a contained environment; and (3) the
build-it-from-scratch approach, i.e., students
build a secure system from scratch (Mitchener
and Vahdat, 2001).
Free-style design projects are effective for
creative students; however, most students become
frustrated with this strategy because of the difficulty in finding an interesting topic. With the
dedicated environment approach, projects can
be very interesting, with the logistical burdens of
the laboratory e obtaining, setting up, and managing the computing environment. In addition,
course size is constrained by the size of the
dedicated environment. The third design approach
requires students to spend considerable amount of
time on activities that are irrelevant to computer
security education but are essential for a meaningful and functional system.
The lack of an effective and efficient laboratory
for security courses motivated us to consider
practices adopted by the traditional mature
courses, e.g., operating systems (OS) and compilers. In OS courses, a widely adopted successful
practice is using an instructional OS (e.g., MINIX

(Tanenbaum, 1996), NACHOS (Christopher et al.,


1993), and XINU (Comer, 1984)) as a framework
and ask students to write significant portions of
each major piece of a modern OS. The compiler
and network courses adopted a similar approach.
Inspired by the success of the instructional OS
strategy, we adapt it to our computer security
courses. Specifically, we provide students with
a system as the framework, and then ask them to
implement significant portions of each fundamental security-relevant functionality for a system.
Although there are a number of instructional systems for OS courses, to our knowledge, this approach has not yet been applied to computer and
information security courses.
Our goal is to develop a courseware system,
serving as an experimental platform and framework for computer security courses. The courseware is not designed to create new security
mechanisms, but to let students practice existing
security work. The courseware contains a set of
well defined and documented projects for helping
students focus on (1) grasping security concepts,
principles, and technologies; (2) practicing design
and implementation of security mechanisms and
policies; and (3) analyzing and testing a system
for its security properties.
We chose Minix as our base system, and have
designed a number of laboratory assignments on
it. These assignments cover topics ranging from
the design and implementation of security mechanisms to the analysis and testing of a system for security purpose. Each assignment can be considered
as adding/modifying security mechanisms to
Minix. To finish each task, students just need to
focus on those security mechanisms, with minimum effort on other parts of the system. For
example, while learning discretionary access control (DAC), we give students a file system without
DAC mechanisms; students only need to design
and implement DAC for this existing file system.
Students can immediately see how their DAC
implementation affects the system. This strategy
helps students to stay focused on security
concepts.
Our course projects consist of two parts. One
part focuses on design and implementation. This
part of the projects requires students to add new
security mechanisms to the underlying Minix system to enhance its security. The security mechanisms students need to implement include access
control, capability, sandbox, and encrypted file
systems. In the second part of our projects, we
gave students a modified Minix system that contains a number of injected vulnerabilities. Students need to use their skills learned from the

192
lectures to identify, exploit, and fix those
vulnerabilities.
Our approach is open-ended, i.e., we can add
more laboratory projects to this framework without affecting others. The projects presented in
this paper are the result of 3 years maturation,
with more components added in each year. We are
also planning to design a number of network
security projects for Minix based on the Minixs
existing networking functionality.
The paper is organized as follows: the next
section briefly describes our computer security
course. Then the design of our courseware is
described which is followed by description of
each of our laboratory projects. Further the
experiences and lessons we have gained during
our 3-year practice are presented. Finally, the last
section concludes the paper and describes the
future work.

The computer security course


Scope of the course
Our department offers two graduate courses in
security: one is computer security, and the other
is network security. The computer security course
focuses on the concepts, principles, and techniques
for system security, such as encryption algorithms,
authentication, access control, privilege, vulnerabilities, system protection, etc. Currently, our
proposed approach only targets at the computer
security course, but we plan to extend this approach
to the network security course in our future
work.

Pedagogical approach
Lecturing on theories, principles and techniques of
computer security is not enough for students to
understand system security. Students must be able
to put what they have learned into use. We use the
learning by doing approach. It was shown in
other studies that this type of active learning
approach has a higher chance of having a lasting
effect on students than letting students passively
listen to lectures without reinforcement (Meyers
and Jones, 1993).
More specifically, we try to use the Minix OS as
our base system to develop assignments that can
give students hands-on experience with those theories taught in class. For example, when teaching
Set-UID concept of Unix, we developed an assignment for students to play with this security

W. Du et al.
mechanism, figure out why it is needed, and
understand how it is implemented.
We have developed two types of assignments:
small assignments and comprehensive assignments. Each small assignment focuses on one
specific concept, such as Set-UID and access control. These assignments are usually small; they do
not need much programming, and take only 1 or 2
weeks; therefore, we can have several small projects to cover a variety of concepts in system security. However, being able to deal with each
individual concept is not enough, students need
to learn how to put them together. We have developed comprehensive assignments, which cover
a number of concepts in one assignment. They
are ideal candidates for final projects.

Course prerequisites
Because this course focuses on system security, we
require students to have appropriate system background. Students taking the course are expected
to have taken the graduate-level operating systems. They should be proficient in C programming.

Design of course projects


The goal of our projects is to provide a set of
exercises for students to practice their security
design, implementation, analysis, testing, and operation skills. Using the Minix instructional operating system, we designed two classes of projects,
one focusing on design and implementation of security mechanisms, and the other focusing on security analysis and testing. The overview of our
projects is depicted in Fig. 1.
Design and implementation. In our computer
security class, we aim at covering a number of
important security mechanisms, such as Privilege,
Authentication, Access Control, Capability, and
Sandboxing. We expect students to have first-hand
experience on most of them during one semester
period. However, asking students to implement
a system with all of these mechanisms from scratch
sounds infeasible. Using an instructional operating
system, our goal becomes feasible because of the
following reasons: (1) An instructional OS provides
students with a structured framework upon which
they can build various security mechanisms. (2)
An instructional OS is functional even if the
students have not implemented the security
modules completely. This gives students quick
feedback as to how their implementations work
and whether the modules are implemented
correctly.

A novel approach for computer security education

193

Minix Instructional
Operating System

Security Exploit, Analysis & Testing

Vulnerabilities pool

Figure 1

Security Design & Implementation

Preparation

Privilege
(SetUID)

Access
Control

Capability

Sandboxing

Encrypted
File System

Overview of course projects based on Minix.

Some of the security mechanisms are already


implemented in Minix, such as privilege, and
access control. For some of these mechanisms,
our projects are designed in a way that requires
students to study and play with the existing implementation, so they can gain first-hand experience.
For other existing mechanisms, we ask students
to extend them and add more functionalities. For
example, we ask students to extend the Minixs
abbreviated access control list mechanism to support full access control lists. Several security
mechanisms that we cover in class do not exist in
Minix, such as capability and encrypted file system. For them, we designed course projects that
ask students to implement these mechanisms in
Minix. To make the tasks doable within 2e3
weeks, the security mechanisms are simplified
compared to those implemented in a real operating system.
Security analysis and testing. To master the security analysis and testing skills the students have
learned from the class, they need to practice those
skills in some systems. One way to do this is to give
them a vulnerable system, such as older versions
of Windows 2000 or Linux, and ask them to find
security flaws in those systems. Although these
systems contain many vulnerabilities, identifying
and exploiting them is not a trivial task even for
seasoned system administrators, much less students who have just learned the basic skills.
We have created a pool of vulnerable components for Minix, with some in the application layer
and some in the kernel layer. The vulnerabilities
we choose reflect vulnerabilities in the real world.
They include buffer-overflow errors, race condition errors, sym-link errors, input validation errors, authentication errors, domain errors, and
design errors (Landwehr et al., 1994).
Instructors can choose the vulnerable components they like and inject them into Minix. The
flawed Minix system is then given to students,

who need to find those vulnerabilities and exploit


them. Before starting these exercises, students
are equipped with theoretical knowledge of these
vulnerabilities, the methods of detection and exploitation, and the methodologies of penetration
testing and standard security testing.

Why choose Minix?


Before we decided to use Minix, we have investigated a number of alternatives. We had the following criteria in mind when choosing an operating
system as the base of our courseware:
1. Source code availability. Because the system
security course involves implementation of system security mechanisms, studying the source
code is important for the learning process.
2. Complete but not complex. The OS should
provide sufficient infrastructure to students.
Students should be able to immediately see
how their implementation behaves without
having to build the security-irrelevant components to make the whole system work. However, the OS should not be too complex;
otherwise students need to spend much time
in understanding the underlying system.
3. Modularized. The security modules in the
system should be highly modularized, so that
they can be modified or replaced independently.
4. No need for superuser privilege. It is preferable for students to carry out lab assignments
in a general computing environment using normal user accounts, as opposed to in a dedicated
computing environment using superuser
privileges.
A complete featured OS like Linux seems a good
candidate because of its completeness. However,
if we choose such an operating system, the

194

W. Du et al.

students will take considerable amount of time to


understand the functionality of the OS and thus
lose focus on security. To overcome this drawback,
many operating system courses use simplified operating systems, such as Xinu, Nachos and Minix,
for educational purposes. We adopted a similar
practice.
Most computer security course projects require
the administrator/superuser privilege, which can
jeopardize the security of the security experiment. With the superuser privilege, students can
have complete control over the experimental
domain. A malicious student might use it to gain
unwanted access to other peoples accounts. Even
if all students are well behaved, they might
accidentally introduce security holes into the
system because of the lack of system administrating experience. Some universities do give students
the superuser privilege for this type of projects,
but the computers have to be restricted to an
isolated environment. Although this approach has
been widely used in practice, it requires high cost
for lab setting up and management. We chose
a different approach: to enable students to build
and run the operating system without giving the
superuser privilege.
We chose Minix instructional operating system
as our base system for three reasons: first, Minix
is complete comparing to other unix-style instructional OSs; second, Minix can run on the Solaris
systems as a non-privileged process; third, Minix
is small and easy to understand. Table 1 compares
the pros and cons of using different OSs as the base
of our courseware.

Introduction to the Minix operating system


Minix is a Unix operating system, and its name
came from mini Unix . As an instructional operating system, Minix system is designed to be small
and simple. It only has about 15,000 lines of codes,
which are publicly available at http://
www.cs.vu.nl/%126;ast/minix.html (Tanenbaum).

Table 1

A textbook was also written by Tanenbaum to


explain how Minix works (Tanenbaum, 1996).
Students meeting the prerequisites can understand
this operating system within a short period of time.
Minix system has a high modular structure, which
makes it not only easy to understand, but also
easy for students to extend and modify.
Minix was originally developed as a real operating system, running directly on Intel x86 machines.
Later on, Ashton ported Minix to run on the
SUN Solaris systems as a non-privileged process
(Ashton, 1996).

Course projects
Laboratory setup
We use Minix on Solaris in our course. All of the
laboratory exercises will be conducted in SUN
Solaris environment using C language. Except
for giving students more disk space (100 MB) to
store the files of Minix system, Minix poses no
special requirements on the general Solaris
computing environment.
The Minix operating system can also be installed on simulated environments like VMware ,
Bochs and so on. Installing the operating system
on VMware is not a difficult process, and no superuser privilege is needed to run Minix on VMware.
Therefore, this could be another installation option. Both approaches can be used in our laboratory designs. However, we preferred to use the
Solaris approach, so students do not need to buy
the VMware license or use free-wares that are
not stabilized yet.
We have designed a variety of course projects
on Minix. Depending on the course schedule and
the students familiarity with Unix and their proficiency in C programming, instructors might want
to choose a subset of the projects we designed.
Currently, we are still developing more assignments, and we will also solicit contributions from

A comparison of various operating systems


Source code availability

Complete

Complex

Superuser privilege

Modularized

Instructional OS

Minix
Nachos
Xinu

Yes
Yes
Yes

Yes
Partial
Yes

No
No
No

No
No
Yes

Yes
Yes
Yes

Commercial OS

Linux
BSD
SunOS
Windows

Yes
Yes
No
No

Yes
Yes
Yes
Yes

Yes
Yes
Yes
Yes

Yes
Yes
Yes
Yes

Yes
Yes
Yes
Yes

A novel approach for computer security education

195

other people. Our goal is to create a pool of lab assignments, such that different instructors can
choose the subset to meet the requirements of
their syllabi.

they are not. (2) Students are given the binary


code of the passwd program, which contains a number of security flaws injected before-hand. Students need to identify those flaws, and exploit
the vulnerable program to gain the root privileges.
(3) Read Minix source codes, and figure out how
Set-UID is implemented in the system. (4) Modify
the kernel source code to disable the Set-UID
mechanism.
This project is quite straightforward. On average it takes students 1 week to finish.

Preparation
In this warm-up project, students get familiar with
the Minix operating system, such as installing and
compiling the Minix OS, conducting simple administration tasks (e.g., adding/removing users), and
learning to use/modify some common utilities.
More importantly, we want students to understand
the Minix kernel. For our system security course,
students just need to understand in detail system
calls, file systems, the data structure of i-node
and process table. They do not need to study
non-security modules such as process scheduling
and memory management. Students meeting the
prerequisites should be comfortable with the
Minix environment in 2e3 weeks.
The following is a list of sample tasks we used.
In reality, instructor can choose different tasks to
achieve the same goals:
 Compile and install Minix, then add three user
accounts to the system.
 Change the password verification procedure,
such that a user is blocked for 15 min after
three failed trials.
 Implement system calls to enable users to print
out attributes in i-node and process table.
Appropriate security checking should be implemented to ensure that a user cannot steal information from other accounts.
Our experiments show that it is better to guide
students to conduct the above tasks in one or two
lab sessions, in which a teaching assistant can
provide immediate helps. These lab sessions are
extremely necessary when students have significantly different backgrounds.

Set-UID programs
Set-UID is an important security concept in Unix
operating systems. It is a good example to show
students how privileges are escalated in a system.
In this project, students learn the Set-UID concept
and its implementation. Students also learn how
an attacker can escalate his privileges via exploiting a vulnerable Set-UID program.
Students need to finish the following tasks: (1)
Figure out why passwd, chsh, su commands need
to be Set-UID programs, and what will happen if

Access control list


Access control is an important security mechanism
implemented in many systems. It can be classified
as Discretionary Access Control and Mandatory
Access Control (MAC). In DAC systems, the owner
of an object can decide its security properties
(e.g., who can read this file?); while in MAC
systems, the security properties are determined
and controlled by only a security manager. Access
permissions can be represented on a per object
basis (i.e., who can do what operations on an
object); this is called Access Control Lists. Permissions can also be represented on a per subject
(principal) basis (i.e. what operations on what objects the subject can do); this is called Capabilities. This project focuses on access control lists.
The goal of this project is two-fold: (1) to get
first-hand experience with DAC and (2) to be able
to implement DAC. Minix already has an implementation of abbreviated ACL; namely the access
control is based on three classes: owner, group,
and others. Students need to extend this abbreviated ACL to a full ACL, i.e., a user can assign a specific access right to any single user. On average
students need about 2e3 weeks to finish this project. Students need to deal with the following
issues:
 How access control works: Before working on
their implementations, students need to understand the entire process of access control,
and they need to trace the program execution
to find out how the access control is conducted
in Minix. This enhances their understanding of
access control.
 ACL representation: Students need to think
about how to represent the full ACL, how to allow ACLs to specify access permissions on a per
user (principal) basis, rather than the current
owneregroupeother protection method. Students also need to make their representation
flexible for adding and removing purposes.

196
 Storing the ACLs: This is another challenging
part of the project. Students need to think
where exactly they should store the access
control list. The current Minix implementation
does not seem to have a place to store the full
access control list. Students need to solve this
issue. A hint we give them is to use some unused entries in i-nodes or store the access
control lists in separate files.
 ACL management: In addition to implementing
the full ACL in the kernel, students also need to
implement the corresponding utilities, such
that users can manage the access control list
of their own files.

Capability
Capability is another important concept in computer security. The goal of this project is to help
students understand the concept of capability. We
defined a set of capabilities in this project, with
each capability representing whether a process
can invoke a specific system call. Students need to
implement these capabilities in Minix. Specifically, their capability mechanism should be able to
achieve the following functionalities: (1) Permission granting based on capability. (2) Capability
copying: A process should be able to copy its capabilities to another process. (3) Capability reduction/restoration: A process should be able to
amplify or reduce its current capabilities. For example, a process can temporarily remove its own
Set-UID capability, but later can add it back. Of
course, a process cannot assign a new capability
to itself. (4) Capability revocation: Root should
be able to revoke capabilities from processes.
In this project, students need to take care of
the following issues:
 Capability list representation: Students need
to think about how to represent the set of defined capabilities. They also need to think how
they can associate capabilities with each process. The final representation should conveniently support the required functionalities
(e.g., copying, removing, etc.).
 Storing the capabilities: This is another challenging part of the project where students
need to think where capabilities should be
stored.
One option is to add an entry to the process
table to store the capabilities. A potential issue is
how feasible it is to extend the process table (note
that the process table is a kernel data structure
used by many other components).

W. Du et al.
 Capability revocation: Students need to think
about how to revoke an objects capability.
They must be careful not to introduce vulnerabilities in this part.
 Capability management: Students need to take
care of two types of users, normal and superusers. They need to consider the following issues:
how they manage these two types of users, and
what functionalities are associated with each
of them.
This project enhanced the students understanding of the capability concept. At the beginning, most
students had trouble mapping the capability concept to the real world. We did not tell the students
how the capability should be implemented, but to
ask them to design their own capability mechanisms. This requires them to figure out how the
capabilities should be represented in the system,
where to store the capabilities, how the system can
use the capability to conduct access control, etc.
Once students have figured out all of these issues,
the implementation becomes relatively easy;
therefore the amount of coding for this project is
not significant, and students are able to accomplish
the task within 2 weeks. Had it not been for Minix,
students would need to spend a lot of time implementing a meaningful system where the effect of
the capability can be demonstrated.
We encouraged students to design some other
features beyond the basic requirements. Students
were highly motivated, some implemented a more
generic capability-based access control mechanism than the required one, and some allow new
capabilities to be defined by the superuser.

Sandbox
A sandbox is an environment in which the actions
of an untrusted process are restricted according to
a security policy (Bishop, 2002). Such restriction
protects the system from untrusted applications.
In Unix, chroot can be used to achieve a simple
sandbox.
The instruction chroot newroot cmd causes
cmd to be executed relative to newroot, i.e., the
root directory is changed to newroot for cmd and
any of its child processes. Any program running
within this sandbox can only access files within
the subdirectory of newroot.
Some Unix systems allow normal user to run
chroot sandbox (just make chroot a Set-UID
program). However, this can introduce a serious
problem: malicious users may create a login environment with their own shadow file and passwd
file under newroot, which will help them gain

A novel approach for computer security education

197

a root shell. Once getting that privilege, they can


create a Set-UID shell program which allows
them to use after exiting the sandbox. The attack
is described in the following:

encryption/decryption operations should be transparent to users. Implementing EFS requires students to combine techniques such as encryption,
key management, authentication, access control,
and security in OS kernels and file systems;
therefore this project is a comprehensive project.
We give this project as a final project.
Minix system has a complete file system, so students can build the EFS on top of it. As we mentioned before, Minix file system is reasonably
easy to understand; students can start building
their own EFS after they understand how the file
system works.
This project is a good candidate for the final
comprehensive project because it covers a variety
of security-related concepts and properties:

test $ mkdir /tmp/etc


test $ echo root::0:0::/:/bin/sh > /tmp/
etc/passwd
test $ mkdir /tmp/bin
test $ cp /bin/sh /tmp/bin/sh
test $ cp /bin/chmod /tmp/bin/chmod
test $ chroot /tmp /bin/login (login as root
with no password)
root # chmod 4755 /bin/sh (change shell to
Set-UID)
root # exit
test $ cd /tmp/bin
test $ ./sh
root # (get root shell in real system)
One of the goals of this project is to let students
find out this vulnerability with some provided
clues. Students need to implement attack procedures and demonstrate how to take advantage of
the vulnerability to gain root privileges. This is an
efficient way for students to enhance their understanding on security hole in kernel level.
To fix the above vulnerability, the best way is to
disallow normal user from using chroot. However,
normal users will not be able to take advantage
of the sandbox. We ask students to extend the
current chroot such that the program is safe to
be used by normal users.
We suggest students to design a security policy
for this sandbox. Sandbox security policy defines
a set of permissions and restrictions that a program
must obey while running. For example, the policy
can define whether a program is permitted to read
files or connect to the Internet. Any program
attempting to violate the security policy will be
blocked. Students need to consider a number of
issues, including how to define policy, where to
save policy, when it should be read in, and how
to secure the policy file. Students should be able to
finish this project within 2e3 weeks.

Encrypted file system


Non-encrypted file system stores plain text on
disks, so if the disk is stolen, information on it
can be disclosed. An Encrypted File System (EFS)
solves this problem by encrypting the file system,
such that only users who know the encryption keys
can access the files. The primary benefit of EFS is
to defend against unauthorized access. The

 User transparency: The main challenge of this


project is how to make EFS transparent. If
the transparency is not an issue, then students
can easily implement a set of encryption/decryption utilities, and users need to use those
utilities to encrypt/decrypt their files manually.
The transparency means that the encryption/
decryption should be performed on the fly,
while users are reading/writing their files. To
achieve the transparency, students need to
modify the system calls related to the reading
and writing. They need to insert the encryption
algorithms into the proper positions in those
system calls.
 Key management: Another challenge of this
project is the key management, namely how
and where the encryption keys should be
stored, how the keys should be protected,
changed, and revoked. We have seen different
designs from students. For example, regarding
the key storage problem, some students store
the key (encrypted) in a file, and some store
it in the i-node of the encrypted file. We also
found out that some students mistakenly save
the plain text key on the disk, which defeats
the whole purpose of the EFS.
 Authentication: How to decide whether a user
can access the encrypted file system or not?
This part of the project not only teaches students the authentication purpose, more importantly, it teaches students an important lesson
about the tradeoff between the usability and
the security. Some students projects require
users to authenticate themselves each time
when they access a file in EFS; some conduct
just one authentication when the users mount
the EFS (a good implementation in our opinion); some conduct the authentication during
the login. During their demos, we point out

198
the advantages and disadvantages of their designs, so they can evaluate their own designs.
 Using encryption and hashing algorithms: Although students are provided with codes for
encryption and hashing algorithms, they still
need to learn how to use it correctly. Because
AES is a block cipher, students need to deal
with the issues related to the block and padding; otherwise, their reading/writing system
calls might not function correctly.
 Security analysis: After most of the students
have finished their designs, we gave them several incorrect designs that we have encountered in the past, and we asked them to find
out whether those designs are secure or not;
if not, how to break those EFSs.
Project simplification
For students who do not have sufficient background in operating system kernel programming,
we need to customize our projects for them. We
divide the EFS project into three projects:
1. Project 1: Encryption algorithms. This project
gets students familiar with the AES algorithm.
Students need to implement a user-level program to encrypt and decrypt files.
2. Project 2: Kernel modification. The second project asks students to modify the corresponding
system calls, such that some special files are
always read/write using encryption. However,
to simplify this project, we ask them to always
use a fixed key for the encryption. The key can
be hard-coded in their programs.
3. Project 3: Key management. This project deals
with the key management issue that is intentionally left off in the previous project. Students now need to find a place to store the
key; they need to make decision on whether
to use the same key for all the files or one
key for each file; they also need to deal with
the authentication issues, etc.

Vulnerability analysis
Vulnerability analysis strengthens the system security by identifying and analyzing security flaws
in computer systems. This project intends to expose students to such a critical approach. We
have two goals in this project: The first goal is to
let students gain first-hand experience on software
vulnerabilities, be familiar with a list of common
security flaws, and understand how a seemly-notso-harmful flaw in a program can become a risk
to a system. The second goal is to give students

W. Du et al.
opportunities to practice their vulnerability analysis and testing skills. Students can learn a number
of methodologies from class, such as vulnerability
hypothesis, penetration testing methodology,
code inspection techniques, and blackbox and
whitebox testing (Pfleeger et al., 1989). They
need to practice these methodologies in this
project.
To achieve our goals, we modify the Minix
source codes and intentionally introduce a set of
vulnerabilities. We call these vulnerabilities the
injected vulnerabilities. The revised Minix system
is then given to students. The students are given
some hints, such as a list of possible vulnerabilities, the possible locations of the vulnerable programs, etc. Their task is to find out and verify
these vulnerabilities.
The injected vulnerabilities cover a wide spectrum of vulnerabilities, such as buffer overflow,
race condition, security holes in the access control
mechanisms, security holes in Set-UID programs,
information leakage, and denial of service. These
vulnerabilities reflect system flaws caused by incorrect design, implementation, and configuration. All these vulnerabilities are collected from
real commercial Unix operating systems, such as
SunOS, HP-Unix and Linux, and are then ported
to Minix. We have ported nine vulnerabilities so
far, with six in the user level and three in the kernel level. We will port other typical vulnerabilities
to Minix in the future.
Students in this project need to accomplish the
following tasks:
 Identify vulnerabilities. This is a warm-up
practice to help students get familiar with
vulnerability living environment.
 Exploit vulnerabilities. This is a challenging
and interesting part of the project in which students write attack programs aiming at these
vulnerabilities. Demonstration is needed to
show what unauthorized privilege can be
obtained.
 Fix vulnerabilities. Students need to design
solutions to eliminate or remedy the identified
vulnerabilities.

Experiences and lessons


We did a teaching experiment in the 2002 spring
semester when we taught the graduate-level
computer security course at Syracuse University.
At that time, we asked students to add certain
specific security mechanisms to Minix. We only
give students one project for the whole semester

A novel approach for computer security education

199

because modifying an OS seems to be a daunting


job for most of the students. The students liked
the project very much and were highly motivated.
At the end of the semester, the students provided
a number of useful suggestions. For example,
many students noted, most of our time was spent
on figuring out how such an operating system
work, if somebody or some documentation can explain that to us, we could have done four or five
different projects of this type instead of doing
one during the whole semester. This observation
shapes the goal of our design: we want students to
implement a project within 2e4 weeks using our
proposed instructional environment.
When we taught the course again in Spring 2003,
we provided students with sufficient information
on how Minix works, and we added a lecture to introduce Minix. As a result, students had become
familiar with Minix within the first 3 weeks, and
were ready for the projects we had designed for
them. The same degree of familiarity took students half of a semester previously due to the
lack of information.
In our first experiment in 2002, the requirements of each project were not tailored to a scope
appropriate for 2e3 weeks. During the last 3 years
experiments, we simplified those requirements.
In 2004 semester, we successfully assigned four
projects in one semester, including the Set-UID
project, capability project, access control project,
and the comprehensive encrypted file system project. However, we are still unable to assign the
vulnerability project due to the lack of time. We
will further improve our strategy in the coming
2005 Spring semester.
During the last 3 years, we have also learned the
following lessons:

time to help the students finish this assignment. The preparation part is extremely important. If students fail this part, they will spend
enormously more time on the subsequent projects. This is very clear when we compare the
performance of the students in our 2003 course
with that of the students in 2002. We plan to integrate the materials related to Minix into the
lecture, so students can be prepared better.
 Background knowledge: We also realized that
some students in the class are not familiar
with the Unix environment because they have
been using Windows most of the time. This
brings some challenges because these students
do not know how to set up the PATH environment variable, how to search for a file, etc.
We plan to develop materials to help students
get over this obstacle.
 Cheating: Cheating did occur, especially on the
final encrypted file system project. We now
have a list of questions that we will ask during
students demonstrations. They not only help
us evaluate students projects, but also are
quite effective so far in identifying cheatings.
Example of questions include where do you
save keys and why?, can your implementation work on large files? and how did you handle
that?, etc. Students who simply copy others
implementation be will most likely unable to
answer these questions.

 Preparation: From our experience, the preparation project is crucial to the success of the
subsequent assignments. Some students who
overlooked this assignment find themselves in
trouble later. In fact, when we used the proposed approach at the first time, we did not
give students this assignment because we
thought it was not necessary. As a result, students later spent a great deal of time in figuring out how to achieve the tasks in this
assignment. Most of the students told us that
they spent 80% of their time to get familiar
with the system. Once they know how Minix
works, they can spend short time to finish the
required task. Therefore, when we use the
approach again, we used several lectures to
inform students the necessary materials, and
ask the TA to devote significant amount of

Conclusion and future work


We have described a laboratory design for our
graduate-level computer security course. Our approach is intrigued by the successful practice in
operating system and network courses education.
In our approach, we use Minix instructional operating system as the basis of our laboratory; in designoriented laboratory projects, students add a specific
security mechanism to the system; in analysisoriented laboratory projects, students identify,
exploit, and fix vulnerabilities in Minix. Because
of the desirable properties of Minix, our laboratory
projects can be finished within a reasonable
amount of time and in a general computing environment without using superuser privileges. We have
designed a series of laboratory projects based on
Minix, and have experimented with our approach
for the last 3 years. The experience obtained is
encouraging, and students in our class have shown
great interest in the course and the projects.
We will continue experimenting and perfecting
our approach. More importantly, we will work
on making this laboratory approach easy to be

200
adopted by other people. This requires us to
provide detailed documentations, instructions,
and a pool of different projects covering a wide
range of security concepts.

References
Ashton P, Smxdthe solaris port of minix. 1996.
Bishop M. Computer security: art and science. Addison-Wesley;
2002.
Bochs, <http://bochs.sourceforge.net>; 2002.
Christopher WA, Procter SJ, Anderson TE. The nachos instructional operating system. In: Proceedings of the winter
1993, USENIX conference, San Diego, CA, USA, January
25e29, 1993. p. 481e9. Available from: http://http.cs.
berkeley.edu/%126;tea/nachos.
Comer D. Operating system design: the XINU approach. Prentice
Hall; 1984.
Hill JMD, Carver CA Jr, Humphries JW, Pooch UW. Using an isolated network laboratory to teach advanced networks and
security. In: Proceedings of the 32nd SIGCSE technical symposium on computer science education, Charlotte, NC, USA,
February 2001. p. 36e40.
Landwehr CE, Bull AR, McDermott JP, Choi WS. A taxonomy of
computer program security flaws. ACM Computing Surveys
September 1994;26(3):211e54.
Mayo J, Kearns P. A secure unrestricted advanced systems laboratory. In: Proceedings of the 30th SIGCSE technical symposium on computer science education, New Orleans, USA,
March 24e28, 1999. p. 165e9.
Meyers C, Jones TB. Promoting active learning: strategies for
the college classroom. San Francisco, CA: Jossey-Bass;
1993.
Mitchener WG, Vahdat A. A chat room assignment for teaching
network security. In: Proceedings of the 32nd SIGCSE technical symposium on computer science education, Charlotte,
NC, USA, February 2001. p. 31e5.
Pfleeger C, Pfleeger S, Theofanos M. A methodology for penetration testing. Computers and Security 1989;8(7):613e20.

W. Du et al.
Spafford EH. February 1997 testimony before the United States
House of Representatives subcommittee on technology,
computer and network security, 2000. Available from:
http://www.house.gov/science/hearing.htm.
Tanenbaum A. Operating systems: design and implementation.
2nd ed. Prentice Hall; 1996.
Tanenbaum A, <http://www.cs.vu.nl/%126;ast/minix.html>;
1996.
VMWare, <http://www.vmware.com>; 1996.
Wenliang Du received the B.S. degree in Computer Science
from the University of Science and Technology of China, Hefei,
China, in 1993, the M.S. degree and the Ph.D. degree from the
Computer Science Department at Purdue University, West Lafayette, Indiana, USA, in 1999 and 2001, respectively. During
his studies in Purdue, he did research in the Center for Education and Research in Information Assurance and Security
(CERIAS). Dr. Du is currently an assistant professor in the
Department of Electrical Engineering and Computer Science at
Syracuse University, Syracuse, New York, USA. His research
background is in computer and network security. In particular,
he is interested in wireless sensor network security and privacypreserving data mining. He is also interested in developing
instructional laboratories for security education using instructional operating systems. His research has been supported by
the National Science Foundation and the Army Research Office.
Mingdong Shang received his B.S. Degree in Electrical and
Mechanical Engineering from Beijing University of Aeronautics
and Astronautics in 1998. He is Currently a Ph.D. student in
the Department of Electrical Engineering and Computer Science
at Syracuse University. His research interests include computer
security and network security, and he has been focusing on
developing Minix-based instructional laboratory environment
and lab exercises for computer and network security courses.
Haizhi Xu received his B.S. and M.S. degrees both in computer
engineering from Harbin Institute of Technology, Herbin, China,
in 1995 and 1997 respectively. He is a Ph.D. Candidate at Syracuse University, Syracuse, NY, USA, majoring in computer engineering. His current research interests are computer system
security, intrusion detection and mitigation, and operating
systems.

Computers & Security (2006) 25, 201e206

www.elsevier.com/locate/cose

A traceable threshold signature scheme with


multiple signing policies
Jun Shao, Zhenfu Cao*
Department of Computer Science and Engineering, Shanghai Jiao Tong University,
1954 Huashan Road, Shanghai 200030, Peoples Republic of China
Received 15 November 2005; accepted 15 November 2005

KEYWORDS
Threshold
cryptography;
Signature schemes;
Multi-secret;
Traceability;
Multiple signing
policies

Abstract In recent years, a great deal of work has been done on threshold signature schemes and many excellent schemes have been proposed. In Eurocrypt94,
Li et al. [Threshold-multisignature schemes where suspected forgery implies
traceability of adversarial shareholders. In: Advances in CryptologydProceedings
of EUROCRYPT 94; 1994. p. 413e9] proposed a threshold signature scheme with
traceability, which allows us to trace back to find the signer without revealing
the secret keys. And in 2001, Lee [Threshold signature scheme with multiple signing
policies. IEE Proc Comput Digit Tech 2001;148(2):95e9] proposed a threshold signature scheme with multiple signing policies, which allows multiple secret keys to be
shared among a group of users, and each secret key has its specific threshold value.
In this paper, based on these schemes, we present a traceable threshold signature
scheme with multiple signing policies, which not only inherits their properties, but
also fixes their weaknesses.
2005 Elsevier Ltd. All rights reserved.

Introduction
In order to keep the secret efficiently and safely,
Shamir (1979) and Blakley (1979) presented (l, n)
threshold secret sharing schemes independently
in 1979. In such a scheme, the dealer splits the secret x into shares x1 ; .; xn among players, and
sends the share to the corresponding player. As
* Corresponding author. Tel.: 86 21 62835602; fax: 86 21
62933504.
E-mail address: cao-zf@cs.sjtu.edu.cn (Z. Cao).

a result, any l or more players can cooperate to


recover the secret x, but any l  1 or fewer players
cannot get any useful information about the secret
x. A threshold secret sharing scheme has many
practical applications, such as opening a bank
vault, launching a nuclear, or authenticating an
electronic funds transfer. On the other hand, there
are several situations in which more than one
secret is to be shared among players. As an example, consider the following situation, described by
Simmons (1991): there is a missile battery and not
all of the missiles have the same launch enable
code. The problem is to devise a scheme that

0167-4048/$ - see front matter 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cose.2005.11.006

202
will allow any one, or any selected subset, of the
launch enable codes to be activated in this
scheme. Till now, many efficient schemes for sharing more than one secret have been proposed
(Blundo et al., 1994; Lee, 2001).
Digital signature is a major research topic in
modern cryptography and computer security. The
signer needs to take full responsibility for their
digital signatures. In 1991, Desmedt and Frankel
(1991) combined digital signatures and threshold
secret sharing schemes to propose the concept of
threshold signature. Like threshold secret sharing
schemes, in a threshold signature scheme, the responsibility for signing a document is shared by
a group of signers from time to time. A threshold
signature scheme is designed to allow that only
when the number of players attains the given
threshold value, the signature can be created.
More precisely, a typical (l, n) threshold signature
scheme follows the three basic properties:
 Any l or more players in the group can cooperate with each other to generate a valid group
signature, while they do not reveal any information about their sub-secret keys and the
secret key.
 Any l  1 or fewer players in the group cannot
create a valid group signature.
 Any verifier can verify the group signature with
only knowing the group public key.
However, Li et al. (1994) have pointed out that
most of the (l, n) threshold digital signature
schemes proposed so far suffer from the so-called
conspiracy attack. That is, any l or more players
can cooperate to impersonate any other set of players to forge the signature. To prevent from the attack, they added a random number to the shadow
to form a sub-secret key held by each player. The
additional random number gives the (l, n) threshold
signature scheme the property of traceability,
which means that we can trace adversarial signers
if forgery is suspected. Unfortunately, Michels and
Horster (1996) showed that the signer cannot be
sure who his cosigners are in Li et al.s (1994)
scheme, and this weakness violates the traceability
property.
Corresponding to multi-secret sharing scheme,
there is threshold signature scheme with multiple
group secret keys. In this kind of scheme, different
secret keys can be used to sign documents depending on the significance of the documents. Once the
number of the cooperated users is greater than or
equal to the threshold value of the group secret
key, they can cooperate to sign the document.
In 2001, Lee proposed an efficient threshold

J. Shao, Z. Cao
signature scheme with multiple signing policies.
However, in Lees scheme, there are n group
secret keyS0 ; S1 ; .; Sn1 ; if the group secret key
S0 is exposed, then the scheme is broken. Furthermore, the partial signature cannot be verified.
In this paper, based on Li et al.s scheme and
Lees scheme, we present a traceable threshold
signature scheme with multiple signing policies.
The proposed scheme allows the players to apply
different group secret keys to sign documents, and
only two sub-secret keys need to be kept by each
player. Furthermore, in the proposed scheme, we
can trace back to find the signer without revealing
the secret keys. In addition, the exposure of any of
the group secret key cannot harm the security of
other unexposed group secret key.
The rest of this paper is organized as follows. In
the next section, we first review Li et al.s scheme
and Lees scheme. Then we propose our scheme
and discuss its security. Finally, conclusions are
marked.

Review of Li et al.s scheme and Lees


scheme
In this section, we briefly describe Li et al.s
scheme (1994) and Lees scheme (2001).

Li et al.s scheme
Li et al. (1994) proposed two (l, n) threshold signatures with traceable players: the first one needs
a mutually trusted dealer while the second one
does not. In this section we only review their first
scheme, which needs a mutually trusted dealer
(Michels and Horster, 1996).
The dealer picks two large primes p, q with
qjp  1, a generator
order
q and a

P gGFp
i
*
polynomial fx l1
a
x
mod
q
a
Z
i
i
q ; i 0;
i0
1; .; l  1 . Then the dealer determines x
f0 a0 as the group secret key and computes
y gx mod p as the public group key. The secret
share of each player Pi 1  i  n with identity
IDi is ui bi fIDi mod q using a random value
bi Zq* and the public keys are yi gui mod p and
zi gbi mod p.
If a group B with jBj t of players would like to
generate a signature of a message m, then each
player
Pi (i  B) picks ki Zq* and broadcasts

ki
ri g mod p . Once, all ri are available, each
player and the designated combiner (DC) compute
R

ri mod p and E Hm; Rmod q

iB

Then each player Pi (i B) computes

A traceable threshold signature scheme


si ui di ki E mod q



Q
where di jB;isj 0  IDj
IDi IDj . Each player
Pi (i B) sends the values m and si to the DC who
can verify si by the following equation:
gsi yidi r Ei mod p
Then the DC computes the group signature by
X
S
si mod q
iB

(m, B, R, S ) is the group signature of the signers in


B to message m. This signature can be checked by
computing
Y d
T
zi i mod p; E Hm; Rmod q
iB

and checking, whether the equation


gS yTRE mod p
holds.

Lees scheme

203
set of the public values, x1 ; .; xnl , and B be
the union of Bl and Ba. If a group Bl with jBl j l
of players would like to generate a signature of
a message m with the threshold value l, then
each player Pi iBl picks ri 1; .; N  1 and
n
broadcasts ui ui riL mod N. Once all ui are
available, then each player and the DC compute
Y
n
ui mod N RL mod N
U
iBl

e Hm; U
Q
where R iBl ri mod N.
Pi iBl computes
nl d

zi ri KiL

Then

each

player

mod N

Q


Q
where di jD;j;B xi  xj jB;jsi 0  xj e. Each
player Pi iB sends the values m and zi to the DC
who can compute the group signature by
Y Y
nl
Z
zi
Wi mod N RaL de mod N
iBl

iBa
nli

Now we review the scheme by Lee (2001). In this


scheme, a trusted dealer is assumed too. The dealer
picks three numbers N, L and a, where N pq,
p 2p0 1 and q 2q0 1, p, q, p0 and q0 are
large primes, L is a random number with
gcdL; 4N 14N 2p0 q0 , and a is primitive
in both GF (p) and GF (q). Also, the dealer picks
a random number dgcdd; 4N 1 and a random
polynomial f(x) of degree n  1 with f0
d mod 4N. N, L and a are published, while p, q,
p0 , q0 , 4(N) and f(x) are kept secret. Then the dealer
computes the n group secret keys Si ; i
0; 1; .; n  1 and the corresponding n public group
keys Yi ; i 0; 1; .; n  1 as follows:
i

Si adL mod N
Yi adL

ni

mod N

The threshold value of Si is set to be n  i. The


secret share of each player Pi i
n; .; 2n  1 is Ki
Q
(Ki asi mod N; si fxi =2 =
jD; jsi xi  xj
=2 mod p0 q0 , xi is a public odd integer, and f(xi) is
a secret even integer). Also, the dealer publishes
an odd integer xi with an even f(xi), and a
i
public shadow Ki aL si mod Ni 1; 2; .; n  1.
Let AjAj n be the set of all players public values xis in the group, CjCj n  1 be the set of all
public shadows public values xis, and D be the
union of A and C.
Let Bl jBl j l be denoted as the set of the l
players xis, Ba jBa n  lj be denoted as the

where Wi KiL di mod N. (m, e, Z ) is the group


signature of the signers in Bl to message m. This
signature can be checked by computing
n

e
mod N
U ZL Ynl

and checking, whether the equation


e Hm; U
holds.

The proposed scheme


In this section, we present our scheme that is
based on the schemes of Li et al. and Lee. In the
proposed scheme, a trusted dealer is also assumed. Let us divide the proposed scheme into
three phases: the initialization phase, the partial
signature generation/verification phase, and the
group signature generation/verification phase. The
proposed scheme can thus be stated as follows.

Initialization phase
Firstly, the dealer selects the following parameters:
(1) two large primes c, c0 , with c0 jc  1;
(2) a generator GFc of order c0 ;
(3) a number N pq(p 2p0 1 and q 2q0 1),
where p, q, p0 and q0 are large primes, and
defines 4N 2p0 q0 ;

204

J. Shao, Z. Cao

(4) three random numbers


L, L0 and d, where

gcdL; 4N 1, gcd L0 ; 4N 1 and gcdd;
4N 1;
(5) two numbers a and b, where a and b are both
primitive in both GF (p) and GF (q);
(6) two collision free hash functions H1 and H2;
(7) a polynomial f(x) of degree of n  1, where
f0 d mod 4N.

g0i bri mod N

zi asi e mod N

li

H2 a; b; zi ; yie ; gi ; g0i

wi r i l i s i e

ti bi eri mod c0

10

Thus, 
the dealer publishes
c; c0 ; g; N; L; L0 ;
a; b; H1 ; H2 as the group public parameters and
keeps fp; q; p0 ; q0 ; 4N; d; fxg from being revealed. Let AjAj n be the set of all players
public values xis in the group, CjCj n  1 be
the set of all public shadows public values xis,
and D be the union of A and C.
Then the dealer computes the n group secret
keys Si ; i 0; .; n  1 and the corresponding n
public group keys Yi ; i 0; .; n  1 as follows:
dLi L0ni

Si a

Yi a

mod N

dLni L0ni

mod N

1
2

The threshold value of Si is set to be n  i, e.g.


the threshold value of S0 is n and the threshold value
of Sn1 is 1. The dealer sends the (si, bi) to player

Pi, and publishes yi yi bsi mod N and vi vi
i modc i n; .; 2n  1,
gbQ
where si fxi =2=

 

2 mod p0 q0 , xi is a public odd
jD;jsi xi  xj
integer, f(xi) is secret even integers, and bi is a random integer.
Also, the dealer publishes an odd integer xi with
i
an even f(xi), and a public shadow Ki aL Si
mod Ni 1; .; n  1.

Partial signature generation/verification


phase

Also, DC computes U and e as Eqs. (3) and (4),


respectively. Then each player Pi sends his partial
signature m; li ; wi ; zi ; ti to DC who can check the
validity of the partial signature by
!
wi
wi
a
b
li H2 a; b; zi ; yie ; li ; li e
11
z i yi
gti vi uei mod c

12

Group signature generation/verification


phase
Once these l partial signatures are verified, the DC
can compute the group signature by
 Y n1 Y L0l
nl 0l
Z
zdi i L
Wi
mod N aL L de mod N
iBl

iBa

13
ti mod c0

14

iBl

Q
nli
Wi KiL di e mod
N, and di jD;j;B
where Q

xi  xj jB;jsi 0  xj . m; Z; T; U; Bl is the
group signature of the signers in Bl to message m.
This signature can be checked by computing
e H1 m; U; Bl

Assume that a message m is required to be signed


by the cooperation of l players (l could be any integer from 1 to n). Let Bl jBl j l be denoted as
the set of the l players xi s, Ba jBa j n  l be denoted as the set of the public values, x1 ; .; xnl ,
and B be the union of Bl and Ba.
To generate the signature for message m, each
player Pi iBl picks ri ri 1; N  1 and broadcasts ui ui gri mod c. Once all ui are available,
then each player computes
Y
ui mod c
3
U
iBl

e H1 m; U; Bl

gi ari mod N

and checking, whether the equations


n

e
mod N
1 Z L Ynl
Y
gT
vi Ue mod c

15
16

iBl

hold.
n

e
Theorem 1. If 1 Z L Ynl
mod N and gT

vi U e

iBl

mod c hold, then m; Z; T; U; Bl is the valid group


signature of m with threshold value l.
Proof 1. Since
zi asi e mod N
where iBl , and

A traceable threshold signature scheme


nli d

Wi KiL

ie

Li s

Lnl si di e

Security discussions

mod N

nli d e
iL
i

mod N

mod N

where iBa , we have


 Y nl Y L0l
Z
zdi i L
Wi
mod N
iBl

Y

iBa
Lnl

asi edi

iBl

asi i edi L

nl

L0l mod N

iBa
si edi Lnl L0l

mod N

iB

also, we have
si Q

fxi =2
0 0

 . mod p q
2
jD;jsi xi  xj

Y 

di

xi  xj

 Y 

0  xj

jD;jB

jB;jsi

By the threshold secret sharing scheme (Shamir,


1979), the unique (n  1)th degree polynomial
f (x) can be determined with knowledge of n pair
of xi ; fxi , thus
Y
nl 0l
nl 0l
Z
asi edi L L mod N adeL L mod N
iB

Consequently,

n

n
nl 0l L
2nl 0l e
e
ZL Ynl
adeL L
mod N
adL L
2nl L0l

adeL

adeL

2nl L0l

mod N 1 mod N

On the other hand, since


e

gti gbi eri mod c gbi gri mod c vi uei mod c


By multiplying gti for all iBl , we have,
Y

ti

iBl

vi uei

mod c

iBl

vi

iBl

vi Ue mod c

iBl

Since T can be expressed as


T

ti mod c0

iBl

we have
gT

Y
iBl

vi Ue mod c

205

Y
iBl

uei

mod c

According to Theorem 1, any subset Bl of l players


can generate a valid group signature with threshold value l. The group signature can also be verified easily by any verifier.
Although any player can associate the n  1 public shadows to retrieve the group secret key
n1
0
aL dL , this group secret key can be only used for
generating group signatures with threshold value 1.
Moreover, the exposure of any group secret key
i
0ni
aL dL
cannot harm the security of other unexposed group secret keys, unless the adversary
can find L1 mod 4N or L01 mod 4N. However,
it is as difficult as factoring N.
The adversary cannot get any useful information
about si from Eqs. (5)e(9), because it is a well-known
non-interactive protocol, due to Shoup (2000). Moreover, we directly adopt the method proposed in
Michels and Horster (1996), thus, our scheme can
withstand the attack presented by them.
Furthermore, we can check the security of our
scheme by replying to the questions given by
Li et al. (1994) and Lee (2001). We omit the detail
analysis here because it is very similar to that presented earlier (Li et al., 1994; Lee, 2001). The
reader may refer to the above-mentioned works
for more detailed information.

Conclusions
Based on the schemes of Li et al. and Lee we have
devised a traceable threshold signature scheme with
multiple signing policies. In the proposed scheme,
any group secret key is exposed, which cannot harm
the security of other unexposed group secret keys.
Moreover, our scheme has the traceability property.

Acknowledgements
This research is supported by the National Natural
Science Foundation of China for Distinguished
Young Scholars under Grant No. 60225007, the
National Research Fund for the Doctoral Program
of Higher Education of China under Grant
No.20020248024, and the Science and Technology
Research Project of Shanghai under Grant Nos.
04JC14055 and 04DZ07067.

References
Blakley GR. Safeguarding cryptographic keys. In: Proceedings of
AFIPS National Computer Conference, vol. 48, Arlington, VA,
June 1979. p. 313e7.
Blundo C, Santis AD, Crescenzo GD, Gaggia AG, Vaccaro U. Multi
secret sharing schemes. In: Desmedt YG, editor. Advances in

206
cryptologydCrypto94 Proceedings. LNCS 839. Berlin:
Springer-Verlag; 1994. p. 150e63.
Desmedt Y, Frankel Y. Shared generation of authenticators and
signatures. In: Advances in cryptologydCrypto91 Proceedings; 1991. p. 457e69.
Lee NY. Threshold signature scheme with multiple signing
policies. IEE Proc Comput Digit Tech March 2001;148(2):95e9.
Li C, Hwang T, Lee N. Threshold-multisignature schemes where
suspected forgery implies traceability of adversarial shareholders. In: Advances in CryptologydProceedings of EUROCRYPT 94; 1994. p. 413e9.
Michels M, Horster P. On the risk of disruption in several
multiparty signature schemes. In: Advances in Cryptologyd
Proceedings of Asiacrypto 96; 1996. p. 334e45.
Shamir A. How to share a secret. Commun ACM 1979;22(11):
612e3.
Simmons GJ. An introduction to shared secret and/or shared
control schemes and their application, Contemporary cryptology. IEEE Press; 1991. p. 441e97.

J. Shao, Z. Cao
Shoup V. Practical threshold signatures. In: Preneel B, editor.
EUROCRYPT 2000. LNCS 1807; 2000. p. 207e20.
Jun Shao received his B.S. degree in Computer Science from
Northwestern Polytechnical University in 2003. Currently,
he is a doctoral candidate in the Department of Computer
Science and Engineering, Shanghai Jiao Tong University.
His research interests lie in cryptography and network
security.
Zhenfu Cao is the professor and the doctoral supervisor of
the Department of Computer Science and Engineering,
Shanghai Jiao Tong University. His main research areas
are number theory, modern cryptography, and information
security. He is the recipient of the Youth Award and Research
Fund of Chinese Science Academy (1986), the first prize Award
for Science and Technology in Chinese University (2001), and
the National Outstanding Youth Fund of China (2002), etc.

Computers & Security (2006) 25, 207e212

www.elsevier.com/locate/cose

Security implications in RFID and


authentication processing framework
John Ayoade*
Security Advancement Group, National Institute of Information and
Communications Technology, Japan
Received 17 November 2004; accepted 15 November 2005

KEYWORDS
RFID;
Access control;
Authentication;
Security;
APF

Abstract The objective of this paper is to propose an idea called APF (Authentication Processing Framework) as one of the ways to deter the growing concerns
of unauthorized readers from accessing the tag (transponder) which could result into the violations of information stored in the tag. On one hand, we will discuss the
importance of RFID systems and on the other hand, we will discuss about the security implications that the RFID systems have over consumers privacy and security.
In this paper, we are trying to weigh the two issues, importance of RFID system and
the RFID security implications. Having done that, we are recommending our idea
called APF (Authentication Processing Framework) as a good method to overcome
the above mentioned problem.
2005 Elsevier Ltd. All rights reserved.

Introduction
A typical RFID system will consist of a tag, a reader,
an antenna and a host system. Most RFID tags are
passive which means that they are battery-less and
that they obtain power to operate from the
reader. While some are battery powered tags
which means they are active and do not need
power from the reader to function. RFID tags are
tiny computer chips connected to miniature antennae that can be affixed to physical objects
* 101 Domiru-Tsuda, 3-25-41 Tsudamachi, Kodaira-shi, Tokyo,
Japan. Tel./fax: 81 423 43 4403.
E-mail addresses: ayoadejohn@yahoo.com, ayoade@nict.go.jp

(Berthon, 2000). In the most commonly touted applications of RFID, the microchip contains an Electronic Product Code (EPC) with sufficient capacity
to provide unique identifiers for all items produced
worldwide. When an RFID reader emits a radio signal, tags in the vicinity respond by transmitting
their stored data to the reader.
With passive (battery-less) RFID tags, readrange can vary from less than an inch to 20e30
feet, while active (self-powered) tags can have
a much longer read-range.
Typically, the data are sent to a distributed
computing system involved in, perhaps, supply
chain management or inventory control (Spychips,
2003).

0167-4048/$ - see front matter 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cose.2005.11.008

208
RFID system has many beneficial uses as it can be
applied to many areas of our day to day activities.
It supports many versatile applications including
entrance gate control at transport facilities, custody control and so on. However, the major barrier
that the RFID system is facing presently is the issue
of possibility of privacy violation which could be as
a result of illegal access.
Since, RFID tags respond automatically to any
reader; that is, they transmit without the knowledge of the bearer, and this property can be used
to track a specific user or object over wide areas.
While expectations are growing for the use of RFID
systems in various fields, opposition to their use
without the knowledge of the user is increasing
(CASPIAN).
Furthermore, if personal identity were linked
with unique RFID tag numbers, individuals could be
profiled and tracked without their knowledge or
consent. For example, a tag embedded in a shoe
could serve as a de facto identifier for the person
wearing it. Even if item-level information remains
generic, identifying items people wear or carry
could associate them with, for example, particular
events like political rallies (Spychips, 2003).
Our main goal is to find a solution to the privacy
problem of illegal access of readers to the tags
(tags) in the RFID system.
Moreover, the RFID has been around for many
years now. The first notable application was in
identifying aircraft as friend or foe. Since then
RFID has been deployed in a number of application
such as identifying and tracking animals from
implanted tags; tracking transport containers;
access control systems; keyless entry systems for
vehicles; and automatic collection of road tolls
(Allan, 2003).
Many other RFID applications may emerge.
Consider an airport setting. Both boarding passes
and luggage labels could be tagged with RFID
devices. Before take-off, an RFID enabled airplane
could verify that all boarding passes issued were
on the plane and that all luggage associated with
those was in the hold. Within an airport, tracking
passengers by their boarding passes could improve
both security and customer service. Of course, in
other environments this would be an undesirable
violation of privacy (Weis, 2003).
Regarding consumers privacy violation, we can
refer to the above example. Since many airlines
are in the airport with different workers, there
could be malicious workers working for different
airlines with ulterior motives to violate consumers
privacy. There is a tendency that the malicious
workers would be accessing and monitoring the
private information of consumers.

J. Ayoade
Therefore, there should be a preventive method
that should be put in place to deter the violation of
privacy of consumers.

Importance and implications of RFID


systems
The problem we are dealing with in this paper is
the issue of privacy problem in RFID system. Since,
in RFID system any reader can read and write to
the tag in the range of its vicinity. As it is obvious
that any item a tag is attached to is susceptible to
tracking or monitoring.
This is explained in Ohkubo et al. (2004) as
a leakage of information regarding use of belongings, for example, money and expensive products,
medicine (which may indicate a particular disease), and books (which mirror personal consciousness or avocation). It means if such items are
tagged, various types of personal information can
be acquired without the knowledge of the user
(Ohkubo et al., 2004).
However, in this paper, we proposed APF which
stands for Authentication Processing Framework.
We will discuss how the APF will be able to
circumvent the problem described above later in
the paper.

Importance of RFID systems


RFID systems are now used for a variety of industrial and consumer applications, including
access control, asset management, and warehouse
automation (UBICOM).
Electronic toll collection and road pricing are
a typical use of active and semi-active tags.
Automobiles are equipped with an active tag
that can be read as the car moves through
a toll booth or drives along the road. Each tag
has a unique serial number; a database correlates
the serial number with an account number that is
automatically debited each time the tag is read
(E-ZPASS).

Security implications in RFID


Personal data protection
While recognizing the benefits of RFID, business
also has to consider fully the implications for
personal data protection and security. Citizens
have already voiced concerns about the ability of
RFID to track them personally, to gather information about their purchasing habits, and to compromise their personal security.

Security implications in RFID and APF


System reliability
System reliability has been identified as key
element for the future deployment of RFID. The
most pressing concern is the possibility that data
from tags could be compromised or altered by an
unauthorized source.
Consumer education
A core part of any future dialogue on RFID will be
consumer education. Consumers should be provided with accurate information to enable them to
participate fully in discussions regarding RFID
technology, usage and management and to understand any possible benefits from the use of RFID
(Nakamura).

Pros and cons of previous work


Some people have worked on this issue of privacy
problem in RFID system. Moreover, we will discuss
some of the ideas and approaches used by them.

jamming, it may affect the other legal tags


(Ari et al.).

Importance of a proper security and


access control in RFID systems
In this paper, our objective is to find a solution to
the pressing concern of data from tags being
compromised or altered by an unauthorized source.
We proposed an authentication framework
called APF e Authentication Processing Framework. This is a framework that makes it compulsory for the readers to authenticate themselves
with the APF database before they can access
registered tags.
In order to prevent illegal access to the memory
segment of tag there should be a procedural access
control to the memory segment of the tag.
From Fig. 1, each tag memory segment will register its unique ID and the access key to the memory of the tag with the APF. This means both the
unique access key and the data in the tag will be
encrypted and the access key will be registered
with the APF. This is necessary for the protection
of tag from unscrupulous readers that have ulterior
intention. Once tag registers its unique identity
and access key with the APF, it will be difficult
for any reader to have access to the memory segment of the tag without possessing the access
key to the tag. We will discuss about how the authenticated reader would have access to the memory segment of the tag in the next paragraph.
Furthermore, every reader will register itself
with the APF in order for it to be authenticated
prior to the time the reader will request for the
key to access the data in the tag.
In a nutshell, every reader will register its
unique identification number with the APF and
this will be confirmed by the APF before releasing

ID 1 Key = 10
ID 1 Key = 10

ID 2 Key = 11

Registration
.
ID N Key = **

Tags

ID 2 Key = 11

a. Kill command idea e The standard mode of operation proposed by the AutoID Center is indeed for tags to be killed upon purchase of
the tagged product. With their proposed tag
design, a tag can be killed by sending it a special kill command. However, there are many
environments, in which simple measures like
kill command are unworkable or undesirable for privacy enforcement. For example,
consumers may wish RFID tags to remain operative while in their possession.
b. Faraday cage approach e An RFID tag may be
shielded from scrutiny using what is known as
a Faraday cage e a container made of metal
mesh or foil which is impenetrable by radio signals (of certain frequencies). There have been
reports that some thieves have been using foillined bags in retail shops to prevent shoplifting-detection mechanisms (Liu et al., 2004).
c. The active jamming approach e An active jamming approach is a physical means of shielding
tags from view. In this approach, the user could
use a radio frequency device which actively
sends radio signals so as to block the operation
of any nearby RFID readers. However, this approach could be illegal for example if the
broadcast power is too high it could disrupt
all nearby RFID systems and not that alone it
could be dangerous and cause problems in restricted areas like hospital and in the train.
d. The blocker tag approach e The blocker tag is
the tag that replies with simulated signals
when queried by reader so that the reader cannot trust the received signals. Like active

209

ID N - Key = ***

Authentication Processing
Framework

Figure 1

The registration of tags with the APF.

210

J. Ayoade

ID R1 Key = 110
ID R1 = 110

ID R2 Key = 111

Registration
.

ID R2 = 111

ID RN Key = ***
ID R N = ***

Readers
Authentication Processing
Framework

Figure 2

The registration of readers with the APF.

the encrypted key to the reader in order to read


the encrypted data in the specific tag.
From Fig. 2 every reader registers its unique identification number with the APF. Since both readers
and tags register their identification numbers with
the APF, this will serve as a mutual authentication
and will protect tags from malicious readers which
is one of the concerns users have for the full realization of the RFID systems. This means that unauthorized access into the tag will be eradicated if APF
framework is implemented and used. In the next
paragraph we will discuss about the registration
and access control of readers to the APF.
In the previous paragraphs we discussed about
the registration of the tag memory segments
unique identity and access key with the APF. Also
we discussed about the registration of readers with
the APF prior to accessing the tags. When the
reader sends a read command to the tag, the
tag will reply with its identification number and
the encrypted data, this means that only registered reader with the APF will be able to get the
decryption key to access the encrypted data. Once

the key is received the data in the tag will be


readable (Fig. 3).
In this framework, there are two important
processes, the first one is that, mutual authentication will be carried out by the APF because it
authenticates the reader and the tag.
Secondly, the privacy concern will be guaranteed because the data stored in the tag are
protected from malicious reader. Since, the information the reader got from the tag is encrypted
and it can only be read after the decryption key to
access the information is received from the APF.

The flowchart of the APF framework


The flowchart of the APF framework is given in
Fig. 4.

The pseudo code of the APF framework


1. Tags register decryption key with the APF.
2. Readers register their unique identification
numbers with the APF.
3. Readers issue command to access the tag.
4. Response from the tag to release the encrypted
data.
5. Readers request for the decryption key.
6. Readers can decrypt the encrypted data.

Merits and demerits of the APF


The APF provides assurance to the RFID users that
the information stored in the tag is secured in the
sense that only authenticated reader by the APF
can have access to the tag. The reason for this is
that the information received by the reader from
the tag is encrypted and this information can only
be decrypted by getting the decryption key from

Registration
process
Tag
ID1

Tag
ID2

Tag
IDN

Reader
ID1

Reader
ID2

Reader
ID N

..

Challenge
Reader 1

Tag 1 Encrypted data

Reader 2

Tag 2 Encrypted data

Reader N

Tag 3 Encrypted data

Response
Readers

Tags

Authentication Processing
Framework

Figure 3
denied.

The registration/access control of readers to the APF/tag. O-means access granted X-means access

Security implications in RFID and APF


3

211

Challenge
Issue Command to Access

Readers

Kill Command

Tags to be killed upon purchase


- Unworkable or undesirable

Faraday Cage

Tag shielded from scrutiny.


- Thieves using foil-end bag to prevent
shoplifting

Active jamming

Physical means of shielding tag from view.


- Illegal and dangerous especially in restricted
areas, like hospital and train

Blocker tag

Tag replies with simulated signals when queried


by reader so that the reader cannot trust the
received signals, like active jamming
- Illegal and dangerous like active jamming

APF

Authenticate readers before it can access the


APF which means ill-intentioned readers can nit
decrypt the information collected from the tag

Tags
4

Response
Get the encrypted data
1 Register
decryption keys
with the APF

2
5

Register with the APF

Request for the access key

Figure 4

APF
Database

Access Granted

The flowchart of the APF framework.

the APF. Also, the reader that did not register


with the APF prior to the time it gets the information from the tag will be denied of getting
the decryption key, the reason for this is in order to
circumvent malicious readers from accessing tags
illegally. The only disadvantage of this framework
is that, it is designed for read only RFID system.
In case of reader writing into the tag, this
procedure can not handle that. The reason for
this is that after the reader gets the decryption
key from the APF the encrypted information will
be decrypted automatically so there is no need
for the reader to go back to the tag for any
process.

Validity and effectiveness of the APF


The validity and effectiveness of the APF is given in
Fig. 5.

The importance of the APF


i. It prevents malicious readers from reading the
information in the tags.
ii. It permits consumers their wishes for RFID tags
to remain operative while in their possession.
iii. It also helps to authenticate both tags
and readers that is, it deploys mutual
authentication.

Real world application of the APF system


We are trying to consider a particular area in which
the APF system can be applied in a real world
system. Patient confidential/personal information
could be the area that APF system can be world
wide applied. Take for example, a patient who has
an RFID tag attached to his hospital card in
a particular hospital and his doctor diagnosed his
ailment and prescribed some drugs to him and all
this information and other personal information are

Figure 5

The validity and effectiveness of the APF.

stored in the tag of the patients card. In this case,


the patient needs some level of confidentiality to
the information stored in this tag and this patient
wants only his doctor or a specific doctor to know
about the information stored in this tag. However,
since the current RFID system allows any reader to
have access to any tag the patients private information stored in the tag could be jeopardized.
Moreover, if the APF system is used, first of all,
the patient doctors reader will be authenticated
by the APF, this means not just any reader can
access the patients tag. Therefore, malicious
readers will be denied access to read the stored
information since, in the APF system any reader
that did not register with the APF prior to accessing
the tag will be denied access to the key that can
access the tags information. This means, in case
the patient wants to change his hospital and doctor
it is a matter of informing the APF his intention and
the new doctor and hospitals reader have to
register with the APF prior to the time the new
reader will have access to the patients tag.
However, the former doctor and hospital readers
right to the patients tag will be denied henceforth.
Looking at this example, we can see
the importance and contribution that the APF
system can provide and the level of users confidence that the APF system will guarantee in the
RFID system.

Conclusion
In conclusion, information in tags can be protected
from being read by unauthorized readers through

212
the authentication procedures as we have described above in the APF system. It is very
imperative to protect unauthorized access to the
tag in order to prevent the violation of privacy and
confidential information stored in it. Moreover, the
above framework is a mutual authentication which
makes it a system that will be able to protect
unauthorized or malicious readers from accessing
the information stored in the RFID tags.

References
Adopting fair information practices to low cost RFID systems, <http://www.guir.berkeley.edu/pubs/ubicomp2002/
privacyworkshop/papers/UBICOM2002_RFIDv3.doc>.
Allan Alex. RFID and privacy, <http://www.whitegum.com/
journal/rfidspch.htm>; November 2003.
Ari Juels, Rivest RL, Szydlo M. The blocker tag: selective
blocking of RFID tags for consumer privacy, <http://www.
rsasecurity.com/rsalabs/staff/bios/ajuels/publications/
blocker/blocker.pdf>; 2003.
Berthon Alain. Security in RFID, <http://www.nepc.sanc.
org.sg/html/techReport/N327.doc>; July, 2000.
C.A.S.P.I.A.N., <http://www.nocards.org>.
E-ZPASS Regional Consortium Service Center, <http://www.
ezpass.com>.
Liu Dingzhe, Kobara Kazukuni, Hideki Imai. Pretty-simple
privacy enhanced RFID and its application. In: (SCIS 2004)

J. Ayoade
The symposium on cryptography and information security,
Sendai, Japan; January 2004.
Nakamura Naoshi. Future of the Internet RFID, <http://www.
gbde.org/acrobat/rfid03.pdf>.
Ohkubo Miyako, Suzuki Koutarou, Kiinoshita Shingo. Hash-chain
based forward-secure privacy protection scheme for lowcost RFID. In: (SCIS 2004) The symposium on cryptography
and information security, Sendai, Japan; January 2004.
Position statement on the use of RFID on consumer products,
<http://www.spychips.org/jointrfid_position_paper.html>;
November 2003.
Weis Stephen A. Security and privacy in radio-frequency identification devices, <http://theory.lcs.mit.edu/wSweis/masters.
pdf>; May 2003.
Dr. John Ayoade is an expert researcher in the Security Advancement Group of the National Institute of Information and
Communications Technology, Tokyo, Japan.
He obtained his Ph.D. degree in Information Systems under
Japanese government scholarship in the Graduate School of Information Systems in the University of Electro-Communications,
Tokyo, Japan.
Dr. Ayoades research work focuses on information and communications security and privacy. He has a very wide knowledge
in the university training involving lectures and practical in the
principles and practice of telecommunications and network policies, coupled with the sound theoretical and practical knowledge in Computer Science. He has presented and published
papers in many conferences and journals, respectively.
Dr. Ayoade is happily married to his loving and caring wife
Oluwatomi and they are blessed with a daughter and a son,
Opeyemi and Ayodeji, respectively.

Computers & Security (2006) 25, 213e220

www.elsevier.com/locate/cose

Change trend of averaged Hurst parameter


of traffic under DDOS flood attacks
Ming Li*
School of Information Science and Technology, East China Normal University,
No. 3663, Zhongshan Bei Road, Shanghai 200026, PR China
Received 22 November 2004; revised 15 November 2005; accepted 15 November 2005

KEYWORDS
Hurst parameter;
Traffic;
Time series;
Distributed denial-ofservice flood attacks;
Anomaly detection

Abstract Distributed denial-of-service (DDOS) flood attacks remain great threats


to the Internet though various approaches and systems have been proposed. Because arrival traffic pattern under DDOS flood attacks varies significantly away from
the pattern of normal traffic (i.e., attack free traffic) at the protected site, anomaly detection plays a role in the detection of DDOS flood attacks. Hence, quantitatively studying statistics of traffic under DDOS flood attacks (abnormal traffic for
short) are essential to anomaly detections of DDOS flood attacks.
References regarding qualitative descriptions of abnormal traffic are quite rich,
but quantitative descriptions of its statistics are seldom seen. Though statistics of
normal traffic are affluent, where the Hurst parameter H of traffic plays a key role,
how H of traffic varies under DDOS flood attacks is rarely reported. As a supplementary to our early work, this paper shows that averaged H of abnormal traffic usually
tends to be significantly smaller than that of normal one at the protected site. This
abnormality of abnormal traffic is demonstrated with test data provided by MIT
Lincoln Laboratory and explained from a view of Fourier analysis.
2005 Elsevier Ltd. All rights reserved.

Introduction
The Internet is the infrastructure that supports
computer communications. It has actually become
the electricity of the modern society because
* Tel.: 86 21 62233389; fax: 86 21 62232517.
E-mail addresses: mli@ee.ecnu.edu.cn, ming_lihk@yahoo.
com.
URL: http://www.ee.ecnu.edu.cn/teachers/mli/js_lm(Eng).
htm.

its use in modern society is so pervasive and many


people rely on it so heavily. For instance, employees in the modern society would rather give up
access to their telephone than give up their access
to their email. Nevertheless, it is subject to
electronic attacks (Coulouris et al., 2001), e.g.,
distributed denial-of-service (DDOS) flood attacks
(Sorensen, 2004). The threats of DDOS attacks to
the individuals are severe. For instance, any denialof-service of a bank server implies a loss of money,
disgruntling or losing customers. Hence, intrusion

0167-4048/$ - see front matter 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cose.2005.11.007

214
detection system (IDS) and intrusion prevention
system (IPS) are desired (Kemmerer and Vigna,
2002; Householder et al., 2002; Schultz, 2004; Sorensen, 2004; Gong, 2003; Li, 2004; Streilein et al.,
2003; Bencsath and Vajda, 2004; Feinstein et al.,
2003; Oh and Lee, 2003; Liston, 2004).
There are several categories of denial-ofservice (DOS) attacks (Gong, 2003). The CERT Coordination Center (CERT/CC) divides DOS attacks
into three categories: (1) flood (i.e., bandwidth)
attacks, (2) protocol attacks, and (3) logical attacks. This paper considers flood attacks.
A DDOS flood attack sends attack packets upon
a site (victim) with a huge amount of traffic, the
sources of which are distributed over the world so
as to effectively jam its entrance and block access
by legitimate users or significantly degrade its
performance. It never tries to break into the
victims system, making security defenses at the
protected site irrelevant (DDoS; Dittrich-a; Dittrich-b; Dittrich-c; Dittrich-d; Dietrich et al.;
Geng et al., 2002).
Usually, IDSs are classified into two categories.
One is misuse detection and the other anomaly
detection. Solutions given by misuse detection are
primarily based on a library of known signatures to
match against network traffic. Hence, unknown
signatures from new variants of an attack mean
100% miss. Therefore, anomaly detectors play
a role in detection of DDOS flood attacks. As far
as anomaly detection is concerned, quantitatively
characterizing abnormalities of statistics of abnormal traffic is fundamental.
A traffic stream is a packet flow. A packet
consists of a number of fields, such as protocol
type, source IP, destination IP, ports, flag setting
(in the case of TCP or UDP), message type (in the
case of ICPM), timestamp, and data length (packet
size). Each may serve as a feature of a packet. The
literature discussing traffic features is rich (see
e.g. Li, 2004; Streilein et al., 2003; Bencsath and
Vajda, 2004; Feinstein et al., 2003; Oh and Lee,
2003; Cho and Park, 2003; Cho and Cha, 2004; Lan
et al., 2003; Paxson and Floyd, 1995; Li et al.,
2003; Beran, 1994; Willinger and Paxson, 1998;
Willinger et al., 1995; Csabai, 1994; Tsybakov
and Georganas, 1998; MIT; Garber, 2000; Kim
et al., 2004; Mahajan et al., 2002; Kim et al.,
2004; Bettati et al., 1999). For instance, Mahajan
et al. (2002) consider flow rate, Kim et al. (2004)
use head message, Oh and Lee (2003) alone consider 86 features of traffic (not from a statistics view
though), and so on. To the best of our knowledge,
however, taking into account the Hurst parameter
H in characterizing abnormality of traffic series in
packet size under DDOS flood attacks is rarely seen

M. Li
except for Li (2004), where autocorrelation function (ACF) of traffic series in packet size (traffic
for short) with long-range dependence (LRD) is
taken as its statistical feature. As a supplementary
to Li (2004), this paper specifically studies how
H of traffic varies under DDOS flood attacks. In
this regard, the following two questions are
fundamental.
(1) Whether H of traffic when a site is under DDOS
flood attacks (abnormal traffic for short) is significantly different from that of normal one
(i.e., attack free traffic)?
(2) What is the change trend of H of traffic when
a site suffers from DDOS flood attacks?
We will give the answers to the above questions
from the point of views of processing data traffic
and theoretic inference and analysis.
In the rest of paper, section Test data sets is
about test data. We brief data traffic and use
a series of normal traffic in ACM to explain how
its H normally varies in section Brief of data
traffic. The answer to the question (1) is given
in section Using H to describe abnormality of
traffic under DDOS flood attacks. Then, in section
Change trend of traffic under DDOS flood
attacks, we use a pair of series (one is normal
traffic and the other abnormal one) that is provided by MIT Lincoln Laboratory to demonstrate that
averaged H of abnormal traffic tends to be significantly smaller than that of normal one and briefly
discusses this abnormality of abnormal traffic from
a view of Fourier analysis. The answer to the
question (2) is given in this section. Section Conclusions concludes the paper.

Test data sets


Three series of test data are utilized in this paper.
The first one is an attack free series measured at
the Lawrence Berkeley Laboratory from 14:00 to
15:00 on Friday, 29 January 1994. It is named LBLPKT-4, which has been widely used in the research
of general (normal traffic) traffic pattern (see e.g.
Paxson and Floyd, 1995; Li et al., 2004). We use it
to show a case how H of normal traffic varies. The
second is Outside-MIT-week1-1-1999-attack-free
(OM-W1-1-1999AF for short) (MIT). It was recorded
from 08:00:02, 1 March (Monday) to 06:00:02, 2
March (Tuesday), 1999. The third is Outside-MITweek2-1-1999-attack-contained (OM-W2-1-1999AC
for short) (MIT), which was collected from
08:00:01, 8 March (Monday) to 06:00:49, 9 March
(Tuesday), 1999. Two MIT series are used to

Change trend of averaged Hurst parameter of traffic under DDOS flood attacks
demonstrate a case how H of traffic varies under
DDOS attacks. Though whether or not MIT test
data are in the sense of standardization is worth
further discussion as stated in McHugh (2000),
they are valuable and can yet be test data for
the research of abnormality of abnormal traffic
due to available data traffic under DDOS flood attacks being rare.

Brief of data traffic


Denote xti a traffic series, indicating the number
of bytes in a packet at time ti , i 0; 1; 2; .. From
a view of discrete series, we write xti as xi implying the number of bytes in the ith packet. Let
rk be the ACF of xi. Then,
rkwck2H2 for c > 0; H0:5; 1;

where w stands for the asymptotical equivalence


under the limit k/N and H the Hurst parameter.
The ACF in Eq. (1) is non-summable for
H0:5; 1, implying LRD. Hence, H is a measure
of LRD of traffic.
According to the research in traffic engineering,
fractional Gaussian noise (FGN) is an approximate
model of traffic (Paxson and Floyd, 1995; Li et al.,
2003; Beran, 1994; Willinger and Paxson, 1998;
Willinger et al., 2002; Li et al., 2004; Paxson,
1997; Li and Chi, 2003; Michiel and Laevens, 1997;
Adas, 1997; Leland et al., 1994; Beran et al., 1995;
Stallings, 1998; Carmona et al., 1999; Pitts and
Schormans, 2000; MaDysan, 2000). The ACF of FGN
is given by
h
i
Rk; H 0:5s2 jk 1j2H 2jkj2H jk  1j2H ;
2
where,
s2

G2  HcospH
pH2H  1

(Mandelbrot, 2001; Muniandy and Lim, 2001).


By taking FGN as an approximate model of xi,
we P
consider another series given by xiL
i1L1
1=L jiL
xj. According to the analysis in selfsimilar processes (see e.g. Beran, 1994; Mandelbrot,
2001; Beran et al., 1995), one has
 
Var xL zL2H2 Varx;
where Var implies the variance operator. Thus,
traffic has the property of self-similarity measured
by H. Consequently, H characterizes the properties
of both LRD and self-similarity of traffic.
In practice, measured traffic is of finite length.
Let x be a series of P length. Divide x into N nonoverlapping sections. Each section is divided into
M non-overlapping segments. Divide each segment

215

into K non-overlapping blocks. Each block is of L


length. Let xiL
m n be the series with aggregated
level L in the mth segment in the nth section
m 0; 1; .; M  1; n 0; 1; .; N  1. Let Hm n
be the H value of xiL
m n. Let rk; Hm n be the
L
measured ACF of xim n in the normalized case.
Then,
h
2H n
2H n
Rk; Hm n 0:5 jk 1j m 2jkj m
i
2H n
jk  1j m :
3
The above expression exhibits the multi-fractal
property of traffic as that explained from a mathematics view (Muniandy and Lim, 2001; Muniandy
and Lim, 2000). P
Let JHm n k Rk; Hm n  rk2 be the
cost function. Then, one has
Hm n arg min JHm n:

Averaging Hm n in terms of index m yields


Hn

M1
1X
Hm n;
M m0

representing the H estimate of the series in the nth


section. In practical terms, a normality assumption
for Hn is quite accurate in most cases for M > 10
regardless of probability distribution function of H
(Bendat and Piersol, 1986). Thus,
Hx EHn

is taken as a mean estimate of H of x, where E is


the mean operator.
Let sH be the standard deviation of Hn. Then,


Hn  Hx
 za=2 1  a;
Prob z1a=2 <
sH
where 1  a is the confidence coefficient. The
confidence interval of Hn

 with 1  a confidence
coefficient is given by Hx  sH za=2 ; Hx sH za=2 .
The following demonstration exhibits Hn of traffic series LBL-PKT-4.
Demonstration 1: The first 1024 points of the
series xi of LBL-PKT-4 are indicated in Fig. 1 (a).
Consider the first 524 288 (P) points of xi. The
partition settings are as follows. L 32; K 16;
M 32; N 32; and J 2048. Computing H in
each section yields Hn as shown in Fig. 1(b). Its
histogram is indicated in Fig. 1(c).
According to Eq. (6), we have Hx 0:758. The
confidence interval with 95% confidence level is
[0.750, 0.766]. Hence, we have 95% confidence
to say that the H estimate in each section of that

M. Li

(a)

1000

x(i)

216

500

256

512

768

1024

0.85

0.78

0.7

(c)
Hist[H(n)]

(b)

H(n)

16

24

32

Figure 1
of Hn.

0.5

0.25

0.5

0.75

Demonstrating statistical invariable H. (a) A real-traffic time series; (b) estimate Hn; (c) histogram

series takes Hx 0:758 as its approximation with


the fluctuation not greater than 7:431  103 .

Using H to describe abnormality of


traffic under DDOS flood attacks
From the previous discussions, we see that H is
a parameter to characterize the properties of
both LRD and self-similarity of traffic. On the other
hand, ACF is a statistical feature of a time series,
which is used in queuing analysis of network systems (Livny et al., 1993; Li and Hwang, 1993).
Hence, the following lemma.
Lemma: Let x and y be normal traffic and abnormal traffic, respectively. Let rxx and ryy be the
ACFs of x and y, respectively. During the
 transi
tion process of DDOS flood attacking, ryy  rxx 
is noteworthy (Li, 2004).

Proof: A network system is a queuing system.


Arrival traffic x of a queuing system has its statistical pattern rxx (Livny et al., 1993; Li and
Hwang, 1993). Suppose the site 
suffers from

DDOS flood attacks. Suppose that ryy  rxx  is
negligible in this case. Then, the site would be
overwhelmed at its normal state even if there
were no DDOS flood packets. This is an obvious
contradiction.
,
For each value of H0:5; 1, there is exactly
one ACF of FGN with LRD as can be seen from

Beran (1994,
p. 55).
 Thus, a consequence of Lemma

is that Hy  Hx  is considerable, where Hy and Hx
are average H values of x and y, respectively.
Hence, H is a parameter that can yet be used to
describe abnormality of traffic under DDOS flood
attacks. This gives the answer to the question (1)
in Section Introduction.

Change trend of H of traffic under


DDOS flood attacks
Demonstrations
This subsection gives two demonstrations of Hn.
One is for normal traffic and the other abnormal
one. Two demonstrations show that average value
of H of abnormal traffic tends to be significantly
smaller than that of normal one.
Demonstration 2 (attack free traffic): The first
1024 points of the series xi of attack free traffic
OM-W1-1-1999AF are indicated in Fig. 2(a). Its Hn
is plotted in Fig. 2(b) and histogram in Fig. 2(c).
By computation, we obtain
Hx 0:895;

its variance 5.693  104, and the confidence


interval with 95% confidence level [0.865, 0.895].

Demonstration 3 (abnormal traffic): The first


1024 points of the series xi of attack contained
traffic OM-W2-1-1999AC are indicated in Fig. 3(a).

(a)

2000

x(i)

Change trend of averaged Hurst parameter of traffic under DDOS flood attacks

1000

256

512

768

217

1024

(c)

0.95

hist[H(n)]

(b)

H(n)

0.9

0.85

1
0.75
0.5
0.25

0.8

16

24

0
0.5

32

0.6

0.7

0.8

0.9

Figure 2 Demonstrating Hn of attack free traffic OM-W1-1-1999AF. (a) Time series of OM-W1-1-1999AF; (b)
estimate Hn of OM-W1-1-1999AF; (c) histogram of Hn of OM-W1-1-1999AF.

Its Hn is plotted in Fig. 3(b) and histogram in


Fig. 3(c).
By computation, we obtain

Comparing the means of H in the above two


demonstrations, we see
9

Hy < Hx :
Hy 0:774;

8
The above inequality exhibits a case of the change
trend of H of traffic under DDOS flood attacks. It
actually follows a general rule as can be seen
from the following analysis.

(a)

2000

y(i)

its variance 6.777  104, and the confidence


interval with 95% confidence level [0.723, 0.825].

1000

256

512

768

1024

(c)

0.85

hist[H(n)]

(b)

H(n)

0.8

0.75

1
0.75
0.5
0.25

0.7

16

24

32

0.5

0.6

0.7

0.8

0.9

Figure 3 Demonstrating Hn of abnormal traffic OM-W2-1-1999AC. (a) Time series of OM-W2-1-1999AC; (b) estimate
Hn of OM-W2-1-1999AC; (c) histogram of Hn of OM-W2-1-1999AC.

218

M. Li

Conclusions

Analysis of change trend of H of traffic


under DDOS flood attacks
In the case of multi-fractional FGN, we let
H represent the mean estimate of the Hurst
parameter as that in Eq. (6) for the sake of
simplicity. As
h
i
2H
2H
0:5 t 1 2t2H t  1
is the finite second-order difference of 0:5t2H
(Beran, 1994; Mandelbrot, 2001; Li and Chi, 2003;
Caccia et al., 1997), approximating it with 2-order
differential of 0:5t2H yields
h
i
2H
2H
0:5 t 1 2t2H t  1 zH2H  1t2H2 :
10
In the domain of generalized functions (Lighthill,
1958, p. 43), we obtain


p2H  1
22H
2H1
2H  2!juj
F jtj
;
2cos
2
11
where F is the operator of the Fourier transform.
As known, the frequency bandwidth of x is the
width of its power spectrum S(u), which is usually
explained in the sense of the maximum effective
frequency in engineering (Stalling, 1994). Hence,
the following is a consequence of Eq. (11).
Corollary: Let B1 and B2 be the bandwidths of
LRD FGN x1 and x2 , respectively. Let mean estimates of H of x1 and x2 be H1 and H2 , respectively.
Then, H2 < H1 if B2 > B1 .
As known, the data rate of abnormal traffic is
usually greater than that of attack free traffic
(Garber, 2000). Hence, the bandwidth of abnormal
traffic is wider than that of attack free traffic according to the relationship between data rate and
bandwidth (Stalling, 1994). Then, according to Corollary, we see that average H of abnormal traffic is
smaller than that of attack free traffic, giving the
answer to the question (2) in sectionIntroduction. Eq. (9) is a case about this rule. As the larger
the H the stronger the LRD as well as self-similarity
(Beran, 1994; Mandelbrot, 2001), we note that LRD
and self-similarity of abnormal traffic become
weaker than those of attack free traffic.
In passing, Corollary gives the reason why Li
(2004) designs the case study by assigning abnormal traffics Hs smaller than that of normal one.

To reveal how a statistical feature of traffic varies


under DDOS flood attacks is crucial to anomaly
detection of DDOS flood attacks (Liston, 2004). As
the Hurst parameter H (or equivalently autocorrelation function (Li, 2004; Paxson and Floyd, 1995;
Li et al., 2003; Beran, 1994; Willinger and Paxson,
1998; Willinger et al., 1995; Tsybakov and Georganas,
1998; Willinger et al., 2002; Li et al., 2004;
Mandelbrot, 2001; Paxson, 1997; Li and Chi,
2003; Michiel and Laevens, 1997; Adas, 1997;
Leland et al., 1994; Beran et al., 1995; Stallings,
1998; Carmona et al., 1999; Pitts and Schormans,
2000; MaDysan, 2000; Mandelbrot, 1971) plays a
key role in traffic analysis, this paper aims at
revealing how H varies under DDOS flood attacks.
We have explained that average H of abnormal
traffic significantly differs from that of normal one
as a consequence of Lemma, where H represents
mean estimate in the case of multi-fractional
series. We have given a corollary to show that
average H of abnormal traffic is smaller than
that of normal one. The results in theory are
demonstrated and also validated with the test
data provided by MIT Lincoln Laboratory.

Acknowledgement
This work was supported in part by the National
Natural Science Foundation of China under the
project grant number 60573125. MIT Lincoln Laboratory is highly appreciated.

References
Adas A. Traffic models in broadband networks. IEEE Communications Magazine 1997;35(7):82e9.
Bencsath B, Vajda I. Protection against DDoS attacks based on
traffic level measurements. In: International symposium on
collaborative technologies and systems. Waleed W. Smari,
William McQuay; 2004. p. 22e8.
Bendat JS, Piersol AG. Random data: analysis and measurement
procedure. 2nd ed. John Wiley & Sons; 1986.
Beran J, Shernan R, Taqqu MS, Willinger W. Long-range dependence in variable bit-rate video traffic. IEEE Transactions on
Communications FebruaryeApril 1995;43(2e4):1566e79.
Beran J. Statistics for long-memory processes. Chapman & Hall;
1994.
Bettati R, Zhao W, Teodor D. Real-time intrusion detection and
suppression in ATM networks. In: Proceedings of the first
USENIX workshop on intrusion detection and network monitoring; April 1999.
Caccia DC, Percival D, Cannon MJ, Raymond G,
Bassingthwaighte JB. Analyzing exact fractal time series:
evaluating dispersional analysis and rescaled range methods.
Physica A 1997;246(3e4):609e32.

Change trend of averaged Hurst parameter of traffic under DDOS flood attacks
Carmona R, Hwang W-L, Torresani B. Practical time-frequency
analysis: Gabor and wavelet transforms with an implementation in S. Academic Press; 1999. p. 244e7.
Cho S, Cha S. SAD: web session anomaly detection based on
parameter estimation. Computers & Security 2004;23(4):
312e9.
Cho S-B, Park H-J. Efficient anomaly detection by modeling privilege flows using hidden Markov model. Computers & Security
2003;22(1):45e55.
Coulouris G, Dollimore J, Kindberg T. Distributed systems:
concepts and design. 3rd ed. Addison-Wesley; 2001.
Csabai I. 1/f noise in computer network traffic. Journal of Physics A: Mathematical and General 1994;27(12):L417e21.
Data are available from: <http://www.acm.org/sigcomm/ITA/>.
Distributed denial of service (DDoS) attacks/tools, <http://
staff.washington.edu/dittrich/misc/ddos/>.
Dietrich S, Long N, Dittrich D. An analysis of the Shaft distributed denial of service tool, <http://www.adelphi.edu/
wspock/shaft_analysis.txt>.
Dittrich D. The DoS projects Trinoo distributed denial of
service attack tool, <http://staff.washington.edu/dittrich/
misc/trinoo.analysis> (Dittrich-a).
Dittrich D. The Tribe Flood Network distributed denial of
service attack tool, <http://staff.washington.edu/dittrich/
misc/tfn.analysis.txt> (Dittrich-b).
Dittrich D. The Stacheldraht distributed denial of service
attack tool, <http://staff.washington.edu/dittrich/misc/
stacheldraht.analysis.txt> (Dittrich-c).
Dittrich D. The Mstream distributed denial of service attack
tool, <http://staff.washington.edu/dittrich/misc/mstream.
analysis.txt> (Dittrich-d).
Feinstein L, Schnackenberg D, Balupari R, Kindred D. Statistical
approaches to DDoS attack detection and response. In:
DARPA information survivability conference and exposition, vol. I, April 22e24, 2003. Washington, DC; 2003.
p. 303e14.
Garber L. Denial-of-service attacks rip the Internet. Computer
April 2000;33(4):12e7.
Geng X, Huang Y, Whinston AB. Defending wireless infrastructure against the challenge of DDoS attacks. Mobile Networks
and Applications 2002;7:213e23.
Gong F. Deciphering detection techniques: part III denial of
service detection. White Paper. McAfee Network Security
Technologies Group; January 2003.
Householder A, Houle K, Dougberty C. Computer attack trends
challenge Internet security. Supplement to Computer. IEEE
Security & Privacy April 2002;35(4):5e7.
Kemmerer RA, Vigna G. Intrusion detection: a brief history and
overview. Supplement to Computer. IEEE Security & Privacy
April 2002;35(4):27e30.
Kim SS, Reddy ALN, Vannucci M. Detecting traffic anomalies at
the source though aggregate analysis of packet header
data. In: Proceedings of Networking 2004. LNCS, vol. 3042,
Athens, Greece; May 2004. p. 1047e59.
Kim Y, Lau WC, Chuah MC, Chao HJ. PacketScore: statisticsbased overload control against distributed denial-of-service
attacks. In: IEEE Infocom 2004, Hong Kong; 2004.
Lan K, Hussain A, Dutta D. Effect of malicious traffic on the network. In: Proceedings of passive and active measurement
workshop, April 2003, La Jolla, California; 2003.
Leland E, Taqqu M, Willinger W, Wilson DV. On the self-similar
nature of ethernet traffic, (extended version). IEEE/ACM
Transactions on Networking February 1994;2(1):1e15.
Li Ming, Chi C-H. A correlation-based computational method
for simulating long-range dependent data. Journal of the
Franklin Institute SeptembereNovember 2003;340(6e7):
503e14.

219

Li S-Q, Hwang C-L. Queue response to input correlation functions: continuous spectral analysis. IEEE/ACM Transactions
on Networking December 1993;1(6):678e92.
Li Ming, Zhao W, Jia WJ, Chi C-H, Long DY. Modeling autocorrelation functions of self-similar teletraffic in communication networks based on optimal approximation in
Hilbert space. Applied Mathematical Modelling 2003;
27(3):155e68.
Li Ming, Chi C-H, Long DY. Fractional Gaussian noise: a tool of
characterizing traffic for detection purpose. In: Content computing LNCS, vol. 3309. Springer; November 2004. p. 94e103.
Li Ming. An approach for reliably identifying signs of DDoS flood
attacks based on LRD traffic pattern recognition. Computers
& Security 2004;23(7):549e58.
Lighthill MJ. An introduction to Fourier analysis and generalised
functions. Cambridge University Press; 1958.
Liston K. Intrusion detection FAQ: can you explain traffic analysis and anomaly detection? <www.sans.org/resources/idfaq/
anomaly_detection.php>; 6 July, 2004.
Livny M, Melamed B, Tsiolis AK. The impact of autocorrelation on
queuing systems. Management Science 1993;39:322e39.
MaDysan D. QoS & traffic management in IP & ATM networks.
McGraw-Hill; 2000.
Mahajan R, Bellovin S, Floyd S, Ioannidis J, Paxson V, Shenker S.
Controlling high bandwidth aggregates in the network.
Computer Communications Review July 2002;32(3):62e73.
Mandelbrot BB. Fast fractional Gaussian noise generator. Water
Resources Research 1971;7(3):543e53.
Mandelbrot BB. Gaussian self-affinity and fractals. Springer;
2001.
McHugh J. Testing intrusion detection systems: a critique of
the 1988 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln laboratory. ACM Transactions
on Information System Security November 2000;3(4):262e94.
Michiel H, Laevens K. Teletraffic engineering in a broad-band era.
Proceedings of the IEEE December 1997;85(12):2007e33.
<http://www.ll.mit.edu/IST/ideval>.
Muniandy SV, Lim SC. On some possible generalizations of fractional Brownian motion. Physics Letters A 2000;266:140e5.
Muniandy SV, Lim SC. Modelling of locally self-similar processes
using multifractional Brownian motion of RiemanneLiouville
type. Physical Review E 2001;63:046104.
Oh SH, Lee WS. An anomaly intrusion detection method by clustering normal user behavior. Computers & Security 2003;
22(7):596e612.
Paxson V, Floyd S. Wide-area traffic: the failure of Poisson modeling. IEEE/ACM Transactions on Networking June 1995;3(3):
226e44.
Paxson V. Fast, approximate synthesis of fractional Gaussian
noise for generating self-similar network traffic. Computer
Communications Review October 1997;27(5):5e18.
Pitts JM, Schormans JA. Introduction to IP and ATM design and
performance: with applications and analysis software.
John Wiley; 2000. p. 287e93.
Schultz E. Intrusion prevention. Computers & Security 2004;
23(4):265e6.
Sorensen S. Competitive overview of statistical anomaly detection.
White Paper. Juniper Networks Inc., www.juniper.net; 2004.
Stalling W. Data and computer communications. 4th ed.
Macmillan; 1994.
Stallings W. High-speed networks: TCP/IP and ATM design
principles. Prentice Hall; 1998 [chapter 8].
Streilein WW, Fried DJ, Cunninggham RK. Detecting flood-based
denial-of-service attacks with SNMP/RMON. In: Workshop on
statistical and machine learning techniques in computer
intrusion detection. September 24e26, 2003. George Mason
University; 2003.

220
Tsybakov B, Georganas ND. Self-similar processes in communications networks. IEEE Transactions on Information Theory
September 1998;44(5):1713e25.
Willinger W, Paxson V. Where mathematics meets the Internet.
Notices of the American Mathematical Society August 1998;
45(8):961e70.
Willinger W, Taqqu MS, Leland WE, Wilson DV. Self-similarity in
high-speed packet traffic: analysis and modeling of ethernet
traffic measurements. Statistical Science 1995;10(10):
67e85.
Willinger W, Paxson V, Riedi RH, Taqqu MS. Long-range dependence
and data network traffic. In: Doukhan P, Oppenheim G,
Taqqu MS, editors. Long-range dependence: theory and applications. Birkhauser; 2002.

M. Li
Ming Li completed his undergraduate program in electronic
engineering at Tsinghua University. He received the M.S. degree
in mechanics from China Ship Scientific Research Center and
Ph.D. degree in Computer Science from City University of
Hong Kong, respectively. In March 2004, he joined East China
Normal University (ECNU) as a professor after several years experiences in National University of Singapore and City University
of Hong Kong. He is currently a Division Head for Communications & Information Systems at ECNU. His current research
interests include teletraffic modeling and its applications to
anomaly detection and guaranteed quality of service, fractal
time series, testing and measurement techniques. He has published over 50 papers in international journals and international
conferences in those areas.

Computers & Security (2006) 25, 221e228

www.elsevier.com/locate/cose

An empirical examination of the reverse


engineering process for binary files
Iain Sutherland a,*, George E. Kalb b, Andrew Blyth a, Gaius Mulley a
a
b

School of Computing, University of Glamoran, Treforest, Wales, UK


The Johns Hopkins University, Information Security Institute Baltimore, Maryland, USA

Received 18 November 2004; accepted 4 November 2005

KEYWORDS
Reverse engineering;
Software protection;
Process metrics;
Binary code;
Complexity metrics

Abstract Reverse engineering of binary code file has become increasingly easier
to perform. The binary reverse engineering and subsequent software exploitation
activities represent a significant threat to the intellectual property content of commercially supplied software products. Protection technologies integrated within
the software products offer a viable solution towards deterring the software exploitation threat. However, the absence of metrics, measures, and models to characterize the software exploitation process prevents execution of quantitative
assessments to define the extent of protection technology suitable for application
to a particular software product. This paper examines a framework for collecting
reverse engineering measurements, the execution of a reverse engineering experiment, and the analysis of the findings to determine the primary factors that affect
the software exploitation process. The results of this research form a foundation for
the specification of metrics, gathering of additional measurements, and development of predictive models to characterize the software exploitation process.
2005 Elsevier Ltd. All rights reserved.

Introduction
Deployed software products are known to be
susceptible to software exploitation through reverse engineering of the binary code (executable)
files. Numerous accounts of commercial companies
reverse engineering their competitors product,
for purposes of gaining competitive advantages,
have been published (Bull et al., 1995; Chen, 1995;
* Corresponding author.
E-mail address: isutherl@glam.ac.uk (I. Sutherland).

Tabernero, 2002). Global movement towards the


use of industrial standards, commercially supplied
hardware computing environments, and common
operating environments achieves software engineering goals of interoperability, portability, and
reusability. This same global movement results in
a reduced cost of entry for clandestine software
exploiters to successfully reverse engineer a binary
code file. A software exploiter, with rudimentary
skills, possesses a threat to recently deployed
commercial software product because (1)
machine-code instruction set and executable file

0167-4048/$ - see front matter 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cose.2005.11.002

222
formats (Tilley, 2000) are routinely published, (2)
hex editors, dissemblers, software in-circuit emulators tools are readily available via Internet sources, and (3) similar attack scenarios involving
reverse engineering of binary code files are readily
accessible through numerous hacking websites.
There are also legitimate reasons for reverse engineering code in such cases as legacy systems (Muller et al., 2000; Cifuentes and Fitzgerald, 2000) and
so there is a body of published academic material
(Weide et al., 1995; Interrante and Basrawala,
1988; Demeyer et al., 1999; Wills and Cross,
1996; Gannod et al., 1988) to which a software exploiter could refer although the main focus of this
effort is at source code level (Muller et al., 2000).
The commercial software product developer is
forced to employ various protection technologies to
protect both the intellectual property content and
the software development investment represented
by the software asset to be released into the
marketplace. The commercial software product developer must determine the appropriate protection
technologies that are both affordable and supply
adequate protection against the reverse engineering
threat for a desired period of performance.
The absence of predictive models that characterize the binary reverse engineering software
exploitation process precludes an objective and
quantitative assessment of the time since first
release of the software asset to when software
exploitation is expected to successfully extract
useful information content. Similar to parametric
software development estimation models (e.g.,
COCOMO), the size and complexity of the binary
code file to be reverse engineered are considered
to be a prime contributing factor to the time and
effort required to execute the reverse engineering
activity. Additionally, the skill level of the software exploiter is also considered to be a primary
contributing factor. This paper describes the execution of an experiment to derive empirical data
that will validate a set of proposed attributes that
are believed to be the primary factors affecting
the binary reverse engineering process.

Background
An insider is assumed to have access to developmental information resources pertaining to the
commercial software product including the product source code. An outsider does not have access
to this information and must resort to analysis of
available software product resources. Such available software product resources may be little
more than the binary code file as released from

I. Sutherland et al.
the original developer. The outsider is forced to
execute a binary reverse engineering activity
beginning with the binary code file and concluding
when some desired end goal has been achieved.
The entry criterion is defined as the time when
the outsider first obtains a copy of the binary code
file so as to commence the reverse engineering
process. The commercial software product vendor
must assume that this entry criterion coincides
with the first market release of the product.
The exit criterion is determined by the time when
the outsider has satisfied a particular end goal for
the software exploitation process. Unlike software
development activities where the singular end goal
is to deliver a reasonably well-tested software
product to an end user given the available funding
and schedule resources, binary reverse engineering
activities may have multiple software exploitation
end goals (Kalb). The first software exploitation end
goal is defined as obtaining sufficient information regarding the software products operational function,
performance, capabilities, and limitation. Satisfying
this first software exploitation end goal enables the
software exploiter to transfer the information gathered to other software products that are either in
development or are already deployed. The second
software exploitation end goal builds upon the first
and is defined as enabling minor modifications to
alter/enhance the deployed software product. Satisfying this second software exploitation end goal
enables (1) circumvention of existing performance
limiters and protection technologies to enhance
the operational performance of the deployed software product, and/or (2) insertion of malicious
code artefacts to corrupt the execution of the deployed software product. The third software exploitation end goal builds upon the previous two and is
defined as enabling major modifications to enhance
the operational performance of the deployed software product. Satisfying this third software exploitation end goal enables a significant alteration of
the deployed software products functional and
operational performance characteristics.
Regardless of the particular software exploitation end goal to be obtained, the software exploitation process must be defined to base a series of
experiments that will enable the capturing of
measurement data. This software exploitation
process commences when the exploiter acquires
the binary code file that represents the subject for
the reverse engineering activity. For networkcentric computing, this acquisition step is rather
expediently performed and may be no more effort
than locating the particular executable or load file
that will be the subject of subsequent reverse
engineering activities. For commercial software

An empirical examination of the reverse engineering process for binary files


products, this acquisition step encompasses the
purchase and installation of the product followed
by the selection of a particular executable or load
file for subsequent reverse engineering activities.
Embedded computer systems may require greater
effort during the acquisition step since the binary
code assets must be extracted from internal
memory devices using various attack scenarios
(e.g., in-circuit emulators, bus monitors or
invasive memory read-out attacks).
The next step in the software exploitation process is a static analysis of the binary code file to
derive information to support subsequent reverse
engineering activities (Kalb). Using a hex editing
tool the software exploiter can identify useful
text strings that may encompass library function
names, symbol table entries, debug messages, error
messages, I/O messages, and/or residual text inserted by the compilation environment (e.g., compiler version number, data and time stamps, etc.).
The software exploiter can analyze the file header
information used during the loading process to verify the binary file format employed (e.g., COFF, ELF,
PE, etc.). Knowledge of the binary file format
enables correct navigation through the contents
of the binary code file along with identification of
the major structural segments contained within
the binary code file such as the instruction segment.
Static analysis may include using a disassembler to
produce human readable assembly code for sections of the instruction segment that may be analyzed to determine functional attributes.
The next step in the software exploitation process
is a dynamic analysis of the binary code file to
evaluate the operational characteristics of the software product (Kalb). Execution of the binary code file
either on the actual target processor or within an
emulation environment enables the observation of
the execution behaviour of the software product.
Test case inputs can be supplied to stimulate functionality within the software product wherein the
execution behaviour may be observed by the software exploiter. The information gathered through
static and dynamic analyses of the binary code file
is sufficient for the software exploiter to achieve
the first software exploitation end goal.
Achieving the second or third software exploitation end goal requires modification of the software
product. The software exploiter uses the information gathered through static and dynamic analyses
of the software product to determine the nature
and location of the desired change/enhancement
to be applied to the software product. The actual
application of the change/enhancement takes the
form of a software patch of the existing binary code
file to alter the execution of the software product.

223

The extent of the modifications determines whether


it is the second or the third software exploitation
end goal that is to be achieved.
Anticipating the software exploitation of the
deployed software product, the commercial software product developer can perform a vulnerability
assessment culminating with the selection of appropriate tamper resistance technologies to be
integrated into the end product. The vulnerability
assessment concludes with an estimate of the time
since first deployment of the software product when
it is anticipated that software exploiters would have
achieved one of the software exploitation end
goals. Based upon the estimate of software exploitation timeline, the commercial software product
developer may elect to employ software tamper
resistance technology. The application of software
tamper resistance technology extends the software
exploitation timeline by increasing the difficulty
relating to reverse engineering of the binary code
file contents.
Experiments have been used in the past to
perform both tool assessments and user studies
(Cifuentes and Fitzgerald, 2000; Gleason, 1992;
Storey et al., 1996). The experiment described in
this paper attempts to determine the primary factors that affect the software reverse engineering
process. These primary factors once defined and
characterized could be used to quantitatively estimate the software exploitation timeline diminishing the subjectivity that currently dominates the
estimation process.

Assertions
Prior to executing the reverse engineering experiment, a set of assertions were identified to be
validated once experimental results had been
obtained. The first assertion was that a statistical
model could illustrate the relationship between
education and technical ability of the software
exploiter and their ability to successfully reverse
engineer a software product. The second assertion
was that the complexity of the binary code file is
related to the complexity of the human readable
source code. The reverse engineering experiment
uses the Halstead and McCabe software complexity
metrics to explore this relationship.

Experiment
The reverse engineering experiment requires a set
of test subjects to perform a sequence of tasks
relating to the reverse engineering of a set of

224
binary code files. The test subjects progress and
success during each task are monitored using
a variety of techniques to enable a series of
deductions to be made concerning the effort
required to reverse engineer a binary code file of
known size and complexity. To expediently execute the reverse engineering experiment, each
task was allotted a specific amount of time. The
progress of each test subject towards achieving
the task objective is then assessed. This approach
avoids the potentially open ended approach of
allowing each test subject to perform the task to
a completion criterion consuming as much time as
required to complete the task.
The set of test subjects included 10 student
volunteers attending the University of Glamorgan.
This included six undergraduates (three secondyear students and three third-year students),
three masters students, and one post-masters
student providing diversity in the education/technical skills suitable for experimental requirements.
Prior to the commencement of the experiment the
test subjects were informed that the nature of the
experiment related to reverse engineering of
executable programs that contained simple algorithms. The test subjects were provided a reading
list and a copy of the platform used (Redhat 7.2
GNU/Linux) along with documentation.
The reverse engineering experiment is partitioned into three stages that include an initial
assessment of the test subjects knowledge/skill
base, execution of the reverse engineering tasks
on a set of test objects, and a post-experiment
assessment to obtain feedback on the experiment.
A set of six test object programs were developed
that included (1) Hello World, (2) Date, (3) Bubble
Sort, (4) Prime Number, (5) LIBC, and (6) GCD
(Table 1). The test object programs were purposely
selected to be easily recognizable algorithms,
approximately same size to afford reasonable reverse engineering progress given a restrictive
amount of time, and absence of proprietary software elements to avoid legal infringements associated with reverse engineering of binary code files.
A subset of the six test object programs were
compiled with the debug option enabled (Program
Set A) while another subset of the six test objects
were compiled with the debug option disabled
(Program Set B). This approach provides the test
subjects the opportunity to reverse engineer the
same test object thereby enabling the assessment
of the value that debug information retained in the
binary code file adds to the reverse engineering
process.
The initial assessment of the test subjects
knowledge/skill base requires each test subject

I. Sutherland et al.
to complete a questionnaire. The questionnaire
inquired as to the number of years of experience
the test subject possessed regarding UNIX and the
C programming language. The majority of test
subjects had at least one years experience with
UNIX and the C programming language. The questionnaire also included a series of multiple choice
questions. The multiple choice questions focused
on UNIX commands relating to reverse engineering
to provide an assessment of the test subjects level
of experience/capability.
The execution of the reverse engineering experiment required each test subject to perform
a static, dynamic, and modification task on each of
the test object programs within a constrained time
limit. Test object filenames were selected so as
not to reveal the function of the binary. Each test
subject was supplied with a tutorial worksheet
that provided general guidance during each specific task. For example, the static task tutorial
worksheet requested each test subject to determine the size of the binary, determine the creation
time of the binary, speculate as to the type of
information contained in the file, identify all
strings and any constants present in the executable, and generate the assembly language for the
program. The dynamic task tutorial worksheet
requested each test subject to determine if any
input is required by the binary, describe the output
produced by the binary, identify any command line
arguments required by the binary, and describe
the function/purpose of the binary. The modify
task tutorial worksheet requested each test subject to perform a specific modification to the test
object program that requires the development and
insertion of a software patch to the binary code
file. For example, the test subjects were requested to modify the Hello World binary so that
upon execution the program would output World
Hello or to modify the Bubble Sort binary so that
upon execution the program sorts in descending
rather than ascending order. During the time
allotted for each task the test subjects were
required to perform the work requested and record their findings on the tutorial worksheets
provided for that task. Upon expiration of the
allotted time the tutorial worksheets were collected and replaced with the next tutorial worksheet in the experiment.
Test subjects were provided with Program Set A
during the morning session of the reverse engineering experiment. Experiment developers were
present to observe the execution of the experiment and to observe any interactions between test
subjects. Test subjects were allowed to interact
during the lunchtime break since it was decided

An empirical examination of the reverse engineering process for binary files


Table 1

225

Reverse engineering experiment framework

Session

Event

Morning
session

Initial assessment
Program Set A
(debug option enabled)

Test
object

Program
function

Task

Duration
(min)

Total
duration (min)

Hello World

Date

Bubble Sort

Prime Number

15
10
10
10
10
10
15
15
15
15
15
15

35

Static
Dynamic
Modify
Static
Dynamic
Modify
Static
Dynamic
Modify
Static
Dynamic
Modify

Hello World

Date

GCD

LIBC

Static
Dynamic
Modify
Static
Dynamic
Modify
Static
Dynamic
Modify
Static
Dynamic
Modify

10
10
10
10
10
10
15
15
15
15
15
15

30

45

45

Lunch
Afternoon
session

Program Set B
(debug option disabled)

30

30

45

45

Exit questionnaire

that some limited collaboration on experimental


results would emulate real world conditions present during actual software exploitation activities.
Test subjects were provided with Program Set B
during the afternoon session of the reverse engineering experiment. The experiment developers
were again present to observe any interactions
between test subjects.
To further observe the test subject activities
during the execution of the reverse engineering
experiment, the test developers employed an
automated screen capture tool (Camtasia) to provide a permanent record of activities. The reverse
engineering experiment platform involved an
Intel-based computer executing Linux Redhat 7.2
within a VMWare virtual environment hosted on
Windows NT4. This enabled the complete experimental environment to be retained for future
analysis and included Bash histories of command
line instructions, and all temporary and history
files arising from Internet accesses. The screen
captures, Bash histories, temporary and history
files coupled with the initial questionnaire and
tutorial worksheets, provide a detailed accounting
of the test subject activities.

At the completion of Program Set B the test


subjects were provided an exit questionnaire to
enable post-experiment assessment. The exit
questionnaire assessed the amount of materials
supplied on the reading list that were actually used
by test subjects during the experiment along with
general comments pertaining to the various stages
of the reverse engineering experiment.

Results
The measurements collected during the reverse
engineering experiment are analyzed to validate
the two assertions defined in the beginning of this
paper (section Assertions).

Education/technical ability
The first assertion to be validated by the experimental results concerned whether the use of
a statistical model could illustrate the relationship
between education and technical ability of the
software exploiter and their ability to successfully
reverse engineer a software product. This assertion

226

I. Sutherland et al.
graphs do not coincide one-for-one, a correlation
coefficient of 0.7236642 was computed illustrating
a statistically significant relationship between the
educational/technical ability of the software exploiter and their ability to successfully reverse engineer the binary code file of a software product.
This result provides validation evidence for the first
experiment assertion.

2.5

Normalized Data

Ability
Score

2
1.5
1
0.5
0
1

10

Test Subject

Figure 1

Grading scheme used to normalize responses.

is validated through analysis of the initial questionnaire and tutorial worksheet responses. The education/technical ability (Fig. 1, ability) is derived
from the initial questionnaire responses for each
test subject and is normalized to values between
0 and 3 (Table 2) based on their experience with
operating systems, platforms, and the range of
commands used during the reverse engineering
experiment. The ability to successfully reverse
engineer a software product (Fig. 1, score) is derived from the tutorial worksheet responses for
each test subject and is normalized by applying
a consistent grading scheme per question response
(Table 2) then averaging over all of the responses
(3 tasks  8 test objects) for that particular test
subject. The education/technical ability and the
ability to successfully reverse engineer a software
product values are plotted against the test subjects identification number. Although the two
Table 2
responses

Grading

scheme

used

to

normalize

Grade Description
0

The test subject has failed to


answer the questions, or the
answer is completely incorrect.
The test subject has failed to demonstrate
an adequate understanding of the problem.
There is some factual information presented,
but there may be significant errors. The
answer provided by the test subject
lacks sustentative matter.
Demonstrates an adequate understanding
of the major issues and the complexity of
the issues involved. The answer provided by
the test subject is correct, but it may
contain minor errors.
Demonstrates an excellent understanding
of the problem and the complexity of the
issues involved.

Complexity/size metric
The second assertion to be validated by the
experimental results concerned the relationship
between the complexity of the binary code file to
the complexity of the human readable source
code. This assertion is validated through correlation of the tutorial worksheet responses (regarding
the reverse engineering of the eight test objects)
versus the application of Halstead and McCabe
metrics on the human readable source code (six
software programs that when compiled produced
the eight test objects). The tutorial worksheet
responses for the static, dynamic, and modification tasks were normalized using the grading
scheme (Table 2) then averaged to produce the
mean grade per test object (3 tasks  10 test subjects). The Halstead and McCabe metrics were
computed using the source code for each of the
test objects. The mean grade per test object is
correlated with each of the individual metric items
to determine the extent of any dependencies
(Tables 3 and 4).
The statistical analysis reveals that there are no
significant positive correlations between the
source code metrics and the ability of the software
exploiter to successfully reverse engineer a software product. The lack of correlation illustrates
that source code artefacts that contribute to size
and complexity metrics do not impact the reverse
engineering process applied to binary code files.
For example, the amount of branching (decision
points) within a source code file is the basis of
the McCabe cyclomatic complexity metric and
has significant bearing on unit-level testing of
the software module. Comparatively, branching
instructions (jump instructions) within a binary
code file are easily disassembled and understood
by the software exploiter.

Conclusion
The reverse engineering experiment as defined
within this paper represents a framework for the
experimental collection of measurement data in

An empirical examination of the reverse engineering process for binary files


Table 3

227

Source code metrics debug enabled

Source program

Hello World

Date

Bubble Sort

Prime Number

Test object

Mean grade
per test object

1.483

1.300

0.786

0.867

6
7
6
18
0.667
1.499
27
12
0.001
8
1

10
27
14
103
0.167
5.988
618
17
0.001
2.86
1

9
14
11
48
2.5
5.988
120
19
0.001
7.68
1

21
33
15
130
0.094
10.638
1435
15
0.001
1.83
3

Correlation

Metric
Lines of code
Software lengtha
Software vocabularya
Software volumea
Software levela
Software difficultya
Efforta
Intelligencea
Software timea
Language levela
Cyclomatic complexity
a

0.5802
0.3958
0.5560
0.4006
0.4833
0.7454
0.3972
0.6744
0
0.1909
0.4802

Halstead metrics.

a consistent and repeatable fashion. The 10 test


subjects participating in the actual reverse
engineering experiment, although representing
a relatively small data set, provide the basis of
a preliminary assessment as to the primary factors
that affect the software reverse engineering process. The reverse engineering experiment provides
quantitative evidence that there is a relationship
between the education/technical ability of the
software exploiter and their ability to successfully
reverse engineer a software product. This evidence provides the foundation for modelling
of this relationship using existing predictive models. Development and maturation of a reverse
engineering model that characterizes the software

Table 4

exploitation process will enable commercial software product developers to quantitatively predict
the time following product deployment when it is
anticipated that a software exploiter would have
achieved a given exploitation end goal.
The reverse engineering experiment also provides quantitative evidence that industry accepted
source code size and complexity metrics are not
suitable for characterizing the size and complexity
of binary code files pursuant to estimating the
time required to perform software exploitation
activities. Literary research conducted at the
commencement of this project did not identify
binary size and complexity metrics that could have
been used instead of the source code size and

Source code metrics debug disabled

Source program

Hello World

Date

GCD

LIBC

Test object

Mean grade per test object

1.350

1.558

1.700

1.008

6
7
6
18
0.667
1.499
27
12
0.001
8
1

10
27
14
103
0.167
5.988
618
17
0.001
2.86
1

49
40
20
178
0.131
7.633
2346
17
0.2
2.43
3

665
59
21
275
0.134
7.462
5035
19
0.4
2.3
11

Correlation

Metric
Lines of code
Software lengtha
Software vocabularya
Software volumea
Software levela
Software difficultya
Efforta
Intelligencea
Software timea
Language levela
Cyclomatic complexity
a

Halstead metrics.

0.3821
0.3922
0.0904
0.4189
0.1045
0.0567
0.5952
0.1935
0.5755
0.0743
0.7844

228
complexity metrics. Size and complexity metrics
that directly characterize the binary code files
must be defined. Such size and complexity metrics
are required to support the development of a software exploitation predictive models a follow-on
research project has been proposed to define
these metrics and then use the existing reverse
engineering experiment framework to gather
measurements to corroborate the defined metrics.

Acknowledgment
The researchers wish to thank the sponsor of this
project, who requested to remain anonymous, for
the generous funding of this project and for providing funding for the follow-on research project.

I. Sutherland et al.
Muller H, Jahnke J, Smith D, Storey Margaret-Anne, Tilley S,
Wong K. Reverse engineering: a roadmap. In: ACM 2000;
2000.
Storey MAD, Wong K, Fong P, Hooper D, Hopkins K, Muller HA. On
designing an experiment to evaluate a reverse engineering
tool. Published in. In: The proceedings of working conference on reverse engineering 1996. Washington DC, USA:
IEEE Computer Society; 1996.
Tabernero M. Embedded system vulnerabilities & The IEEE
1149.1 JTAG standard, <http://www.cs.jhu.edu/wkalb/
Kalb_JTAG_page.htm>; February 2002.
Tilley SR. The canonical activities of reverse engineering. Annals of Software Engineering 2000;vol. 9. Baltzer Science
Publishers.
Weide BW, Heym WD, Hollingsworth JE. Reverse engineering of
legacy code expose. In: Proceedings: 17th international conference on software engineering. IEEE Computer Society
Press/ACM Press; 1995. p. 327e31.
Wills LM, Cross II JH. Recent trends and open issues in reverse
engineering. Automated Software Engineering: An International Journal July 1996;vol. 3(1/2):165e72. Kluwer Academic Publishers.

References
Bull TM, Younger EJ, Bennett KH, Luo Z. Bylands: reverse
engineering safety-critical systems. In: Proceedings of the
international conference on software maintenance. IEEE
Computer Society Press; October 17e20, 1995. p. 358e66.
Chen Y. Reverse engineering. In: Practical reusable UNIX software. John Wiley & Sons; 1995.
Cifuentes C, Fitzgerald A. The legal status of reverse engineering of computer software. Annals of Software Engineering
2000;vol. 9. Baltzer Science Publishers.
Demeyer S, Ducasse S, Lanza M. A hybrid reverse engineering
approach combining metrics and program visualisation.
Published in. In: The proceedings of working conference
on reverse engineering 1999. Washington DC, USA: IEEE
Computer Society; 1999.
Gannod G, Chen Y, Cheng B. An automated approach for
supporting software reuse via reverse engineering. In: Thirteenth international conference on automated software engineering. IEEE Computer Society Press; 1988. p. 94e104.
Gleason JA. A reverse engineering tool for a computer aided
software engineering (CASE) system. Technical Report. Massachusetts Institute of Technology, Department of Electrical
Engineering and Computer Science; 1992.
Interrante MF, Basrawala Z. Reverse engineering annotated bibliography technical report. Software Engineering Research
Centre; 26 January 1988. Number SERC-TR-12-F.
Kalb G. Embedded computer systems e vulnerabilities, intrusions and protection mechanisms courseware. The Johns
Hopkins University Information Security Institute.

Further reading
<http://www.wotsit.org>; July 1996.
Iain Sutherland is a lecturer in the Information Security Research Group at the School of Computing, University of Glamorgan, UK. His main research interests are Information Security
and Computer Forensics. Dr. Sutherland received his Ph.D.
from Cardiff University.
George E. Kalb is an Instructor and Institute Fellow at the Johns
Hopkins University Information Security Institute, US. His research interests are in the domains of binary reverse engineering and tamper resistance technologies. He has a B.A. in Physics
and Chemistry from University of Maryland and an M.S. in Computer Science from Johns Hopkins University.
Andrew Blyth is currently the Head of the Information Security
Research Group at the School of Computing, University of Glamorgan, UK. His research interests include network and operating systems security, and reverse engineering. Dr. Blyth
received his Ph.D. from Newcastle University.
Gaius Mulley is a senior lecturer at the University of Glamorgan.
He is the author of GNU Modula-2 and the groff html device
driver grohtml. His research interests also include performance
of micro-kernels and compiler design. Dr. Mulley received his
Ph.D. and B.Sc.(Hons) from the University of Reading.

computers & security 25 (2006) 229236

A simple, configurable SMTP anti-spam filter: Greylists


Guillermo Gonzalez-Talavan*
Department of Computer Science and Automation, University of Salamanca, Facultad de Ciencias,
Plaza de la Merced, s/n, 37008 Salamanca, Spain

article info

abstract

Article history:

This paper addresses methods for combating spam, focusing especially on those based on

Received 27 August 2004

the economic motivations of unsolicited commercial e-mail. Considering the fact that to

Revised 25 October 2005

date no machine has passed the Turing test, well-known blacklist and whitelist solutions

Accepted 15 February 2006

can be generalized by greylists. An outline of a simple SMTP anti-spam application following these ideas and running on a UNIX machine is offered. Some problems regarding the

Keywords:

application are discussed, together with some of the results obtained after a two-month

Spam

test period.
2006 Elsevier Ltd. All rights reserved.

Anti-spam filter
Whitelist
Greylist
UNIX
Sendmail

1.

Introduction

Spam is the word commonly used to refer to unsolicited commercial e-mail (UCE) or unsolicited bulk e-mail (UBE). As well as
a certain displeasure for users, spam is a waste of money
and Internet resources (Grimes, 2004; Spam). Furthermore,
owing to its content, its distribution methods, and the way it
usually forges its sources, it can be regarded as fraudulent
(Hinde, 2003). Currently, more than half of all circulating Internet e-mails are spam. Forecasts point to an even worse situation in the future. While the number of legitimate e-mails in
2007 will be the same as now, it is believed that spam will
double (Spam filters, 2004).
There is readily available software to fight spam. Anti-spam
software is in constant evolution, and so are the tools used to
generate it. Indeed, a fight has arisen at the spam battlefield
similar to the one between other computing opponents, such
as viruses and antivirus software. Just as a computer virus
has a life cycle comparable to the life cycle of its biological

* Tel.: 34 923294500x1302; fax: 34 923294514.


E-mail address: gyermo@usal.es

counterpart, spam also seems to resemble another type of


biological behaviour: i.e., parasitic behaviour. Spammers eat
Internet resources for their own benefit and give nothing in
return. It is said that if spam keeps expanding at its present
pace, it may well bring Internet to an end, at least as it is known
now. If the parasite comparison is correct, however, this will
not actually happen, since no parasite wishes its hosts death,
which in the long run of course means its own death.
In the following sections several methods for fighting spam
will be discussed. A simple application based on some of them
will be presented. This application is easy to implement on
a UNIX machine and is currently being tested at our
Department.

2.

Some methods for combating spam

In order to combat spam, two main independent battlefronts


have been opened: the legal one and the technological one.

230

computers & security 25 (2006) 229236

There are supporters of the former (Mertz) as well as of the latter (Grimes, 2004; Hinde, 2003). However, a problem as multifaceted as spam probably requires a combined solution.
With regard to the legal front, some difficulties have been
encountered. They may be due to the international character
of Internet and the lenience of some recently enacted laws
(Asaravala). A severe anti-spam legislation could perhaps
lead to a lack of competitiveness against less scrupulous
neighbouring countries. Opt-in and opt-out models are also being debated, as well as public registries where the addresses of
people who do not want e-mail marketing are to be included.
Some of these measures may be counter-productive, however,
since spammers can use them maliciously for their own
benefit.
Regarding the technological aspect, several measures have
been devised and put into practice. Among them are the
following:
0) Preventive methods: such as trying to prevent spammers
from including ones e-mail address in their lists.
1) Blacklists: these are lists of e-mail or machine addresses
from which it is known that spam is sent. They may be
personal or public, local or distributed. When a message
arrives coming from an address or machine listed on the
blacklist, it is rejected.
2) Honeypots: in connection with blacklists, these consist of
invented e-mail addresses. Their aim is to attract as
much spam as possible in order to alert other users or
take further measures. They are based on spam usually
being distributed in bulk. Characteristic features
( fingerprints) are obtained from received messages. User
software connects to the honeypot to find out if the relevant message has already been received there.
3) Whitelists: their operation is the opposite of blacklists.
They consist of a list of addresses from which all mail is
accepted. Mail coming from other addresses is transferred
to a low priority folder (Ookoboiny). A few commercial
implementations are available and some of them are evaluated in PC Magazine, 2004.
4) Content filters: these compute a score for each incoming
message as a function of some previously user-established criteria. If the score of the message is greater than
a given threshold, the message is considered spam.
5) Bayesian filters (Graham): Statistics about the content of
the message are used for the purpose of being able to classify it as spam or not. Users must train their filters to make
them learn which messages are spam and which are
not. This method is appealing because it is adaptable;
that is, it learns from its users concept of spam as more
and more messages are processed.
6) Neural networks (Vinther): if a human being is easily capable of detecting spam, perhaps artificial intelligence
should be tried out. Although no systems are currently
available commercially, some efforts have been made.
7) Sender Id: this method is devised to get rid of forged sender
information (domain spoofing). It simply asks the presumed
sender domain for IP addresses from which that message
can be sent. The message is considered spam if the e-mail
connection did not come from one of those (Sender ID
Framework).

In order to find out how good a spam detection method is,


it must be considered that two kinds of errors can be produced: a message is classified as spam and it is not (a false
positive), or, the message is classified as correct and it is
spam (a false negative). Depending on the type of user, it
may be necessary for one of these errors to be minimized or
even abolished.
No single method can be considered as the universal
panacea against spam; each has its own problems. In blacklists, a legitimate user belonging to a spammers domain
may be misjudged. Whitelists are not suitable when the majority of good e-mails come from unknown people who are
sending their first e-mail to that address. Users of filters or
neural networks lack control over their operations and the
message must be received in its entirety before a decision
is made.
Another important aspect is the policy to be followed with
the spam detected. Anti-spam software usually moves it to
a different folder or marks it. This is a disadvantage since
spammers, who do not get the message bounced back, can
think that at least their messages are reaching their destination and that there is an active address there. If an alternative
policy is decided on and the messages are rejected even before
retrieving their body, there is no chance of rescuing false
positives.
For a complete review of the techniques and tools currently
used by spammers, as well as the sources where they get
e-mail addresses from, Cournane and Hunt (2004) can be
consulted. Some personal statistics about the efficiency of
several anti-spam methods appear in Mertz.

3.

Spam on spam

Recently, there has been much debate about the economic aspects of fighting spam (McWilliams). Clear evidence in support
of how profitable a spam-based commercial campaign can be
is seen in spam e-mails that advertise spam services. Sometimes, e-mails offering bulk marketing programs via e-mail
are received. Their price ranges between a few dollar cents
and one dollar for each thousand e-mails sent, depending on
whether the spammers are in charge of the design. If one is
interested in merely buying e-mail addresses to send spam
e-mails, the price is in the region of a hundred dollars. It thus
depends on whether the addresses are classified by country,
Internet domain, field of activity, etc. and whether they are
verified (the addresses are not dead). The promised response
rate also ranges between 1% and a more realistic 1/10,000.
Quoting the very spammers advertisement: You sell a product or service for 10 euros. You decide to promote this product
or service on the Internet to 10 million people, only 1% decides
buy your product (sic), do the math and see how much money
you would make. [.] You would make one million euros sending 10 million emails. You understand now why you receive so
much email every day in your mailbox: Advertising on Internet is extremely lucrative.. Just as certain animals or plants
produce hundreds of eggs or seeds, a spammer spreads a huge
amount of messages, despite knowing that the vast majority
will not bear fruit.

computers & security 25 (2006) 229236

Other spammers tell us in their advertisements about the


advantages of bulk e-mail marketing:









Low cost per message


Quick implementation
Immediate results
Market segmentation possibility
Personalised messages
Direct contact with clients
Ability to interact with the recipient
Almost no restrictions about message size or design

It is clear that this kind of marketing has these advantages


only for rather unknown companies with doubtful reputations. They certainly will not see or will not mind seeing their
name spoilt by thousands of people with indiscriminately
filled mailboxes.
Spam companies also offer premium services, such as:
 Legal advice, to dodge defective anti-spam legislation.
 Bullet-proof web sites: as spammers explain it as you already
know, many web hosting companies have Terms of Service
(TOS) or Acceptable Use Policies (AUP) against the delivery of
e-mails advertising or promoting your web site. If your web
site host receives complaints or discovers that your web site
has been advertised in broadcasts, they may disconnect your
account and shut down your web site.
 Sorted or classified lists of addresses.
 Software for sending spam.
 Verified addresses.
It so happens that when a client pays money for a number
of e-mail addresses, that person can and will demand working
ones. Therefore, spammers need a certain amount of feedback when sending e-mail. This can be accomplished in
several ways:
 With personalised opt-out links. When somebody clicks onto
them, the spammer has proof that the e-mail reached the
recipient.
 Recording the lack of bounced mails. The message headers
are usually forged to make tracing back more difficult. But,
sometimes, exploratory e-mails with correct headers are
sent in the hope they will not be bounced back. In this
case, the address probed is added to the spammers list.
 By using an HTML e-mail trick. A common one consists of
placing a personalised URL for an embedded image in
the HTML message (e.g. http://mailcheckisp.biz/load_gifs.
asp?picgyermo@gugu.usal.es). When the e-mail is shown,
the image is retrieved from the spammers server and they
log a hit. This resource is invaluable for efficient spammers
since they know with certainty that their e-mail has broken
all barriers and has been opened by the recipient. The Achilles heel results in an annoyed recipient forging the image
link to point to known spammers e-mail addresses (those
used for contact information on their advertised web pages).
 Lack of errors in sending connections. Spammers frequently
use customized software to send spam. That software directly connects to the destination e-mail relay to deliver
mail. If the software gets an error for an address, it will

231

probably discard it for future uses. The spammer is aware


of the address malfunction regardless of possible forged
headers.
The variety of spam services can sometimes lead to nonsense: some spammers offer anti-spam filters or the possibility of removing an address from their lists by spam e-mail.
They are aware that if recipients are reading their mail it is because their anti-spam defence did not work! Following this
trend, one finds the idea of paying a small amount to spammers for deleting certain e-mail addresses from their lists. I
do not believe this is an adequate solution. It encourages
new spam companies to be born to engage in this type of business, currently very profitable in itself.
How can one combat spam from the economic viewpoint? By means of cutting off its system source of profit,
the same as one gets rid of biological parasites by preventing
them from assimilating the hosts body substances and finally making them die of hunger. Essentially, the secret
lies in transferring part or all bulk e-mail expenses to the
sender of spam. Today, these expenses are born by information transport companies and eventually passed onto their
Internet subscribers.
Among several proposed solutions based on economic
aspects, one is to place a very small fee (micropayment) on
electronic mail (Grimes, 2004). That micropayment may even
involve computation resources. This electronic franking is
thought to be minimum for a few mails, but unaffordable for
sending millions. Such a solution is severely contested by Internet users, already accustomed to free electronic mail. The
solution also implies a serious handicap for legal companies
who have their clients consent to send electronic marketing.
Perhaps this idea could be put into practice only through some
kind of registered premium e-mail service, guaranteed free of
spam.
Other techniques (Miller, 2003) implying economic considerations are indirectly based on the Turing test (Turing, 1950).
At present, there is an imbalance between generating and
fighting spam software. Spammers can send millions of emails with relatively simple software, while users have to
deal with complex statistical or heuristic filters to set the
spam aside. In the worst cases, valuable user time is wasted
in separating the wheat from the chaff. Can these roles be
swapped? Some people have put forward ideas that require
some sort of intelligence on the senders part to keep machines out of the game. These methods are usually combined
with whitelists in the following way: if a message with an unknown sender is received, that message is bounced back asking the sender to answer a simple challenge. The challenge
can be as simple as visiting a web page, answering an easy
question or even finding the solution of a popular riddle. The
task is designed to be very easy for a human being, but terribly
challenging for the machine. The answer can be added to the
subject line of the message, for instance see Manes. Thus, if
this becomes a common approach spam companies will
need to recruit new staff to send their mail, loosing competitiveness against other more conventional marketing. Unfortunately, this solution is expected to have limited validity: just
up to the time when a machine might be able to pass the
Turing test.

232

4.

computers & security 25 (2006) 229236

Spam filter test with greylists

Considering the above ideas, a low-cost and highly customizable anti-spam application has been developed at the
Computer Science Department at the University of Salamanca. To accomplish this, only a web server (Apache) and
an e-mail management program (Sendmail) were needed.
A simple C program and a dozen lines added to the configuration file of Sendmail (sendmail.cf) sufficed for work to
begin.
An SMTP anti-spam barrier was chosen. SMTP (Klensin)
stands for Simple Mail Transfer Protocol. As its name suggests,
SMTP is very straightforward. The simplest of all SMTP working procedures to deliver mail is shown in Fig. 1. Mail transfer
begins with the recipient machine greeting and introducing
itself. A dialogue with the senders machine follows, which
is quite easy to understand. After the sender has issued the
QUIT command and the other part has acknowledged it, the
connection is closed.
It must be remarked that the senders machine statements
about its name or the senders address may be false. There is
no guarantee that they are real. Once the mail has been processed, the stated machine name, along with the IP address
from which the connection was established, will appear at
the Received headers of the e-mail. The rest of the headers
are probably kept unchanged (Fig. 2). It is important to mention that both the origin address (MAIL FROM) and the destination address (RCPT TO) may differ from the ones at the
headers of the message (From: and To:, respectively). This is
why the former is sometimes known as SMTP envelope addresses. The specification states that any notification or error
detected once the SMTP connection is closed has to be
addressed to the MAIL FROM address.
The SMTP anti-spam filter developed therefore has to decide whether the connection is good with only two pieces of

sender.part.com

information at hand: who the recipient of the message is


(RCPT TO) and from whom it is said to come (MAIL FROM). By
working this way, the e-mail is rejected even before its body
has been transmitted, resulting in server resource saving in
space and time. If the spammers software is directly connected to the machine where the application is held, it gets
an immediate feedback. The address to which it is trying to
send the e-mail is not working properly.
Each user belonging to the Department test program can
state both blacklisted sender addresses (immediately rejected)
and whitelisted sender addresses (admitted with no further
checking) by means of a configuration file. Furthermore, the
concept of blacklist and whitelist is extended to that of greylist. Each address received at a MAIL FROM statement must
belong to exactly one of the three recipients lists: blacklist,
whitelist or greylist. The action taken for either of the first
two is clear, but if the address belongs to the greylist, the
mail is rejected in a standard way with this additional information (e.g.): alice@sender.part.com blocked. Info: http://
tejo.fis.usal.es/wgyermo/as.htm.
The web page in the example belongs to the recipient of the
message and that person can change it as s/he likes. The
sender whose address appears in the greylist therefore has
to visit the web page to pass the spam barrier. Instructions
on how to proceed are to be found there. The English content
of the present web page is illustrated in Fig. 3. The challenge in
this case is merely to add a password (pera) to the login
name of the recipient (gyermo). It is not possible to place the
password in the subject line, as is the custom, because the
message subject is not included in the SMTP envelope data
but in the body of the message. The addresses on the web
page are gif images in order to make their automatic retrieval
more difficult. It is obvious that the web page can be modified
at will to include a more difficult challenge of any sort, provided that the answer to the challenge is a word to be added
to the login name of the user.

recipient.part.com
[[listening onport 25]]

220 recipient.part.com ESMTP


HELO marte
250 recipient.part.com welcomes you
MAIL FROM: <alice@sender.part.com>
250 ok
RCPT TO: <bob@recipient.part.com>
250 ok
DATA
354 go ahead
Hello there...
.
250 ok 1093222544 qp 18015
QUIT
221 nice to see you
Fig. 1 Simple SMTP transaction.

computers & security 25 (2006) 229236

233

From alice@sender.part.com Mon Oct 10 03:06:43 MDT 2005


Received: from marte ([xxx.xxx.xxx.xxx])
by recipient.part.com (8.12.10/8.11.1) with SMTP id yyy
for <bob@recipient.part.com>;Mon, 10 Oct 2005 03:05:54 -0600(MDT)
Date: Mon, 10 Oct 2005 03:05:13 -0600 (MDT)
Message-Id: <zzz@recipient.part.com>
Subject: A present for you
From: santa@northpole.com
To: bob@recipient.part.com
Hello there...

Fig. 2 Received message, including headers.

5.

Users lists specification

Users can customize the filter to suit their needs. They must
create a file named .blacklist at their home directory. An
example of such a file can be seen in Fig. 4. The file syntax is
very simple. It is a text file with independent entries on different lines. Each entry has two parts separated by a colon (:).
The type of entry comes after the colon and can be PASSWORD, BLACK, GREY or WHITE. A PASSWORD entry is used
to set the password of the user. BLACK, GREY or WHITE deal
with address lists. When a sender requests the system to deliver an e-mail to a local user and the address does not have
a password, the local users .blacklist file is scanned sequentially. The first time that the left part of a line matches the
SMTP MAIL FROM address, the right part will show what to do
with the request (i.e. which list it belongs to). If the end of
the file is reached, the address is considered to be GREY. As
can be seen in the figure, regular expressions can be used to
specify addresses. This is not a minor enhancement. Joining
the three lists in a single file and using regular expressions afford the application great flexibility.
For example, one can decide to allow all incoming mail
from the mars.com domain with the line *.mars.com:WHITE. If
later one finds that spam is arriving from alienvacation@mars.com,

inserting the line alienvacation@mars.com:BLACK before the


first line can solve it. It is quite reasonable to insert all addresses of the most popular free webmail services into the
greylist, for spammers like to include false return addresses
from them. One only has to add *@popularwebmail.com:GREY
to the .blacklist file, for example, to accomplish this. If,
then, ones grandmother opens an account with popularwebmail, one needs another addition before: granny@popularwebmail.com:WHITE. The user maintains absolute control over the
filter. The administrator of the machine is exempt from liability in the event of losing an important e-mail due to the filter.

6.

Application description

Thanks to the extraordinary configuration possibilities of


Sendmail, the reader can have a working filter of this sort running at a UNIX machine in a few hours. An updated version of
Sendmail is required. A standard web server is also recommended for the challenges. Only a few lines have to be added
to the configuration file of Sendmail (sendmail.cf is its usual
name). This file is essentially made up of rules. Each rule
has two parts: an input string and an output string. If the input
string matches the left part of the rule, it is rewritten following
the instructions found at the right part of the rule. The rule

Fig. 3 Current English content of the web page of the challenge.

234

computers & security 25 (2006) 229236

pera:PASSWORD
*.mars.com:WHITE
alienvacation@mars.com:BLACK
granny@popularwebmail.com:WHITE
*popularwebmail.com:GREY
:WHITE
compromised@recipient.part.com:GREY
*[@.]recipient.part.com:WHITE
Fig. 4 An example of .blacklist file.
may be repeated until the input string does not match its left
part. Then, the output string is the resulting input string. If
there is no match, the output string equals the input string.
For further information, readers are referred to Sendmail
documentation (The whole scoop in the configuration file).
Rules are grouped in procedures. Different procedures are
invoked in different parts of the SMTP connection. check_rcpt
is the name of the procedure which is called when an SMTP
RCPT TO address has been received (Sendmail 8.8). In the version
of Sendmail used to develop the test filter, check_rcpt calls another procedure whose name is Local_check_rcpt. These procedures are widely used to avoid mail relaying. However, they can
also be used to implement a spam filter. For example, Fig. 5 illustrates how Local_check_rcpt was used at the test example.
Sendmail must be restarted for the changes to take effect.
A general explanation of Fig. 5 follows. On the first line, the
CheckAS key is defined as an executable program located at /
root/ANTISPAM/antispam. The program takes an argument consisting of the recipients login, a colon and what was read from
the SMTP MAIL FROM statement. The program writes on the
standard output BLACK, GREY, WHITE, GOODPASS or BADPASS according to the argument passed. The present version
looks up the users .blacklist file as described above. The second line builds up a set, whose name is ProgramaAS, where all
local users who the administrator wants to be included in the
anti-spam program appear. In the example of the figure, only
user gyermo is included. Following these two lines, the relevant core of the modifications is shown. When Sendmail
receives the SMTP RCPT TO statement, check_rcpt is invoked,
which, in turn, calls Local_check_rcpt. On the fifth line, the
rule adds what was stated in MAIL FROM plus the word local
if the mail is for a local user. The sixth line more or less states
that if the receiving user is local and belongs to the anti-spam
program, the CheckAS program will be run with the corresponding argument. The seventh line works the same as the
sixth, but for the password case. The eighth, ninth and tenth

[01]
[02]
[03]
[04]
[05]
[06]
[07]
[08]

lines intercept blacklist, greylist and bad password cases.


They make Sendmail produce an SMTP error. Attempts were
made to ensure that error codes would follow the RFC 3461
specification (Moore). The last line is only reached by the
rest of cases. It leaves the input string as it was at the procedure entrance, being ready for additional rules if necessary.
It may seem strange to produce an error in the RCPT TO
statement provided that the main cause of that error comes
from the previous MAIL FROM statement. The fact is that both
items of information are needed to diagnose the problem.
This issue is reflected in RFC 2821: Despite the apparent
scope of this requirement, there are circumstances in which
the acceptability of the reverse-path may not be determined
until one or more forward-paths (in RCPT commands) can be
examined. In those cases, the server MAY reasonably accept
the reverse-path (with a 250 reply) and then report problems
after the forward-paths are received and examined.
For simplicity, log functionality has not been considered in
Fig. 5. These logs allow one to check all incoming message addresses and how Sendmail classified them so that statistics
can be produced. It should also be noted that the method
used to implement the filter is highly inefficient. One process
has to be spawned for each received e-mail message. It is
therefore not suitable for large organizations and is only
presented here for testing purposes.

7.

Results

The application was tested for two months on a single account.


Before the test was started, the e-mail account used to receive
an average of 27 spam messages a day: 1647 messages in two
months. After that, only 14 out of the previous 1647 spam messages reached the users inbox. These results were expected
and are compatible with other whitelist-based methods.
Regarding false positives, although they are difficult to detect accurately with the application logs, only four cases were
found during these two months. Two of them correctly visited
the web page and sent the password accordingly. One of the
four users said that he thought the address had problems,
and concerning the last case nothing is known. These cases
cannot be considered statistically representative since there
are few of them and the account is mainly used for Computer
Science academic purposes. Nevertheless, when the program
started, the address was useless for practical purposes and is
now nearly fully functional again.

KCheckAS program /root/ANTISPAM/antispam


C{ProgramaAS} gyermo
SLocal_check_rcpt
R$*
R$* $| $={ProgramaAS} $| local
R$* $| $={ProgramaAS}+$+ $| local
R$* $| BLACK

[09] R$* $| GREY


[10] R$* $| BADPASS
[11] R$* $| $*

$: $1 $| $&{rcpt_addr} $| $&{rcpt_mailer}
$1 $| $( CheckAS $2:$&{mail_addr} $)
$1 $| $( CheckAS $&{rcpt_addr}:$&{mail_addr} $)
$#error $@ 5.7.1 $: "550 " $&{mail_addr}
" blacklisted (spam)"
$#error $@ 5.1.0 $: "550 " $&{mail_addr} " blocked."
" Info: http://tejo.fis.usal.es/~gyermo/as.htm"
$#error $@ 5.1.1 $: "550 Wrong or expired password"
$1

Fig. 5 Additions to sendmail.cf file.

computers & security 25 (2006) 229236

Surprisingly, less spam mail is now caught trying to pass


through the filter. Presently, only an average of 13 spam connection attempts a day are detected (as compared with 27 received before). This means that some spammer software
detects some of the bounced messages and decides not to
send any more, at least for some time. This assumption receives support on considering that at one time spam increased
notably, exactly when the filter was down and messages did not
bounce. On that occasion, the spam level rose to approximately
the values usually detected before the filter was installed.

8.

Problems

There is no universal solution to spam. Each user has his/her


own different requirements. Thus, for example, a person
working for an e-commerce site will probably not be willing
to lose a single client due to greylist challenges. In such
a case, that person has to be very careful with SMTP filtering,
if using it at all. Perhaps she/he can filter mails coming from
countries with which his/her company does not have trade
relations and also try a very soft content filter.
Some people are also concerned about losing an important
e-mail because of spam filtering. In a solution such as the one
shown here, a systematic check of rejected addresses is
a must. However, if applications like this become popular, people sending e-mail to an address for the first time will gradually
get used to possibly receiving a confirmation request from the
recipient. It may even become to be considered as good e-manners, a kind of introduction perhaps. All this small individual
inconvenience is after all due to some people abusing common
Internet e-mail resources. To draw an analogy, at some time
and place there was no need for closed doors. As time passed,
they had to be closed for security reasons despite the small inconvenience of knocking before entering.
On the other hand, as stated, no verification of stated MAIL
FROM data is carried out. A spammer can simply lie about it. In
order to be successful, the spammer must find an address on
the recipients whitelist. The response to this may consist in
alerting the person whose address has been compromised
and passing the address, at least for some time, to the blacklist. It is a sudden reaction and it requires an also sudden
counter-measure on the spammers side. The situation becomes balanced out. It is a one to one fight. The spammer
needs human staff to keep the lists working.
But one may guess that the important issue here is not the
security of the SMTP information. The MAIL FROM field can be
understood as a visiting card or as a sort of soft password
for a users inbox. All strangers, or at least all who do not
show a known address, become suspects. In agreement with
Dominus todays experience shows that spammers seldom
give valid return addresses, and even more seldom open
returned mail. At best, their software automatically discards
broken addresses.
Information inside the .blacklist file can also be stolen.
This is certainly a very serious problem and drastic measures
should be taken. It is also true that to do so the spammer
needs to break the security of the host where the lists are
and it may not be worth the effort. If this comes to be a problem, .blacklist can be coded with a hash function similar

235

to the one used in UNIX /etc/passwd. The drawback is that


regular expression flexibility would be lost.
Spammers can attack the filter using the null address (<>)
in MAIL FROM. RFC 2821 clearly states that all notification or error messages must set MAIL FROM to the null address to avoid
loops. Accordingly, it is not advisable to blacklist this address.
However, if spammers use this, as a collateral effect, a sure
method for identifying spam is clear. Nevertheless, spam
messages would share null addresses with legitimate return
messages and this means an additional problem. The solution
may be to place certain codes in all sent messages so that the
software knows when a return message was really sent by us
and is not a fake.
It is fairly convenient to include local domain addresses in
the whitelist. Spammers can effortlessly take advantage of
this. In such a case, and if one wishes to keep the local domain
in the whitelist, one has to reject mail from local users coming
from outside or, alternatively, install a validated SMTP server.
An open SMTP relay must, of course, not be allowed for the
sake of the hygiene of the Internet e-mail system.
One last point claimed against greylist-like methods is that
they are poorly compatible with automatic response e-mail
systems: online stores, mailing lists, etc. Most users can work
with a temporary address, free from spam, for such purposes.
If this is not possible and whitelisting the relevant domain
does not work either, the user can directly register his/her
password address (gyermopera@tejo.fis.usal.es in the example) on those services. By making the e-mail address hold the
password instead of the message subject line, an additional advantage arises. We have plenty of fresh new addresses at our
disposal. If one of them falls, we change it and thats that.

9.

Summary and future work

With little effort and with readily available software, a first


barrier against spam proliferation has been accomplished at
the SMTP connection level. The proposed solution is 100%
compatible with current standards and older software. The results obtained are promising, especially considering that the
resources required were few. Although the solution has been
tested in a UNIX environment with Sendmail, it can be easily
ported to other operating systems or e-mail access methods,
including webmail.
The solution is highly customizable and is totally under the
users control. This relieves the system administrator of work
and liability. If solutions like this become popular, users are
expected to overcome adaptation problems.
There are plans to extend the system to more users and
add new features such as several passwords for the same account or the possibility of silently discarding e-mails (suitable
against indirect mail-bombing, i.e., mail-bombing based on
bounced e-mail).

Acknowledgements
This work has been partially supported by the Spanish Ministerio de Ciencia y Tecnologa (FEDER funds, grant BFM200200033) and by the Junta de Castilla y Leon (grant SA107/03).

236

computers & security 25 (2006) 229236

references

Asaravala A, et al. With this law, you can spam. Available from:
<http://www.wired.com/news/business/0,1367,62020,00.
html>.
Cournane A, Hunt R. An analysis of the tools used for the generation and prevention of spam. Computers & Security 2004;
23:15466.
Dominus MJ. My life with spam: Part 3. Available from: <http://
www.perl.com/pub/a/2000/03/spam3.html>.
Graham P. A plan for spam. Available from: <http://www.
paulgraham.com/spam.html>.
Grimes GA. Issues with spam. Computer Fraud & Security 2004;
5:126.
Hinde S. Spam: the evolution of a nuisance. Computers & Security
2003;22:4748.
Klensin J. Simple Mail Transfer Protocol (RFC 2821). Available
from: <http://www.ietf.org/rfc/rfc2821.txt>.
Manes S. Kill spam with your own two hands. Available from:
<http://www.forbes.com/forbes/2003/0623/136_print.html>.
McWilliams B. Swollen orders show spams allure. Available
from: <http://www.wired.com/news/business/
0,1367,59907,00.html>.
Mertz D. Spam filtering techniques, six approaches to eliminating
unwanted e-mail. Available from: <http://www-106.ibm.com/
developerworks/linux/library/l-spamf.html>.
Miller MJ. Forward thinking. How spam solutions lead to more
problems. PC Magazine December 2003:7.

Moore K. Simple Mail Transfer Protocol (SMTP) Service Extension


for Delivery Status Notifications (DSNs) (RFC 3461). Available
from: <http://www.ietf.org/rfc/rfc3461.txt>.
Ookoboiny G. Whitelist-based spam filtering. Available from: <http://
impressive.net/people/gerald/2000/12/spam-filtering.html>.
Whitelist. PC Magazine February 2004:82.
Sender ID framework overview. Available from: <http://www.
microsoft.com/mscorp/safety/technologies/senderid/
overview.mspx>.
Using check_* in sendmail 8.8. Available from: <http://www.
sendmail.org/wca/email/check.html>.
Spam: about the problem. Available from: <http://www.cauce.
org/about/problem.shtml>.
Spam filters. How technology works. Technology Review April
2004:79.
The whole scoop in the configuration file. Available from: <http://
www.sendmail.org/wca/email/doc8.12/op-sh-5.html#sh-5>.
Turing A. Computing machinery and intelligence. Mind 1950;59:236.
Vinther M. Intelligent junk mail detection using neural networks.
Available from: <http://www.logicnet.dk/reports/JunkDetection/JunkDetection.htm>.

Guillermo Gonzalez-Talavan is a graduate in Physics and


Computer Science from the University of Salamanca, in Spain.
He is a Tenured Lecturer at the Computer Science and Automation Department and is Head of the Computer Architecture
Area at the University of Salamanca. His main areas of interest
and research include Computer Architecture and Security,
and Operating Systems.

You might also like