You are on page 1of 14

CSI 28th Annual Computer Security Conference October 29-31, 2001 Washington, D.C.

Cost-Benefit Analysis for Network Intrusion Detection Systems Huaqiang Wei, Deb Frinke, Olivia Carter, Chris Ritter Center for Secure and Dependable Software University of Idaho Moscow, Idaho 83844 huaqianw@cs.uidaho.edu Abstract Assessing the cost-benefit tradeoff of a network intrusion detection system requires an understanding of the effectiveness of the system and the cost of its employment. In this paper, we propose a cost-benefit analysis methodology and build a cost model based on an investigation of the cost factors and categories of various intrusions. The model can be used to quantitatively and qualitatively calculate the cost of detecting and responding to an intrusion, and provide necessary advice for determining the tradeoff between costs and benefits. Our overall goal is use of this model in a real-time network intrusion detection system. Therefore, we provide an example of our cost models application in the cooperative intrusion detection system Hummer. 1. Introduction Computer information security risk analysis and assessment has become a major research field since the beginning of modern network technology, especially the Internet. The Internet has changed the way people think, do business, and communicate with each other. By means of computer networks, we can shop, research, and conduct financial business online. The downside to this is that while online, we risk exploitation at the hands of others who may access our private information through the network. Many researchers and vendors have spent time and money in the development of sophisticated security monitoring tools [7, 9, 12, 16], such as firewalls and network intrusion detection tools. By implementing these tools, organizations may significantly reduce security risks. Unfortunately, most current detection mechanisms do not consider the cost of operating a network intrusion detection system as an important factor when deploying one. As a matter of fact, the cost to operate the system, weighed against the benefits it brings to an organization, is an important part of the risk analysis and management of network administration, and has a great impact on the operation of a network intrusion detection system. Therefore, some kind of cost-benefit analysis becomes necessary. In this paper, we first review previous studies of information risk analysis technologies and methodologies. We then examine the network intrusion detection system, Hummer [13]. Finally, we propose a cost-benefit analysis methodology and develop a cost model for that system. 2.0 Computer Security Risk Analysis 2.1 Literature survey Security is the protection of information, systems and services against disasters, mistakes and manipulation so that the likelihood and impact of security incidents is minimized [22]. Security is comprised of confidentiality, integrity, and availability. Confidentiality issues arise because sensitive business information and processes are disclosed only to authorized persons, and controls are required to restrict access to these objects. Integrity pertains to a businesss need to control modification to objects, such as information and processes. Therefore, controls are required to ensure that objects are accurate and complete. Finally, the information and services of a business must be available when needed. Therefore, controls are required to ensure reliability of services. When the security of a business is compromised, threats can affect the confidentiality, integrity, and availability of its assets, leading to potential loss and damage [22]. Security planning begins with risk analysis [7], which determines a networks exposure to threats and potential harm. It is an analytical process with a large number of variables, many of which are unique to the environment. Many researchers and groups have studied risk analysis schemes, including the National Institute of Standards and Technology (NIST) [1, 5, 6, 7, 22, 23]. Even though the definitions of risk vary with the culture, business, and

environment, the core parts of risk analysis procedures are very similar. These include identification and analysis of assets and values, identification of threats and vulnerabilities, risk assessment, management control, and cost-benefit evaluation. Early on, R. Campbell [1] proposed a modular approach to computer security risk management. In his research, Campbell designed a model that tried to provide a framework responsive to the need of most environments. Within the model, there are several sub-models, including value analysis, threat analysis and identification, vulnerability analysis, risk analysis, risk assessment, management decision, control implementation, and effectiveness review. Summers [6] also proposed a similar four-step procedure in risk analysis: 1) Identify the assets and assign monetary values to them; 2) Identify the threats and the vulnerabilities; 3) Calculate the annual loss expectancy exposure (ALE) of each threat; 4) Identify potential safeguards and estimate how much they reduce exposure. Furthermore, S. Boran [22] designed top-down and bottom-up risk analysis methodologies to improve security. The bottom-up approach identifies what degree of protection a system or business needs and then decides the potential risk to the system. This method is fast but not very precise. The top-down approach is more meticulous and precise than the bottom-up, but it can be slower and more costly. This approach involves five steps including asset analysis, analysis of current security rules/policies/practices, definition of basic security objectives, threat analysis and impact analysis. A similar approach developed by Pfleeger [7] concentrates on the calculation of expected annual loss, the cost of control, and annual savings of control. Based on the research of the above authors and our own experience, we propose a similar procedure to analyze the risk for a network system. This involves identification of the systems assets, values, and vulnerabilities based upon potential threats. It also includes risk assessment and prediction of the likelihood of an occurrence, which must be managed and controlled. It then computes the annual loss expectancy (ALE) for management and control, and performs a cost-benefit analysis. 2.2 Description of Our Risk Analysis Procedure 2.2.1 Identification of the Assets and Values In order to start our analysis, we must identify the assets of a network system and their values. Similarly to computer systems, the assets of a

network system can be divided into several categories [25], which can then be divided into smaller elements. See Table 2.1.

Table 2.1: Cost Categories


COST CATEGORY Equipment and Hardware Software Services COST ELEMENTS Computers (every kind), disks, tape drivers, printers, telecommunication, network systems, modems. Operating systems, utility programs, diagnostic programs, application programs. Commercially provided services, such as teleprocessing, local batch processing, on-line processing, internet access, e-mail, voice mail, telephone, fax and packet switch of data. Any consumable item designed specifically for use with equipment, software, service or support service. The salaries (compensation) and benefits for persons who perform functions, such as development, support, management, operation and analysis for running this system. Any not included in the above categories.

Supplies Personnel

Other resources

We can further divide these assets into tangible and intangible assets. Tangible assets can be assigned dollar values, while intangible asset cannot. Tangible assets include physical assets, such as disks, memory, CPUs, workstations and servers. Intangible assets include logical assets, such as data and programs stored on disks. Because it is difficult to measure the dollar values of intangible assets, we assign them relative values. For example, one software element can be assigned five credits, while another only three. This is because the first element is more important to the system than the second. Two additional concepts worth noting are criticality and sensitivity. Criticality refers to items that are critical to an operation, and have the potential to cause major impacts on the organization. These impacts maybe as drastic as loss of human life, though they are more likely to be business impacts, such as major destruction of cooperate assets, revenue losses, embarrassment, and legal problems [5]. Sensitivity refers to the systems value or importance, and its vulnerability. A sensitivity evaluation must include privacy, trade secrets, planning information, and financial data [5]. Both criticality and sensitivity are important factors in determining the value of assets, services, and resources.

To properly assign values to assets, we need to consider their market value, depreciation, and discount value. When an asset is first purchased, it is purchased at its market or book value. After a certain amount of time, the value of the asset will decrease, thus resulting in depreciation. We then use the following formula to calculate the actual value of an asset at any given point in time [25]: P= F(1/(1+I)n) Where P = present value, F = Future Value, I =Interest rate, and n = number of years. 2.2.2 Identification of Threats and Vulnerabilities A threat is any action that can affect the security of assets and cause harm to a system in the form of destruction, disclosure, modification of data, and/or denial of service [5, 22]. Vulnerabilities are the weaknesses in the defense mechanisms of an information system. A threat is manifested by a threat agent using a specific technique, methodology, or spontaneous occurrence to produce an undesired effect on a network system [1]. To clearly identify risks, we must identify the various threat agents and the methodologies that they use [5]. Normally, threats come from two types of agents: natural disasters and human beings. Natural disasters include power outages, thunderstorms, wind, fires, facility collapses, and earthquakes. They can damage equipment and render service unavailable. Although the destructive power and arbitrariness of natural disasters can make them seem quite dangerous, they are actually neither as prevalent nor as harmful as human beings. Since human beings create and use network systems, they are most likely to harm them. They can make use of diverse technologies and create more diverse damage, not only to hardware equipment and other physical facilities, but also to software and operation systems. In short, human attackers can disable a network without physically damaging it. To help organize the consideration of threats and assets, we create a table similar to Pfleegers [7], as shown in Table 2.2. Rather than a rigid tool, the table loosely guides our determination of the maximum threats to a networks assets. Table 2.2: Assets and security properties
ASSET CONFI DENTI ALITY INTEGRITY AVAILABI LITY

Equipmen t and Hardware Software Stolen, copied, pirated Copied

Services

Overloaded, destroyed, tampered with Trojan horse modified, tampered with Overloaded, destroyed, tampered with

Failed, stolen, destroyed, unavailable Deleted, misplaced, usage expired Failed, unavailable, Destroyed Lost, stolen, damaged Quit job, retired, terminated, on vacation Other

Supplies Personnel

Other resources

Other

Other

2.2.3 Risk Assessment and Prediction of the Likelihood of Occurrence A risk assessment evaluates identified risks to determine their relative impact on the facility, information handler, the processing performed, the support being provided, and the mission accomplishment of the organizations being supported [1]. It also assesses the severity of the identified risks and weighs the likelihood of occurrences so that they are ranked according to degree of acceptability. The risk assessment needs to summarize all the previous risk analysis activity results and present these results to appropriate levels of management for their review and evaluation [5]. Likelihood of occurrence relates to the stringency of the existing controls and the likelihood that someone or something will evade the existing controls [22]. Pfleeger describes the following items and procedures, which should be involved in the prediction of likelihood [7]: 1) Calculate the probability that the risk may happen, found in the observed data for the specific system. 2) Estimate the number occurrences in a given time period. 3) Estimate the likelihood from a table. The analyst gives a rating based on several different risk analysis methodologies, and then creates a table to hold and compare the ratings. 4) The Delphi approach: several raters individually estimate the probable likelihood of an event, combine their estimates, and choose the best one. 2.2.4 Computation of Annual Loss Expectancy (ALE)

Because of the complication of assets and threats, it is difficult to estimate the precise value of each event. We use annual loss expectancy (ALE) to represent the cost of every event in a year. Once we determine the cost of one event, we can calculate the ALE by multiplying that cost by the number of incidents. For example, one event, with an expected cost of $20,000, may happen 2 times a year, while another event that costs $500,000 may occur once every 4 years. The ALE of the first event is $40,000, while the ALE of the second event is $125,000. We calculate the total ALE for this organization by adding the ALEs of the events together. 2.2.5 Management and Control The purpose of management and control is to evaluate identified risks according to the degree of their acceptability/unacceptability, in consideration of the nature of the threats and vulnerabilities as they relate to risk, as well as identifying and selecting countermeasures to effectively reduce the risk [1]. In other words, with the existing control, we calculate the expected loss. If the loss is unacceptably high, then we implement new controls. For example, if a network intrusions cost is too high, we evaluate and implement new network intrusion detection software and countermeasures. To effectively enforce the new controls, we need to develop and execute a plan to implement the countermeasures required to improve the security and provide an acceptable degree of risk. This process includes selecting countermeasures, testing their effectiveness, and performing a cost-benefit analysis. Some suggested countermeasures are cryptographic controls, such as secure protocol and operating system protection features. Also available are identification and authentication countermeasures, such as access controls and physical controls [7, 22]. 2.2.6 Cost-Benefit Analysis The purpose of the cost-benefit analysis is to periodically review the effectiveness of planned and implemented security controls to determine if they are doing what they are supposed to do, rather than creating additional vulnerabilities. It is used to support the management and control actions. After completing steps 2.2.1 to 2.2.5, we compute the true cost or savings from the implementation of new countermeasures. We then calculate the effective cost, which is the new countermeasure cost minus any reduction in ALE from the use of the new countermeasure [7].

For example, one organization had a substantial loss because of a network intrusion. The companys important data was deleted. Recovery of the data cost $200,000. This incident may happen once every 2 years; thus the company decided to install sophisticated network intrusion detection tools, which cost $40,000. Table 2.3 lists the cost items and the benefits of this new tool. This example proves that we can employ risk analysis to evaluate the cost of controls and provide instruction in choosing different security measures. Table 2.3 Justification of a Network Intrusion Detection Tool ITEM AMOUNT Risk: disclosure and damage of company confidential data Cost to recover data: $200,000 @ 50% $100,000 likelihood per year Effectiveness of the tool: 85% -$85,000 Cost of tool $40,000 ALE due to loss and control: $55,000 $100,000-$85,000 + $40,000 Savings: $100,000 - $55,000 $45,000 2.3. Risk Management Software Tools In order to automate and standardize the risk management process, most risk analysts adopt software tools [6, 14]. The functions of these tools include gathering and storing data, computing risk measures, evaluating cost effectiveness of countermeasures, and presenting the results in effective forms, such as matrices and graphs. When an organization chooses its risk tools, it should weigh its requirements against the capabilities of various programs. NIST [23] proposes a risk methodology and a serious standard of instruction on the choice of risk analysis tools. The fundamental features of risk analysis tools should be data collection, analysis, and result output. The data collection feature should have a structure for gathering information either textually or graphically about the system under study. This step is necessary in order to maintain a description of the assets and values to the organization. Data collection also includes threat and vulnerability analysis and identification. The risk analysis feature of NIST analyzes the relationship between assets, threats, vulnerability, and controls. Finally, NIST displays its results, which is an important consideration of any risk analysis tool. Central Computer and Telecommunication Agency (CCTA) Risk Analysis and Management Methodology (CRAMM) [6, 14], adopted by the

U.K. government, is one of the most well developed programs available. A CRAMM analysis begins by identifying the assets of a network, assigning values to them, and determining the potential impact of an intrusion. The four impacts are disclosure, modification, unavailability, and destruction. This method provides questionnaires for obtaining information about the assets. Assets can be organized into groups, in which they are analyzed together. CRAMM also includes a built-in list of generic threats, with which the program can generate the questionnaire, determine the strength of each threat, and determine the vulnerability value as low, medium, or high. Finally, the program provides a countermeasure for each known potential threat, based on the vulnerability of the system. CRAMM is also upgradeable, supporting the periodic updating of its threat list and defense mechanisms. The Aerospace Risk Evaluation System (ARIES) is another example of quantitative risk methodologies. It is a six-step procedure that involves project planning, information gathering, risk element definition, screening and assessment of risk acceptability, cost-benefit assessment, and prioritization of control sets [6]. The Los Alamos Vulnerability/Risk Assessment system (LAVA) is a risk methodology developed by the Los Alamos National Lab. It is a systematic methodology for assessing vulnerabilities and risk in complex security systems. LAVA was developed to address large, complex systems that are generally too large for other risk analysis methods [6]. 2.4 Hummer: A Cooperative Principle-Based Network Intrusion Detection system 2.4.1 Cooperative Principle for Network Intrusion Detection Early intrusion detection systems required either a centralized architecture [12, 17] or a centralized decision making point [10]. Recently, business technology has extended to networks either via the Internet or Intranets; therefore, most systems are not stand-alone. The data that users share over these networks adds a new level of convenience but also introduces security risks. Consequently, intrusions have become much more common. To detect an intrusion effectively, a cooperative effort between connected hosts is necessary due to the inability of single hosts to identify the intrusions source. Therefore, many

researchers have begun looking into the cooperative intrusion detection field. The cooperative principle was established based on investigations of traditional law enforcement techniques, such as Neighborhood Watch and the Israel-Jordan Peace Treaty [3]. It is intended to provide a way for a site lacking centralized control or a homogeneous security policy to cooperate in the detection and prevention of a widespread intrusion. The cooperative principle defines several cooperation relationships. The first is a manager/subordinate relationship, which occurs within a single network. A manager is a host who provides a set of subordinate or managed hosts with information regarding data collection, warnings, countermeasures, and so forth. Managers have some control over their subordinates, and the managed hosts behavior should be aligned with the managers behavior. The manager and subordinates pass messages to each other, and occasionally subordinates can manage themselves. We consider the managers policies to be a superset of the managed hosts. Another type of cooperation relationship involves peer groups. These exist in multiple networks, where trust is not necessarily reciprocated. A peer relationship is the least restrictive relationship. No restrictions are placed on the policies governing interactions between peers. Hummer (Hierarchical Management of Misuse Report Protocol) [3, 13] is a cooperative principle-based on-line network intrusion detection system. Hummer protocol addresses the requirements needed to permit network sites to share security-relevant data while retaining local control of data gathering activities, decide locally how much trust to place in data received from outside sites, and determine how much data to share with those sites. Some of Hummers basic features are local control over policy decisions, autonomous but cooperative data collection, and separate enforcement of policy and identification of policy violations. In addition, Hummer validates transactions and shares structure information both hierarchically and cooperatively. The data collector is layered and overlapping, and it provides both data reduction and sanitization. Lastly, Hummer contains an easy mechanism for adding new data collection tools and storing new types of data, as well as adjusting data collection granularity, with a common format between hosts but potentially specialized formats permitted within domains.

Figure 2.1: Architecture of Hummer

2.4.2 System Architecture [13] In the following section, we explain the major components of the Hummer system. Figure 2.1 displays the architecture of Hummer. In this paper, the term Hummer refers to an IP address running the Hummer system and the system as a whole. It also refers to a single machine running the Hummer system. On the other hand, Hummers refer to multiple Hummer packages running on several machines across a network or multiple networks. The Hummer System is composed of four components, which include the Hummer Server, Configuration (Config) Server, Peer Server, Message Server and Tool Server. Figure 2.2: Components of Hummer Server

The Hummer Server (Figure 2.2) is considered the main server, and runs every other server. A client called Hummer I (Hummer Interface) connects to the Hummer Server. Upon connection to the server, a text menu appears with several options. The user can start or stop other servers individually or simultaneously. One option in the Hummer Interface menu visualizes the administration of accounts, specifically with operations relating to the keystore. A few operations include listing the users in the keystore, displaying access values for a user, adding a new user, adding a new server, adding new peers, adding accounts, and reinitializing accounts. There also exists the option to connect to a Tool Server interface through the running Hummer Server. The Config Server handles all major changes to the functionality of Hummer and consists of multiple operations the user can perform. One function is to set appropriate filters for the Message Server to find. These filters contribute to the system by alerting the console and/or logging the information coming into a database. The filters on

the Config Server can forward messages to certain Hummers and Peer groups. The user can implement this filtering by specifying certain IP addresses, using the Config Server, from which Hummer receives messages. There are two types of filters that exist: local filters and inherited filters. Local filters define those filters that are specifically set up for Hummer. Inherited filters refer to filters pushed to Hummer by its managing Hummer. Config Server also controls the ability to set up hierarchical relationships between Hummers. A computer network can consist of one or more managing Hummers under which subordinating Hummers reside. A Hummer can maintain this manager/subordinate relationship fairly easy. A manager can push his or her local filter configuration to his or her corresponding subordinates. Also, a manager can change his or her connected subordinates, and a subordinate can change his or her manager. However, a manager/subordinate relationship cannot exist unless a two-way handshake between machines takes place. A manager must first add the subordinate Hummer, and then the subordinate must add the managing Hummer, thus developing a mutual relationship between Hummers. Two additional aspects of the Config Server consist of setting levels and kill files. Levels made of associated Trust, Integrity, and Cooperation (TIC) values involve messages received from other Hummers. The TIC values range from 1 to 5, with 5 being the highest value. The trust value refers to the level the current Hummer trusts the other system's updates and monitoring scheme. The integrity value reveals the programs level of confidence that no manipulation of the messages has occurred. Cooperation defines the willingness of one Hummer to devote resources for aiding another Hummer. The kill files implement a policy of Allow Everything Except For, allowing the Message Server to dump messages received from a Hummer that resides in the kill files. The Config Server also manages peer groups. When a request to subscribe or unsubscribe from a peer group occurs, the Config Server connects to the Peer Server. The Config Server handles all changes performed in the peer groups other than subscribing and un-subscribing. Another action of the Config Server is the updating of keystores. The Config Server updates keystores each time a Hummer either adds a new user or subscribes to a peer group. The updating of the keystore allows connections from the new member to be made to the server and is the main connection to the database. The Config Server controls all requests for information from the database. A managing Hummer connects to the Config Server when pushing its filter configuration to

the corresponding subordinates. This allows the managing Hummer to contact and send information to all of the associated subordinates. The Peer Server is the main point of contact for any subscriptions, un-subscriptions, or messages passed to peer groups. Hummer prohibits all means of subscription to a peer group unless performed by the moderator. After acceptance, Hummer contacts the moderator's Peer Server. The Peer Server adds the Hummer, and sends the appropriate information to update Hummer's database. When a member of a peer group wishes to unsubscribe from that group, he or she connects to the moderator's Peer Server to do so. This tells the Peer Server to delete the Hummer from the database, and the moderator removes the Hummer from the peer group. For the communication in a peer group, a message received by the peer group also proceeds to the Peer Server that moderates the peer group. The moderator first increases the age of the message, then checks to see if a peer relationship exists between the sender and the proposed receiver of the message. The moderator then sends the message on to the members of the peer group. The connections to the Peer Server are made automatically by either the Config Server (subscription and un-subscription) or Message Server (passing messages to the peer group). Another part of Hummer is the Message Server. Hummer sends all generated messages through the Message Server. These messages can come from several different sources: the tools, other Message Servers, Peer Servers, and command lines through PeerI and ToolI. The following is the path that a message takes through the Message Server. Upon receipt of the message by the Message Server, the age of the message increases by one. The Message Server checks the age of the message against a set age limit residing in the Hummers file properties. Next, the IP address where the message originated is checked against any IP's that reside in the Kill file. If the IP address resides in the Kill file, then Hummer dumps the message; otherwise, the message continues to the next phase. If the message proceeds, the Message Server compares each filter with the contents of the message for a match. The filter consists of a regular expression for which the user looks in the information outputted by the system. A match tells Hummer to keep the message and continue through the cycle. Then the Message Server checks the accept list, which includes the filters that implemented the policy of Allow Nothing Except. Another type is the empty accept list, which means that there are no restrictions on an IP address sending messages to this server. When all the checking is complete, the Message Server decides on an action to take. The Message Server may proceed in logging

the message to the database. Then the Message Server decides whether the message should continue to the Alert console (HAC Hummer Alert Console). Next, the Message Server forwards the message to Hummers listed in the forward list. The Message Server also forwards the message to any peer groups that were set by the filters. Finally, all other Message Servers that receive the message perform the same check. The last major element of Hummer is the Tool Server, which controls all of the tools that gather the system information sent to the Message Server. Each one of these tools uses ToolI in order to send the message to Hummer. ToolI is a program that runs from the tools, and allows for the passage of information from the tools to Hummer. HummerI, discussed earlier, allows the Tool Server access to a tool administration menu, including options such as starting or stopping all tools, starting or stopping one tool, and determining the status of the tools. Due to Hummers unique features and robust extendibility, we choose to combine it with our cost-benefit analysis model. 3. Cost Sensitive Model for Network Intrusion Detection 3.1 QoS Application in Computer Security Quality of service (QoS) has long been studied and applied to network technology and resource management [26, 27, 28]. Aurrecoechea and Campbell systematically studied the architecture of QoS [26]. Irvin proposed the concept of quality of security service (QoSS), and QoSSs cost method and taxonomy [27,28]. Since QoS involves users requests for different levels of services, which are related to performance-sensitive variables [27], we can imagine that security service can be part of those services. In Irvins research, he defines security as a dimension of quality of service [28]. Security services include data confidentiality, integrity, traffic flow confidentiality, authenticity, non-repudiation, availability, audit and intrusion detection, and boundary control. For each security service, Irvin also defines three service areas: End System (ES), Intermediate Node (IN) and Network Connection (NC). An Example of ES is a client or server system, routers and switches are the examples of IN, and NC indicates the wires that connect systems and nodes. In addition to the above three service areas, he also defines Total Subnet (TS), which is a service area that cant be assigned exclusively to IN, NC or ES. For each security service and service area, he also defines at least one security mechanism. For example, to protect data confidentiality, he defines

operating system and cryptographic credentials as the security mechanism in the ES and INs. For intrusion detection, he defines auditing of network control functions in IN, and rule-based network intrusion systems in the TS as the mechanisms. Irvine also discusses the costs of those security services and mechanisms. Table 3.1 is the cost example [28]. Table 3.1: Security Cost Examples
Security Service Data Confidentility Message NonRepudiation Intrusion Detection Service Area NC ES Mechanism Link layer 40-bit DES Remote nonrepudiation service Experiment al system Cost Measure Processor clocks per byte 2n bytes per message network bandwidth, plus c clocks per byte N Mbytes per second of overall bandwidth, plus m instructions per second, plus b bytes per second storage

TS

From the example table, we find that the measure units are impractical and hard to use, even though Irvin proposes a clear cost method and taxonomy. Moreover, he doesnt mention a quantitative and qualitative cost-benefit analysis and cost benefit tradeoff criteria for computer security service. 3.2 Cost Model for Network Intrusion Detection Network intrusion detection technologies, as part of risk management measures, have been studied for more than a decade, but most systems [12, 17] are concerned only with intrusion detection, which tries to use brute force to catch every possible intrusion, while ignoring technical effectiveness [21]. However, it is both impossible to catch every attack and impractical to employ an extremely restrictive network intrusion detection system. Furthermore, the cost of detection and countermeasures could be much higher than the benefit. This is why the cost-benefit tradeoff is one of the most important parts of a network intrusion detection system, because it can be used to determine whether or not the system is valuable enough to employ countermeasures to stop an intrusion. Lee and Stolfo's [20, 21] research of cost modeling for network intrusion detection systems follows a risk analysis procedure to select sensitive data/assets and create a cost matrix for each intrusion. They divide the cost items into damage cost, operation cost, and response cost, and combine them together to calculate the total cost for each intrusion. Damage cost (DamageCost) represents the maximum

amount of damage to an attack target when the intrusion detection system and other protective measures are either unavailable or ineffective. Response cost (ResponseCost) is the cost of responding to the intrusion, which includes taking some action to stop the intrusion and reduce the damage. These actions or countermeasures should be defined during the risk analysis process according to specific threats. Operation cost (OperationCost) is the cost of processing the stream of events being monitored by an intrusion detection system and analyzing the activities using intrusion detection models. Lee [21] divides these features into three different levels. The first level involves using few resources to save money. These resources can easily be obtained at the beginning of an event. For example, the destination service can be determined using the first packet of connection. The second level involves the use of a moderate number of resources during the event. These resources are computed at any point during an event, and are maintained throughout the events duration. For example, one feature is the number of data bytes transferred during the FTP process [13]. The third level uses resources of several events within a window, or from different hosts. For each level of resources, Lee assigns a different cost weight level. A level 1 feature may cost 1 to 5, level 2 features may cost 10, and level 3 features may cost 100. After the cost factors are defined, the cost values can be given when performing risk analysis and assessment, leading to the creation of the related cost matrix. Finally, they propose a cost model:
Cost_total (e) = N (CCost + OperationCost(e) ) (1) i=1

Where Cost_total (e) is the total cost, n is the event number, and CCost is the consequential cost of the prediction by the network intrusion detection system for the intrusion event e, which is determined by the damage cost and response cost. Researchers [18, 19, 21] have identified five different prediction cases: False Negative (FN), True Positive (TP), False Positive (FP), True Negative (TN) and Misclassified Hit. False Negative (FN) is the cost of not detecting an attack. FN is incurred either by a system that does not install an intrusion detection system, or one in which the intrusion detection system does not function properly and mistakenly ignores an attack. This means that the attack will succeed and the target resource will be damaged. The FN cost is therefore defined as the damage cost of the attack.

True Positive (TP) occurs in the event of a correctly classified attack, and involves the cost of detecting the attack and responding to it. To determine whether a responsive action needs to be taken, response cost and damage cost must be considered. If the damage cost is less than the response cost, the intrusion is ignored and recorded. The total loss is the damage cost. On the other hand, if the response cost is less than the damage cost, then the intrusion must be countered back, and the total loss equals the response cost. In a real-time situation, since the attack is potentially in progress when it is detected, the damage cost may be a part of the maximum damage cost. This is represented by the formula Progress x DamageCost, where Progress is the percent of the attacks progress. False Positive (FP) occurs when an event is incorrectly classified as an attack. For example, a network intrusion detection system may misidentify a simple connection as an attack. If the response cost is less than damage cost, action will be taken and the total loss equals the response cost, which is unnecessary because no attack has occurred. If the damage cost is less than the response cost, the network intrusion detection system will not take any action, resulting in no loss or damage. True Negative (TN) cost is always 0, as it is incurred when a network intrusion detection system correctly decides that an event is normal. Misclassified Hit cost is incurred when the wrong type of attack is identified. If the response cost is less than the damage cost, a response action will be taken to stop the attack. Since the action is not useful for the actual attack, some damage cost occurs due to the progression of the true attack. To simulate his model, Lee [21] chooses an attack taxonomy example from DARPA, which we will use in our model (Table 3.2). Lee also studies a mechanism to reduce the operation cost by implementing a machine learning technology in the intrusion detection model. And finally, he performs empirical experiments to verify the effectiveness of the model. In our research, we use some ideas from Lees research in the definition of cost factors and basic procedures. To verify our model, we have adopted DARPAs attack taxonomy, but have improved it by dividing the attack categories further. Finally, we study ways to combine our cost model with Hummer.

Table 3.2: Attack Taxonomy used by DARPA and Lee [21] MAIN DESCRIPTION SUBDESCRIPTION CATEGORY CATEGORY 1. Root Illegal root access is obtained 1.1 local By first logging in as a legitimate user on a local system, e.g., buffer overflow on local system programs such as eject From remote host, e.g. buffer overflow of some daemon running suid root A single event, e.g. guessing password Multiple events, hosts, or days, e.g. the multiple attack

COST DamageCost=10 0 ResponseCost=4 0 DamageCost=10 0 ResponseCost=6 0 DamageCost=50 ResponseCost=2 0 DamageCost=50 ResponseCost=4 0 DamageCost=30 ResponseCost=1 0 DamageCost=30 ResponseCost=1 5 DamageCost=2 ResponseCost=5 DamageCost=2 ResponseCost=7

1.2 remote

2. R2L

Illegal user access is obtained from outside

2.1 single 2.2 Multiple

3. DOS

Denial-of service of target is accomplished

3.1 crashing 3.2 consumption

4. PROBE

4.1 simple 4.2 stealth

3.3 Our Cost Model for Hummer Based on the investigation of Lees research [20, 21], we build our cost model, which considers the cost not only from multiple events, but also from multiple hosts, making it more comprehensive. Therefore, the first formula becomes:
N H Cost_total (e) = ( DamageCost(i)) + ResponseCost(e) + OperationCost (e) (2) i=1 j=1

Where H is the number of attacked hosts. To simplify the cost model, we initially consider only individual attacks detectable by intrusion detection systems, which are relatively easy to detect and process. For example, in a complicated intrusion case, an external intruder first uses port scanning to detect the address space of the network, then illegally acquires user-level access to the network, and finally obtains root access, which he or she uses to damage the network. There are several sequential actions involved in the attack that we can

Using a single malicious event (or a few packets) to crash a system, e.g. the teardrop attack Using a large number of events to exhaust network bandwidth or system resource, e.g. synflood Many of probe within a short period of time, e.g. fast port scan Probe events are distributed sparsely across long time windows, e.g. slow port scan divide into three individual attacks: port scanning, illegal access to the network and root access. Since the model is going to be used for realtime purposes, the faster the processing speeds, the better the ability to reduce damage. After the model is finished, we may find a way to enhance it by increasing the time window and event window to handle more complicated and coordinated attacks. We have built specific rules to detect each intrusion category and case. For example, for a Ping of Death attack, we define the following rule: If data_size >= 65535 and protocol = ICMP then Ping of Death. For a Synflood intrusion, we define the following rule: If packet = SYN and IP stack loop_number > 5 then synflood intrusion. Since we determine the operation cost by the computing resources we have used to detect the intrusion, the choice of rule set is very important in determining the operation cost. 3.2.3. Implementation of Cost Model on the Hummer System Figure 3.1: Data flow chart between Cost Model and Hummer servers

MessageServer And MessageHAndler

AlertServer

If the cost is less than the benefit, Message Server will send the message to Alert Server and Response Server. Response Server will take some action based on the suggestion of the cost-benefit analysis to stop the intrusion. 4. System Verification and Cost Modeling Simulation

AlertConsol Cost-benefit analysis (CBA)/Cost Model Response system Database

Before we move the model into real-time use, we need to perform a simulation and verify the system off-line to make sure it works for all cases. To test the system, we must gather all possible attack data and train the rule set. The rule set needs to be verified and the threshold value needs to be determined during this training process. Then we can move the system to an on-line test. To test our model, we have gathered and used the following attack data from current network intrusion detection systems [11, 15, 17, 18]. 4.1 Test Case 1: TearDrop Attack Case

The cost model can be developed into an independent model for off-line analysis; however, the primary objective of this research is to develop an online system for real-time use. To fulfill this goal, we propose the combination of this cost model with Hummer [3, 13]. The most important parts of such a combination are the interface and data communication between the intrusion detection function and the cost analysis model. We describe this communication in Figure 3.1. In the Hummer system, Message Server, Message Handler, Alert Server and Alert Console are important components. Before the cost-benefit analysis model is combined with Hummer, Message Server collects data from other tools and employs Message Handler, which verifies the intrusion and logs messages to the database. If an intrusion occurs, Message Handler will forward a message to Alert Server and display the alert message on Alert Console. After adding the cost-benefit analysis model, if Message Handler realizes that an intrusion is occurring, it will send the intrusions name, status, target name and other related information to the costbenefit analysis system first. The system will analyze the cost-benefit and calculate the cost, and compare it with other alternatives. It will then send back the analysis result to Message Server and Message Handler. If the cost to respond to the intrusion is higher than the benefit, Message Handler will relabel the message and will not send it to Alert Server. The message will not be displayed on Alert Console.

The TearDrop attack is a very common attack type. It is initiated by sending multiple fragmented IP packets, that when reassembled, have data portions of the packet that overlap and create bad data fragments, which makes the system unstable. The protocol used is UDP. This attack can be detected by initiating the rule set to check the number of bad fragments and the protocol. For example, at 12:05 a.m., we receive a message that there are five bad fragments, and the protocol is UDP. After checking the IP address of the intruder, we determine that this is an external attack. From the rule set, we know that this is a TearDrop attack, that it belongs to the DOS crashing attack category, and that the damage cost is 30. The response action is termination of the session, which costs 10. Since the damage cost is higher than the response cost, the attack is countered. Since the resource we use is a level two feature that costs 10, we initiate the cost model to calculate the total cost:
Cost_total = Progress x DamageCost + ResponseCost + OperationCost

We estimate that the attack is halfway completed, so progress is 0.50. Then the total cost is 35. 4.2 Test Case 2: Ping of Death The Ping of Death, which is also a DOS attack, is generated by sending multiple fragmented ICMP packets that when reassembled have a data portion of greater than 655535 bytes. Since it is a violation of the TCP/IP specification, it causes the

TCP/IP stack to crash on the vulnerable computer. To detect this attack, we need to check the size of the data portion in the packet and the protocol. The damage cost is 30 for one computer, but the attack is across three hosts, so the total damage cost is 90. The response cost is 30 to terminate the session on each host, plus 5 to notify the administrator. Since the damage is higher than response cost, we decide to stop the attack. Since the attack has recently begun, we estimate the progress at 10%. The operation cost is 10, bringing the total cost to 90 x 0.1 + 30 + 5 + 10 = 54. 4.3 Test Case 3: Coordinated Attack A coordinated attack includes several steps and sub-attacks (simple attacks). Since our model can currently evaluate only simple attacks, we must divide a coordinated attack into its various simple attacks and evaluate them individually. One day, at 11:30:20 p.m., a network intrusion detection system detects a port scan attack from outside, and identifies one machine. Since the response cost is higher (5) than the damage cost (2), the network intrusion detection does not stop the attack. After several minutes, at 11:32, the intruder attempts to log into the identified machine via telnet. The attacker first tries lp, which is password protected, and then demos, which is not. The attacker then exploits the password files of both the computer and network and tries to modify these files. The network intrusion detection system detects this unusual action and determines that it is a Remote Root Access attack. The damage cost for such an attack is 100 and the response cost can be as high as 60, which includes the termination of the connection session and notification of the administrator. Since the damage cost is higher than the response, the system decides to take action to stop the attack. Since the password has already changed, and the operation resource is a level 2, the total damage cost is 100 + 60 + 10 = 170. Since the password file has been modified, every password needs to be changed, and the operation system is reinstalled. From the above attack case analysis, we find that our cost model can give proper cost results as long as the network intrusion detection can give correct detection results. The combination of the model with a real-time network intrusion detection system is the future work of this research, and the most important part is the interface and data transition between these different components. 5. Conclusion and Future Work

5.1 Conclusion The research objectives of this paper were quantitative and qualitative analysis of the security risks in a distributed network environment, creation of a cost model, and determination of the cost-benefit tradeoff of a network intrusion detection system. To fulfill these objectives, we investigated the risk analysis methodologies and tools, which included identification of assets, threats, and vulnerabilities, as well as ALE calculation, risk assessment, management control, and cost-benefit analysis. We then introduced the cooperative principle in light of the cooperative principle-based network intrusion detection system, Hummer. Furthermore, we investigated several researchers work on QoS application in computer security, and cost-sensitive models for network intrusion detection systems. Based on this investigation, we proposed our own model, which combines a cost-benefit tradeoff with the distributed network intrusion detection system Hummer. Finally, we evaluated real attack cases, and, through our model, successfully provided the total costs of the attacks. 5.2 Future Work This model has not yet undergone extensive enough training to be used in commercial applications. Our ultimate goal is to develop a practical model that can be used in a real-time on-line network intrusion system. The real-time employment of this model will be part of future enhancements of the Hummer system. In addition to cost modeling and intrusion detection functions, the automatic response function is very important to the network intrusion detection system. With the cost model and automatic response system, a network intrusion detection system can both detect an attack and decide if it is worth stopping. If the attack is worth stopping, the network intrusion detection system can automatically employ countermeasures. 6. References [1] R.P.Campbell etal, A modular approach to computer security risk management. AFIPS Conference Proceedings. AFIPS Press, 1979. [2] D. Denning, Information Warfare and Security. Addison Wesley, 1999. [3] D. Frincke etal, Principles of Cooperative Intrusion Detection for Network-Based Computer Sites, University of Idaho, 1998.

[4] LaPadula, Rile-set modeling of a trusted computer system, Information Security: An Integrated Collection of Essays, p178-241, IEEE Computer Security Press, 1995. [5] P. Tites etal Information systems Security, Van Nostrand Reinhold, New York, 1993. [6] R. Summers, Secure Computing, McGraw-Hill, 1997. [7] C.Pfeleeger, security computing, Prentice-Hall, Inc, 1997. [8] Donna Parker, Fighting Computer Crime, Wiley & son, 1998. [9] Paul Proctor, The practical Intrusion detection handbook, Prentice-Hall Inc. 2001. [10] G. White etal, Cooperative Security Managers: A Peer-Based Intrusion Detection System, IEEE Network, Jan./Feb. 1996. [11] T. Dunigan etal, Intrusion Detection and Intrusion Prevention on a Large Network, A Case Study. http://www.usenix.org/publications, 1999. [12] K. Richards etal, Network Based Intrusion Detection: A Review of Technologies. Computer & Security, p671-682, 18, 1999. [13] D. Frincke etal, Hummer, A Copperative, Collaborative Intrusion Detection System, University of Idaho. [14] J. Eloff etal, A Comparative framework for risk analysis methods, Computer & Security, p597-603, 12 1993. [15] L. Labuschagne, etal The Use of Real-time Risk Analysis to Enable Dynamic Activation of Countermeasures, Computer & Security p347-357, 17, 1998. [16] T. Lunt, A Survey of intrusion detection techniques, Computer & Security, p 405418, 12, 1993. [17] M. Denault etal, Intrusion Detection Approach and Performance issues of the SECURENET System. Computer & Security, p495-508, 13, 1994. [18] F. Cohen, Simulation Cyber Attacks, Defences, and Consequences, Computer & Security, p479-518, 18, 1999. [19] F. Cohen etal, A Cause and Effect Model of Attack on Information System, www.all.net. [20] S. Stolfo etal, Cost-Based Modeling for Fraud and Intrusion Detection Results from the JAM Project, Technical Report, Columbia University. [21] W. Lee etal, Toward Cost-Sensitive Modeling for Intrusion Detection and Response, North Carolina State University, 2000. [22] Sean Boran [http://www.boran.com/security-IT Security Cook Book.] [23] I. Gilbert, Guide for selecting Automated risk

analysis tools, NIST [http://www.NIST.gov] [24] C. Irvine etal, Toward a Taxonomy and costing method for security services, proceeding of the computer security application conference, phoenix, AZ, 1999. [25] Cost-Benefit Analysis Guide for NIH IP Project, www.itpolicy.gsa.giv [26] C. Aurrecoechea, A. Campbell etal A Survey of QoS Architecture, ACM/Springer Verlag Multimedia Systems Journal, Special Issue on QoS Architecture, Vol. 6 No. 3, pg. 138-151, May 1998. [27] C. Irvine etal Quality of Security Service, http://cs.nps.navy.mil [28] C. Irvine etal Toward a Taxonomy and Costing Method for Security Metrics, proceedings of the Annual Computer Security Applications conference, Phoenix, AZ, Dec. 1999

You might also like