You are on page 1of 143

Quantitative Risk Analysis of Computer Networks A Thesis Submitted to the Faculty in partial fulllment of the requirements for the

degree of Doctor of Philosophy by Daniel Bilar Thayer School of Engineering Dartmouth College Hanover, New Hampshire June, 2003

Examining Committee: George Cybenko (Chairperson) Robert Morris (Member 1) Robert Gray (Member 2) Sue McGrath (Member 3) Carol L. Folt Dean of Graduate Studies

c 2003 Trustees of Dartmouth College

ii Thayer School of Engineering Dartmouth College Quantitative Risk Analysis of Computer Networks Daniel Bilar Doctor of Philosophy George Cybenko Robert Morris Robert Gray Sue McGrath ABSTRACT

Quantitative Risk Analysis of Computer Networks (QSRA) addresses the problem of risk opacity of software in networks. It allows risk managers to get a detailed and comprehensive snapshot of the constitutive software on the network, assess its risk with assistance of a vulnerability database, and manage that risk by rank ordering measures that should be taken in order to reduce it, subject to cost, functionality and risk constraints. A theoretical methodology is proposed and a prototype implementation has been developed. Six out-of-the-box popular operating systems were studied using the methodology and the prototype. We nd that around 75% of discovered vulnerabilities are patchable within two weeks, and around 90% within 40 days after initial discovery. We nd a statistically signicant time window dierence between security-audited and non-security audited software. Across the operating systems, the majority of faults give rise to availability and full compromise consequences. There is a statistically signicant dierence between fault types: Input validation faults are proportionally over-represented. There is a statistically signicant dierence between consequence types: Full compromise consequences are proportionally over-represented. There is, however, no statistically signicant fault or consequence proportion dierence between the audited systems. QSRAs risk assessment model calculated that for all audited systems, four to six months after their respective release date, the probabilities are very high (66% to 99%) that an attacker can conduct a full consequence compromise, remotely and locally. Risk management analysis for remote risk probabilities indicates that, given a moderate fault count, QSRAs highest risk analytic risk mitigation strategy consistently outperforms the simpler strategy of choosing software with the highest vulnerability count. Highest risk outperforms the undierentiated highest count strategy for at least four out of the six tested operating systems and for four out of ve fault consequences.

iii I would like to thank everybody who helped me. George Cybenko as my adviser and mentor who showed me that one can routinely have two brilliant ideas a day and still be a decent and kind human being. My tender, kind and loving girlfriend Daniella Hirschfeld who brought me sandwiches when I was starving writing this thesis. Biiiiig kiss, my sweetheart! NH Senator Gregg, Dean Duncan and George again for getting Washington, DC to nance my excellent work environment. Susan McGrath who runs the IRIA research group with great success and a winning presence. Bob Gray, a kind man for always nding time to eld questions as he is preparing four papers at the same time. Guofei Jiang whose unassuming demeanor belies his immense work ethic and sparkling intelligence. Robert Morris as an example that truth can be far stranger than ction. The excellent teachers I had at Thayer, including but not limited to Prof. Eric Hansen and Prof. Ulf Oesterberg. And last but not least, all those people who follow Wordsworths dictum: That best portion of a good mans life; his little, nameless, unremembered acts of kindness and love I remember.

Contents
1 Introduction 1.1 1.2 1.3 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 1.3.2 1.4 Fault Tree Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 3 4 4 5 6 7 7 8 8 10

Previous work on cybersecurity risk . . . . . . . . . . . . . . . . . . . . . . . . . .

Major ndings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 1.4.2 1.4.3 1.4.4 Parameters and Vulnerability Data . . . . . . . . . . . . . . . . . . . . . . . . . . . Faults and consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Design and Implementation of the QSRA system . . . . . . . . . . . . . . . . . . . QSRA empirical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Design of the QSRA system 2.1 2.2

Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.1 Software quality, faults, failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3 2.4

Process sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Risk Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.1 2.4.2 Preliminary example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 QSRA risk calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5

Risk Management 2.5.1

Optimization problem setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.6

Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.6.1 2.6.2 ISTS ICAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 NETWORKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 iv

CONTENTS 2.7

A QSRA example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 28

3 Implementation of QSRA system 3.1 3.2 3.3

Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 QSRA Event Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.1 3.3.2 3.3.3 3.3.4 Vulnerability assessment: Software inventorying . . . . . . . . . . . . . . . . . . . . 32 Vulnerability assessment: Software identication . . . . . . . . . . . . . . . . . . . 34 Risk assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Risk management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 37

4 Analysis 4.1

Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.1.1 4.1.2 4.1.3 Parameters and Vulnerability Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Faults and consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 QSRA empirical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.2

Parameter and vulnerability data sources . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2.1 4.2.2 4.2.3 4.2.4 Vulnerability lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Timing data set ndings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Attacker expertise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.3

Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.3.1 4.3.2 Audited environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Data and analytical framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.4

Vulnerability assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.4.1 4.4.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.5

Risk Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.5.1 4.5.2 Risk calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Fault and Consequences distributions . . . . . . . . . . . . . . . . . . . . . . . . . 64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.6 4.7

Risk Management

Possible improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.7.1 4.7.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

CONTENTS 5 Future Work 5.1

vi 82

Database update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.1.1 5.1.2 5.1.3 Vulnerability Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Risk Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Risk Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 90

A Numeric analysis

A.1 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 A.2 ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 B Databases 92

B.1 ISTS ICAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 B.2 NETWORKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 C Supplemental data 101

C.1 Average time window from discovery of the vulnerability to posted patch . . . . . . . . . 101 C.2 Software inventorying experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 C.3 Faults and Consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 C.3.1 Vulnerability count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 C.3.2 Fault count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 C.4 Risk management scenario results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 D Glossary 124

D.1 Boxplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

List of Tables
1.1 1.2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 Virus spread prole 1990-2003 [Kes00][Bek03] . . . . . . . . . . . . . . . . . . . . . . . . . Software fault densities [Bin97] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2

Hypothetic risk of a host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Fault types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Consequence types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Functionality (Class) groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Vulnerabilities data sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Risk management results for George . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Remote risk limits for George . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Risk assessment: Consequence losses and consequence probabilities for George . . . . . . 27 Risk management: Consequence losses and consequence probabilities for George . . . . . 27

2.10 Vulnerabilities of Apache 2.0.35 and MySQL 4.0.5 . . . . . . . . . . . . . . . . . . . . . . 27 2.11 Vulnerabilities of Apache 1.3.26 and MySQL 3.23.50 . . . . . . . . . . . . . . . . . . . . . 27 3.1 3.2 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 QSRA event sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Sample uname and ver output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Questionnaire: Vulnerability lifecycle and attacker ability . . . . . . . . . . . . . . . . . . 40 Categorical and continuous data sources: decision criteria . . . . . . . . . . . . . . . . . . 41 Vulnerability lifecycle events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 ANOVA results of time between discovery and patch [tdesc , tpatch ] . . . . . . . . . . . . . . 44 Experts estimate for time window from discovered vulnerability to posted patch [tdesc , tpatch ] 49 Overview of questionnaire results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Vulnerability advisories vs attack in 2002 . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Standard installed software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 vii

LIST OF TABLES 4.9

viii

Experimental factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.10 Timing experiment: Software count, database size, OS . . . . . . . . . . . . . . . . . . . . 54 4.11 Timing experiment: Connection type, ARP cache . . . . . . . . . . . . . . . . . . . . . . . 55 4.12 Timing experiment: Database size, number of inventoried hosts . . . . . . . . . . . . . . . 55 4.13 Stepwise regression results, two factors: Number of software packages, database size . . . 57 4.14 Stepwise regression results, two factors: Number of hosts, database size . . . . . . . . . . 57 4.15 OS ANOVA: Software inventorying timing data . . . . . . . . . . . . . . . . . . . . . . . . 60 4.16 Size of software inventory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.17 Parameters: Attacker ability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.18 Parameters: Time window till posted exploit . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.19 OS consequence probability calculation results . . . . . . . . . . . . . . . . . . . . . . . . 65

4.20 Proportion of OS to consequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.21 Proportion of consequences to OS 4.22 Proportion of consequences to fault 4.23 Proportion of faults to consequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.24 Proportion of OS to fault . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.25 Proportion of faults to OS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.26 ANOVA results of proportions of faults, OS and consequences . . . . . . . . . . . . . . . . 68 4.27 Fault type breakdown 2002 [oSN02] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.28 Consequence magnitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.29 Individual OS remote consequence risk probability scenario results . . . . . . . . . . . . . 72 4.30 Individual OS remote vulnerability count scenario results . . . . . . . . . . . . . . . . . . 73 4.31 OS family average remote risk probability scenario results . . . . . . . . . . . . . . . . . . 74 4.32 OS family average remote vulnerability count scenario results . . . . . . . . . . . . . . . . 74 4.33 Relevant libraries used by Apache 1.3.26 and Apache 1.3.27 . . . . . . . . . . . . . . . . . 81 5.1 5.2 Revised vulnerability lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 New vulnerability consequence types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

B.1 System aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 B.1 System aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 B.1 System aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 B.1 System aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 B.2 Vulnerabilities aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

LIST OF TABLES

ix

B.2 Vulnerabilities aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 B.3 Relations aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 B.4 NCC aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 B.5 Risk Assessment aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 B.5 Risk Assessment aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 B.6 Risk Management aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 B.6 Risk Management aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 C.1 Security advisories as of September 30th , 2002 . . . . . . . . . . . . . . . . . . . . . . . . . 101 C.2 Average time window (days) of discovered vulnerability to posted patch . . . . . . . . . . 103 C.3 ANOVA group mean rank comparison: OS . . . . . . . . . . . . . . . . . . . . . . . . . . 106 C.4 ANOVA results group mean rank comparison: consequences vs fault type . . . . . . . . . 112 C.5 ANOVA results group mean rank comparison: consequences vs OS . . . . . . . . . . . . . 112 C.6 ANOVA results group mean rank comparison: fault type vs OS . . . . . . . . . . . . . . . 113 C.7 Vulnerability count by OS, class and access: Debian 3.0 and Mandrake 8.2 . . . . . . . . . 115 C.8 Vulnerability count by OS, class and access: Suse 8.0, W2K SP2, OpenBSD 3.1 and RedHat 7.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 C.9 Fault count of standard OS installations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

List of Figures
1.1 2.1 2.2 2.3 2.4 3.1 3.2 3.3 3.4 3.5 3.6 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 5.1 Fault tree example of event No power on the 415V bus [www03] . . . . . . . . . . . . . 5

Software view of the network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 QSRA approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Fault tree for consequence c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Escalation exploits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Inventorying the network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Sample lsof output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Sample fport output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Specifying the software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Assessing the software risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Managing software risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Timeline of events in vulnerability lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Time between discovery and patch [tdesc , tpatch ] . . . . . . . . . . . . . . . . . . . . . . . . 45 Histogram of time between discovery and patch [tdesc , tpatch ] . . . . . . . . . . . . . . . . . 46 Group mean comparison of [tdesc , tpatch ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Scatterplot: Software inventorying time for one host . . . . . . . . . . . . . . . . . . . . . 56 Scatterplot: Software inventorying time for multiple hosts . . . . . . . . . . . . . . . . . . 57 Failure probability of independent components in series . . . . . . . . . . . . . . . . . . . 63 Fault count data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Vulnerability database input mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

C.1 Boxplot of [tdesc , tpatch ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 C.2 Software inventorying time ANOVA: Boxplots for connection type and ARP size . . . . . 105 x

LIST OF FIGURES

xi

C.3 Software inventorying time ANOVA: Boxplots for database size and software count . . . . 108 C.4 Software inventorying time ANOVA: Boxplots for database size and software count . . . . 109 C.5 Fault count ANOVA: Boxplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 C.6 Risk probability scenario reduction: Consequences . . . . . . . . . . . . . . . . . . . . . . 120 C.7 Risk probability scenario reduction: Consequences . . . . . . . . . . . . . . . . . . . . . . 121 C.8 Risk probability scenario reduction: Individual OS . . . . . . . . . . . . . . . . . . . . . . 122 C.9 Risk probability scenario reduction: OS families . . . . . . . . . . . . . . . . . . . . . . . . 123 D.1 Sample boxplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Chapter 1

Introduction
1.1 Problem statement

At the end of the 20th century, we have witnessed the massive transition from isolated, disconnected computers to networked computer clusters all over the world. At the time of this writing (Spring 2002), there are an estimated 195 million networked hosts world-wide [Tec02]. This global pervasive connectivity has been a boon for consumers, businesses and governments alike due to the ease, convenience and speed of electronic data exchange. However, the ease of use and relative anonymity that the Internet aords has been leveraged by criminal elements, as well. The National Infrastructure Protection Centers (http://www.nipc.gov/) Congressional statement of May 25th, 2000, highlights the risks we face today: Ninety percent of respondents detected security breaches over the last 12 months. At least 74% of respondents reported security breaches including theft of proprietary information, nancial fraud, system penetration by outsiders, data or network sabotage, or denial of service attacks. Many companies experienced multiple attacks; 19% of respondents reported 10 or more incidents. Information theft and nancial fraud caused the most severe nancial losses, estimated by the respondents at $68 million and $56 million respectively. The losses from 273 respondents totaled just over $265 million. Notably, this survey does not include harm caused by recent destructive episodes such as the Distributed Denial of Service attacks on e-commerce sites in February, and the ILOVEYOU or Love Bug virus earlier this month. No private, commercial or government agency is completely safe or has been unaected by the proliferation of this kind of cyber-crime. E-Commerce Times reported that the ILOVEYOU virus aected 1

CHAPTER 1. INTRODUCTION

Virus Jerusalem, Cascade, Form, Concept Melissa Love Bug Slammer

Year 1990

Table 1.1: Virus spread prole 1990-2003 [Kes00][Bek03] Type Time to prevalence 1 Estimated damages .exe File, Boot Sector 3 years $50M

1995 1999 2000 2003

Word Macro E-mail enabled, Word Macro E-mail enabled, .vbs self-propagating worm

4 months 4 days 5 hours 10 minutes

$50M $93M-$385M $700M - $6.7B $1B

Table 1.2: Software fault densities [Bin97] Application Faults per thousand lines of source code NASA Space Shuttle Avionics 0.1 Leading-edge software companies 0.2 Reliability Survey 1.4 Best, Military system survey 5.0 Worst, Military system survey 55.0 45 million hosts and inicted monetary damages to the tune of estimated $2.6 billion [Eno00]. The infamous Melissa macro virus caused an estimated $300 million in damage in 1999 and several prominent e-commerce sites were hit by Distributed Denial of Service attacks in the beginning of 2000 [SoT00]. The estimated worldwide damage caused by automated digital attacks over $30 billion for 2002 [mi202b]. These estimated damage gures have to be taken with a grain of salt, but the trend is clear. Moreover, in just a dozen years time, the propagation speed, as well as the estimated damages have increased by ve, and two orders of magnitude, respectively (see Table 1.1). One of the persistent factors is faulty software. Software fault density remains between 0.1 and 55 faults per thousand lines of source code (see Table 1.2). Exacerbating circumstances include increases in code size and in the number of interaction pathways between the various software subsystems. Perhaps the most aggravating fact is the unwillingness and unawareness of designers and implementers alike to use proven techniques to build secure code. The net result has been an increase in the number of potentially serious software faults. In the absence of larger, long-term eorts and initiatives (vendor liability, legal regulations, widespread adoption of safe design practices and responsible user behavior), the risk posed by faulty software has to be managed somehow.

1 Time

it took to become the #1 most prevalent computer virus in history

CHAPTER 1. INTRODUCTION

1.2

Feasibility

There are two main objections that have been raised to the feasibility of cyberrisk analysis and management: Unknown enemies with unknown skills, knowledge, resources, authority, motives, and objectives (SKRAMO) attacking unknown vulnerabilities make risk measurement and management impossible. Dealing with known SKRAMO and known vulnerabilities does little to reduce risk. The rst argument is rather like Lord Kelvins 1895 remark in front of the Royal Society: Heavier than air ying machines are impossible. Incomplete information does not mean no information at all. There are studies of software defects in code [MV96][Rea03], theoretical modelling of software defects [MIO90] and attack proportions/damages among operating systems [mi202c][mi202a]; as well as extensive historical vulnerability databases [oSN02][Str02]. These sources can be used to hypothesize about future unknown vulnerabilities, attack frequencies, and attack targets. It is correct, however, that there is a conspicuous lack of attacker studies - their knowledge, resources, and skills. The short answer is that in light of incomplete information, cyber risk, like all risks, cannot be eliminated, but can be managed. Various organizations, diligent ones like the US DoD among them, have been doing this for some time now [Gor03]. The issue is subtler when one tries to evaluate cyberrisk in the context of formulating insurance policies and setting insurance premiums. Insurance companies have to guard against two phenomena when setting premiums: adverse selection, when someone chooses to insure against a particular loss because of private information not available to the insurance companies at the time of contracting (like somebody getting life insurance before going to Liberia), and moral hazard, which denotes the lack of incentives by the insured to take actions that reduce the probability of a loss subsequent to purchasing the insurance [GLS03]. The moral hazard problem can be fruitfully addressed by QSRA. Extensive software auditing and subsequent risk analysis can pinpoint problem areas to the insurers, which in turn can oer discounts for companies that take (QSRA specied) security measures. The short answer is that the QSRA methodology is not the whole picture, but is just part of the premium assessing process. The second argument has been shown to be false. Known vulnerabilities are the main entry point into most systems, as security groups have maintained and recent incidents have shown [SASS02, /top20][CC03, VU484891]. Mitigating these risks would go a long way towards improving security. In addition, the issue is not known vulnerabilities per se. The issue is one of locating the software on the

CHAPTER 1. INTRODUCTION

network that exhibits these vulnerabilities and patching them. Few administrators are diligent about installing the latest patches and upgrades. Almost no one has a good overview of the software running on their network in the rst place. An extensive software inventorying is a pre-requisite for dealing with known vulnerabilities possible, and is one of the most useful features of the implemented QSRA system.

1.3
1.3.1

Risk
Fault Tree Analysis

Fault tree analysis was developed by Bell Telephone Laboratories in 1961, to improve the reliability of the Minuteman Launch Control System [II99]. Since then, it has been extensively used for evaluating system safety and concomitant risk in many engineering disciplines (power, nuclear, electric, source code analysis) [oRBC75] [LM00]. Fault tree analysis is used to model failure situations in (typically) multicomponent systems. The goal of fault tree analysis is to investigate dependencies between failures of components that lead to the fault trees top-event. A fault tree is a logical diagram representing the consequences of component failures on the failure of the so-called top-event, the root of the tree. The top-event occurrence depends on a specic combination of basic events, which are combined with two primary types of gates, AND and OR. An example of a fault tree is given in Figure 1.1. The top-event is 415VBUS, indicating that there is no power on the 415V bus. The sub-events are ISO-A, ISO-D (and its negation NOT(ISO-D)), CON-A, DIESEL, and EXT-GRID-REPAIR. The following Boolean expression illustrates their relationship:

415VBUS = ((ISO-A NOT(ISO-D)) + CON-A + DIESEL) (EXT-GRID-REPAIR + ISO-D) Systems

(1.1)

A systems reliability function r(t) is the probability that the top event does not occur over time period (0, t). The systems failure function f (t) is equal to 1 r(t). Systems can be arranged in dierent congurations. Hillier and Lieberman identify three structures: series, parallel, and k-out-of-n-systems [HL90]. A serial system is a system which does not fail only if all components do not fail. A parallel system is a system which does not fail only if at least one component does not fail. A k-out-of-n-system fails if at least (n k ) components fail. For serial systems with n independent components, each with reliability ri (t), the systems reliability rsystem (t) and the systems failure fsystem (t) are given by Equations 1.2 and 1.3.

CHAPTER 1. INTRODUCTION

Figure 1.1: Fault tree example of event No power on the 415V bus [www03]

rsystem (t) =
i=1

ri (t)
n

(1.2) (1.3)

fsystem (t) =

(1 fi (t))
i=1

This is the basic risk modelling setup that is going to be used by QSRAs calculations.

1.3.2

Previous work on cybersecurity risk

There exist a number of suites that attempt to evaluate and manage cybersecurity risk (Strategic Technology Institutes Cost-Benet-Analysis Tool C-BRAT [Ins01], SAINT Corporations Security Administrators Integrated Network Tool [Cor01], Network Associates CyberCop Scanner [Ass01]), proposed measures to quantify the risk of vulnerability exposure (Northcutts severity measure system [Nor99]) and approaches for assessing vulnerabilities in computer networks (Sandias computer-attack graph generation tool [PS98] [SPEC01]). The risk calculations in the aforementioned approaches are either opaque (in the case of the com-

CHAPTER 1. INTRODUCTION

mercial products) or very crude (Northcutts measure), or gloss over how the probability/risk measure will actually be calculated (Sandia). Furthermore, practical, specic, granular risk management is a key aspect of an integrated methodology of mitigating cybersecurity risk - and there has been a paucity of tools and approaches in that respect. Lastly, since todays networks are increasingly ad-hoc, mobile and wireless (with the invariably concomitant increase in risk), no eective methodology can aord to be static - it must constantly adjust itself or be quickly adjustable to reect the addition, removal and modications of hosts and their constitutive software. Ideally, the assessment, analysis and management recommendations must be real-time or near-real time (on the order of minutes). QSRA was designed with six principles in mind: 1. Generality : QSRA is general in its design and can be deployed to assess and manage the risk posed by any IP device on a network 2. Adaptiveness : QSRA can be run in a semi-automated or manual fashion on a network of interest; mobile, wireless, static, or combinations 3. Comprehensiveness : QSRAs risk metric calculations take machine conguration, functionality and value, attacker ability, software installed/running, time ow, and compromise dependencies into account 4. Granularity : The granularity of the data used in the risk analysis step is on the level of the individual program, broken down into dierent compromise consequences of dierent severity 5. Transparency : QSRA is an open methodology, as well as a toolset,that is largely platform independent with exclusive use of COTS components 6. Solutions : QSRA oers risk management, i.e. rank ordered practical steps to reduce risk such as upgrading this web server on IP 127.0.0.2 to a newer version, or disabling service A on IP 127.0.0.7 subject to cost, compatibility and functionality constraints.

1.4

Major ndings

Vulnerability data was collected for six operating systems: Red Hat Linux 7.3, SuSe Linux 8.0, Mandrake Linux 8.2, OpenBSD 3.1, Debian Linux 3.0, and Windows 2000 Professional SP2. The standard, out-ofthe-box workstation installation was chosen when installation options were presented. Since the Linux and OpenBSD distributions come with a wealth of applications, MS Oce was assumed to be part of the standard out-of-the-box Windows 2000 distribution.

CHAPTER 1. INTRODUCTION

1.4.1

Parameters and Vulnerability Data

Vulnerability lifecycle Empirical research on 58 vulnerabilities over a three months period showed that in over 60% of cases where data were available, exploits were obtainable before and at the time the patch was released. Most exploits were obtainable the day the patch was released, and some preceded the patch by as much as two months. Time window data nding For six audited operating system, we nd that around 75% of vulnerabilities are patchable within two weeks, and around 90% within 40 days after initial discovery. There is no statistical signicant dierence between locally and remotely exploitable vulnerabilities, nor between vulnerability consequence types. No statistical inference can be drawn regarding fault types. There is no statistical signicant dierence between open source and closed source software. There is, however, a statistically signicant dierence between security-audited and non-security audited software. Attacker expertise Automated exploit tools and straight-forward exploit descriptions are readily found on the Internet and make acquisition of attacker skills easy. A corollary of this nding is that one cannot count on vulnerabilities and associated exploits to remain secret.

1.4.2

Faults and consequences

Faults were mapped to a vulnerability consequence, a fault type and an operating system. The overwhelming majority of fault occurrences lead to availability and full compromise consequences. 75% of the 3-tuples (OS, consequence and fault type) have a count of 0, and around 25% fall within a count range of four to one. Within availability and full compromise consequences, around 15% percent have a count of one or two, and around 25% fall within a count range of four to one. Input validation faults are proportionally over-represented. There is a statistically signicant dierence between consequence types. Full compromise consequences are proportionally over-represented. There is no statistically signicant fault or consequence proportion dierence between the audited hosts.

CHAPTER 1. INTRODUCTION

1.4.3

Design and Implementation of the QSRA system

Java as an implementation language proved its worth in cross-platform development. It cannot be stressed enough how much time is saved by the lack of the tedious memory leaks and pointer manipulations that plague C and C++. Text parsing, however, is not Javas forte, and hence QSRA client development took a disproportional amount of time. The decision to design a relational, as opposed to a at database was found to have been very fortunate. Having all data relations in third normal form is advisable, since this enables additions and mutations to be accommodated with few if any changes to the existing design. The ists_icat database holds data about software characteristics and its associated vulnerabilities. It is used to provide choices for the software identication, data and parameters for risk assessment and options for risk management. It has two main functional aggregates: Systems which holds data on software programs, and Vulnerabilities, which holds data on the softwares associated vulnerabilities. Its comprehensiveness is key. Lower estimate for necessary risk management data for 40 software class categories with 5 substitutes programs each across 3 families (Windows, BSD, Linux) is 600 entries. The lower estimate for risk analysis is far higher, since there are thousands of software packages for each family. The lower bound record size of the Systems aggregate, which holds data data on software programs, for production quality deployment is estimated to be around 3000 entries, namely for three OS families with fty functionality groups containing twenty records each. This number is necessary so that the risk management optimizer has a suciently large pool from which to assemble an alternative software makeup. In steady-state, and with an input mask, it takes no more than 5 minutes to amass and assess one entry and another 3 minutes to write it to the database, making the time commitment to populate the Systems database to be around 400 man-hours, ar at least two months for a single person. The time commitment to keep the Vulnerability aggregate current is continuous and substantial. Researching and scrubbing a single vulnerability takes at least ten minutes, and in many cases may take an hour or more. CERT reported over four thousand new vulnerabilities in 2002 [CC03], so at a rate of eighty vulnerablities/week, one should schedule at a minimum of 12 man-hours a week, although twenty to thirty man-hours a week would be more advisable.

1.4.4

QSRA empirical results

As implemented, software inventorying time grows linearly with the number of audited hosts and the number of installed software packages and the number of records in the database networks which records

CHAPTER 1. INTRODUCTION

the software makeup of the network. Database throughput to networks becomes a non-negligible factor when the number of records in the database is high.There is no statistically signicant time dierence between auditing wireless hosts and landline hosts. QSRAs risk assessment model calculated that across all the original audited operating systems, four to six months after their respective release data, the probabilities are very high (66% to 99%) that an attacker can conduct a full consequence compromise, remotely and locally. One or two faults are enough to cause the probabilities to rise to 50% and more. Almost all, if not all, of the faults have to be eliminated to have an appreciable eects on risk probabilities. Risk management analysis for remote risk probabilities indicates that, given a moderate fault count, QSRAs highest risk analytic risk mitigation strategy consistently outperforms the simpler strategy of choosing software with the highest vulnerability count. Highest risk outperforms the undierentiated highest count strategy for at least four out of the six tested operating systems and for four out of ve fault consequences. The most compelling eects are found on the Windows system, probably due to the comprehensiveness of Windows-style patches. The eects on Linux-family systems are less pronounced. Both strategies have minimal eect on risk probability reduction across all audited hosts when the vulnerability count is high.

Chapter 2

Design of the QSRA system


2.1 Findings

The ists_icat database holds data about software characteristics and its associated vulnerabilities. It is used to provide choices for the software identication, data and parameters for risk assessment and options for risk management. It has two main functional aggregates: Systems which holds data on software programs, and Vulnerabilities, which holds data on the softwares associated vulnerabilities. Its comprehensiveness is key. Lower estimate for necessary risk management data for 40 software class categories with 5 substitutes programs each across 3 families (Windows, BSD, Linux) is 600 entries. The lower estimate for risk analysis is far higher, since there are thousands of software packages for each family. The lower bound record size of the Systems aggregate, which holds data data on software programs, for production quality deployment is estimated to be around 3000 entries, namely for three OS families with fty functionality groups containing twenty records each. This number is necessary so that the risk management optimizer has a suciently large pool from which to assemble an alternative software makeup. In steady-state, and with an input mask, it takes no more than 5 minutes to amass and assess one entry and another 3 minutes to write it to the database, making the time commitment to populate the Systems database to be around 400 man-hours, ar at least two months for a single person. The time commitment to keep the Vulnerability aggregate current is continuous and substantial. Researching and scrubbing a single vulnerability takes at least ten minutes, and in many cases may take an hour or more. CERT reported over four thousand new vulnerabilities in 2002 [CC03], so at a rate of eighty vulnerablities/week, one should schedule at a minimum of 12 man-hours a week, although twenty

10

CHAPTER 2. DESIGN OF THE QSRA SYSTEM to thirty man-hours a week would be more advisable.

11

2.2

Overview

Conceptually for QSRA, the network is not a collection of hardware. From QSRAs point of view, the network is a collection of IP addresses with associated software. Figure 2.1 is an illustration of this view.

Figure 2.1: Software view of the network The software on each IP-enabled device can be classied into three groups. Services Services are running programs that have one or more sockets bound to listening ports. They

react to connection requests from the other devices. An example would be a httpd daemon listening on port 80, the common web service port.

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

12

Applications The main dierence between services and applications is that the latter are not listening on any port. Rather, the term encompasses installed software on the device. This software may or may not be actively executing. An example would be MS Oce. Operating system The operating system is the most important software on a device, since it provides

the operating environment for other software.

2.2.1

Software quality, faults, failures

Software code quality has not markedly improved. Although it is impossible, both on theoretical and practical grounds, to build completely secure systems, commercial o-the-shelf (COTS) software has been exhibiting the same source code fault density for the past twenty years. Software fault density remains between 0.1 and 55 faults per thousand lines of source code (see Table 1.2 on page 2). Exacerbating circumstances include increases in code size (Windows 2000 has an estimated 35 million lines of source code [Whe02]) and in the number of interaction pathways between the various software subsystems. Perhaps the most aggravating fact is the unwillingness of designers and implementers alike to use proven techniques to build secure code [HL02, pp.19-57]. The net result has been an increase in the number of potentially serious software faults. A fault is a defect in a program that, given certain conditions, causes a failure [MIO90, p. 6]. A fault may cause many failures; likewise, a failure may be caused by more than one fault. Some failures can be exploited by attackers. Faults, as well as failures, may be classied according to type. Drawing from the taxonomy used by ICAT, SecurityFocus, Landwehr and Bishop [oSN02] [Sec02] [LBMC94] [Bis00], the following fault types can be distinguished: 1. Input validation error : A fault is characterized as an Input Validation Error if the input being received by the software is not properly checked, thereby enabling a vulnerability to be exploited by a certain input sequence. An example of this would be a buer overow. A dierent example would be a boundary condition error - the canonical case being the Y2K problem, where the 2 digit year assumption was exceeded. Another example would be insucient validation of HTTP requests that may allow execution of code. 2. Access validation error : A fault is characterized as a Access Validation Error if software is vulnerable because the access control mechanism is faulty. The problem lies not with the conguration of the access control mechanism but with the mechanism itself. An example would be an error that enabled authentication to an FTP service using a non-existent username/password.

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

13

3. Exceptional condition handling error : A fault is characterized as an Exceptional Condition Handling error if software becomes vulnerable due to an exceptional condition that has arisen. The handling (or mishandling) of the exception by the software enables a vulnerability. An example of this would be an error in router software that allows remote attackers to cause a denial of service via a special BOOTP packet that is forwarded to broadcast MAC addresses. 4. Environmental error : A fault is characterized as an Environmental Error if unanticipated synergies causes the software to be vulnerable. This may be due, for example, to an unexpected interaction between an application and the operating system or between two applications on the same host. An example of this would be an interaction between encryption software in conjunction with encrypted le systems, which result in the creation of cleartext temporary les that cannot be wiped or deleted due to the strong permissions of the le systems. 5. Conguration error : A fault is characterized as a Conguration Error if user-controllable settings cause the software to become vulnerable. This fault is caused by controllable conguration, not inherent design errors. An example of this error would be unchanged, well-known default user accounts on remotely accessible software. 6. Race condition : A fault is characterized as a Race condition if the non-atomicity of a security check causes the existence of a vulnerability. An example would be software checking the validity of an operation against a security model. However, between the time of the security check and the actual operation, some changes occur that invalidate the security model. Attackers can then perform illegal operations like writing to the password le. 7. Design error : A fault is characterized as a Design Error if there exists no errors in the software implementation or conguration. Rather, the initial design is faulty. A example of this error would be a default root password that cannot be changed. The assumption is made that some fault types are harder to exploit than others; taking advantage of a race condition requires on the whole more expertise and nesse than exploiting an input validation error, for instance. Faults cause failures, which will henceforth be called consequences. Consequences may be one of ve dierent types (see Table 2.3 on page 17) . Each of these consequences entails a potential loss, which is dependent on the IP-enabled device on which the software resides. The determinants are, intuitively, data and/or functionality of the IP-enabled device. An availability breach on a public Web server may be much more serious than an identical breach

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

14

on an internal Web server, for instance. Similarly, a condentiality breach on a database hosting customer data may spell ruin, whereas the impact of the same condentiality breach on a database hosting archival information may be minimal. Presently, there are no known ways to automate the generation of the loss functions; there has to be a risk manager who can gauge the value of each IP-enabled device for the networks information/business/mission purpose and set the appropriate magnitudes for the consequence losses.

2.3

Process sequence

The software information of the network is gathered and processed in three steps, as shown in Figure 2.2:

Figure 2.2: QSRA approach

1. Vulnerability Assessment : Conduct an exhaustive and detailed inventory and identication of software on a network of interest (operating systems, services, and applications)

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

15

2. Risk Assessment : Quantify the risks associated with those services by matching them with known security vulnerabilities, using a back-end comprehensive vulnerability database 3. Risk Management : Optimize the risk by changing the software makeup of the network, subject to cost, compatibility and functionality constraints The QSRA software consists of a QSRA server running the QSRA application and QSRA clients. The clients are installed on the IP-enabled hosts on the network. The clients responsibility is to perform a thorough inventory of services and applications found on the residing IP address. The inventorying step follows a pull model: The client programs relay software inventory information back to the application when contacted by the QSRA application. After gathering the software information, the application assists the user with help from the vulnerability database in identifying the software found on the hosts. Once this identication is complete, the software inventorying/vulnerability assessment stage is complete. This data stream carries detailed and specic information about the networks software makeup, and should be adequately protected against eavesdroppers. For specialized IP-enabled devices such as routers, where a client installation is not practical or feasible, the entire software inventorying and identication for the device has to be done manually. Next, the QSRA application queries the vulnerability database against the identied software to gather vulnerability information (faults and associated consequences, and lifecycle data). With the help of a risk manager who has to specify magnitude of the ve consequence losses on each host, the QSRA application calculates a risk prole. This constitutes the risk assessment step. The nal step of management requires input from the risk manager, as well. There are four sets of constraints: OS compatibility, functionality, risk and cost limits. The applications optimization module takes care of the rst two constraints. The risk manager has to set the last two. Risk limits are set by the risk manager for each IP-enabled device and reect the upper acceptable bounds for the consequence magnitudes. Cost limits are the upper bound monetary costs associated with software changes. This includes installation, conguration, maintenance, upgrade, training, and acquisition costs, expressed in $US. Within these constraints, the QSRA server assembles a software makeup substitute and presents it to the risk manager with the expected risk reduction report (as shown stylized in Figure 2.2).

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

16

2.4
2.4.1

Risk Assessment
Preliminary example

Risk is commonly dened as the product of probability and severity of adverse eects, and the most common approach to quantify risk is a single gure - its expected value [Hai98, p. 29]. Mathematically speaking, given a random variable X with probability function px (x) and loss function lx (x), the expected risk value in the discrete case is equal to

E (x) =

pX (x)lX (x)

(2.1)

It is apparent that these are generic probability weighed averaging formulas. Its semantic specialization into an expected value of risk occurs through the loss function. The unit of the expected risk value is the unit used by the loss function and could be downtime, cost, credibility, etc. As a preliminary example, the simplied risk of attack consequences on a host that is running one application is shown in Table 2.1. The assumption is that an attacker can exploit three vulnerabilities, with the following consequences for the host: Full compromise (equivalent to root access), Availability (commonly known as Denial of Service) and Condentiality (as an example, reading a password le). Let X be a discrete variable denoting the individual fault consequence. Let the probability function be the probability that the vulnerability is targeted by the hacker. Let the loss function be dened as general damage, in US$ to the function the host serves. Table 2.1: Hypothetic risk of X pX (x) lX (x) Full 0.1 $100 Availability 0.8 $10 Condentiality 0.1 $50 a host RiskX (x) $10 $8 $5

With the probabilities and expected values shown in Table 2.1, the expected value of risk E (X ) of a hacker attack is $10 + $8 + $5 = $23.

2.4.2

QSRA risk calculation

Since the unit of analysis is software, this step gauges the risk of the deployed software on the network. The rst section Basic Risk describes the software risk without taking interactions into account. Basic remote and local consequence risk is calculated this way. The second section Extended Risk takes interactions between programs into account. This renes the risk of remote attack risk by including

CHAPTER 2. DESIGN OF THE QSRA SYSTEM Table Type Input validation error Buer overow error Boundary condition error Access validation error Exceptional condition mishandling Environmental error Conguration error Race condition Design error 2.2: Fault types Description improper input sequence validation expected buer length exceeded assumed boundary exceeded faulty access control mechanism incorrect exception handling discrepancy between test and installation environment unsafe conguration by user non-atomicity of security check faulty basic design

17

Type Availability Condentiality Integrity Process Full

Table 2.3: Consequence types Description Some software or data is unavailable for legitimate use Unintended read privilege to data Unintended write privilege to data User access over software and data Full access over all software and data on a device

interactions between services and applications. Basic Risk Basic risk calculates the basic remote and local risk, without taking interactions between programs into account. The risk equations presented in this section focus on services (remotely accessible software), but can be applied without loss of generality to applications (locally accessible software). Let C be the set of consequences. Let V be the access venue (remotely exploitable or locally exploitable); V {vr , vl }. Let loss(i,c) be the consequence loss for a specic IP-enabled device i. Let pv,c (t) be the probability of consequence c C, v V . Then, we have

Risk(v,i,c) (t) = pv,c (t) loss(i,c) where c C , i I

(2.2)

The consequence losses, loss(i,c) , are domain-dependent on the data and the functionality of the IPenabled device and have to be set by the risk manager. The probability of consequence c is the combined failure probability of the services residing on the device that have faults that potentially entails consequence c. This probability is time dependent and assumed to be increasing monotonically. For remotely exploitable risk (access venue v = vr ), let Kc be the set of services with faults that entail consequence c. Let pvr ,c (t), c C be the aggregate probability of the set of services Kc leading

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

18

Figure 2.3: Fault tree for consequence c to consequence c. Let pc,k (t) be the exploit probability of consequence c of a specic service k . Then, we have

pvr ,c (t) = 1
kKc

(1 pc,k (t)) where c C

(2.3)

Equation 2.3 represents the failure of independent components in series [Ros93, pp. 418, 434]. The components are the software in Kc and 1 (1 pc,k (t)) is the combined failure probability of the

services residing on the device that have faults that potentially entail consequence c. This is intuitively reasonable, since the faulty services constitute independent pathways toward consequences. The more pathways you have, the more probable that this consequence will materialize. Figure 2.3 illustrates a consequence fault tree with two service pathways toward consequence c. The same pathway argument applies to multiple faults in a single service as well. Service k may have many faults which may lead to consequence c. For instance, an input validation fault and a race condition fault both could lead to a full (root) compromise. Let Fc,k be the set of faults in service k that lead to consequence c. Let qf (t) be the exploit probability of fault f , f Fc,k , at time t . Then, we have

pc,k (t) = 1
f Fc,k

(1 qf (t))

(2.4) (2.5)

qf (t) denotes the probability that an attacker will be able to exploit fault type f at time t. Let t0 be the time when a fault has been discovered. pautomated (t, f ) be a probability function that an automated tool to exploit this fault has been developed since tdesc , the time the vulnerability was discovered. The

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

19

function is assumed to increase monotonically from time tdesc till tposted , the time the exploit tools are publicly available and then stabilize. Let Fc,k be the set of faults in service k that lead to consequence c Let pmanual (f ) , f Fc,k , be the proportion of attackers that can exploit this fault without an automated tool. These are skilled attackers who do not need an automated tools. Let ptool (f ), f Fc , be the additional proportion of attackers that can exploit this fault with an automated tool. As time goes by, it becomes more likely that this set of attackers learn how to exploit this vulnerability. Then we have

qf (t) = pautomated (t, f )proptool (f ) + propmanual (f )

(2.6)

The basic remote and local risk over all consequences can be calculated this way Extended Risk: Services and Applications So far, remote and local consequence risk have been calculated without taking interactions between programs into account. Interactions between programs enable escalation exploits. Escalation exploits are a manifestation of negative synergies between vulnerabilities. These are attacks that use vulnerabilities in one software program - usually a service - as a stepping stone to exploit vulnerabilities in another software program, usually an application. Figure 2.4 illustrates the pathways of two types of escalation exploits. Inclusion of escalation exploits aects the consequence probability pvr ,f ull by potentially increasing the risk of a remote full consequence attack. Remote2Local-User2Root Also known as R2L-U2R, a pathway in which a service is remotely ex-

ploited that has a fault entailing a process, and/or integrity privilege compromise. This privilege is then used by an attacker to access an application a with a fault that can entail a full consequence compromise. Remote2Local-Remote2Root Also known as R2L-R2R, it represents a pathway in which a service is remotely exploited that has a fault entailing a process and/or integrity privilege compromise. This compromise enables a changes in the state of some service, creating a conguration error fault within that service that enables an attacker to compromise service k2 , leading to a full consequence breach. Let Escalation be the set of the two Remote2Local-Remote2Root and Remote2Local-Remote2Root escalation events. Let P I be the set of consequences that aect non-root level of access control; P I = P rocess Integrity . Let svr ,f ull (t) be the remote full consequence probability, resulting from services

with conguration faults. Let tvr ,f ull (t) be the remote full consequence probability, excluding services with conguration faults. We have

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

20

Figure 2.4: Escalation exploits

pEscalation (t) = pvr ,P I (t)(pvl ,f ull (t) + svr ,f ull (t) pvl ,f ull (t) svr ,f ull (t))

(2.7)

Revising the full consequence probability pvr ,f ull to include these escalation exploit pathways is straightforward. Combined with the non-escalation attack event that leads to full compromise, we have

pvr ,f ull (t) = tvr ,f ull (t) + pEscalation (t) tvr ,f ull (t) pEscalation (t)

(2.8)

Aggregation Each consequence risk is measured in $US. The aggregate risk of the IP-enabled device i is then the summation of 2.2 on page 17 for all consequences c C , C being the set of consequences in Table 2.3 on page 17. The device risk is then

Risk(i) =
cC

Risk (i,c)

(2.9)

2.5

Risk Management

Once the risk metrics have been calculated for the networks constitutive IP-enabled devices, the next step is to implement measures to reduce the networks risk prole. This is the risk management step. Subject to cost, functionality, and risk limit constraints, the risk prole optimization procedure consists of replacing, patching and/or removing software. The QSRA application, together with its optimization module, works with the vulnerability database to assemble an alternative software makeup. The conceived optimization problem is an instance of a Set Partitioning Problem (with base con-

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

21

Table 2.4: Functionality (Class) access control admin tool application server authentication boot server bridge (hardware) bug tracking communication database dhcp le le sharing rmware IDS ISDN library mail mail client name NAT network access (non-Ethernet) network storage non-TCP server OS kernel packet snier printer spooler remote access router (hardware) sandbox security module superdaemon switch (hardware) task scheduler text editor unspecied web wireless access point ...

groups application basic OS tools bridge (software) compression encryption rewall internet client log mail server network access (Ethernet) news packet lter programming language router (software) server daemon switch (software) time web proxy ...

straints). See Eso for an introduction [Eso99]. The problem statement is formulated as an Integer Linear Program (ILP) [BHM77, Ch. 9]. The ILP for one IP-enabled device a is described, without loss of generality. There are four general sets of constraints: OS compatibility, functionality, risk and cost limits. The QSRA OS compatibility ensures that the QSRA application solver module only chooses compatible software. Functionality ensures that the software set chosen by the solution retains pre-optimization functionality. If host a had a web server, a database server and a ftp server before optimization, it has to oer this functionality afterward, too. Table 2.4 lists the functionality (class in QSRA vernacular) groups in the prototype QSRA implementation. This list is not complete and can be amended; it is merely a listing of functionality of the inventoried software in section 4.3. Risk limits are set by the risk manager and reect the upper acceptable bounds for the risk consequences. Cost limits are the costs associated with software - monetarily, they include installation, conguration, maintenance, upgrade, training, and acquisition costs, expressed in US$. Timewise, they include downtime, installation, conguration, training and/or maintenance costs, expressed in hours. They are set by the risk manager, and specify an upper bound. The risk manager should be work in conjunction with a senior network administrator and a senior executive in order to accurately assess the risk and cost dimensions.

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

22

2.5.1

Optimization problem setup

Let os be an operating system. Let xos be an indicator vector of the software compatible with os. Let Fos be functionality indicator matrix of software in xos . Let Frhs be the functionality required for a. Let Vos be the risk pv,c (t) matrix (see Equation 2.2 on page 17) for the software in Xos . Let e be the exploit loss vector for IP a and let E = diag (e). Let vrhs be the risk limit vector, set by the risk manager. Let Ros be the general cost matrix of the software in Xos . Let rrhs be the costs limit vector, set by the risk manager. Mathematically, the optimization problem becomes

min eVos xos


xos

(2.10)

subject to

Fos xos Vos Exos Ros xos

frhs vrhs rrhs

(2.11) (2.12) (2.13)

such that

xos binary

Networks of k IP-enabled devices are not completely independent subsystems. These k devices have coupling constraints in common, namely the risk and cost limits. Thus, the ILP formulation can be extended to k devices by using a straight-forward large-scale optimization decomposing pattern, the primal block angular structure [BHM77, pp.506-507]. For a general network with k devices, each with indicator vector x(i) for OS compatible software, the optimization problem becomes (leaving out the os subscript for clarity) min e(1) V (1) x(1) + e(2) V (2) x(2) + . . . + e(k) V (k) xk
x

(2.14)

CHAPTER 2. DESIGN OF THE QSRA SYSTEM subject to F (1) x(1) . . . F (k) x(k) frhs
(k) (1)

23

frhs

(2.15)

V (1) E (1) x(1) + . . . +V (k) E (k) x(k) vrhs R(1) x(1) + . . . +R(k) x(k) rrhs

(2.16) (2.17)

such that xi binary, 1 i k

The feasible solutions are returned as indicator vectors xi , 1 i k . If no feasible solution can be found, the risk and cost limit constraints (2.12) (2.16)(2.13)(2.17) may have to be relaxed.

2.6

Database

QSRAs knowledge repository consists of two relational databases: ists_icat and networks. Both databases were designed to be modularly expandable, easily mutable and free of the data redundancies found in other databases. Hence, all relations were designed to be, at a minimum, in third normal form. For more information on relational database design and normalization, see Rolland and Gillenson [Rol98, pp.72-85] [Gil87].

2.6.1

ISTS ICAT

ists_icat holds data about software and its associated vulnerabilities. It is used to provide data and parameters for software identication, risk assessment and risk management. The database contains 40 tables which can be separated into four functional aggregates. System This set of 17 tables holds data on software programs known as Systems. The inventoried services and application executables are mapped to these Systems, down to the patch level. Vulnerabilities This set of 13 tables holds data on vulnerabilities. Relations This set of 5 tables links the Systems to Vulnerabilities.

CHAPTER 2. DESIGN OF THE QSRA SYSTEM Other These are helper tables that store unit information, date stamps, etc.

24

The timeliness and accuracy of the information in the ists_icat database (which holds detailed information about the software and its associated vulnerabilities) is vital to QSRAs risk model calculations. CERT reported over four thousand new vulnerabilities in 2002 [CC03]. In QSRAs current incarnation, ists_icat has to be manually populated, with the help of an input mask. The reasons are twofold: The vulnerability database uses data gathered from the manufacturer and up to ten other sources. Table 2.5 shows the non-manufacturer sources and what information they can provide. As can be expected, the data across sources is presented in varying formats and the issue of faulty and contradictory data needs to be addressed. In addition, some information extraction does not lend itself to automated parsing. For instance, sometimes it is necessary to read an exploit description in order to gure out what fault causes what vulnerability consequence, when the vulnerability was rst discovered, and so on. The lower bound record size of the Systems aggregate for production quality deployment is estimated to be around 3000 entries, three OS families (Windows, BSD, Linux) with fty functionality groups containing twenty records each. This number is necessary so that the risk management optimizer has a suciently large pool from which to assemble an alternative software makeup. In a steady-state, with an input mask, it takes no more than 5 minutes to amass and assess one entry and another 3 minutes to write it to the database, making the time commitment to populated the Systems database to be around 400 man-hours, ar at least two months for a single person. The time commitment to keep the Vulnerability aggregate current is continuous and substantial. Researching and scrubbing a vulnerability takes at least ten minutes, and in many cases may take an hour or more. CERT reported over four thousand new vulnerabilities in 2002 [CC03], so at a rate of eighty vulnerablities/week, one should schedule a minimum of 12 man-hours a week, although twenty to thirty man-hours a week would be more advisable.

2.6.2

NETWORKS

networks holds data about the network of interest scrutinized by QSRA. This includes the identication of inventoried programs (services and applications) to their corresponding programs, the numeric results of risk assessment and the options generated by risk management. Three sets of tables may be discerned: NCC This set of 7 tables holds entries pertaining to a network makeup: its constitutive IP-enabled devices, its inventoried programs, and the mapping of those programs to services and applications.

CHAPTER 2. DESIGN OF THE QSRA SYSTEM Table 2.5: Vulnerabilities data sources Name URL SecuriTeam http://www.securiteam.com Packet Storm http://packetstormsecurity.nl/ New Order http://neworder.box.sk/ Security Tracker http://www.securitytracker.com/ Network Security News http://www.security-update.com SecurityFocus http://online.securityfocus.com CVE http://cve.mitre.org ISS X-Force Database http://www.iss.net/security_center ICAT Metabase http://icat.nist.gov MARC http://www.theaimsgroup.com @stake http://www.atstake.com/research (t)ype, (a)ccessibility, (c)onsequences, (e)xploits,(l)ifecycle

25

Resource (e),(l) (e),(l) (e),(l) (e),(l) (t),(a),(c), (e),(l) (t),(a),(c), (l) portal for (t),(a),(c),(e),(l) (t),(a),(c) (t),(a),(c) (t),(a),(c),(l) (t),(a),(c), (e),(l) Legend

Risk Assessment This set of 5 tables holds entries pertaining to the calculated risk measure of a network: The individual software risk, the risk of the IP-enabled device hosting that software, and the aggregate network risk. Risk Management This set of 8 tables holds the entries pertaining to risk management scenarios and the alternative software makeup on the network This database holds extremely detailed information about the software setup on the network. In eect it is a gold mine for any attacker. As such, access to this database should be restricted to the QSRA application and senior trusted personnel. As a technical security measure, the database contents as well as the communication stream between the QSRA application and networks should be encrypted. A detailed makeup of the databases can be found in appendix B on page 92.

2.7

A QSRA example

Assume we have a host George that functions as a web and database server for the PUSH company. The PUSH companys risk manager Susan knows that an immaculate, omnipresent George is vital for PUSHs public image. The integrity of the data residing on it is important as well - but not its condentiality, since the information is public anyway. Of course, Susan would not want to see the server processes taken over (and perhaps redirect the user to a rival company) or worse yet, have George be taken over completely and used as a zombie in a DDoS attack against PUSH or another company (and be subjected to potential liability suits). Susan thus assesses the loss consequences for George for the company. QSRA calculates the risk (attack) probabilities after conducting the software inventorying and identication. Software inventorying shows that George is a Debian Linux 3.0 host with service Apache 1.3.26 running,

CHAPTER 2. DESIGN OF THE QSRA SYSTEM Table 2.6: Risk management results for George Functionality Replacement software Migration cost database server MySQL 4.0.5 $3,000 web server Apache 2.0.35 $2,000 Table 2.7: Remote risk limits for George Remote risk limit Risk for original software $3000 $9,900 $3000 $9,500 $500 $0 $8,000 $27,300 $14,000 $49,500

26

Software MySQL 3.23.50 Apache 1.3.26

Remote risk $5,700 $15,200

Consequence Type Availability Integrity Condentiality Process Full

Risk for new software $2,700 $0 $0 $5,700 $12,500

tied to a back-end running MySQL 3.23.50 database server. The respective vulnerabilities of these services is shown in Table 2.11. Risk assessment reveals that its attack probabilities are pretty high. Table 2.8 shows the data from Susan and the QSRA process so far. The total remote and local attack consequence risk is $96,200 and $61,000, respectively. Susan wants to keep Georges functionality. After consulting with management and senior network administration, she sets the acceptable risk limits for George (see Table 2.7). She is not that concerned with insiders (personel hires trustworthy people), but is concerned with outside attackers. She is willing to spend roughly half of her calculated remote attack risk x to change her risk prole, around $50,000. With the risk and cost limits set, Susan invokes the risk management process. The optimizer chooses two new programs, Apache 2.0.35 and MySQL 4.0.5, to substitute for their respective (more vulnerable) predecessors. The vulnerabilities of these new services is shown in Table 2.10. Their migration costs and remote risk are shown in Table 2.6. The risk prole is calculated for these services, as shown in Table 2.9. The total remote and local attack consequence risk is now $20,900 and $0, respectively. This represents a 78% remote risk reduction. All of Susans requirements and PUSHs organizational needs have thus been met: The costs of migrating to the new software are $5,000, which meets the cost limit of $50,000. The consequence risk limits have not been exceeded, as is shown in Table 2.7. Since functionality is preserved, PUSHs managers are content with the decision, as well. If no feasible alternative software makeup had been found, either the migration costs or the risk limits would have had to be relaxed, and the risk management process re-invoked.

CHAPTER 2. DESIGN OF THE QSRA SYSTEM

27

Table 2.8: Risk assessment: Consequence losses and consequence probabilities Type Consequence loss remote attack local attack for George probability risk probability Availability $10,000 0.99 $9,900 0.95 Integrity $10,000 0.95 $9,500 0.40 Condentiality $0 0.87 $0 0.40 Process $30,000 0.91 $27,300 Full $50,000 0.99 $49,500 0.95

for George risk $9,500 $4,000 $0 $0 $47,500

Table 2.9: Risk management: Consequence losses and consequence Type Consequence loss remote attack for George probability risk Availability $10,000 0.27 $2,700 Integrity $10,000 0 $0 Condentiality $0 0 $0 Process $30,000 0.19 $5,700 Full $50,000 0.25 $12,500

probabilities local attack probability 0 0 0 0 0

for George risk $0 $0 $0 $0 $0

Table 2.10: Vulnerabilities of Apache 2.0.35 and MySQL 4.0.5 Type Apache 2.0.35 MySQL 4.0.5 remote local remote local Availability 3 Integrity Condentiality Process 2 Full 1

Table 2.11: Vulnerabilities of Apache 1.3.26 and MySQL 3.23.50 Type Apache 1.3.26 MySQL 3.23.50 remote local remote local Availability 1 1 2 Integrity 1 Condentiality 1 Process 2 3 1 Full 1

Chapter 3

Implementation of QSRA system


3.1 Findings

Java as an implementation language proved its worth in cross-platform development. It cannot be stressed enough how much time is saved by the lack of the tedious memory leaks and pointer manipulation that plague C and C++. Text parsing, however, is not Javas forte, and hence QSRA client development took a disproportional amount of time. The decision to design a relational, as opposed to a at database was found to have been very fortunate. Having all data relations in third normal form is advisable, since this enables additions and mutations to be accommodated with few if any changes to the existing design.

3.2

Overview

As a principle, the QSRA application, clients and SQL database were implemented using royalty-free software, with the simplest tools available. The implementation was to be portable and small. As such, the following products were used to implement the QSRA application and client: Language Suns Java 1.3.1, ensuring portability and ease of implementation[Mic01b] Database MySQL ABs MySQL 3.23, the worlds most widely used open-source database [AB00] Tools Victor Abells lsof 4.47, an software-to-port mapper for Unix to identify service executables; Foundstones fport 1.33, a software-to-port mapper for Windows NT/2000/XP to identify service executables; Michel Berkelaars lp_solve 2.0, a Java implementation of a mixed integer linear program solver [Abe00] [Fou00] [Ber96] 28

CHAPTER 3. IMPLEMENTATION OF QSRA SYSTEM

29

The current QSRA and database implementation was meant to set up the framework for the methodology, a proof-of-concept software engineering implementation. It took more than eighteen months of developing, testing, redesigning and debugging. The application contains around 6000 lines of source code (200 kB of data), and the client applications around 900 lines (30 kB of data). The whole archived le with the GUI and other helper classes totals 980kB of data. The database went through a few redesigns before and during implementation. Implementing the schemata was straightforward; thanks to the initial design decision to put all the relations in third normal form, additions and mutations in the schema were easier to make. Around 400 system and 200 vulnerability records were inserted into the prototype ists_icat database. Not implemented QSRA was rst implemented without escalation attacks (negative synergy eects between services and applications) in mind. The necessity to introduce escalation attacks became apparent in the course of the research as more attention was given to vulnerability exploits and their interactions with installed applications. The decision was made to incorporate escalation attacks into the methodology, but defer implementation in the prototype. A work-around numerical framework, which realized the updated methodology, was designed and implemented using Microsoft Excel. The prototype QSRA client implementation returns data on services and operating systems only, applications have to be inventoried manually. Well-behaved applications follow recommended installation procedures. In the case of Windows machines, proper installation means that meta-data about the installed software is recorded in the Windows Registry under the key HKEY_LOCAL_MACHINE\SOFTWARE. In the case of Linux and BSD machines, there are similar database-like structures (such as packages.rpm for Linux or _pkg-name_ for BSD) that get updated when standard installation procedures are followed properly. As a concomitant to the client not auditing the applications, the prototype QSRA application does not take escalation attacks into account in its risk analysis calculations. The lack of this feature in the prototype did not limit the experimental approach, it just added manual labor to an otherwise automated risk calculation process. Implementation and testing of inventorying applications on a client should take four weeks. Tieing it into the QSRA application and testing this feature is more time-consuming and is estimated to take at least two months. For risk management, the prototype QSRA application implementation does not take time cost constraints into consideration when performing the risk management optimization. The reasons for this was the belated realization that time can just be another $US metric after conversion. This avoided the added complication of having to weight two metrics in the formulation of the objective

CHAPTER 3. IMPLEMENTATION OF QSRA SYSTEM function. QSRA software source

30

A prototype of the QSRA implementation, the schemata for ists_icat

and networks and the data for ists_icat can be downloaded at http:\\actcomm.thayer.dartmouth. edu\~bil\qsra\. The application inventorying module, renement of the risk calculation formulas and documentation will be released publicly during Summer 2003.

QSRA connects to assessor installed in client on TCP port 4447/4448.

Clients (IP-enable devices) send raw lsof, fport data about open ports and the programs bound to them

Vulnerability Assessment Unix lsof (1) Win 2K (2) (TCP) 4447, 4448 Win XP fport fport

(3) QSRA parses lsof and fport data and displays the list of services (called here Communication Process, CP) for every IP-enabled device inventoried (1) Process Port Proto Path svchost 135 TCP svchost.exe persfw 44334 TCP persfw.exe

IP

NAME

PRTCL

PORT

OS

1.1.1.2 1.1.1.2 1.1.1.4 1.1.1.4 1.1.1.5 1.1.1.5

httpd xinted svchost lsass svchost msgsys

tcp tcp tcp udp tcp udp

80 21 135 500 5000 38037

Linux 2.4 Linux 2.4 Win 2K Win 2K Win XP Win XP

(2) Command Device Size Node Name xinetd TCP *:21 (LISTEN) xinetd TCP *:443 (LISTEN) httpd TCP *:80 (LISTEN) (1) QSRA requests selection of software (called here Systems or SVPs) from ists icat. (3) QSRA displays list of programs and waits for user to map CPs to SVPs

(3)

(2) ists_icat returns a list of programs.

Vulnerability Assignment MySQL

IP (1)
SysVerPatch commonPort

NAME ists_icat (2)

PRTCL

PORT

OS

(4) Mapping data is sent to networks

1.1.1.2 1.1.1.2

xinetd httpd

tcp tcp

21 80

Linux 2.4 Linux 2.4

(3) (4) CP2SVP (5) IP2SVP networks

version

patch

port

MS IIS 3.0 and prev.

MS IIS 3 MS IIS 5 Apache 1.3

MS IIS 5.1 Apache 1.3.20

80 80 80

(5) networks stores a Scenario: the CPs and SVPs, as well as the CP-to-SVP, SVPto-IP, IP-to-network mappings.

Map httpd to Apache 1.3.20

Risk Assessment MySQL

(1) QSRA requests the numeric SVP vulnerability information for the current network scenario.

(1) Vuln
Vuln2ExploitFactors

NetworkRisk IP address 1.1.1.4 1.1.1.2 1.1.1.5 ists_icat (2)

Risk 1290 3680 230

(2) ists icat returns the vulnerability information

(3) networks (4) IPRisk IndivRisk

(3) QSRA calculates the IndivRisk, IPRisk and NetworkRisk of the scenario, using data from ists icat for the probabilities and input from the risk manager for the losses. (4) QSRA then populates the risk tables of networks with a baseline scenario

1.1.1.2 IPRisk Consequence Risk Full 600 Trust 2300 Process 100 Integrity 260 Availability 120 Confidentiality 300

IndivRisk Apache 1..3..20 Consequence Risk Full 340 Trust 1200 Process 100 Integrity 200 Availability 0 Confidentiality 90

Risk Management MySQL

(1) The risk manager sets the resource and the risk consequence limits.

Scenarios (2) Cost (3) ists_icat Class

Scenario

IP Scen2 1.1.1.2

Risk 3680

Improv 0%

(1)

System

Scen1 1.1.1.2 Scen1 1.1.1.4

2750 800

25% 38%

resources money time

Limit 12000 148

(2) Optimize invokes the optimization process during which ists icat is primed numerous times for information

(3) ists icat is queried against the optimization solution and returns the software makeup for the new scenario.

(5) (4)

networks IPRisk IndivRisk Scenario

(5) QSRA calculates the risk of the alternative scenario, displays and contrasts the risk measures (6) QSRA stores the scenario back to networks

Baseline IP Risk 1.1.1.2 3680 1.1.1.4 1290 (6)

1.1.1.5

230

optimize

consequence Full Trust Process Integrity Availability Confidentiality

Limit 340 1200 100 200 0 90

(4) networks returns risk data of a baseline scenario to be used for comparison with other scenarios.

Table 3.1: QSRA event sequence

CHAPTER 3. IMPLEMENTATION OF QSRA SYSTEM

32

Figure 3.1: Inventorying the network

3.3

QSRA Event Flow

The stylized event ow in Table 3.2 demonstrates the implemented QSRA methodology. The sequence illustrates the detailed steps of inventorying, identifying, risk assessing and managing services (remotely accessible running software).

3.3.1

Vulnerability assessment: Software inventorying

QSRA connects, via TCP port 4447/4448 to the assessor service installed on the client. The small foot-print services listening on port 4448 are Java wrappers and post-processors of two applications: lsof (1), used with Unix-compatible clients, and fport (2), used on Windows NT/2000 clients. Both applications can be downloaded [Abe00][Fou00]. The size of these Java wrappers is roughly 900 lines. The services listening on port 4447 are Java wrappers around two built-in applications, uname (Unix) and ver (Windows). The size of these wrappers is roughly 300 lines. QSRA parses this raw data and displays the list of services (Communication Process or CPs in QSRA parlance) for every IP-enabled device inventoried. The output of this processing can be seen in the left half of Figure 3.1. The raw outputs lsof, fport, ver and uname are shown in Figures 3.2 and 3.3, and Table 3.2 on the next page. In addition, the rst set of database entries are made in the networks database. A Scenario is created. A Scenario is an actual or potential software network makeup. A network makeup consists of mapping CPs to IP-enabled devices, and IP-enabled devices to a network. The tables that are aected are dNetwork, dScenario, dCalendar, dIP, dOS, dProcess, rScenarioScen2Network, rNetwork2Calendar, rScenarioNetwork2IP, dOS, rScenarioIP2OS and rScenarioIP2Process. See appendix B.2 for more details on these tables.

CHAPTER 3. IMPLEMENTATION OF QSRA SYSTEM Table 3.2: Sample uname and ver output Linux Dan00 2.2.16-22 #1 Tue Aug 22 16:49:06 EDT 2000 i686 unknown Microsoft Windows 2000 [Version 5.00.2195]

33

uname ver

Figure 3.2: Sample lsof output

Figure 3.3: Sample fport output

CHAPTER 3. IMPLEMENTATION OF QSRA SYSTEM

34

Figure 3.4: Specifying the software

3.3.2

Vulnerability assessment: Software identication

The QSRA application displays the inventoried communication processes, as shown at the top of Figure 3.4. At this point, the CPs are mere executable names - like httpd or xinetd - gathered from the IP-enabled devices. They are not yet identied and mapped to any particular program (SysVerPatch or SVPs, in QSRA parlance). SysVerPatch is a program that has been identied down to its patch level. QSRA hence requests a selection of SVPs from ists_icat (1). The database returns a selection of SVPs with optional port and common name lters (see appendix B.1 on page 92 for more details) to narrow down the selection from ists_icat (2). Not every CP has to be mapped to an SVP; CP entries can be deleted by the user if they are not of interest. The user browses through the selection and assigns an SVP to a CP. Each time the button Mapping xxx to yyy in the bottom half of Figure 3.4 is clicked, the user choice is sent to the database networks (3). The updated data tables are dSVP, rScenarioProcess2SVP and rScenarioIP2SVP. See appendix B.2 for more details.

CHAPTER 3. IMPLEMENTATION OF QSRA SYSTEM

35

Figure 3.5: Assessing the software risk

3.3.3

Risk assessment

After the software makeup of the networks constitutive IP-enabled devices is completed, risk measures can be calculated for the identied software, the IP addresses where the software resides and a logical network which contains these IP addresses. Identication of a network is done via a numeric ScenarioID (1). The QSRA application requests the parameters used for these calculation from both ists_icat and networks. Via networks, the QSRA application iterates through the IP listing of the network, gathers the list of software and proceed to query ists_icat about their vulnerabilities (2). The results of the risk assessment are written to networks tables (3), dIndivRisk, dIPRisk and dNetworkRisk and displayed, as shown in Figure 3.5.

CHAPTER 3. IMPLEMENTATION OF QSRA SYSTEM

36

Figure 3.6: Managing software risk

3.3.4

Risk management

The risk manager has to set both resource and risk constraints for each IP-enabled device. Intuitively, software migration exacts some resource costs: Monetarily, there are acquisition, installation, conguration, training, maintenance and/or upgrade costs, expressed in $US. Timewise, there are downtime, installation, conguration, training and/or maintenance costs, expressed in hours. The time costs (hours) can be converted to $US for simplicity. The risk constraints, expressed in $US, denote the maximal costs acceptable in each risk category (1). QSRA uses the constraint data to set up the optimization problem. Optimize subsequently invokes the optimization module during which the ists icat database is queried numerous times for alternative programs and their vulnerability information (2). After the optimization module nds a feasible solution (if no feasible solution is found, it just returns the original software makeup), the QSRA application generates an alternative scenario by querying ists icat against the solution (3). The risk manager then selects a baseline scenario from networks (usually the original) to compare to alternative scenarios for this particular network. Previously stored alternative scenarios may be used as well (4). The QSRA risk server then proceeds to display and contrast the risk proles. For each IP address in the network, the relative risk improvement over the baseline is displayed (5). Finally, if a generated alternative scenario is deemed acceptable, it can be stored back into networks for future reference (6). A one page graphical representation of the entire QSRA event sequence is presented in Table 3.2 on page 31.

Chapter 4

Analysis
4.1
4.1.1

Findings
Parameters and Vulnerability Data

Vulnerability lifecycle 58 randomly sampled vulnerabilities, posted over a six month period, were investigated. These vulnerabilities and their associated timeline events showed that in over 60% of cases where data was available, exploits were obtainable before and at the time the patch was released. Most exploits were obtainable the day the patch was released, and some preceded the patch by as much as two months. Time window data ndings Looking at the time window from discovered vulnerability to posted patch, we nd that around 75% of vulnerabilities are patchable within two weeks, and around 90% within 40 days after initial discovery. In terms of the time window, there is no statistical signicant dierence between locally and remotely exploitable vulnerabilities, nor between vulnerability consequence types. No statistical inference can be drawn regarding fault types. There is no statistical signicant dierence between open source and closed source software. There is, however, a statistically signicant dierence between security-audited and non-security audited software. Attacker expertise Automated exploit tools and straight forward exploit descriptions are readily found on the Internet and make acquisition of attacker skills easy. A corollary of this nding is that one cannot count on 37

CHAPTER 4. ANALYSIS vulnerabilities and associated exploits remaining secret.

38

4.1.2

Faults and consequences

Extensive vulnerability data were collected for six popular operating systems: Red Hat Linux 7.3, SuSe Linux 8.0, Mandrake Linux 8.2, OpenBSD 3.1, Debian Linux 3.0, and Windows 2000 Professional SP2. The standard, out-of-the-box workstation installation was chosen when installation options were presented. Since the Linux and OpenBSD distributions come with a wealth of applications, MS Oce was assumed to be part of the standard out-of-the-box Windows 2000 distribution. Faults were mapped to a vulnerability consequence, a fault type and an operating system. The overwhelming majority of faults lead to availability and full compromise consequences. 75% of the 3-tuples (OS, consequence and fault type) have a count of 0, and 25% fall within a count range of four to one. Within availability and full compromise consequences, around 15% have a count of 2 or less, and around 25% fall within a count range of four to one. Input validation faults are proportionally overrepresented. There is a statistically signicant dierence between consequence types. Full compromise consequences are proportionally over-represented. There is no statistically signicant fault or consequence proportion dierence between the audited hosts.

4.1.3

QSRA empirical results

As implemented, software inventorying time grows linearly with the number of audited hosts and the number of installed software packages. Database throughput to networks becomes a non-negligible factor when there are 1000+ records in the database. There was no statistically signicant inventorying time dierence found between wireless hosts (bandwidth of 10 Mb/s) and landline hosts (bandwidth of 100 Mb/s). QSRAs risk assessment model calculated that across all the original audited operating systems, four to six months after their respective release date, the probabilities are very high (66% to 99%) that an attacker can conduct a full compromise, remotely or locally. One or two faults are enough to cause the probabilities to rise to high levels. Almost all if not all of the faults have to be eliminated to have an appreciable eect on risk probabilities. Risk management analysis for remote risk probabilities indicates that, given a moderate fault count, QSRAs highest risk analytic risk mitigation strategy consistently outperforms the simpler strategy of choosing software with the highest vulnerability count. Highest risk outperforms the undierentiated

CHAPTER 4. ANALYSIS

39

highest count strategy for at least four out of the six tested operating systems and for four out of ve fault consequences. The most compelling eects are found on the Windows system, probably due to the comprehensiveness of Windows-style patches. The eects on Linux-family systems are less pronounced. Both strategies have minimal eect on risk probability reduction across all audited hosts when the vulnerability count is high.

4.2

Parameter and vulnerability data sources

A dozen non-manufacturer sources were tapped to obtain the necessary parameters and vulnerability data. Table 2.5 lists them and the specic resources each one provided. Experts opinions also were solicited to help gauge vulnerability lifecycle parameters. A questionnaire was sent to the maintainers of security-related, moderated discussion groups in order to elicit responses from the security community as a whole (see http://lists.insecure.org/isn/2002/Jan/0144.html for an example). The response was non existent. The questionnaire then was more aggressively mailed directly to about two dozen discoverers of various vulnerabilities whose e-mail addresses were pulled from the sources in Table 2.5. Six partially completed questionnaires were returned. The questionnaire is shown in Table 4.1. A systematic, consistent approach proved necessary to balance incomplete and (sometimes contradictory) information. Categorical data was the easiest to gather with reasonable accuracy, whereas the continuous data proved much harder to obtain. Table 4.2 gives an overview of which sources were used, toward which end, and how conicting data claims were resolved.

4.2.1

Vulnerability lifecycle

Figure 4.1: Timeline of events in vulnerability lifecycle Faults have a lifecycle; they are discovered, published, exploited and eventually xed. Over time, knowledge about the fault deepens and diuses across the community. It is assumed that there are four salient points in time in the lifecycle of a fault, or vulnerability, that are relevant in the context of this thesis: tdesc : Theoretical description of the vulnerability (e.g. the discovery of the vulnerability, not widely known except to vendors, elite hackers or security experts)

CHAPTER 4. ANALYSIS

40

Table 4.1: Questionnaire: Vulnerability lifecycle and attacker ability Overview This survey would like to gather data on two questions: The rst one is concerned with the time distribution between events in the lifecycle of a vulnerability. The second one is concerned with the ability, in percentage of the general hacker population, to launch a successful exploit at each of these points in time in the lifecycle of a vulnerability. Question 1) I have identied 4 events of interests in the lifecycle of a vulnerability: a) Theoretical description vulnerability (e.g. the discovery of the vulnerability, not widely known but to either vendor or elite hacker or security experts) b) Proof of concept of vulnerability (e.g. an exploit has been written, but is not widely available because it is not widely posted or the vulnerabilitys exploit is an old technique (like cross scripting, etc ) c) Popularization of vulnerability (e.g. the exploit is posted and as such widely available) d) Countermeasure of vulnerability (e.g. patch/method is posted and widely available) A possible time line of events (other sequences are possible/probable):

Let ta be the time period between tdesc and tP oC , tb the period between tP oC and tposted , and tc the period between tposted and tpatch Can you give an estimate for the times ta , tb ,and tc ? Question 2) At each of point in times of these events, a particular skill level is required to take advantage of the vulnerability. Only very skilled hackers can take advantage of a buer overow condition at time tdesc , for instance. What percentage of the general hacker population has the skills to exploit a vulnerability at each time point tdesc , tP oC , tposted and tpatch ?

CHAPTER 4. ANALYSIS

41

Type Continuous

Categorical

Table 4.2: Categorical and continuous data sources: decision criteria Datum Sources order Comments Discovery of Vulnerability SecurityFocus High prole vulnerabilities have Manufacturer detailed discovery dates, pubISS lished even in general newssources. Otherwise, across the sources, the majority discovery date was chosen. Proof of Concept of Vul- email Proof of Concept code can be innerability @stake ferred explicitly by comprehenPacketStorm sive advisories or by email veriNewOrder cation, and implicitly through SecuriTeam a thorough theoretical descripMARC tion of an exploit and the threads mailings on discussion groups. Popularization of Vulnera- SecurityFocus If available, the exploits le or bility ISS code comment timestamp was @stake used . If the exploit writers PacketStorm email is known, he/she was conNewOrder tacted to get a timestamp. OthSecuriTeam erwise, across the sources, the majority posted date was chosen. Patch of Vulnerability patch timestamp If available, the patch codes SecurityFocus time stamp was used. Otherwise, ISS across the sources, the majority SecurityTracker posted date was chosen. Fraction of attackers who Survey results No other study on attacker abilcan code an attack from a ities could be found. theoretical description Fraction of attackers who Survey results Same as above launch an attack once code is available Time between Discov- Survey results The survey results were used in ery of Vulnerability and calculation from conjunction with empirical rePosted Exploit vulnerability lifecy- search cle parameters Identier type of Vulnera- CVE CVE IDs were preferable. When bility Manufacturer no CVE number has been assigned yet, manufacturers ID was used Description of Vulnerabil- ISS ISSs comprehensive database ity IDs are a pithy description of the vulnerability. Fault Type of Vulnerabil- ICAT Across the sources, the majority ity SecurityFocus type was chosen. ISS Consequence type of Vul- ICAT The most serious consequence nerability SecurityFocus was chosen when multiple conseISS quences were possible. When the MARC consequences were unclear, they were inferred from vulnerability discussions. Access type of Vulnerabil- SecurityFocus one of three options: local, reity ISS more or both.

CHAPTER 4. ANALYSIS

42

tP oC : Proof of concept of the vulnerability (e.g. an exploit has been written, but is not widely available because it is not widely posted or the vulnerabilitys exploit is an old technique1 ) tposted : Popularization of the vulnerability (e.g. the vulnerability information and possibly exploits are posted and are widely available) tpatch : Countermeasure of the vulnerability (e.g. patch/method is posted and widely available) Figure 4.1 on page 39 shows a typical vulnerability lifecycle. Other sequences are possible and do occur. Arbaugh, in his case study of three vulnerabilities, suggests that publication and automation of the exploits usually succeed, not precede, the patch release. However, empirical research on 58 vulnerabilities and their associated timeline events (see Table 4.3) showed the following: In 14 cases, patch events preceded availability of automated exploitation, in 14 cases the events coincided in time, in 9 cases availability of automated exploitation tools preceded the patch release, and in 21 cases, no time data was found. Hence in over 60% of the cases where data was available, the suggested order in Figure 4.1 applied. Table 4.3 also show the paucity of the data available on the time events tP oC and tposted , the proof of concept and the popularization of exploits: The time for these events could be discerned for only 5 out of 58 vulnerabilities. Data on tdesc and tpost , the initial discovery and and posted patch events, were more accessible. Hence analysis will focus on the time window between discovery and posted patches. Time window from discovery to posted patch Figure 4.2 show the empirical CDF of the timing data set and a closeup of the region of interest. Figure 4.3 renders the same data in histogram format, with a closeup of a region of interest. The timing data group averages (OS, consequence, fault and access types) can be found in Figure 4.4. Group averages are noted in bold-face to the left of the data points, and counts in slightly smaller font to the right of the data points. Negative numbers reect the cases when a discovered vulnerability was already xed by a patch that was released earlier for another vulnerability. Plotting the timing data against a normal probability curve (Figure 4.2(b)) reveals the timing data distribution to be non-normal. Hence, a non-parametric ANOVA test, Kruskal-Wallis, was used to analyze both timing and fault count data [Dev95, pp.652-653]. One-way ANOVA (Kruskal-Wallis) was performed on the timing dataset for four factors. The results are shown in Table 4.4. Details on the group wise comparison can be found in Table C.3. Boxplots of
1 Old

techniques include cross-scripting, illegal IP packets, and the like

CHAPTER 4. ANALYSIS Table 4.3: Vulnerability lifecycle events


Vulnerability ID CAN-2001-0828 CAN-2001-0829 CVE-2000-0884 CAN-2001-0840 CVE-2001-0244 CAN-2001-0825 CAN-2001-0815 CVE-2000-0666 CAN-2001-0500 CAN-2001-0847 CVE-2001-0239 CVE-NO-MAP CVE-2001-0241 CVE-2001-0333 CAN-2001-0045 CAN-2001-0048 CAN-2001-0507 CAN-2001-0658 CAN-2001-0720 CAN-2001-0818 CAN-2001-0824 CAN-2001-0833 CAN-2001-0835 CAN-2001-0836 CAN-2001-0841 CAN-2001-0844 CAN-2001-0855 CAN-2001-0867 CAN-2001-0823 CAN-2001-0838 CVE-2001-0015 CVE-2001-0013 CAN-2001-0799 CAN-2001-0800 CAN-2001-0801 CAN-2001-0047 CAN-2001-0849 CAN-2001-0349 CAN-2001-0671 CAN-2001-0719 CAN-2001-0803 CAN-2001-0808 CAN-2001-0816 CAN-2001-0817 CAN-2001-0819 CAN-2001-0820 CAN-2001-0826 CAN-2001-0839 CAN-2001-0842 CAN-2001-0846 CAN-2001-0850 CAN-2001-0858 CAN-2001-0860 CVE-2001-0016 CVE-2001-0147 CVE-2001-0340 CVE-2001-0341 CVE-2001-0344 a) tdesc 3/16/01 3/16/01 10/17/00 unknown unknown unknown 10/15/01 7/16/00 unknown unknown 4/2/01 11/19/01 4/8/01 3/27/01 unknown unknown 6/11/01 unknown unknown unknown 3/20/01 8/2/01 unknown 9/17/01 9/3/01 unknown 8/22/01 unknown unknown unknown 2/1/01 1/29/01 unknown unknown unknown unknown 6/12/01 unknown unknown 8/7/01 10/29/01 6/26/01 9/18/01 unknown 6/10/01 unknown 6/30/01 unknown 9/3/01 unknown unknown unknown 6/1/01 unknown unknown unknown 4/13/01 unknown b) tP oC n/a n/a unknown unknown 5/16/01 6/28/01 unknown 7/1/00 6/1/01 n/a unknown unknown 4/11/01 5/2/01 n/a n/a unknown n/a n/a 6/10/01 n/a 8/2/01 n/a unknown unknown n/a unknown unknown unknown 4/17/01 2/2/01 unknown 5/1/01 5/1/01 7/1/01 n/a n/a unknown unknown unknown unknown n/a unknown unknown unknown 6/17/01 6/30/01 unknown unknown unknown unknown unknown n/a unknown unknown unknown unknown unknown c) tposted 7/2/01 7/2/01 10/19/00 10/29/01 6/16/01 7/2/01 11/22/01 7/28/00 6/29/01 11/8/01 4/23/01 12/15/01 5/2/01 5/15/01 3/27/01 12/19/00 8/15/01 8/16/01 10/23/01 6/12/01 7/2/01 10/23/01 10/24/01 10/18/01 10/30/01 10/30/01 11/1/01 11/14/01 6/18/01 10/25/01 2/4/01 1/30/01 9/1/01 9/1/01 8/8/01 12/6/00 8/10/01 unknown unknown unknown unknown unknown unknown unknown unknown 6/17/01 6/30/01 10/25/01 unknown unknown unknown unknown 11/7/01 unknown unknown unknown 6/25/01 unknown d) tpatch 2/15/01 3/30/01 8/10/00 9/1/01 5/10/01 6/9/01 11/1/01 7/16/00 6/18/01 10/29/01 4/16/01 12/13/01 5/1/01 5/14/01 3/27/01 12/19/00 8/15/01 8/16/01 10/23/01 6/12/01 7/2/01 10/23/01 10/24/01 10/18/01 10/30/01 10/30/01 11/1/01 11/14/01 6/19/01 10/26/01 2/5/01 2/11/01 11/9/01 11/9/01 11/9/01 3/27/01 12/6/01 6/7/01 9/11/01 11/20/01 11/12/01 6/26/01 9/25/01 11/20/01 6/15/01 unknown unknown unknown 10/30/01 10/29/01 10/30/01 11/12/01 unknown 2/7/01 2/26/01 6/13/01 unknown 6/12/01 tdesc,P oC -15 3 35 0 1 0 tP oC,posted 30 4 27 28 21 13 2 81 188 2 120 120 37 0 0 tposted,patch -137 -92 -69 -58 -36 -23 -21 -12 -11 -9 -7 -2 -1 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 11 68 68 91 111 116

43

CHAPTER 4. ANALYSIS

44

Table 4.4: ANOVA results of time between Factor p-value OS 0.002 Consequence 0.52 Fault 0.44 Access 0.09

discovery and patch [tdesc , tpatch ] Detail Table C.3 not signicant not signicant not signicant

the timing dataset and the 2 test results are shown in Figure C.1. For an interpretation of boxplots, see section D.1 on page 124. The descriptive representation in Figure 4.2(c) shows that around 75% of discovered vulnerabilities are patchable within two weeks and around 90% within 40 days after initial discovery. The data exhibits a tight interquartile spread, largely invariant to the grouping criteria, with the exception of a few outliers, ranging anywhere from 60 to 700 days.

4.2.2

Hypotheses

Five claims were investigated. 1. Accessibility: One could expect remote vulnerabilities to be handled more expeditiously than local ones, since less restrictive access increases the attacker pool, thereby putting the network more at risk. 2. Fault type complexity: Certain fault types, due to their nature (reproducibility, programming eort) are more dicult to x then others. One could expect to nd that the time window for vulnerabilities caused by a design, exceptional condition, and race condition error to be greater than for those caused by the other faults. 3. OS Open-source: There could be a marked time dierence between closed-source (Windows) and open-source OSs (Mandrake, Red Hat, SuSe, Debian, OpenBSD). One could expect the time dierential to be smaller for open-source. 4. OS Security-audition: There could be a marked time dierential between security-audited (OpenBSD) and non-security audited OSs (Windows,Mandrake, Red Hat, SuSe, Debian) . One ought to expect the vulnerability time window for security-audited OSs to be smaller than for the other OSs. 5. Consequence severity: Intuitively, the more serious consequences ought to be dealt with more expeditiously than the less serious ones. One could expect to nd that the time window for vulnerabilities leading to full consequence compromises to be smaller than the window for the remaining ones.

CHAPTER 4. ANALYSIS

45

Figure 4.2: Time between discovery and patch [tdesc , tpatch ]


Time of posted advisories to posted patch: CDF 1

0.9 Time CDF

0.8

0.7

0.6 Probability

0.5

0.4

0.3

0.2

0.1

0 300

200

100

100 200 Time (days)

300

400

500

600

(a) CDF
Normal Probability Plot 0.997 0.99 0.98 0.95 0.90

0.75 Probability

0.50

0.25

Time data Normal line

0.10 0.05 0.02 0.01 0.003 0 100 200 300 Data 400 500 600 700

(b) Normal Probability Plot


Time of posted advisories to posted patch: CDF (region of interest) 0.9
0.997 Normal Probability Plot (region of interest)

0.8
0.99 0.98

0.7
0.95 0.90

0.6
0.75

Probability

Probability

0.5

Time CDF

0.50

0.4

0.25

Time data Normal line

0.3
0.10

0.2

0.05 0.02

0.1

0.01 0.003

0 5

10

15

20 25 Time (days)

30

35

40

45

20

20 Data

40

60

80

100

(c) CDF: Region of Interest

(d) Normal Probability Plot: Region of Interest

CHAPTER 4. ANALYSIS

46

Figure 4.3: Histogram of time between discovery and patch [tdesc , tpatch ]
Time of posted advisories to posted patch: Histogram 140

120

100

Frequency

80

60

40

20

0 100

100

200

300 400 Time (days)

500

600

700

800

(a) Full data set


Time of posted advisories to posted patch: Histogram (region of interest)

120

100

80 Frequency

60

40

20

0 40 30 20 10 0 10 20 Time (days) 30 40 50 60

(b) Region of Interest

CHAPTER 4. ANALYSIS

47

Figure 4.4: Group mean comparison of [tdesc , tpatch ]


Time of posted advisories to posted patch: Access type group mean with C.I. of 95% 90

80

70

60

Time (days)

50 43.8 40
50

30

20 17.8 10
154

0 remote local

(a) Access
Time of posted advisories to posted patch: Fault type group mean with C.I. of 95% 120

100

80

60 Time (days) 46 40 32 26 20
45 42 11

24 18 11
57 29 2

6.2
9

3.3
8

0
3

20

40 design bound access input buffer except race env conf

(b) Fault
Time of posted advisories to posted patch: OS group mean with C.I. of 95% 120
300 Time of posted advisories to posted patch: Consequence type group mean with C.I. of 95%

100

250

80

200

60 Time (days)
Time (days)

150

40

43
33

36
41

100

101

10

20 10
39

19
45

50 23.1
112

7.3
28

20.5

11

16.5

48

0.1
20

0 11.3
25

20

50

40 W2k SP2 MDK 8.2 RHL 7.3 OS SuSe 8 Debian 3 OpenBSD 3.1

100 Confidentiality Full Process Availability Integrity

(c) Operating Systems

(d) Consequences

CHAPTER 4. ANALYSIS

48

Accessibility Figure 4.4(a) shows the two access type group averages and spread. Although the group averages reveal a marked dierential between local and remote vulnerabilities (around 25 days), the subsequent 2 tests p value is around 0.1, which is not signicant. This suggests that the 25 day spread is due to the presence of outliers, which is corroborated by the boxplot. The access type hypothesis hence cannot be supported with this data. Fault type complexity Figure 4.4(b) shows the nine fault type group averages and spread. Although

the starkest dierential between the fault types amounts to 46 days (between environmental error and exceptional condition errors), the subsequent 2 tests p value of 0.44 suggests that the dierence is due to outliers. The exceptional condition boxplot does indeed show an extreme outlier that skewed the data. Although the test unequivocally rejects the hypothesis with a p value of 0.44, the samples sizes in the cases of exceptional condition, (11 samples), access validation (9 samples), race condition (8 samples), environmental error (3 samples), and conguration errors (2 samples) are too small (n < 20) to reliably draw conclusions. The fault type hypothesis thus cannot be answered reliably with this data. OS Open-Source, OS Security-audited Figure 4.4(c) shows the six operating system group averages and spread. The rst interesting result is the OpenBSD data. The 20 samples that make up this group are positioned exclusively at 0 or 1 days, with an average of 0.1 days. Compared to the runner-up (open-source, SuSe, 7 days) or Windows (closed source, 43 days), this may be telling testimony to the dedication and receptiveness to bug reports on the part of the OpenBSD developers. The subjunctive of the preceding sentence is necessary since the Windows, Mandrake and Suse boxplots show some extreme outliers. The 2 tests p value of 0.002 did suggest signicant dierences between OSs, however, a glance at the groupwise comparison table C.3 on page 106 shows a signicant statistical dierence between OpenBSD and Mandrake, but not between any other two OSs. Hence, the source hypothesis cannot be supported by the data. The data does seem to support the security-audited hypothesis; however, one should keep in mind that some Mandrake outliers are on the order of a few hundred days and that one would have expected to also nd signicant dierences between other non-security-audited distributions, since much of the code base is shared. In keeping with the general ndings of the timing data set distribution, every outlier above 70 days was eliminated and the tests re-run. The p value dropped to 0.01, which is still signicant. With these reservations in mind, the data supports the security-audited hypothesis. Consequences Figure 4.4(d) shows the ve consequences group averages and spread. The most marked dierential between group averages amounts to 120 days, and is to be found between con-

CHAPTER 4. ANALYSIS

49

dentiality and full compromise consequences. The 2 tests high p value of 0.52 suggests, however, that this dierential is due to outliers. The integrity compromise consequence boxplot shows this to be the case with a prominent 700+ day outlier. Although the test unequivocally rejects the hypothesis at a p value of 0.52, the sample sizes in the case of of integrity and process consequences are too small to draw reliable conclusions. We nd a six day dierential between large-sample consequences, namely, full compromise (112 samples, 23 days) and availability compromise (48 samples, 17 days). After eliminating the outliers and rerunning the tests, the dierence becomes negligible. Thus the consequence hypothesis cannot be supported by the data. Survey data vs empirical results: [tdesc , tpatch ] The estimates collected in the survey on vulnerability lifecycles (see Table 4.6 on the following page) may now be contrasted with the empirical results of the timing data set. Two caveats: First, the survey results are based on a total of six respondents. This is insucient for a statistically signicant comparison. Secondly, the survey asked for three time estimates (time between advisory and proof of concept code, the time between proof of concept code and posted code, and the time between posted code and patch) and assumed that the advisory, proof of concept code, posted code and patch event occurred in this order in the lifecycle of a vulnerability. The three time estimates were added for a point estimate for the time window from posted advisory to posted patch [tdesc , tpatch ]. The survey yields the following estimates: Table 4.5: Experts estimate for time window from discovered vulnerability to posted patch [tdesc , tpatch ] Time window assumption Time window size low 6 days median 11 days high 127 days From the empirical results in Figure 4.2, the low estimate seems too optimistic; only about 65% of the timing set data points are under six days. The high estimate, however, is accurate; around 95% fall within 127 days. The median estimate is so-so; empirically, roughly 75% of vulnerabilities have a patch posted within 11 days. On the whole, the experts got it right enough. However, the overly optimistic lower bound estimate may prove to be a genuine cause for concern, since it bespeaks a critical misperception in the minds of security professionals, both academic and corporate.

4.2.3

Timing data set ndings

75% of discovered vulnerabilities are patchable within 10 days, 90 % within 40 days after initial discovery date.

CHAPTER 4. ANALYSIS Table 4.6: Overview of questionnaire Parameter Time between discovery of vulnerability and proof of concept code Time between proof of concept code and posted code Time between posted code and patch % of general hacker population capable of exploiting the vulnerability at tdesc % of general hacker population capable of exploiting the vulnerability at tP oC % of general hacker population capable of exploiting the vulnerability at tposted % of general hacker population capable of exploiting the vulnerability at tpatch results low 1 hour 2 days 4 days 5 10 15 30

50

high 7 days 30 days 90 days 30 30 40 70

median 1 day 4 days 7 days 10 20 25 40

There is no statistical signicant dierence between locally and remotely exploitable vulnerabilities There is no statistical signicant dierence between vulnerability consequence types. No statistical inference can be drawn regarding fault types. There is no statistically signicant dierence between open-source and closed-source software. There is a statistically signicant dierence between security-audited and non-security audited software. OpenBSDs time window is substantially smaller in one pair-wise comparison.

4.2.4

Attacker expertise

Attacker expertise numbers are hard to obtain. There are studies on hacker motivation for Open Source networks [LWBD02] and on psychological and sociological hacker taxonomies [Rog00], but none on sheer hacker ability. The questionnaire sought to shed some light on this. Hacker was dened to be a malicious computer savvy attacker. The questionnaire results on hacker ability among the hacker population yield median results ranging from 10% to 40%, depending on the point in time on the vulnerability lifecycle. These numbers are disconcertingly high, but they are based on a very small survey. The results are somewhat corroborated by anecdotal stories of 14 year old scripters and the relative ease of nding reconnaissance tools, exploit code and detailed descriptions of attacks on the general Internet. Technical ability to conduct attacks is generally easily acquired for motivated individuals.

CHAPTER 4. ANALYSIS

51

4.3
4.3.1

Approach
Audited environment

An isolated network of six Gateway laptops, plus a QSRA application host and a database server were set up. A dierent, out-of-the-box, popular OS was installed on every laptop: Red Hat Linux 7.3 (Kernel 2.4.18-3), SuSe Linux 8.0 (Kernel 2.4.10), Mandrake Linux 8.2 (Kernel 2.4.18-6mdk), OpenBSD 3.1 (Kernel OpenBSD 3.1 GEN), Debian Linux 3.0 (Kernel 2.4.18-bf2.4), and Windows 2000 Professional SP2 (5.00.2195) [Red02] [SuS02][Man02] [Ope02] [Deb02] [Mic01a]. The standard, workstation installation was chosen when installation options were presented. Since the Linux and OpenBSD distributions come with a wealth of applications, MS Oce was assumed to be part of the standard out-of-the-box Windows 2000 distribution. The database system MySQL 3.23 was installed on a Pentium II 300MHz running on RedHat Linux 7.1. The QSRA application itself was running on a Windows XP 2002 Pentium III 1.6GHz. Each of these IP-enabled devices (laptops, database servers and QSRA application host) had its own IP address.

4.3.2

Data and analytical framework

Vulnerability data for the audited hosts was gathered from the advisories on the OS manufacturers web site and cross-checked with the sources in Table 2.5. A total of 129 unique vulnerabilities were collected. The total number of security advisories is 171, but due to the shared code base of the Linux distributions, some vulnerabilities apply to more than one operating system. The average rate of advisories per day after release for the Linux family is around 0.26 (0.17 without the Debian outlier), which puts it in midrange between W2k (0.26) and OpenBSD (0.11). It is noted that the advisory rates from Windows to Linux to OpenBSD corroborate the conventional wisdom that security-audited code produces fewer vulnerability advisories than general open-source code, which in turn outperforms the closed- source W2k code. However, it does not necessarily follow from the number of advisories that Windows is more insecure than Linux. Other factors beyond the intrinsic vulnerability occurrence in software may play a role as well: installed user base, popularity of operating systems among certain institutions and the fame factor for discovering vulnerabilities all are potentially signicant. Mi2g, in an end of year study in 2002 reports [mi202a] that most of the known vulnerabilities aected Windows (44% with an estimated 80%-95% market share), yet their share of overt attacks was larger (55%). The same is true for Linux (19% vs 30%; the market share for Linux is unknown, even as a rough estimate). Interestingly enough, Apple is an anomaly: With 3% of the market share in installed

CHAPTER 4. ANALYSIS Table 4.7: Vulnerability advisories vs attack in 2002 Operating system % of vulnerabilities advisories % of attacks Windows 44 54 Linux 19 30 BSD 9 6 MacOS 2 0.05 Table 4.8: Standard installed software # of active services # of installed applications 5 516 4 371 12 41 3 55 14 301 12 45

52

Operating System Red Hat Linux 7.3 Mandrake Linux 8.2 OpenBSD 3.1 SuSe Linux 8.0 Debian Linux 2.2 Windows 2000 SP2

user base, 2% of vulnerability advisories, it suered a mere 31 overt digital attacks, or just 0.05%. For the purpose of this analysis, a at database structure thesisRA was implemented in which the vulnerability data was inserted. thesisRA contains a small subset of ists_icats schema; it focussed on the information needed to perform the risk calculations and further analysis. Its elds include fault type, consequence type, access type, and the four lifecycle event dates. It took around 40-50 hours to gather, investigate and cross-check the vulnerability data, which amounts to an average of 20 minutes per vulnerability. After the data was stored, it was transferred to an Excel spreadsheet, along with the parameter results and the risk calculation formulae described in the QSRA design chapter.

4.4

Vulnerability assessment

The known vulnerabilities of the audited hosts have already been gathered from the manufacturers and advisory sites. The primary goal of vulnerability assessment, namely the inventorying and identication of software in order to assess its known vulnerabilities, has hence already been achieved. The operating system and the services were inventoried by the QSRA client on the hosts, and the applications were inventoried manually from the Linux and BSD packages managers and the Windows Registry. Table 4.8 lists the numbers of software packages found on the audited platforms. Linux distributions in general come preloaded with hundreds of software packages, whereas OpenBSD and Windows have a count that is an order of magnitude smaller. OpenBSDs secure operating system philosophy precludes it from having too many software packages pre-installed, and Windows has a lot of functionality already bundled in its monolithic operating-system design.

CHAPTER 4. ANALYSIS Table 4.9: Experimental factors Factor Description Host OS Is inventorying time signicantly inuenced by host OS? Causes for this may be dierent TCP/IP protocol implementation, or a subtle communication advantage of W2K hosts with Windows XP. Connection type Is inventorying time signicantly inuenced by type of connection (wireless or landline)? A cause for this may be the lesser service quality of wireless connections. ARP cache Is inventorying time signicantly inuenced by IP address already in the ARP cache of the QSRA application host? ARP requests may be relatively expensive, and may be compounded by spotty wireless service. Database size How does inventorying time scale with the number of database records in the database server? Number of software pack- How does inventorying time scale ages with the number of packages on the hosts? Number of inventoried How does inventorying time scale host with the number of hosts inventoried by the QSRA application?

53

Type Categorical

Continuous

4.4.1

Experiments

Vulnerability assessment consists of two parts, automated software inventorying and assisted identication. These experiments measure the automated software inventorying time. The trials consisted of sets of ten repetitions each, on one to four hosts. Six factors (see Table 4.9), three categorical and three continuous, were varied in order to investigate their eect on the software inventorying time: OS, connection type, ARP cache size, database size, number of inventoried software packages and number of inventoried hosts. Tables 4.10-4.12 show the timing results of the experiments.

CHAPTER 4. ANALYSIS

54

Table 4.10: Timing experiment: Database 56 packages size W2k MDK 3000 records 15.7 17.9 16.9 17.2 17.3 17.5 17.0 17.5 17.1 17.0 17.0 17.9 17.4 16.9 17.9 17.4 18.0 18.1 17.5 17.7 2000 records 13.1 13.1 12.9 12.8 12.6 12.6 12.6 12.8 12.7 12.5 12.4 12.5 12.7 12.4 12.5 12.6 12.5 12.6 12.5 12.4 1000 records 9.4 8.8 9.0 8.8 9.0 8.8 8.9 8.7 9.2 8.8 8.9 8.6 9.0 8.7 9.0 9.3 8.9 8.8 9.0 8.7 0 records 3.6 2.6 3.4 2.6 3.5 2.9 3.3 3.0 3.4 2.9 3.3 3.0 3.2 3.5 3.2 2.9 3.3 3.7 3.2 3.0

Software count, 558 packages W2k MDK 163.9 177.9 166.2 170.0 165.9 169.4 165.4 170.2 165.8 170.1 166.6 169.6 166.6 170.1 165.0 168.6 165.0 169.1 166.0 169.7 128.2 131.1 128.5 136.0 127.2 142.5 124.7 128.4 124.6 134.4 124.1 128.1 124.2 136.0 124.6 129.5 124.5 134.3 124.1 128.5 91.8 88.9 96.8 89.8 99.7 88.3 112.0 88.6 104.9 88.1 90.7 89.3 99.0 88.7 96.9 89.0 92.5 88.7 102.2 89.7 38.7 38.1 54.5 36.5 65.7 36.2 46.6 35.9 54.6 35.6 44.6 37.1 44.2 35.5 42.1 36.1 38.9 35.8 48.5 35.6

database size, OS 1058 packages W2k MDK 336.5 337.0 336.0 340.1 334.6 342.5 337.1 341.2 334.3 341.4 332.9 342.3 334.8 338.9 335.0 344.4 333.9 341.7 333.3 340.6 277.7 264.5 273.9 267.2 273.6 267.4 252.1 267.0 273.4 264.4 271.3 261.3 251.7 268.9 254.3 256.4 255.1 264.2 252.9 266.6 216.7 192.7 191.6 192.5 208.1 192.2 211.7 193.6 210.9 192.2 211.3 192.2 210.3 192.3 210.1 192.7 213.4 191.7 212.0 193.2 120.7 84.3 94.3 82.1 98.1 82.6 106.7 80.8 95.9 80.2 97.5 82.7 96.6 81.8 103.6 82.6 97.0 81.2 90.7 80.8

CHAPTER 4. ANALYSIS

55

Table 4.11: Timing experiment: Connection Connection type ARP ushed landline (100 Mb/s) 3.4 3.46 3.5 3.3 3.8 3.9 3.9 3.9 3.6 3.8 3.7 wireless (10 Mb/s) 3.6 3.5 3.6 3.4 3.5 3.5 3.5 3.8 3.5 3.7 3.6

type, ARP cache ARP full 3.5 3.7 3.4 3.7 4.0 3.7 3.8 3.8 3.5 3.6 3.7 3.8 3.6 3.5 3.5 3.5 3.4 3.5 3.6 3.5 3.8 4.0

Table 4.12: Timing experiment: Database size, number of inventoried hosts Number of Database size inventoried hosts 0 records 1000 records 2000 records 3000 records 1(W2k) 3.6 9.4 13.2 15.7 3.4 9.0 12.5 16.9 3.5 9.0 12.4 17.3 3.3 8.9 14.7 17.0 3.4 9.2 12.5 17.0 3.2 8.9 12.3 17.0 3.2 9.0 12.3 17.4 3.2 9.0 12.2 17.9 3.3 8.9 12.3 18.0 3.2 9.0 12.2 17.5 4 16.0 40.8 52.0 68.7 13.3 40.1 53.3 69.5 14.7 39.9 52.9 73.2 12.9 39.2 52.0 68.2 13.0 39.8 52.8 65.5 15.1 39.3 53.3 66.6 13.0 39.8 52.3 65.3 12.8 39.4 52.4 65.7 12.9 39.6 52.3 65.5 14.8 40.8 52.7 69.7

CHAPTER 4. ANALYSIS

56

Figure 4.5: Scatterplot: Software inventorying time for one host


Scatterplot for w2k: R =0.8931
2

350 300 250 Time (seconds) 200 150 100 50 0 1058 3000 2500 558 1000 Number of software packages 58 0 500 0 Number of database entries 2000 1500

(a) Windows 2000


Scatterplot for mdk: R =0.8756
2

350 300 250 Time (seconds) 200 150 100 50 0 1058 3000 2500 558 1000 Number of software packages 58 0 500 0 Number of database entries 2000 1500

(b) Mandrake 8.2

CHAPTER 4. ANALYSIS

57

Scatterplot for mdk: R2 =0.87802

80 70 60 Time (seconds) 50 40 30 20 10 0 4 3 2000 2 500 Number of hosts 1 0 Number of database entries 1500 1000 3000 2500

Figure 4.6: Scatterplot: Software inventorying time for multiple hosts

OS W2K

MDK

Factor(s) used Database Size Software count Software count Database size Database Size Software count Software count Database size

Value 0.04 0.22 0.22 0.04 0.044 0.21 0.21 0.044

C.I (=0.05) [0.033,0.045] [0.198, 0.234] [0.188, 0.245] [0.217, 0.5715] [0.038,0.052] [0.19, 0.228] [0.177, 0.241] [0.0273, 0.062]

Root Mean Square Error 34.61 56.27 95.56 37.84 63 94.15

R2 0.89 0.72 0.18 0.88 0.65 0.22

Table 4.13: Stepwise regression results, two factors: Number of software packages, database size

Factor(s) used Database size Number of hosts Number of hosts Database size

Value 0.01 10.99 10.99 0.01

C.I (=0.05) [0.009,0.013] [9.66, 12.32] [8.5, 13.49] [0.006, 0.015]

Root Mean Square Error 7.812 14.66 18.42

R2 0.88 0.56 0.31

Table 4.14: Stepwise regression results, two factors: Number of hosts, database size

CHAPTER 4. ANALYSIS

58

4.4.2

Discussion

Hypothesis Two claims were investigated: 1. Linearity with respect to software package count and number of hosts : Software inventorying time was postulated to grow linearly with software count and number of hosts. This is important for later scaling expansions. 2. Independence with respect to ARP cache, connection type and Operating System : Software inventorying time should be relatively independent of connection type (landline, wireless) and operating system. This is important if the inventorying takes place on heterogenous, dynamic networks. Linearity Software package count The scatterplots for one host in Table 4.5 on page 56 were produced by the data in Table 4.10 on page 54. They seem to indicate, by visual inspection, that the relationship between timing data and two dependent factors, database size and software count, is linear. Running the multiple regression for both W2K (Windows 2000 host) and MDK (Mandrake 8.2 host) produces the following results: The coecient of determination, R2 , measures 0.89, and 0.88, respectively. This value means that roughly 90% of the observed data variation can be explained by a linear model that includes the two factors. This is a good t. With just one factor, number of software packages, the models explanatory power drops to roughly 70%. With just the other factor, database size, it drops to roughly 20%. Table 4.13 on the page before shows the regression results. The predictive linear model of software inventorying time for one host is thus O(n + m):

Yonehost

a + 0.22n + 0.04m

(4.1)

where

Yonehost a m n

= Assessment time of one host in seconds = roughly -67, a constant term = number of database records = number of software packages

CHAPTER 4. ANALYSIS Number of hosts

59

Software inventorying time also scales linearly with the number of hosts. The

scatterplots for multiple hosts in Table 4.6 on page 57 were produced by the data in table 4.12 on page 55. They, too, seem to indicate, by visual inspection, that the relationship between timing data and two dependent factors, number of hosts and database size, is linear. Running the multiple regression produces the following results: The coecient of determination R2 is 0.88. As in the one host case, this value means that 88% of the observed data variation can be explained by a linear model that includes the two factors. This is a good t. With just one factor, number of hosts, the models explanatory power drops to roughly 56%. With just the other factor, database size, it drops to 31%. Table 4.14 on page 57 shows the regression results. The predictive linear model of vulnerability assessment time for more than one host, with a low software package count on each host, is also O(n + m):

Ymultiplehosts

a + 11n + 0.01m

(4.2)

where

Ymultiplehosts

= Cumulative assessment time of multiple hosts, in seconds 17, a constant term

a = m n

= number of database records = number of hosts, each with a low software package count

The key nding of the regression is that is supports to the hypothesis that vulnerability assessment time scales linearly with all three continuous factors, number of hosts, database size and software count. Independence Connection type and ARP cache size yields the following: There is no statistically signicant time dierence between auditing wireless hosts (bandwidth of 10 Mb/s) and landline hosts (bandwidth of 100 Mb/s). The probability p is 40% that the variations in timing data are due to random uctuations. There is also no statistically signicant time dierence between the QSRA application server having the ARP cache lled or ushed. The probability p is 58% that the variations in timing data are due to random uctuations. The factor analysis of the data in table 4.11 on page 55

CHAPTER 4. ANALYSIS Table 4.15: OS ANOVA: Software inventorying timing data Software Number of records in database count high F normal F low F zero high W2k 63 W2k 0.751 MDK 64 MDK normal W2k 30 W2k 22 MDK 19 MDK low W2k 1.9 W2k 0.044 MDK 8.8 MDK

60

F 45 19 7.8

Software count 56 500 1000

Table 4.16: Size of software inventory Windows 2000 (size in kB) Mandrake 8.2 (size in kB) 14 14 107 105 208 205

Figure C.2 on page 105 shows the boxplot for the connection experiment data. This experiment was done with a high software count on the host, and a large database size. The environment thus was set up to maximize the potential for dierences by making auditing time as large as the parameters would allow. The wireless data is skewed to the bottom, as can be seen be the median line. This is probably due to the relative ckleness of a wireless connection. In addition, the landline connection is even a bit slower than its wireless counterpart - another indicator that the uctuations are random. Figure C.2 shows the boxplot for the ARP experiment data. The title of the plot indicates a low software count on the host, and a small database size. Similarly to the connection rationale, this was done to reduce software inventorying time to emphasize potential lags in ARP requests, so they may not to be drowned out by the processing on the QSRA server and database lookups/writing. As can be seen by visual inspection, the means are virtually identical for the small testbed network. Operating system The factor analysis of the data in table 4.12 on page 55 yields the following: There seems to be a statistically signicant time dierence in favor of Windows 2000 hosts when the hosts have a high software count and when the database contains high to average number of records. This eect is most pronounced when the hosts have a high to normal software count. There seems to be a statistically signicant time dierence in favor of Mandrake 8.2 hosts when the database contains few or zero records. Again, the eect is most pronounced when the hosts have a high to normal software count. Table 4.15 illustrates this; it denotes the F value and the faster host. The host is in boldface if the dierence is statistically signicant with = 0.05. Figures C.3-C.4 show the boxplots for the OS experiment data. The timing dierentials are too great and too consistent to be due to random uctuations. Four factors were investigated to seek an explanation for inventoried W2K data being written more quickly to the database than the inventoried MDK data when the number of records in the database was high, and conversely, substantially slower

CHAPTER 4. ANALYSIS when the database record number was low.

61

Database parameters: Analysis began by looking at server-side caching parameters, and queued query execution. There was nothing that could possibly apply to one set of data and not the other. It was concluded that database-side parameters could not the explanation. TCP parameters: A possible explanation would have been that the two TCP/IP stack implementations for W2k and MDK use dierent parameters for ow and congestion control windows, as well as connection time outs that aect connection setup and transmission time. [Tan96, pp.527542]. However, the connection bandwidth is too high (11Mb/s), the RTT of packets too fast (< 1ms) and the subsequent time to transmit 200 K of data is negligeable. After measuring the pure transmission time (before processing) at virtually 0ms, the conclusion was reached that TCP parameters cannot play any decisive role. Software count simulator: Since there were not 1000+ services and applications installed on each of the test hosts, two simple programs were used, one for Linux and one for the Windows hosts, to mimic installed services/applications. The program was adapted to both hosts and the experiments repeated. There remained a statistically signicant time dierential. The software count simulator was not the source of the time dierence. Assessor dierences: MDK and W2K use dierent types of assessors which transmit slightly dierent amounts of data back to the QSRA application. This can be seen in Table 4.16: The size of the assessor objects vary fairly linearly as a function of the number of software packages and minimally between the host specic assessors. The data that gets written out, however, in an assessor-independent, standardized format. This could not explain the dierences either. Although this is an interesting eect which could be investigated further, it is rather tangential to the analysis of QSRA. Thus, this eect is noted without a satisfying explanation. Synopsis The following conclusions are drawn: Software inventorying, as implemented in the QSRA software, scales linearly with the number of hosts and with the number of software packages on these hosts. In addition, the QSRA applications interaction with the database is an important time factor.

CHAPTER 4. ANALYSIS

62

There is no statistically signicant time dierence between auditing wireless hosts (bandwidth of 10 Mb/s) and landline hosts (bandwidth of 100 Mb/s). There is no statistically signicant time dierence between the QSRA application server having the ARP cache lled or ushed. There seems to be a statistically signicant time dierence in favor of Windows 2000 hosts when the hosts have a high software count and when the database contains many to average number of records. There also seems to be a statistically signicant time dierence in favor of Mandrake 8.2 hosts when the database contains few or zero records. There is no satisfying explanation for this to date.

4.5
4.5.1

Risk Assessment
Risk calculations

The vulnerability data was imported into an Excel spreadsheet and then processed by Matlab for further analysis [Mat00]. Attack probabilities were calculated for the audited hosts, using the calculation parameters in Tables 4.17 and 4.18. The estimates for time window and the attacker abilities were based on the survey and adjusted for the individual fault types. The consequence probabilities of the standard OS installations according to the QSRA model are shown in Table 4.19. These are the probabilities that a motivated attacker can use some fault to cause the given consequence. It is noted that four to six months after their respective release date, the probabilities are very high across all the audited operating systems (66% to 99%) that an attacker can conduct a full consequence compromise, remotely and locally, if the audited systems are left unpatched. Systems with independent components in series experience a rapid growth in cumulative failure probability given the failure of just a few components. Figure 4.7 illustrates this for a ten-component system, varying the independent failure probability for one component. With a component failure probability of just 0.2, three faults suce to raise the systems failure probability to 50%. Greater component failure probabilities push the systems failure probability quickly into the 70%-90% range. QSRAs exploit probabilities are in the 10%-70% range, depending on the fault type (see Table 4.17 on page 64), which means that one or two faults are enough to cause the consequence probabilities to rise to 50% and more. Hence, almost all if not all of the faults have to be eliminated to have an appreciable eect on the consequence risk probabilities.

CHAPTER 4. ANALYSIS

63

Figure 4.7: Failure probability of independent components in series


Failure of independent components in series 1

0.9

0.8

0.7 Failure probability

0.6

0.5

0.4

0.3

0.2

0.1 Single component reliability:0.2 0 0 1 2 3 4 5 6 Number of components 7 8 9 10

(a) Failure of single component: 0.8


Failure of independent components in series 1

0.9

0.8

0.7 Failure probability

0.6

0.5

0.4

0.3

0.2

0.1 Single component reliability:0.4 0 0 1 2 3 4 5 6 Number of components 7 8 9 10

(b) Failure of single component: 0.6


Failure of independent components in series 1 1 Failure of independent components in series

0.9

0.9

0.8

0.8

0.7 Failure probability Failure probability Single component reliability:0.6 0 0 1 2 3 4 5 6 Number of components 7 8 9 10

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1 Single component reliability:0.8 0 0 1 2 3 4 5 6 Number of components 7 8 9 10

(c) Failure of single component: 0.4

(d) Failure of single component: 0.2

CHAPTER 4. ANALYSIS Table 4.17: Parameters: Attacker ability Fault with automated tool without automated tool Access .4 .2 Bound .3 .1 Buer .3 .1 Cong .4 .2 Design .4 .2 Env .4 .2 Exceptional .4 .2 Input .7 .3 Race .3 .1 posted exploit days 20 27 27 15 15 20 27 15 27

64

Attacker ability (proportion of general attacker population)

Table 4.18: Parameters: Time window till Time window Fault from discovered Access vulnerability to Bound posted exploit Buer Cong Design Env Exceptional Input Race

4.5.2

Fault and Consequences distributions

The breakdown of faults and consequences by OS can be found in Table C.9. From this, distributions of the proportions over operating systems, faults and consequences were calculated (see Tables 4.20-4.25) and submitted to analysis. Figure 4.8 shows the empirical CDF plot of 3-tuples (OS, consequence and fault type) values from the data in C.9 on page 118. Faults were mapped to a vulnerability consequence, a fault type and an operating system. The overwhelming majority of fault occurrences lead to availability and full compromise consequences. 75% of the 3-tuples (OS, consequence and fault type) have a count of 0, and around 25% fall within a count range of four to one. Within availability and full compromise consequences, around 15% percent have a count of one or two, and around 25% fall within a count range of four to one. The fault count dataset is non-normally distributed, as can be seen from the histogram and normal probability plot in Figure 4.8. Hence, one-way ANOVA (Kruskal-Wallis) was performed over the fault count dataset for three factors. The appropriate boxplots and the 2 test results are shown in Figure C.5. Details on the group wise comparison of the fault count data can be found in Tables C.4 to C.6 on page 113, with the signicant mean dierence highlighted in bold face.

CHAPTER 4. ANALYSIS

65

Operating System Debian 3.0

Mandrake 8.2

OpenBSD 3.1

RHL 7.3

SuSe 8.0

W2k SP2

Table 4.19: OS consequence probability calculation results Consequence Remote consequence probability Local consequence probability Full 0.99 0.91 Process 0.91 Integrity 0.95 0.40 Condentiality 0.87 0.40 Availability 0.86 0.58 Full 0.99 0.98 Process 0.40 Integrity 0.30 Condentiality 0.64 Availability 0.96 0.58 Full 0.89 0.66 Process 0.51 0.70 Integrity Condentiality 0.70 Availability 0.90 Full 0.99 0.96 Process 0.40 0.40 Integrity Condentiality 0.64 Availability 0.96 Full 0.98 0.98 Process Integrity Condentiality Availability 0.86 0.30 Full 0.99 0.93 Process 0.89 0.64 Integrity 0.40 0.64 Condentiality 0.92 Availability 0.99 0.40

Consequence Avail Condent Full Integrity Process

Table 4.20: Proportion of OS to consequence Debian 3.0 Mandrake 8.2 OpenBSD 3.1 RHL 7.3 0.20 0.18 0.09 0.18 0.25 0.17 0.08 0.17 0.14 0.27 0.09 0.19 0.50 0.13 0.25 0.13 0.25

SuSe 8.0 0.13 0.16

W2k SP2 0.22 0.33 0.16 0.38 0.38

Consequence Avail Condent Full Integrity Process

Table 4.21: Proportion of consequences to OS Debian 3.0 Mandrake 8.2 OpenBSD 3.1 RHL 7.3 0.29 0.22 0.31 0.27 0.10 0.05 0.08 0.07 0.42 0.68 0.62 0.60 0.13 0.03 0.06 0.03 0.07

SuSe 8.0 0.29 0.71

W2k SP2 0.29 0.11 0.43 0.09 0.09

CHAPTER 4. ANALYSIS

66

Table 4.22: Proportion of Fault Avail Condent Acc 0.29 Bound 0.33 Buer 0.28 Cong 0.67 Design 0.14 0.28 Env 0.50 Except 0.82 0.09 Input 0.09 0.05 Race

consequences to fault Full Integrity Process 0.43 0.29 0.67 0.72 0.33 0.36 0.11 0.11 0.50 0.09 0.68 0.09 0.09 0.89 0.11

Table 4.23: Proportion of Fault Avail Condent Acc 0.04 Bound 0.36 Buer 0.18 Cong 0.04 Design 0.11 0.83 Env 0.02 Except 0.20 0.08 Input 0.04 0.08 Race

faults Full 0.03 0.34 0.22 0.14 0.01 0.01 0.16 0.09

to consequence Integrity Process 0.25 0.13 0.50 0.50 0.25 0.25 0.13

Fault Acc Bound Buer Cong Design Env Except Input Race

Debian 3.0 0.29 0.08 0.17 0.19 0.18 0.45 0.11

Table 4.24: Proportion of OS Mandrake 8.2 OpenBSD 3.1 0.25 0.10 0.28 0.14 0.33 0.25 0.50 0.09 0.14 0.05 0.33 0.22

to fault RHL 7.3 0.14 0.19 0.21 0.33 0.14 0.18 0.18 0.22

SuSe 8.0 0.23 0.10 0.03 0.50 0.09 0.14 0.11

W2k SP2 0.57 0.15 0.10 0.33 0.39 0.45 0.05

Fault Acc Bound Buer Cong Design Env Except Input Race

Debian 3.0 0.06 0.13 0.16 0.23 0.06 0.32 0.03

Table 4.25: Proportion of faults to OS Mandrake 8.2 OpenBSD 3.1 RHL 7.3 0.00 0.00 0.03 0.32 0.38 0.30 0.22 0.31 0.20 0.03 0.03 0.24 0.17 0.08 0.03 0.07 0.08 0.08 0.13 0.08 0.15 0.07

SuSe 8.0 0.00 0.52 0.14 0.05 0.05 0.05 0.14 0.05

W2k SP2 0.11 0.20 0.09 0.03 0.40 0.14 0.03

CHAPTER 4. ANALYSIS Figure 4.8: Fault count data


Overall fault count: CDF 1

67

0.9

0.8 Fault count CDF 0.7

0.6 Probability

0.5

0.4

0.3

0.2

0.1

4 5 Fault count

(a) CDF
Fault count: Histogram 200

180

160

140

120 Frequency

100

80

60

40

20

4 Count

(b) Histogram
Normal Probability Plot 0.999 0.997 0.99 0.98 0.95 0.90

0.75 Probability

0.50

0.25

0.10 0.05 0.02 0.01 0.003 0.001 0 1 2 3

Fault count data Normal line

4 Data

(c) Normal Probability Plot

CHAPTER 4. ANALYSIS Table 4.26: ANOVA results of proportions of faults, OS and consequences Factor p-value Detail Faults vs OS 0 Table C.6 Faults vs consequences 0.134 not signicant OS vs faults 0.547 not signicant OS vs consequences 0.484 not signicant Consequences vs OS 0 Table C.5 Consequences vs faults 0.004 Table C.4 Hypotheses Four claims were investigated.

68

1. Fault type eects : Certain fault types could be proportionally more prevalent across operating systems. Similarly, certain fault types may be signicantly more responsible for fault consequences. Input validation error, boundary condition and buer overow errors are posited to be proportionally over-represented across OS and fault consequences. 2. Consequence type : The fault count leading to certain consequences may be proportionally more prevalent than others, since the severity of the consequence may determine the attention given to such faults by testers, hackers, and developers. One could expect full compromise consequences to have a higher proportional fault count than the other consequences. 3. OS open-source : There could be a marked proportional fault count dierence between closed-source (Windows) and open-source OSs (Mandrake, Red Hat, SuSe, Debian, OpenBSD). The dierential could go either way. 4. OS security-audited : There could be a marked proportional fault count dierence between securityaudited (OpenBSD) and non-security audited OSs(Windows, Mandrake, Red Hat, SuSe, Debian). One could expect the fault count proportions for security-audited OSs to be dierent than for the other OSs.

Faults The top two boxplots in Figure C.5 show the spread of the nine fault types, controlled for OS and consequences. After controlling for OS, we nd that the 2 tests p value of near zero strongly indicates that that fault type is a signicant factor. Looking at the mean rank comparison in Table C.6, we nd that it is the boundary condition fault type, which diers signicantly from the conguration, access and environmental errors. The empirical fault data from Table 4.27 ranks boundary condition dead last in fault count, however. After some investigation, this contradiction is believed to be due to diering classication criteria for vulnerabilities caused by input validation errors, and its subtypes,

CHAPTER 4. ANALYSIS

69

Table 4.27: Fault type breakdown 2002 Fault type 2002 - local proportion Input Validation Error 87 0.27 (Boundary Condition Error) 3 0.01 (Buer Overow) 59 0.18 Access Validation Error 27 0.08 Exceptional Condition Error 7 0.02 Environment Error 4 0.01 Conguration Error 18 0.06 Race Condition 16 0.05 Design Error 101 0.31 Other 0 0

[oSN02] 2002-remote 517 17 199 76 98 4 48 6 282 1

proportion 0.41 0.01 0.16 0.06 0.08 0 0.04 0 0.23 0

boundary condition and buer overow errors. The dierences between those three categories is subtle, and at times not clear cut. When the boundary and buer errors are subsumed under input validation, the contradiction disappears. When controlled for consequences, the 2 tests p value of 0.134 rejects the claim that certain fault types contribute more to consequences. However, after subsuming the subtypes under input validation, it is found that input validation error represent a signicantly higher proportion than the other errors, controlled for both OS and consequences. The data, after some rearrangement, does support the claim that input validation errors are signicantly over-represented in vulnerabilities across operating systems and consequences. Consequences The bottom two boxplots in Figure C.5 show the spread of the fault count for the ve consequence types, controlled for OS and consequences. After controlling for OS, we nd that the 2 tests p value of near zero strongly indicates that consequence type is a signicant fault count factor. Table C.5 shows that full consequence signicantly outranks integrity, condentiality and process consequences. Controlling now for fault type, the 2 tests p value of 0.00382 again suggests that consequence type is a signicant fault count factor. Table C.4 ranks full consequence over conguration consequence, but not over integrity and process consequence (albeit very narrowly; the lower bound for their respective rank dierence is -1 and zero). This is suciently close to conclude that the data supports the claim that a signicantly higher proportion of faults result in full consequence compromises, controlling for both OS and fault types. OS open source, security-audited The middle two boxplots in Figure C.5 show the spread of the fault proportions of the six operating systems, controlled for fault and consequence type. The results are surprising, because it goes against popular computer lore. Both 2 tests p values are so high (0.547, and 0.484, respectively) that they reject both the security-audit and open source hypothesis. The data

CHAPTER 4. ANALYSIS

70

does not support the contention that operating system, after controlling for fault type and consequences, is a signicant fault proportion factor. Fault data set ndings Faults were mapped to a vulnerability consequence, a fault type and an operating system. The overwhelming majority of fault occurrences lead to availability and full compromise consequences. 75% of the 3-tuples (OS, consequence and fault type) have a count of 0, and around 25% fall within a count range of four to one. Within availability and full compromise consequences, around 15% percent have a count of one or two, and around 25% fall within a count range of four to one. Input validation faults are proportionally over-represented. There is a statistically signicant dierence between consequence types. Full compromise consequences are proportionally over-represented. There is no statistically signicant fault or consequence proportion dierence between operating systems. The proportions are roughly the same.

CHAPTER 4. ANALYSIS

71

4.6

Risk Management

Remotely accessible software constitutes the primary entry point for security compromises by outside attackers. Hence, the goal of this analysis section is to evaluate competing strategies to mitigate this risk. Software, as noted earlier, exhibits varying counts of vulnerabilities of diering consequences. Two vulnerability metrics can be used to decide which software should be worked on, highest count and highest risk. Highest-count consists of a fast patching approach, where the goal is to patch the software exhibiting the highest vulnerability count regardless of the consequences. Highest-risk, on the other hand, is a slower approach where a preceding analysis (one of the goals of QSRA) pinpoints the software with the highest risk exposure. There are of course any number of strategies to compare to highest risk, but the salient dierence between highest risk and highest count, namely the rank ordering of the vulnerabilities, is emphasized in this juxtaposition. Risk proles were calculated in Excel for the audited operating systems (Red Hat Linux 7.3, SuSe Linux 8.0, Mandrake Linux 8.2, OpenBSD 3.1, Debian Linux 3.0, and Windows 2000 Professional SP2) and aggregated into three operating system families. These families consist of Windows, Linux, and BSD. For both families and individual operating systems, two scenarios were contrasted with their original risk prole: Highest count, in which the top three buggiest software packages, as measured by highest vulnerability count, were patched; and highest risk, in which the top three buggiest software packages, as measured by QSRAs analytic highest risk metric, were patched. Since the implementation of QSRA at this point in time did not take escalation attacks into account and the vulnerability database did not contain the requisite estimated 3000 software packages to eectively run the optimization algorithm, the optimization was done manually. Patched, then, in this context means that the vulnerabilities in the software packages were neutralized by deducting the faults that caused them. The risk calculation parameters used can be found in Tables 4.17 and 4.18. The consequence loss magnitudes are contingent on the data residing on and functionality of the machine, and as such have to be set individually by the risk manager for the constitutive hosts under scrutiny. In this analysis, full consequence losses were presumed to be more severe than condentiality, integrity and availability losses (see Table 4.28). The scenario results can be found in Tables 4.29- 4.32.

CHAPTER 4. ANALYSIS

72

Table 4.28: Consequence magnitudes Consequence Magnitude (Severity) Full 5 Process 4 Integrity 3 Condentiality 2 Availability 1

Table 4.29: Individual OS remote consequence risk probability scenario results


Operating System Debian 3.0 Vulnerability Consequence Full Process Integrity Condentiality Availability Full Process Integrity Condentiality Availability Full Process Integrity Condentiality Availability Full Process Integrity Condentiality Availability Full Process Integrity Condentiality Availability Full Process Integrity Condentiality Availability Original 0.99 0.91 0.95 0.87 0.86 0.99 0.64 0.96 0.89 0.51 0.90 0.99 0.40 0.64 0.96 0.98 0.86 0.99 0.89 0.40 0.92 0.99 Scenarios highest count 0.99 0.70 0.82 0.78 0.58 0.99 0.64 0.95 0.67 0.51 0.79 0.99 0.40 0.64 0.93 0.90 0.71 0.98 0.70 0.40 0.92 0.91 highest risk 0.99 0.70 0.40 0.87 0.86 0.98 0.64 0.95 0.53 0.51 0.79 0.99 0.40 0.64 0.93 0.51 0.86 0.93 0.70 0.40 0.94 % change in probability highest count highest risk 0 0 -23 -23 -13 -58 -10 0 -32 0 0 -2 0 0 -2 -2 -24 -40 0 0 -12 -12 0 -1 0 0 0 0 -4 -4 -8 -48 -18 0 -2 -7 -22 -22 0 -100 0 -57 -8 -6

Mandrake 8.2

OpenBSD 3.1

RHL 7.3

SuSe 8.0

W2k SP2

CHAPTER 4. ANALYSIS

73

Table 4.30: Individual OS remote vulnerability count scenario results


Operating System Debian 3.0 Vulnerability Consequence Full Process Integrity Condentiality Availability Full Process Integrity Condentiality Availability Full Process Integrity Condentiality Availability Full Process Integrity Condent Availability Full Process Integrity Condentiality Availability Full Process Integrity Condentiality Availability Original 9 2 3 4 5 16 0 0 2 6 5 2 0 0 6 12 1 0 2 8 8 0 0 0 5 14 3 1 5 9 Scenarios highest count 7 1 2 3 2 12 0 0 2 5 2 2 0 0 4 9 1 0 2 6 4 0 0 0 3 6 1 1 5 4 highest risk 7 1 1 4 5 10 0 0 2 5 1 2 0 0 4 8 1 0 2 6 2 0 0 0 5 5 1 0 1 5 % change in vulnerability count highest count highest risk -22 -22 -50 -50 -33 -67 -25 0 -60 0 -25 -38 0 0 -17 -17 -60 -80 0 0 -33 -33 -25 -33 0 0 0 0 -25 -25 -50 -75 -40 0 -57 -64 -67 -67 0 -100 0 -80 -56 -44

Mandrake 8.2

OpenBSD 3.1

RHL 7.3

SuSe 8.0

W2k SP2

CHAPTER 4. ANALYSIS

74

OS family Linux

W2k

OBSD

Table 4.31: OS family average remote risk probability scenario results Vulnerability Scenarios % change in probability Consequence Original highest count highest risk highest count highest risk Full 0.99 0.97 0.87 -2 -12 Process 0.33 0.28 0.28 -16 -16 Integrity 0.24 0.21 0.10 -13 -58 Condentiality 0.54 0.52 0.54 -4 0 Availability 0.91 0.79 0.90 -13 -1 Full 0.99 0.98 0.93 -2 -7 Process 0.89 0.70 0.70 -22 -22 Integrity 0.40 0.40 0.00 0 -100 Condentiality 0.92 0.92 0.40 0 -57 Availability 0.99 0.91 0.94 -8 -6 Full 0.89 0.67 0.53 -24 -40 Process 0.51 0.51 0.51 0 0 Integrity 0.00 0.00 0.00 Condentiality 0.00 0.00 0.00 Availability 0.90 0.79 0.79 -12 -12

OS family Linux

W2k

OBSD

Table 4.32: OS family average remote vulnerability count scenario results Vulnerability Scenarios % change in vulnerability count Consequence Original highest count highest risk highest count highest risk Full 45 32 27 -29 -40 Process 3 2 2 -33 -33 Integrity 3 2 1 -33 -67 Condentiality 8 7 8 -13 0 Availability 24 16 21 -33 -13 Full 14 6 5 -57 -64 Process 3 1 1 -67 -67 Integrity 1 1 0 0 -100 Condentiality 5 5 1 0 -80 Availability 9 4 5 -56 -44 Full 5 2 1 -60 -80 Process 2 2 2 0 0 Integrity 0 0 0 Condentiality 0 0 0 Availability 6 4 4 -33 -33

CHAPTER 4. ANALYSIS Hypothesis

75

The claim under investigation is one of analysis primacy. Two vulnerability metrics can be used to decide which software should be worked on, highest count and highest risk. Highest-count consists of a fast patching approach, where the goal is to patch the software exhibiting the highest vulnerability count regardless of the consequences. Highest-risk, on the other hand, is a slower approach where a preceding analysis pinpoints the software with the highest risk exposure. Risk management claims that, given a limited set of resources (in this case, patching of only three programs), risk analysis with subsequent patch recommendations is a more eective risk mitigation strategy than just highest count patching. Consequence breakdown For full consequences, highest risk performs as good or better than highest count across all operating systems. Two operating systems exhibit little or no improvement for either risk mitigation strategies. The reason for this is the large number of full consequence vulnerabilities in the Debian and Mandrake standard installations (nine and sixteen, respectively), which remained large even after the most vulnerable three software packages were patched. For process consequences, both strategies yielded similar risk probability improvements. The small number of process consequence vulnerabilities across the OSs (three in Windows being the highest count) is the likely cause. The same reasoning holds for the similar integrity consequence results. For condentiality consequences, the data are mixed: Highest count outperforms highest risk for Debian, the reverse is true for Windows, and the other OSs exhibit no change in the probabilities. The reasons are threefold: the small number of condentiality vulnerabilities (other OSs), the relatively low condentiality consequence loss magnitude (Debian), and the high number of vulnerabilities ameliorated by one patch (Windows). For availability consequences with a relative consequence loss magnitude ranking of 1, highest risk performs as good or better than highest count in four of six cases. The exceptions are Suse and Debian. Both exhibit the smallest availability consequence counts in the standard installation (ve in both cases) across the OSs, so the relative eects of two or three patched vulnerabilities is disproportionally large. The data supports the contention that QSRAs analytical highest risk management strategy performs as good or better than highest count, in roughly 80% of consequence cases. It consistently outperforms highest count for the higher potential loss consequences of full, process and integrity vulnerabilities. Both strategies are ineective when the vulnerability count is so high that even a substantial vulnerability count elimination has no noticeable eect on the risk exposure probability. This can be seen, for instance, for full consequence in W2K, where both strategies reduce the vulnerability count by roughly 60%, yet the risk probability is reduced by a negligible two to seven percent.

CHAPTER 4. ANALYSIS

76

OS breakdown Across the individual OSs, highest risk generally performs as good or better as highest count in the majority of consequence cases. Conclusions are most robust for Windows systems, due to the relatively large number of vulnerabilities across all consequences. The eect is pronounced: every consequence risk probability except availability consequence improved more with QSRAs analytical risk strategy. However, closer inspection reveals the sobering fact that although the vulnerability count decreased by roughly 60%, the risk probability decreased by one order of magnitude less. The various Linux avors picture is more nuanced, but still can be rmly positioned in the highest risk camp, the notable exception being SuSe. The reason seems to be a relatively small vulnerability count that is distributed between just two consequences, availability and full; both of which are on the two opposite ends of the consequence magnitude spectrum. OpenBSD, on the other hand, seems to conrm the hypothesis as well; although, again, due to its relatively small vulnerability count, the conclusion is more tentative. The power of QSRAs ranked risk analysis and management is also highlighted by the fact that for ve out of the six operating systems, the remaining vulnerability count is higher for the highest count than it is for the highest risk strategy. This counterintuitive result becomes less confusing when one recalls that the goal of this analysis was to reduce the risk of remotely accessible vulnerabilities; whereas highest count indiscriminately patched remotely and locally accessible software, QSRA focussed on the remotely accessible software vulnerabilities. Generally speaking, the data supports the contention that QSRAs analytical highest risk management strategy performs as good or better than highest count in at least four out of the six tested operating systems. The tested Windows system has shown the most pronounced dierence, probably due to the comprehensiveness of Windows-style patches. Both strategies have minimal eect on risk probability reduction across all OSs when the vulnerability count is high. Risk management ndings Recapitulating, the following conclusions are drawn: One or two faults are enough to cause the risk probabilities to be rise to 50% or more. The risk probabilities are hence not appreciatively reduced by half-hearted measures - xes should cover almost all if not all the faults. In the presence of a moderate fault count, QSRAs highest risk analytic risk mitigation strategy consistently outperforms the undierentiated highest count strategy across four out of ve consequences, assuming the given consequence magnitude ranking. Highest risk also outperforms the undierentiated highest count strategy for at least four out of the six tested operating systems.

CHAPTER 4. ANALYSIS

77

The most compelling eects are found on the Windows system, probably due to the comprehensiveness of Windows-style patches. The eects on Linux-family systems are less pronounced. Due to the paucity of vulnerabilities in the tested BSD system, no conclusion can be drawn. Both strategies have minimal eect on risk probability reduction across all audited hosts when the vulnerability count is high.

4.7

Possible improvements

The current implementation of QSRA can naturally be improved. As mentioned in the Implementation chapter, the application inventorying module and the renement of the risk calculation formulas in the QSRA applications are the rst step to synchronize the prototype implementation with the full design. Further improvements before a production quality implementation of QSRA can be released are described.

4.7.1

Design

Parameter uncertainty QSRAs risk analysis calculations are based jointly on published information about software vulnerabilities, which are reasonably accurate, and on parameters that were inferred from empirical research and solicited expert opinion. While this approach may be sucient for a proof-of-concept design, for practical deployment, the issue of parameter uncertainty, both epistemic and stochastic, needs to be addressed. Epistemic uncertainty refers to the lack of knowledge about the risk parameters that characterize the QSRA model, whereas stochastic uncertainty refers to the variability of the parameters [Vos00, pp. 149193]. In the present incarnation, these parameters consist of the time event parameters that describe vulnerability lifecycles (especially [tdesc , tposted ], the proportion parameters that denote a attackers manual and automated ability to take advantage of a vulnerability, and the loss functions the risk manager has to set for the various vulnerability consequences. The parameter variability can be more accurately modelled by replacing the point probabilities with continuous/discrete distribution functions. In addition, to reect and mitigate the epistemic uncertainties Bayesian inference methods can be employed [Hec95]. The Bayesian theorem is stated below in Equation 4.3.

CHAPTER 4. ANALYSIS

78

f (|X ) =

()l(X |) ()l(X |)d

(4.3)

where X is the observed data, () is the prior distribution, l(X |) is the likelihood function, and f (|X ) is the posterior distribution. This approach will be demonstrated with the time window [tdesc , tpatch ], which denotes the time window in days from discovered vulnerability to posted patch. Prior to any observed data X , the initial prior distribution of is assumed to be uniform over [0,120] (following the experts estimation bracket in Table 4.5): 0.008 0 120 ( ) = 0 otherwise The empirical vulnerability data for [tdesc , tpatch ] (see Figure 4.2 on page 45) yields the following l(X ) distribution: 0.65 0.07 l(X ) = 0.23 0.05 0X6 6 < X 11 11 < X 120 otherwise

We now can calculate the posterior distribution f (|X ) using the observed data. For 0 6, () =
6 0 1200

= 0.05. The likelihood function l(X |) is 0.65; hence the posterior distribution is =
0.33 0.245

0.050.65 0.050.65+0.0480.07+0.9080.23+00.05

= 0.133. The posterior distribution (and possibly the new

prior distribution) over the whole range of is thus: 0.133 0.014 f (|X ) = 0.852 0 06 6 < 11 11 < 120 otherwise

The use of distribution functions combined with Bayesian inference can fruitfully tackle the issue of stochastic and epistemic parameter uncertainties. As a result, the calculated risk metric will yield a value range rather than the current risk value point. Changes in the QSRA application have to be made, both in the GUI and in the database schemes.

CHAPTER 4. ANALYSIS The design, implementation and testing is estimated to take eight to twelve weeks. Encryption, authentication and non-repudiation

79

Currently, communication streams are in cleartext. For a production version, all communication streams between QSRA servers, clients and databases have to be encrypted and authenticated. Encryption has to be built into the QSRA application and QSRA client software. Sun oers Java Cryptographic Extensions [Mic01b, /products/jce/], which uses SSL to encrypt the TCP data streams. Encryption is especially important for wireless networks, since the built-in WEP protection schemes are rarely used and weak [BGW01]. Authentication can be implemented via the Kerberos network authentication protocol. A free implementation of this protocol is available from the Massachusetts Institute of Technology [oT03]. After vulnerability assessment has been performed, software inventory data is date/time stamped and MD5 hashed. Some changes need to be made to the QSRA application, clients and the database. The encryption segment should take no more than ve weeks to implement and test, since the JCE should do the encryption transparently. The communication between the database server and the QSRA client hast be investigated2 before implementation and will take three weeks longer.

4.7.2

Implementation

Software inventory comprehensiveness In its present version as a prototype, QSRA can automatically inventory services and operating systems. An obvious, if not straightforward, extension would be application inventorying. For this to be feasible, installed software should follow recommended installation procedures. In the case of Windows machines, proper installation means that meta-data about the installed software is recorded in the Windows Registry under the key HKEY_LOCAL_MACHINE\SOFTWARE. In the case of Linux and BSD machines, there are similar database-like structures (such as packages.rpm for Linux or _pkg-name_ for BSD) that get updated when standard installation procedures are followed properly. QSRAs software inventorying can then parse these databases to draw up the list of installed software. However, not all software installation follow protocol, which makes it harder to detect purposefully installed, potentially vulnerable software (malicious, purposefully hidden software like a trojan is another matter altogether). To address this issue, software inventorying needs to include a sweep function that peruses the le
2 http://www.securityfocus.com/infocus/1667

is a starting point

CHAPTER 4. ANALYSIS

80

system (akin to a virus scanner), ags executables of varied stripes and returns their location and lename for further investigation. As an example3 for Linux, the following piped commands nd all les that are executables, as well as Perl scripts, in the directory $DIR find $DIR -type f | file -f - | grep -v RPM | egrep -i executable|perl This is a supplemental addition to the application auditing implementation of the QSRA client. With some research, the design, implementation and testing cycle should take two to three weeks. Software identication automation After the software inventorying has taken place, the executables have to be identied as specic software (see Figure 3.4 on page 34). Currently, this tedious process is done in a manual or assisted way: QSRA can query the vulnerability database for software whose attributes match items in the software inventory list. In the current version, there are two: executable name and active port, if applicable. As an example, if an executable httpd is found listening on port 80, QSRA will suggest that this is a most likely an Apache web server - however, even given that fact, it cannot glean from this information the version and patch level of the software, which is necessary to calculate an accurate risk prole. In the worst case, the inventory datum will nd no match in the vulnerability database and no assistance for correct identication is oered. To enhance the databases quality of assistance and maybe fully automate the software identication process, the vulnerability database needs to be populated with more attributes about the software. These would primarily include le size of the executable, but also may include ancillary identifying lenames, and the names of libraries used. As an example, consider an Apache web server running on a Windows machine. Current software inventorying yields an executable named Apache.exe listening on port 80, which is not enough to identify the version. As an extension, QSRAs software inventorying could also return information about the support libraries used by Apache.exe. In Table 4.33, the le size of APACHECORE.DLL is sucient to correctly identify version 1.3.27 of the web server, if that information were stored in the vulnerabilities database. Suitable tools for Windows and Linux (akin to fport for services) need to be found, and incorporated into the QSRA clients. Some changes to the ists_icat database schema have to be made and to the
3 http://www.imsb.au.dk/~mok/linux/doc/RedHat-CD-5.html

CHAPTER 4. ANALYSIS Table 4.33: Relevant libraries used by Apache 1.3.26 and Apache 1.3.27 Module Name APACHE.EXE APACHECORE.DLL APACHE.EXE APACHECORE.DLL File Size 20480 352256 20480 356352 Path e:\apache26\ e:\apache26\ e:\apache27\ e:\apache27\

81

QSRA GUI and matching logic. This is all estimated to take from design to testing twelve to sixteen weeks.

Chapter 5

Future Work
This chapter gives a sketch of design changes and additions that could be usefully added to QSRA.

5.1

Database update

The timeliness and accuracy of the information in the ists_icat database (which holds detailed information about the software and its associated vulnerabilities) is vital to QSRAs risk model calculations. CERT reported over four thousand new vulnerabilities in 2002 [CC03]. In QSRAs current incarnation, ists_icat has be be manually populated. This has been calculated to take at least 12 man-hours/week, up to 70 man-hours/week. The reasons are twofold: The vulnerability database uses data from the manufacturers and a dozen other sources (see Table 2.5 on page 25) with mutable, non-standardized formats. This data is sometimes faulty, often contradictory. In addition, some information extraction does not lend itself to automated parsing. For instance, sometimes it is necessary to read an exploit description to gure out what fault causes what vulnerability consequence, when the vulnerability was rst discovered, and so on. The rate of eighty vulnerablities/week, as well as the tedium of transcription and its inevitable errors, warrants some form of automation. For automation to be feasible, changes have to be made both at the data source and the QSRA end. At the data source end, information has to be moved from being buried in descriptive text to parsable, marked elds, analogous to other data such as vulnerability consequence, fault type, etc. Specically, vulnerability lifecycle information and aected programs/libraries have to be painfully extracted from the text. Sometimes, vulnerability lifecycle information is not listed and has to be inferred from exploit source code, mailing list threads and other discussion fora. Secondly, it would be desirable that access be given to the raw data, as it is in the case of ICAT, whose database content is downloadable in several 82

CHAPTER 5. FUTURE WORK

83

Figure 5.1: Vulnerability database input mask output formats [oSN02]. Alternatively, if raw data cannot be shared, a documented external query interface would be helpful. After (more or less) complete data from divergent data source have been assembled, two additional steps are necessary on the QSRA end. First, source-specic lters have to be written to translate the data elds in an ists_icat standardized input format. Secondly, contradictory information has to be reconciled. This can be achieved by a weighed majority rule process, in which a majority count will determine which value for a specic eld will be treated as authoritative. The scrubbed data can subsequently populate the database or be processed with an input mask to check for omissions and plausibility (see Figure 5.1). All this requires contact and coordination with the maintainers of these vulnerability websites. In the best case, if they make their database accessible and put the relevant information into parsable elds, design, implementation and testing could be done in two months. A semantic markup language like XML would be ideal to present the relevant information in a parsable fashion. What is sorely missing on these sites, though, is vulnerability lifecycle information. This information is the hardest to nd and extract.

CHAPTER 5. FUTURE WORK

84

5.1.1

Vulnerability Assessment

Push model Once an attack has been identied as taking place, the time window to prevent damage to systems and networks is small, on the order of minutes, and likely to get smaller still. Hence, one of the ideas behind QSRA is preemption: to get a feeling for the technical weaknesses of the software on a network before an attack takes place. A natural corollary is rapidly reacting to changes in the network and its constitutive software. Addition of software, hosts and other IP-enabled devices may severely aect the networks risk exposure and should be incorporated in the risk prole as soon as possible. In the current version, software inventorying is pull modelled: The QSRA servers contact the hosts software inventory clients during the vulnerability assessment stage to gather information on the software. To address the problem of changing software make-up on identied IP-enabled devices, the software inventory client will be made more autonomous by additionally implementing a push model: The client will periodically check the current software inventory against an established baseline and contact the QSRA server in case of a discrepancy. The software inventory clients autonomously and periodically compare an MD5 hash of the previous inventory with the current software inventory at regular time intervals (on the order of minutes). Upon nding a discrepancy - meaning software has been added, deleted, or altered - the new data is pushed to the QSRA server, which subsequently recalculates the risk prole. It is not as straightforward to identify newly added IP-enabled devices to the network. A drastic measure, would be a network-wide ping sweep: The QSRA server initiates a network ping sweep at periodic intervals to smoke out newly connected devices. However, this technique clogs bandwidth to an unacceptable degree and can trigger false positives on intrusion detection systems deployed on the network. A possible feasible solution is to decentralize the problem by enhancing the QSRA vulnerability assessment client. After the QSRA server does its initial risk prole, it transmits the part of the network makeup, namely a list of IP addresses of risk proled devices, back to the clients. The list constitutes a baseline. The newly enhanced vulnerability assessment client incorporates a trac packet sning module (such as tcpdump [Lab03]), which monitors just the IP addresses of trac passing by and compares the IP addresses to the baseline. If a discrepancy is found, the client contacts the QSRA server. Detection of changed software makeup on the client requires rewriting the client and parts of the QSRA application. This is a substantial change and will take up to four months. Detection of newly

CHAPTER 5. FUTURE WORK added IP addresses requires in addition to that another four to four weeks. Covert channel detection

85

Currently, in design and implementation, QSRAs software inventorying client only catalogues portbound (TCP, UDP) remotely accessible software. It is possible to have software not bound to a TCP or UDP port listening to ICMP, NCP/IPX, Appletalk, ATM/AAL, and other trac. Communication streams over unusual protocols or IP packet specications may constitute covert channels. In the best case, these channels relay legitimate network-responsive software trac. In the worst case, this may be an indication of Trojan inltration (for use of ICMP for Trojan control, see Loki [Nor99, pp.246-247]) In order to attempt to ush out these potential listeners on covert channels, both server and client QSRA software, as well as the overall QSRA procedure, need substantial revisions and enhancements. Procedurally, following the rst iteration of the QSRA process (vulnerability assessment, risk assessment and risk management), another iteration, this time with focus on non-TCP/UDP software, is necessary. On the QSRA application server side, specially crafted non-UDP/TCP trac (ICMP packets, IPX packets) is generated and sent to every member of the inventoried list. On the QSRA client side, modules are deployed that listen and respond to such trac. The idea is to induce software residing on IP enabled devices listening to these covert channels to process the trac before the client side modules do. The absence of a response from the client betrays the presence of the covert listeners on the network to the QSRA application server. Graham and Beyah propose techniques to detect invisible trojans and sniers which may be adaptable for non-IP trac [Gra00][BHC02]. Alternatively, detecting covert listeners could also be done by an enhanced Intrusion Detection System. An estimate of the eort needed to design and implement this feature cannot be given at present, since more research has to be done on the applicability and adaptability of previously implemented detection schemes.

5.1.2

Risk Assessment

Lifecycle of vulnerability addition The lifetime of a vulnerability was optimistically assumed to be nite - in the sense that once a patch had been issued, the particular vulnerability would not longer pose a threat. Also, it was assumed that the x has only positive eects. Events, such as the Slammer worm in January, 2003 [AfIDA03] and Microsoft Exchanges vulnerability [CC03, VU149424], have shown these assumptions to be false. The Slammer worm exploited a vulnerability for which a patch was available for over six months [CC03, VU484891].

CHAPTER 5. FUTURE WORK Table 5.1: Revised vulnerability lifecycle Event in time Description Theoretical description vulnerability Discovery of the vulnerability, not widely known but to either vendor or elite hacker or security experts Proof of concept of vulnerability An exploit has been written, but is not widely available because it is not widely posted or the vulnerabilitys exploit is an old technique (like cross scripting, etc) Popularization of vulnerability The vulnerability is ocially acknowledged, exploits are posted and as such widely available Countermeasure of vulnerability Patch and/or containment method is posted and widely available Neutralization of vulnerability Patch is actually applied/installed Resilience of vulnerability Patch either does not work, and/or the patch introduces new vulnerabilities

86

Microsofts Exchange 2000s patch was eective only after the third release; the rst patch exhibited severe errors, the second one contained outdated les [Sec02, bid/2832/solution/]. Accordingly, the vulnerability lifecycle has been revised with two new events (see Table 5.1). Neither events have an eect on risk assessment, since both events are immaterial to the risk prole calculation of the inventoried software. However, risk management, the risk prole calculations of putative networks with diering constitutive software, may be aected. Resilience to patches introduces a probabilistic element of risk increase into the risk management process. The possibility that solutions may actually increase the risk exposure has to be taken into account. A possible approach is outlined. First, through vulnerability trend analysis of existing data, the resilience probabilities for patches has to be empirically established. This can be controlled for variables of varying granularity (from the very low level software-specicity (e.g the MS Oce line) to manufacturer-specicity (e.g. Microsoft) to development-specicity (closed vs open source). Further analysis will determine the most useful dierentiating criteria. The goal is to get empirical parameters that can be used for a new constraint (5.5) in the risk management optimization problem. The formulation below is for one IP-enabled device.

min eV x
x

(5.1)

CHAPTER 5. FUTURE WORK subject to

87

Fx V Ex Rx Px

frhs vrhs rrhs lrhs

(5.2) (5.3) (5.4) (5.5)

such that

x binary

As before, x is the indicator vector over a software set, normally pre-ltered by OS compatibility. The square |x| |x| matrix P represents the vulnerability patch resilience matrix; each Pij denotes the probability that substitution of software i for software j will fail with probability pij . lrhs are the failure probability limits that can be set by the risk manager. The data for the trend analysis has to be researched, also, the granularity is an open question. This preliminary research is estimated to take three to four weeks. The incorporation into the QSRA application and the changes in the ists_icat risk schema should take no more than four to six weeks. Escalation exploits In its current version, QSRAs model incorporates the concept of negative synergy between vulnerabilities with a basic escalation exploit framework: Two single-host attack sequences are taken into account, Remote2Local-User2Root and Remote2Local-Remote2Root. However, more complex attack sequences involving multiple dependencies between multiple hosts are not modelled (see Ammans example of two hosts with four vulnerabilities [AWK02, p.221]). The next step in QSRAs design is to broaden the escalation attack modelling. This entails generalizing both within the consequence dependencies on one host and across multiple hosts. To capture the salient characteristics of general escalation attacks, four new vulnerability consequence types are dened in Table 5.2. The exploit probability denition also needs to be revised. In the current design, qf (t) denotes the probability that a motivated attacker will be able to exploit fault type f at time t (see Eqn. 2.6 on page 19). This measure needs to be made more granular in order to incorporate the attack complexity necessary to exploit the vulnerability. A few characteristics of an attack complexity measure will be outlined.

CHAPTER 5. FUTURE WORK Table 5.2: New vulnerability consequence types Name Description Protection consequence Compromising protection mechanisms. This includes but is not limited to disabling and bypassing antivirus software, rewalls, intrusion detection systems. Audit consequence Compromising audit mechanisms. This includes but is not limited to bypassing/disabling OS event logs, rewall logs and router logs Security policy consequence Compromising security policies. This includes but is not limited to bypassing and disabling OSenforced read, write, access, modify, delete policies) Authentication consequence Compromising authentication procedures. This includes but is not limited to bypassing and disabling authentication services (such Kerberos, NTLM (NT LAN Manager), SSL (TLS), and Distributed Password Authentication (DPA))

88

In general, an attack complexity measure should incorporate the number of general attack sequence steps (such as writing a le, modication of access controls, etc), number of hosts involved (some attacks involve up to three hosts), initiation source (attacker or victim), and attacker expertise. The quantied attack complexity can then be incorporated into exploit probability qf (t, C ), where C is the attack sequence complexity (Eqn. 5.6)

qf (t, C ) = pautomated (t, f, C )proptool (f, C ) + propmanual (f, C )

(5.6)

While attack complexity should incorporate some elements of attack graphs and templates (such as sequences in time and pre-conditions [SPEC01][PS98][AWK02]), it should strive to strike a balance between succinctness and detail. Excessive detail may not necessarily contribute signicantly to the modelling of the risk probability metric. A succinct attack complexity metric would be a fruitful research topic. This issue of attack complexity is perplexing to distill into a numeric measure and will be an important component of future research.

5.1.3

Risk Management

Autonomous risk management Some vulnerabilities are so egregious and obvious (like identication of a known Trojan) and the reaction time window to contain putative damage to the network so small (Slammer infected at least 75,000

CHAPTER 5. FUTURE WORK

89

hosts, more than 90% of vulnerable hosts, in ten minutes [AfIDA03, outreach/papers/2003/sapphire/ sapphire.html]) that autonomous risk management, i.e. automatic measures on the inventoried clients side, may make sense. The process can be outlined as follows. Upon positive identication of malicious software on a host, the QSRA client software activates a defense module whose goal is to immediately reduce the risk exposure and contain the spread of malware. The defense module may block a vulnerable accessing port (akin to a software rewall) or suspend/stop the oending softwares execution thread of the software, thereby eectively slowing or containing it. The client software also contacts the QSRA server, which may induce QSRA clients defense modules to take protective measures, such as blocking an oending port. Small footprint rewalls that are adaptable on the y and incorporable into the QSRA client have to be identied and assessed. Control changes in the QSRA application and on the QSRA clients have to be made, which are estimated to take eight to twelve weeks from design to testing. Topology modication So far, QSRAs risk management options are limited to replacing and/or removing software on the IP-enabled devices, subject to cost, functionality, and risk limit constraints. There may be another way: Instead of replacing software components, we may rearrange them; that is, changes in the network topology may result in a decrease of overall risk. As an example, in a static network, a rewall may not as eectively decrease your risk inside the perimeter as it would if it were placed it outside the perimeter. Hence, in addition to software addition, replacement and deletion, topology changes on the network may yield tangible risk benets (e.g. the proper positioning and deployment of rewalls and routers in the perimeter). This pre-supposes information on the network topology. Network mapping algorithms, (Chun [CMSW97], Siamwalla [SSK98]), and topology discovery and visualization tools(Otter and Skitter [AfIDA03]) could be be used. Revisions to the risk measure will be necessary that take topological information - i.e. the relative locations of the IP-enabled devices - into account. This is all subject to further research. Since this direction has barely been researched by the author, its required eort or even feasibility in the QSRA context are still open questions. No estimate can be given at this time.

Appendix A

Numeric analysis
A.1 Regression

The general additive multiple regression model is as follows: With Y , the dependent variable, being predicted by more than one predictor variable x1 , x2 , . . . , xk , we have [Dev95, pp.550-562]:

0 + 1 x1 + 2 x2 + . . . + k xk +

(A.1)

where = random disturbances distributed N (0, 2 )

The objective is to build a linear additive rst-order model that reasonably approximates the relationship between Y and one or more dependent variables.

A.2

ANOVA

Devores describes the standard one-factor ANOVA model [Dev95, p. 651]: For I populations, samples of size Ji are collected. We have

Xij = i +

ij

(A.2)

i = 1, . . . , I, j = 1, . . . , Ji

90

APPENDIX A. NUMERIC ANALYSIS where random disturbances distributed N (0, 2 )

91

ij

The objective is to nd out whether data from several groups have a common mean, and whether the population factor disturbs the additivity property of the model. This is done normally with an F test statistic [Dev95, p. 396-400]. The Kruskal-Wallis rank sum test relaxes the assumption that to be normally distributed and requires only the
ij ij

has

to have the same continuous distribution.

Appendix B

Databases
The table naming convention is as follows: table names starting with a d hold data. This means that they contain text strings, port numbers, loss values, and so on, but they do not contain foreign keys. Names starting with an r hold relationships. This means that they contain exclusively foreign keys, indices into data tables. Names starting with dr hold both keys and data and are used when the distinction between data and foreign key is not that sharp.

B.1

ISTS ICAT

The purpose of ists_icat database is to provide data and options for the program assignment, risk assessment and risk management. There are roughly three functional aggregates of tables: System Those tables describing software program data Vulnerabilities Those tables describing vulnerabilities Relations Those tables linking the two previous ones. System This set of tables describes and classies software programs(known as Systems ) and classies

the communication processes gathered from the network. The dierence is that a communication process is regarded an unidentied System. Table B.1: System aggregate

Table name

Description

92

APPENDIX B. DATABASES Table B.1: System aggregate

93

Table name dClass

Description Functionality of systems. Around fty categories (e.g. OS, web, router, application server, news, mail, name, le, time, dhcp, database, printer, log, authentication, IDS). Fields are ClassID (int) and Name (String)

dSys

Name of systems (like MS Word, Apache, Lotus Notes, etc). Hundreds of entries. Fields are SysID (int) and Name (String)

dSysVer

Name of version of system (like Apache 1.3, Windows 95). Hundreds of entries. Fields are SySVerID (int) and Name (String)

dSysVerPatch

Name of patch of system version (like Linux 2.4.1, NT 4.00.1381). Dozens of entries likely. Fields are SysVerPatchID (int) and Name (string)

dLibrary

holds

library

entries

(like

Kernell32.DLL,

libcrypt.so.1), dozens to hundreds of entries likely. Fields are LibraryID, name(string) and version (string) dCommonPort Common ports numbers. Hundreds of entries. Fields are CommonPortID (int) , Port (int), and Protocol (string). rCommonPort2SysVer maps ports to systems ( like 913 to SideCar, 21 to wuftd and FTPEasy). Fields are CommonPort2SysVerID, SysVerID, CommonPortID. Uses tables dCommonPort and dSysVer. dCommonName Common names of communication processes (e.g. httpd, java for web server). Fields are CommonNamesID (int) and name (string) rCommonNames2SysVer maps common names to system names (like httpd to Apache or java to Tomcat). Fields are CommonNamesID, SysVerID. Uses tables dCommonNames and dSysVer.

APPENDIX B. DATABASES Table B.1: System aggregate

94

Table name rSys2Class

Description maps systems to a class (like Apache to web server, or eftp to le). Fields are SysID and ClassID. Uses tables dClass and dSys.

rSysVerOrder

species the order of versions within a system. (like for Windows NT, 2000, XP) since they cannot be presumed to follow lexicographical order. Fields are SysID, SysVerID, Ordering (real number, so we can sort and insert at will). Uses tables dSys and dSysVer

rSysVerPatchOrder

species the order of patches within a version. (like for Windows NT, 2000, XP) since they cannot be presumed to follow lexicographical order. Fields are SysVerID, SysVerPatchID, Ordering (real number). Uses tables dSysVerPatch and dSysVer

dGetInfo

method how to get info (ie. httpd-v+, or go to ControlPanel-System, or type ver). Fields are GetInfoID (int) and method (long string)

rSys2GetInfo

maps systems to the way to get info (like Windows to Control Panel-System or Apache to httpd -v). Fields are GetInfoID and SysID. Uses tables dGetInfo and dSys.

rSysVer2GetInfo

maps system versions to the way to get info. This may be the same as for Systems, but cannot be presumed. (like Windows 95 to type ver in Dos prompt and Windows 2000 Control Panels-System). Fields are GetInfoID and SysVerID. Uses tables dGetInfo and dSysVer.

dSetupCost

numeric cost of installing and conguration for a System (like windows nt 4.00.1381). Fields are SetupCostID (int) and cost (real number, US$)

APPENDIX B. DATABASES Table B.1: System aggregate

95

Table name rSVP2SetupCost

Description maps a SystemVerPatch to Setup costs. Fields are

SVP2SetupCostID, SystemVerPatchID and SetupCostID. Uses tables dSetupCost and dSystemVerPatch. Vulnerabilities These set of tables hold meta data on vulnerabilities (classication, naming, eects, tests) Table B.2: Vulnerabilities aggregate

Table name dVulnAlias

Description holds an alias for a vulnerability (like buer overow in IIS, root exploit in Tomcat) that are in common use. Fields are VulnAliasID (int) and name (string)

dVuln

holds internal unique name for one specic vulnerability, with QSRA naming scheme. This is the main table for vulnerabilities. These would be CVE or Cert entries, but they have not captured everything, so this should be more comprehensive. Fields are VulnID (int) and internal name (string)

rVulnAlias2Vuln

maps alias (names) to specic vulnerabilities. A vulnerability may be know under many dierent names, similarly a name may be used for many dierent vulnerabilities. Fields are VulnAlias2VulnID, VulnID, VulnAliasID. Uses tables dVulnAlias, dVuln.

dVulnTaxonomy

hold the taxonomy of consequences of the vulnerabilities. Fields are VulnTaxonomyID (int), name (string)

rVuln2VulnTaxonomy

maps every vulnerability to a category in the taxonomy. Fields are Vuln2VulnTaxonomyID, VulnID, VulnTaxonomyID. Tables uses are dVuln and dVulnTaxonomy.

APPENDIX B. DATABASES Table B.2: Vulnerabilities aggregate

96

Table name rVuln2CVE

Description maps vulnerabilities to CVE entries. Fields are

Vuln2CVEID, VulnID and CVEID. Uses tables dCVE and dVuln. rVuln2Cert maps vulnerabilities to Cert entries. Fields are

Vuln2CertID, VulnID and CertID. Uses tables dCert and dVuln. dCVE holds CVE entries. Fields are CVEID and CVEVulnID (string). dCert holds Cert entries. (string) dFix holds instructions on how to x a vulnerability (like install service-pack 6). Fields are FixID (int) and instructions (long string) dEect holds description of any apparent eects, if any, of vulnerability on normal working (like Screen ashes or Bootup is slowed). string) dPreTest hold instructions on how to test if a vulnerability is present (like Run patchwrk.exe and see if you get a warning or Check if you are using version 1 of WinSock32.dll). Fields are PreTestID (int) and instructions(long string) dPostTest hold instructions on how to test if a vulnerability has been plugged (like Run patchwrk.exe and see if you are clean). Fields are PostTestID (int) and instructions(long string) Relations This set of tables maps the vulnerabilities to systems (down to patch level). This set constitutes the most important section of ists_icat . Fields are EectID and description (long Fields are CertID and CertVulnID

APPENDIX B. DATABASES Table B.3: Relations aggregate

97

Table name rReference

Description maps a system on the patch level to a vulnerability. Fields are ReferenceID, SysVerPatchID and VulnID. Uses tables dSysVerPatch and dVuln.

rRef2Fix

maps a x for a vulnerability on a particular systems patch level. Fields are Ref2FixID, ReferenceID, FixID. Uses tables dReference and dFix.

rRef2Eect

maps an eect of a vulnerability on a particular systems patch level. Fields are Ref2EectID, ReferenceID, EectID. Uses tables dEect and dEect.

rRef2PreTest

maps a test for a vulnerabilitys presence on a particular systems patch level. Fields are Ref2PreTestID, ReferenceID, PreTestID. Uses tables dEect and dPreTest.

rRef2PostTest

maps a test for a vulnerabilitys absence on a particular systems patch level. Fields are Ref2EectID, ReferenceID, PostTestID. Uses tables dEect and dPostTest

B.2

NETWORKS

The purpose of the networks database is to hold the results of risk assessment and alternative software makeup scenarios on the network of interest. Three sets of tables may be discerned: NCC The tables describing the inventoried and mapped software on the network Risk Assessment The tables that hold results of the risk assessment Risk Management The tables that hold risk management scenario results NCC The data we get back from the vulnerability assessment of a network is stored in this set of

tables. The main objective is to have a structure of the mapping of software to IP addresses, and IP addresses to a network. Software inventory data are store in dProcess. When processes have been identied, their mapping is established and dSVP can be populated.

APPENDIX B. DATABASES Table B.4: NCC aggregate

98

Table name dNetworks

Description hold the name of a network that has been inventoried. Fields are NetworkID (int) and name(string)

dIP

hold IP addresses. (string).

Fields are IPID (int) and address

dSVP

holds System ID down to patch level. Fields are SVPID (int) and SysVerPatchID (int from ists_icat database). The entries come from results of matching CP to SystemVerPatches in rProcess2SVP

dCalendar

holds date and time, up to second level. Fields are CalendarID (int) and time (datetime)

dProcess

hold communication process data . Fields are ProcessID (int), CP NAME (string), PRTCL (string), SRC PORT (string), DST PORT (string), SRC ADDR (string),

DST ADDR (string), PID (string), CONNECT STATE (string) dOS holds OS information. Fields are OSID(int) and SysVersionPatchID (int), since OS is just viewed as a system. rNetwork2Calendar maps networks to a date. Fields are Network2CalendarID, NetworkID, CalendarID. Uses tables dCalendar, dNetworks Risk Assessment Each System Version Patch has a risk measure associated with it. That risk is dependent on the IP address where that System Version Patch resides (IP dependence) and the System Version Patchs vulnerability makeup (SVP dependence). IP risk is calculated from its constitutive SVP risk. Network risk is an aggregate of IPRisk. Table B.5: Risk Assessment aggregate

Table name

Description

APPENDIX B. DATABASES Table B.5: Risk Assessment aggregate

99

Table name dLossValuation

Description holds dollar loss values. Fields are LossValuationID (int) and value (real number)

dProcessRisk

holds the aggregate risk of a SystemVersionPatch (for consequence taxonomy in dVulTaxonomy). Fields are SystemRiskID (int), IP2SVPID (int) and risk (real number).

dIPRisk

holds the aggregate risk for an IP address (over all the software residing on that address). IP Risk is calculated akin to failure of independent components in series. Fields are IPRiskID, IPID and risk (real number).

dNetworkRisk

holds the aggregate risk of the network. Fields are NetworkRiskID, NetworkID and risk (real number).Risk is summation of risk in dIP.

Risk Management The principle is that every change to the original network conguration results in a new network scenario. Scenarios indicate which relations between Network and IP addresses and IP addresses and SVP characterize the new network. Table B.6: Risk Management aggregate

Table name dScenario

Description hold names for scenarios (like Switch to NT). Fields are ScenarioID and name.

rScenarioScen2Network

maps networks to scenarios.

Fields are Scenar-

ioScen2NetworkID, ScenarioID and NetworkID. Uses tables dScenario and dNetworks rScenarioNetwork2IP maps IPs to networks for a scenario. Fields are ScenarioNetwork2IPID, NetworkID, IPID and ScenarioID. Uses tables dNetwork, dIP, dScenario

APPENDIX B. DATABASES Table B.6: Risk Management aggregate

100

Table name rScenarioIP2SVP

Description maps SVPs to IPS for a scenario. Fields are Scenari-

oIP2SVPID, SVPID, IPID, ScenarioID. Uses tables dSVP, dIP, dScenario rScenarioIP2OS maps OSs to IPS for a scenario. Fields are Scenari-

oIP2SVPID, OSID, IPID, ScenarioID. Uses tables dSVP, dIP, dScenario rScenarioIP2Process maps communication processes to IP addresses in a scenario. Fields are IP2ProcessID, IPID, ProcessID, ScenarioID. Uses tables dIP, dScenario and dProcess. rScenarioProcess2SVP maps communication processes to systems down to the patch level in a scenario . Fields are Process2SVPID, ProcessID, SVPID, ScenarioID. Uses tables dProcess, dScenario and dSVP. rScenarioIP2LossValuation maps loss values to IP addresses. Fields are

IP2LossValuationID, IPID and LossValuationID and TaxonomyID, ScenarioID. Uses tables dIP and dLossValuation and dVulnTaxonomy (from db ists icat), dScenario

Appendix C

Supplemental data
C.1 Average time window from discovery of the vulnerability to posted patch
Vulnerability data was collected for six operating systems: Red Hat Linux 7.3, SuSe Linux 8.0, Mandrake Linux 8.2, OpenBSD 3.1, Debian Linux 3.0, and Windows 2000 Professional SP2. The standard, out-ofthe-box workstation installation was chosen when installation options were presented. Since the Linux and OpenBSD distributions come with a wealth of applications, MS Oce was assumed to be part of the standard out-of-the-box Windows 2000 distribution. Disclosed vulnerabilities were collected from the release date to September 30th , 2002 (see Table C.1). The average time window data for the audited platforms can be found in Table C.2. Windows 2000 stands in stark contrast to the Linux and OpenBSD distributions. Six month lags are not uncommon for W2K, contrasted with nearly zero days for OpenBSD and two weeks to two months for most Linux systems. Unsurprisingly, design errors take the longest to x. ANOVA boxplots and group-wise comparison results are shown in Figure C.1 and Table C.3. One number indicates just one data point. Negative

OS Mandrake 8.2 SuSe Linux 8.0 Debian Linux 3.0 RedHat Linux 7.3 OpenBSD 3.1 W2k SP2

Table C.1: Security Release date March 18th , 2002 April 22nd ,2002 July 30th , 2002 May 6th ,2002 May 19th ,2002 May 16th ,2002

advisories as of September 30th , 2002 # days since release # of security advisories 196 37 161 22 62 32 147 30 134 15 137 35

Advisories days

0.19 0.14 0.52 0.2 0.11 0.26

101

APPENDIX C. SUPPLEMENTAL DATA

102

numbers reect the cases when a discovered vulnerability was already xed by a patch that was released earlier for another vulnerability.

APPENDIX C. SUPPLEMENTAL DATA

103

Table C.2: Average time window (days) of discovered vulnerability to posted patch
Operating System Mandrake 8.2 Fault Type Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation Other SuSe Linux 8.0 Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation Debian Linux 3.0 Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation RedHat Linux 7.3 Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation OpenBSD 3.1 Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation W2k SP2 Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation [150] [-3] [-1,60] [-1] [-68,358] [3,146] [0,1] [0] [0,1] [0] [0] [0] [0,14] [0] [7,9] [0] [0,47] [5] [0] [26] [0,14] [0] [0,7] [0,13] [5] [0] [26] [1] [1] [14] [0] [0] [0] [0] [Min, Max] [1,59] [15,24] [-3,24] [24] [13,97] [5,6] [0] Availability [0,24] 1 24 24 24 13 0 [1,14] 1 14 0 [0,65] 0 7 13 65 [0,24] 0 7 24 0 [0] 0 0 0 [-3,146] -3 -1 -3 146 Vulnerability consequences [Min, Max] Condentiality Integrity Process [59] [6] [43] 59 43 6 [5,16] [0,72] [49] 72 49 16 0 0 5 [0] [0,26] 0 0 26 [1] [0] 1 0 0 [-68] [1,358] [1] 1 -68 358 -16 Full [-3,97] 59 15 -3 97 5 [0,14] 1 1 14 0 0 0 [0,53] 53 7 0 0 0 [0,47] 14 0 9 47 5 [0] 0 0 0 [0,150] 150 60 24 3 0

APPENDIX C. SUPPLEMENTAL DATA

104

Figure C.1: Boxplot of [tdesc , tpatch ]


Boxplot: ANOVA (KruskalWallis) of OS 700

600

500

2 =19 p>2 =0.00217

Time (days)

400

300

200

100

0 W2k SP2 MDK 8.2 RHL 7.3 OS SuSe 8 Debian 3 OpenBSD 3.1

(a) Operating Systems


Boxplot: ANOVA (KruskalWallis) ofConsequences 700

600

500

2 =3.2 p>2 =0.521

Time (days)

400

300

200

100

0 Process Availability Full Consequences Confidentiality Integrity

(b) Consequences
Boxplot: ANOVA (KruskalWallis) ofFaults 700 700 Boxplot: ANOVA (KruskalWallis) ofAccess

600

600

500

=7.9 p>2 =0.442


2

500

2 =2.8 p>2 =0.093

Time (days)

400

Time (days) access design input bound buffer Faults except race env conf

400

300

300

200

200

100

100

0 remote Access local

(c) Fault

(d) Access

APPENDIX C. SUPPLEMENTAL DATA

105

Figure C.2: Software inventorying time ANOVA: Boxplots for connection type and ARP size
Connection type (high databate repletion, high software count) 351 350 349 348 Time (seconds) 347 346 345 344 343 342 341 wireless landline F =0.75 p>F =0.398

(a) Connection type


ARP cache (low databate repletion, low software count) 4

3.9

3.8

Time (seconds)

3.7 F =0.3 p>F =0.586 3.6

3.5

3.4

3.3 ARP flushed ARP filled

(b) ARP size

APPENDIX C. SUPPLEMENTAL DATA

106

Table C.3: ANOVA group mean rank comparison: OS Group wise comparison Dierence Consequence Consequence Lowerbound Estimate U pperbound W2k SP2 Mandrake 8.2 69 26 17 W2k SP2 RHL 7.3 40 4 47 W2k SP2 SuSe 8.0 36 12 59 W2k SP2 Debian 3.0 50 8 34 W2k SP2 OpenBSD 3.1 19 33 85 Mandrake 8.2 RHL 7.3 11 30 71 Mandrake 8.2 SuSe 8.0 7 38 82 Mandrake 8.2 Debian 3.0 21 18 57 Mandrake 8.2 OpenBSD 3.1 9 59 108 RHL 7.3 SuSe 8.0 37 8 53 RHL 7.3 Debian 3.0 52 12 28 RHL 7.3 OpenBSD 3.1 21 29 79 SuSe 8.0 Debian 3.0 63 20 24 SuSe 8.0 OpenBSD 3.1 32 21 74 Debian 3.0 OpenBSD 3.1 8 41 89

APPENDIX C. SUPPLEMENTAL DATA

107

C.2

Software inventorying experimental results

The trials consisted of sets of ten repetitions each, on one to four hosts. Six factors (see Table 4.9), three categorical and three continuous, were varied in order to investigate their eect on the software inventorying time: OS, connection type, ARP cache size, database size, number of inventoried software packages and number of inventoried hosts. Tables 4.10-4.12 show the timing results of the experiments. The graphs are the graphical representations of the ANOVA analysis. High, normal, low database repletion means a high, average and low number of records in the database, respectively. MDK stands for Mandrake 8.2, and W2K for Windows 2000 SP2.

APPENDIX C. SUPPLEMENTAL DATA

108

Figure C.3: Software inventorying time ANOVA: Boxplots for database size and software count
high software count,high database repletion 344 275 342 Time (seconds) 340 338 336 334 W2K MDK Time (seconds) 270 265 260 255 F =0.1 p>F =0.751 high software count,normal database repletion

F =63 p>F =0

W2K

MDK

normal software count,high database repletion 178 176 Time (seconds) Time (seconds) 174 172 170 168 166 164 W2K MDK 125 F =30 p>F =0 140

normal software count,normal database repletion

135 F =22 p>F =0 130

W2K

MDK

low software count,high database repletion 18 17.5 17 16.5 16 F =1.9 p>F =0.184 13.1 13 Time (seconds) Time (seconds) 12.9 12.8 12.7 12.6 12.5 12.4 W2K MDK

low software count,normal database repletion

F =0.044 p>F =0.837

W2K

MDK

APPENDIX C. SUPPLEMENTAL DATA

109

Figure C.4: Software inventorying time ANOVA: Boxplots for database size and software count
high software count,low database repletion 215 210 205 200 195 F =64 p>F =0 120 115 Time (seconds) Time (seconds) 110 105 100 95 90 85 80 W2K MDK W2K MDK F =45 p>F =0 high software count,zero database repletion

normal software count,low database repletion 65 110 60 Time (seconds) Time (seconds) 105 100 95 90 35 W2K MDK F =22 p>F =0 55 50 45 40

normal software count,zero database repletion

F =19 p>F =0

W2K

MDK

low software count,low database repletion 9.4 9.3 Time (seconds) Time (seconds) 9.2 9.1 9 8.9 8.8 8.7 8.6 W2K MDK 2.8 2.6 F =8.8 p>F =0.00826 3.6 3.4 3.2

low software count,zero database repletion

F =7.8 p>F =0.0119 3

W2K

MDK

APPENDIX C. SUPPLEMENTAL DATA

110

C.3

Faults and Consequences

Extensive vulnerability data were collected for six popular operating systems: Red Hat Linux 7.3 (RHL), SuSe Linux 8.0 (SuSe) , Mandrake Linux 8.2 (MDK), OpenBSD 3.1 (OBSD), Debian Linux 3.0 (Deb), and Windows 2000 Professional SP2 (W2K). Disclosed vulnerabilities were collected from the release date to September 30th , 2002 (see Table C.1). Faults were mapped to a vulnerability consequence, a fault type and an operating system. The top two boxplots in Figure C.5 show the spread of the nine fault types, controlled for OS and consequences.The middle two boxplots in Figure C.5 show the spread of the fault proportions of the six operating systems, controlled for fault and consequence type. The bottom two boxplots in Figure C.5 show the spread of the fault count for the ve consequence types, controlled for OS and consequences. The Kruskal-Wallis mean rank comparisons can be found in Tables C.4-C.6, with signicant dierences highlighted.

APPENDIX C. SUPPLEMENTAL DATA

111

Figure C.5: Fault count ANOVA: Boxplot


ANOVA (KruskalWallis):Faults vs OS 30 25 Values Values 20 15 10 5 0 Acc Bnd Buf Conf Des Env Exct Inp Race 2 =31 p>2 =0 30 25 20 15 10 5 0 Acc Bnd Buf Conf Des Env Exct Inp Race 2 =12 p>2 =0.134 ANOVA (KruskalWallis):Faults vs consequences

ANOVA (KruskalWallis):OS vs faults 30 25 Values Values 20 15 10 5 0 Deb MDK OBSD RHL SuSe W2k 2 =4 p>2 =0.547 30 25 20 15 10 5 0

ANOVA (KruskalWallis):OS vs consequences

2 =4.5 p>2 =0.484

Deb

MDK

OBSD

RHL

SuSe

W2k

ANOVA (KruskalWallis):Consequences vs OS 30 25 Values Values 20 15 10 5 0 Avail Confident Full Integrity Process 2 =22 p>2 =0 30 25 20 15 10 5 0

ANOVA (KruskalWallis):Consequences vs faults

2 =15 p>2 =0.00382

Avail

Confident

Full

Integrity Process

APPENDIX C. SUPPLEMENTAL DATA

112

Table C.4: ANOVA results group mean rank comparison: consequences vs fault type Group wise comparison Dierence Consequence Consequence Lowerbound Estimate U pperbound Avail Condent 2 15 31 Avail Full 19 2 14 Avail Integrity 3 13 29 Avail Process 2 14 30 Condent Full 33 17 1 Condent Integrity 18 2 15 Condent Process 17 1 15 Full Integrity 1 15 32 Full Process 0 16 32 Integrity Process 15 1 17

Table C.5: ANOVA results group mean rank comparison: consequences vs OS Group wise comparison Dierence Consequence Consequence Lowerbound Estimate U pperbound Avail Condent 3 11 24 Avail Full 19 5 9 Avail Integrity 1 13 27 Avail Process 1 13 27 Condent Full 30 16 2 Condent Integrity 11 3 16 Condent Process 11 2 16 Full Integrity 5 18 32 Full Process 4 18 32 Integrity Process 14 0 14

APPENDIX C. SUPPLEMENTAL DATA

113

Table C.6: ANOVA results group mean rank comparison: fault type vs OS Group wise comparison Dierence Consequence Consequence Lowerbound Estimate U pperbound Acc Bound 57 29 1 Acc Buer 50 22 5 Acc Cong 23 5 32 Acc Design 46 19 9 Acc Env 21 7 34 Acc Except 34 7 21 Acc Input 43 15 13 Acc Race 33 5 23 Bound Buer 21 7 34 Bound Cong 6 34 61 Bound Design 17 10 38 Bound Env 8 36 63 Bound Except 5 22 50 Bound Input 14 14 42 Bound Race 4 24 52 Buer Cong 1 27 55 Buer Design 24 4 31 Buer Env 1 29 57 Buer Except 12 16 43 Buer Input 20 7 35 Buer Race 11 17 45 Cong Design 51 23 5 Cong Env 26 2 30 Cong Except 39 11 17 Cong Input 47 19 8 Cong Race 37 10 18 Design Env 2 25 53 Design Except 16 12 40 Design Input 24 4 31 Design Race 14 13 41 Env Except 41 13 14 Env Input 49 22 6 Env Race 40 12 16 Except Input 36 8 19 Except Race 26 1 29 Input Race 18 10 37

APPENDIX C. SUPPLEMENTAL DATA

114

C.3.1

Vulnerability count

The breakdown of vulnerabilities by OS, software class, and access type for the standard installations can be found in Tables C.7 and C.8, respectively. The count discrepancy between those tables and Tables 4.30 and 4.32 stems from the fact that there is an n:m relationship between software and class (e.g. a particular piece of software may map to many software classes, and vice versa).

APPENDIX C. SUPPLEMENTAL DATA

115

Table C.7: Vulnerability count by OS, class and access: Debian 3.0 and Mandrake 8.2
Operating System Debian 3.0 Access type local Software class admin tool bug tracking database library non-TCP server programming language superdaemon web application authentication bug tracking database encryption internet client library mail mail server server daemon web web proxy admin tool application basic OS tools compression library non-TCP server superdaemon admin tool compression dhcp encryption rewall internet client library mail client mail server name OS kernel packet snier printer spooler remote access web web proxy Availability 1 1 1 3 3 1 1 2 1 1 1 1 Vulnerability consequence Condentiality Full Integrity 1 1 2 1 1 1 3 2 1 1 1 1 1 2 1 1 1 1 1 1 1 2 4 1 1 1 3 1 1 2 1 1 1 1 1 1 3 2 1 1 1 2 1 1 2 1 Process 1 1 1 1

remote

Mandrake 8.2

local

remote

APPENDIX C. SUPPLEMENTAL DATA

116

Table C.8: Vulnerability count by OS, class and access: Suse 8.0, W2K SP2, OpenBSD 3.1 and RedHat 7.3
Operating System SuSe 8.0 Access type local Software class application authentication basic OS tools le ISDN library authentication dhcp encryption le internet client library name remote access web web proxy OS kernel application authentication basic OS tools encryption internet client library mail server OS kernel remote access router (software) OS kernel remote access authentication communication encryption internet client library name remote access web application library packet snier web web proxy boot server encryption internet client library mail server name NAT packet snier printer spooler remote access security module web web proxy Availability 1 1 1 2 1 1 1 1 1 7 2 3 1 1 1 2 3 1 1 1 Vulnerability consequence Condentiality Full Integrity 2 1 1 1 1 2 1 1 1 1 1 2 1 2 6 2 4 1 2 1 3 1 4 3 1 3 1 1 1 1 2 1 1 1 1 1 1 2 1 4 1 1 1 1 3 3 2 1 1 1 1 1 2 1 Process 2 1 1 1 1 1 1 1 1 1

remote

Windows 2000 SP2

local remote

OpenBSD 3.1

local remote

RedHat 7.3

local

remote

APPENDIX C. SUPPLEMENTAL DATA

117

C.3.2

Fault count

Table C.9 breaks down the fault occurrences by fault type, consequence and OS. Availability and full consequences make up for the bulk of the data. It is instructive to see that design faults make up for 40% of totals faults for Windows 2000, roughly 20% across the Linux distributions . There have been no known design faults found in OpenBSD to date, which is telling testimony to OpenBSDs design and security-review process.

APPENDIX C. SUPPLEMENTAL DATA

118

Operating System Mandrake 8.2

Fault Type

Table C.9: Fault count of standard OS installations


Count 8 3 12 1 9 3 1 3 3 11 1 1 1 1 5 10 4 7 1 2 2 6 4 9 1 5 2 2 1 4 1 5 1 2 3 1 7 1 14 5 4 (22%) (8%) (32%) (3%) (24%) (8%) (3%) (14%) (14%) (52%) (5%) (5%) (5%) (5%) (16%) (32%) (13%) (23%) (3%) (6%) (6%) (20%) (13%) (30%) (3%) (17%) (7%) (7%) (3%) (31%) (8%) (38%) (8%) (16%) Availability 8 (21%) 1 1 3 1 1 1 6 (29%) 1 4 1 9 (29%) 2 3 1 1 2 8 (27%) 2 3 1 2 4 (27%) 2 1 1 10 (29%) 1 2 3 4

Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation SuSe Linux 8.0 Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation Debian Linux 3.0 Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation RedHat Linux 7.3 Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation OpenBSD 3.1 Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation W2k SP2 Buer overow Input validation Boundary condition Environmental error Conguration error Design error Race condition Exceptional condition Access validation

(9%) (3%) (20%) (3%) (40%) (14%) (11%)

Vulnerability consequences count (%) Condentiality Integrity Process 2 (5%) 1 (3%) 1 (3%) 2 1 1 3 (10%) 4 (13%) 2 (7%) 2 2 2 2 1 2 (7%) 2 (7%) 2 1 1 1 (5%) 1 4 (11%) 3 (9%) 3 (9%) 1 4 2 2 1

Full 25 (68%) 7 2 9 5 2 15 (71%) 2 3 7 1 1 1 13 (42%) 3 6 1 2 1 18 (60%) 4 4 6 2 2 8 (40%) 2 4 2 15 (43%) 3 5 3 1 3

APPENDIX C. SUPPLEMENTAL DATA

119

C.4

Risk management scenario results

Risk proles were calculated in Excel for the audited operating systems (Red Hat Linux 7.3, SuSe Linux 8.0, Mandrake Linux 8.2, OpenBSD 3.1, Debian Linux 3.0, and Windows 2000 Professional SP2) and aggregated into three operating system families. These families consist of Windows, Linux, and BSD. For both families and individual operating systems, two scenarios were contrasted with their original risk prole: Highest count, in which the top three buggiest software packages, as measured by highest vulnerability count, were patched; and highest risk, in which the top three buggiest software packages, as measured by QSRAs analytic highest risk metric, were patched. Since the implementation of QSRA at this point in time did not take escalation attacks into account and the vulnerability database did not contain the requisite estimated 3000 software packages to eectively run the optimization algorithm, the optimization was done manually. Patched, then, in this context means that the vulnerabilities in the software packages were neutralized by deducting the faults that caused them. The comparative risk probability reduction barplots can be found in Figures C.6, C.7, C.8 and C.9; they are broken down by vulnerability consequence, individual OSs and OS family, respectively. Scenario highest count consists of a fast patching approach, where the goal is to patch the software exhibiting the highest vulnerability count regardless of the consequences. Scenario highest risk, on the other hand, is a slower approach where a preceding analysis (one of the goals of QSRA) pinpoints the software with the highest risk exposure. NA in the plots indicates absence of data, not zero.

APPENDIX C. SUPPLEMENTAL DATA

120

Figure C.6: Risk probability scenario reduction: Consequences


Comparative scenarios:Confidentiality 0 NA NA

20

% reduction in risk probability

40

60

80

Count Risk 100 Deb MDK OBSD RHL Operating System SuSe W2k

(a) Condentiality (individual OS)


Comparative scenarios:Confidentiality 0 NA

20

% reduction in risk probability

40

60

80

Count Risk 100 Linux W2K OS Group OBSD

(b) Condentiality (OS family)


Comparative scenarios:Availability 0 0 Comparative scenarios:Availability

20

20

% reduction in risk probability

40

% reduction in risk probability Count Risk

40

60

60

80

80

Count Risk MDK OBSD RHL Operating System SuSe W2k 100 Linux W2K OS Group OBSD

100

Deb

(c) Availability (individual OS)

(d) Availability (OS family)

APPENDIX C. SUPPLEMENTAL DATA

121

Figure C.7: Risk probability scenario reduction: Consequences


Comparative scenarios:Full 0 0 Comparative scenarios:Full

20

20

% reduction in risk probability

40

% reduction in risk probability Count Risk

40

60

60

80

80

Count Risk MDK OBSD RHL Operating System SuSe W2k 100 Linux W2K OS Group OBSD

100

Deb

(a) Full (individual OS)


Comparative scenarios:Process 0 NA NA
0

(b) Full (OS family)


Comparative scenarios:Process

20

20

% reduction in risk probability

% reduction in risk probability

40

40

60

60

80

80

Count Risk 100


100

Count Risk

Deb

MDK

OBSD RHL Operating System

SuSe

W2k

Linux

W2K OS Group

OBSD

(c) Process (individual OS)


Comparative scenarios:Integrity 0 NA NA NA NA 0

(d) Process (OS family)


Comparative scenarios:Integrity

NA

20

20

% reduction in risk probability

40

% reduction in risk probability Count Risk

40

60

60

80

80

Count Risk MDK OBSD RHL Operating System SuSe W2k 100 Linux W2K OS Group OBSD

100

Deb

(e) Integrity (individual OS)

(f) Integrity (OS family)

APPENDIX C. SUPPLEMENTAL DATA

122

Figure C.8: Risk probability scenario reduction: Individual OS


Comparative scenarios:Debian 0

Comparative scenarios:Mandrake 0 NA NA

20

20

% reduction in risk probability

40

% reduction in risk probability


Count Risk

40

60

60

80

80

Count Risk
Proc Integ Vulnerability consequence Conf Avail

100

Full

100

Full

Proc

Integ Vulnerability consequence

Conf

Avail

(a) Debian
Comparative scenarios:SuSe 0 NA NA NA 0

(b) Mandrake
Comparative scenarios:Redhat NA

20

20

% reduction in risk probability

40

% reduction in risk probability Count Risk

40

60

60

80

80

Count Risk Proc Integ Vulnerability consequence Conf Avail 100 Full Proc Integ Vulnerability consequence Conf Avail

100

Full

(c) SuSe
Comparative scenarios:Window 2000 0 0

(d) RedHat
Comparative scenarios:OpenBSD NA NA

20

20

% reduction in risk probability

40

% reduction in risk probability Count Risk

40

60

60

80

80

Count Risk Proc Integ Vulnerability consequence Conf Avail 100 Full Proc Integ Vulnerability consequence Conf Avail

100

Full

(e) Windows 2000

(f) OpenBSD

APPENDIX C. SUPPLEMENTAL DATA Figure C.9: Risk probability scenario reduction: OS families
Comparative scenarios:Linux group 0

123

20

% reduction in risk probability

40

60

80

Count Risk 100 Full Proc Integ Vulnerability consequence Conf Avail

(a) Linux
Comparative scenarios:OBSD group 0 NA NA

20

% reduction in risk probability

40

60

80

Count Risk 100 Full Proc Integ Vulnerability consequence Conf Avail

(b) OpenBSD
Comparative scenarios:W2K group 0

20

% reduction in risk probability

40

60

80

Count Risk 100 Full Proc Integ Vulnerability consequence Conf Avail

(c) W2K

Appendix D

Glossary
D.1 Boxplot
Figure D.1: Sample boxplot

200

150 Values 100 50 0 1 Column Number

Figure D.1 shows an example of a box plot. The interpretation, taken from the Matlab Help Documentation are as follows: The lower and upper lines of the box are the 25th and the 75th percentile of the data. The distance between them is also referred to as the IQR, the interquartile range. The line in the middle of the box is the median. If it is not centered, it means the data is skewed. The whiskers are lines extending above and below the box. They show the extent of the rest of the data, excluding the outliers. Assuming no outliers, the maximum of the sample is the top of the upper whisker. The minimum of the sample is the bottom of the lower whisker. By default, an 124

APPENDIX D. GLOSSARY

125

outlier is a value that is more than 1.5 times the interquartile range away from the top or bottom of the box. The plus sign at the top of the plot is an indication of an outlier in the data. The notches in the box are a graphic condence interval about the median of a sample.

Bibliography
[AB00] [Abe00] MySQL AB. MySQL 3.23.43,. http://www.mysql.com/downloads/index.html, June 2000. Victor Abell. lsof 4.47. ftp://ftp.cerias.purdue.edu/pub/tools/unix/sysutils/

lsof/, August 2000. [AfIDA03] Cooperative Association for Internet Data Analysis. CAIDA. http://www.caida.org/, 2003. [Ass01] Network Associates. Cybercop Scanner. http://www.pgp.com/products/

cybercop-scanner/default.asp, April 2001. [AWK02] Paul Ammann, Duminda Wijesekera, and Saket Kaushik. Scalable, graph-based network vulnerability analysis. In Proceedings of the 9th ACM conference on Computer and communications security, pages 217224. ACM Press, 2002. [Bek03] [Ber96] Scott Bekker. Group Estimates Slammer Damage at One Billion Dollars. ENT news, 2003. Michel Berkelaar. lp solve 2.0 for Java. http://siesta.cs.wustl.edu/~javagrp/source/ lp/, June 1996. [BGW01] Nikita Borisov, Ian Goldberg, and David Wagner. Intercepting Mobile Communications: The Insecurity of 802.11. http://www.isaac.cs.berkeley.edu/isaac/wep-faq.html, 2001. [BHC02] R.A. Beyah, M.C. Holloway, and J.A. Copeland. Invisible trojan: an architecture, implementation and detection method. In The 2002 45th Midwest Symposium on Circuits and Systems, volume 3, pages 500504. MWSCAS, 2002. [BHM77] Stephen Bradley, Arnoldo Hax, and Thomas Magnanti. Applied Mathematical Programming. Addison-Wesley, Reading, MA, 1977.

126

BIBLIOGRAPHY [Bin97]

127

Robert V Binder. Can a manufacturing quality model work for software? IEEE Software, 14(5):101102,105, Sept/Oct 1997.

[Bis00]

Matt Bishop. How Attackers Break Programs, and How to Write Programs More Securely. System Administration, Networking and Security (SANS), SANS 2001, Baltimore, MD, May 2000.

[CC03]

Carnegie Mellon Universitys CERT Coordination Center. CERT. http://www.cert.org/, 2003.

[CMSW97] Brent N. Chun, Alan M. Mainwaring, Saul Schleimer, and Daniel S. Wilkerson. System area network mapping. In Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, pages 116126. ACM Press, 1997. [Cor01] SAINT Corporation. Saint Vulnerability Assessment Tool. http://www.wwdsi.com/saint/, June 2001. [Deb02] Debian. Debian 2.2. ftp://ftp-linux.cc.gatech.edu/pub/linux/distributions/

debian-cd/2.2_rev%6/i386/, June 2002. [Dev95] Jay L Devore. Probability and Statistics for Engineering and the Sciences. Duxbury Press, 1995. [Eno00] Lori Enos. Lloyds of London to oer hacker insurance. E-Commerce Times, July 10th, 2000. [Eso99] Marta Eso. Parallel Branch and Cut for Set Partitioning. PhD, School of Operations Research and Industrial Engineering, College of Engineering, Cornell University, Ithaca, NY 14853, 1999. [Fou00] Foundstone. fport 1.33. http://www.foundstone.com/resources/termsofuse.php?

filename=FPortNG.zip%, May 2000. [Gil87] Mark L. Gillenson. The duality of database structures and design techniques. Communications of the ACM, 30(12):10561065, 1987. [GLS03] Lawrence A. Gordon, Martin P. Loeb, and Tashfeen Sohail. A framework for using insurance for cyber-risk management. Communications of the ACM, 46(3):8185, 2003. [Gor03] Lawrence Gordon. Authors replique to Donn B. Parker, Los Altos, CA. Communications of the ACM, Forum Section, May 12th 2003. Vol. 46, No. 5.

BIBLIOGRAPHY [Gra00]

128

Robert Graham. Sning (network wiretap, snier) FAQ. http://www.robertgraham.com/ pubs/sniffing-faq.html, 2000.

[Hai98]

Yacov Haimes. Risk Modeling, Assessment, and Management. Wiley Series in Systems Engineering. John Wiley & Sons, New York, August 1998.

[Hec95]

David Heckerman. A tutorial on learning with bayesian networks. Technical report MSRTR-95-06, Microsoft Research, 1995.

[HL90]

Frederick Hillier and Gerald Lieberman. Introduction to Operations Research. McGraw-Hill, fth edition, 1990.

[HL02]

Michael Howard and David LeBlanc. Writing Secure Code. Microsoft Press, One Microsoft Way, Redmond, WA 98052-6399, 2002.

[II99]

Clifton A. Ericson II. Fault tree analysis - a history. In Proceedings of the 17th International Systems Safety Conference, 1999.

[Ins01]

Strategic Technology Institute. C-BRAT CIP. http://www.sti-inc.com/shtml/cbrat. shtml, 2001.

[Kes00]

Gary Kessler. Love Sick. Information Security Magazine, June 2000. Contains the ICSA virus costs/spread 1990-2000 table.

[Lab03]

Lawrence Berkeley National Laboratory. tcpdump 3.7.2. http://www.tcpdump.org/, 2003.

[LBMC94] Carl E. Landwehr, Alan R. Bull, John P. McDermott, and William S. Choi. A taxonomy of computer program security aws. ACM Computing Surveys (CSUR), 26(3):211254, 1994. [LM00] P. Liggesmeyer and O. Maeckel. Quantifying the reliability of embedded systems by automated analysis. In Proceedings of the International Conference on Dependable Systems and Networks, pages 8994, 2000. [LWBD02] Karim R. Lakhani, Bob Wolf, Je Bates, and Chris DiBona. The Boston Consulting Group Hacker Survey. Technical report, The Boston Consulting Group, July 2002. [Man02] Mandrake. Mandrake Linux 8.2. ftp://uml.ists.dartmouth.edu/mirrors/ftp.

mandrakesoft.com/pub/Mandrake/%mandrake/8.2/i586/, June 2002. [Mat00] The MathWorks. Matlab Student 6.0 R12, November 2000.

BIBLIOGRAPHY [mi202a]

129

mi2g.com. Apple Mac OS and SCO Unix least vulnerable to attack. Technical report, mi2g.com, October 2002.

[mi202b]

mi2g.com. Economic Damage from Digital Risk Stabilising. Technical report, mi2g.com, October 2002.

[mi202c]

mi2g.com. Windows regains mantle of most vulnerable OS. Technical report, mi2g.com, August 2002.

[Mic01a] [Mic01b] [MIO90]

Microsoft. Microsoft Windows 2000 SP2, May 2001. Sun Microsystems. Java 1.3.1. http://java.sun.com, August 2001. John D. Musa, Anthony Iannino, and Kazuhira Okumoto. Software reliability : measurement, prediction, application. McGraw Hill Software Engineering Series, Professional edition, 1990.

[MV96]

John Marciniak and Robert Vienneau. Software engineering baselines. Technical report, Data and Analysis Center for Software, Rome Laboratory RL/C3C, Griss Business Park, Rome, NY 13441, July 1996. Contract Number F30602-89-C-0082.

[Nor99] [Ope02]

Stephen Northcutt. Network Intrusion Detection: An Analysts Handbook. New Rider, 1999. OpenBSD. OpenBSD 8.1. ftp://uml.ists.dartmouth.edu/mirrors/openbsd-rsync/3. 1/i386/, June 2002.

[oRBC75]

Conference on Reliability and Fault Tree Analysis (1974 : Berkeley CA);. Introduction to fault tree analysis. In Jerry B. Fussel Richard E. Barlow and Nozer D. Singpurwalla, editors, Reliability and fault tree analysis : theoretical and applied aspects of system reliability and safety assessment, pages 735. Society for Industrial and Applied Mathematics, 1975.

[oSN02]

National Institute of Standards and Technology (NIST). ICAT Metabase. http://icat. nist.gov/, 2002.

[oT03]

Massachusetts Institute of Technology. Kerberos: The Network Authentication Protocol. http://web.mit.edu/kerberos/www/, March 2003.

[PS98]

Cynthia Phillips and Laura Painton Swiler. A graph-based system for network-vulnerability analysis. In Proceedings of the 1998 workshop on New security paradigms, pages 7179. ACM Press, 1998.

BIBLIOGRAPHY [Rea03]

130

Reasoning. How Open Source and Commercial Software Compare. Technical report, Reasoning Inc., February 2003.

[Red02]

RedHat. RedHat Linux 7.3. ftp://uml.ists.dartmouth.edu/mirrors/redhat-rsync/ pub/redhat/linux/7.3/%, June 2002.

[Rog00] [Rol98]

Marc Rogers. A New Hacker Taxonomy. Catalyst - Computers in Psychology, 2000. F. D. Rolland. The Essence of Databases. Prentice Hall, Campus 400, Maylands Avenue, Hemel Hempstead, Hertfordshire, HP2 7EZ, Great Britain, 1998.

[Ros93]

Sheldon M. Ross. Introduction to Probability Models. Acdemic Press, San Diego, California 92101-4495, 5th edition, 1993.

[SASS02]

Networking System Administration and Security (SANS). SANS mainpage. http://www. sans.org/, 2002.

[Sec02]

SecurityFocus.

Securityfocus Online Vulnerability Database.

http://online.

securityfocus.com/, 2002. [SoT00] Committee on Science (Congress:House of Representatives) Subcommittee on Technology. The Love Bug Virus: Protecting Lovesick Computers From Malicious Attack. http://www. house.gov/science/tech_charter_051000.htm, May 2000. [SPEC01] L.P. Swiler, C. Phillips, D. Ellis, and S. Chakerian. Computer-attack graph generation tool. In DARPA Information Survivability Conference and Exposition II, 2001. DISCEX 01. Proceedings, volume 2, pages 307321, Anaheim, CA, USA, 2001. DARPA, IEEE Computer Society. [SSK98] R. Siamwalla, R. Sharma, and S. Keshav. Discovering Internet Topology. Technical report, Cornell University Computer Science Department, 1998. [Str02] Internet Security Strategies. X-Force database. http://www.iss.net/security_center, 2002. [SuS02] SuSe. SuSe Linux 8.0. ftp://ftp-linux.cc.gatech.edu/pub/linux/distributions/ suse/suse/i386/8.%0/, June 2002. [Tan96] Andrew Tannenbaum. Computer Networks. Prentice-Hall, Upper Saddle River, NJ 07458, third edition, 1996.

BIBLIOGRAPHY [Tec02] [Vos00] Telecordia Technologies. Netsizer. http://www.netsizer.com, April 2002.

131

David Vose. Risk Analysis: A Quantitative Guide. John Wiley & Sons, New York, 2nd edition, December 2000.

[Whe02]

David A. Wheeler. Counting source lines of code (sloc). http://www.dwheeler.com/sloc/, 2002.

[www03]

www.riskspectrum.com. Example of methods for safety, reliability and availability analysis - fault tree. http://www.riskspectrum.com/docs/methods_ft.htm, 2003.

You might also like