Faria Dissertation

SCALABLE LOCATION-BASED SECURITY IN WIRELESS NETWORKS
A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
Daniel Braga de Faria December 2006
c Copyright by Daniel Braga de Faria 2007 All Rights Reserved
ii
I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.
(David R. Cheriton)
Principal Advisor
(Mary G. Baker)
(Fouad Tobagi)
Approved for the University Committee on Graduate Studies.
iii
iv
Abstract
This dissertation presents a location-based approach to wireless security. It diers from current solutions in that it uses information about the physical location of clients to leverage physical security measures instead of relying on long-term secrets such as passwords and private keys. Our approach adapts to the wireless scenario an intuitive security model that is eective and already commonplace in wired LANs. We show that it addresses three problems with current solutions. First is the inability of network administrators to dene geographical boundaries for wireless coverage. While access to Ethernet ports can be controlled by locking them inside buildings, wireless links extend connectivity beyond physical boundaries, making networks reachable to users across the street or in nearby buildings, therefore more vulnerable to attacks. We show that our services allow networks to provide connectivity to clients located within the intended service area (SA). Moreover, we show that malicious devices located outside the SA either need to get physically close to it running afoul of physical security measures or are faced with impractical hardware demands. The second problem is the lack of proper accountability. Without additional mechanisms, wireless networks are unable to accurately locate and securely identify trac sources because clients are no longer physically connected to network ports. Even with user authentication and cryptographic packet protection, some link-layer services still rely on MAC addresses to identify clients, making networks vulnerable to denial-of-service attacks that are both eective and easy to implement. We show that our services allow wireless devices to be distinguished and located accurately, making misbehaving clients again physically exposed and accountable for their acts. Finally, most solutions that increase security to acceptable levels incur substantial management costs. Unlike these, our approach is to improve security in a cost-eective manner. Our architecture takes advantage of higher numbers of access points not only to improve v
accuracy but also to congure itself autonomously, with minimal operator participation. We use extensive measurements in a real setting to show that such automatic calibration provides for accurate services while also allowing networks to scale to large numbers of access points.
vi
Acknowledgements
I would like to start by thanking my advisor, David Cheriton, for mentoring me over the past six years. David has a natural ability to identify key research problems, and helped me polish my own ideas since my rst group meeting. I look back at my thesis proposal, written three years ago, and the importance of his guidance is immediately clear. I would also like to thank the members of my reading committee, Mary Baker and Fouad Tobagi. Their valuable comments improved this dissertation considerably. Over these past years I have made many friends, some of which have been listening to crazy research ideas since my rst days at Stanford. Katerina Argyraki and Evan Greenberg have read papers of mine so many times that I was embarrassed to show them any piece of this dissertation. No more work, guys! I have also learned a lot, both personally and technically, from other colleagues from the Distributed Systems Group: Tassos Argyros, Mark Gritter, Vince Laviano, Sam Liang, Dapeng Zhu, and many others. I have also had the support of many friends outside of my research group. Franois c Conti, Eugene Nudelman, Javier Nicols Snchez, Paulo Pinheiro da Silva, and Qixiang a a Sun have been particularly important to me. Without the help of Christine Fiksdal and Hector Gamez the sheris of Gates many of the measurements reported in this dissertation would not exist. My parents, Carlos and Ana Lcia Faria, together with my brother, Rafael Faria, have u made me who I am. I cannot thank you enough. Even from such a long distance, I have also received an impressive amount of support from all my relatives, even though they still do not understand why I moved thousands of kilometers away from them. Finally, no words can describe how grateful I am for all the support I have received from my wife, Gisele Faria, over all these years. Gi, I dedicate this dissertation to you, although I think this alone is not enough. I thank you with all my heart.
vii
viii
Contents
Abstract Acknowledgements 1 Introduction 1.1 Link-Layer Security in Local Area Networks . . . . . . . . . . . . . . . . . . 1.1.1 1.1.2 1.2 1.3 1.4 Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Accountability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v vii 1 1 2 3 4 4 7 8 9 9 10 11 12 13 15 15 16 18 18 20
Security in Wired LANs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenges in Wireless Security . . . . . . . . . . . . . . . . . . . . . . . . . Location-Based Approach to Wireless Security . . . . . . . . . . . . . . . . 1.4.1 1.4.2 1.4.3 Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Accountability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 1.6 1.7 1.8
Assumptions About the Network . . . . . . . . . . . . . . . . . . . . . . . . Terms and Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Original Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Outline of Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Related Work 2.1 Leveraging User Location Information . . . . . . . . . . . . . . . . . . . . . 2.1.1 2.1.2 2.1.3 2.1.4 Signal Strength-Based Localization Systems . . . . . . . . . . . . . . Making RSSI-Based Systems Robust Against Attacks . . . . . . . . Decreasing Calibration Eorts . . . . . . . . . . . . . . . . . . . . . Ultrasound and Infrared Based Systems . . . . . . . . . . . . . . . . ix
2.1.5 2.2 2.3 2.4 2.5
Other Localization Systems . . . . . . . . . . . . . . . . . . . . . . .
21 21 23 24 25 27 27 29 30 32 32 32 34 35 36 36 37 38 40 41 42 44 45 45 47 47 49 50 52 54 55 58
Exploring Location-Limited Channels . . . . . . . . . . . . . . . . . . . . . Distance Bounding and Location Verication Mechanisms . . . . . . . . . . Detecting Identity-Based Denial-of-Service Attacks . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Automatic System Calibration 3.1 3.2 3.3 3.4 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Log-Distance Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Calibration Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 3.4.2 3.4.3 3.5 3.6 Sample Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model Instance Computation . . . . . . . . . . . . . . . . . . . . . .
Handling Heterogeneous Environments . . . . . . . . . . . . . . . . . . . . . Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.1 3.6.2 3.6.3 3.6.4 Network Testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calibration Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . Measuring Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . .
3.7 3.8
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 Alternate Path Loss Models . . . . . . . . . . . . . . . . . . . . . . . Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Robust Localization 4.1 4.2 4.3 4.4 4.5 4.6 4.7 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Attack Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Client Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Position Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Verication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 4.7.2 Signal Threshold (T ) . . . . . . . . . . . . . . . . . . . . . . . . . . . Minimum Quorum (Q) . . . . . . . . . . . . . . . . . . . . . . . . . . x
4.7.3 4.7.4 4.8 4.8.1 4.8.2 4.8.3 4.8.4 4.8.5 4.8.6 4.8.7 4.8.8 4.8.9 4.9
Condence Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conguration Tuple . . . . . . . . . . . . . . . . . . . . . . . . . . . Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Localization Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . False Negatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Handling False Negatives . . . . . . . . . . . . . . . . . . . . . . . . Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Building Penetration Loss . . . . . . . . . . . . . . . . . . . . . . . . Minimum Amplication Demands . . . . . . . . . . . . . . . . . . . False Positives Omnidirectional Antennas . . . . . . . . . . . . . . False Positives Directional Antennas . . . . . . . . . . . . . . . . .
59 60 60 60 61 66 67 67 68 69 71 75 79 80 81 83 83 85 85 87 88 92 93 94 96 97 98 98 98 101 104 104
Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.8.10 Localization in Multi-Story Buildings . . . . . . . . . . . . . . . . . Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.10 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Range-Based Authentication 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Attack Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Principle of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Authentication Handshake . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using Unicast Nonce Messages . . . . . . . . . . . . . . . . . . . . . . . . . Transmission Power Control . . . . . . . . . . . . . . . . . . . . . . . . . . . Handshake Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Selective Jamming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.10 Minimizing Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.11 Secure Session Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 5.12 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.12.1 Channel Quality vs. SNR . . . . . . . . . . . . . . . . . . . . . . . . 5.12.2 Authentication at Close Range . . . . . . . . . . . . . . . . . . . . . 5.12.3 Handling False Negatives . . . . . . . . . . . . . . . . . . . . . . . . 5.12.4 Attacker Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
5.12.5 Attack Resource Demands . . . . . . . . . . . . . . . . . . . . . . . . 5.12.6 Eective Coverage Reduction . . . . . . . . . . . . . . . . . . . . . . 5.12.7 Session Establishment Overhead . . . . . . . . . . . . . . . . . . . . 5.12.8 Selective Jamming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.13 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.14 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Detecting Identity-Based Attacks 6.1 6.2 6.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Attack Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Signalprints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 6.3.2 6.3.3 6.4 6.4.1 6.4.2 6.4.3 6.4.4 6.5 6.5.1 6.5.2 6.6 6.6.1 6.6.2 6.6.3 6.6.4 6.6.5 6.7 6.8 Signalprint Representation . . . . . . . . . . . . . . . . . . . . . . . Signalprint Generation . . . . . . . . . . . . . . . . . . . . . . . . . . Signalprint Properties . . . . . . . . . . . . . . . . . . . . . . . . . . Dierential Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . Max-Matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Min-Matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matching Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Resource Depletion Attacks . . . . . . . . . . . . . . . . . . . . . . . Masquerading Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
106 108 109 111 115 116 119 119 121 121 121 122 124 125 125 128 128 130 131 131 132 133 133 134 136 141 142 143 144 145 145 146
Matching Signalprints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Attack Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Signal Strength Oscillation . . . . . . . . . . . . . . . . . . . . . . . Signalprints and Physical Proximity . . . . . . . . . . . . . . . . . . Moving Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Directional and Beamforming Antennas . . . . . . . . . . . . . . . .
Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 Conclusion 7.1 7.2 A Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
7.3
Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
148 151
Bibliography
xiii
xiv
List of Tables
4.1 4.2 5.1 5.2 5.3 Notation used to describe the localization system. . . . . . . . . . . . . . . . Directional antenna patterns used during simulation. . . . . . . . . . . . . . Notation used to describe a range-based authentication handshake. . . . . . Authentication statistics for each location sampled. . . . . . . . . . . . . . . Performance measurements for the main operations in a handshake using 4 distinct machines. 6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 50 76 88 103
Matching results. Each row shows a matching rule, the gure (if any) containing the signalprints created from our measurements that satisfy that rule, the number of access points used as sensors, the number of matches produced, and the percentages of matches created by locations within 5, 7, and 10 meters from each other. The rst four rules are used to detect packets transmitted by the same device, while the last two detect packets sent by distinct devices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
xv
xvi
List of Figures
3.1 Temporal RSSI oscillations relative to the 75th percentile detected during calibration for three pairs of access points (measurements described in Section 3.6). 3.2 3.3 3.4 3.5 3.6 3.7 4.1 4.2 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 36 38 39 40 41 42 48 51 Our 4524m service area in the Gates Building 4A Wing with the placement of access points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . All the 135 locations sampled to create the survey data set. . . . . . . . . . Measurements in the calibration data set and the best t found for the logdistance model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Measurements in the survey data set and the best t found for the logdistance model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Path loss exponent values (i ) found for each AP. . . . . . . . . . . . . . . Path loss exponent () values found for each group of 6 APs. . . . . . . . . Localization system overview. . . . . . . . . . . . . . . . . . . . . . . . . . . RSSI oscillation for a stationary device. . . . . . . . . . . . . . . . . . . . . Pattern dierence between client at (25,25) and attacker at (80,25) using = 3.0. The clients pattern presents a clear peak, as it is located inside the service area. The attackers pattern resembles a plateau, a result of the increased distance to the SA. . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Localization accuracy using all 12 access points. The rst two curves show the error distribution resulting from the survey data set using the calibration and the survey models. The third curve presents the error distribution found through simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 This gure shows the localization error for each location in the survey data set and the locations that produce false negatives with the aggressive conguration. 63 xvii 61 57
4.6
For each location in the survey data set, this gure shows the real locations (arrow tails) and the corresponding positions estimated by the system (arrow heads). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.7
Localization error as a function of the number of APs using the survey data set and the calibration model. . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.8
Localization error as a function of the standard deviation associated with the path loss model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 69
4.9
Measurements outside the building. . . . . . . . . . . . . . . . . . . . . . . .
4.10 Predicted minimum amplication level needed to satisfy the threshold condition around the SA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.11 False-positive rates around the SA as a function of GU N using the aggressive conguration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.12 False-positive rates for GU N = 5 dB using the conservative conguration. . . 4.13 False-positive rates for directional antenna with = 40 and 14 dBi, against the aggressive conguration and with GU N = 5 dB. . . . . . . . . . . . . . . 4.14 False-positive rates for the other simulated directional antennas. . . . . . . 5.1 Example of a range conguration that covers most of our service area with 8 access points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 5.3 5.4 5.5 Detailed description of an authentication handshake. . . . . . . . . . . . . . Frames received during each round as a function of average SNR. . . . . . . Percentage of corrupted frames as a function of SNR level. . . . . . . . . . . Bit errors per frame (left y-axis) and total number of frames (right y-axis) relative to measured SNR levels. . . . . . . . . . . . . . . . . . . . . . . . . 5.6 5.7 Locations sampled during authentication experiments. . . . . . . . . . . . . Expected SNR outside the building. The rectangle represents our service area, while access points are shown as triangles. The axes show coordinates in meters, while the curves show the SNR at each location. . . . . . . . . . 5.8 Predicted SNR for an attacker outside the building as a function of his distance to the external wall (dA ). . . . . . . . . . . . . . . . . . . . . . . . . . xviii 108 107 101 102 85 89 99 100 77 78 73 74 71
5.9
Jamming experiments. Each row in the gure represents a measurement round. A full tick represents a frame received successfully, while a half-tick represents a frame with bit errors. Frames not captured by the receiver are not shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 114 123
5.10 Predicted SNR for access point and jammer outside the building (jammer placed at the external wall). . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 6.2 Overview of the signalprint generation process. . . . . . . . . . . . . . . . . Variation in RSSI for a stationary client that uses two dierent power levels. Each graph presents measurements with respect to a single AP. The top and bottom curves correspond respectively to the samples collected with the client transmitting at 15 and 0 dBm. 6.3 . . . . . . . . . . . . . . . . . . . . . . 127 Signalprint matching examples. Figure 6.3(a) shows two signalprints and their corresponding sizes. Figures 6.3(b) and 6.3(c) demonstrate how maxmatches and min-matches are computed. . . . . . . . . . . . . . . . . . . . . 6.4 6.5 Two matching rules applied to signalprints S1 and S2 . . . . . . . . . . . . . RSSI oscillation for a stationary device. Each graph was created by choosing one of the locations sampled and one access point within its range. It shows the variation in signal strength for consecutive frame transmissions relative to the median RSSI (shown as 0 dBm) for that location-AP pair. . . . . . . . 6.6 Figures 6.6(a) and 6.6(b) show the location pairs producing a minimum of respectively 4 and 5 5-dB max-matches using the 6-AP conguration (APs shown as triangles). 6.7 6.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 138 139 Figures 6.7(a) and 6.7(b) extend respectively gures 6.6(a) and 6.6(b) with a clause that allows no 8-dB min-matches. . . . . . . . . . . . . . . . . . . . Location pairs producing a minimum of 3 5-dB max-matches using the 4-AP conguration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 129 130
xix
xx
Chapter 1
Introduction
Were not lost. Were locationally challenged. John M. Ford
1.1
Link-Layer Security in Local Area Networks
Local area networks (LANs) have become indispensable: the installed base for both wired and wireless LAN technologies has reached impressive numbers and it still grows at high yearly rates. Employees expect their companies to provide them with bandwidth-rich environments, mobile users expect wireless service to be available everywhere, and every new hardware generation is expected to provide faster and cheaper network devices. Consequently, it is no surprise that the local area network market has soared over the last few years. According to research, the worldwide market for enterprise switches and routers alone has grown to over $19 billion in 2005 and is expected to reach $25 billion in 2009 [94]. As with any other asset, local area networks need to be secured. In this dissertation we focus on two aspects of network security: access control and accountability. To control access to their resources, networks need mechanisms that allow them to specify who their intended users are. For instance, companies usually want to provide employees with unrestricted network access but limit the connectivity provided to visitors and business partners. Additionally, installations want their users to be accountable for their network trac. This applies not only to enterprise installations but also to public networks such as those available in cafeterias, libraries, and airport lounges, which can also have their infrastructure used 1
CHAPTER 1. INTRODUCTION
for malicious purposes. The revenue related to network security appliances and software has surpassed $4 billion in 2005 and is expected to grow to $5.7 billion in 2009 [95]. Our work identies a dichotomy between wired and wireless LANs with respect to these security aspects and proposes a solution that is compatible with current network architectures and consistent with current technology trends. We begin this chapter by showing that in wired networks, proper access control and accountability can be achieved using a simple security model that leverages physical security measures. We then show that inherent characteristics of wireless links break this simple model and that current mechanisms that address this wireless security problem incur considerable management costs. In fact, studies suggest that security and privacy are still the main barriers to the adoption of wireless LANs [92]. Finally, we present an overview of our solution and describe how it improves security in wireless LANs without a substantial increase in total cost of ownership. We show that our proposed services take advantage of large numbers of cheap access points to collect real-time information about the physical location of clients and reduce the security gap between wired and wireless networks in a cost-eective manner. In sections 1.1.1 and 1.1.2 we describe in more detail what we refer to as access control and accountability. We present the relevant issues related to security in wired and wireless networks in sections 1.2 and 1.3, respectively. The remaining sections describe our approach, its main properties and results, and the original contributions of this dissertation.
1.1.1
Access Control
Access control can be dened as the mechanism used by a system to grant or revoke the right to perform some action. In this dissertation we focus on access control at the link layer, i.e., the procedure used by devices to convince the network that their packets should be forwarded. Consider a client1 that connects to the network and requests service. The client could be an Ethernet device connecting to a physical port or an IEEE 802.11 card associating with an access point (AP). Before providing any service, the network authenticates the client: it can check the identity of its user, the physical port (or access point) the client is connected to, or even base its decision on ner-grained information regarding the devices physical location. If authentication succeeds, the network authorizes the client to use its resources and provides it with a network conguration which dictates its
1 We use the term client to denote a device connecting to the network and the term user to represent the person operating the device.
1.1. LINK-LAYER SECURITY IN LOCAL AREA NETWORKS
level of connectivity. For example, this conguration can specify an IP address and proper VLAN settings, which coupled with access control lists placed in the network dene the exact level of network visibility provided to the client. Dierent environments require dierent access control policies. For example, a visitor with a wireless-enabled laptop entering an enterprise campus can be provided with a basic level of service (e.g. only email and Web access) without ever having to prove his or her identity. Filtering state at the network is responsible for providing proper isolation, therefore minimizing the damages caused by a misbehaving visitor. On the other hand, employees connecting to wired ports within their oces can be provided with more privileged settings, such as being allowed to connect to development machines and internal servers. If needed, a company can even provide specic users with special privileges using rules such as allow user JohnTheCEO to connect to server srv12 when using physical port 65A and after a successful two-factor authentication. In the other extreme, a cafeteria which probably houses no sensitive information may provide all clients connected to its access points with the same level of service.
1.1.2
Accountability
A system has proper accountability if the actions of an entity are traceable uniquely to that entity. According to the Merriam-Webster Online Dictionary, accountability is ...an obligation or willingness to accept responsibility or to account for ones actions [18]. A proper accountability framework allows a network to check whether clients are respecting service rules and to react to situations in which that is not the case. We consider a local area network to have an acceptable level of accountability if it can robustly identify the device responsible for a specic network trac and establish the physical location of misbehaving clients. For example, an enterprise hit by the latest worm would like to know the exact ports being used by all compromised hosts, such that proper lters can be put in place to prevent a widespread event and such that operating systems can be patched. Likewise, it would like to identify (and possibly disconnect) a guest device that uses the wireless network to launch a denial-of-service (DoS) attack against a major website. As cafeterias today see providing wireless access as an important requirement to attract new customers, they would also like to pinpoint a misbehaving client that uses dishonest methods to consume most of the available bandwidth.
1.2
Security in Wired LANs
Access control in wired networks is a less complicated problem because they are able to leverage physical security measures already deployed. Enterprises can lock their Ethernet ports inside buildings protected by keys, badges, and security personnel, physical security measures that are necessary for the protection of all company assets. Wired ports are just a subset of these assets, being reasonably protected without any additional mechanism and without increasing management costs. If a user has the credentials needed to walk into an oce and connect her laptop to a wired port, most of the time there is nothing else she needs to do to use the network. In this case, network access control decisions are inherently based on the physical location of clients, and network security is directly related to the level of physical security implemented. In fact, this base security level seems sucient for most wired installations. I.e., for the purposes of access control at the link layer, most wired networks reach a level of protection considered acceptable by simply leveraging physical security measures. This way they can avoid the additional costs incurred by mechanisms such as PPPoE [80] and IEEE 802.1X [3]. Extra security is left to upper layers: for instance, applications can use protocols such as IPSec [70], SSL/TLS [42], and SSH [118, 117] to create end-to-end secure tunnels and implement ner-grained access control. In wired networks, accountability can also be easily implemented because clients have to be physically attached to ports, being therefore easily trackable and constantly exposed. Once malicious activity is detected, the network can identify the port being used and nd the oending device, which may have its connection terminated or even be physically removed.
1.3
Challenges in Wireless Security
Enterprises see many advantages in deploying wireless LANs (WLANs), a technology that in the past few years has witnessed considerable market penetration. Compared to wired networks, WLANs incur lower installation costs per user given the reduced cabling and manual labor required. Moreover, connectivity is provided to mobile users at no extra cost, improving eciency and productivity. According to Infonetics Research, the worldwide revenue from wireless LAN equipment reached $2.4 billion in 2005, and it is estimated that respectively 57%, 62%, and 72% or small, medium, and large organizations in North America will have deployed WLAN equipment by 2009 [92, 93].
1.3. CHALLENGES IN WIRELESS SECURITY
One major challenge regarding the deployment of wireless networks is dealing with the unpredictable nature of signal propagation. While signal strength decreases with distance, the rate of decay inside a building depends on the construction materials used, oor layouts, the placement of furniture and other obstacles, as well as the number of people and their moving patterns. The resulting reection, diraction, and scattering of waves creates environment-dependent oscillations in received signal strength, which challenge not only services such as network planning but also all others that rely on signal strength statistics. Regarding security, one major disadvantage of wireless networks is that they violate the physical security model that is so eective in wired LANs, therefore requiring additional mechanisms to implement proper access control and accountability. Unlike the wired scenario, there is no inherent access control: wireless links extend connectivity beyond physical boundaries, making networks available in parking lots, across the street, and in nearby buildings, adjacent locations where coverage was not intended. When left unprotected, these wireless links make networks vulnerable to misuse and attacks. It is also more dicult to make clients accountable for their acts, as misbehaving devices can move freely and be 50 meters from the access points they use. Moreover, malicious users can use directional antennas and ampliers to produce higher signal strength levels and create many dierent signal strength patterns to further obfuscate their physical locations. To make matters worse, current access control solutions are not suitable for all wireless installations: they either provide little security improvements or incur high management overhead. On one hand, there are mechanisms that are easy to deploy but that add little or no protection. For instance, installations may leave their wireless links open, hide network names (SSIDs in 802.11 jargon), or use MAC address lists. As these solutions can be easily circumvented, networks are still at risk of being compromised. On the other hand, there are several mechanisms that provide higher security levels and ne-grained access control capabilities, but at much higher costs. In this category are mechanisms based on passwords or private keys, including authentication protocols proposed by the IEEE 802.11i [7] standard to secure wireless LANs, such as EAP-TLS [20], EAP-TTLS [53], and PEAP [87]. The higher management costs come from adding and removing users, granting and revoking access rights, as well as protecting sensitive information (keys), which is vital for keeping systems uncompromised. These mechanisms also place a lot of responsibility on users, who have to choose proper passwords and keep them safeguarded.
As a result of this trade-o between security and management costs, a large percentage of wireless networks still operate with insecure congurations and many of them are commonly victims of network abuse. A world-wide wardriving eort performed in June 2004 detected over 200,000 access points, with more than 60% of them running without cryptographic protection (then using WEP) and over 30% with the default SSID set by the manufacturer [8]. In enterprise environments, insecure congurations are also common. A study performed by RSA and NetSurity in March 2005 revealed that over 30% of enterprise wireless LANs in London, Frankfurt, New York, and San Francisco lacked basic security measures [100, 101, 102, 103]. While these congurations may be complemented by VLANs and rewalls to restrict the services available to wireless clients, networks are still left unprotected and vulnerable to misuse. Of 700 institutions that responded to the 2005 CSI/FBI Computer Crime and Security Survey, over 15% acknowledged that their wireless networks were victims of abuse during the previous year [37]. This number is clearly a lower bound, as these institutions need at least to be aware that such attacks took place and willing to share that information. Unfortunately, this situation is doomed to get worse because the number of deployed access points keeps rising at a considerable rate, allowing malicious users to nd vulnerable networks with minimal eorts. According to the study by RSA, the number of access points detected in London and Frankfurt increased respectively by 62% and 66% between 2004 and 2005 [100, 101]. Market research shows that WLAN unit shipments increased 39% between 2004 and 2005, with access points accounting for 81% of the wireless equipment revenue [93]. Finally, link-layer services may be vulnerable to a large class of denial-of-service attacks based on identity spoong even if users are authenticated and session keys generated to protect trac. This happens because despite the established session keys, these services identify clients using only their MAC addresses, which can be easily forged. For example, an attacker in an IEEE 802.11 network can disrupt service to well-behaved clients by spoong their addresses or the addresses of their access points and requesting disconnection. Bellardo and Savage have shown that eective attacks can be easily implemented: a 10-second deauthentication attack can immediately knock a client o the network and possibly incur minute-long outages given the interaction between 802.11 and TCP [29]. These attacks can be currently implemented in 802.11 networks even if the security mechanisms proposed by the 802.11i standard are used, given that control and management frames are still transmitted without cryptographic protection and [7]. Another option would be for
1.4. LOCATION-BASED APPROACH TO WIRELESS SECURITY
the attacker to use many dierent addresses to submit high rates of requests in order to overload a shared authentication server (e.g. a RADIUS [97] server). These attacks are also possible in 802.11i-enabled networks because clients need to contact these servers as part of the authentication process, during which messages are inherently unprotected. In summary, if on one hand wireless LAN technology is compelling for many reasons, on the other they introduce a new security paradigm that creates challenges for access control and accountability. It is clear that security problems will continue to exists unless cheaper yet eective solutions become available.
1.4
Location-Based Approach to Wireless Security
The objective of our research is to answer the following question: is there a way to also leverage physical security measures in wireless LANs so that security increases to an acceptable level without a considerable increase in management costs? This dissertation shows that the answer is yes. Our approach can be summarized by three design guidelines: 1. Leverage information about the physical location of clients. Such information is used to limit network service to those clients located within the intended coverage area and also to make clients again physically exposed and accountable for their acts; 2. Take advantage of increased numbers of access points. When WLAN devices become commodities IEEE 802.11 hardware provides a great example larger numbers of APs can be installed without considerably increasing total cost of ownership while improving robustness and the accuracy of location-based services; 3. Minimize human assistance during system conguration and maintenance. This is required to allow networks to scale to large numbers of APs and also to provide management costs that are feasible to most installations. We show in this dissertation that our approach decreases the gap between wired and wireless security, providing an alternative that is reasonably priced and that cannot be easily circumvented.
1.4.1
Access Control
Our proposed mechanisms improve access control by allowing network administrators to dene geographical boundaries for wireless service, minimizing undesired coverage and reducing the likelihood of attacks and network abuse [50]. In order to use the network, clients are forced to move closer to the infrastructure, therefore leveraging any physical security measures deployed. We show that standard devices beyond this intended coverage area are denied service and that extending the range of the network requires impractical amounts of resources such as amplication and antenna gain. In order to impose such boundaries, a network has to check whether each client is physically located within the intended coverage area. We propose two mechanisms a robust localization system and range-based authentication that exploit inherent properties of the wireless channel to achieve this goal. We rst present the design and evaluation of our localization system. Similar to earlier systems, it monitors client transmissions and uses signal strength measurements collected at dierent vantage points to estimate the position of clients as they use the network. Unlike other systems, however, it was designed to identify (and reject) clients trying to access the network from locations beyond the intended coverage area and to face malicious transmitters that may increase their transmission power levels or use directional antennas to access the network. The fact that it was designed with such adversaries in mind makes our system suitable for access control purposes while still providing accurate location estimates. We later present the design of a range-based authentication protocol that explores receiver sensitivity constraints to further increase the costs required to extend the intended coverage area. Each access point is congured with a maximum range so that, together, all APs cover the whole service area. In this protocol, a client authenticates successfully by proving to be within range of one of the access points. While the localization system forces clients to act as transmitters, in this mechanism clients seeking connectivity act as receivers, for which amplication is a harder problem. These two mechanisms allow a network to implement location-based access control (LBAC), in which a clients rights vary according to its physical location. For example, a visitor allowed to attend a seminar inside an enterprise building can be automatically provided with basic network connectivity as soon as his location can be veried. Likewise, service can be terminated as he walks out of the premises.
1.4. LOCATION-BASED APPROACH TO WIRELESS SECURITY
1.4.2
Accountability
Our architecture improves accountability as the use of a localization system provides realtime information about the physical location of transmitters (clients or otherwise), making them once again physically exposed and therefore accountable for any misbehavior. As shown later, most clients have their positions estimated within 2 meters (6.56 feet) of their real locations, which is comparable to a wired network in which the exact location of the port is known, but where clients can use cables tens of meters long. Almost 50% of respondents in the CSI/FBI survey acknowledged being victims of insider abuse of network access, so such capabilities would be useful in many installations. For example, networks could use such precise location information to detect (and remove) rogue access points as soon as they are installed by employees, and to track visitors granted with basic connectivity in public spaces who appear responsible for suspicious network trac. To further improve accountability, we present a mechanism that allows networks to dierentiate between distinct devices even in the face of MAC address spoong. Our mechanism assigns to each transmitter a signal strength pattern, which we call a signalprint, that functions as a location-based device identier. We show that distinct devices produce dierent signalprints (and can therefore be successfully distinguished) unless they are in close proximity to each other. For instance, this capability allows IEEE 802.11 networks to detect identity-based denial-of-service attacks. If a malicious device uses the MAC address of an access point to disconnect an unsuspecting client, the attempt can be easily spotted if the attacker is on a dierent oor or in the parking lot, as the two signalprints produced are clearly distinguishable. Furthermore, the physical location of the attacker can be established by inputting the signalprint into a localization system like the one we have designed.
1.4.3
Scalability
In order to scale to large numbers of access points without incurring high management costs, services need to minimize their dependence on human assistance during reconguration. One often nds that when calculating total cost of ownership (TCO) for a new service, capital expenses or CAPEX, mostly composed of equipment costs are dominated over time by operational expenses (OPEX), which include labor costs resulting from system installation and maintenance [50]. As an example in wireless networks, services that require
10
human-assisted sampling of the wireless coverage (such as the site surveys needed for some localization systems) can substantially increase OPEX, given that access points are usually incrementally deployed and often relocated. In fact, research suggests that centralized wireless architectures can achieve lower TCO mostly due to a considerable decrease in operational expenses [52]. Our system uses the access points already deployed to congure itself without operator participation. We show that all the information regarding signal strength attenuation within the coverage area that is needed by our localization and range-based authentication mechanisms can be autonomously established by the system, with little or no impact on accuracy and quality of service.
1.5
Assumptions About the Network
Our location-based approach to security is suitable for a wireless local area network with the following properties: RF data collection: access points are able to measure, on a frame-level basis, wireless transmission properties such as signal strength and noise levels; Centralized control: a network device aggregates all the wireless trac and also manages all the access points; Physical security: the level of physical protection surrounding the planned coverage area is consistent with the intended security level and threat model. The rst property is necessary because all our proposed services characterize wireless transmitters using such RF measurements. For instance, signal strength measurements are the input to our localization system and are also used to construct the signalprint assigned to each transmitter. Currently, IEEE 802.11-compatible hardware provides all the necessary RF information, being used in our testbed and measurements. While we focus on the use of 802.11 technology in this dissertation mostly due to its wider adoption, our mechanisms are equally suitable for other WLAN standards such as HiperLAN/2 [49], and some of them can even be used in shorter-range wireless personal area networks (WPANs) such as Bluetooth/IEEE 802.15.1 [13] and UWB/IEEE 802.15.3 [5]. A detailed study of implementing location-based security using these technologies is left as future work.
1.6. TERMS AND DEFINITIONS
11
We consider networks where access points are simple devices controlled by a centralized wireless appliance (WA). Our ideas closely follow industry standards such as CAPWAP [31, 116], where APs are made as simple as possible and where most of the functionality is placed at the WA, which is computationally more powerful. One of the objectives of this approach is to transform APs into wireless commodities: cheap, unspecialized devices that can be deployed at higher numbers and easily replaced, reducing management overhead without a signicant increase in capital expenses. The CAPWAP architecture is in path for industry adoption, providing a concrete example of deployment scenario for our services. This central point of control is necessary for several reasons. First, it provides a natural place to implement services that require trac aggregation, such as localization and signalprint calculation. Second, it is required whenever network-wide conguration or synchronization is needed, such as during channel assignment (important for localization and intrusion detection), access point selection (which APs are used for authentication and data sessions), and transmission power control (achieving the desired coverage with minimum interference and leakage). Finally, a coherent level of physical security is essential to any location-based security mechanism. Location-based access control policies are useful if only targeted clients are allowed within the physical space where service is provided. (Note, however, that they may not necessarily be provided with the same level of service.) For this reason, our mechanisms are suitable for an installation with an attack model consistent with the level of physical security implemented. They can be used in enterprise environments and other installations with sensitive information because the more stringent security requirements are coupled with proper physical security measures. Our services can also be used in public spaces such as cafeterias and other hotspots because the lack of physical security is counterbalanced by a weaker threat model and a willingness to serve a wider user base.
1.6
Terms and Denitions
We use the term service area (SA) to denote the physical space that denes the geographical boundaries for wireless coverage, i.e. where clients need to be physically located to be provided with network connectivity. For example, in an enterprise environment, the service area may be congured as the interior of an oce building. While this may be the case in most scenarios, the physical limits of the service area may not coincide with the boundaries
12
for the placement of access points or the enforcement of physical security. For instance, an enterprise seeking additional protection could deploy additional APs acting as sensors in a demilitarized zone (DMZ) around the service area, a buer zone that is physically secure but where wireless service is denied. While the service area denes the physical space where wireless service is to be provided, the reference client establishes the minimum hardware requirements, i.e. the capabilities needed by a wireless device to be accepted when located inside the service area. Our services set conguration parameters according to reference clients capabilities such as transmission power levels, noise gure estimates, and antenna properties such as directivity and gain. The reference client needs to be set carefully, as there is a trade-o between security and heterogeneity: the more exible an installation is in terms of the devices it accepts, the weaker the security level achieved. For example, an enterprise that periodically upgrades the hardware used by its employees and changes the denition of the reference client accordingly will always maximize the resources required by attackers. For our testbed deployment, we use a standard 802.11a/b/g PCMCIA card as the reference client. We assume that its maximum transmission power lies between 15 and 20 dBm, that it is provided with an omnidirectional antenna with 0 dBi of gain, and with a noise gure inferior to 10 dB. These values are compatible with current PCMCIA cards such as the Cisco 350 [4].
1.7
Original Contributions
The popularity of wireless LANs has exposed a signicant security gap between wireless and wired networks, mainly due to how these two technologies leverage physical security measures. Current solutions that address the wireless security problem and that in fact provide satisfactory levels of protection require user credentials, incurring higher management costs. The thesis of this dissertation is that it is possible to leverage physical security measures in wireless LANs using information related to the physical location of clients, therefore improving security in a cost-eective manner. This dissertation makes the following original contributions: We show that well-behaved clients within a service area and provided with proper hardware are correctly identied and located accurately, allowing the implementation of location-based access control and improving accountability;
1.8. OUTLINE OF DISSERTATION
13
We show that attacks can be prevented or detected by physical security measures. We demonstrate that in order to extend coverage beyond the intended service area, malicious clients need to either get close to the SA (running afoul of physical security measures) or employ high-gain antennas and ampliers, making these attempts not only harder to implement but also easier to detect. We also show that identitybased DoS attacks can be detected with high probability unless an attacker is in close proximity to his or her victim; We show that our architecture incurs low management costs. The system can congure itself using the access points already deployed, with no operator participation. We show that this automatic calibration achieves satisfactory results despite the challenges associated with signal strength modeling. The main achievement of our location-based approach is a substantial increase in the costs of mounting attacks beyond the service area, simultaneously providing higher security than basic mechanisms and lower management costs than solutions that rely on user credentials.
1.8
Outline of Dissertation
The remainder of this dissertation describes our system architecture and evaluation. Chapter 2 describes relevant prior work regarding location-based services and security in wireless networks. In Chapter 3 we describe how the system uses the deployed access points to congure itself autonomously. Chapters 4, 5, and 6 present the design of our three proposed mechanisms and their evaluation using measurements in our testbed. Chapter 4 describes our proposed localization system. Chapter 5 presents range-based authentication, while Chapter 6 shows how signalprints can be used to detect identity-based DoS attacks in wireless LANs. Finally, Chapter 7 presents a summary of our results, suggestions for future work, and concludes this dissertation.
14
Chapter 2
Related Work
This chapter surveys prior attempts to leverage user location information for both general and security purposes. In Section 2.1, we discuss prior work on wireless localization systems. Section 2.2 describes techniques that explore the inherently constrained nature of wireless links for security purposes. In Section 2.3, we present prior work on distance bounding protocols and location verication mechanisms, designed to provide stronger security guarantees than standard localization systems. Finally, Section 2.4 surveys prior work related to detection and mitigation of identity-based denial-of-service attacks. We focus mostly on systems designed for use in wireless LANs, although we also present some relevant contributions in other elds such as wide-area localization and wireless sensor networks.
2.1
Leveraging User Location Information
This section surveys systems that use measurements from dierent vantage points to establish the physical location of mobile devices. These systems can be broadly characterized according to the kinds of measurements they use, including estimates of distance (range), angle of arrival, and received signal strength levels. In sections 2.1.1-2.1.3 we focus on systems that employ signal strength measurements to locate mobile devices. Systems based on infrared and ultrasound measurements are discussed in Section 2.1.4. Other relevant localization systems are addressed in Section 2.1.5. 15
16
CHAPTER 2. RELATED WORK
2.1.1
Signal Strength-Based Localization Systems
The main advantage of signal strength-based localization systems is that they can be easily deployed. Estimating the energy level in a wireless channel is necessary to implement techniques such as CSMA, so signal strength statistics can be easily provided by devices on a per-frame basis. For instance, the availability of RSSI1 measurements in IEEE 802.11 devices has supported a large volume of research on device localization. All systems described in this section have at least one signicant disadvantage regarding their use as a security building block in large-scale networks. First, they were not evaluated in adversarial settings, where malicious clients try to obfuscate their locations using ampliers or directional antennas. We examine systems designed for security applications in Section 2.1.2. Second, most systems require a considerable amount of manual calibration before they can be used. One needs to measure signal strength levels at many locations inside the service area to create a RSSI database which is then used to train the system. We discuss systems designed to reduce the amount of manual calibration in Section 2.1.3. Pioneer systems such as RADAR [26, 27] and SpotON [64] demonstrated that accurate localization in wireless LANs was possible using only RSSI measurements. Using only 3 access points in a 43.5 m22.5 m service area, the best algorithm used by the RADAR system (the empirical method) achieved a median localization error lower than 3 meters (the euclidean distance between real and estimated locations) [26]. Another contribution of RADAR was to propose the use of path loss models for localization in WLANs, although this approach generated results inferior to those of the empirical method. At least two localization systems were developed as part of Project Aura using the IEEE 802.11 infrastructure at Carnegie Mellon University [54, 106]. Small et al. showed that signal strength measurements for a stationary device are fairly stable, which agrees with our results [106]. The CMU-PM algorithm designed by the authors based on patternmatching techniques was shown to provide better accuracy than a triangulation-based algorithm (CMU-TMI), although the former incurred a higher training overhead [54]. These earlier systems generated considerable interest in wireless localization and motivated many other researchers to approach the problem using pattern recognition techniques. Instead of modeling signal attenuation directly which would allow signal strength prediction to be extrapolated to locations that were not sampled localization is performed by comparing live signal strength measurements to RSSI distributions pre-computed for the
1
Received Signal Strength Indicator. We use RSSI and signal strength interchangeably.
2.1. LEVERAGING USER LOCATION INFORMATION
17
nite set of locations visited during system calibration. As the main objective was to improve accuracy, this direction was consistent with the results of earlier work, which favored pattern matching over modeling physical properties directly. In their design of the Nibble location service, Castro et al. proposed the use of Bayesian networks to locate users at the granularity of rooms [33]. In one of their deployment scenarios, the authors used 10 access points and a set of 12 locations, calibrating the system over the course of a few days. Whenever a location was produced by their algorithm, it was correct 97% of the time, although 15% of the readings did not produce a location estimate. Roos et al. used machine learning techniques to design a system that achieved an average localization error inferior to 2 meters on a 1640m oce space provisioned with 10 access points [98]. They proposed two localization methods and showed that both performed better than nearest neighbor search (similar to the algorithm used by RADAR). To achieve these results, however, the authors relied heavily on manual sampling: the training data set was created by sampling locations every 2 meters on a grid. The authors also showed that at least one calibration point was needed every 25 squared meters of area for their algorithm to achieve an average error inferior to 2 meters. Ladd et al. used similar techniques to build a system that provides median error inferior to 1.5 meters [76]. Their system estimates not only the location but also the orientation of a mobile device. Their measurements were performed inside an oce building, within a 32.517 m area. The authors themselves consider the necessary training as one of the disadvantages of their system [76]. The main objectives of the Horus system were to provide better accuracy than that of earlier systems while also decreasing the processing overhead associated with the localization service [119, 120, 121]. Computation requirements were reduced by performing localization at the clients and using clustering techniques, which group locations covered by similar sets of access points [119, 121]. With locations in one of the testbeds covered by an average of 6 APs and with locations sampled every 5 feet to train the system, Horus achieved a median error inferior to 0.5 meters [121]. (The authors also showed that, with the same data set, RADAR [26] and the system proposed by Roos et al. [98] achieved median errors of 2.96 and 0.48 meters, respectively.)
18
2.1.2
Making RSSI-Based Systems Robust Against Attacks
Only recently have researchers focused on improving the robustness of RSSI-based localization algorithms against malicious devices. For instance, a client that wishes to disguise its location can use dierent transmission power levels over time, use ampliers to generate high RSSI measurements even from distant locations, or use directional antennas to decrease the number of RSSI measurements reported and create dierent signal strength patterns. Tao et al. [109] presented techniques to improve the performance of a previously published localization system [76] when locating transmitters with unknown transmission power levels. By modeling the dierence between RSSI values reported by pairs of access points, their system was shown to achieve higher accuracy when locating users with dierent wireless cards or employing dierent transmission power levels.
2.1.3
Decreasing Calibration Eorts
Recently, researchers have focused on decreasing the amount of human eort needed to congure RSSI-based localization systems. Many systems already provide enough accuracy to enable most location-based applications, but reducing the amount of manual conguration is still necessary to decrease overall installation costs and encourage broader deployment. Krumm and Platt investigated the impact on localization accuracy of reducing both the number of RSSI measurements collected at each location and the number of locations sampled during system calibration [75]. Their algorithm uses an interpolation function to predict RSSI levels at locations that were not sampled during calibration. Using data collected at 137 locations (one location every 19.5 m2 ) and a total of 22 APs in a building with an area of 2,680 m2 , their system provided an RMS error of 3.75 meters. The authors showed that using 50% and 20% of the calibration locations increases the error by respectively 20% (0.74 m) and 42% (1.59 m). Haeberlen et al. designed a localization system that locates users at the granularity of rooms and requires between 1 and 2 minutes of calibration per cell (an area of approximately 25 m2 ) [58]. They evaluated their system in a 12,000 m2 oce building divided into 512 cells and provided with 33 access points (average of over 14 APs within range of each location). The correct cell is returned by the system for 95% and 90% of the trials when respectively 33 and 17 access points are used.
2.1. LEVERAGING USER LOCATION INFORMATION
19
Concurrently to our work, Gwon et al. proposed the use of inter-AP measurements to implement a calibration-free localization system [57]. In their scheme, each access point establishes a mapping curve (essentially a path loss model instance) for each neighboring AP, based on periodic RSSI measurements. A mobile client reports RSSI measurements back to a localization server, which establishes the clients position using the mapping curves calculated by the closes AP (the one with the highest reported RSSI). Their algorithm always uses 3 access points for localization and was shown to achieve a median error of 5.4 meters inside an oce building with dimensions of 39.825.5 meters and with 4 APs. In their design of the LEASE system, Krishnan et al. use special emitters devices placed strategically throughout the service area to perform location ngerprinting measurements [74]. All the calibration data is generated using transmissions performed by the emitters and the corresponding RSSI measurements reported by access points. As in the work of Krumm et al., a interpolation function is used to predict RSSI levels at locations that do not contain emitters. They evaluated their system in two buildings with a total area of over 7,000 m2 and 9 access points. They estimated the median error to be 4.5 meters when 12 emitters are used and just over 2 meters when 104 emitters are used. Chen et al. evaluated the use of RFID tags to assist calibration in a RSSI-based localization system [34]. RFID readers placed in corridors were used to detect linear trajectories of mobile users with passive RFID tags. Assuming clients move at constant speeds, the system nds the RSSI samples generated by clients at intermediary points while walking from one RFID reader to another, and uses these samples for calibration purposes. Using the same localization algorithm, the authors found that moving from a manual calibration process to an automatic one assisted by RFID tags increased the average localization error slightly, from 2.73 to 2.9 meters. Lim et al. have recently designed a RSSI-based localization system that also requires no manual conguration [79]. Using inter-AP measurements, their algorithm calculates a signal-distance map, which is then used to estimate the distance from a mobile client to each anchor node. A lateration algorithm is used to calculate a clients geographical position using these distance estimates. With 8 access points distributed across a 2623 m building wing and with RSSI measurements taken at 25 dierent locations, their system provided a median localization error between 2 and 3 meters. The authors also showed that, at least with their algorithm, localization accuracy changes considerably over time due to uctuations in signal strength measurements ([79], page 8).
20
2.1.4
Ultrasound and Infrared Based Systems
Systems based on infrared and ultrasound transmissions can provide accurate localization and enable the implementation of many context-aware applications, but are not suitable for our purposes. To use these technologies to secure wireless LANs, current hardware would have to be modied to include infrared or ultrasound capabilities, which would incur non-trivial costs. The Active Badge system used special badges equipped with infrared transmitters to track people and objects within a building [114]. Badges periodically transmit their unique identications, which are detected by infrared receivers and relayed to a centralized server. Localization is simpler than in RF-based systems: infrared transmissions do not propagate through walls and other obstacles, transmitters usually have short ranges (6 meters for the hardware used in [114]), and the location of a mobile device can be estimated as the location of the receiver that heard its last transmission. The system proposed by Harter et al. used RF and ultrasound transmissions to track mobile users equipped with special devices called bats [60, 61]. Periodically, a base station uses an RF transmission to probe each bat device, which responds by transmitting an ultrasonic pulse. Sensors deployed throughout the building which are reset through the wired network simultaneously with the RF transmission receive the bats ultrasonic pulse and report the corresponding times-of-arrival to a centralized server. These are used to estimate the distances between sensors and the bat, which has its location determined through multilateration. With 100 ultrasound receivers and two base stations in a building, the system was able to estimate the location of mobiles devices to within 9 centimeters of their real locations 95% of the time [61]. Priyantha et al. designed the Cricket system to allow low-cost implementation of inbuilding location-aware applications while maintaining user privacy [88]. Like the Bat system, Cricket also uses RF and ultrasound transmissions, although in a slightly dierent way. Periodically, beacon devices spread across a building transmit an RF signal simultaneously with an ultrasonic pulse. Mobile devices equipped with listeners calculate their distances to each beacon using the delay between RF signals and the ultrasonic pulse and the dierence between the speeds of light and sound in air. Using the closest beacon as a location estimate, mobile devices can be localized with a granularity of 44 feet [88]. The authors later extended this system to calculate the orientation of mobile clients with an accuracy of 3-5 [89].
2.2. EXPLORING LOCATION-LIMITED CHANNELS
21
2.1.5
Other Localization Systems
Niculescu et al. proposed a system based on angle-of-arrival (AoA) measurements that eliminates calibration but that requires special-purpose access points [45]. Their implementation uses APs with revolving directional antennas (horizontal and vertical beamwidth values of approximately 30 and 33 RPM) that are able to estimate the angle of an incoming transmission with a median error of 22 . Using only AoA measurements and with 7 APs placed across a 5625 m area, their system provided a median error of 2.9 meters with a total of 32 locations sampled. When AoA measurements are coupled with range estimates produced using a path loss model, the median error decreases to 2.1 meters. Finally, the Global Positioning System (GPS) provides localization services using range estimates based on time-of-ight measurements from at least 4 satellites [48, 55]. Dierential GPS (DGPS) is a mechanism proposed to enhance localization accuracy using two receivers located within kilometers of each other. In summary, the largest error sources in the standard GPS system related to the estimated satellite positions and propagation through the ionosphere are identical for users in close range and cancel out in relative measurements. The concept behind DGPS inspired the use of dierential signal strength values when designing our signalprint-based mechanism.
2.2
Exploring Location-Limited Channels
Several researchers have proposed the use of location-limited channels (LLCs) for security purposes. A LLC, according to the denition by Balfanz et al. [28], is a communication channel with the property that human operators can precisely control which devices are communicating with each other. While many of these works use a LLC to distribute keys that can be latter used for authentication purposes over a communication channel (such as 802.11 links), our work explores the constrained nature of the communication channels themselves to check whether devices are in close proximity to each other. Our approach provides security improvements without requiring extra hardware support (e.g. infrared, ultrasound, or GPS capabilities). In their pioneer work, Denning and MacDoran proposed location-based authentication as a way to use geodetic localization to improve network security [40]. In their mechanism, a location signature sensor (LSS) uses raw signals transmitted by all GPS satellites within its range to create an unforgeable location signature that is unique to that specic place
22
and time. Mobile clients can acquire these signatures from the LSS and use them for an amount of time in the order of milliseconds to prove their physical location to remote hosts. Their mechanism requires special devices and targets remote authentication, while our protocols target local authentication. Stajano and Anderson dened a security policy model used to establish secure yet transient associations between two devices [107, 108]. In this model, an imprintable slave (e.g. a new home appliance just taken out of the box) is set to obey the rst master (e.g. a remote control) that sends it a secret key over a physically secure channel (e.g. physical contact). After being imprinted, a slave only accepts commands from its current master, which can send the slave device back to the imprintable state with a death command. As masters transmit session keys unencrypted during imprinting the authors avoided the use of asymmetric cryptography to allow their mechanism to be used in power-constrained devices the channel used for that purpose needs to be protected against eavesdropping, which justies their use of physical contact as communication channel. Kindberg and Zhan proposed a general model that explores location-limited channels to implement location authentication [71]. They dened receive-constrained and sendconstrained channels and proposed several protocols that use them to verify the location of a mobile client. The authors suggest that Bluetooth, infrared, and even IEEE 802.11 can be used as base technologies to implement constrained channels, but they focus on the design of the authentication protocols and provide no empirical data regarding the performance of real communication links. Using their notation, our range-base mechanism can be classied as a protocol that explores a receive-constrained channel. Balfanz et al. built on the work of Stajano and Anderson and designed a mechanism used to bootstrap trust in ad-hoc networks [28]. In their scheme, two devices rst exchange public keys (or secure hashes of these) over a location-limited channel. For example, a user in an airport lounge that points to a specic printer using an infrared link can have reasonable condence that the public key acquired was actually sent by the device he or she intended to communicate with. With public keys exchanged, the two devices can proceed to authenticate each other over a wider-ranged network (e.g. IEEE 802.11) using standard protocols such as TLS [42] or IKE [68]. Their mechanism allows the use of identity-based authentication in ad-hoc networks without reliance on a wide public key infrastructure. The authors argue that their use of asymmetric cryptography reduces the secrecy requirements on the location-limited channel compared to the work of Stajano and Anderson, allowing
2.3. DISTANCE BOUNDING AND LOCATION VERIFICATION MECHANISMS
23
them to use audio or infrared for this purpose. The design of Zero Interaction Authentication, by Corner et al., makes use of a shortrange wireless link for communication between a wearable authentication token (e.g. a PDA) and a laptop with an encrypted le system [36]. With traditional encrypted le systems, users need to authenticate periodically to supply a decryption key which unlocks the data in the laptop. Data is vulnerable to theft as long as cached decryption keys are valid, so in these systems there is a trade-o between security and usability. With ZIA, a user authenticates infrequently to the token, which provides transient decryption keys to the laptop much more frequently using a secure wireless channel established using standard Die-Hellman techniques. The security of this mechanism depends on the limited range of the wireless link between laptop and token because the laptop maintains unencrypted (and therefore unprotected) data while it can communicate with the token.
2.3
Distance Bounding and Location Verication Mechanisms
Distance bounding mechanisms aim to determine or verify the physical distance between two communicating entities while providing strong security guarantees. Location verication protocols usually use distance bounding as a building block to verify the physical location of a mobile host using multiple vantage points. We use the terminology proposed by Sastry et al., in which a verier (e.g. an AP) checks the correctness of location claims made by a prover (e.g. a 802.11 device) [104]. Given the accuracy targeted by these systems, they usually employ time-of-ight (ToF) measurements using RF signals or ultrasound pulses. The deployment of mechanisms that rely on measurements using RF signals requires not only accurate timing at the verier, but also strict processing delay guarantees at both devices to produce accurate measurements. If the prover takes 50 nanoseconds to respond, it adds an articial 15 meters to the range estimate (assuming c = 3 10 8 m/s), which may cause the verier to reject the provers range or location assertion. Brands and Chaum proposed a mechanism that establishes an upper bound on the distance between two devices communicating using a wired link, which is used to prevent man-in-the-middle attacks in security protocols [30]. Their proposal consists of a challenge-response protocol in which the verier times the delay between sending out a challenge bit and receiving back a response. Waters and Felten designed a protocol that employs round-trip measurements to allow a prover using a wireless network to demonstrate its proximity to a verier [115]. The
24
authors used this capability to implement a location verication protocol. Hu et al. used similar techniques to prevent wormhole attacks in wireless sensor networks [65]. Cpkun and a Hubaux studied the vulnerabilities of multiple positioning techniques and designed veriable multilateration, an algorithm that uses ToF measurements to verify the position of a wireless device [112, 113]. Similar techniques have also been explored by other researchers [77]. Protocols that use ultrasound pulses for range estimation considerably simply timing requirements, but the additional hardware capabilities demanded limit their adoption as security building blocks for wireless LANs. Sastry et al. proposed the Echo protocol, which employs time-of-ight measurements involving both ultrasound and RF messages to check whether a client is within a geographical region of interest [104].
2.4
Detecting Identity-Based Denial-of-Service Attacks
A malicious wireless device implements identity-based attacks (also called sybil attacks) by changing its identity over time or by using the identities of other devices to impair their ability to communicate. At the link layer, this can be accomplished by simply changing MAC addresses. Most works on wireless intrusion detection, both in industry and in academia, have focused on other kinds of attacks do not require identity spoong, such as detecting rogue access points and identifying greedy clients that do not respect rules dened by the MAC protocol. We do not address these here. Bellardo and Savage have shown that eective DoS attacks in 802.11 networks can be mounted with standard hardware [29]. They measured the impact of several identitybased attacks, including the ones targeting authentication and association services, and presented practical solutions that can be realized with low overhead and without modifying clients. For example, the authors suggest that access points buer deauthentication and disassociation requests for brief periods of time (5-10 seconds) before processing them. In this case, conicting requests would be taken as indications of an attack. A technique called RF ngerprinting (RFF) has been developed to identify distinct transceivers across multiple wireless systems [47, 110]. The ngerprint for a transmitter is created from several features (such as phase, and amplitude) extracted during a period of transient behavior that occurs as the device powers up before a transmission. These turnon transients tend to be dierent for each transceiver, allowing even units build on the same factory to be distinguished. RFF systems have been used to detect cloned phones in cellular
2.5. CONCLUSION
25
systems [96], and several researchers have proposed their use in wireless LANs [59, 111]. One disadvantage of RFF is that it requires specialized hardware to measure the signal properties needed with enough precision. Experimental results, however, have shown that some devices can produce ngerprints that are indistinguishable from each other [47]. Concurrently to our work, Demirbas et al. have proposed the use of RSSI measurements from multiple sensors to detect sybil attacks in wireless sensor networks [39]. As testbed, the authors use up to four Mica2 motes operating as sensors at 433 MHz, with motes always located in close proximity to each other (30 cm to 10 m). Our research demonstrates that reliable attack detection is possible for larger 802.11 installations, where clients can be more than 40 meters from access points. Mechanisms such as client puzzles have been designed to slow down attack sources, reducing the damages caused by resource depletion attacks [19, 24, 38, 67]. Before any resource is committed to an incoming request, computational puzzles are sent back to clients that require CPU- or memory-intensive operations. Despite being protocol-agnostic, puzzles demand that both clients and servers be modied, increasing deployment overhead when compared to a mechanism that can be implemented solely at the WA. On a side note, Gruteser et al. have proposed the use of temporary interface identiers to improve privacy in WLANs: clients change their MAC addresses whenever they associate with an access point, reducing the chances of being tracked [56]. The authors evaluate this mechanism against an attacker that uses signal strength information to identify MAC addresses used by the same client. Our research complements this analysis to show that with higher number of access points, attackers may be able to track clients even after address changes, unless the number of active devices in the network is large enough as to create multiple similar signalprints.
2.5
Conclusion
In this chapter we have identied several issues that need to be addressed before locationbased services can be eectively used as security building blocks in large-scale wireless LANs. We showed that major disadvantages of prior work include the requirement of specialized hardware and non-trivial amounts of operator assistance during conguration.
26
A localization system can be used to monitor activity on a wireless LAN, pinpoint the location of misbehaving clients, and allow location-based access control policies to be implemented. To be readily deployed, such a system needs to leverage the hardware architecture currently used by the network it is trying to protect. Requiring additional hardware capabilities increases costs and discourages wide-scale adoption. Moreover, the system needs to congure itself automatically in order to scale to large installations and cope with the dynamics of wireless networks. Finally, they have to be prepared to face malicious devices. We showed that prior works in wireless localization do not address at least one of these issues. We present a system that addresses them in Chapter 4. We also showed that security mechanisms that work by controlling or measuring the communication range between entities were either designed with the sole objective of distributing long-term keys or targeted strong security guarantees that are only realizable with specialized hardware. Our proposed range-based authentication mechanism, presented in Chapter 5, explores the inherently constrained nature of wireless links to authenticate clients in close proximity to access points. Our mechanism requires no hardware modication, so it can be readily deployed. Finally, we showed that eective identity-based DoS attacks can be implemented in wireless LANs and that prior work has not explored the use of location-dependent information (such as RSSI measurements) to identify attack sources. Techniques such as RF ngerprinting, which aim to identify properties of specic devices, are applicable to wireless LANs but require additional hardware support. In Chapter 6 we describe a mechanism that uses information related to the location of mobile devices to provide reliable detection of identity-based attacks and immediate deployment.
Chapter 3
Automatic System Calibration

In this chapter we describe the process of calibration, which is executed autonomously and periodically by our system to collect information regarding signal attenuation within a service area. Calibration results are used by our localization system (Chapter 4) and when using range-based authentication (Chapter 5).
3.1
Objective
The objective of calibration is to establish proper parameters for a path loss model, a mathematical function able to predict, with reasonable accuracy, signal attenuation within a service area. These model parameters are used by two of our services to estimate RSSI levels given the locations of transmitters and receivers. Our localization system uses it to estimate the physical location of each client based on signal strength measurements reported by the access points, while our range-based authentication protocol uses it to control the size of each cell and to nd the minimum power levels required for that purpose. (Throughout this chapter we also use the term model instance to denote a specic set of values for model parameters.) The main challenge associated with the calibration process is to deal with environmentspecic properties without high conguration overhead. Signal attenuation within a service area depends strongly on factors such as construction materials, building layouts, as well as the number and location of people. This creates a clear trade-o involving accuracy and conguration costs. Services that require accurate prediction often rely on manual site surveys, ne-grained measurements performed by network administrators during system 27
28
CHAPTER 3. AUTOMATIC SYSTEM CALIBRATION
setup. While the models resulting from these measurements are usually quite accurate they can account even for phenomena that are not well understood they impose high management costs due to the human assistance required. Furthermore, these are likely to be recurring costs, as access points are usually incrementally deployed and often relocated, requiring new models to be created through additional site surveys. Our approach is to leverage the increasing number of access points and have our system establish model parameters autonomously, allowing calibration to be performed without operator assistance and therefore with low management costs. Periodically, access points send each other sample frames and report the corresponding RSSI measurements to the WA, which nds the model parameters that best approximate attenuation within the service area. The low prices for WLAN hardware allows service areas to be over-provisioned with APs without a signicant increase in total cost of ownership. During calibration, such higher numbers of APs provide more vantage points and therefore produce more accurate model instances. By removing the need for costly site surveys, our approach allows networks to be continuously recalibrated without increasing management costs. This allows the system to react promptly to the installation of additional access points and dynamic changes in the environment that may aect signal attenuation, such as changes in layout or a sudden increase in the number of people in a building. In this chapter we show that proper model parameters can be found autonomously by our system, without operator participation. Using measurements in our testbed network, we show that model parameters found during calibration are comparable to values resulting from a ner-grained, manual sampling performed within the same service area. Results presented in the next chapters show that such autonomous conguration has little impact on the accuracy our services. Consequently, by using the parameters found during calibration, our services can be deployed with acceptable management overhead, which is consistent with our objective to improve security in wireless LANs without a signicant increase in management costs. The rest of this chapter is organized as follows. In Section 3.2 we describe the logdistance model, used in our current implementation. In Section 3.3 we describe the main challenges of modeling signal attenuation and how they are addressed by our system. Section 3.4 describes how our system nds the model parameters that best match a service area, while in Section 3.5 we address calibration in heterogeneous environments. In Section 3.6 we present a detailed description of our testbed, the measurements performed within it,
3.2. THE LOG-DISTANCE MODEL
29
and calibration results. Finally, in Section 3.7 we discuss the use of other path loss models and other alternatives regarding the calibration process.
3.2
The Log-Distance Model
Our current implementation employs the log-distance path loss model [91], which states that received signal strength (or received power, in dBm) at a distance d (in meters) from the transmitter is given by:
P r(d) = P r(d) + X = P r0 10 log(d) + X
(3.1)
where P r0 is the signal strength 1 meter from the transmitter, a function of the transmission power used, the gain of the transmitter antenna (if any), as well as the expected loss in the rst meter. The parameter is known as the path loss exponent, the value P r(d) denotes the expected power d meters from the transmitter, and X represents a Gaussian random variable with zero mean and standard deviation of dB [91]. The well-known freespace equation which states that signal strength decays as a function of the square of the distance (in Watts) is equivalent to Equation 3.1 with = 2. This model accounts for both large-scale path loss and shadowing [91]. The large-scale loss corresponds to the attenuation due to the distance between transmitter and receiver, being represented in the equation by the term containing the path loss exponent. Shadowing, represented by the random variable X , accounts for variations in signal strength for locations that are equidistant to the transmitter. Multiple experiments in the literature have demonstrated that shadowing tends to be log-normally distributed, i.e. can be modeled as Gaussian variables in a logarithmic scale (dB or dBm) [91]. Equation 3.1 does not account for small-scale fading; we show that a ltering stage can be used during both calibration and localization to reduce the impact of signal strength oscillations caused by this phenomenon. We selected this model for two main reasons. The rst is its simplicity: all it needs is the distance between transmitter and receiver, not requiring details about the environment that would be harder to congure, such as building blueprints and information about the
30
disposition of furniture and other obstacles that aect signal propagation. This is consistent with our autonomy objective: while this model is not be as precise as others, the parameters for a specic SA ( and ) can be established by the system without operator participation. The second is that this model is well accepted, having being used to approximate path loss inside buildings across a wide frequency range, with plenty of measurements available in the literature. It was rst used by Alexander [22], while Seidel et al. report the results of modeling two oce buildings at 914 MHz [105]. Medbo et al. [82] performed a similar study at 5.2 GHz with measurements performed within school and oce buildings, while Cheung et al. [35] use Equation 3.1 to model attenuation inside a townhouse at both 802.11 frequencies. As shown in Section 3.6, we use this equation to model attenuation in a standard oce environment. Other installations that have also been shown to follow this model can be found in [23, 62, 82, 84, 91].
3.3
Challenges
The rate at which signal strength attenuates with distance depends on the characteristics of each environment. It is strongly aected by construction materials, the disposition of walls, furniture, and other obstacles, as well as the number of people and their moving patterns. All these aect how electromagnetic waves get reected, diracted, and scattered within a service area and the resulting signal strength attenuation. In this section we present three issues addressed by our system as a result of this. Model parameters vary substantially from one installation to another. Measurements in the literature have reported empirical values for in the range between 1.8 (lightly obstructed environments with corridors) and 5 (multi-oored buildings), while values for usually fall between 4 and 12 dB [62, 91]. Values for the path loss exponent within oce buildings tend to fall within 3 and 5. For example, Seidel et al. found the best t for the log-distance model with = 3.27 when modeling attenuation within a softpartitioned building with cubicles [105]. Measurements in our testbed network yielded a model instance with = 4.05 (Section 3.6). Consequently, xed parameter values cannot be used for all installations: the values for (, ) that dene an instance of the log-distance model have to be found empirically for each environment. Addressing this issue is the main objective of our calibration mechanism.
3.3. CHALLENGES
31
Model parameters can vary over time. Signal attenuation can also change over time due to dynamic factors such as the number of people within a service area. As our body attenuates wireless signals, an oce building full of people will likely generate dierent channel responses compared to measurements performed during weekends or after hours. For instance, researchers have shown that the number and movement of people aect the magnitude of temporal RSSI variations caused by multipath. Using measurements at 1.1 GHz and varying the distance between transmitter and receiver from 5 to 30 meters, Hashemi et al. demonstrated that the average standard deviation of RSSI variations relative to the median signal strength level increases as the number of people increase, reaching values of respectively 8.7, 16.4, 18.5, and 21.6 dB when 1, 2, 3, and 4 individuals were moving close to the receiver [63]. The same behavior was detected by Moraitis et al. for measurements performed at 60 GHz [85]. We address this problem by making calibration a continuous process, which can be implemented with minimal costs given that it requires no operator assistance. Instead of establishing model parameters once before service is provided, the system periodically calculates new values, being able to account for dynamic changes in the environment without service disruption. While our system can certainly adapt to changes, it is unclear how the number of people within an SA aect model parameters over time. (The results presented above refer to eects of people on small-scale fading, not large-scale path loss or shadowing, which are directly modeled by Equation 3.1.) We show in Section 3.4 that the eects of temporal oscillations due to small-scale fading can be reduced by proper ltering. They will only aect model parameters if the base RSSI levels are also aected, i.e., if the values outputted by the ltering stage change considerably over time. Only measurements performed within a SA undergoing these situations would allow us to quantify the gain of frequent recalibration.
Multiple model instances can be necessary in heterogeneous SAs. Within a heterogeneous service area (e.g. a multi-story building), the use of a single model instance may produce high standard deviation values and poor prediction accuracy, a consequence of the simplicity of Equation 3.1. In homogeneous environments, with symmetrical layouts and similarly-sized rooms, a single instance can be used with reasonable accuracy. We show later in this chapter that this is the case for our network testbed. In heterogeneous
32
environments, signicant improvements can be achieved using multiple model instances that can account for distinct characteristics of each building section. For example, Seidel et al. present the results of modeling signal attenuation within two multi-story oce buildings, one with 4 and the other with 5 oors [105]. Using a single instance to cover both buildings with measurements both within and between oors the authors found a standard deviation of 16.3 dB. With two models, one for each building, linear regression produced models with 12.8 and 13.3 dB of standard deviation. Finally, by modeling three wings within two oors separately, the authors signicantly improved signal strength prediction, with values found between 4.1 and 8.3 dB. Our system follows this approach and uses multiple model instances to account for heterogeneity. By default, dierent model instances are used for dierent oors in a multistory building. Additionally, multiple instances can be used for dierent sections of the same oor, as we discuss in Section 3.5.
3.4
The Calibration Process
The calibration process comprises three steps, which are described in the following sections.
3.4.1
Sample Generation
During the rst step, the WA collects signal strength measurements relative to each pair of access points. Each AP periodically scans other channels, measures received signal strength for transmissions performed by other access points operating in that frequency, and reports the RSSI levels detected back to the WA. The WA coordinates this process, which continues until samples have been collected for each pair of APs and in each direction. This procedure can be executed while the network is operating without signicantly aecting active clients.
3.4.2
Filtering
The objective of the ltering step is to reduce the impact of temporal signal strength oscillations on the model parameters produced. The procedure just described generates a set of signal strength samples for each pair of access points and in each direction. During ltering, a more reliable RSSI value is extracted from this distribution. Based on the samples transmitted from APi to APj , a ltered signal strength level is calculated, which we denote by P rij .
3.4. THE CALIBRATION PROCESS
33
10 0 -10 -20 -30 -40
Variation (dB)
50
100 Sample
150
200
10 0 -10 -20 -30 -40
Variation (dB)
50
100 Sample
150
200
10 0 -10 -20 -30 -40
Variation (dB)
50
100 Sample
150
200
Figure 3.1: Temporal RSSI oscillations relative to the 75th percentile detected during calibration for three pairs of access points (measurements described in Section 3.6).
34
These temporal RSSI variations, mostly caused by multipath, aect even stationary clients. They are worsened by the movement of obstacles or people near receivers and transmitters, with oscillations as high as 30 dB. Filtering helps nd the base RSSI level over which these temporal oscillations are superimposed, lowering standard deviation values associated with instances of Equation 3.1 and consequently improving signal strength prediction and localization accuracy. Our current implementation of the ltering stage returns a high percentile of the signal strength distribution for each pair of access points. The numbers presented in this dissertation were generated using the 75th percentile. For example, Figure 3.1 shows the calibration measurements relative to three pairs of access points. The base RSSI level in each graph (0 dBm, shown as dotted line) is the 75th percentile of the distribution for that AP pair. While multipath can cause both constructive and destructive variations in signal strength, our measurements suggest that strong oscillations (> 15 dB) are mostly destructive. In this scenario, higher percentiles of the distribution provided more reliable estimates for P r ij than the median or average RSSI levels. Reducing the impact of RSSI oscillations is equally important when establishing the physical location of a transmitter, so a similar ltering mechanism is implemented by our localization system. In Chapter 4 we present additional measurements performed in our network testbed that further demonstrate the eectiveness of ltering.
3.4.3
Model Instance Computation
Given that the WA knows not only the location of all access points but also the transmission power used by each of them, it uses the ltered RSSI levels to create a tuple of the form (dij , P rij ) for transmissions from APi to APj , where dij denotes the physical distance between the two APs. Tuples like these are generated for all pairs of access points within range of each other. Linear regression is then used to nd the model parameters (, ) that best match the RSSI samples collected within a service area, a standard procedure [105, 91]. As shown in Equation 3.1, the log-distance model is a linear function involving signal strength in a logarithmic scale (such as dBm) and the logarithm of the distance between transmitters and receivers. The slope of the best linear t and the sample standard deviation s are used as estimators for and , respectively.
3.5. HANDLING HETEROGENEOUS ENVIRONMENTS
35
3.5
Handling Heterogeneous Environments
The number of model instances to use within a service area will vary according to how homogeneous the environment is with respect to signal attenuation. For example, multiple instances may be necessary to provide reasonable accuracy in an installation where layouts or construction materials vary substantially from one oor to another. This question exposes a trade-o between specicity and bias. One one hand, the system could compute a dierent model instance for each access point based on the measurements it reports relative to its neighbors. The value for , for example, would be specic for that access point and take into account attenuation caused by its immediate environment, but would also have a higher bias given the smaller number of samples. On the other hand, aggregating all RSSI measurements into a single model instance minimizes the impact of measurement errors, but may reduce the quality of signal strength prediction in some sections of a heterogeneous building. An approach to this problem that seems promising is to calculate model instances in an iterative manner. First, a dierent value is calculated for each access point using all the measurements in which it acts as receiver or transmitter. The result is a set of instances (i , i ), i AP . Then, the values produced by APs in the same oor are compared. If the dierence between the highest and the lowest values of is smaller than a pre-dened maximum value maxDif f , the environment is considered homogeneous enough to use a single model instance. (Our measurements suggest that using alphamaxDif f = 1.0 is a good starting point.) If that is not the case, APs can be partitioned into groups according to their physical locations and values such that the requirement above is satised. Further research is needed to evaluate dierent partitioning schemes and their impact on localization accuracy. As we show in Section 3.6, our testbed is not heterogeneous enough (and probably not large enough) to justify the use of multiple model instances. The main advantage of this approach is that it allows the system to minimize the number of model instances used (and therefore reduce measurement bias) while also allowing heterogeneous sections to be detected. As in our testbed, a single model can be used with condence if per-AP values do not dier considerably.
36
20 Y coordinate (m) 6
9 5
10 4 7
10 2 1 0 0 10 20 X coordinate (m) 30 40 11 3 12 8
Figure 3.2: Our 4524m service area in the Gates Building 4A Wing with the placement of access points.
3.6
Evaluation
In this section we present the results for the calibration process in our network testbed. In Section 3.6.1 we describe in detail our deployment scenario and the hardware used, while Section 3.6.2 describes the two data sets used to evaluate the calibration process. In Section 3.6.3 we present the model parameters calculated autonomously by the APs and compare them to values produced by a site-survey we conducted within the same SA. Some of the measurements presented in this section will also be used in chapters 4 and 5 to evaluate our localization system and authentication mechanism.
3.6.1
Network Testbed
We have deployed and evaluated our system in a standard oce environment. As service area, we used the A-Wing in the 4th oor of the Computer Science Building at Stanford University. As shown in Figure 3.2, The SA has approximately 4524 meters (14778 ft), with a mix of oces (most of them measuring 36 m), large labs (at least 84.5 m), and long corridors.
3.6. EVALUATION
37
The network infrastructure consists of 12 IEEE 802.11b/g access points connected to a dedicated server that executes the role of the WA. The access points, their locations also shown in Figure 3.2, are mounted at the ceiling (height of 2.5 m), with an approximate density of one AP per 90 m2 or 955 f t2 . They are o-the-shelf Linksys WRT54G units (hardware versions 3.0 and 4.0) running OpenWrt WhiteRussian rc3 [17], a Linux distribution. We developed an application that runs at the APs, monitoring transmissions over all wireless channels used in the building and reporting signal strength measurements back to the server.
3.6.2
Data Sets
The rst data set consists of calibration sample frames transmitted by the access points (we call it the calibration data set). We followed the procedure described in Section 3.4 and used our access points to establish the model that best approximates signal attenuation in our service area. At each calibration round, one AP was used as a transmitter while the others acted as sensors and reported an RSSI estimate for each received frame. As calibration trac we used ping packets sent at a rate of 10 packets per second. As not all access points are in range of each other, a total of 128 tuples (distance, RSSI) were generated, with distances between 6 and 40 meters. Our APs transmit at 18 dBm using standard dipole antennas (2.2 dBi of gain). A second data set was created through a site survey: we manually sampled a total of 135 locations within our service area, as shown in Figure 3.3 (we call this the survey data set). These measurements were not used to congure any of our services, being used only for evaluation. At each location a laptop provided with a Cisco 340 PCMCIA card (transmission power of 15 dBm) transmitted ping packets at rates between 10 and 20 packets per second for approximately one minute, as our access points had to hop through the three non-overlapping 802.11 channels at 2.4 GHz. The client annotated each packet transmitted with an ID for its current location, while each access point in range tagged it with the RSSI level detected during reception and forwarded it to the localization server, which logged all measurement trac. At all locations, the laptop used for measurements had the same orientation: it was held by the user in front of his body and at waist level, with the user facing North as indicated in Figure 3.3. There are two limitations with our data sets. First, all of our measurements were collected within a single oor. As a result, we are not able to evaluate signal attenuation
38
20 Y coordinate (m) N 10
0 0 10 20 X coordinate (m) 30 40
Figure 3.3: All the 135 locations sampled to create the survey data set.
between oors and the corresponding impact on our services. For example, we are unable to compare dierent approaches for client localization in multi-story buildings and the eectiveness of attacks mounted from oors adjacent to a service area, as we discuss in Chapter 4. Second, our measurements do not allow us to evaluate changes in signal attenuation over time. We are therefore unable to measure how the accuracy of our services changes according to dynamic changes in the environment, such as an increase in the number of people within a SA or a change in their moving patterns.
3.6.3
Calibration Results
We rst look at the model parameters produced autonomously by the access points. The results agree with previously published values in that attenuation is much stronger than predicted by the free-space model ( = 2.0). Using all 128 samples as input to linear regression, the best t for Equation 3.1 was found to be = 3.65, = 6.59 dB, as shown in Figure 3.4. (We assume free-space attenuation for the rst meter, using P r 0 = 19.8 dBm during regression.) One interesting result evident in Figure 3.4 is that received signal strength cannot be assumed to be symmetric. I.e., for two identical stationary devices (two APs) using the same output power, the RSSI levels produced in each direction may dier considerably.
3.6. EVALUATION
39
-20 -30 Signal strength (dBm) -40 -50 -60 -70 -80 -90 -100 1 5 10 Distance (m)
=3.65, =6.59 dB
20
30
40 50
Figure 3.4: Measurements in the calibration data set and the best t found for the logdistance model.
Multiple access point pairs in our measurements recorded dierent ltered signal strength values for each direction, some dierences as high as 16 dBm. It is unclear how exactly these dierences were generated. It is likely a combination of hardware dissimilarities and changes of path loss over time, given that the measurements in both directions were not executed simultaneously. For example, additional attenuation aecting only one direction could be caused by people moving around or doors being closed. Figure 3.5 shows that the model parameters found during calibration provide close approximations to the values produced by our site survey. The gure shows the best t for the log-distance model using the ltered RSSI values from the survey data set. It uses all the access points within range for each of the 135 locations sampled (a total of 1471 points) and assumes P r0 = 25 dBm, as our PCMCIA card transmits at 15 dBm. Compared to the calibration data set, the rate of attenuation () is slightly higher (4.05 instead of 3.65), while the standard deviation is actually lower (5.64 dB instead of 6.59 dB). We nd these results encouraging for at least two reasons. First, similar values were found despite the hardware disparity between APs (used as transmitters during calibration) and the PCMCIA card used to create the survey data set. We simply accounted for the dierences in transmission power and antenna gain. Second, transmitters and receivers are coplanar in the calibration set (all APs are mounted at the ceiling), which was not the case for the survey data set, where transmitters were carried approximately at waist level. We accounted for the dierence by using distances in tridimensional space in both cases.
40
-20 -30 Signal strength (dBm) -40 -50 -60 -70 -80 -90 -100 1 2 3 4 5 10 Distance (m)
=4.05, =5.64 dB
20
30
40 50
Figure 3.5: Measurements in the survey data set and the best t found for the log-distance model.
Our measurements also agree with previously published results in that deviations from the mean can be closely approximated by a log-normal distribution (normal distribution in dBm) [91]. In our survey data set, respectively 35.3%, 65.7%, 95.5%, and 100% of RSSI measurements are within 0.5, 1, 2, and 3 standard deviations from the value predicted by the path loss model. The respective percentages for a normal distribution are 38.3%, 68%, 95%, and 99.7%. The eectiveness of ltering becomes clear when we use the average and median RSSI levels (instead of the 75th percentiles) in order to calculate model parameters. For our survey dataset, for example, using respectively the average and median signal strength levels at each location yields path loss models with standard deviations of 6.73 and 5.99 dB, higher than the 5.64 value found with the 75th percentile (Figure 3.5). The lower standard deviation produced with ltering enabled yields better signal strength prediction and improves the accuracy of our services. In the next chapters we show that the small dierence between model parameters produced by both data sets has little impact on the accuracy of our services.
3.6.4
Measuring Heterogeneity
Our service area is not heterogeneous enough to justify the use of multiple path loss model instances. Figure 3.6 shows the value for calculated for each AP, following the approach
3.7. DISCUSSION
41
20 Y coordinate (m) L 3.62
L 3.97 L 3.37
L 3.81 L 3.40 L 3.51
10 L 3.92 0 0 10 L 4.03 20 X coordinate (m) 30 L 3.25 L 3.42 L 3.75 40 L 3.58
Figure 3.6: Path loss exponent values (i ) found for each AP.
described in Section 3.5. As shown in the gure, no two access points generate path loss exponent values with a dierence larger than 0.8. As expected, values are higher for access points located inside oces than for those at corridors, which have some other APs almost in line-of-sight. Figure 3.7 shows the values found when access points are grouped into two groups of six. The dierence between the two values is only 0.09. This corresponds to a dierence of only 1.3 dB in terms of signal strength prediction for a mobile device located 30 meters from an access point.
3.7
Discussion
There are many alternatives regarding the calibration process that may improve the results achieved by our services in other installations. In this section we discuss some of the tradeos involved, but leave the complete evaluation of these alternatives as future work.
42
20 L Y coordinate (m)
L L L L 3.66 L L L L L
3.75 10 L
0 0 10 20 X coordinate (m) 30 40
Figure 3.7: Path loss exponent () values found for each group of 6 APs.
3.7.1
Alternate Path Loss Models
The choice of path loss model is a function of a trade-o between conguration costs and accuracy. While in this dissertation we focus on the use of the log-distance model, mainly due to its simplicity, other path loss models can also be used in conjunction with our services. Researchers have shown that standard deviation values can be decreased by using more detailed models that take into account construction materials, disposition of walls, and other environmental properties relevant to signal propagation that are not accounted for by Equation 3.1. This accuracy improvement, however, comes at the cost of increased management costs. Several researchers have demonstrated that signal strength prediction improves if the attenuation caused by some obstacles, such as external walls and soft partitions, is modeled explicitly [86, 69, 105, 91]. In this path loss model, which we call the partition model, the expected signal strength level d meters from the transmitter is given by:
3.7. DISCUSSION
43
P r(d) = P r0 10 log(d)
i=0
ki Fi + X
(3.2)
where Fi is the attenuation factor (in dB) associated with the ith of N types of obstacles modeled and ki is the number of such obstacles between transmitter and receiver. Motley et al. proposed an instance of Equation 3.2 that accounted for the attenuation caused by oors in multi-story buildings, i.e., they explicitly modeled a single class of obstacles (oors), for which an average attenuation factor (F ) was found empirically [86]. The same authors later extended their model to account for both oors and walls [69]. In [105], Seidel et al. present a study of attenuation at 914 MHz and compare the performance of three models: the logdistance model, an instance of the partition model that accounts only for oor attenuation, and a second instance that explicitly models the attenuation caused by soft partitions and concrete walls for measurements within a single oor. Using the third model the authors were able to further reduce standard deviation to 4.1 dB. While it has the advantage of reducing standard deviation and improving prediction accuracy, the main disadvantage of the partition model is that it increases the costs of system calibration. First, it requires detailed blueprints for the service area, used to establish the type and number of obstacles between transmitters and receivers. Second, attenuation factors have to be found empirically for each installation, which requires operator assistance and increases overall costs. Other researchers have experimented with a path loss model that assumes free-space propagation with an exponential loss factor that increases with distance [91]. In this case, received power is given by:
P r(d) = P r0 20 log(d) d + X
(3.3)
where is found empirically and expressed in dBm/meter. This model was rst proposed by Devasirvatham et al., who used it to model path loss within two oce buildings at 850 MHz and 4 GHz [41]. Medbo et al. compared the results of using both equations 3.1 and 3.3 to model attenuation at 5.2 GHz within an oce building and a school building [82]. Their results are inconclusive regarding the power of these two models, as the best t for the
44
school was found using the log-distance model while the best t for the oce building was found to be an instance of Equation 3.3.
3.8
Summary and Conclusion
This chapter described the mechanism used by our system to establish a path loss model instance that approximates signal strength attenuation with distance inside a service area. As discussed, the rate at which signal gets attenuated within an environment strongly depends on construction materials, building layout, and the disposition of furniture and other obstacles. For this reason, the model parameters that best t a specic environment have to be found empirically. First, we showed how our system can determine these parameters autonomously, using measurements performed by the access points already deployed and without operator assistance. Then we showed that the parameters resulting from such automatic calibration are comparable to the ones produced by a ner-grained manual sampling within the same building. In the next chapters we show that such automatic setup has little impact on accuracy and quality of service within the SA. As the system does not require operator assistance for calibration, reconguration costs are kept to a minimum and our location-based services can improve security in a cost-eective manner.
Chapter 4
Robust Localization
In this chapter we present the design and evaluation of a localization system that uses signal strength measurements collected by access points to locate wireless devices within the service area. We show that it is robust against attacks and that it allows the implementation of location-based access control policies. We also show that it scales to large numbers of access points with minimal conguration overhead.
4.1
Motivation
A properly-designed localization system can be used to improve both accountability and access control in a wireless LAN. To improve accountability, a localization system needs to be accurate. If the network can track all active clients within a service area to within 2 or 3 meters from their real locations, it has enough information to eectively react to any kind of client misbehavior. To allow the implementation of location-based access control, however, a RSSI-based system needs to be robust against attacks directed to it. It needs to assume that malicious devices will try to access the network from locations beyond the boundaries of the service area and be prepared to face clients that use dissimilar transmission power levels and resort to ampliers and directional antennas to obfuscate their real locations. We address a variant of the standard localization problem: given a set of signal strength measurements (a signal pattern), an algorithm should check if it successfully matches a location inside a service area of interest. The SA denes the geographical limits for localization and consequently the solution search space. As the algorithm is expected to reject patterns created by transmitters located outside the SA, localization needs to be performed 45
46
CHAPTER 4. ROBUST LOCALIZATION
on the network side. The output of the algorithm is a location inside the SA (two- or three-dimensional position) tagged with a label, one of either accepted or rejected. We present the design and evaluation of an RSSI-based localization system that can be used for security applications while also allowing networks to scale to large numbers of access points. We have tested our system in our network testbed, having performed measurements both inside and outside the building for evaluation. We show that it provides: Autonomous conguration. All the information the system needs regarding signal attenuation within a service area is found during system calibration. This makes the system self-congurable: an operator inputs the dimensions of the environment and the location of the access points but is not required to perform site surveys. This also allows our system to take advantage of access points that are incrementally deployed without incurring additional conguration costs. Adequate accuracy and low false-negative rates. Clients physically located inside the SA and provided with hardware compatible to our reference client are located accurately and rejected with low probability. For instance, we show that in our testbed, with 12 access points covering a 4524m area, clients are located with a median error of 1.88 meters. Moreover, even with an aggressive conguration, the rate of false negatives in our testbed is inferior to 3%. Low false-positive rates. Malicious devices with omni-directional antennas located outside the SA (our building) are able to mount undetected attacks only when provided with the necessary gain and positioned within 20 meters from external walls. As the system requires additional power from transmitters outside the SA, intruders using portable devices (e.g. PCMCIA cards) can only mount successful attacks close to the building (<10 meters). For large distances (>30-40 m), the RSSI patterns produced by external transmitters dier signicantly from what is seen within the SA, so attacks are detected with high probability even if intruders are provided with unbounded transmission power. We also show that it is hard for intruders to nd the exact transmission power level to use, increasing chances of attack detection and further reducing the practical attack range to around 20 meters. Finally, we show that strong guarantees against attackers with directional antennas can be also be achieved, although at the expense of additional APs located outside the SA. The remainder of this chapter is organized as follows. Section 4.2 describes our attack model. Section 4.3 provides an overview of our system, while Sections 4.4-4.7 describe it
4.2. ATTACK MODEL
47
in more detail. We present the evaluation of our system in our testbed in Section 4.8 and present its limitations in Section 4.9. Finally, our conclusions are presented in Section 4.10.
4.2
Attack Model
We assume that unwanted users are kept outside the service area. In enterprises, this is achieved through standard physical security measures. In cafeterias and other public hotspots, all clients within range can be treated as legitimate clients, allowing our system to be used despite the weaker physical security level. In both scenarios, attackers are limited to the area around the service area, from where they try to fool the localization system. We assume that attackers can employ a wide range o-the-shelf wireless equipment. The level at which an attacker is able to extend service beyond the boundaries of a SA is a function of his hardware resources. We show that the larger the distance between him and the service area, the higher the demands on output power. We consider a wide range of attackers, from the ones that are limited to standard PCMCIA cards to others that are provided with ampliers and directional antennas. We assume that the wireless infrastructure (including access points and server) has not been compromised by attackers. First, we assume that legitimate access points already deployed have not been tempered with and that they report RSSI levels based on sound measurements. Second, we assume that no rogue access points feed measurements to the localization server. Both of these goals can be achieved through proper network partitioning and standard physical security measures. For example, access points can be mounted at the ceiling or other locations with limited access. Consequently, all signal strength measurements reported can be trusted by the localization server.
4.3
Overview
An overview of our localization system is presented in Figure 4.1. As shown, the state needed by the system to estimate signal attenuation within the SA is calculated during system calibration. The system also uses as inputs the denition of the service area and the location of all access points.
48
Figure 4.1: Localization system overview.
At runtime, the location of each client is established in a three-stage process: 1. Filtering. Signal strength oscillates even for stationary devices due to multipath and other phenomena. (In our measurements oscillations were as high as 30 dB.) Similarly to what is performed during calibration, the ltering stage presented in Section 4.5 extracts more reliable RSSI statistics to be used during position estimation. 2. Position estimation. In the second stage, the system nds the location inside the service area and the transmission power level that best match the pattern presented by a transmitter (described in Section 4.6). The position returned is the one with the smallest error value, a function of the dierence between the measured RSSI levels and the values predicted by the path loss model. 3. Verication. In order to be accepted, the solution found in stage 2 has to satisfy a set of conditions, presented in Section 4.7. For instance, a location estimate is rejected if its error value is too high, if not supported by a minimum number of access points, or if RSSI values are too low. These conditions are used to improve accuracy, enhance the systems ability to dierentiate between indoor and outdoor transmitters, and also to impose extra resource demands on devices located outside the service area, therefore increasing the costs of compromise attempts.
4.4. CLIENT REQUIREMENTS
49
As a result, a position estimate is calculated periodically for each device and labeled with either accepted or rejected. We say that a client is successfully localized if he produces a location estimate labeled with accepted.
4.4
Client Requirements
The objective of our algorithm is to locate accurately and accept all clients that satisfy the list of requirements shown below. I.e., a client that follows these guidelines provides the system with enough information about its position and should not fail localization. In the implementation of location-based access control, these constitute the requirements for being granted network service. 1. Physically located within the service area; 2. Provisioned with omni-directional antennas; 3. Capable of transmitting above a pre-determined power level. The rst requirement is derived directly from our objective to improve security by leveraging physical security measures implemented around the SA. A device will be served only if an acceptable location is found within the service area. The other two are hardware requirements specied according to our current reference client: an 802.11 PCMCIA card. The objective of these requirements is to increase the number of access points within a clients range, improving localization accuracy and the performance of the verication phase. As shown in Section 4.7, they improve the algorithms ability to dierentiate between RSSI patterns generated inside and outside the service area, decreasing the rates of false positives and false negatives. While o-the-shelf transmitters (including 802.11 PCMCIA cards) are not perfectly omnidirectional, they still create patterns centered about their locations with power transmitted in all directions. This allows our system to disregard the orientation of clients and calculate expected signal strength based solely on their distances to access points. As we show in Section 4.8, any directivity eects induced by PCMCIA cards can be successfully modeled together with other noise in the measurements.
50
Term (xi , yi ) (x , yT , P r0 ) T (xT , yT , P r0 ) P Lij min max P r0 , P r 0 P ri T L Q ,
Meaning coordinates of ith access point. real transmitter position. estimated transmitter position. path loss (in dBm) between access points i and j. minimum and maximum values used during search. ltered signal strength (in dBm) relative to ith access point. signal strength threshold (in dBm). condence level. minimum access point quorum. path loss coecient and model standard deviation.
Table 4.1: Notation used to describe the localization system.
4.5
Filtering
Filtering, the rst processing stage, is responsible for generating reliable signal strength statistics for each pair (client, AP) given the measurements reported in the recent past (order of seconds). Each frame transmitted by a client and successfully received by an access point within range is labeled with the RSSI level determined during frame reception. For a given client, the input to ltering consists of a set of RSSI measurements relative to each AP. Using the notation shown in Table 4.1, the output is the set of ltered signal strength values {P r1 , P r2 , ..., P rn }, with a value for each access point. (We consider a single client for simplicity, avoiding the use of multiple indexes or superscripts.) Our current denition of the ltering stage simply returns a high percentile of the RSSI distribution for each pair (client, AP). All the results presented in this chapter were collected using the 75th percentile. I.e., for the ith AP, the ltered RSSI value P ri is the
1 2 N 75th percentile of the distribution that arises from the samples P ri , P ri , ..., P ri reported
for the client. The main objective of the ltering stage is to minimize the eects of small-scale fading on localization. Small-scale fading, which is not directly modeled by Equation 3.1, creates RSSI oscillation over time even for stationary devices. Filtering helps lower the deviation associated with Equation 3.1, improving RSSI prediction and localization accuracy. Similarly to the behavior detected during calibration, when both transmitters and receivers were mounted at the ceiling, signal strength uctuations can be as strong as 30 dB. For example, Figure 4.2 presents examples of RSSI variation relative to 75 th percentile (dotted line). Each graph was created with the samples collected for a single location relative
4.5. FILTERING
51
10 0 -10 -20 -30 -40
Variation (dBm)
50
100 Sample
150
200
250
10 0 -10 -20 -30 -40
Variation (dBm)
50
100 Sample
150
200
250
10 0 -10 -20 -30 -40
Variation (dBm)
50
100 Sample
150
200
250
10 0 -10 -20 -30 -40
Variation (dBm)
50
100 Sample
150
200
250
Figure 4.2: RSSI oscillation for a stationary device.
52
to a single access point. The top gure depicts the common case in our measurements: over 95% of the RSSI samples in the graph are within 5 dB from the reference level. The second gure shows a similar scenario, but with three large uctuations (>30 dB). The bottom two gures present scenario with more severe oscillation. In the third, most RSSI samples are more than 10 dB from the reference level, while in the last graph there are periods where the detected RSSI drops more than 30 dB. As shown, by using the 75th percentile of the RSSI distribution, our algorithm becomes more robust against such oscillations, from the mild to the severe ones. Filtering also removes from the position estimation stage the RSSI values reported by access points that have failed to collect enough samples about the client. This is usually generated by clients at the border of a cell, producing low signal strength values and high packet loss rates.
4.6
Position Estimation
The system uses the values produced by the ltering stage to estimate, for each wireless client, a position (xT , yT ) and a power level at the reference distance (P r0 ). The input consists of a set of tuples of the form (xi , yi , P ri ), i.e. each access points location and ltered RSSI for the client. Throughout this chapter we will refer to the tuple (x T , yT , P r0 ) simply as the clients location, even though we also estimate its power level at 1 meter (P r0 ). Our system actually returns tridimensional positions, but as we evaluate it in a single oor, we omit the height for simplicity (z = 1.0 m always, approximately waist level). Localization in multi-oored buildings is discussed in Section 4.8.10. For a given RSSI input, each position inside the service area is associated with an error value, a measure of how the reported RSSI values deviate from the path loss model found during calibration. The estimated location of a wireless transmitter returned by the system is the value (xT , yT , P r0 ) found to provide the smallest error value. The search space S is dened as follows: the coordinates (xT , yT ) are constrained by the geographical boundaries of the service area, while the transmission power returned (P r 0 ) lies within a
min max pre-determined interval [P r0 , P r0 ], set according to the reference client. Therefore,
the algorithm returns the solution s = (xT , yT , P r0 ) S s.t. error(s) error(s ), s

min max S, (xT , yT ) SA, P r0 [P r0 , P r0 ].
4.6. POSITION ESTIMATION
53
The solution s can be found by standard minimization methods. Our current implementation performs an iterative conjugate gradient minimization based on the Polak-Ribiere algorithm, commonly available in scientic libraries. Another possibility would be to divide each dimension in the search space into discrete intervals and perform an exhaustive search. For example, the SA can be partitioned into 0.5m0.5m areas and the P r 0 interval divided accordingly, the solution s being the combination with the smallest error value. Our approach using the iterative gradient method provides better localization accuracy and lower computational overhead. Error Function We use an error function that compares predicted RSSI levels with the ones reported by the APs. The error value of a tentative solution s = (xT , yT , P r0 ) is given by:
error(s ) =
i
(P ri P r(di ))2 (P ri [P r0 10 log(di )])2

i
(4.1)
where di is the Euclidean distance between (xT , yT ) and the ith access point and P r(di ) is the expected signal strength given by the path loss model. The error is therefore the sum of the squares of the dierences between estimated and ltered signal strength levels relative to all access points. Algorithm 1 describes how to calculate the error associated with a candidate solution s and its degrees of freedom (which is described in detail in Section 4.7.2). As shown, the only APs that do no contribute error terms are those that were predicted to be out of range and that indeed did not report RSSI measurements. Each access point has a sensitivity level (denoted by sens), a minimum RSSI level for the radio to experience low frame loss for a given modulation. An AP is considered within range of s if the predicted RSSI is higher than its sensitivity. As the system assumes clients use omni-directional antennas, if an access point within range fails to report RSSI measurements, a reading (x i , yi , P ri = sens) is automatically generated. For example, if the predicted signal strength at one AP (with sensitivity of -90 dBm) relative to the current solution s is -70 dBm and no value is reported, a (90 70)2 = 400 term is added to the total error value.
54
Algorithm 1 Algorithm that calculates the error (error(s )) and number of degrees of freedom ((s )) for a candidate solution s . 1: error(s ) 0 2: (s ) 0 3: for all i AP do 4: if (P r(di ) sens || def ined(P ri )) then 5: if (def ined(P ri )) then 6: dif f i abs(P ri P r(di )) 7: else 8: dif f i abs(sens P r(di )) 9: end if 10: error(s ) error(s ) + (dif fi )2 11: if (dif fi < k) then 12: (s ) (s ) + 1 13: end if 14: end if 15: end for
It is important to dierentiate between two errors associated with our localization system. The rst is what we refer to as localization error, the distance in meters or feet between a devices real location and the position estimated by our system. The second is the error value associated with a solution as given by Equation 4.1, a function of the dierence in dBm between predicted and measured signal strength levels. It is possible for a solution to produce a small localization error and still be rejected by the system during the verication phase due to a high error value. (We have actually seen cases such as this in our measurements.) We use the terms localization error and error value to avoid confusion.
4.7
Verication
During verication, the system decides whether or not to accept the location estimate found during the second stage. The mechanism just described always nds the best solution s S, but provides no condence regarding the value found. The input to verication consists of ltered RSSI levels and the solution s found in the position estimation phase. The assumption behind our verication algorithm is that the pattern generated by a client inside the service area can be distinguished from the pattern created by an external device. While it may be error-prone trying to distinguish a client inside a building from
4.7. VERIFICATION
55
another right outside the external wall, the dierence between the two patterns increases as the distance between them increases. The patterns produced by a client within the SA and an intruder 30 meters from the external wall are considerably dierent. This happens because signal attenuates faster in close proximity to the transmitter. Take for instance the path loss model given by Equation 3.1. The expected signal attenuation (in decibels) 10 meters from the transmitter is comparable to the attenuation caused by the next 90 meters. Therefore, clients within close range to access points (<10m) produce patterns with a clear peak, followed by sharp signal strength degradation (>30 dB). Patterns generated by devices located outside the SA are atter because of the increased distance to the access points. The conditions imposed by our verication phase together with the increased number of APs allow these two patterns to be distinguished successfully with high probability. In our current specication, a position estimate is accepted if it satises the three conditions discussed next.
4.7.1
Signal Threshold (T )
Every transmitter must be heard by at least one access
Condition 1: if i|P ri T .
point above a signal level threshold T . I.e., the algorithm proceeds only
The rst objective of this condition is to impose additional resource demands (amplication) on attackers. A client that is inside the service area is in close range to multiple access points, thus in a better position to generate high signal strength values. A malicious transmitter outside the service area needs to boost its transmission power in order to cope with the higher attenuation caused by the extra separation and external walls, trees, and other obstacles in its path. Therefore, the objective is to set T as high as possible while allowing the targeted clients inside the SA to be accepted. Second, the higher the threshold value, the easier it is for the system to dierentiate between transmitters inside and outside the service area. Figure 4.3 shows a 50m50m service area and two signal patterns (more distinguishable in Figure 4.3(b)). The rst pattern the surface with a clear peak represents the expected signal strength for a client located in the middle of the SA, at (25, 25), assuming a log-distance model with = 3.0. The other pattern shows the expected RSSI for an attacker located outside the
56
SA at (80, 25), 30 meters from the external wall. The larger distance to the APs makes the attackers pattern resemble a plateau, without the sharp drop detected for the patterns created inside the SA. Setting a low threshold increases the likelihood of false positives for two main reasons. First, it decreases amplication demands on intruders. In Figure 4.3(a), the threshold is set too low, and the attacker is able to satisfy the threshold condition by using a transmission power 10 dB higher than the one used by the client. Second, the gure shows that the pattern created by an external transmitter is not easily distinguishable from the ones created inside the building, as most APs report similar RSSI values for both devices. Only the APs in close range to the legitimate client report RSSI measurements that are considerably dierent for the two devices. However, in this situation the error value associated with the solution might not be too high as to prevent acceptance if many access points are used. In this case, the attacker would be able to fool the system with a manageable amount of amplication. When the threshold is set higher, the amplication demands on attackers are higher and the two patterns can be more easily distinguished. In Figure 4.3(b), T is set 20 dBm higher than in Figure 4.3(a), so the external device needs a total of 30 dB of amplication, a considerable amount. Moreover, most access points now report RSSI values for the attacker that are much higher than expected from a device within the SA, resulting in a position estimate with higher error value and therefore with a higher chance of rejection. The higher the number of access points, the higher the system can set T , thus generating location estimates with higher degrees of condence. Obviously, the real signal patterns faced by the system are much noisier than shown in Figure 4.3, which plots only the expected RSSI levels (P r(d)). As shown in Section 4.7.3, patterns are rejected through a statistical test, which takes into consideration noise in RSSI samples and the error associated with the path loss model. The threshold T is set to the highest possible value according to the path loss model found during calibration and the distribution of access points. For instance, if no location inside the SA is more than dmax meters from an access point, the threshold can be set to:
T = f loor(P r0 10 log(dmax ))
(4.2)
i.e., the expected signal strength at distance dmax decreased by an additional dB for safety.
4.7. VERIFICATION
57
RSSI (dBm)
-20 -25 -30 -35 -40 -45 -50 -55 -60 -65 -70
10
25 X coordinate (m)
15
20
30
35
40
45
50 0
10
15
20
25
30
35
40
45
50
Y coordinate (m)
RSSI (dBm)
(a) attacker gain: 10 dB.
-20 -25 -30 -35 -40 -45 -50 -55 -60 -65 -70
10
25 X coordinate (m)
15
20
30
35
40
45
50 0
10
15
20
25
30
35
40
45
50
Y coordinate (m)
(b) attacker gain: 30 dB.
Figure 4.3: Pattern dierence between client at (25,25) and attacker at (80,25) using = 3.0. The clients pattern presents a clear peak, as it is located inside the service area. The attackers pattern resembles a plateau, a result of the increased distance to the SA.
58
If all clients are assumed to be able to transmit at the maximum value used during position
max estimation (P r0 ), this value can be used in the computation above.
In environments where the planned range for access points dier considerably, a dierent threshold value can be used for each AP. In this case, the threshold Ti for a given AP is found by replacing dmax in Equation 4.2 by Ri , the targeted range for the same AP. Extensive measurements in our environment, presented in Section 4.8, demonstrate that path loss in close range to access points tends to be lower than predicted by the model, allowing the use of small values for (0 5 dB). This trend can also be seen in other measurements presented in the literature, including the oce buildings modeled by Seidel et al. [105]. This happens because there are fewer obstacles (e.g. walls, doors) in close range, while the path loss model parameters are calculated using measurements across the whole oor, with transmitter-receiver separations of up to 40 meters.
4.7.2
Minimum Quorum (Q)
Each location estimate is associated with a number of degrees of freedom, denoted by . The minimum quorum condition simply species a lower limit for : Condition 2: A location estimate s must be supported by at least Q access points: s is rejected if (s) < Q. The minimum quorum condition further improves our systems ability to reject signal patterns generated by devices outside the service area. As seen in Figure 4.3, the threshold condition forces an attacker outside the SA to boost his output power, which creates RSSI values inside the building that are higher than expected. The more APs participate in localization, the higher the error values for patterns created by an intruder, increasing the condence with which they are rejected by the system. It is much easier for clients inside the service area to satisfy the quorum requirement because they are closer to the APs and are expected to use omni-directional antennas. As shown in Algorithm 1, only access points that report signal strength values within an acceptable range from the level predicted by the model cause the number of degrees of freedom to be incremented. As shown, is incremented given an access point APi if dif fi < k, where is the standard deviation found during calibration and k is a conguration parameter (our current implementation uses k = 1.8). As suggested by Equation 3.1 and conrmed
4.7. VERIFICATION
59
by our experiments, the dierence between predicted and measured signal strength can be approximated by zero-mean Gaussian variables. Therefore, using k = 1.8 means that, for clients inside the SA, over 92% of such dierences lie within accepted bounds. This rule for calculating helps penalize patterns that strongly disagree with the model, which is common for transmitters located outside the SA or using directional antennas. As we describe in the next section, solutions are rejected if their error values are higher than a threshold that is itself a function of . The smaller the value of associated with a candidate solution, the lower the allowed maximum error and the higher the chances of rejection. Note that the solution error is always incremented, even when the value of is not updated. Specic values for Q are also found autonomously by the system using the path loss model, the locations and sensitivity levels of the access points, and the minimum power level expected from clients (as of Section 4.4). The minimum number of APs within range across all locations within the service area can be used as an upper bound for Q. (One may decrease this number slightly in order to decrease the probability of false negatives.) In heterogeneous environments, a dierent value of Q can be used for each location within the service area.
4.7.3
Condence Test
The system rejects solutions with high error values by modeling the distribution of the error function (Equation 4.1) and performing a statistical test. Let E denote the cumulative distribution function of our error function, and let Ep be the critical value such that P rob(e > Ep ) = 1 p, for a random error sample e. A solution s can be rejected with condence level L if error(s) > EL . For example, we can reject a solution with 95% condence if error(s) > E.95 . The error function dened in Section 4.6 can be approximated using the chi-square distribution. It is known that a variable following this distribution, denoted by 2 , arises as the sum of the squares of independent standard normal variables [66]. If X1 , X2 , ..., X are independent standard Gaussian variables, degrees of freedom (d.f.), denoted by 2 .
2 i=1 Xi
follows a chi-square distribution with
As deviations from the model can be modeled as
normal variables and our error function is the sum of the squares of such deviations, critical values from the chi-square distribution can be used as upper bounds in the condence test.
60
Using 2 (L) to denote the critical value of the chi-square distribution for a condence level L and degrees of freedom, the condence test can be written as follows: Condition 3 (condence test):
error(s) 2
A location estimate cannot have too high
an error value. A solution s with degrees of freedom is rejected if > 2 (L).
Note that the error of the estimated solution is divided by the variance ( 2 ), as the chi-square distribution assumes standard Gaussian variables ( = 1).
4.7.4
Conguration Tuple
Our algorithm is dened by the tuple (T, Q, L), the exact values being found autonomously by the system but according to how aggressive a network administrator wants to be. There is a clear trade-o between false positives and false negatives, i.e. between security and management costs. A more aggressive conguration (e.g. higher values of T and Q) incurs higher costs for an attacker outside the service area, decreasing false-positive rates and improving security. However, more aggressive settings are also doomed to cause higher rates of false negatives, i.e. users inside the SA that fail localization. A more conservative conguration decreases the costs associated with false negatives but provides lower security improvements.
4.8
4.8.1
Evaluation
Conguration
We evaluated our system in our IEEE 802.11 network testbed, described in detail in Section 3.6.1. We use the values found during calibration for the conguration parameters that approximate signal strength attenuation within our SA: = 3.65 and = 6.59 dB. Unless noted otherwise, these values are always used to establish the location of a transmitter. We use two conguration tuples to evaluate the performance of our system, one aggressive and one more conservative. Given that all locations in our testbed are within 9 meters from an access point, the model found during calibration, and the transmission power level used by our PCMCIA card (15 dBm), Equation 4.2 yields thresholds of -60 and -65 dBm when using = 0 and = 5, respectively. As for the condence level, we use 80% and 90%
4.8. EVALUATION
61
100 90 80 70 60 50 40 30 20 10
10
Localization error (ft) 15 20
25
30
Percentile
calibration model, 12 APs site survey model, 12 APs simulation, 12 APs
4 6 Localization error (m)
10
Figure 4.4: Localization accuracy using all 12 access points. The rst two curves show the error distribution resulting from the survey data set using the calibration and the survey models. The third curve presents the error distribution found through simulation.
respectively for the aggressive and conservative settings. For both congurations we set the minimum quorum requirement to 8 access points. Therefore, the aggressive and conservative congurations are respectively (T = 60, Q = 8, L = 0.8) and (T = 65, Q = 8, L = 0.9). As these are verication parameters, the accuracy of the system (the distance between real and estimated locations) is the same for both congurations. The dierent parameters only impact the rates of false positives and false negatives, i.e. whether or not a position estimate is accepted.
4.8.2
Localization Accuracy
Our rst accuracy estimate was found by running the samples collected for each of the 135 locations in the test data set through our algorithm, using the model found during calibration ( = 3.65) and all 12 access points. As shown by the thicker solid line in Figure 4.4, the results are encouraging: the median localization error across all locations is 1.88 meters (6.2 ft) with a 90th percentile of 3.81 m. (The error is dened as the euclidean distance between real and estimated positions.) Localization error is less than 3 meters (respectively 5 meters) for over 78% (96%) of the distribution, with no location with an error higher than 7.5 meters. The second curve shown in Figure 4.4 demonstrates that the accuracy of our system would not improve substantially if we were to use the path loss model found for the survey
62
data set instead of the values found autonomously during calibration. With the survey model and still using all 12 access points, the median and the 90th percentile of the error distribution are respectively 2.03 and 3.70 meters, similar to the performance of the autonomous conguration. We attribute the negligible dierence to the fact that our algorithm estimates both physical coordinates and transmission power. So, the fact that the model used for localization ( = 3.65) predicts slightly higher RSSI levels compared to the survey data set ( = 4.05) gets compensated by the system, which channels some of the error to the estimated power and thus reduces the impact on localization accuracy. We conrmed through simulation the performance of our system on the test data set. We simulated an environment with the exact same dimensions and placement of APs as our service area, again using all 12 access points. For each location on a grid with a 2-meter stride we generated 500 signal strength realizations and used that as input to the system. Each realization is an RSSI tuple according to Equation 3.1: for a given location, relative to each access point, we add to the predicted RSSI a random variation (in dB) drawn from a Gaussian distribution with zero mean and standard deviation of dB. The samples are generated according to the survey model while localization still uses the calibration model. While simulation should not replace real deployment in the evaluation of a localization system, our results suggest that it produces reasonable accuracy estimates, allowing one to gauge performance with dierent AP densities and placement strategies. Aggregating all realizations, simulating our SA yields a median localization error of 1.95 meters, with a 90th percentile of 4.4 m, as shown by the third curve in Figure 4.4. The graph shows that, for most of the error distribution, simulation provides a close lower bound on accuracy. However, the curves also show that simulation does not approximate well the tail of the distribution: while the maximum localization error generated by our data set was inferior to 8 meters, simulation manages to generate realizations with errors of over 20 meters. We attribute this disagreement to the fact that simulation assumes Gaussian samples to be independently distributed, generating realizations that may be unlikely to happen in practice. In real deployments, RSSI levels seem to present more correlation than assumed by Equation 3.1. Our measurements demonstrate that localization error tends to be higher at the boundaries of the service area, where fewer access points participate in localization. Figure 4.5 shows the error at each sampled location using the calibration model and all 12 APs. Note that 4 of the 5 locations with error higher than 5 meters are close to the external wall.
4.8. EVALUATION
63
error in [0, 2) m error in [2, 5) m error in [5, 8) m false negatives, aggressive conf. 20
Y coordinate (m)
10
0 0 10 20 X coordinate (m) 30 40
Figure 4.5: This gure shows the localization error for each location in the survey data set and the locations that produce false negatives with the aggressive conguration.
20 Y coordinate (m)
10
0 0 10 20 X coordinate (m) 30 40
Figure 4.6: For each location in the survey data set, this gure shows the real locations (arrow tails) and the corresponding positions estimated by the system (arrow heads).
64
100 90 80 70 60 50 40 30 20 10
10
25
30
Percentile
calibration model, 12 APs calibration model, 10 APs calibration model, 8 APs calibration model, 6 APs
10
Figure 4.7: Localization error as a function of the number of APs using the survey data set and the calibration model.
Interestingly, the remaining location is the one with the highest error (7.48 m): a small cable room, lled with wires, computers, switches, routers, and other metallic structures that most likely perturbed signal propagation. This was the only location with an error higher than 6 meters. Also, note that error tends to increase inside oces as the transmitter moves away from the center of the building. Of the 30 oces in which two locations were sampled, in 21 of them the error is higher closer to the external wall. Most of time, the location estimated by the system is within the same room or within an adjacent room relative to the real location. Figure 4.6 shows the position estimated by the system for each location in the survey data set (calibration model, 12 APs). From 100 locations sampled inside rooms (oces and labs), for 64 of them the estimated location is within the correct room, and for another 25 of them it is in an adjacent room. For 7 locations, the estimated position lies in a corridor, while for the remaining 4 locations the estimated position lies in a nonadjacent room. As expected, localization accuracy improves when more access points are deployed within a service area. Figure 4.7 shows the localization error distributions resulting from the survey data set when we use 12, 10, 8, and 6 APs. With 10 APs (all but numbers 9 and 12), the median localization error and the 90th percentile increase respectively to 2.05 and 4.13 meters. With 8 APs (numbers 1-8), these values are respectively 2.44 and 4.84 meters. Finally, using only 6 access points (APs 1, 2, 4, 6, 7, 8), the values increase further to 2.83 and 5.65 meters. We believe this is a graceful degradation in accuracy: going from 12 to
4.8. EVALUATION
65
100 90 80 70 60 50 40 30 20 10
10
25
30
Percentile
simulation, 12 APs, =5 dB simulation, 12 APs, =5.64 dB simulation, 12 APs, =7 dB simulation, 12 APs, =10 dB
10
Figure 4.8: Localization error as a function of the standard deviation associated with the path loss model.
6 APs, there is an increase in the median error and 90th percentile smaller than 1 and 2 meters, respectively. Finally, localization accuracy depends strongly on how well the path loss model calculated for a service area matches signal strength attenuation in the environment. The higher the standard deviation found, the less accurate the model is, and consequently the higher the localization errors. Figure 4.8 shows the error distributions found through simulation when the standard deviation is varied between 5 and 10 dB. We again generate RSSI samples according to the survey model ( = 4.05), use the calibration model for localization with all 12 APs, and simulate locations every 2 meters within our SA, with 500 RSSI realizations per location. When = 5.64 dB, the value found for our survey data set, the median error and the 90th percentile are respectively 1.95 and 4.4 meters. When the standard deviation is reduced to 5 dB, these numbers are reduced to respectively 1.75 and 3.90 meters. When we increase the value of to 7 dB, performance decreases: the median error and the 90 th percentile increase to 2.38 and 5.48 meters, respectively. With = 10 dB, these numbers increase further, to 3.31 and 8.11 meters. For higher values of , there is a weaker correlation between signal strength and the distance between transmitters and receivers, which negatively aects localization accuracy.
66
4.8.3
False Negatives
Using the aggressive conguration, (T = 60, Q = 8, L = 0.8), only three locations in our survey data set produced false negatives (labeled with rejected), a rate of under 3%. These locations are shown circled in Figure 4.5. The leftmost location failed the threshold condition, while the other two (in corridors) failed the condence test. As we expected, no location failed the quorum condition because the PCMCIA card transmits omni-directionally, always reaching the minimum number of APs. The fact that our algorithm also estimates the transmission power used by a device helps reduce the rate of false negatives due to the condence test. If most RSSI samples are higher (respectively lower) than predicted by the model, the system reacts by returning a higher (lower) power level, minimizing the solution error and therefore the chance of rejection. There were no false negatives for the conservative conguration, (T = 65, Q = 8, L = 0.9). The location that failed the threshold condition in the aggressive conguration is now accepted. In fact, it produced a small localization error, only 1.80 meters. Similarly, the locations that failed the condence test are also accepted with the conservative conguration because the maximum error value allowed is now higher (only the last 10%, the tail of the distribution is rejected). These numbers show that a network administrator can successfully reduce the probability of false negatives, although he also increases the probability of attacks, as we describe later in this chapter. The main challenge regarding the estimation of false-negative rates through simulation lies with the threshold condition. Take for instance a client located 9 meters from the closest access point (this is the maximum distance to an AP in our testbed). For this distance, the path loss model yields a predicted RSSI around -60 dBm, and this is exactly how the threshold value is calculated. However, a simulation step adds to this value a random sample from a zero-mean Gaussian distribution, which means that the higher RSSI for the client will be below -60 dBm during approximately half of the simulation rounds. Consequently, approximately half of the signal strength realizations will fail the threshold condition, producing false-negative rates that are overly pessimistic. And that is very dierent from what happens in practice. Looking at Figure 3.5, the majority of points with distances smaller than 9 meters produce ltered RSSI values above 60 dBm. For separations higher than 10 meters, the model seems more accurate at predicting a base RSSI level. Turning o the threshold condition during verication, simulation produces low falsenegative rates across the whole service area, which agrees with our survey data set. With
4.8. EVALUATION
67
the aggressive conguration, no location generated false negatives in more than 5% of the signal strength realizations. We again simulated every location on a 2-meter grid with 500 realizations per location, generated RSSI samples according to the survey model, and ran the localization algorithm using the calibrated model.
4.8.4
Handling False Negatives
Network administrators can address false negatives caused by the threshold condition by simply installing additional access points. These cases happen at locations inside the service area where signal strength attenuation with respect to the closest access points is higher than predicted by the path loss model instance. For example, the single location in our measurements that failed the threshold condition did not fail the condence test and the quorum condition. This is an example of a false negative that could be eliminated by adding another access point. False negatives caused by the quorum condition and the condence test cannot be eliminated as easily because they are the result of errors and deviations associated with multiple access points. However, some of these false negatives may not need to be addressed. This will depend on the location at which the false negative is generated and whether users are expected to visit them on a regular basis. For example, a false negative at a location inside an oce or a conference room has a higher impact than one caused in a corridor. For this reason, the other two false negatives in our testbed both located in a corridor may not be of concern.
4.8.5
Performance
Performance measurements show that our localization server has modest hardware requirements. We timed localization overhead during our simulations on a laptop running Linux with kernel version 2.6.11 and equipped with a 1.86 GHz Intel Pentium M processor and 1GB of RAM. Given an RSSI tuple as input, the system takes an average of 1.4 milliseconds to establish a location within our service area. This would allow each of 200 clients to be pinpointed every 5 seconds while demanding less than 5% of CPU utilization. While this overhead would increase for larger service areas and higher numbers of APs, a centralized server should still be able to handle a sizable network with modern processors.
68
4.8.6
Building Penetration Loss
In order to quantify the amplication needed by attackers located outside the service area and their probability of success at fooling the localization system, we need a path loss model that accounts for the attenuation suered by a signal as it travels outside and into the building. With such a model we can calculate the probability of false positives as a function of the distance between the attacker and the SA. While not as well-studied as the indoor wireless channel, several papers suggest that a model similar to Equation 3.1 can be used to predict signal penetration into buildings. In this case, the signal strength inside the SA relative to an external transmitter is given by:
P r(d) = P r0 10out log(d) W AF + X .
(4.3)
The rst dierence is the wall attenuation factor (WAF), which accounts for penetration loss caused by the external walls. Estimates for W AF vary according to frequency and construction materials; Rappaport reports average loss due to concrete walls ranging from 8 to 15dB [91]. The second is a dierent path loss exponent: we use out to avoid confusion with the value used inside the SA. Measurements available in the literature suggest that values for out lie within [2, ], i.e., higher than free space and lower than models found inside the respective buildings. For instance, Durgin et al. found values for out and W AF respectively within the intervals (2.7, 3.6) and (7, 21) dB when measuring propagation into three homes at 5.85 GHz, with distances varying between 30 and 210 meters [46]. We performed measurements outside our building to nd the proper parameters for Equation 4.3. As transmitter we used a 802.11 access point located 9 meters from the buildings external wall. We sampled a total of 40 locations over 4 straight lines going out of the building at a 90 angle and spaced by approximately 2 meters. The measurements were performed in front of large windows (6ft7 ft each), with separations varying between 2 and 22 meters (11 to 31 meters to the access point). Our ndings agree with previously published results and indicate that the modied logdistance model given by Equation 4.3 closely approximates signal strength levels outside the building. In each of the 40 locations, between 30 and 75 RSSI samples were collected, with the average signal strength values shown in Figure 4.9. The line shows the model instance found through linear regression: W AF = 4.8 dB and out = 3.32, with a standard deviation
4.8. EVALUATION
69
-20 -30 Signal strength (dBm) -40 -50 -60 -70 -80 -90 -100 10 wall
=3.32, WAF=4.8 dB
20 Distance to access point (m)
30
40
50
Figure 4.9: Measurements outside the building.
of 3.1 dB. Despite the lower attenuation rate compared to the measurements inside the building (3.32 vs. 4.05), signal still degrades much faster than predicted by the free-space model ( = 2.0). As with our indoor measurements, the deviations from the mean can be closely approximated by Gaussian random variables. Overall, 70% of the RSSI samples are within 1 standard deviation from the mean, 95% within 2 deviations, and 100% within 3 standard deviations. While other path loss models have been proposed to model propagation into buildings (mostly to reduce ), our system relies on the fact that large-scale path loss is a function of the logarithm of the distance, which is common to all models. First, this means that attackers need more amplication as they increase their distance to the service area. Second, and most importantly, it means that RSSI patterns tend to be atter after 30 or 40 meters from the SA, i.e. without the sharp decay seen for clients in close range to APs. The bigger this dierence, the larger the error value of the estimated position inside the SA and the higher the chances of failing the condence test.
4.8.7
Minimum Amplication Demands
The objective of an intruder located outside the building is to generate an RSSI pattern that is accepted by the system without failing successive successive times, which could trigger alarms and reveal his eorts. He has two main ways of improving his chances. His rst alternative is to get closer to the service area, making his RSSI pattern more
70
similar to those produced within the SA. In this case the chances of attack detection are higher, but he is able to use portable devices with the same capabilities as the reference client. His second alternative is to use more sophisticated equipment (such as ampliers and directional antennas) to allow him to increase his distance to the SA. These attempts can still be detected, as these devices are harder to conceal than common PCMCIA cards. In this section we quantify the amount of amplication needed outside the SA to satisfy the threshold condition. This is the rst requirement for a successful attack, as failure to do so means automatic rejection during the verication phase. For the estimates presented in this section we assume that targeted clients are able to transmit at 20 dBm (100 mW), level commonly available in PCMCIA cards. For our network setup, this means that the threshold T can be set as high as -55 dBm. The additional amplication needed by an attacker relative to the reference client increases with distance and rapidly achieves considerable levels. For instance, assume an attacker located outside the SA in the direction of access point 11 (Figure 3.2), which is 2.7 meters from the buildings external wall. If we assume that propagation into the building follows the model (out = 3.3, W AF = 5) found in our outdoor measurements and that T = 55 dBm, Equation 4.3 states that the attacker needs a minimum of 11, 20, and 24 dB of amplication relative to the reference client when located respectively 15, 30, and 40 meters from the wall. Figure 4.10 shows the amplication needed at multiple locations around our building. A location labeled with 15 means that an attacker located at that location needs to transmit at a level 15 dB higher than the reference client, i.e. at 35 dBm. These numbers demonstrate that attackers with low amplication levels (e.g. using PCMCIA cards) have to be located in close proximity to the external walls (<10 m) to have any chance of acceptance. The PCMCIA card with highest power available in the market transmits at 24.8 dBm (300 mW), 4.8 dB of amplication compared to our reference client. An intruder that opts for a portable transmitter has a limited range around the building, making physical security measures more eective. Moreover, these estimates show that amplication demands are quite high for an attacker located more than 20 meters from the building. At that distance, attackers need at least 1520 dB of amplication to satisfy the threshold condition. As this kind of gain is possible only through ampliers and external antennas, attackers risk exposure and are more vulnerable to cameras and other physical security measures. Note as well that such high output
4.8. EVALUATION
71
60 50 40 30 20 10 0
30 29 28 27 26 26 26 26 26 -40
29 27 26 25 24 23 23 23 23 -30
28 26 24 23 22 21 20 21 20
26 25 23 21 19 17 17 17 17 -20
25 23 21 18 15 13 12 12 12 -10
24 22 19 16 12 7 5 6 5
23 21 17 13 8 0
23 20 16 11 4 0
23 20 16 12 5 0
23 20 17 12 6 0
23 20 16 11 2 0
23 20 17 13 7 0
10
20
30
40
Figure 4.10: Predicted minimum amplication level needed to satisfy the threshold condition around the SA.
power levels are higher than allowed by the FCC in the United States: at 2.4 GHz, omnidirectional transmitters cannot transmit at levels higher than 30 dBm, or 10 dB above our reference client.
4.8.8
False Positives Omnidirectional Antennas
In this section we assume that attackers have unbounded amplication capabilities, being always able to increase their output power as to satisfy the threshold condition. We assume they employ omni-directional antennas and ampliers, with directional antennas being evaluated in Section 4.8.9. We show that even with unbounded power, undetected attacks can be mounted only close to the service area, within 20-30 meters. Our analysis is based on F + , the percentage of simulation rounds in which the corresponding signal strength realization was accepted by the system. For each location, simulation proceeds as follows. During a simulation round, a random signal strength value is generated for each access point within range according to Equation 4.3 using out = 3.3, W AF = 5 dB, and = 5 dB, and inputted to the algorithm. We ran 5000 dierent rounds (realizations) for each location and calculated F + as the percentage of rounds with estimated locations labeled with accepted. The simulated intruder uses an antenna with a beamwidth of degrees: omni-directional antennas are simulated using = 360 . Finally, we use both the aggressive and conservative congurations, as described in Section 4.8.1.
72
Unless noted otherwise, the power used by the intruder at a given location is the minimum necessary to make the expected signal strength at the closest access point within range equal to the threshold T . I.e., the minimum to satisfy the threshold condition. We denote by GU N the additional (unnecessary) gain in dB used by the intruder, the same value being used for all locations in each simulated scenario. When GU N = 0 dB, the intruder uses the minimum power level. An intruder is likely to over-amplify his signal (GU N > 0) due to several issues. First, most transmitters are provided with a limited set of power levels, which may force them to overshoot by a couple of dB (e.g. the Cisco 350 has 6 levels between 0 and 20 dBm [4]). Second, these are approximate gures: cards usually have an error of around 2 dBm, as sometimes reported in product brochures (e.g. SMC2536W-AG [6]). Finally, and most importantly, the intruder may not be able to precisely gauge the path loss between him and the closest access point, for instance if the AP is used exclusively as a sensor (no transmissions). In this case, using additional power is the only way to guarantee that the threshold condition is satised. If APs are not used only as sensors, the intruder could measure the RSSI from the AP and take the corresponding path loss as an estimate for the loss in the reverse direction. However, our calibration measurements suggest that this procedure is error-prone, given that even for two stationary devices the dierence in path loss between the two directions can be as high as 16 dB. As the intruder gets farther from the service area, values of F + decrease rapidly. For example, Figure 4.11(a) shows the simulation results for the aggressive conguration when using GU N = 5 dB. Note that at close distances (< 10 meters), F + values are as high as 99, i.e. 99% of the simulation rounds generated false positives. This was expected, as the pattern generated at these locations is similar to the pattern seen at the border of the SA. However, when more than 30 meters from the building, less than 10% of the simulation rounds yielded false positives. At these distances, the patterns created by the attacker are atter, being rejected with high probability. Such low rates generate multiple alarms, allowing attack detection. Our results also demonstrate that unnecessary amplication is harmful: the higher the output power used by the attacker, the smaller are his chances of acceptance. When amplication is too high, the pattern produced diers considerably from what is expected from standard clients inside the building. For example, Figure 4.11(b) shows that the probability of false positives decreases substantially when the attacker uses G U N = 10
4.8. EVALUATION
73
60 50 40 30 20 10 0
1 1 2 2 4 3 2 2 2 -40
1 4 4 4 6 9 7 8 8 -30
1 3 7 7 7 11 16 8 20
3 3 4 9 18 33 28 26 32 -20
3 5 8 21 45 58 65 64 67 -10
3 4 11 25 57 86 91 91 88 0
4 5 19 41 79 94
2 7 20 51 89 98
2 5 16 30 82 92
1 4 7 30 73 96
2 5 17 48 86 99
3 9 15 29 74 95
10
20
30
40
(a) = 360 , aggressive conguration, GU N = 5 dB
60 50 40 30 20 10 0 0 0 0 0 0 0 0 0 0 -40 0 0 0 0 0 0 0 0 0 -30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -20 0 0 0 0 1 2 3 3 5 -10 0 0 0 0 2 36 52 29 61 0 10 20 30 40 0 0 0 1 13 89 0 0 0 2 42 98 0 0 0 0 14 96 0 0 0 0 7 61 0 0 0 1 35 99 0 0 0 0 11 89
(b) = 360 , aggressive conguration, GU N = 10 dB
60 50 40 30 20 10 0 31 34 38 41 46 44 43 42 43 -40 32 43 43 47 51 55 54 54 54 -30 31 40 50 53 57 60 63 62 62 41 40 49 58 65 66 71 73 68 -20 40 48 55 64 67 70 73 74 68 -10 42 47 59 67 73 65 66 77 57 0 10 20 30 40 44 47 61 69 74 69 41 51 60 66 66 98 38 48 61 72 72 92 38 47 56 71 77 83 39 48 60 67 66 99 41 49 58 69 77 82
(c) = 360 , aggressive conguration, GU N = 0 dB
Figure 4.11: False-positive rates around the SA as a function of GU N using the aggressive conguration.
74
60 50 40 30 20 10 0
47 51 55 59 63 62 60 59 61 -40
49 61 62 65 69 72 72 72 74 -30
48 58 69 71 74 78 81 77 82
59 57 67 76 82 87 88 89 87 -20
57 65 73 82 89 92 94 94 93 -10
59 64 76 85 92 95 96 97 93
62 64 80 89 95 96
57 68 80 90 95 99
53 64 77 87 96 99
54 63 71 86 96 98
55 64 77 88 94 99
56 66 75 86 96 99
10
20
30
40
Figure 4.12: False-positive rates for GU N = 5 dB using the conservative conguration.
dB, i.e. 5 dB higher than in Figure 4.11(a). As shown, the probability of false positives is negligible when more than 15 meters from the SA. This happens because the system expects measured RSSI levels to be within k standard deviations from what is predicted by the path loss model (Section 4.7.2). As the system is looking for positions inside the SA, unnecessarily high power causes some APs to report RSSI levels more than k dB from the predicted value, which generates solutions with fewer degrees of freedom and lower error thresholds. As Figure 4.11(c) shows, if the intruder were able to nd the minimum power level required, his chances would improve substantially. Our simulation results also demonstrate the usual trade-o between security and management costs, represented by the relationship between false negatives and false positives. When using a more conservative conguration as to reduce the number of clients rejected within the service area, one automatically increases the probability of successful attacks outside the SA. Figure 4.12 shows that F + values increase considerably when simulation is performed with the conservative conguration (compare to Figure 4.11(a)). Even with GU N = 5 dB, false-positive rates are high even when the intruder is located more than 30 meters from the building. Our system can be seen as an intrusion detection system, and like any other, each installation has to choose an acceptable rate of false negatives while conguring the system. Even our aggressive conguration is not too aggressive, as it produces few false negatives. Setting even higher values for the threshold or minimum quorum, however, would
4.8. EVALUATION
75
likely create enough rejected spots to force the network administrator to reconsider his decision. Another important factor to consider is where false negatives occur. For instance, an administrator could decide to use a higher threshold value in a setting where oces and meeting rooms are provided with good coverage at the expense of false negatives at other, less frequently used areas of the service area. We believe that in our setting, undetected attacks are likely only within 15-20 meters from the building. This follows from the fact that in our testbed we can safely use the aggressive conguration, making Figure 4.11(a) the most likely scenario for attackers with unbounded power capabilities and omni-directional antennas.
4.8.9
False Positives Directional Antennas
In this section we evaluate our system against an attacker that employs directional antennas, devices that provide him with two main advantages. First, directional antennas have an associated gain, which may allow him to satisfy the threshold condition without the need for an amplier. Second, as these antennas do not distribute power uniformly across all directions that is where the gain comes from they generate signal strength patterns that are considerably dierent from the one expected from am omni-directional transmitter (like our reference client). We will show that this has a negative impact on the rates of false positives. The main disadvantage of these antennas is their size: with high-gain (> 18 dBi) antennas being commonly larger than 20 inches in diameter and heavier than 4 pounds, they make attacks considerably harder to implement and easier to detect [9, 15]. For our simulation we use the directional antenna model proposed by Ramanathan [90]. We adopted this model to simplify our analysis, given that each real directional antenna distributes power tridimensionally in a dierent way, producing a distinct antenna pattern. For example, the 802.11-compatible HG2409Y, HG2418D, and HG2424G antennas produced by HyperLink Technologies create considerably dierent patterns, with gains ranging from 9 to 24 dBi [9, 11, 15]. Like any real antenna, the antenna pattern proposed by Ramanathan is characterized by a main lobe, the direction of maximum gain, and side lobes, less-favored directions that provide lower gains (less power is sent in these directions than would be in a perfectly omni-directional antenna). The main dierences to a real antenna pattern is that in Ramanathans model there are no nulls (directions with negligible radiation) and that power is distributed uniformly outside of the main lobe, i.e. it has a single side lobe with uniform gain. We simulated 4 directional antennas proposed by Ramanathan [90], with
76
Antenna (1) (2) (3) (4)
Beamwidth () 60 40 20 10
Main lobe gain 10 dBi 14 dBi 20 dBi 26 dBi
Side lobe gain -7.4 dBi -7.6 dBi -6.5 dBi -4.0 dBi
Table 4.2: Directional antenna patterns used during simulation.
beamwidth values ranging between 10 to 60 and maximum gain between 10 and 26 dBi (shown in Table 4.2). Like real antennas, higher gains are achieved by making antennas more directional (lower values of ). The simulation procedure is similar to the one described for omni-directional antennas. The main dierence is that for each location, multiple orientations were simulated, each with 5000 simulation rounds. For each location and beamwidth value, we tested all orientations with increments of 10 and we report the combination with the highest value for F + , i.e. the best combination for the attacker. Figure 4.13 shows that false-positive rates increase when an attacker uses a directional antenna. The gure shows the values for F + for the 14-dBi antenna against the aggressive conguration and with GU N = 5 dB. With the same placement of access points, an attacker with the 14-dBi antenna can fool the system with high probability even when located 30 meters from the building. Whereas in Figure 4.11(a) (omni-directional antenna) low values of F + are seen across all locations more than 30 meters from the service area, in Figure 4.13 many locations at that range had over 80% of the simulation rounds accepted. These higher false-positive rates are possible because an attacker with a directional antenna is able to induce considerable signal strength degradation inside the service area within a short distance, a behavior similar to the one expected from standard clients within the building. Take for instance the location in Figure 4.13 labeled with 86 and with its main lobe represented by two line segments. With the orientation shown, the attacker is able to cover half of the access points with the main lobe, with the other 6 being covered by the side lobe. With the 14-dBi antenna, there is a dierence of over 21 dB between the gains of the main and side lobes. Consequently, the RSSI levels detected by the 6 APs covered by the main lobe are much higher than the values reported by the other access points, a pattern that is similar to the one created by a standard device located on the West side of the building. Consequently, for most of the simulation rounds the system is able to nd a
4.8. EVALUATION
77
60 50 40 30 20 10 0
53 91 76 92 36 93 30 66 22 -40
83 66 91 92 93 92 33 47 93 -30
66 91 76 90 92 91 39 76 94
69 74 91 91 92 92 92 75 94 -20
69 82 77 90 92 91 90 92 94 -10
85 79 80 76 81 88 88 94 91
84 84 82 76 85 89
86 84 82 83 89 98
88 88 86 84 82 95
90 90 91 83 78 78
91 92 92 85 90 92
80 91 91 84 93 98
10
20
30
40
Figure 4.13: False-positive rates for directional antenna with = 40 and 14 dBi, against the aggressive conguration and with GU N = 5 dB.
solution in that area with an acceptable error value. Figure 4.14 shows that this approach is also successful with the other directional antennas simulated. While such attacks are possible, they are considerably hard to implement successfully without raising suspicion. First, the attacker needs to amplify his signal to meet the threshold condition, the same problem faced in the omni-directional scenario. Additionally, he has to point the antenna in the right direction, which he may be unable to nd if the APs are congured solely as sensors. If the attacker covers more access points with the main lobe, he will increase the number of APs reporting high RSSI levels and consequently increase the error associated with estimated location, which decreases his chances of success. If in the process of nding the right orientation the attacker is rejected a couple of times, the system has considerable evidence that an attack is taking place. These numbers do demonstrate, however, that additional mechanisms or resources are needed to provide the same level of security against attackers with directional antennas and the required amplication capabilities. In this dissertation we present two approaches that allow an installation to improve security against directional antennas with a manageable increase in overall deployment costs. The rst approach consists of deploying additional access points around the service area in a demilitarized zone (DMZ). These additional sensors are deployed with the objective of identifying transmissions originating at locations beyond the service area. They could
78
60 50 40 30 20 10 0
76 89 77 90 56 93 52 67 46 -40
93 81 90 91 92 92 55 64 92 -30
78 87 81 90 93 92 65 75 93
70 81 89 91 94 93 92 76 93 -20
80 94 83 90 91 91 90 93 94 -10
95 83 94 86 86 91 91 93 92
96 95 94 92 86 93
96 96 95 91 87 99
96 96 96 96 90 98
94 95 96 97 96 94
95 94 93 93 92 95
86 92 91 91 89 98
10
20
30
40
(a) = 60 , 10 dBi, aggressive conguration, GU N = 5 dB
60 50 40 30 20 10 0 40 88 76 88 30 88 9 65 8 -40 71 36 89 89 88 88 11 20 92 -30 35 88 74 87 90 86 11 70 92 69 72 89 89 89 89 87 67 91 -20 69 64 76 89 89 88 87 91 91 -10 62 65 74 76 80 89 94 92 92 0 10 20 30 40 63 65 67 76 79 89 62 65 68 77 93 97 58 61 77 78 81 80 55 57 58 82 77 65 54 74 62 83 89 83 55 56 58 59 94 97
(b) = 20 , 20 dBi, aggressive conguration, GU N = 5 dB
60 50 40 30 20 10 0 80 71 81 25 80 2 0 2 88 -40 30 81 69 81 81 2 77 2 6 -30 68 72 82 69 83 78 76 47 45 16 83 74 84 70 72 78 86 88 -20 42 71 73 70 85 89 77 76 88 -10 44 68 23 73 89 92 94 90 87 0 10 20 30 40 69 66 65 46 86 88 42 56 71 67 90 94 49 48 55 66 84 52 39 57 77 58 59 74 76 68 73 78 85 87 36 70 90 88 49 93
(c) = 10 , 26 dBi, aggressive conguration, GU N = 5 dB
Figure 4.14: False-positive rates for the other simulated directional antennas.
4.8. EVALUATION
79
be even installed close to external walls and with directional antennas pointing away from the SA in order not to aect localization within the building. As signal gets substantially attenuated as it travels through walls and external obstacles, in the case of an external attack, RSSI levels detected by sensors in the DMZ should be much higher than the ones reported by sensors placed within the service area. With enough external sensors, a simple additional condition added to the verication phase would be sucient to detect these attacks. In the case of IEEE 802.11 networks, the falling prices of access points allow such security improvement to be realized with low extra costs. The second alternative is to prevent attackers from using ampliers to obtain the gain they need to when located outside the service area. For example, Figure 4.14(a) shows that an attacker may successfully fool the localization system with a 10-dBi antenna coupled with an amplier. In the next chapter we present in detail a mechanism that reduces the eectiveness of ampliers by exploring characteristics of wireless receivers. Consequently, attackers outside the SA have to rely mostly on their antennas to provide the amount of gain they need. Security gets further improved because high-gain antennas are fairly large devices, making attack attempts not only harder to implement by also easier to detect.
4.8.10
Localization in Multi-Story Buildings
Despite our use of a single-oor building for evaluation, our system can be used in installations with multiple oors. This scenario aects the localization phase only, having no signicant eect on the conditions imposed by the verication phase. Even though our algorithm already performs a tridimensional search, we believe our system would achieve better accuracy in multi-oored buildings if search were performed in a per-oor basis. Propagation through oors causes considerable attenuation (10-15 dB [91]), so the system should not have a problem detecting the oor at which a mobile device is located given the RSSI patterns produced in each one. The system can search all oors starting with the one containing the access point that reported the highest RSSI level and return the location found with the smallest error across all oors. One alternative would be to perform a full tridimensional search, simultaneously using all access points, across all oors. One problem with this approach is that our system would have to take into account signal attenuation across oors, directly modeling the loss caused by each one. For example, it could use the partition model proposed by Seidel et al. [105], discussed in Section 3.7. However, there is empirical evidence in the literature that suggests
80
an increase in standard deviation (loss of prediction accuracy) when one increases the area covered by a single model instance. Using a single model for a multi-oored building could suer from this very problem, as heterogeneity between oors would not be taken into account because a single value for would be used for the whole building.
4.9
Limitations
Our system will provide lower accuracy and weaker security guarantees in environments that produce path loss model instances with high standard deviation values. For instance, our simulation results demonstrated the impact of standard deviation values on accuracy: when increases from 5.64 to 10 dB, the median error and the 90 th percentile of the error distribution in our testbed increase from 1.95 and 4.4 meters to respectively 3.31 and 8.11 meters. Higher standard deviation values also cause higher rates of false positives. We showed in Section 4.8.8 that with the conservative conguration, false positive rates are high even when the attacker is located more than 30 meters from the SA. A similar increase in false positives would result from a path loss model with a higher value of , as the maximum error allowed for a solution would be higher for a given condence level. It is unclear, however, the physical properties needed by a SA to produce such inaccurate path loss model instances. For example, the literature presents strong support for the log-distance model, as high standard deviation values are usually a result of modeling attenuation across large areas or in multi-oored buildings with a single model instance [91, 105]. Moreover, our system uses several mechanisms to improve signal strength prediction. First, we showed in Chapter 3 that proper ltering reduces standard deviation values compared to the use of average RSSI levels, which is used in many earlier works. Second, our system can use multiple model instances exactly to account for distinct properties of dierent oors or even of dierent sections of the same oor. As discussed in Section 7.3, a thorough evaluation of signal attenuation modeling in dierent service areas is left as future work. Even though our mechanism increases amplication demands at all locations outside a service area, successful attacks with modest hardware requirements are still possible near the boundaries of the SA. For example, our results demonstrate that an attacker located within 15 meters from the SA may not succeed using a standard PCMCIA card, but may generate a valid location using a portable setup including an amplier and an external antenna. To maximize the security guarantees provided by our mechanism, installations
4.10. SUMMARY AND CONCLUSION
81
need to protect between 20 and 30 meters around their service areas for example deploying fenced demilitarized zones or security cameras. If that is not the case, networks may be vulnerable to attacks from parking lots and other areas within close range of the deployed access points. Whether this extra level of protection is necessary or not will depend on the attack model adopted by each installation, which includes the range of devices they want to protect against. We believe this is not a characteristic of our system per see, but of any mechanism that exploits the relationship between signal attenuation and distance because of the inherent variations and errors associated with signal strength measurements. A similar scenario occurs when service areas with dierent administrative domains occupy adjacent oors in a multi-story building. For instance, an attacker on the second oor of a building may be able to access the wireless network of an enterprise located on the rst oor if provided with enough amplication to account for the oor attenuation. We cannot quantify the exact amount of amplication needed using our measurements, but it will vary from one installation to another and is likely to be within 10 and 20 dB [91].
4.10
In this chapter we presented the design and evaluation of a localization system based on signal strength measurements and showed how it can be used to improve security in wireless LANs in a cost-eective manner. We showed that the system improves accountability, as it is able to accurately monitor the position of all devices located within the service area. Using measurements collected at 135 locations in an IEEE 802.11 installation with 12 access points, we demonstrated that clients inside the building are located with a median error of 1.88 meters and a 90 th percentile of 3.81 m. As the system estimates not only location but also the transmission power used by devices, it is prepared to handle clients using dissimilar wireless cards or power levels. We also demonstrated that our system can be used to implement location-based access control, improving security by leveraging physical security measures and imposing considerable hardware resources on devices located outside the SA. We showed that clients inside the SA are rejected with low probability: even with an aggressive conguration, only 3 locations out of the 135 sampled were rejected. With a conservative conguration, there were no false negatives. Moreover, we used measurements performed outside our building
82
to show that hardware demands for attackers rapidly reach impractical levels. First, we showed that power-limited devices (e.g. PCMCIA cards) have to be located within 10 meters from the SA to have any chances of acceptance, making physical security measures more eective. Second, we showed that after 30-40 meters from our building, attackers need over 20 dB of amplication. At these distances, devices with omni-directional antennas are rejected with high probability even if provided with unbounded amplication because the patterns produced dier considerably from what is predicted by the path loss model. We showed that it is hard for attackers to nd the exact power level to use and that unnecessary amplication is harmful, which further reduces the practical range of attacks. Third, we demonstrated that false-positive rates increase when attackers use high-gain directional antennas and described alternatives to improve the performance of the system in these scenarios. Finally, we showed that our system is able to congure itself with minimal operator assistance, therefore improving security without a corresponding increase in management costs. All the results presented in this section were obtained using the path loss model found autonomously by the system during calibration. Overall, our system takes advantage of the increasing number of access points to provide a service that is both robust against attacks and scalable, making it suitable for security applications even in environments where APs are incrementally deployed.
Chapter 5
Range-Based Authentication
In this chapter we present a protocol that checks whether a wireless receiver is within a specied range from an access point. Compared to our localization system, we show that this mechanism makes it even harder for attackers to access the network when located outside the service area.
5.1
Motivation
One inherent characteristic of using a localization system for access control purposes is that clients seeking authentication act as transmitters, for which amplication is an easier problem. Our results in the previous chapter showed that an attacker needs more than 15 dB and 20 dB of amplication to authenticate successfully when located more than 20 and 30 meters from our building, respectively. When acting as a transmitter, an attacker that far from the service area can obtain the necessary gain to generate a false positive through a combination of a compact directional antenna and an amplier. This setup allows him to better disguise an attack, decreasing the eectiveness of surveillance cameras and other physical security measures. In this chapter we present a mechanism that increases the hardware demands for an attacker and improves the chances of attack detection by forcing clients to act as receivers. As ampliers are not as useful in this scenario, an attacker has to gather most of the amplication needed to mount an attack by using antennas with gain. The problem with this approach is that high-gain antennas those providing over 15 dBi of gain are large devices, much harder to conceal than standard PCMCIA cards and some low-gain antennas. 83
84
CHAPTER 5. RANGE-BASED AUTHENTICATION
For example, 802.11-compatible 2.4 GHz antennas providing between 18 and 24 dBi of gain usually measure between 15 and 40 inches (largest dimension) and weigh between 3 and 8 pounds [9, 10, 14, 15]. Consequently, attackers are physically exposed and such attempts can be more easily detected. We describe the design and evaluation of a range-based authentication protocol that imposes no modication to o-the-shelf wireless equipment and relies solely on well-accepted properties of the wireless channel. Each access point within the service area is congured with a maximum range, a value chosen so that each location in the SA is covered by at least one AP. During an authentication handshake, an AP transmits a stream of unpredictable random numbers. A client authenticates itself by sending back a cryptographic proof that it correctly received the random sequence, which in turn proves it to be within the targeted range for that access point. We show that our protocol can control the range of each AP through transmission power control and CRC omission, while using a standard DieHellman key exchange to generate session keys to protect wireless trac. We present measurements in our network testbed that demonstrate the eectiveness of range-based authentication. Using indoor measurements, we show that clients within the targeted coverage area and provisioned with o-the-shelf hardware experience high-quality channels and are able to complete the handshake successfully, with a low probability of two consecutive failures. Outside the building, attackers need extra antenna gain to cope with the higher signal attenuation and the corresponding increase in bit error rates. We show that such demands increase rapidly with distance and that for an intruder to authenticate successfully from a safe location (> 30 meters from the SA) he or she needs at least 20 dBi of antenna gain. Finally, we show that management overhead is minimal because there is no additional costs for supporting new users and because the system leverages the results of system calibration. As a result, our protocol considerably increases the hardware demands for successful attacks and improves the eectiveness of physical security measures without increasing management costs. This chapter is organized as follows. In sections 5.2 and 5.3 we present our attack model and an overview of our protocol. Section 5.4 presents the properties regarding signal propagation that our protocol relies on, while sections 5.5-5.11 present the protocol in detail. In Section 5.12 we demonstrate the eectiveness of range-based authentication and show that it is built upon sound properties. In Section 5.13 we present the limitations of our mechanism, while Section 5.14 concludes this chapter.
5.2. ATTACK MODEL
85
Figure 5.1: Example of a range conguration that covers most of our service area with 8 access points.
5.2
Attack Model
The attack model used to evaluate our range-based authentication protocol is similar to the one used in the previous chapter for our localization system. First, we assume that unwanted users are not allowed inside the service area. Second, we assume that attackers are not limited to using devices similar in resources to our reference client. We again evaluate the performance of our protocol against a wide range of attackers, including those provided with high-gain directional antennas.
5.3
Overview
Given the set of access points to be used for authentication purposes, the system nds the range to be used by each one. Each access point APi is associated with a maximum range Ri so that all APs used for authentication cover the whole service area, with a possibly dierent range value being used for each access point. For example, Figure 5.1 shows one conguration that covers our service area using 8 of the 12 IEEE 802.11 access points deployed. A client authenticates successfully if it can prove that it is located at a distance di Ri from any access point APi SA. A client seeking authentication needs to successfully complete an authentication round. During a round, the WA (or server) selects one access point to act as transmitter and broadcasts several messages containing a sequence of random numbers (nonces), which are
86
created fresh and are unpredictable to clients. To be granted access to the network, a client only needs to prove to the server that it correctly received all the nonces by sending a secure hash of the data. Each round is executed over a single access point, aiming to cover exactly the range of that AP. A transmission power control algorithm, described in Section 5.7, is used to map the desired range to the minimum amount of output power needed by the access point. All the state needed to approximate signal attenuation inside the service area is established during system calibration. As the distance between client and AP increases, the quality of the wireless channel (represented by the signal-to-noise ratio, or SNR) decreases due to the increased signal attenuation caused by the environment. Clients within the targeted range experience high SNR levels and low bit error rates, successfully receiving all the nonces and completing the handshake. The lower the transmission power, the lower the SNR outside the building, and therefore the lower the chances of unauthorized access. Three other techniques are employed to limit authentication range. First, nonce messages are transmitted by access points without checksums or CRCs, therefore providing no additional information about the payload and preventing attacks based on local exhaustive search. Second, nonce messages are not retransmitted in order to prevent an undesired increase in authentication range. Finally, a server can optionally use other access points to transmit bogus packets during a handshake in order to decrease the chances of authentication for clients located outside the service area. This mechanism, called selective jamming, is described in Section 5.9. Merged with nonce and proof messages is a standard Die-Hellman (DH) key exchange that provides key material that is used to establish a secure session for each client. Several session keys are generated: one is used by the client to securely transmit its nonce reception proof back to the server, while other two keys are used to protect (authenticate and encrypt) the packets sent during that session in order to prevent eavesdropping or trac injection by unauthorized devices. These keys are established by our protocol, but are then passed to a lower-level mechanism that is actually responsible for protecting wireless trac (such as TKIP and CCMP as dened by the current IEEE 802.11 security standard [7]). For completeness, secure and well-accepted mechanisms to encrypt and authenticate trac are discussed briey in Section 5.11. A client does not need to complete another authentication round until an active session is about to expire. For example, if an installation uses hour-long sessions, a client could
5.4. PRINCIPLE OF OPERATION
87
start its reauthentication process 5 minutes before the expiration of its session in order to prevent periods of disconnection.
5.4
Principle of Operation
Our proposed protocol relies solely on the following two properties of the wireless channel: P1: Signal strength decreases with distance. Average signal strength (in Watts) decreases as a power-law function of the distance, the exact rates depending strongly on the characteristics of each environment. The experiments presented in Chapter 3 showed that in practice, signal attenuates much faster that predicted by the free-space model, that predicts degradation as a function of the square of the distance between transmitter and receiver. P2: Communication quality is a function of perceived SNR, with performance degrading sharply in the vicinity of a threshold SNR level. For a given modulation, performance degrades substantially over a narrow range of SNR, and one can usually identify a threshold level below which the channel becomes unreliable. Therefore, a channel providing a negligible bit error rate (BER) by operating close to this threshold level can become unusable with a drop in signal strength of 5-10 dB. This rapid performance degradation is the main reason for having multi-rate wireless systems and has been demonstrated through simulation in the literature [25, 43, 44] and conrmed by our measurements. For instance, when employing a 802.11a link at 54 Mbps, we show that packet corruption stays below 1.5% in measurements with 30 dB or more of SNR, while increasing to over 90% when facing SNR levels below 20 dB. Note that these performance numbers already include the benets of the error correction techniques dened in the IEEE 802.11a standard [2]. While threshold values will vary from one data rate to the other, the same behavior is expected for all data rates. The 802.11a standard, for example, provides 8 dierent transmission modes, employing 4 dierent modulation algorithms (BPSK, QPSK, 16QAM, and 64-QAM) and coding schemes to provide data rates between 6 and 54 Mbps [2]. Higher data rates are achieved with more complex modulations, which code more bits per
88
Term I Ni S ID T t H(M ) Ek (M ) (a||b) M ACC M ACAP , p, x, y x
Meaning number of iterations. random nonce sent during ith iteration. size of each nonce (bytes). round identier. minimum delay between rounds. maximum challenge response time. one-way hash of message M . Encryption of message M with key k. denotes concatenation of a and b. clients MAC address. APs MAC address. Die-Hellman parameters. short for x mod p (Die-Hellman).
Table 5.1: Notation used to describe a range-based authentication handshake. OFDM symbol and demand higher SNR levels. For instance, simulation results presented by Awoniyi et al. show that similar error rates are achieved by transmitting at 6 Mbps with a SNR of 10 dB and at 54 Mbps with a much higher SNR of 30 dB [25].
5.5
Authentication Handshake
For the description that follows, we employ the notation shown in Table 5.1. The protocol handshake makes use of three well-known building blocks: a secure one-way hash function H expected to provide preimage resistance ([83], page 235), a block cipher E, used by clients and access points to encrypt reception proofs, and a Die-Hellman key exchange, used to establish fresh session keys with forward secrecy [83]. Access points can use either broadcast or unicast frames to transmit nonce messages during an authentication round, each method having its own advantages and disadvantages depending on the underlying network used. We assume nonces are broadcasted when we describe the authentication handshake and discuss the impact of using unicast frames in Section 5.6.
Access Point Selection

The client initiates a protocol round by sending a broadcast request message, which contains no vital information and is used for the sole purpose of notifying the server about its intent. (The protocol handshake is shown in Figure 5.2.)
5.5. AUTHENTICATION HANDSHAKE
89
Client request
AP/WA

ID, I, 1, N1 ID, I, 2, N2 ... ID, I, I, NI , y
M ACC , ID, x , Ek (H(N1 ||...||NI ||M AC-||x ||y )) C M ACAP , ID, Ek (H(N1 ||...||NI ||M ACAP ||y ||x ))
Figure 5.2: Detailed description of an authentication handshake.
The best AP from the stand point of a given client is the one that provides it with the highest signal strength level. The WA has two main ways to discover the best pairings. The rst alternative is to look at link-level associations. For example, most 802.11 clients already choose the AP they are able to detect with the highest RSSI. The second choice is for the WA to estimate the RSSI level at a client relative to each access point based on measurements performed on the reverse direction. For a packet transmitted by the client, the access point reporting the highest RSSI is most likely the one providing the best quality in the reverse channel assuming all APs use the same power level. If the WA is using broadcast packets to send nonce messages, it can control the rate at which authentication rounds occur. For example, it can enforce a minimum delay of T seconds between rounds, either globally or on a per-AP basis. Multiple devices within range of the access point can authenticate during the same round, so well-behaved clients are not strongly aected by this procedure. On the other hand, attackers are prevented from increasing the frequency of rounds as to increase their chances of authentication when located beyond the targeted range. With an access point selected, the server schedules the next round and chooses the number of iterations (I) and the total amount of random data to send. This data is divided among the I iterations, with S random bytes sent during each one. A round identier (ID) is created, being used to identify the messages pertaining to the same handshake.
90
Nonce Messages
At the scheduled time, the server proceeds with the handshake by performing the rst iteration (second message in Figure 5.2). It sends a message containing the round identier ID, the total number of iterations I, as well as the current iteration number and the rst random nonce (N1 ). When sending this frame, the AP skips the computation of link-layer checksums or CRCs, in order not to reveal extra information about the payload. The server continues with the remaining iterations, changing only the iteration counter and the nonce at each step. Even though an authentication round may have been triggered by a single clients request, all clients that are able to correctly receive the broadcasted messages can proceed with the handshake. The server prevents link-layer retransmissions when sending nonce messages in order not to improve the authentication chances for devices beyond the targeted range. In 802.11 networks, this is automatically achieved when nonce messages are sent through broadcast packets, given that these are neither acknowledged nor retransmitted ([1], page 83). The access point also has to minimize the probability of frame collisions while transmitting nonce messages. Many collisions could render authentication impractical even for client in close range due to the hidden-station problem. In a 802.11 network, the access point can reduce the chance of collisions using a contention-free period, part of the Point Coordination Function (PCF). Note that the RTS/CTS mechanism cannot be used to reserve the channel for broadcast frames ([1], p.83). When sending the nonce message pertaining to the last iteration, the WA sends its Die-Hellman public key (y mod p). A fresh random secret y is chosen for this round and discarded after it takes place. This same exponent is used by the server with all the clients authenticating during the current round. This concept of reusing DH exponents was introduced by the JFK protocol [21]; it increases performance while still generating independent session keys, a result of distinct x values used by dierent clients. After receiving the rst message, clients wait for all the nonces to be transmitted. Note that a client that joins the network (or that channel) in the middle of a round can immediately detect the number of iterations lost by examining the iteration counter. It may then wait for the next round, explicitly demonstrate its intent to authenticate by sending its own request message, or simply get back to normal operation (in the case of a client with a session that is still active). Detecting loss of iterations is important for a client that scans multiple channels looking for connectivity or in the case where authentication and
5.5. AUTHENTICATION HANDSHAKE
91
session data are transmitted over dierent physical channels. In both cases, clients are able to reduce their scanning delay.
Authentication and Session Establishment

Clients turn o CRC computation when receiving nonce frames (a simple driver modication), and have to decide whether or not to complete the handshake after all nonces are received. As the AP does not compute checksums, a client can never be sure of whether all nonces were received correctly. However, a client can guess with a high probability of success by looking at the SNR levels detected when receiving the nonce frames. As shown in Section 5.12, if the SNR level perceived by the client is above a value SN R (determined experimentally for each data rate), bit error rates are negligible. In this scenario, a client has high assurance that all bits were received correctly, and can thus proceed with the handshake. Accordingly, well-behaved clients abort rounds with messages received with low SNR levels. If it decides to complete the handshake, the client chooses a fresh random number x and computes the corresponding Die-Hellman shared secret (xy mod p). It then computes a proof that it received the random data by hashing the concatenation of all transmitted nonces, its MAC address, and both DH public keys. The resulting hash is encrypted using the block cipher with transient key k generated from the DH shared secret. The encrypted proof is then sent to the server in a unicast message, along with M ACC , ID, and x . The proof generated by the client is encrypted in order not to reveal additional information to other clients regarding the random numbers. Let M denote the concatenation of all the nonces. A client in close range receives M without bit errors, being able to complete the handshake. An adversary outside the SA may be able to receive partial information. For example, it may gather a value M which diers from M in b bits. If the hash over M ||M ACC ||x ||y were to to be transmitted unencrypted, an attacker could perform a local search starting from M ||M ACC ||x ||y and stop when a hash value match was found. In the scenario where clients are allowed to share a single round, an attacker could use this hash value to forge a response, simply replacing the MAC address and the DH public key with his own values. Note that given that the hash function is preimage resistant, an attacker is not able to eciently compute M given the transmitted hash value. When an incorrect proof is received, the WA generates an alarm. When too many alarms are detected for the same authentication round or during a small period of time, the
92
system may decide, for example, to contact the network administrator. The operator could even receive an estimated location for the incident given that the WA is provisioned with a localization server (at least in our architecture). Note that well-behaved clients monitor their received signal strength during a round and are not expected to generated a high number of alarms. Multiple clients may respond to an authentication round, and the server processes each message separately. First, the server veries whether ID identies the current round and whether the client has responded within the allowed time interval (t). The server then computes the transient key k, decrypts the proof sent by the client and compares it to the value it computes itself. If the value received is correct, the server sends the client a message acknowledging a successful round. The WA also uses this last message to send the client its own proof of the knowledge of the nonces, the duration of the secure session, and other conguration information (IP address, default gateway, etc). After the round is complete, each client shares with the WA a set of independent session keys, used to (temporarily) secure the communications between them. Section 5.11 details how session keys and the transient key k can be generated from the DH shared secret and how a secure channel may be implemented.
5.6
Using Unicast Nonce Messages
If nonce messages are transmitted as unicast frames, a dierent round is executed for each device seeking authentication. In this case, a 802.11 access point can reduce the probability of frame collisions by using the RTS/CTS mechanism, part of the Distributed Coordination Function (DCF). This has two advantages. First, the RTS/CTS mechanisms provides better protection against collisions caused by hidden nodes than a contention-free period. In dense 802.11 deployments, this reason alone may be sucient to justify the use of unicast nonce messages, specially if not enough channels are available to provide proper isolation between cells. Second, all 802.11-compatible devices have to implement all DCF functions, while the PCF mechanism is optional [1]. When using unicast frames in a 802.11 deployment, access points have to directly prevent link-layer retransmissions. For example, this can be achieved by setting the maximum number of retransmissions to zero ([1], page 78). The access point can change this value for the duration of the authentication round and revert to the original value after all nonce
5.7. TRANSMISSION POWER CONTROL
93
messages have been transmitted. One disadvantage of using unicast nonce messages is that the server cannot control the frequency of authentication requests as easily as in the broadcast case. As soon as the server is willing to start a new round (T seconds after the previous one), an attacker can jump ahead of well-behaved clients and request authentication, having the next round directed at him. By doing that continuously, other clients could starve and not be properly authenticated. Therefore, this mechanism should be turned o when unicast messages are used unless the network is able to detect when many authentication requests are being generated by the same device, even if it changes its MAC address over time. We describe one such mechanism in Chapter 6.
5.7
Transmission Power Control
The objective of power control is to use the minimum transmission power level that provides clients within the targeted range with a SNR level high enough to successfully complete the handshake. For a round executed through APi , the power control algorithm needs two inputs: SN R and the range Ri . SN R represents the minimum SNR level that provides clients with a high probability of receiving all random nonces without bit errors, with a dierent value for each data rate. From the denition of the service area and the placement of access points, the system denes the range for each AP as to maximize coverage. The signal-to-noise ratio experienced by a wireless receiver is the dierence in power between the intended signal and the noise in the communication channel. The expected SNR d meters from the transmitter can be predicted through the following equation:
SN R(d) = Pr (d) N SN R(d) = Pt + Gr Ploss (d) N (5.1)
where SN R is expressed in dB, Pt represents the power used by the AP (in dBm), Gr the receiver antenna gain (in dBi), Ploss (d) the path loss d meters from the AP, and N the sum of both noise and interference (also in dBm) [91]. In Equation 5.1, we assume access points (the transmitters) have antennas with zero gain.
94
From Equation 5.1, we can estimate the power needed to provide clients within R meters with a minimum signal-to-noise ratio of SN R dB:
Pt = SN R Gr + Ploss (R) + N +
(5.2)
where is an increment (in dB) used as a safety margin and derived from how aggressive the network security policy is. In close proximity, signal attenuation tends to be lower than predicted by path loss models, so using = 0 dB should work for most installations (supported by our measurements). Given that our current reference client is a standard PCMCIA card, we can use Gr = 0 dBi in Equation 5.2. The path loss estimate for the targeted range is found using the model computed during calibration, while the noise level (N ) can be measured by access points during operation, as wireless cards usually provide a noise estimate for each packet received. The minimum noise perceived by a receiver is a function of the bandwidth used (20 MHz for a 802.11a channel), the environment temperature, and the noise gure the exact equation can be found in [91]. For instance, taking ambient temperature to be 290 Kelvin (approx. 63 F), the minimum noise evaluates to N = 96 dBm, which will be used in this work as the noise level when no interferer is in range. O-the-shelf IEEE 802.11 access points provide a set of discrete values for transmission power, preventing the WA from using the exact value found using Eqn. 5.2. In this case, the WA uses the lower power level which still satises the SNR requirement. For instance, one 802.11b-compatible access point provides 7 distinct power levels: 2, 5, 8, 11, 15, 17, and 20 dBm [12]. For example, suppose clients within 15 meters (R = 15m) are to be provided with an SNR above 30 dB (SN R = 30 dB for the 802.11a 54 Mbps mode). If = 3.0 and N = 96 dBm, an AP would need to use a transmission power of at least 16 dBm, so 17 dBm would be used with the given AP.
5.8
Handshake Parameters
During an authentication round, the server should send enough random data to make impractical for outsiders to receive all packets correctly (due to bit errors) while allowing clients in close range to authenticate successfully with high probability. There is a tradeo between the amount of random data and authentication delay, given the increase in
5.8. HANDSHAKE PARAMETERS
95
the number of iterations. We found a good compromise using 40 Kbytes of random data transmitted over 20 iterations (I = 20, S = 2000 bytes), as we show in Section 5.12. The server also chooses one of the available modes to transmit the nonce messages. In the IEEE 802.11a standard, for example, a transmission mode denes the combination of modulation and coding scheme to be used, with proper information placed in the PLCP header for decoding purposes [2]. The receiver has no choice but to demodulate the nonce message according to the mode selected by the server. Note that the selection of modulation exposes a trade-o between performance and power, as modulations providing higher data rates require higher SNR levels for a target bit-error rate, consequently increasing transmission power demands. For our evaluation we adopt the 54 Mbps mode, which can be used with o-the-shelf access points to cover the short ranges (< 10 meters) of our deployment. The server imposes a maximum delay of t seconds for a client to complete a handshake after the last nonce message is transmitted. The chosen value should allow clients to perform a Die-Hellman exponentiation, calculate the transient key k and the corresponding proof, and respond to the challenge. A value between 1 and 5 seconds should suce for most installations, allowing enough clients to respond while still limiting the time for attackers to deal with bit errors. For instance, assuming clients take 300 milliseconds to generate the response message (supported by measurements in Section 5.12), 2000-byte response frames, 6 Mbps data rate for both data and ACK frames, and no collisions, theoretically 1 second is enough time for 200 clients to authenticate (already accounting for inter-frame spacing and propagation delay). Packet collisions and concurrent trac would lower this estimate, while clients that are able to use faster data rates would increase this number. Finally, the system can control the frequency of authentication rounds by controlling the value of T (minimum delay between rounds). This lower bound makes attackers unable to increase the rate with which new rounds occur, which would increase the probability of system compromise within a given amount of time. The system can use values of T in the order of tens of seconds (e.g. 30 seconds) without severely aecting well-behaved clients. Clients joining the network will wait an average of T /2 for a round to start, but active clients can reauthenticate before their current sessions expire. This mechanism is akin to how system evolution is controlled in the LOCKSS system [81].
96
5.9
Selective Jamming
Selective jamming is an optional mechanism that can be used by the network to further improve security guarantees compared to using a standalone handshake. The objective of this mechanism is to make it harder for devices beyond the intended authentication range to receive all nonce messages successfully. If directed selectively towards devices outside the service area, we show that this jamming mechanism improves security substantially. When selective jamming is enabled, the WA schedules bogus transmissions to take place over other access points called jammers while an authentication handshake is taking place. Jammers transmit simultaneously with the access point performing the authentication handshake. They start their transmissions immediately before the handshake starts the WA can send them a command either wirelessly or over the wired infrastructure and continue until all the nonce messages have been sent. In terms of packet contents, our measures (presented in Section 5.12.8) demonstrate that 802.11 frames with random payloads achieve the desired result. With proper placement of jammers, this mechanism works because devices located outside the service area detect these bogus transmissions with signal strength levels higher than those seen for nonce messages. This capture eect, also called the near-far problem, is inherent of multiple-access wireless systems: a stronger transmitter captures the receiver, making unfeasible the detection of weaker signals [91]. Even when such a secondary transmission does not capture the receiver, it still increases the noise level, thus decreasing SNR and communication quality. Cellular systems, for example, avoid similar problems by performing transmission power control, proper channel allocation, and by using sectorized antennas. During an authentication handshake, the receiver used by an attacker may be captured by a stronger, bogus transmission performed by a jammer, reducing authentication chances. The main challenge regarding this mechanism is that devices located within range of the access point performing the authentication handshake must not be aected by the jammers. Such collateral damage could render authentication impossible, as our measurements demonstrate. To achieve such isolation, jammers have to be properly placed and may have to use directional antennas. For example, in the case of a service area dened by the boundaries of an oce building, jammers could be placed outside, farther from the clients targeted by the network and closer to external areas. In this case, jammers provided with directional antennas could even be placed close to the external walls, with their main lobes
5.10. MINIMIZING INTERFERENCE
97
pointed away from the SA. For example, small 8-dBi directional antennas with horizontal beamwidths larger than 70 could be used in these scenarios [16].
5.10
Minimizing Interference
As in any other wireless deployment, it is important to minimize interference in the network. With range-based authentication, it is specially important for the WA to minimize interference while nonce messages are being transmitted as part of an authentication round given the low power levels used and the lack of frame retransmission. This is performed in two steps. First, the WA is responsible for assigning channels to all active access points those serving active clients as well as those handling authentication trac in order to minimize co-channel interference. In networks with enough channels, standard channel allocation techniques seem sucient to prevent harmful interference. For instance, 802.11 networks operating at both 2.4 GHz and 5.4 GHz bands have at least 11 non-overlapping channels, enough to provide an oce building with dense coverage without severe interference levels given that each AP covers around 40-50 meters. Additionally, the WA could reserve one channel solely for authentication purposes, therefore isolating authentication trac from active data sessions. This would allow for negligible interference, as the WA controls authentication rounds across all the access points. When fewer channels are available, minimizing interference is a harder task. If the underlying network provides the WA with a reliable channel reservation mechanism, the WA can temporarily prevent communication in all cells operating at the same channel than the AP performing the handshake. It is unclear whether this can be achieved in 802.11 networks given the mechanisms provided by the standard. If this is not possible, one option would be to decrease the density of access points, therefore increasing the authentication range used by each AP and decreasing the security guarantees provided by our mechanism. The second step is to minimize the interference between APs and clients contending for the same channel during the transmission of nonce messages. As already discussed, in IEEE 802.11 networks this can be achieved using contention-free periods (when using broadcast frames) or the RTS/CTS mechanism (when using unicast frames).
98
5.11
Secure Session Implementation
Three building blocks are used during the protocol handshake: a cryptographic hash function, a block cipher, and a Die-Hellman key exchange. One could currently achieve adequate security by using SHA-1 as hash function, AES as block cipher (with 128-bit keys), and a 1536-bit group for the DH exchange [72, 78], upgrading these sizes as it becomes necessary. For our measurements, described in detail in Section 5.12.7, we use one of the DH groups dened for the IKE protocol [72]. To generate the session keys from the DH secret also a standard operation we currently use one of the pseudo-random functions dened by IKEv2 [68].
5.12
Evaluation
In this section we present measurements that demonstrate the eectiveness of our mechanism. We rst show that bit error rates increase as SNR levels decrease, property that allows our protocol to control the range of each access point. We then demonstrate that clients within the intended range authenticate successfully with high probability while attackers outside the SA need additional antenna gain to complete an authentication handshake.
5.12.1
Channel Quality vs. SNR
We show that signal quality is a function of SNR using measurements performed with o-the-shelf IEEE 802.11a hardware running at 54 Mbps. We show that channel quality, represented by the occurrence of bit errors, degrades rapidly as the SNR decreases. Specically, we show that for a given 802.11 mode, when SNR values are above a threshold SN R , clients consistently experience negligible bit error rates. Similarly, our results demonstrate that whenever SNR goes 10 dB below SN R , receivers consistently experience unbearable BER levels, making communication almost impractical. This narrow SNR gap helps our protocol dene boundaries for wireless coverage. For the measurements presented we used two laptop computers equipped with 802.11a cards with Atheros chipsets AR5000 and AR5002X, the rst used as transmitter and the second as receiver. Across all measurement rounds, the receiver was placed in 2 dierent locations (inside an oce and a conference room), while the location of the transmitter was varied for each of the 73 rounds. All locations were inside the Gates 4A wing, with both
5.12. EVALUATION
99
500 400 300 200 100 0
Frames received
10
15
20
25
30 35 40 Average SNR (dB)
45
50
55
Figure 5.3: Frames received during each round as a function of average SNR. transmitter and receiver kept stationary during the experiments. Each of the 73 measurement rounds consisted of 500 2000-byte raw 802.11 frames containing random payloads and transmitted with an inter-frame delay of 100 milliseconds. For each round, the payload was created by seeding the MT19937 pseudo-random number generator (as implemented by the GSL library) with a dierent value. To avoid synchronization issues, the same payload was used for all frames transmitted during the same round. All rounds were executed over the same channel, with the default transmission power, using the 54 Mbps mode, and with the receiver in promiscuous mode and ignoring CRC checks. Using a modied Linux device driver, we logged for every received frame the data rate used, the sequence number, the detected SNR level, as well as the frame payload. We then applied a consistency check, removing from the dataset all frames labeled with wrong data rates (a total of 619 frames). We decided to discard them because we were not sure the card used the right mode during decoding, which would aect our error statistics. The probability that a frame is received even if we disregard bit errors is a function of the perceived SNR level: the higher the SNR, the better the chances of reception. Figure 5.3 shows the number of frames received (even the ones with bit errors) during each round, as a function of the average SNR during the round. Note that for rounds with high average SNR levels (> 25 dB), virtually all frames were received. However, all rounds with average SNR levels under 17 dB produced less than 150 frames out of the 500 frames transmitted, i.e. over 70% of frames were not even received. Two rounds with low average SNR produced no frames and are not even shown in the gure.
100
100 Corrupted frames (%) 80 60 40 20 0 10 15 20 25 30 35 SNR (dB) 40 45 50 55
Figure 5.4: Percentage of corrupted frames as a function of SNR level. Figure 5.4 shows that a 10 dB drop in SNR transforms a perfect channel into one with almost 100% of packet corruption. The gure plots the percentage of corrupted frames we labeled all frames with at least 1 bit errors as corrupted as a function of the measured SNR level. Instead of calculating per-round SNR averages, for this graph we aggregated all frames received with the same SNR, across all 73 rounds. As we expected, the receiver experiences negligible frame corruption when SNR30 dB. As the SNR decreases below this threshold, frame corruption increases rapidly, with corruption rates close to 100% when SNR<20 dB. While these two threshold values (30 and 20 dB) are specic to the 54 Mbps mode, this behavior is characteristic of all other data rates [25, 44]. Finally, Figure 5.5 shows that the number of bit errors per frame increases rapidly when the SNR decreases from around 30 to 15 dB. The gure presents the number of bit errors per frame as a function of SNR. Each point represents a frame, with the corresponding number of bit errors given by the left y-axis. The right y-axis shows the total number of frames in the dataset for each SNR level (represented by the dotted curve). While high SNR levels create consistently reliable channels, up to 5000 bit errors per frame are seen in the range 15-20 dB. Due to these error rates, in this range a receiver will have negligible chances of successfully completing a handshake, as seen in the next section. Put together, these measurements demonstrate our rst result: R1: For the IEEE 802.11a 54 Mbps mode, SN R =30 dB, with performance degrading substantially 10 dB below this level.
5.12. EVALUATION
101
6000 5000 Bit errors per frame 4000 3000 2000 1000 0 10 15 20 25 30 35 SNR (dB) 40 45 50 55
2000 1800 1600 1400 1200 1000 800 600 400 200 0 Number of frames
Figure 5.5: Bit errors per frame (left y-axis) and total number of frames (right y-axis) relative to measured SNR levels.
5.12.2
Authentication at Close Range
In this section we evaluate the chances of successful authentication for a client within range of an access point. In this case, a failed handshake can be seen as a false negative.
Theory
Using the 54 Mbps mode during authentication, what is the probability a receiver would complete a handshake with I iterations for dierent values of SNR? Let p err denote the probability of receiving a single error frame. Using the results presented in Figure 5.4, perr equals 0.0142, 0.0576, and 0.1247 with SNR levels of 30, 28, and 25dB, respectively. Assuming uniformly distributed frame losses, the probability of successfully completing an authentication round can be estimated by:
pauth = (1 perr )I .
(5.3)
With 20 iterations and according to Equation 5.3, a client completes the handshake with probabilities 0.75, 0.31, and 0.07 when the SNR equals 30, 28, and 25 dB, respectively. At close range, clients face a small probability of successive authentication failures. For I = 20 and using the values just found for pauth , a client provided with SN R = 30 dB consecutively fails two and three authentication rounds with probabilities 0.062 and 0.015,
102
5 7 8 9 14 17 2
3 4
30dB 13 25dB
10 11 12 15 16
20 18 19 21
Figure 5.6: Locations sampled during authentication experiments. respectively. At SN R = 28 dB, these numbers increase to 0.48 and 0.34, which shows that despite a decrease in performance, clients close to the maximum range may still be able to authenticate when facing mild RSSI oscillations. These results demonstrate that for the 54 Mbps mode, the WA can be congured to use I = 20 and SN R = 30 dB, i.e. congure the APs to use enough power as to provide clients with an average SNR of 30dB at the desired range.
Practice
We performed additional measurements to show that with the transmission power control of Section 5.7, clients within the targeted range do achieve SNR levels above SN R and authenticate successfully. As shown in Figure 5.6, we placed the transmitter (shown as a triangle) inside an oce and moved the receiver across 21 other locations, always with the same orientation (facing the upper wall in the gure). At each location, 1000 2000-byte frames were transmitted at 54 Mbps in order to emulate 50 consecutive handshakes (I = 20, S = 2000 bytes). Given the path loss model found for our building during calibration ( = 3.65) and the power
5.12. EVALUATION
103
# 1 2 3 4 5 6 7 8 9 10 11
SNR 50.4 dB 48.4 dB 39.4 dB 39.6 dB 48.2 dB 43.0 dB 38.8 dB 41.7 dB 42.3 dB 34.6 dB 30.9 dB
Rounds 42 (84%) 37 (74%) 44 (88%) 38 (76%) 44 (88%) 42 (84%) 45 (90%) 42 (84%) 47 (94%) 37 (74%) 37 (74%)
# 12 13 14 15 16 17 18 19 20 21
SNR 30.6 dB 27.1 dB 37.9 dB 39.3 dB 29.0 dB 26.7 dB 26.8 dB 17.9 dB 21.3 dB 16.4 dB
Rounds 49 (98%) 16 (32%) 45 (90%) 47 (94%) 47 (94%) 18 (36%) 18 (36%) 0 ( 0%) 0 ( 0%) 0 ( 0%)
Table 5.2: Authentication statistics for each location sampled.
used by our transmitter (15 dBm), Equation 5.2 yields a range of 8.5 meters (R = 8.5 m) for the 54 Mbps mode (SN R =30 dB). The same calculation predicts that clients within 11.7 meters from the AP are provided with SNR levels above 25 dB. The results of our measurements (shown in Table 5.2), agree with our theoretical analysis. The table presents, for each location shown in Figure 5.6, the average SNR level and the percentage of rounds completed successfully. (All locations within range are marked with a star). First, the results conrm that all clients located within the authentication range experience SNR levels above 30 dB. In fact, 9 of the 13 locations within range perceived average SNR levels above 38 dB, an 8 dB safety margin which would allow for further signal oscillation. Second, all locations in range were able to complete at least 74% (37 out of 50) of the authentication handshakes, which agrees with the value found in our theoretical analysis (pauth = 0.75). We nd these results promising for a couple of reasons. First, our transmitter used a standard PCMCIA card with an internal antenna, which produces a more directional pattern then the external antennas found in o-the-shelf access points. Second, in overprovisioned LANs clients should be in range of multiple APs, so even a sudden drop in SNR relative to one AP would not prevent a client from authenticating successfully.
R2: Using the transmission power control algorithm of Section 5.7, clients in close range authenticate successfully, with a low probability of two consecutive failures.
104
5.12.3
Handling False Negatives
A network operator has several choices to improve quality of service in locations within the service area that generate high rates of false negatives. For example, locations close to the boundaries of the SA tend to be within range of fewer access points and are therefore expected to generate more false negatives than locations covered by more APs. In some situations for example a user unable to authenticate when working at his oce forcing the user to get closer to the access point to improve his SNR is not a satisfactory alternative. The rst option for the network administrator is to manually increase the transmission power for the closest access point. Depending on where the AP is located and the immediate surroundings, attenuation may indeed be higher than predicted by the path loss model (which is calculated throughout the whole SA). An increase in power will always increase coverage beyond the SAs boundaries, but whether that decreases overall security depends on the range of the given access point and how that value compares to the ranges used by the other APs. (We assume an attacker will always use the AP providing him with the highest SNR level.) To avoid increasing the range of one AP signicantly we showed that the higher the range value, the lower the amount of amplication needed by an attacker outside the building an administrator can install an additional access point close to (or at) the problematic location. In the oce scenario, an AP could be installed inside the oce, with an even shorter range (< 5 meters) and therefore minimal transmission power. The use of such a short link considerably reduces the probability of false negatives without necessarily making it easier for attackers to access the network. Another alternative is for the administrator to change the wireless device used by the user. Independently of the hardware used to set the reference client, there is always the chance that a user is provided with a device with inferior quality. For example, some PCMCIA cards are known to have badly designed antennas which severely aect their performance.
5.12.4
Attacker Strategy
We now evaluate the resources needed by an attacker to successfully complete an authentication round when located outside the service area, i.e., his chances of bypassing the system and generating a false positive.
5.12. EVALUATION
105
Increasing antenna gain is the most cost-eective way for an attacker to improve SNR outside the building. Any receiver is associated with a noise gure of F dB, a function of the noise created in circuitry in the receiver stages [32, 91]. Ampliers at the receiver always amplify both signal and noise, so the output SNR is always F dB lower than the level detected at the antenna. While ampliers can be used to reduce the noise gure associated with the overall system, current receivers already provide noise gures of 6-8 dB, leaving little room for improvement. Antennas are passive devices, so an increase in gain is possible by making them more directional, which is achieved using smaller angles of radiation (or beamwidth) and larger apertures (related to physical area) [32]. For specic antenna designs, one can nd approximations for beamwidth and gain as a function of the antenna physical dimensions. For instance, at 2.4 GHz a parabolic dish antenna with diameter of 1 meter (40 inches) can provide up to 26 dBi of gain with an 8.7 beamwidth [32]. As real examples, the 802.11-compatible HyperLink HG2424G parabolic antenna provides 24 dBi of gain with a 8 beamwidth while measuring 40 inches and weighing 8 lbs [15]. The 19-dBi HyperLink HG2419G has a 12 horizontal beamwidth while measuring over 23 inches and weighing over 5 lbs [10]. An attacker outside the building needs to increase his perceived SNR in order to authenticate successfully without generating a suspicious number of alarms. Nonce messages do not contain checksums, so an attacker (as any other receiver) has to rely on the detected SNR level when deciding whether to complete a handshake. For example, if provided with a SNR of 20 dB an attacker could use an antenna with 10 dBi of gain to achieve 30 dB and properly receive nonce messages sent at 54 Mbps. As high-gain antennas are both larger and more expensive, the best strategy for an attacker is to nd the smallest antenna (lower gain) he can get away with. This would allow him to bypass security without being easily spotted or generating a risky number of alarms, as an alarm is raised for each incorrect response. Let SN R Att denote the SNR level targeted by the intruder in his eort to complete a handshake executed using the 54 Mbps mode. We know that if he minimizes his gain, then SN RAtt SN R = 30 dB. The higher the SNR level, the lower the number of alarms generated by the attacker. Let A denote the number of alarms generated and pauth denote the probability of successfully completing an authentication round (as of Section 5.12.2). Assuming an attacker stops after a successful round, the number of alarms follows a geometric distribution [99], with n
106
alarms happening with probability
pauth (1 pauth )n , and the expected number of alarms given by
(5.4)
E[A] = (1/pauth ) 1.
(5.5)
With 20 rounds (I = 20), E[A] equals 0.33, 2.27, and 13.35 for respectively 30 dB, 28 dB, and 25dB of SNR. Given that an attacker is unlikely to want to generate 13 successive alarms, we can conservatively assume that SN RAtt > 25 dB. Consequently, for an authentication round executed using the 54 Mbps mode, an intelligent attacker would strive to achieve 25 < SN RAtt 30.
5.12.5
Attack Resource Demands
Given our building and assuming the placement of access points shown in Figure 5.1, how much gain would an attacker need outside the building? The area shown measures approximately 48m24m and the 8 access points provide a density of roughly one AP per 144 m 2 (1550 s.f.). As shown in the gure, in this conguration 4 access points use a 8.5-meter range (R = 8.5 m) while the other 4 would use R = 6 meters in order to cover the oor. Given the path loss model found inside our building ( = 3.65), our power control algorithm yields the values 10 dBm (R = 6 m) and 15 dBm (R = 8.5 m). The results shown in this section assume the use of the 802.11a 54 Mbps mode, with SN R = 30 dB. Figures 5.7 and 5.8 plot the expected SNR values outside our building given the signal attenuation rate found in our outdoor measurements (out = 3.32), the access point placement shown in Figure 5.1, and the power levels just calculated (10 and 15 dBm). Figure 5.7 shows the maximum SNR available for each location taking all 8 access points into account and assuming no antenna gain. Figure 5.8 presents the same data from a dierent perspective, with SNR outside the SA shown as a function of the distance between attacker and external wall (dA ). Two curves are shown, presenting the SNR levels relative to the access points using authentication ranges of 8.5 and 6 meters. It is interesting to note that, despite being closer to the wall, the AP congured with a 6-meter range provides lower SNR levels
5.12. EVALUATION
107
SNR (dB) 20 15 10 5
L L L L L
40 SA L
L
20 0 -20 -40
-40
-20
20
40
Figure 5.7: Expected SNR outside the building. The rectangle represents our service area, while access points are shown as triangles. The axes show coordinates in meters, while the curves show the SNR at each location. outside the building, given the lower power used and the logarithmic relationship between SNR and distance. An attacker provided with at most 10 dBi of antenna gain which can be realized without resorting to large directional antennas has to be physically located at most 15 meters (50 feet) from the service area in order to have any chances of authenticating successfully. Figure 5.8 shows that after 15 meters from the service area, the predicted SNR level (without amplication) is inferior to 15 dB. From that location, an attacker with 10 dBi of antenna gain is able to increase the predicted SNR to 25 dB. To increase the total SNR to 30 dB and avoid an unusual number of alarms, the attacker has to be less than 10 meters from the wall, where the unamplied SNR level is closed to 20 dB. Getting that close to the SA may pose high risks of exposure, especially if any physical security measures such as cameras or fences are deployed. Even in a cafeteria, getting that close to the infrastructure would probably put the intruder within sight, increasing the chances of attack detection. What if an attacker employs high-gain, directional antennas? The extra gain provides him with approximately 20 meters of additional separation from the SA. After that, the amount of amplication needed is impractical. With 20 dBi of gain, the perceived SNR
108
50 40 SNR (dB) 30 20 15 10 5 0 wall
R=8.5m R=6m
outside building
10
20 30 attacker distance (m)
40
50
60
Figure 5.8: Predicted SNR for an attacker outside the building as a function of his distance to the external wall (dA ). needs to be above 5 dB for a total SNR of 25 dB. Outside our building, the average SNR is above 5 dB approximately within 35 meters from the external wall. To achieve a total SNR of 30 dB with the same amount of gain, an attacker needs to be less than 25 meters from the wall. These results demonstrate that independently of the approach preferred by an attacker, our protocol considerably increases wireless security compared to other solutions that incur low management overhead. An attacker that opts for a device that is easier to conceal is forced to be physically close to the service area and therefore more exposed to physical security measures. If he attempts to authenticate from a safer location farther from the infrastructure he is required to use high-gain antennas. As these devices are large (20-40 inches), heavy (5-12 lbs), and quite directional (beamwidth < 12 ), such attempts are not only much harder to implement but also much easier to detect through standard physical security measures (e.g. cameras). R3: Attackers with low amplication have to be located physically close to the service area (< 15 m), while impractical amounts of antenna gain are needed when more than 30-35 meters from the SA.
5.12.6
Eective Coverage Reduction
By looking at the results from the previous section from a dierent angle, it becomes clear that our mechanism signicantly reduces unwanted network coverage. For example, assume
5.12. EVALUATION
109
an 802.11b access point inside our building located 10 meters from the external wall and with an output power of 15 dBm (the same power level used by 4 of the APs in our SA). Let us assume that signal strength attenuation through the building follows the same path loss model (out = 3.32 and W AF = 5 dB) and that the AP is left open, i.e. servicing any client and using any data rate. For the 1 Mbps data rate, most 802.11b cards have a sensitivity level of approximately -94 dBm. Given the model and the transmission power used by the AP, the predicted RSSI goes below -94 dBm only when a client is located over 80 meters from the AP (> 70 meters from the building). I.e., even an attacker with no antenna gain can successfully use an open access point from such large distances. With range-based authentication, the practical network range is severely reduced, with large antennas already needed after 30-35 meters from the building.
5.12.7
Session Establishment Overhead
The time elapsed between the rst nonce message and the last message in the handshake is a function of the following terms: trans: transmission overhead, incurred by all the frames transmitted during the handshake; DHsnd : time needed to generate DH private and public keys (choose random x, perform x mod p); DHrcv : time needed to generate the DH shared secret (from x mod p and y, perform xy mod p); k: time used for key generation (computation of transient key k and session keys from the shared secret); proof : time spent on proof computation (hashing plus encryption). We used four Linux machines with distinct processors to estimate the overhead incurred by each of these terms. Details about the hardware used and the results of our measurements are presented in Table 5.3. The two machines provided with slower processors (C 1 , C2 ) were used to estimate lower bounds for client performance. The two faster machines (S 1 , S2 ) were used to estimate the performance of the server. All cryptographic operations were measured using OpenSSL v.0.9.7 under Linux.
110
Machine C1 : Pentium III 733 MHz C2 : Pentium III 933 MHz S1 : Pentium 4 1.7 GHz S2 : Pentium 4 3.2 GHz
DHsnd 19.6 (19.6) ms 17.3 (20.0) ms 16.0 (16.0) ms 8.0 (8.1) ms
DHrcv 24.6 (25.0) ms 22.0 (25.0) ms 20.1 (20.4) ms 10.0 (10.1) ms
k 75 (76) s 74 (76) s 39 (44) s 19 (19) s
proof 653 (665) s 540 (624) s 261 (276) s 128 (138) s
Table 5.3: Performance measurements for the main operations in a handshake using 4 distinct machines.
Transmission overhead (trans) is in the order of tens of milliseconds. Each 802.11 transmission has to wait up to 50 s to minimize the probability of collisions and include a preamble (16 s) and a signal eld (4 s), the latter always transmitted at 6 Mbps [2]. A 2350-byte frame sent at 24 Mbps takes approximately 855 s to transmit, already counted the xed 70 s of overhead. (APs and clients should rarely need to use data rates below 24 Mbps, given the short range between them.) With 20 iterations, the total overhead would stay below 20 ms. This is a reasonable estimate for the scenario where access points use broadcast packets for nonce messages, as they reserve the channel before transmission. In the unicast case in a 802.11 network, additional overhead would be incurred by the RTS/CTS mechanism. As shown in Table 5.3, Die-Hellman operations (DHsnd and DHrcv ) are orders of magnitude more expensive than other computations included in the handshake. We evaluated each operation 200 times, with the median and 90th percentiles shown in the table (90th percentiles shown in parentheses). While it takes the slowest client (C1 ) less than 700 microseconds to hash 40 Kbytes of nonces and DH public exponents, almost 20 milliseconds are spent on a single DH exponentiation. Even at the fastest server, calculating the DH public key and shared secret consume respectively 8 and 10 milliseconds. We generated 128-bit session keys using the 2048-bit Die-Hellman group used by IKE [72] and 256-bit exponents (minimum recommended size [72]). From the DH shared secret, we estimated k by generating 480 bits of key material using the PRF prf+ (also dened by IKE [68]) with HMAC-SHA1 as the underlying function. At the fastest client and server, k takes an average of 74 and 19 microseconds, respectively. From these 480 bits, we get k (128 bits) and the encryption and authentication keys to be used during the established session (128 bits each). Finally, we used SHA-1 and AES in ECB mode to calculate the nonce reception proof. Again at the fastest client and server, proof takes
5.12. EVALUATION
111
an average of 540 and 128 microseconds, respectively. Under low loads, clients rarely take more than 150 milliseconds to nish an authentication handshake. From the clients perspective, we can estimate the average duration of a handshake by adding trans to the DH-related overheads at both client and server. Following this rationale, it would take the slowest client (C1 ) an average of 100 milliseconds to complete the handshake with the slowest server (S1 ). This value decreases to less than 80 ms when pairing the fastest client with the fastest server (C2 S2 ). These gures show that 150 milliseconds can be safely used as safe estimate for authentication delay. Our results suggest that a server can process tens of requests per second in software. The WA authorizes a client after a successful verication of the nonce proof, the cost of this operation being dominated by DHrcv . Based on our measurements, S1 (respectively S2 ) would be able to process a maximum of 62 (125) such requests per second. These numbers seem more than necessary for most environments. While these values may seem low, these servers provide enough performance for three main reasons. First, most installations would probably not have 20 new clients within a conned area (remember the SNR requirements) requesting authentication simultaneously. By randomizing the duration of established sessions, the WA can also distribute over time the costs associated with session renewals, thus preventing active users from creating ash crowds. Second, a client has to respond within t seconds after the last nonce message. All other requests are dropped by the server. Finally, another round does not take place before T seconds elapse. Thus, accepted requests can be buered and the correspondent handshakes completed before the next authentication round starts. R4: A handshake is expected to be completed by most clients with an average delay below 150 milliseconds. Servers should be able to handle upwards of 50 requests per second in software.
5.12.8
Selective Jamming
We now present measurements that demonstrate the eectiveness of the selective jamming mechanism. First, we show how communication quality degrades when multiple transmitters are in the range of a receiver (the near-far problem). To be consistent with our deployment scenario, for these measurements we use three 802.11 devices. Then, we show the eect of jammers if they were to be installed outside the building in our testbed.
112
For these measurements we want to reproduce the scenario in which a receiver is within range of two transmitters which cannot hear each other. These conditions are similar to the ones perceived by an attacker outside the service area that can detect simultaneous transmissions performed by an access point and a jammer. We used three laptops, all provisioned with Atheros-based 802.11a cards. One laptop acts as the receiver, while the other two act as transmitters, one transmitting data frames at 54 Mbps (the intended transmitter) and the other at 6 Mbps (the jammer). The intended transmitter sends sequences of 500 2000-byte frames with a xed payload one every 100 milliseconds while the jammer transmits continuously. The rst measurement had the objective of testing our logging capabilities. We placed all three laptops in close range to each other and veried that our modied device driver and our application could indeed log all frames transmitted by both cards. A total of 16,076 frames were logged by the receiver, 500 of them sent by the transmitter and the remaining transmitted by the jammer (logged with proper sequence numbers). Only 3 frames out of all transmitted had bit errors, 2 of which were sent by the intended transmitter at 54 Mbps. In order to make both transmitters act simultaneously to simulate a jamming scenario, we had to force them to bypass the CSMA/CA algorithm implemented by the 802.11 standard. Not doing so would decrease the jamming eects, as the cards would back-o constantly to avoid collisions. Unable to congure this behavior through the device driver, we physically placed the three laptops in a way that the two transmitters could barely hear each other (the hidden-station scenario). The receiver was placed at the intersection of two corridors, each housing one transmitter. The transmitter and the jammer could be heard by the receiver with approximately 33 dB and 36 dB of SNR, respectively. With this conguration and in the absence of the jammer, the receiver has a highquality communication channel with the intended transmitter. As expected by the high SNR (average of 33 dB), in this scenario only 2 out of the 500 frames transmitted were received in error (0.4% frame loss). However, when the transmitter and the jammer are active simultaneously, the error rates perceived by the receiver are two orders of magnitude higher. We performed 5 measurement runs in the hidden-station scenario, with the results illustrated in Figure 5.9. Each of the 5 rows in the gure presents the statistics for the 500 frames sent by the intended transmitter during that measurement run. A full vertical tick indicates that the frame with that sequence number was received correctly, with no bit errors. A half-tick represents an
5.12. EVALUATION
113
100
200 Sequence number
300
400
500
Figure 5.9: Jamming experiments. Each row in the gure represents a measurement round. A full tick represents a frame received successfully, while a half-tick represents a frame with bit errors. Frames not captured by the receiver are not shown.
error frame, i.e. a frame actually received but containing ipped bits in its payload. The absence of ticks represents frames not captured by the receiver. For the rst run (bottom of the gure), 351 frames were received, 76.4% of them in error. For the other runs, the numbers were respectively 329/79.3%, 340/75.9%, 285/74.7%, and 383/54.8%. Note that these percentage values do not include frames not captured by the receiver, so frame loss rates are even higher. These results show that jamming substantially degrades communication quality, increasing a negligible frame loss rate to at least 54%. They show that with two simultaneous transmissions, either the receiver is not able to detect the beginning of the intended transmission (in which case the frame is not received) or it receives corrupted frames given the extra noise caused by the jammer. With all laptops stationary and kept at the exact same locations, frame error rates increased from 0.4% (the scenario without the jammer) to between 54% and 79% with the introduction of the jammer. Despite the low average communication quality, we can identify several long runs of undamaged frames, three of which are labeled in the gure. These long runs were caused by the fact that our jammer was a normal 802.11a device, still susceptible to back-os or other interruptions, fact we could verify by analyzing our log les. While the intended transmitter waits 100 ms between frame transmissions, the jammer transmits continuously. As a result, when both devices are transmitting simultaneously, the log le alternates one frame from the transmitter and several from the jammer. This behavior, however, was not seen in the three sequences marked in Figure 5.9. For example, sequence 1 consists of 18 correct frames, which were intermingled with only one frame sent by the jammer (veried
114
50 40 SNR (dB) 30 20 15 10 5 0 wall
AP, 15 dBm, R=8.5m jammer, 15 dBm
outside building 10 20 30 attacker distance (m) 40 50 60
Figure 5.10: Predicted SNR for access point and jammer outside the building (jammer placed at the external wall).
by sequence numbers). The same trend was seen for the other two sequences: sequence two (25 frames) has two long frame sequences without jammer frames (11 and 13 frames) while the frames in sequence 3 (11 frames) were divided into two bursts also without interference. Such long runs would not occur were our jammer devices completely bypassing the carrier sense mechanism. Given their eectiveness, jammers can be placed outside the service area in order to reduce the chances of attack. Figure 5.10 shows the expected SNR outside the building as a function of the distance between the attacker and the external wall, assuming out = 3.32. The solid line shows the SNR for an access point (inside the building) located 8.5 meters from the wall and using a transmission power of 15 dBm. The dotted line shows the SNR for a jammer located outside the building, mounted at the external wall, and also transmitting at 15 dBm. As shown in the gure, outside the building the SNR relative to the jammer is considerably higher than the value provided by the access point. Consequently, an attacker outside the SA is likely to have his receiver locked to the jammer, which severely reduces his chances of detecting all nonce messages successfully. Moreover, the scenario depicted in Figure 5.10 is likely optimistic for the attacker, for two main reasons. First, as the jammer is placed outside, its signals are not as strongly attenuated as the signals sent by an access point inside the building. Therefore, the attenuation rate () relative to the jammer is likely to be lower than the value used in the gure, which would make the predicted SNR levels
5.13. LIMITATIONS
115
for the jammer even higher. Second, jammers use directional antennas in order to avoid interfering with clients inside the SA. Given that even antennas with beamwidth values larger than 60 provide some gain (> 8 dBi is typical), SNR values for the jammer with be that much higher in the directions pointed by the main lobe of the jammers antenna. To avoid the interference caused by the jammers, an attacker needs to use a more directional antenna (higher gain, smaller beamwidth) to capture the APs transmissions and block the jammer. Once again, he becomes more visible by doing this and improves the chances of attack detection. The more jammers are deployed, the lower the probability of the attacker nding a suitable location and orientation to implement a successful attack. R5: Selective jamming substantially degrades the quality of an otherwise perfect channel and can be used to decrease the chances of authentication outside the service area.
5.13
Limitations
Similarly to the scenario described for our localization system, attackers located in close proximity to a service area may be able to authenticate successfully with attainable amounts of amplication. Range-based authentication does make attacks harder to mount and easier to detect, as it forces most of the amplication required to come from directional antennas. However, networks may be vulnerable to attacks if an intruder equipped with a 10 dBi external antenna is allowed to position himself within 10-15 meters from the SA without raising suspicion. Possible attack locations again include nearby parking lots and adjacent oors in multi-story buildings. Our mechanism may also be vulnerable to collusion attacks in which a device is placed either within or in close proximity to the service area in order to help another device located farther apart to complete an authentication round successfully. For example, a visitor escorted into the SA could leave behind a hidden PDA to act as authentication proxy. When the visitor is located outside the service area therefore outside the range of all APs he can command the PDA to capture nonce messages and forward them to him using a secret channel (e.g. a non-802.11 frequency). The visitor could then use this information to complete the round while the proxy would have not performed any transmission in the monitored frequencies, therefore not exposing itself to intrusion detection systems. Whether or not this attack can be implemented in a given installation is a function of the level of physical security deployed, such attack being possible even in wired networks.
116
In fact, if an attacker is able to walk into the service area, there are probably more eective attacks he can mount. In the wired setting, the PDA could be connected to a public Ethernet port in a conference room to act as a rogue access point, again using a hidden frequency to bridge trac to and from its master. One dierence between the wireless and wired scenarios is that in the former the proxy can be placed at any location within range of at least one AP, while in a wired network a successful attack depends on the availability of a physical port. Additionally, in the wired case, the proxy would be actively forwarding packets into the network, therefore providing more information about its physical location.
5.14
In this chapter we have described range-based authentication, a mechanism that forces clients seeking authentication to act as receivers. We showed that this approach considerably increases the costs for a device located outside the service area to mount a successful attack, therefore reducing the probability of unauthorized access to a wireless LAN. Our protocol explores inherent properties of wireless channels to control the range of each access point inside a SA and reduce unwanted coverage. Using measurements in our network testbed, we demonstrated that our protocol introduces minimal management overhead and that clients with o-the-shelf hardware physically located in the service area authenticate successfully with high probability. We showed that our transmission power control algorithm uses the results of system calibration to nd the minimum power level needed to cover each access points range, with little or no feedback from network operators. We also showed that clients within range of an AP have a low probability (< 0.07) of failing two consecutive handshakes. As clients can establish new secure sessions before their current ones expire, they can be provided with continuous connectivity even in the face of sporadic authentication failures. We also showed that attackers located outside the service area are more physically exposed: to successfully complete an authentication round, they have to be either located close to the external wall or be provided with large directional antennas. With the measurements performed outside our building, we showed that an attacker provided with less than 10 dBi of antenna gain has to be physically located within 10-15 meters (30-50 ft) from our building. We also showed that in order to authenticate successfully when located more than 30-35 meters (100-115 ft) from the building, an attacker is expected to need more
5.14. SUMMARY AND CONCLUSION
117
than 20 dBi of gain, a considerable amount. For example, 802.11b-compatible antennas providing such gain levels usually measure over 20 inches. In both scenarios, our protocol makes physical security measures more eective.
118
Chapter 6
Detecting Identity-Based Attacks

In this chapter we present the design and evaluation of a mechanism that creates reliable device identiers called signalprints using signal strength measurements. We show that this mechanisms allows a wireless network to detect as class of denial-of-service attacks based on identity (MAC address) spoong.
6.1
Motivation
Even if a malicious device is not allowed to use the network, it can still deny service to legitimate clients. With location-based security, a device located outside the SA may not be located accurately or authenticate successfully, being therefore rejected by the network. However, legitimate clients within its range are vulnerable to denial-of-service (DoS) attacks. Wireless LANs are yet another scenario for DoS attacks, though with the added complication that the wireless medium makes it easier for the injection of attack trac. Several DoS attacks in wireless LANs are possible because these networks lack reliable client identiers before upper-layer authentication mechanisms are evoked and user credentials are securely established. After a client authenticates successfully and session keys are used to encrypt and authenticate packets sent over wireless links, the network can securely verify if the source MAC address in a packet is correct. Without this mechanism, however, wireless installations have to rely solely on MAC addresses for client identication: two devices in a network using the same address are treated as a single client, even if they generate conicting or inconsistent requests. 119
120
CHAPTER 6. DETECTING IDENTITY-BASED ATTACKS
As MAC addresses can be easily changed through device drivers, simple yet eective identity-based attacks can be implemented with o-the-shelf equipment against multiple link-layer services. IEEE 802.11 networks, for instance, have been shown to be vulnerable to a class of attacks we refer to as masquerading attacks, in which a malicious device targets a specic client by spoong its MAC address or the address of its current access point. Bellardo and Savage have demonstrated that a 10-second deauthentication attack can immediately knock a client o the network and possibly incur minute-long outages given the interaction between 802.11 and TCP [29]. With such tools, a malicious user could render a WiFi hotspot unusable by targeting all active clients or simply maximize the throughput achieved by his own laptop by periodically deauthenticating devices using the same access point as him. These attacks can be currently implemented even if networks deploy recent security standards such as IEEE 802.11i [7]. Another class of identity-based attacks target resource depletion: an attacker can generate high rates of requests with random MAC values in order to consume shared resources. For example, authentication protocols such as TLS (popular with 802.11i/802.1X) demand milliseconds of processing time, making servers vulnerable to attacks that consume in the order of 200 Kbps of attack bandwidth [38]. As another example, the attack could target a DHCP server in a publicly available part of the network and consume all IP addresses reserved for visitors. A PDA device left behind inside a corporation could act as a wireless grenade, going o at a programmed time and ooding the authentication server with random requests, possibly aecting clients well beyond its communication range. In this chapter we show that reliable client identiers, which we call signalprints, can be created using signal strength information reported by access points and used to detect misbehaving devices [51]. As a packet of interest (e.g. a deauthentication request) is transmitted over the wireless link, it is sensed by access points within range, which report signal strength measurements (a.k.a RSSI levels) to a centralized server. The request is then tagged with a signalprint, a tuple constructed by aggregating all measurements reported. Transmitters at dierent locations produce distinct signalprints because signal decays with distance, allowing the system to robustly distinguish clients located geographically apart. We present measurements performed within an oce building with an IEEE 802.11 network that demonstrate that signalprints can be used to detect masquerading and resource-depletion attacks with high probability.
6.2. ATTACK MODEL
121
In Section 6.2 we present our attack model. Section 6.3 describes how signalprints are created and the properties that allows them to be used as location-based client identiers. In Section 6.4 we describe how signalprints are matched, while Section 6.5 shows how matching rules can be created to detect identity-based DoS attacks. We evaluate our mechanism in Section 6.6 and present its limitations in Section 6.7.
6.2
Attack Model
We assume that malicious clients are provided with standard wireless transmitters. First, we assume they employ omni-directional antennas, much like most portable wireless devices. The use of directional antennas is discussed in Section 6.6.5. Second, we assume they are able to modify the contents of each outgoing packet. This allows them, among other things, to change source and destination MAC addresses, a capability needed to implement the attacks we are interested in. For example, Bellardo and Savage have described a mechanism that can be used to accomplish this [29]. Finally, we assume that they are provided with multiple transmission power levels and that they can also change that setting on a perpacket basis. In this dissertation we restrict ourselves to 802.11 networks, but the ideas presented can be equally applied to other wireless LAN technologies. In terms of their physical location, we assume attackers can move freely around the area covered by the wireless network. Note that in practice, this is only possible in environments with little or no physical security, such as in cafeterias and other hotspots. The probability of mounting successful attacks would be lower in environments with tighter security measures, such as in enterprise installations. We focus on the two classes of attacks already mentioned: masquerading and resource depletion attacks.
6.3
6.3.1
Signalprints
Signalprint Representation
Conceptually, a signalprint is the signal strength characterization of a packet transmission [51]. Each signalprint is represented as a vector of signal strength measurements, with one entry for each access point acting as sensor. Values in signalprints always appear in the same order, i.e., position i always contains the signal strength level (in dBm) reported
122
by the ith AP. We use the notation S[i] to refer to the ith entry in a signalprint. If an access point does not report an RSSI level for a given packet, a default value equal to its sensitivity is used. (The sensitivity of a receiver with respect to a given data rate is dened as the minimum signal strength level needed to achieve a target packet error rate.) The size of a signalprint is the number of non-default elements it contains, i.e., the number of entries created from actual RSSI measurements. For instance, Figure 6.3(a) shows two signalprints, S1 and S2 , both with 7 entries (the number of APs in the network) but with sizes 5 and 6, respectively. (In this case, default values of -95 dBm were used.) Signalprint S1 was created using RSSI levels reported by APs 1, 3, 4, 5, and 7, while S2 has values from APs 1, 2, 4, 5, 6, and 7. As an alternative notation, S1 can also be written as S1 : (50, , 80, 73, 88, , 60), where default values are omitted.
6.3.2
Signalprint Generation
Figure 6.1 illustrates how signalprints are created for wireless transmissions. A client (Client1) is shown transmitting an authentication request through its current access point (solid line). Before forwarding the packet to the WA, the AP tags it with the RSSI level measured during reception. (Signal strength estimates are commonly made available by IEEE 802.11 device drivers for each packet received.) The other two APs shown in the gure are also congured as sensors and tuned to the same channel. As Client1 is also within their ranges, they send similar reports to the WA with their own RSSI measurements. As shown at the top of the gure, the WA aggregates all reports and creates the following signalprint for Client1: SC1 : (73, 51, 67). A signalprint for a second client, Client2, is also shown at the WA. The signalprints produced by both clients are quite dierent for example the clients could be located in dierent oces within a building. The WA can identify identity-based attacks by comparing signalprints produced by multiple packets. For example, if Client1 submits a high rate of requests trying to clog the authentication server, the WA can detect it given that many of Client1s transmissions produce similar signalprints. Likewise, the WA can detect if Client2 mounts a DoS attack against Client1 by sending 802.11 deauthentication requests with Client1s MAC addresses, as the signalprints produced by the two devices are dierent. We assume that a subset of the deployed access points report RSSI measurements to the WA for all transmissions they can detect. In the case of a WIDS that relies on a separate wireless infrastructure, some APs are already permanently congured as sensors. In a
6.3. SIGNALPRINTS
123
Figure 6.1: Overview of the signalprint generation process.
124
CAPWAP network, the WA is responsible for selecting the APs for signalprint processing. In an over-provisioned installation, the WA can select the access points that are not actively serving clients. Signalprint-based attack detection should be implemented as a reactive mechanism whenever the number of sensors is not sucient to cover all active channels. For instance, 802.11 deployments have at least 15 non-overlapping channels available across both 2.4 and 5 GHz frequency bands. The objective is to maximize the size of the signalprints produced: the more measurements are received for a transmission, the more accurate is the information gathered about the location of the corresponding device. For that to be possible, sensors need to be listening simultaneously to the proper channel. In large networks, where more channels are required to serve active clients, dividing the sensors across all channels to be monitored would produce short, inaccurate signalprints. For this reason, a two-step process should be implemented in these situations. First, the WA searches for abnormal behavior using both active and sensor APs, which are scattered across all channels. When abnormal behavior is detected such as a surge in the number of 802.1X authentication requests or a high number of association events related to a single client the WA sets enough sensors to the proper channel to create signalprints for the relevant packets.
6.3.3
Signalprint Properties
Three properties concerning signalprints enable their use as reliable client identiers: Signalprints are hard to spoof. Signal attenuation is a function of the distance between clients and access points, with a strong dependence on environmental factors such as construction materials and obstacles such as furniture items [62, 91]. Consequently, transmitters have little or no control over signal attenuation within the environment, being unable to considerably change the signalprints they produce. We show that the use of dierential signalprints makes the system robust against devices that employ multiple transmission power levels, further decreasing their control over the signalprints generated. Signalprints are strongly correlated with the physical location of clients, with similar signalprints found mostly in close proximity. In our measurements, performed within a 45m24m oce environment with a total of 12 802.11 access points, devices need to be as close as 5 meters in order to generate similar signalprints with high probability, even when only 6 APs are used. This allows the detection of masquerading
6.4. MATCHING SIGNALPRINTS
125
attempts when attacker and victim are not in close proximity. If an attacker aims to DoS a specic client and avoid detection, he is forced to move closer to the infrastructure, thus risking exposure. This property has also been demonstrated by WLAN localization systems that employ an oine training phase where signal strength patterns (essentially signalprints) are created for a set of selected locations (usually called a signal map, or radio map). These systems have consistently achieved average localization errors below 3 meters, mapping areas as large as 19,000 s.f. and with numbers of access points varying between 4 and 20 [26, 76, 98, 120]. Packet bursts transmitted by a stationary device generate similar signalprints with high probability. Our measurements show that while RSSI levels for a stationary device do oscillate over time due to multiple factors, over 90% of variations are within 5 dB from the median RSSI level. This correlation between consecutive samples has also been reported by other researchers [120]. Consequently, an attacker that mounts a resource depletion attack using random MAC addresses can be easily spotted. While not all signalprints may match each other, the network would still be able to detect that a single transmitter is responsible for a high rate of requests. Signalprints allow a centrally controlled WLAN to reliably single out clients. Instead of identifying them based on MAC addresses or other data they provide, signalprints allow the system to recognize them based on what they look like in terms of signal strength levels.
6.4
Matching Signalprints
In this section we demonstrate how matching rules are specied to detect identity-based attacks. In Section 6.4.1 we describe the use of dierential signal strength values during matching. In sections 6.4.2 and 6.4.3 we describe how values within signalprints are compared using max-matches and min-matches. In Section 6.4.4 we describe how matching rules are specied in terms of these operations.
6.4.1
Dierential Values
Values within a signalprint can be written as absolute values (e.g. RSSI levels in dBm) or as relative values (e.g. with respect to its higher or lower value). We use the term dierential signal strength to refer to the dierence between the value at a given position and the
126
maximum value found in a signalprint. Signalprints are either written with absolute or dierential values: for example, a signalprint S : (50, 62, 76) written using dierential signal strength becomes S : (0, 12, 26). Figure 6.3(b) shows S 1 and S2 written with both absolute and dierential values (the latter shown respectively above and below S 1 and S2 ). When matching two signalprints, both need to be written in the same manner. The use of dierential values increases the robustness of signalprint operations against devices (possibly malicious) that vary their transmission power levels between frames. It is a trick borrowed from dierential GPS, where a second, stationary receiver is used to remove timing errors that occur in both paths, between a satellite and each one of the receivers. In our case, this error or unknown quantity is the power level used by a transmitter. In theory according to the free-space equation and the log-distance model [91] if a transmitter increases its output power by dBm, a receiver within range is expected to detect an equal increase in RSSI if they are both stationary. I.e., while RSSI oscillation due to multipath is still expected, the large-scale path loss between two stationary devices is stable. Using dierential values, a stationary transmitter will generate similar signalprints whether it uses 0, 10, or 20 dBm of output power. With absolute values, however, considerable changes in transmission power could cause the system to attribute multiple packets from a single client to distinct devices. We performed a simple experiment using our testbed to demonstrate this property. A laptop with an 802.11 PCMCIA card was placed in the corridor in front of access point number 8 (depicted in Figure 3.2). It transmitted two packet bursts, one with 15 dBm of output power and the other with 0 dBm. Each burst took less than 10 seconds and consisted of 10,000 ping packets transmitted at 11 Mbps. Between 1600 and 3000 RSSI measurements were reported by each AP because the sensors were monitoring all 3 nonoverlapping channels at 2.4 GHz. As expected, the median RSSI levels produced by both sequences dier by approximately 15 dBm. Figure 6.2 shows the rst 1500 measurements reported by three of the access points. Figure 6.2(a) corresponds to AP 12, located approximately 6.8 meters from the client. The top curve shows the RSSI levels reported for the 15 dBm transmissions (total of 1686 samples), while the bottom curve shows the samples for the transmissions with 0 dBm of power (2206 samples). The median RSSI levels are -61 and -76 dBm, a dierence of 15 dBm. Figures 6.2(b) and 6.2(c) present the measurements for access points 3 and 7; the dierences between the median RSSI levels are now respectively 17 and 16 dBm.
127
-50 Signal Strength (dBm) -60 -70 -80 -90 0 200 400
1686 samples, median = -61 dBm 2206 samples, median = -76 dBm
600
800 Sample
1000
1200
1400
(a) AP 12, 6.8 meters from location sampled.

-50 Signal Strength (dBm) -60 -70 -80 -90 0 200 400 600 800 Sample 1000 1200 1400 2010 samples, median = -67 dBm 1869 samples, median = -84 dBm
(b) AP 3, 9.8 meters from location sampled.

-50 Signal Strength (dBm) -60 -70 -80 -90 0 200 400 600 800 Sample 1000 1200 1400 2135 samples, median = -70 dBm 1705 samples, median = -86 dBm
(c) AP 7, 10.7 meters from location sampled.
Figure 6.2: Variation in RSSI for a stationary client that uses two dierent power levels. Each graph presents measurements with respect to a single AP. The top and bottom curves correspond respectively to the samples collected with the client transmitting at 15 and 0 dBm.
128
6.4.2
Max-Matches
Matches are found by comparing values at the same position in two dierent signalprints. A max-match of dB is found whenever values dier by at most dB. I.e., a 10-dB maxmatch is found at position i if abs(S1 [i]S2 [i]) 10 and both S1 [i] and S2 [i] are non-default values. The total number of -dB max-matches found by comparing signalprints S 1 and S2 is denoted by maxM atches(S1 , S2 , ). We decided to remove default values from match computations because they can arise from two distinct scenarios. On one hand, a client can be simply outside the range of an access point, in which case its packets are not detected and RSSI measurements are simply not reported. On the other hand, many events may cause an AP to fail to receive packets independently of signal quality. For instance, two packets sent on the same channel but on dierent cells may overlap in time, in which case both packets might be incorrectly decoded and dropped by the AP. In this dissertation matches are always computed using dierential signal strength values. Figure 6.3(b) shows that 3 10-dB max-matches are found when comparing S 1 and S2 , i.e. maxM atches(S1 , S2 , 10) = 3. Signalprints are shown with both original and dierential signal strength values, with matches found at positions 1, 4, and 7. Note that position 5 does not yield a 10-dB max-match when using dierential values: the dierence equals 21 dB instead of the 8 dB when absolute values are used. Max-matches are especially useful when looking for signalprints produced by the same device. High numbers of max-matches with low values of (e.g. 5 dB) are likely to occur for a pair of signalprints sent by the same device because RSSI values produced by a stationary client tend to oscillate within 5 dB from the median value, as shown in Section 6.6.
6.4.3
Min-Matches
Analogous to a max-match, a min-match of dB is found whenever values dier by at least dB. A 10-dB min-match is found at position i if abs(S1 [i] S2 [i]) 10 and both S1 [i] and S2 [i] are non-default values. The total number of -dB min-matches found when comparing signalprints S1 and S2 is denoted by minM atches(S1 , S2 , ). As shown in Figure 6.3(c), a single 20-dB min-match is found when comparing S1 and S2 , at position 4. Min-matches allow the system to identify, with high probability, when two packets are sent by distinct devices. While small variations in received signal strength occur even for
129
(a) Signalprint size.
(b) 10-dB max-matches.
(c) 20-dB min-matches.
Figure 6.3: Signalprint matching examples. Figure 6.3(a) shows two signalprints and their corresponding sizes. Figures 6.3(b) and 6.3(c) demonstrate how max-matches and minmatches are computed.
130
Figure 6.4: Two matching rules applied to signalprints S1 and S2 .
a stationary client, rarely does it change by more than 10 or 15 dB. Consequently, the system can classify two packets as coming from dierent devices with high condence if large dierences are seen in a signalprint.
6.4.4
Matching Rules
We say that a pair of signalprints match if they satisfy a specied matching rule, a boolean expression involving numbers of max-matches and min-matches, and possibly signalprint properties such as size. Figure 6.4 shows two matching rules applied to signalprints S 1 and S2 (the same from Figure 6.3). The rst rule, maxM atches(S1 , S2 , 5) 4, requires the two signalprints to have RSSI values within 5 dB of each other in at least 4 positions. As S1 and S2 produce 3 such max-matches, the rule is not satised. The second rule in the gure requires at least one position to dier by 10 dB or more, being satised by this pair of signalprints. When specifying matching rules, it is important to account for both signal strength oscillation and lack of feedback from access points. Constant RSSI oscillation makes it unlikely that even signalprints produced by the same stationary device have the exact same RSSI values in multiple positions. Consequently, we usually write max-match clauses with values of of at least 5 dB. The lack of feedback from some APs prevents matches in all signalprint positions. As with intrusion detection systems, matching rules are specied with the objective of minimizing false positives, i.e., we want a match to be a strong indication that an attack is taking place. The reason is cost: a match raises an alarm that is likely to be handled by the network administrator. Rules can be made more precise (fewer false positives) by increasing the minimum number of matches and changing the value of .
6.5. ATTACK DETECTION
131
6.5
Attack Detection
Three attack properties are important to our analysis: R denotes the rate in packets per second (pps) required for a given DoS attack to be eective, S denotes the speed of the device, while A denotes the number of antennas under the control of the attacker. In this section we assume that devices are stationary (S = 0) and provided with a single omnidirectional antenna (A = 1). In Section 6.6.4 we address the eects of moving devices, while in Section 6.6.5 we assume attackers with directional antennas. Finally, we discuss attacks with multiple antennas in Section 6.7.
6.5.1
Resource Depletion Attacks
In this scenario, an attacker sends high rates of request messages using random MAC values in order to emulate a high number of clients and consume scarce resources in the network. For example, an attacker can send enough DHCP requests in a hotspot as to consume all available IP addresses, ood access points with association requests in the hopes of exceeding allowed limits, or send high rates of authentication requests to slow down or even disable a shared authentication server. As an example, Dean et al. [38] have also shown that eective low-bandwidth DoS attacks can be mounted against TLS, one of the preferred authentication methods to be used in 802.11i/802.1X [3, 7]. TLS requires cryptographic operations (e.g. RSA and DieHellman) that when executed in software demand tens of milliseconds even on a dedicated processor. In situations where a dedicated server is not available some lightweight AP architectures perform authentication at the WA the overhead imposed by each request could be much higher. A server could therefore be overloaded by a device that generates 100 requests per second, which can be injected into the network while demanding far less than 1Mbps of attack bandwidth. In this case, the input to the signalprint matching process is a set of packets (e.g. authentication requests) with distinct MAC addresses and their corresponding signalprints. Eective DoS attacks in this category require high packet rates (R >> 1 pps), so many signalprints should be available for processing. By comparing pairs of signalprints, the system can identify subsets generated by the same device. Matching rules should require multiple max-matches with small values of because we are looking for signalprints that were generated by the same device, therefore with similar
132
RSSI values in multiple positions. Using 6 APs as sensors, the rst rule we evaluate in Section 6.6 for this purpose is maxM atches(S1 , S2 , 5) 4. We can decrease the probability of false positives by increasing the required number of max-matches or decreasing the value of . The second rule we evaluate maxM atches(S1 , S2 , 5) 5 tends to be satised by signalprints generated at locations that are physically closer to each other. In order to further decrease the probability of false positives, these rules can be extended with min-match clauses. For instance, consider two signalprints that satisfy the second matching rule above by having similar RSSI levels in 5 positions. Now consider the single position that did not produce a max-match. If one of the signalprints has a default value at that position, the likelihood of these signalprints being from the same device does not change much. However, if values are dened in both signalprints and dier by 8 or 10 dB, this likelihood decreases substantially because the signalprints look considerably dierent relative to the corresponding access point. We evaluate another two matching rules that extend the rules given in the previous paragraph with a clause that does not allow dierences higher than 8 dB. These new rules are maxM atches(S1 , S2 , 5) 4 minM atches(S1 , S2 , 8) = 0 and maxM atches(S1 , S2 , 5) 5 minM atches(S1 , S2 , 8) = 0.
6.5.2
Masquerading Attacks
In masquerading attacks, an attacker targets a specic client by cloning its MAC address or the address of its access point. For instance, Bellardo et al. have shown that deauthentication and disassociation attacks can be easily mounted in 802.11 networks and are very eective [29]. Before a client can send packets over the wireless link, it needs to authenticate and associate itself with an AP. In a deauthentication attack, deauthentication requests are sent by an attacker with the MAC address of the victim. The access point, after granting the attackers request, removes the victim from the authenticated state and drops all its packets until association is reestablished. Bellardo et al. discuss other equally eective masquerading attacks that exploit the association service and the power saving mechanism [29]. In normal situations, 802.11 devices are not expected to generate high rates of authentication or association messages. However, there are situations in which well-behaved clients switch between access points with a frequency that is abnormally high. For example, in their study of a large-scale 802.11 network, Kotz et al. showed that clients sometimes are overly aggressive when selecting the best access point, which causes them to reassociate
6.6. EVALUATION
133
more often than necessary [73]. In these cases, multiple APs are within the clients range with comparable RSSI levels, which may cause it to change APs with small variations in signal strength. So the WA can detect an unusual trac pattern, but is an attack really happening? Signalprints can be used to detect attacks with high probability, providing a level of assurance that cannot be achieved by only looking at packet contents. The input now consists of two sets of packets that represent conicting requests (e.g. authentication vs. deauthentication messages), all transmitted with the same MAC address. An attack is detected by comparing pairs of signalprints, one from each set. Given that continuous attacks are needed to severely aect a victims throughput, large input sets are also expected in this case. For example, to keep a victim o the network, Bellardo et al. used up to 10 deauthentication frames per second in their experiments [29]. To detect these attacks, matching rules should require min-matches with large values of , because we are looking for considerable dierences in RSSI that would indicate two (or more) distinct transmitters. The rst rule we evaluate for this purpose requires at least 1 10-dB min-match using 6 sensors: minM atches(S1 , S2 , 10) 1. In the case of masquerading attacks, rules can be made more precise by either increasing the number of min-matches or increasing the value of . For this reason, we evaluate another two rules: minM atches(S1 , S2 , 10) 2 and minM atches(S1 , S2 , 15) 1.
6.6
Evaluation
In this section we show that signalprints are strongly correlated with the physical locations within an environment, which allows them to be used as robust, location-dependent client identiers.
6.6.1
Measurements
To evaluate our system, we use the survey data set, presented in Section 3.6.2. In Section 6.6.2, we use all RSSI measurements reported to quantify the signal strength oscillation created by stationary clients. Moreover, a signalprint was created for each of the 135 locations sampled using the median RSSI level relative to each access point in range. These are used in Section 6.6.3 to demonstrate the eectiveness of our proposed matching rules.
134
6.6.2
Signal Strength Oscillation
We rst demonstrate that while stationary, a wireless client tends to create similar signalprints, despite the inherently unpredictable nature of wireless propagation. This is an important property when we consider the performance of our system against DoS attacks with high rates of requests. For example, even a small fraction of matching signalprints would allow the network to detect a malicious device that sends over 100 authentication requests per second. Our measurements suggest that most signal strength oscillations are small, within 5 dB from the median RSSI level. Each graph in Figure 6.5 was created by choosing one of our sampled locations and one of the access points within its range. Each graph shows the variation in signal strength for successive transmissions: the dierence between the detected RSSI and the median value for the corresponding location-AP pair (the median, or base level, is shown as 0 dBm). The top three graphs in Figure 6.5 are examples of the behavior detected for most locations: the majority of RSSI oscillations are within 5 dB from the median. Aggregating all the measurements in our dataset all locations with respect to all access points we have that over 71%, 90%, and 93% of RSSI oscillations are respectively within 2, 5, and 10 dB from the median RSSI levels. However, the tail of the distribution is long, and some strong, mostly destructive oscillations do occur. The bottom two graphs in Figure 6.5 are examples of this case. In one of them, the RSSI level is somewhat stable, with a couple of strong oscillations (>25 dB). In the other example, periods with strong RSSI degradation seem to happen with a certain frequency, with a dierence in signal strength between the two levels as high as 30 dB. We do not have a denitive explanation for these signal strength variations we have observed, given that the transmitter was kept stationary during the measurements and used a single transmission power level. While strong RSSI oscillations do occur in our measurements, the path loss between a stationary client and each AP is stable most of the time. In theory, multipath propagation and other phenomena generate small-scale fading, with possibly strong RSSI variations over time caused by people walking by, doors being closed, and other changes in environment that aect any of the multiple paths taken by transmissions between two devices. In practice, however, strong RSSI oscillations do not seem to happen often. Perhaps these results are due to techniques developed to decrease the eects of small-scale fading in wireless systems, such as antenna diversity, implemented in most 802.11 devices. Measurements in other
6.6. EVALUATION
135
10 0 -10 -20 -30 -40
Variation (dB)
50
100 Sample
150
200
250
10 0 -10 -20 -30 -40
Variation (dB)
50
100 Sample
150
200
250
10 0 -10 -20 -30 -40
Variation (dB)
50
100 Sample
150
200
250
10 0 -10 -20 -30 -40
Variation (dB)
50
100 Sample
150
200
250
10 0 -10 -20 -30 -40
Variation (dB)
50
100 Sample
150
200
250
Figure 6.5: RSSI oscillation for a stationary device. Each graph was created by choosing one of the locations sampled and one access point within its range. It shows the variation in signal strength for consecutive frame transmissions relative to the median RSSI (shown as 0 dBm) for that location-AP pair.
136
installations, under dierent circumstances, would be able to provide more information regarding this phenomenon.
6.6.3
Signalprints and Physical Proximity
In this section we explore the relationship between signalprints and physical proximity between transmitters in order to detect identity-based attacks. Despite having 12 access points deployed in our testbed, all the results presented in this section use two AP congurations: one with 6 access points (numbers 3, 5, 6, 8, 10, and 11 in Figure 3.2) and the other with 4 APs (numbers 2, 4, 6, and 8). The conguration with 4 access points is used to evaluate the loss in accuracy when using fewer sensors. In this section, each gure shows the APs being used for each matching rule as triangles and omits the others.
Detecting Packets From a Single Device

As discussed in Section 6.5.1, rules that identify packets generated by the same device are useful to detect high-rate DoS attacks. By requiring multiple max-matches with small values, we show that matching signalprints are found mostly in close proximity. Locations that produce signalprints with similar values in multiple positions tend to be physically close. Using the 6-AP conguration, Figure 6.6(a) shows all location pairs that satisfy the matching rule maxM atches(S1 , S2 , 5) 4 connected by a line segment. Even though many matches are produced, most of them involve locations that are close to each other. Overall, 430 matches were found (4.8% of all pairs), with 51%, 74%, and 91% of them found respectively for locations within 5, 7, and 10 meters from each other. However, there are still many matches found for locations more than 15 meters from each other. (All matching results presented in this section are summarized in Table 6.1.) Matching results are improved by increasing the number of max-matches required. Still using 6 sensors, Figure 6.6(b) shows the locations whose signalprints satisfy the rule maxM atches(S1 , S2 , 5) 5. Compared to Figure 6.6(a), there is a signicant reduction in the number of long-distance matches. A total of 150 matches were found (1.7% of all pairs), with respectively 64%, 88%, and 98% of them found for locations within 5, 7, and 10 meters from each other. In this case, there are no matches for locations more than 15 meters apart. The matching rules just described can be made even more precise (fewer false positives and long-distance matches) if we extend them with min-match clauses. Figures 6.7(a)
6.6. EVALUATION
137
20 Y coordinate (m)
H G H H
H G G
G G G G
G G G G G G
G G
G G
G G
G G G
G G
L G
G G G G G G G G G G
G G
G G G
G G G G G G G
G G
G G
G G
G G G G G G G G G G G G G G H
G G G G G G G
G G G G
10
G G G
G G G G G G G
G G G G G G G
G G
G G
G G G G G
G G G G
L
H G
G G
G G G G G G
L
G H
G G H
0 0 10 20 X coordinate (m) 30 40
(a) maxM atches(S1 , S2 , 5) 4.
20 Y coordinate (m)
H H H H
H H G
G G G H
G G G G G H
G G
G G
G G
G G G
H G
L G
H G G G G G G G G G
H G
G G G
G G G G G G G
G G
H H
G H
G G G G G G G G G G G G G H H
G G G G G G G
G G G
10
G H G
G G G H G G G
G G G H H H H
G G
G G
H H G G G G
G G G G
L
H G
G G
G G H G H G
L
H H
H H H
0 0 10 20 X coordinate (m) 30 40
(b) maxM atches(S1 , S2 , 5) 5.
Figure 6.6: Figures 6.6(a) and 6.6(b) show the location pairs producing a minimum of respectively 4 and 5 5-dB max-matches using the 6-AP conguration (APs shown as triangles).
138
20 Y coordinate (m)
H G H H
H G G
G G G G
G G G G G H
G H
H H
G G
G G G
H G
L G
H G G G G G G G G
H G
G G G
G G G G G G G
G G
G H
G H
G G G H G G G G G G G G G G H
G H
G G G G G H G
G G H G
10
G G G
G G G G G G G
G G G G G G G
G G
G G
G G G G G
G G G G
L
H G
G G
G G G G G G
L
G H
G G H
0 0 10 20 X coordinate (m) 30 40
(a) maxM atches(S1 , S2 , 5) 4 minM atches(S1 , S2 , 8) = 0.
20 Y coordinate (m)
H H H H
H H G
G G G H
G G G G G H
G H
H H
G G
G G G
H G
L G
H G G G G G G G G
H G
G G G
G G G H G G H
G G
H H
G H
G G G H H G H G G G G G G H H
G H
G G G G H H G
G G H
10
G H G
G G G H G G G
H G G H H H H
G G
G G
H H G G G G
G G G G
L
H G
G G
G G H G H G
L
H H
H H H
0 0 10 20 X coordinate (m) 30 40
(b) maxM atches(S1 , S2 , 5) 5 minM atches(S1 , S2 , 8) = 0.
Figure 6.7: Figures 6.7(a) and 6.7(b) extend respectively gures 6.6(a) and 6.6(b) with a clause that allows no 8-dB min-matches.
6.6. EVALUATION
139
20 Y coordinate (m)
H G H G
G G G
G G G G
G G G G G G
G G
G G
G G
G G G
G H
G G G G G G G G G G G
G G
G G G
G G G G G G G
G G
G G
G G G G G G G
G G G G G G G G G G G G G G G
L
G
G G G G G G G
10
G G G
G G G G G G G
G G G G G G G G
G G G G
G G
G G G
G G
G G G G G G
L
G G
G G G
G G
G G
G G
0 0 10 20 X coordinate (m) 30 40
(a) maxM atches(S1 , S2 , 5) 3 minM atches(S1 , S2 , 8) = 0.
Figure 6.8: Location pairs producing a minimum of 3 5-dB max-matches using the 4-AP conguration.
and 6.7(b) extend respectively gures 6.6(a) and 6.6(b) with a clause that discards location pairs with 8-dB min-matches (minM atches(S1 , S2 , 8) = 0). For instance, the rule in Figure 6.7(a) still requires a minimum of 4 max-matches of 5 dB, but now rejects all the location pairs for which any dierence larger than 8-dB is found. As shown, the min-match clause reduces the number of matches from 430 to 160 (1.8% of all pairs), with respectively 71%, 92%, and 99% of them found for locations within 5, 7, and 10 meters from each other. A similar improvement is seen in Figure 6.7(b), that reduced the number of matches from 150 to 97 (1.1% of all location pairs). For both rules in Figure 6.7, there is a single match for locations more than 10 meters from each other. Finally, Figure 6.8(a) shows that performance degrades if we decrease the number of access points used to 4, but that results are still satisfactory due to the use of min-matches. With 4 APs, this matching rule requires at least 3 5-dB max-matches and no 8-dB minmatches. It produces 317 matches, with respectively 62%, 86%, and 99% of them found for locations within 5, 7, and 10 meters from each other. Note that these numbers are better than the ones related to Figure 6.6(a) even though there are two fewer access points and the matching rule requires fewer max-matches. These results show that resource depletion attacks can be detected with high probability, as matching signalprints are found mostly for locations that are near each other. Therefore, a
140
Matching Rule maxM atches(5) 4 maxM atches(5) 5 maxM atches(5) 4 minM atches(8) = 0 maxM atches(5) 5 minM atches(8) = 0 maxM atches(5) 3 minM atches(8) = 0 minM atches(10) 1 minM atches(10) 2 minM atches(15) 1
Figure 6.6(a) 6.6(b) 6.7(a) 6.7(b) 6.8(a) (none) (none) (none)
APs 6 6 6 6 4 6 6 6
Matches 430 (4.8%) 150 (1.7%) 160 (1.8%) 97 (1.1%) 317 (3.5%) 8643 (95.6%) 7768 (85.9%) 7896 (87.3%)
5m 50.9% 64.0% 71.3% 72.2% 62.2% 4.6% 2.6% 2.0%
7m 74.4% 88.0% 91.9% 90.7% 86.4% 10.1% 6.9% 6.1%
10m 91.2% 98.0% 99.4% 99.0% 99.1% 21.9% 17.9% 16.8%
Table 6.1: Matching results. Each row shows a matching rule, the gure (if any) containing the signalprints created from our measurements that satisfy that rule, the number of access points used as sensors, the number of matches produced, and the percentages of matches created by locations within 5, 7, and 10 meters from each other. The rst four rules are used to detect packets transmitted by the same device, while the last two detect packets sent by distinct devices.
large number of matching requests means they are being transmitted from a specic location or area, which could be found by coupling our mechanism with a localization system. Some signalprints produced at the same location may not match due to RSSI oscillations, but this does not prevent the WA from detecting high-rate DoS attacks.
Detecting Packets From Distinct Devices

In this section we evaluate matching rules specied to decrease the probability of false positives when looking for masquerading attacks. We want signalprints to match only if there is a high probability that they were indeed produced by distinct devices. In this case, detecting large RSSI dierences is more important than nding similar values, so min-matches play a more important role in these situations. Most location pairs in our dataset generate signalprints that satisfy the matching rule minM atches(S1 , S2 , 10) 1, i.e., values in at least one position dier by 10 dB or more. As shown in Table 6.1 (5th row) over 95% of all location pairs satisfy this rule. Even a large number of locations that are physically close can be distinguished, with over 400 matches produced for locations less than 5 meters from each other. Overall, these results show that masquerading attacks can be detected with high probability, as at least one access point
6.6. EVALUATION
141
can tell the two locations apart. Most of the signalprint pairs that did not satisfy this rule were related to locations that were physically close: respectively 55% and 81% of them were within 5 and 7 meters from each other. One way to decrease the probability of false positives is to increase the minimum number of 10-dB min-matches to 2. As shown in the 6th row in the table, over 85% of all location pairs still produce a match. In this case, a match is an even stronger indication that an attack is taking place, as signalprints dier substantially relative to at least two access points. However, a smaller percentage of the location pairs that did not satisfy this rule were in close proximity: only 52% of them are within 7 meters from each other. In fact, over 31% of all pairs have distances larger than 10 meters. The reason for this is lack of feedback. A min-match occurs only if both signalprints have non-default values in a position. If one location is outside the range of two APs and the other outside the range of a third, there are just 3 APs left to produce matches. Consequently, even locations what are far apart may not create considerable RSSI dierences in multiple locations. Another way to make such rules more precise is to increase the value of , i.e. require larger dierences in signal strength. For instance, over 87% of all location pairs dier by 15 dB or more relative to at least one access point, i.e., they satisfy the rule minM atches(S1 , S2 , 15) 1 (Table 6.1, 7th row). Of the location pairs that did not satisfy this rule, over 62% were less than 7 meters apart. These results demonstrate that with enough sensors, distinct devices can be distinguished with high probability. The last two rules also demonstrate that one can collect more evidence of an attack without considerably decreasing chances of detection.
6.6.4
Moving Devices
We do not expect legitimate clients on the move to generate false alarms because they send requests at rates much lower than required by most attacks. For example, consider an 802.11 client that associates with an access point and after some time moves to a dierent location and requests disassociation. Despite the fact that the two signalprints generated can be quite dierent, an alarm should not be raised in this situation. An eective disassociation attack requires higher rates of deauthentication requests to keep a client o the network, so only a larger number of matching signalprints detected during a short period of time (e.g. tens of seconds) should generate an alarm.
142
Unless an attacker moves towards the victim, changing his location does not increase the chances of having a successful masquerading attack. What matters is not how the signalprints he produces compare to each other for this matter they could be all dierent but how similar they are to the one produced by the victim. Attacks are detected as long as there are considerable RSSI dierences, which only cease to exist if the attacker moves close to his victim. Whether an attacker can disguise a resource depletion attack by changing his location over time depends on his speed and the required packet rate. Let us assume that an attacker moves at pedestrian speeds and consider an attack requiring R > 10 pps (such as the attack against TLS). In this case, attacks are still detected with high probability. If he transmits at a uniform rate, which has to be close to R pps, he continuously provides the system with information about his location. Packets transmitted close in time generate similar signalprints, allowing the system to track his location if a localization system is available. To avoid being tracked, an attacker needs to alternate periods of packet transmissions and radio silence. During such transmission bursts, however, he needs to send packets at rates higher than R pps in order to compensate for the periods of silence. This attack would be also detected because signalprints generated during each burst should match each other with high probability. However, tracking the attacker becomes more challenging because these bursts produce location estimates that are further apart.
6.6.5
Directional and Beamforming Antennas
A single directional or beamforming antenna would be more helpful to an attacker implementing a resource depletion attack than a masquerading attack. In a masquerading attack, it is still hard for an attacker to clone the exact signalprint produced by his intended victim from a large distance. In close range, an omni-directional transmitter would also be eective while being easier to conceal. During resource depletion attacks, changing the transmission beam allows an attacker to change his signalprint, which decreases the number of matching requests. The probability of detection depends on the number of distinct patterns a transmitter is able to create and the packet rate required by the attack. If an attacker is only able to produce a small number of antenna patterns and an attack requires high packer rates (tens or hundreds of packets per second), some of the signalprints produced are still associated with a large number of requests, allowing detection with high probability.
6.7. LIMITATIONS
143
6.7
Limitations
Due to the use of RSSI levels to characterize wireless clients, one inherent limitation of our mechanism is that it may be unable to distinguish two devices located physically close to each other. Masquerading attempts can be detected if there is a noticeable dierence in RSSI with respect to at least one access point. As shown in Section 6.6.3, this happens even for some locations in close range, possibly due to obstacles that aect one location more than the other. In some situations such as multiple clients in a conference room the system may not have compelling evidence that packets are coming from dierent devices, making masquerading attacks possible. The level of physical security in an installation dictates whether these attacks can be mounted: compared to a cafeteria, it is harder for an attacker in an enterprise building to get close enough to his victim to mount an undetected masquerading attack. Our mechanism may also not be able to detect DoS attacks composed of few packets. The more packets are involved in an attack, the more signalprints are available for processing and the higher the probability of detection. A single-packet deauthentication attack in a 802.11 network may go unnoticed for example if APs are sensing other channels or not provide enough condence as to raise an alarm. In most situations, however, attacks require high packet rates to be eective, increasing chances of detection. An attacker may be able to avoid detection if provided with multiple antennas (A > 1). Suppose that an attacker congures its antennas so that each sensor can only listen to transmissions from a single antenna (e.g. using directional antennas with narrow beamwidth values). To successfully mount a resource depletion attack, the attacker can simultaneously transmit a dierent packet through each antenna. As a single sensor detects each transmission, the signalprints produced are too short to satisfy the rules presented in Section 6.5.1. To mount a masquerading attack, the attacker simultaneously transmits the same packet using all antennas. By choosing the proper transmission power level for each of them, he is able to compose any arbitrary signalprint with A values. In both scenarios, attacks would be detected if some of the packets even a small fraction were detected by multiple access points.
144
6.8
In this chapter we showed that reliable client identiers, which we call signalprints, can be created using signal strength measurements reported by access points acting as sensors. We showed that while malicious clients can lie about their MAC addresses, the signalprints they produce are strongly correlated with their physical location. We demonstrated that by tagging packets with their signalprints and crafting proper matching rules, a wireless network is able to detect a large class of eective denial-of-service attacks based on MAC address spoong. We presented several examples of attacks that can be easily mounted in IEEE 802.11 networks and that can be detected by our proposed mechanism with high probability. Measurements in our network testbed demonstrate that multiple packets transmitted by a stationary device produce similar signalprints with high probability. In our test dataset, most RSSI variations for a stationary client with respect to a single access point are small, within 5 dB from the median signal strength level. This allows the network to detect resource depletion attacks, in which a malicious device transmits high rates of packets (e.g. DHCP or authentication requests) containing random, forged MAC addresses. We presented matching rules able to detect that a large percentage of these packets were indeed generated by a single device, despite the dierent MAC addresses. We also showed that similar signalprints are mostly found in close proximity. First, using 6 of our deployed access points, we showed that locations that produce signalprints with multiple similar RSSI values tend to be within 5 meters from each other. Then we showed that large RSSI dierences provide strong evidence that packets were generated by distinct devices. Consequently, an attacker needs to be physically close to his intended victim in order to mount undetected masquerading attacks. Overall, we showed that signalprints are tags that allow a wireless network to identify mobile devices according their physical location, improving security in a cost-eective manner. Although signalprints can be defeated, such as by the use of multiple synchronized direction antennas, these situations present a challenge for an intruder and increase the likelihood of detection by physical security measurements. Thus, like the use of ngerprints to identify humans, the mechanism is not infallible but a signicant improvement over just believing the identity that the individual claims.
Chapter 7
Conclusion
This chapter provides an overall perspective on our work, summarizes its main contributions, and provides directions for future work.
7.1
A Perspective
Our objective in this dissertation has been to apply to wireless local area networks the same approach that people have been using for centuries to protect general assets: leveraging physical security to the best possible extent. People have used this mechanism to secure almost everything, from general goods at home to highly-sensitive equipment at military bases. We have also been successful at extending this approach to controlling access to our wired LANs. From homes to enterprise buildings, access to Ethernet ports is mainly controlled through a combination of gates, fences, locks, cameras, badges, and security personnel. In this work we have shown that by exploring inherent properties of wireless communications, we can use location-based services to better leverage physical security measures in wireless LANs, decreasing the security gap between wireless and wired networks. With our approach, similar security guidelines can be used to protect both wired and wireless LANs. In both cases, a base security level can be achieved with low management costs by exploring client location information and taking advantage of physical security measures. Regarding security at the link layer think for example of network access control current usage of wired networks suggests that this level of protection is sucient for most installations. I.e., these measures alone raise security to a level that is sucient to prevent most practical attacks. Finer grained access control capabilities can then be left to be 145
146
CHAPTER 7. CONCLUSION
implemented at upper layers, on a per-application basis, for instance using identity-based mechanisms such as IPSec [70], TLS [42], and SSH [117, 118]. The link layer can focus on controlling access to network resources (e.g. bandwidth) and take advantage of a much simpler security model.
7.2
Summary
In this dissertation we have addressed the problem of improving security in wireless LANs more specically access control and accountability without resorting to mechanisms that incur substantial management costs. Nowadays, wireless networks are forced to rely on mechanisms based on user credentials (e.g. passwords, private keys) to achieve satisfactory security levels. Statistics suggest that the costs associated with creating, revoking, and securely managing these credentials are too high for a large percentage of wireless installations, which are ultimately forced to use insecure yet easily deployable solutions. Consequently, even a sizable fraction of enterprise networks operate with insecure congurations, vulnerable to attacks ranging from casual war-driving to exposure of sensitive information. We have shown that networks can use location-based security policies to implement a cost-eective access control and accountability framework while taking advantage of the increasing number of access points to deploy self-congurable services, with little or no dependence on operator assistance. We showed that a properly-designed localization system can be used to improve accountability and to implement location-based access control policies. Using measurements in a real deployment, we showed that our system is able to accurately track the location of wireless clients using signal strength statistics reported by access points. This improves accountability because clients being serviced by the network can now be closely monitored and are constantly exposed to physical security measures. We also showed that our system can be used as a building block to implement cost-eective access control. Conditions imposed by the system can be easily satised by the targeted clients while demanding considerable levels of amplication from attackers located beyond the boundaries of the service area. This increases the costs required to mount successful attacks, making wireless LANs harder to compromise. This system was described in detail in Chapter 4. We also showed that we can further increase the resources required to access a network beyond the service area by forcing clients to act as receivers during authentication. In
7.2. SUMMARY
147
Chapter 5, we presented the design and evaluation of range-based authentication, a mechanism that takes advantage of inherent characteristics of wireless channels to force clients to be located in close proximity to access points. While users with standard wireless devices and located within the SA succeed with high probability, we showed that attackers located more than 30 meters from the service area need large directional antennas to cope with the higher signal attenuation. Acting as receivers, attackers cannot rely on ampliers to acquire all the necessary gain, being forced to use high-gain antennas or get closer to the network infrastructure. In both of these cases, the attacker is physically exposed and within range of physical security measures. In Chapter 6, we showed that networks can also use signal strength information to create reliable client identiers called signalprints and use them to detect a large class of denialof-service attacks. Several link-layer services rely on MAC addresses to identify clients, so even without permission to access a specic network, malicious devices can easily deny service to legitimate clients. In fact, even authenticated clients in IEEE 802.11 networks are currently vulnerable to several of such identity-based DoS attacks. However, while malicious devices can lie about their MAC addresses as they control the contents of their packets, we have shown that the signalprints they produce are strongly correlated with their physical locations. We showed that this property allows identity-based attacks to be detected with high probability. An intruder using random MAC addresses to generate high rates of requests (e.g. towards an authentication server) produces many similar signalprints and provides enough evidence for the attempt to be detected. Similarly, we demonstrated that an attacker needs to be within close range (within 5-10 meters) to its intended victim for them to generate indistinguishable signalprints. Finally, we used our testbed network to demonstrate that our proposed services can be deployed in a self-congurable manner, therefore having little impact on overall management costs. We showed that during system calibration, APs communicate amongst themselves to establish models that coarsely approximate signal attenuation within the service area. This creates all the environment-specic state needed by our localization and range-based authentication mechanisms, so measurements can be performed autonomously by the system and repeated periodically to cope with the dynamics of wireless networks. We described system calibration in Chapter 3. In summary, our approach follows a direction of past success in computing, namely throwing more resources at the problem, access points in this case. As with other
148
large-scale systems, the costs associated with dense wireless LANs have been shown to be dominated by operational rather than capital expenses. With the sharp falling prices of commodity hardware IEEE 802.11 devices provide a timely example large numbers of access points can be deployed without severely aecting total cost of ownership. Moreover, we have shown that location-based security mechanisms are extremely suitable to such scenarios. First, the more APs are deployed, the higher the number of vantage points, which improves both calibration and localization performance. Second, increasing access point density decreases the average length of wireless links, improving security guarantees and reducing network coverage beyond the service area. The main limitations of our architecture are a consequence of the use of signal strength measurements and the required modeling of signal attenuation in indoor environments. One one hand, the use of signal strength measurements makes our services immediately deployable and reduces overall costs because the required infrastructure is composed of cheap, o-the-shelf hardware components. On the other hand, the somewhat unpredictable nature of signal propagation makes it challenging for RSSI-based services to be highly accurate, which has important security implications. For instance, our proposed signalprintbased mechanism may be unable to distinguish devices in close proximity to each other while our localization system may fail to reject a mobile device located right outside a building. As we can only coarsely model signal attenuation with distance, services need to be conservative when rejecting clients to prevent impractical rates of false negatives. As a result, there is an area around each SA where successful attacks are possible with attainable hardware requirements. We showed that unless physical security measures are in place, attackers may successfully bypass our localization system and range-based mechanism with practical amounts of amplication when located in close proximity to a service area.
7.3
Future Work
There are many avenues for extending the mechanisms presented in this dissertation. In this section we present three directions that we consider promising. First, it seems possible to explore characteristics of specic wireless standards and their hardware implementations to further improve security guarantees. The techniques presented in this dissertation explore general properties of wireless channels signal attenuation with distance, signal quality as a function of signal-to-noise ratio, etc being applicable to most
7.3. FUTURE WORK
149
wireless systems. However, we believe it is possible to have optimizations that explore the characteristics of a specic system to reduce the space of practical attacks. For instance, the IEEE 802.11n standard being currently developed will rely heavily on multiple-input multiple-output (MIMO) techniques to improve actual data rates to over 100 Mbps. Let us assume that the upcoming 802.11n standard denes a mechanism to implement a MIMO technique called spatial multiplexing, in which distinct data streams can be transmitted simultaneously and at the same frequency using multiple spatial channels. A network targeting only 802.11n-capable clients can set its reference client accordingly and implement a variant of our range-based authentication protocol in which distinct nonce messages are sent simultaneously over distinct spatial channels. The rst consequence of this change is that single-output1 clients (such as current 802.11a/b/g devices) may not be able to recover all transmitted nonces, therefore not being able to authenticate successfully. Moreover, an attacker that aims to extend the service area now has to amplify multiple distinct signals coming through multiple antennas. Whether this poses a more challenging scenario for the attacker is an interesting research question. Another interesting research direction would be to use the real-time measurements collected about clients not only for security purposes by also to improve performance. Wireless networks today are still deployed in a fairly static manner: relatively few access points are installed and clients are responsible for selecting which one to use. As users are not usually uniformly distributed across the service area for instance, in enterprise buildings they tend to be clustered inside oces and conference rooms it is common for some access points to operate under heavy loads while others are almost idle. In a centrally-controlled WLAN, load-balancing decisions can be made by the WA dynamically, based on the measurements constantly reported by the access points. With such real-time information, the WA can select which APs to activate and which clients should be served by each AP in order to optimize user experience. Some current products already claim to provide such RF management functions, but it is an interesting research question how dynamic a network can be while avoiding unnecessary oscillation (e.g. high rates of hand-os) and still allowing networks to scale to large numbers of APs.
1 In MIMO systems, multiple input means that a devices is able to transmit multiple signals using multiple antennas, while multiple output means that a device is able to receive through multiple antennas.
150
Finally, it would be of great value to better understand how location-based services are aected over longer periods of time by dynamic changes in the environment. For example, our results demonstrate that signal strength is fairly stable for stationary clients over several minutes. However, our results are specic to our building, its construction materials, as well as the number and the movement of people inside our service area during our measurements. Real deployments in environments with dierent characteristics and usage patterns, monitored over the course of several weeks or months, would be extremely useful to support the use of location-based services. For instance, these studies would be able to demonstrate how localization accuracy varies over time and how frequently calibration needs to be performed to allow an installation to respond quickly to events that aect signal attenuation within a service area. Our work demonstrates the advantages of exploring inherent characteristics of wireless channels to improve security, and we hope our eorts will encourage a wider adoption of location-based security policies.
Bibliography
[1] LAN MAN Standards Committee of the IEEE Computer Society. Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specications. Technical Report 1999 Edition, IEEE Std 802.11, 1999. [2] LAN MAN Standards Committee of the IEEE Computer Society. Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specications - Amendment 1: High-speed Physical Layer in the 5 GHz band. Technical Report 1999 Edition, IEEE Std 802.11a, 1999. [3] LAN MAN Standards Committee of the IEEE Computer Society. Standard for Port based Network Access Control. Technical Report Draft P802.1X/D11, IEEE Computer Society, March 2001. [4] Cisco Aironet 350 Series Client Adapters. Product Data Sheet. Cisco Systems, Inc., 2003. [5] LAN MAN Standards Committee of the IEEE Computer Society. Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specications for High Rate Wireless Personal Area Networks (WPANs). Technical Report 2003 Edition, IEEE Std 802.15.3, September 2003. [6] EliteConnectT M Universal 2.4GHz/5GHz High Power Wireless Cardbus Adapter. Product Data Sheet. SMC Networks, Inc., 2004. [7] LAN MAN Standards Committee of the IEEE Computer Society. Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specications - Amendment 6: Medium Access Control (MAC) Security Enhancements. Technical Report 2004 Edition, IEEE Std 802.11i, July 2004. 151
152
BIBLIOGRAPHY
[8] The Ocial Worldwide Wardrive. 2004, accessed in July 2006.
http://www.worldwidewardrive.org/, June
[9] 2.4 GHz 17.5 dBi High Performance Parabolic Dish Wireless LAN Antenna, Model HG2418D. Product Data Sheet. HyperLink Technologies, Inc., 2005. [10] 2.4 GHz 19 dBi Directional Reector Grid Wireless LAN Antenna, Model HG2419G. Product Data Sheet. HyperLink Technologies, Inc., 2005. [11] 2.4 GHz 9 dBi Radome Enclosed Wireless LAN Yagi Antenna, Model HG2409Y. Product Data Sheet. HyperLink Technologies, Inc., 2005. [12] Cisco Aironet 1240AG Series 802.11A/B/G Access Point. Product Data Sheet. Cisco Systems, Inc., 2005. [13] LAN MAN Standards Committee of the IEEE Computer Society. Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specications for Wireless Personal Area Networks (WPANs). Technical Report 2005 Edition, IEEE Std 802.15.1, June 2005. [14] 2.4 GHz 18 dBi Wireless LAN Heavy Duty Panel Antenna, Model HG2418P. Product Data Sheet. HyperLink Technologies, Inc., 2006. [15] 2.4 GHz 24 dBi High Performance Reector Grid Wireless LAN Antenna, Model HG2424G. Product Data Sheet. HyperLink Technologies, Inc., 2006. [16] 2.4 GHz 8 dBi Wireless LAN Round Patch Antenna, Model HG2408P. Product Data Sheet. HyperLink Technologies, Inc., 2006. [17] OpenWrt Wireless Freedom. http://www.openwrt.org, July 2006, accessed in July 2006. [18] Merriam-Webster Online Dictionary. http://www.m-w.com, accessed in November 2006. [19] Martin Abadi, Michael Burrows, and Ted Wobber. Moderately Hard, Memory-Bound Functions. In Proceedings of the Network and Distributed System Security Symposium (NDSS), San Diego, USA, February 2003.
BIBLIOGRAPHY
153
[20] B. Aboba and D. Simon. PPP EAP TLS Authentication Protocol, RFC2716, IETF. October 1999. [21] William Aiello, Steven M. Bellovin, Matt Blaze, Ran Canetti, John Ioannidis, Angelos D. Keromytis, and Omer Reingold. Just Fast Keying: Key Agreement in a Hostile Internet. ACM Transactions on Information and System Security (TISSEC), 7(2):242-273, May 2004. [22] S. E. Alexander. Radio Propagation within Buildings at 900 MHz. Electronic Letters, 18(21):913-914, October 1982. [23] Jrgen Bach Andersen, Theodore S. Rappaport, and Susumu Yoshida. Propagation measurements and models for wireless communications channels. IEEE Communication Magazine, 33(1):42-49, January 1995. [24] Tuomas Aura, Pekka Nikander, and Jussipekka Leiwo. DoS-resistant Authentication with Client Puzzles. In Proceedings of the Cambridge Security Protocols Workshop, LNCS, Cambridge, UK, April 2000. [25] Olufunmilola Awoniyi and Fouad A. Tobagi. Eect of Fading on the Perfomance of VoIP in IEEE 802.11a WLANs. In Proc. of IEEE International Conference on Communications, June 2004. [26] Paramvir Bahl and Venkata N. Padmanabhan. RADAR: An In-Building RF-Based User Location and Tracking System. In Proc. of the 19th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM00), Tel-Aviv, Israel, March 2000. [27] Paramvir Bahl, Venkata N. Padmanabhan, and A. Balachandran. Enhancements to the RADAR User Location and Tracking System. Technical Report MSR-TR-00-12, Microsoft Research, February 2000. [28] Dirk Balfanz, D. K. Smetters, Paul Stewart, and H. Chi Wong. Talking to Strangers: Authentication in Ad-Hoc Wireless Networks. In Proc. of Network and Distributed System Security Symposium (NDSS), February 2002.
154
BIBLIOGRAPHY
[29] John Bellardo and Stefan Savage. 802.11 Denial-of-Service Attacks: Real Vulnerabilities and Practical Solutions. In Proceedings of the USENIX Security Symposium, Washington, DC, USA, August 2003. [30] Stefan Brands and David Chaum. Distance-bounding protocols. In EUROCRYPT 93: Workshop on the Theory and Application of Cryptographic Techniques on Advances in Cryptology, pages 344359, Secaucus, NJ, USA, 1994. Springer-Verlag. [31] P. Calhoun, M. Montemurro, and D. Stanley. CAPWAP Protocol Specication. IETF Internet Draft, draft-ietf-capwap-protocol-specification-01, May 2006. [32] Joseph J. Carr. Practical Antenna Handbook. McGraw-Hill, 3rd edition, 1998. [33] Paul Castro, Patrick Chiu, Ted Kremenek, and Richard Muntz. A Probabilistic Room Location Service for Wireless Networked Environments. In Proc. of the International Conference on Ubiquitous Computing (UbiComp01), pages 18-34, Atlanta, GA, September 2001. [34] Yi-Chao Chen, Ji-Rung Chiang, Hao hua Chu, Polly Huang, and Arvin Wen Tsui. Sensor-Assisted Wi-Fi Indoor Location System for Adapting to Environmental Dynamics. In Proc. of the ACM/IEEE International Symposium on Modeling, Analysis, and Simulation of Wireless and Mobile Systems (MSWiM05), pages 118-125, October 2005. [35] David Cheung and Cli Prettie. A Path Loss Comparison Between the 5 GHz UNII Band (802.11a) and the 2.4 GHz ISM Band (802.11b). Technical report, Intel Corporation, January 2002. [36] Mark D. Corner and Brian D. Noble. Zero-Interaction Authentication. In Proc. of ACM International Conference on Mobile Computing and Networking - MobiCom02, pages 1-11, Atlanta, GA, September 2002. [37] CSI. Tenth Annual CSI/FBI Computer Crime and Security Survey. Technical report, Computer Security Institute, June 2005. [38] D. Dean and A. Stubbleeld. Using Client Puzzles to Protect TLS. In Proceedings of the Tenth USENIX Security Symposium, Washington, DC, USA, August 2001.
BIBLIOGRAPHY
155
[39] Murat Demirbas and Youngwhan Song. An RSSI-based Scheme for Sybil Attack Detection in Wireless Sensor Networks. In Proc. of International Workshop on Advanced Experimental Activities on Wireless Networks and Systems, June 2006. [40] Dorothy E. Denning and Peter F. MacDoran. 1996(2):12-16, February 1996. [41] D. M. J. Devasirvatham, C. Banerjee, M. J. Krain, and D. A. Rappaport. MultiFrequency Radiowave Propagation Measurements in the Portable Radio Environment. In Proc. of IEEE International Conference on Communications (ICC), volume 4, pages 1334-1340, April 1990. [42] T. Dierks and C. Allen. The TLS Protocol - Version 1.0, RFC 2246, IETF. January 1999. [43] A. Doufexi, S. Armour, M. Butler, A. Nix, and D. Bull. A Study of the Performance of HIPERLAN/2 and IEEE 802.11a Physical Layers. In Proc. of IEEE Vehicular Technology Conference - VTC01, volume 1, pages 668-672, May 2001. [44] A. Doufexi, S. Armour, B. Lee, A. Nix, and D. Bull. An Evaluation of the Performance of IEEE 802.11a and 802.1g Wireless Local Area Networks in a Corporate Oce Environment. In Proc. of IEEE International Conference on Communications, pages 1196-1200, May 2003. [45] Drago Niculescu and Badri Nath. VOR Base Stations for Indoor 802.11 Positioning. s In Proc. of ACM International Conference on Mobile Computing and Networking MobiCom04, pages 58-69, Philadelphia, PA, August 2004. [46] Greg Durgin, Theodore S. Rappaport, and Hao Xu. Measurements and Models for Radio Path Loss and Penetration Loss in and Around Homes and Trees at 5.85 GHz. IEEE Transactions on Communications, 46(11):1484-1496, November 1998. [47] K. J. Ellis and Nur Serinken. Characteristics of Radio Transmitter Fingerprints. Radio Science, 36:585-598, 2001. [48] Per Enge and Pratap Misra. Special Issue on Global Positioning System. Proceedings of the IEEE, 87(1):3-15, January 1999. Location-Based Authentication:
Grounding Cyberspace for Better Security. Computer Fraud & Security Bulletin,
156
BIBLIOGRAPHY
[49] ETSI. Broadband Radio Access Networks (BRAN); HIPERLAN Type 2; System Overview. Technical Report DTR/BRAN-00230002, February 2000. [50] Daniel B. Faria and David R. Cheriton. No Long-term Secrets: Location-based Security in Overprovisioned Wireless LANs. In Proc. of the Third ACM Workshop on Hot Topics in Networks (HotNets-III), November 2004. [51] Daniel B. Faria and David R. Cheriton. Detecting Identity-Based in Wireless Networks Using Signalprints. In Proc. of the Fifth ACM Workshop on Wireless Security (WiSe), pages 43-52, September 2006. [52] Farpoint Group. WLAN Total Cost of Ownership: Comparing Centralized and Distributed Architectures, White Paper FPG-2004-013.1. January 2004. [53] Paul Funk and Simon Blake-Wilson. EAP Tunneled TLS Authentication Protocol Version 1 (EAP-TTLSv1), Internet Draft, IETF. draft-funk-eap-ttls-v1-01.txt, March 2006. [54] Dan Garlan, Daniel P. Siewiorek, Asim Smailagic, and Peter Steenkiste. Project aura: Toward distraction-free pervasive computing. IEEE Pervasive Computing, 1(2):22-31, April 2002. [55] Ivan E. Getting. The Global Positioning System. IEEE Spectrum, 30(12):36-47, December 1993. [56] Marco Gruteser and Dirk Grunwald. Enhancing Location Privacy in Wireless LAN Through Disposable Interface Identiers: A Quantitative Analysis. Mobile Networks and Applications, 10(3):315-325, June 2005. [57] Youngjune Gwon and Ravi Jain. Error Characteristics and Calibration-free Techniques for Wireless LAN-based Location Estimation. In Proc. of the International Worshop on Mobility Management, pages 2-9, Philadelphia, PA, October 2004. [58] Andreas Haeberlen, Eliot Flannery, Andrew M. Ladd, , Algis Rudys, Dan S. Wallach, and Lydia Kavraki. Practical Robust Localization over Large-Scale 802.11 Wireless Networks. In Proc. of the ACM International Conference on Mobile Computing and Networking (MobiCom04), Philadelphia, PA, September 2004.
BIBLIOGRAPHY
157
[59] Jeyanthi Hall, Michel Barbeau, and Evangelos Kranakis. Enhancing Intrusion Detection in Wireless Networks Using Radio Frequency Fingerprinting. In Proc. of The IASTED Conference on Communications, Internet and Information Technology, November 2004. [60] Andy Harter and Andy Hopper. A New Location Technique for the Active Oce. IEEE Personal Communications, 4(5):42-47, October 1997. [61] Andy Harter, Andy Hopper, Pete Steggles, Andy Ward, and Paul Webster. The Anatomy of a Context-Aware Application. In Proc. of ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom99), pages 59-68, Seattle, WA, August 1999. [62] Homayoun Hashemi. The Indoor Radio Propagation Channel. Proceedings of IEE, 81(7):943-968, July 1993. [63] Homayoun Hashemi, Michael McGuire, Thomas Vlasschaert, and David Tholl. Measurements and Modeling of Temporal Variations of the Indoor Radio Propagation Channel. 43(3):733-737, August 1994. [64] Jerey Hightower, Roy Want, and Gaetano Borriello. SpotON: An Indoor 3D Location Sensing Technology Based on RF Signal Strength. Technical Report UW CSE 200002-02, University of Washington, February 2000. [65] Yih-Chun Hu, Adrian Perrig, and David B. Johnson. Packet Leashes: A Defense against Wormhole Attacks in Wireless Ad Hoc Networks. In Proc. of the Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM03), pages 1976-1986, San Francisco, CA, March 2003. [66] Norman L. Johnson, Samuel Kotz, and N. Balakrishnan. Continuous Univariate Distributions, volume 2. John Wiley & Sons, Inc., 1994. [67] Ari Juels and John Brainard. Client Puzzles: A Cryptographic Defense Against Connection Depletion Attacks. In Proceedings of the Network and Distributed System Security Symposium (NDSS), pages 151-165, San Diego, USA, February 1999. [68] Charlie Kaufman. Internet Key Exchange (IKEv2) Protocol, Internet Draft, IETF. August 2004.
158
BIBLIOGRAPHY
[69] J. M. P. Keenan and A. J. Motley. Radio Coverage in Buildings. British Telecom Technology Journal, 8(1):19-24, January 1990. [70] S. Kent and R. Atkinson. Security Architecture for the Internet Protocol, RFC 2401, IETF. November 1998. [71] Tim Kindberg, Kan Zhang, and Narendar Shankar. Context Authentication Using Constrained Channels. In WMCSA 02: Proc. of the 4th IEEE Workshop on Mobile Computing Systems and Applications, page 14, 2002. [72] T. Kivinen and M. Kojo. More Modular Exponential (MODP) Die-Hellman groups for Internet Key Exchange (IKE), RFC 3526, IETF. May 2003. [73] D. Kotz and K. Essien. Analysis of a Campus-wide Wireless Network. In Proc. of ACM International Conference on Mobile Computing and Networking - MobiCom02, pages 107-118, Atlanta, GA, September 2002. [74] P. Krishnan, A. S. Krishnakumar, Wen-Hua Ju, Colin Mallows, and Sachin Ganu. A System for LEASE: Location Estimation Assisted by Stationary Emitters for Indoor RF Wireless Networks. In Proc. of the Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM04), pages 1001-1011, Hong Kong, March 2004. [75] John Krumm and John Platt. Minimizing Calibration Eort for an Indoor 802.11 Device Location Measurement System. Technical Report MSR-TR-2003-82, Microsoft Research, November 2003. [76] Andrew M. Ladd, Kostas E. Bekris, Algis Rudys, Guillaume Marceau, Lydia E. Kavraki, and Dan S. Wallach. Robotics-Based Location Sensing using Wireless Ethernet. In Proc. of ACM International Conference on Mobile Computing and Networking (MobiCom02), Atlanta, GA, USA, September 2002. [77] Loukas Lazos, Radha Poovendran, and Srdjan Capkun. ROPE: Robust Position Estimation in Wireless Sensor Networks. In Proc. of the International Symposium on Information Procesing in Sensor Networks (IPSN05), pages 324-331, Los Angeles, CA, April 2005.
BIBLIOGRAPHY
159
[78] Arjen K. Lenstra and Eric R. Verheul. Selecting Cryptographic Key Sizes. Journal of Cryptology, 14(4):255-293, September 2001. [79] Hyuk Lim, Lu-Chuan Kung, Jennifer C. Hou, and Haiyun Luo. Zero-Conguration, Robust Indoor Localization: Theory and Experimentation. FOCOM06), Barcelona, Spain, April 2006. [80] L. Mamakos, K. Lidl, J. Evarts, D. Carrel, D. Simone, and R. Wheeler. A Method for Transmitting PPP Over Ethernet (PPPoE), RFC 2516, IETF. February 1999. [81] Petros Maniatis, David S. H. Rosenthal, Mema Roussopoulos, Mary Baker, TJ Giuli, and Yanto Muliadi. Preserving peer replicas by rate-limited sampled voting. In Proc. of the Nineteenth ACM Symposium on Operating Systems Principles (SOSP), pages 44-59, October 2003. [82] Jonas Medbo and Jan-Erik Berg. Simple and Accurate Path Loss Modeling at 5 GHz in Indoor Environments with Corridors. In Proc. of IEEE Vehicular Technology Conference (VTC), volume 1, pages 30-36, September 2000. [83] Alfred J. Menezes, Paul C. van Oorschot, and Scott A. Vanstone. Handbook of Applied Cryptography. CRC Press, October 1996. [84] D. Molkdar. Review on Radio Propagation into and within Buildings. IEE In Proc. of the Annual Joint Conference of the IEEE Computer and Communications Societies (IN-
Proceedings-H, 138(1):61-73, February 1991. [85] N. Moraitis and P. Constantinou. Indoor Channel Measurements and Characterization at 60 GHz for Wireless Local Area Network Applications. 52(12):3180-3189, December 2004. [86] A. J. Motley and J. M. P. Keenan. Personal Communication Radio Coverage in Buildings at 900 MHz and 1700 MHz. Electronic Letters, 24(12):763-764, June 1988. [87] Ashwin Palekar, Dan Simon, Glen Zorn, Joe Salowey, Hao Zhou, and S. Josefsson. Protected EAP Protocol (PEAP) Version 2, Internet Draft, IETF. draft-josefssonpppext-eap-tls-eap-07.txt, October 2003.
160
BIBLIOGRAPHY
[88] Nissanka Bodhi Priyantha, Anit Chakraborty, and Hari Balakrishnan. The Cricket Location-Support System. In Proc. of ACM International Conference on Mobile Computing and Networking (MobiCom00), pages 32-43, Boston, MA, August 2000. [89] Nissanka Bodhi Priyantha, Allen K. L. Miu, Hari Balakrishnan, and Seth Teller. The Cricket Compass for Context-Aware Mobile Applications. In Proc. of ACM International Conference on Mobile Computing and Networking (MobiCom01), pages 1-14, Rome, Italy, July 2001. [90] Ram Ramanathan. On the Performance of Ad Hoc Networks with Beamforming Antennas. In Proc. of ACM International Symposium on Mobile Ad Hoc Networking and Computing - MobiHoc01, pages 95-105, Long Beach, CA, October 2001. [91] Theodore S. Rappaport. Wireless Communications - Principles and Practice. Prentice Hall PTR, 2nd edition, January 2002. [92] Infonetics LANs: Research. North America User 2005. Plans for Press Wireless Release,
http://www.infonetics.com/resources/purple.shtml?upna05.wl.nr.shtml, October 2005, accessed in November 2006. [93] Infonetics Research. Wireless LAN Equipment Report. Press Release,
http://www.infonetics.com/resources/purple.shtml?ms06.wl.4q.nr.shtml, March 2006, accessed in June 2006. [94] Infonetics port, Research. Enterprise Routers L2-L7 Report. LAN Switches Press ReRelease,
http://www.infonetics.com/resources/purple.shtml?ms06.rtr.sw.4q.nr.shtml, February 2006, accessed in November 2006. [95] Infonetics Research. Network Security Appliances and Software Report. Press Release, http://www.infonetics.com/resources/purple.shtml?ms06.sec.4q.nr.shtml, March 2006, accessed in November 2006. [96] Michael J. Riezenman. Cellular security: better, but foes still lurk. IEEE Spectrum, 37(6):39-42, June 2000.
BIBLIOGRAPHY
161
[97] C. Rigney, S. Willens, A. Rubens, and W. Simpson. Remote Authentication Dial In User Service (RADIUS), RFC 2865, IETF. June 2000. [98] Teemu Roos, Petri Myllymki, Henry Tirri, Pauli Misikangas, and Juha Sievnen. A a a Probabilistic Approach to WLAN User Location Estimation. International Journal of Wireless Information Networks, 9(3):155-164, July 2002. [99] Sheldon Ross. A First Course in Probability. Macmillan, 2 edition, 1984. [100] RSA Security Inc. The Wireless Security Survery of Frankfurt, White Paper WSFR05WP-0305. March 2005. [101] RSA Security Inc. The Wireless Security Survery of London, White Paper WSLN05WP-0305. March 2005. [102] RSA Security Inc. The Wireless Security Survery of New York, White Paper WSNY05WP-0305. March 2005. [103] RSA Security Inc. The Wireless Security Survery of San Francisco, White Paper WSSF05-WP-0305. March 2005. [104] Naveen Sastry, Umesh Shankar, and David Wagner. Secure verication of location claims. In Proc. of the Second ACM Workshop on Wireless Security (WiSe03), pages 1-10, September 2003. [105] S. Y. Seidel and T. S. Rappaport. 914 MHz Path Loss Prediction Models for Indoor Wireless Communications in Multioored Buildings. IEEE Transactions on Antennas and Propagation, 40(2):207-217, February 1992. [106] Jason Small, Asim Smailagic, and Daniel P. Siewiorek. Determining User Location For Context Aware Computing Through the Use of a Wireless LAN Infrastructure. Technical report, Project Aura, Carnegie Mellon University, 2000. [107] Frank Stajano and Ross Anderson. The Resurrecting Duckling: Security Issues for Ad-hoc Wireless Networks. In Proc. of 7th Security Protocols Workshop - LNCS, volume 1796, pages 172-194, 1999. [108] Frank Stajano and Ross Anderson. The Resurrecting Duckling: Security Issues for Ubiquitous Computing. IEEE Computer (Security & Privacy Supplement), 35(4):2226, April 2002.
162
BIBLIOGRAPHY
[109] Ping Tao, Algis Rudys, Andrew Ladd, and Dan S. Wallach. Wireless LAN LocationSensing for Security Applications. In Proc. of the Second ACM Workshop on Wireless Security (WiSe03), pages 11-20, September 2003. [110] O. Ureten and Nur Serinken. Detection of Radio Transmitter Turn-On Transients. Electronic Letters, 35(23):1996-1997, November 1999. [111] O. Ureten and Nur Serinken. Bayesian Detection of Wi-Fi Transmitter RF Fingerprints. Electronic Letters, 41(6):373-374, March 2006. [112] Srdjan Capkun and Jean-Pierre Hubaux. Secure Positioning of Wireless Devices with Application to Sensor Networks. In Proc. of the Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM05), pages 1917-1928, Miami, FL, March 2005. [113] Srdjan Capkun and Jean-Pierre Hubaux. Secure Positioning in Wireless Networks. IEEE Journal on Selected Areas in Communications, 24(2):221-232, February 2006. [114] Roy Want, Andy Hopper, Veronica Falc ao, and Jonathan Gibbons. The Active Badge Location System. ACM Transactions on Information Systems, 10(1):91-102, January 1992. [115] Brent R. Waters and Edward W. Felten. Secure, Private Proofs of Location. Technical Report TR-667-03, Princeton University, Computer Science Dept., January 2003. [116] L. Yang, P. Zerfos, and E. Sadot. Architecture Taxonomy for Control and Provisioning of Wireless Access Points (CAPWAP). RFC 4118, IETF, June 2005. [117] T. Ylonen and C. Lonvick. The Secure Shell (SSH) Authentication Protocol, RFC 4252, IETF. January 2006. [118] T. Ylonen and C. Lonvick. The Secure Shell (SSH) Protocol Architecture, RFC 4251, IETF. January 2006. [119] Moustafa Youssef and Ashok Agrawala. WLAN Location Determination via Clustering and Probability Distributions. In Proc. of the IEEE International Conference on Pervasive Computing and Communcations (PerCom03), pages 143-150, Dallas Fort-Worth, TX, March 2003.
BIBLIOGRAPHY
163
[120] Moustafa Youssef and Ashok Agrawala. Handling Samples Correlation in the Horus System. In Proc. of the Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM04), pages 1023-1031, Honk Kong, March 2004. [121] Moustafa Youssef and Ashok Agrawala. The Horus WLAN Location Determination System. In Proc. of the Third International Conference on Mobile Systems, Applications, and Services (Mobisys05), pages 205-218, Seattle, WA, June 2005.

Faria Dissertation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Faria Dissertation

Uploaded by

Copyright:

Available Formats

SCALABLE LOCATION-BASED SECURITY IN WIRELESS NETWORKS

Daniel Braga de Faria December 2006

c Copyright by Daniel Braga de Faria 2007 All Rights Reserved

Approved for the University Committee on Graduate Studies.

1.5 1.6 1.7 1.8

2.1.5 2.2 2.3 2.4 2.5

Other Localization Systems . . . . . . . . . . . . . . . . . . . . . . .

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 Alternate Path Loss Models . . . . . . . . . . . . . . . . . . . . . . . Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59 60 60 60 61 66 67 67 68 69 71 75 79 80 81 83 83 85 85 87 88 92 93 94 96 97 98 98 98 101 104 104

Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 Conclusion 7.1 7.2 A Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

Measurements outside the building. . . . . . . . . . . . . . . . . . . . . . . .

Link-Layer Security in Local Area Networks

1.1. LINK-LAYER SECURITY IN LOCAL AREA NETWORKS

Security in Wired LANs

Challenges in Wireless Security

1.3. CHALLENGES IN WIRELESS SECURITY

1.4. LOCATION-BASED APPROACH TO WIRELESS SECURITY

Location-Based Approach to Wireless Security

1.4. LOCATION-BASED APPROACH TO WIRELESS SECURITY

Assumptions About the Network

1.6. TERMS AND DEFINITIONS

Terms and Denitions

1.8. OUTLINE OF DISSERTATION

Leveraging User Location Information

CHAPTER 2. RELATED WORK

Signal Strength-Based Localization Systems

2.1. LEVERAGING USER LOCATION INFORMATION

CHAPTER 2. RELATED WORK

Making RSSI-Based Systems Robust Against Attacks

Decreasing Calibration Eorts

2.1. LEVERAGING USER LOCATION INFORMATION

CHAPTER 2. RELATED WORK

Ultrasound and Infrared Based Systems

2.2. EXPLORING LOCATION-LIMITED CHANNELS

Other Localization Systems

Exploring Location-Limited Channels

CHAPTER 2. RELATED WORK

2.3. DISTANCE BOUNDING AND LOCATION VERIFICATION MECHANISMS

Distance Bounding and Location Verication Mechanisms

CHAPTER 2. RELATED WORK

Detecting Identity-Based Denial-of-Service Attacks

CHAPTER 2. RELATED WORK

Automatic System Calibration

CHAPTER 3. AUTOMATIC SYSTEM CALIBRATION

3.2. THE LOG-DISTANCE MODEL

The Log-Distance Model

P r(d) = P r(d) + X = P r0 10 log(d) + X

CHAPTER 3. AUTOMATIC SYSTEM CALIBRATION

CHAPTER 3. AUTOMATIC SYSTEM CALIBRATION

The Calibration Process

3.4. THE CALIBRATION PROCESS

10 0 -10 -20 -30 -40

10 0 -10 -20 -30 -40

10 0 -10 -20 -30 -40

CHAPTER 3. AUTOMATIC SYSTEM CALIBRATION

Model Instance Computation

3.5. HANDLING HETEROGENEOUS ENVIRONMENTS

Handling Heterogeneous Environments

CHAPTER 3. AUTOMATIC SYSTEM CALIBRATION

CHAPTER 3. AUTOMATIC SYSTEM CALIBRATION

CHAPTER 3. AUTOMATIC SYSTEM CALIBRATION

20 Y coordinate (m) L 3.62