Free Text Keystroke Dynamics

An Examination of Keystroke Dynamics
For
Continuous User Authentication
by
Eesa Alsolami
Bachelor of Science (Computer Science), KAU, Saudi Arabia 2002
Master of Information Technology (QUT) 2008
Thesis submitted in accordance with the regulations for
Degree of Doctor of Philosophy
Information Security Institute
Science and Engineering Faculty
Queensland University of Technology
August 2012
Keywords
Continuous biometric authentication, continuous authentication system, user-independent
threshold, keystroke dynamics, user typing behavior, feature selection.
i
ii
Abstract
Most current computer systems authorise the user at the start of a session and do
not detect whether the current user is still the initial authorised user, a substitute
user, or an intruder pretending to be a valid user. Therefore, a system that contin-
uously checks the identity of the user throughout the session is necessary without
being intrusive to end-user and/or eectively doing this. Such a system is called
a continuous authentication system (CAS).
Researchers have applied several approaches for CAS and most of these tech-
niques are based on biometrics. These continuous biometric authentication systems
(CBAS) are supplied by user traits and characteristics. One of the main types of
biometric is keystroke dynamics which has been widely tried and accepted for pro-
viding continuous user authentication. Keystroke dynamics is appealing for many
reasons. First, it is less obtrusive, since users will be typing on the computer
keyboard anyway. Second, it does not require extra hardware. Finally, keystroke
dynamics will be available after the authentication step at the start of the com-
puter session.
Currently, there is insucient research in the CBAS with keystroke dynamics
eld. To date, most of the existing schemes ignore the continuous authentication
scenarios which might aect their practicality in dierent real world applications.
Also, the contemporary CBAS with keystroke dynamics approaches use characters
sequences as features that are representative of user typing behavior but their se-
lected features criteria do not guarantee features with strong statistical signicance
which may cause less accurate statistical user-representation. Furthermore, their
selected features do not inherently incorporate user typing behavior. Finally, the
existing CBAS that are based on keystroke dynamics are typically dependent on
pre-dened user-typing models for continuous authentication. This dependency
restricts the systems to authenticate only known users whose typing samples are
modelled.
iii
This research addresses the previous limitations associated with the existing
CBAS schemes by developing a generic model to better identify and understand
the characteristics and requirements of each type of CBAS and continuous authen-
tication scenario. Also, the research proposes four statistical-based feature selec-
tion techniques that have highest statistical signicance and encompasses dierent
user typing behaviors which represent user typing patterns eectively. Finally, the
research proposes the user-independent threshold approach that is able to authen-
ticate a user accurately without needing any predened user typing model a-priori.
Also, we enhance the technique to detect the impostor or intruder who may take
over during the entire computer session.
iv
Contents
Keywords i
Abstract iii
Table of Contents v
List of Figures ix
List of Tables xi
List of Abbreviations xiii
Declaration xv
Previously Published Material xvii
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Research Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Research Signicance . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Background 11
2.1 User Authentication in Computer Security . . . . . . . . . . . . . . 11
2.1.1 Biometric Authentication . . . . . . . . . . . . . . . . . . . . 13
2.2 Typist Authentication . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Static Typist Authentication . . . . . . . . . . . . . . . . . 16
2.2.2 Continuous Typist Authentication . . . . . . . . . . . . . . . 19
v
2.3 Machine Learning in Typist Authentication . . . . . . . . . . . . . . 26
2.3.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . 26
2.3.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . 27
2.4 Anomaly Detection Techniques . . . . . . . . . . . . . . . . . . . . 28
2.5 Related Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.1 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.2 Application of Threshold Analysis . . . . . . . . . . . . . . . 34
2.5.3 Time Series Analysis . . . . . . . . . . . . . . . . . . . . . . 35
2.6 Research Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3 Model for Continuous Biometric Authentication 41
3.1 Continuous Biometric Authentication System (CBAS) . . . . . . . . 42
3.2 CBAS Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.1 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.2 Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.3 Feature extractions . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.4 Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2.5 Biometric database . . . . . . . . . . . . . . . . . . . . . . . 50
3.2.6 Response unit . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3 Continuous Authentication Scenarios . . . . . . . . . . . . . . . . . 50
3.4 Existing CBAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Class 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.2 Class 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.3 Limitations in the current CBAS . . . . . . . . . . . . . . . 58
3.5 A new class for CBAS . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4 Dataset Analysis 63
4.1 Predened or free text? . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3 Dataset Description . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4 Data Prepossessing . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5 Preliminary Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.6 Experimental Framework . . . . . . . . . . . . . . . . . . . . . . . . 71
4.7 Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . . . . 76
vi
4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5 User-Representative Feature Selection for Keystroke Dynamics 79
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.2 Proposed feature selection techniques . . . . . . . . . . . . . . . . . 82
5.2.1 Most frequently typed n-graph selection . . . . . . . . . . . 82
5.2.2 Quickly-typed n-graph selection . . . . . . . . . . . . . . . . 83
5.2.3 Time-stability typed n-graph selection . . . . . . . . . . . . 83
5.2.4 Time-variant typed n-graph selection . . . . . . . . . . . . . 84
5.3 Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . . . . 84
5.3.1 Selecting candidate features . . . . . . . . . . . . . . . . . . 84
5.3.2 Evaluate candidate features (obtained by feature selection
techniques) . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3.2.1 K-means algorithm . . . . . . . . . . . . . . . . . . 86
5.3.2.2 Assigning users . . . . . . . . . . . . . . . . . . . . 88
5.3.2.3 Cluster evaluation criterion . . . . . . . . . . . . . 89
5.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.4.1 Experimental settings . . . . . . . . . . . . . . . . . . . . . . 89
5.4.2 Experimental results . . . . . . . . . . . . . . . . . . . . . . 90
5.4.3 Comparison with existing feature selection techniques . . . . 93
5.5 Comparing xed and dynamic features on dierent data sizes . . . . 96
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6 User-independent Threshold 101
6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.2 A user-independent threshold system . . . . . . . . . . . . . . . . . 102
6.3 Designing and evaluating user-independent threshold . . . . . . . . 103
6.4 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.5 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.5.1 Data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.5.2 Experimental method . . . . . . . . . . . . . . . . . . . . . . 108
6.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.6.1 Distance measures . . . . . . . . . . . . . . . . . . . . . . . 114
6.6.2 Various number of keystrokes (data size) . . . . . . . . . . . 114
6.6.3 Feature type . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.6.4 Feature amount . . . . . . . . . . . . . . . . . . . . . . . . . 117
vii
6.7 Comparing to user-dependent threshold . . . . . . . . . . . . . . . 119
6.7.1 Experimental Methodology . . . . . . . . . . . . . . . . . . . 120
6.7.2 Empirical Analysis . . . . . . . . . . . . . . . . . . . . . . . 120
6.8 Discussion and limitations . . . . . . . . . . . . . . . . . . . . . . . 122
6.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7 Typist Authentication based on user-independent threshold 127
7.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.2 Typist Authentication System . . . . . . . . . . . . . . . . . . . . . 129
7.3 Change detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.4 Time Series Analysis and Attack Detection . . . . . . . . . . . . . . 132
7.4.1 Sliding window(non-overlapping) . . . . . . . . . . . . . . . 133
7.4.2 Sliding Window (overlapping) . . . . . . . . . . . . . . . . . 133
7.5 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.5.1 Data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.5.2 Experimental method . . . . . . . . . . . . . . . . . . . . . . 135
7.5.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . 136
7.6 Discussions and Limitations . . . . . . . . . . . . . . . . . . . . . . 138
7.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8 Conclusion and Future Directions 141
8.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . 142
8.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.2.1 Application of the Proposed Technique to Dierent Datasets 143
8.2.2 Application of the Proposed Technique to Dierent Biomet-
ric Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.2.3 Improvements to the Proposed the User-Independent Thresh-
old . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.2.4 Detection of Impostor in Real time . . . . . . . . . . . . . . 144
8.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 145
A Characteristics of users typing data 147
Appendix-Dataset-Details 147
Bibliography 151
viii
List of Figures
2.1 Transforming a keystroke pattern into a timing vector when a user
inputs a string AB: The duration and interval times are measured
by millisecond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1 Continuous Biometric Authentication System Model . . . . . . . . . 44
3.2 Characteristics of CBAS . . . . . . . . . . . . . . . . . . . . . . . . 51
5.1 Evaluation methodology for feature selection techniques . . . . . . . 85
5.2 Comparison between proposed feature selection techniques based on
# of selected features cumulatively . . . . . . . . . . . . . . . . . . 91
5.3 Comparison between dierent statistics and distances for the most
frequent 2-graphs technique based on individual group of features . 92
5.4 Comparison between dierent statistics for the most frequent 2-
graphs technique based on # of selected features cumulatively . . . 93
5.5 Comparison between the most and least frequent 2-graphs based on
individual group of features . . . . . . . . . . . . . . . . . . . . . . 96
5.6 Comparison between the most and least frequent 2-graphs based on
# of selected features cumulatively . . . . . . . . . . . . . . . . . . 97
5.7 Comparing xed and dynamic features over dierent data sizes . . . 98
6.1 Equal error rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.2 Inconsistency of threshold among dierent group of users . . . . . . 112
6.3 Varying the data size for group 1 of users . . . . . . . . . . . . . . . 113
6.4 Consistency of the threshold for dierent size of data . . . . . . . . 116
6.5 Consistency of threshold among dierent group of users . . . . . . . 118
7.1 Overview of the proposed typist authentication system . . . . . . . 131
7.2 Sliding window (not overlapping) . . . . . . . . . . . . . . . . . . . 133
7.3 Sliding window (overlapping) . . . . . . . . . . . . . . . . . . . . . 134
ix
7.4 Comparing the accuracy of detection between two dierent autho-
rised users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
x
List of Tables
2.1 The accuracy of statistic typist authentication techniques . . . . . . 17
2.2 The accuracy of continuous typist authentication techniques . . . . 20
3.1 Requirements of dierent scenarios of CBA. . . . . . . . . . . . . . 54
3.2 The dierences between the rst, second classes and the new class. . 61
4.1 Characteristics of users typing data . . . . . . . . . . . . . . . . . . 69
4.2 Most frequent feature in the dataset . . . . . . . . . . . . . . . . . . 72
4.3 Avg time of most frequent characteristic for dierent users . . . . . 73
4.4 Average time of ER for all of the user samples . . . . . . . . . . . . 74
5.1 Comparison of Italian words and most frequent 2-graphs . . . . . . 94
5.2 Comparison between common 2-graphs and most frequent 2-graphs 94
6.1 Comparing accuracy for dierent measurements . . . . . . . . . . . 114
6.2 comparing EER for dierent size of data . . . . . . . . . . . . . . . 115
6.3 Comparing accuracy for dierent feature type . . . . . . . . . . . . 117
6.4 Comparing accuracy for dierent amount of feature set . . . . . . . 117
6.5 Comparative analysis of our approach (user-independent) and cur-
rent schemes (user-dependent) . . . . . . . . . . . . . . . . . . . . 121
6.6 comparing accuracy between user-dependent and user-independent . 122
7.1 Accuracy of detection where varying the size of impostors data in
one window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
A.1 Characteristics of users typing data . . . . . . . . . . . . . . . . . . 148
A.2 Data distribution for 10 users . . . . . . . . . . . . . . . . . . . . . 150
xi
xii
List of Abbreviations
CAS Continuous Authentication System
CBAS Continuous Biometric Authentication System
CA Continuous Authentication
1 Sample 700 to 900 characters
FAR False Acceptance Rate
FRR False Reject Rate
FP False Positive
FN False Negative
IDS Intrusion Detection System
CUSUM Cumulative Sum
GLR Generalized Likelihood Ratio
GP Dataset Guntti Picardi Dataset
AVG Average
STD Standard deviation
EER Equal Error Rate
ROC Receiver Operating Characteristics
TSW Testing Window
AUW Authenticated Window
SNO Sliding Window Not Overlapping
SO Sliding Window Overlapping
xiii
xiv
Declaration
The work contained in this thesis has not been previously submitted to meet
requirements for an award at this or any other higher education institution. To
the best of my knowledge and belief, the thesis contains no material previously
published or written by another person except where due reference is made.
Signed: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Date: . . . . . . . . . . . . . . . . . . . . .
xv
xvi
Previously Published Material
The following papers have been published or presented, and contain material based
on the content of this thesis.
Al solami, Eesa, Boyd, Colin, Clark, Andrew and Khandoker, Asadul Is-
lam. Continuous biometric authentication : Can it be more practical? In:
12th IEEE International Conference on High Performance Computing and
Communications, 1-3 September 2010, Melbourne.
Al solami, Eesa, Boyd, Colin, Clark, Andrew and Ahmed, Irfan. User repre-
sentative feature selection for keystroke dynamics. In Sabrina De Capitani
di Vimercati and Pierangela Samarati, editors, International Conference on
Network and System Security, Universit degli Studi di Milano, Milan, 2011.
Al solami, Eesa, Boyd, Colin, Ahmed, Irfan, Nayak, Richi, and Marring-
ton, Andrew. User-independent threshold for continuous user authentica-
tion based on keystroke dynamics. The Seventh International Conference
on Internet Monitoring and Protection, May 27 - 1 June , 2012, Stuttgart,
Germany.
xvii
xviii
Acknowledgments
Praise and thanks be to Allah for his help in accomplishing this work.
I would like to express my deep and sincere gratitude and appreciation to my
principal supervisor Prof. Colin Boyd, for his continued support and guidance
throughout my Phd. I could not have imagined having a better adviser for my
PhD study. Thank you very much Colin.
I sincerely thank my co-adviser, Assoc. Prof. Richi Nayak, for her encourage-
ment, insightful comments and support. Indeed, Richi was a valuable addition to
my supervision team who provided a great deal of good ideas and suggestions on
my research.
Lastly I wish to thank my entire extended family for their understanding, sup-
port and guidance. I am heartily thankful to my parents, for their encouragement,
guidance and support. Also, and most importantly, I wish to thank my wife and
my daughters, Shahad, Raghad and Yara Without them I could not have com-
pleted this work.
xix
xx
Chapter 1
Introduction
A breach of information security can aect not only a single users work but also the
economic development of companies, and even the national security of a country.
The breach is the focus of research into unauthorised access attacks to a computer,
which is the second greatest source of nancial loss according to the 2006 CSI/FBI
Computer Crime and Security Survey [31]. Attacks on computer systems can be
undertaken at the network, system and user levels [62].
Most information security research undertaken in recent years is concerned
with system and network-level attacks. However, there is a lack of research on
attacks at the user level. User level attacks include the impostor or intruder who
takes over from the valid user either at the start of a computer session or during
the session. Depending on the risks in a particular environment, a single, initial
authentication might be insucient to guarantee security. It may also be necessary
to perform continuous authentication to prevent user substitution after the initial
authentication step. The impact of an intruder taking over during a session is the
same as any kind of false representation at the beginning of a session. Most current
computer systems authorise the user at the start of a session and do not detect
whether the current user is still the initial authorised user, a substitute user, or
an intruder pretending to be a valid user. Therefore, a system that continuously
checks the identity of the user throughout the session is necessary. Such a system
is called a continuous authentication system.
The majority of existing continuous authentication systems are built around
biometrics. These continuous biometric authentication systems (CBAS) are sup-
1
2 Chapter 1. Introduction
plied by user traits and characteristics. There are two major forms of biometrics:
those based on physiological attributes and those based on behavioural character-
istics. The physiological type includes biometrics based on stable body traits, such
as ngerprint, face, iris and hand, and are considered to be more robust and secure.
However, they are also considered to be more intrusive and expensive and require
regular equipment replacement [86]. On the other hand, behavioural biometrics
include learned movements such as handwritten signatures, keyboard dynamics
(typing), mouse movements, gait and speech. Collecting of these biometrics is less
obtrusive and they do not require extra hardware.
Recently, keystroke dynamics has gained popularity as one of the main sources
of behavioural biometrics for providing continuous user authentication. Keystroke
dynamics is appealing for many reasons [32]:
It is less obtrusive, since users will be typing on the computer keyboard
anyway.
It does not require extra hardware.
Keystroke dynamics exist and are available after the authentication step at
the start of the computer session.
Analysing typing data has proved to be very useful in distinguishing between users
and can be used as a biometric authentication. Various types of analysis have been
carried out on users typing data, to nd features that are representative of user
typing behaviour and to detect an impostor or intruder who may take over from
the valid user session.
This research extends previous research on improving continuous user authenti-
cation systems (that are based on keystroke dynamics) by developing a new exible
technique that authenticates users and automates this technique to continuously
authenticate users over the course of a computer session without the need for any
predened user typing model a-priori. Also, the technique introduces new features
that represent the user typing behaviour eectively. The motivation for this re-
search is provided in Section 1.1. Outcomes achieved by this research are identied
in Section 1.2. and the organisation of this thesis is described in Section 1.3.
1.1. Motivation 3
1.1 Motivation
This thesis focuses on developing automatic analysis techniques for continuous user
authentication systems (that are based on keystroke dynamics) with the goal of
detecting an impostor or intruder that may take over a valid user session. The main
motivation of this research is that we need a exible system that can authenticate
users and must not be dependent on a pre-dened typing model of a user. This
research is motivated by:
the relative absence of research in CBAS utilising by biometric sources in
general, and specically using keystroke dynamics.
the absence of a suitable model that considers a continuous authentication
scenario identifying and understanding the characteristics and requirements
of each type of CBAS and continuous authentication scenario.
the need for new feature selection techniques that represent user typing be-
haviour which guarantee that frequently typed features are selected and in-
herently reect user typing behavior.
the lack of an automated CBAS based on keystroke dynamics, that is low
in computational resource requirements and thus is suitable for real time
detection.
1.2 Research Objectives
According to the previous section, the objectives that need to be addressed in this
thesis are:
1. To develop a generic model for identifying and understanding the character-
istics and requirements of each type of CBAS and continuous authentication
scenarios. (Chapter 3)
2. To identify optimum features that represent user typing behaviour which
guarantee that frequently typed features are selected and inherently reect
user typing behaviour. (Chapter 5)
3. To discover whether a pre-dened typing model of a user is necessary for
successful authentication. (Chapter 6)
4. To minimise the delay for an automatic CBAS to detect intruders. (Chapter
7)
1.3 Research Questions
The main research questions or problems that will be addressed in this thesis, are
two.
1. "What is a suitable model for identifying and understanding the characteris-
tics and requirements of each type of CBAS and continuous authentication
scenarios?"
In this thesis we develop a generic model for most continuous authentication
scenarios and CBAS. The model is developed based on detection capabilities
of both continuous authentication scenarios and CBAS to better identify and
understand the characteristics and requirements of each type of scenario and
system. This model pursues two goals: the rst is to describe the charac-
teristics and attributes of existing CBAS, and the second is to describe the
requirements of dierent continuous authentication scenarios.
From the model that we have developed, we found that the main charac-
teristic of most existing CBAS typically depend on pre-dened user-typing
models for authentication. However, in some scenarios and cases, it is im-
practical or impossible to gain a pre-dened typing model of all users in
advance (before detection time). Therefore, the following question will be
addressed in this thesis,
2. "Can users be authenticated without depending on a pre-dened typing model
of a user? If so, how?"
In this thesis we develop a novel continuous authentication mechanism that
is based on keystroke dynamics and is not dependent on a pre-dened typing
model of a user. It is able to automatically detect the impostor or intruder
in real time. The accuracy of CBAS is measured in terms of its detection
accuracy rate. The aim is to maximize the detection accuracy: that is the
percentage of detection of impostors or intruder that masquerade as genuine
users, and minimise the false alarm rate: that is the percentage of genuine
users that are identied as impostor or intruder. In order to detect or dis-
1.4. Research Outcomes 5
tinguish between a genuine user and impostor in one computer session, we
incorporate distance measure techniques in our approach.
Three further sub-questions that arise from the previous question are:
1. What are the optimum features that are representative of user typing behav-
ior? To address this sub-question, we propose four statistical-based feature
selection techniques. The rst technique selects the most frequently occur-
ring features. The other three consider dierent user typing behaviors by
selecting: n-graphs that are typed quickly; n-graphs that are typed with
consistent time; and n-graphs that have large time variance among users.
2. How accurately can a user-independent threshold determine whether two
typing samples in a user session belong to the same user? In order to address
this sub-question, we have examined four dierent variables that inuence
the accuracy of the threshold which directly manipulate the user samples
for authentication: distance type, number of keystrokes, feature type and
amount of features.
3. Can we automatically detect an impostor who takes over from a valid user
during a computer session and the amount of typing data needed for a sys-
tem to detect the imposter? To answer this sub-question, we need to rst
answer the questions 1 and 2, and use the answers to propose the automated
system. For automated detection, a sliding window mechanism is used and
the optimum size of the window is determined.
1.4 Research Outcomes
By addressing the research objectives and research questions, this thesis makes a
number of contributions and achievements including:
1. A generic model is proposed for most continuous authentication scenarios and
CBAS. The model of CBAS is proposed based on their detection capabilities
to better identify and understand the characteristics and requirements of
each type of scenario and system. This model pursues two goals: the rst
is to describe the characteristics and attributes of existing CBAS, and the
second is to describe the requirements of dierent scenarios of CBAS. The
research results were published in:
Al solami, Eesa, Boyd, Colin, Clark, Andrew and Khandoker, Asadul
Islam. Continuous biometric authentication : Can it be more prac-
tical? In: 12th IEEE International Conference on High Performance
Computing and Communications, 1-3 September 2010, Melbourne.
2. We propose four statistical-based feature selection techniques that are repre-
sentative of user typing behavior. The selected features have high statistical
signicance for user-representation and also, inherently reect user typing
behavior. The rst is simply the most frequently typed n-graphs; it selects
a certain number of highly occurring n-graphs. The other three encompass
users dierent typing behaviors including:
(a) The quickly-typed n-graph selection technique; it obtains n-graphs that
are typed quickly. The technique computes the average of n-graphs rep-
resenting their usual typing time and then, selects the n-graphs having
least typing time.
(b) The time-stability typed n-graph selection technique; it selects the n-
graphs that are typed with consistent time. The technique computes
the standard deviation of n-graphs representing the variance from their
average typing time and then selects the n-graphs having least variance.
(c) The time-variant typed n-graph selection technique; it selects the n-
graphs that are typed with noticeably dierent time.The technique
computes the standard deviation of n-graphs among all users repre-
senting the variance from their average typing time and then, selects
the n-graphs having large variance.
The research results were published in:
Al solami, Eesa, Boyd, Colin, Clark, Andrew and Ahmed, Irfan . User
representative feature selection for keystroke dynamics. In Sabrina De
Capitani di Vimercati and Pierangela Samarati, editors, International
Conference on Network and System Security, Universit degli Studi di
Milano, Milan, 2011.
3. A proposed user-independent threshold approach that can distinguish a user
accurately without needing any predened user typing model a-priori. The
threshold can be xed across a whole set of users in order to authenticate
1.5. Research Signicance 7
users without requiring pre-dened typing model for each user. The research
results were published in:
Al solami, Eesa, Boyd, Colin, Ahmed, Irfan, Nayak, Richi, and Mar-
rington, Andrew. User-independent threshold for continuous user au-
thentication based on keystroke dynamics. The Seventh International
Conference on Internet Monitoring and Protection, May 27 - 1 June ,
2012, Stuttgart, Germany.
4. The design of an automatic system that is capable of authenticating users
based on the user-independent threshold.
1.5 Research Signicance
This research advanced the knowledge in the area of CBAS by linking the contin-
uous authentication scenarios with the relevant continuous biometric authentica-
tion schemes. It identies and understands the characteristics and requirements of
each type of CBAS and continuous authentication scenarios. The research helps
to choose the right accuracy measurements for the relevant scenario or situation.
Furthermore, the research established a novel approach based on a user-independent
threshold without needing to build user-typing models. The new approach helps
to allow building new practical systems in a systemic way that can be used for
user authentication and impostor detection during the entire session without the
need for any predened user typing model a-priori. The new system can be ap-
plicable in some cases where it is impractical or impossible to gain the predened
typing model of all users in advance (before detection time). Examples are in an
open-setting scenario and in an unrestricted environment such as a public location
where any user can use the system. For instance, consider a computer that has
a guest account in a public location. In this instance, any user can interact with
the system. Naturally, no pre-dened typing model for the user would be available
prior to the commencement of the session.
Additionally, the implications of this method extend beyond typist authenti-
cation; it is generic and might be applied to any biometric source such as mouse
movements, gait or speech. Another important implication is that unlike the ex-
isting schemes our method can distinguish two unknown user samples and decide
whether they are from the same user or not. This might help in forensics investi-
gation applications where you have two dierent typing samples and you want to
decide if they related to one user or two dierent users.
1.6 Thesis Outline
The rest of this thesis is organised as follows:
Chapter 2: Background This chapter gives an overview of biometrics, and
introduces the concepts that will be used to describe typist authentication in sub-
sequent chapters. Also, this chapter surveys related work, covering existing tech-
niques for continuous user authentication that are based on keystroke dynamics.
Chapter 3: A proposed model for CBAS using biometric This chapter
proposes a generic model for most continuous authentication scenarios and CBAS.
This model has two goals: the rst is to describe the characteristics and attributes
of existing CBAS; and the second is to describe the requirements of dierent
scenarios of CBAS. Also, we identify the main issues and limitations of existing
CBAS, observing that all of the issues are related to the training data. Finally, we
consider a new application for CBAS without requiring any training data either
from intruders or from valid users in order to make the CBAS more practical.
Chapter 4: Dataset Analysis This chapter describes and analyses in depth
the dataset that we used in our research. The analysis includes the data pre-
processing that prepared the data for further analysis. Also, this chapter provides
preliminary experiments that show the dataset is reliable and can be used for our
research. Furthermore, this chapter provides an overview of the experimental and
evaluation methodology used in this thesis.
Chapter 5: User-Representative Feature Selection for Keystroke Dy-
namics This chapter explores which typing patterns can be used on a continuous
basis for user authentication. The chapter proposes four statistical-based feature
selection techniques that mitigate limitations of existing approaches. First tech-
nique selects the most frequently occurring features. The other three consider
dierent user typing behaviors by selecting: n-graphs that are typed quickly; n-
graphs that are typed with consistent time; and n-graphs that have large time
variance among users. We further substantiate our results by comparing the pro-
posed technique with three existing approaches (popular Italian words, common
n-graphs, and least frequent n-graphs). Finally, the chapter analyses and compares
1.6. Thesis Outline 9
xed and dynamic features.
Chapter 6: User-independent threshold for continuous user authenti-
cation based on keystroke dynamics This chapter proposes user-independent
threshold approach that can distinguish a user accurately without the need for any
predened user typing model a-priori. The chapter examines four dierent vari-
ables that can directly manipulate the user samples for authentication in order to
see the inuence of these factors on the accuracy of the user-independent threshold:
distance type, number of keystrokes, feature type and number of features.
Chapter 7: Typist authentication system based on user-independent
threshold This chapter presents a system design that shows the user-independent
threshold can work in a practical way. Particularly, the chapter has two aims.
First, identify the minimum data needed from an impostor before they can be de-
tected. Second, identify the point where the impostor takes over from the genuine
user in a computer session.
Chapter 8: Conclusion and Future Work Conclusions and directions for
future research are presented in this chapter.
Chapter 2
Background
The goal of this thesis, as described in chapter 1, is to design and develop tech-
niques for detection of the impostor who may take over from the authenticated
user during a computer session using keystroke dynamics. This chapter provides
an overview of the authentication concept and dierent types of authentication
methods focusing on typist authentication methods. Also, the chapter gives an
overview of the current anomaly detection techniques that can be used in our re-
search problem with the emphasis on the related techniques that are used in this
thesis.
This chapter is organized as follows. Section 2.1 provides an overview of au-
thentication methods. Section 2.2 discusses in details the current schemes with
typist authentication including static typist and continuous typist authentication.
Section 2.3 provides an overview of the current anomaly detection techniques. Sec-
tion 2.4 presents previous research related to work described in chapters 5 to 7.
Later in section 2.5, research challenges associated with the analysis of keystroke
dynamics for continuous user authentication are discussed. Finally, the chapter is
summarized in section 2.6.
2.1 User Authentication in Computer Security
Authentication is the process of checking the identity of someone or something.
User authentication is a means of identifying the user and verifying that the user is
allowed to access some restricted environments or services. Security research has
11
12 Chapter 2. Background
determined that, for a positive identication, it is preferable that elements from
at least two, and preferably all three, factors be veried [17]. The three factors
(classes) and some of elements of each factor are:
the object factors: Something the user has (e.g., ID card, security token,
smart card, phone, or cell phone)
the knowledge factors: Something the user knows (e.g., a password, pass
phrase, or personal identication number (PIN) and digital signature)
the inherent factors: Something the user is or does (e.g., ngerprint, retinal
pattern, DNA sequence (there are assorted denitions of what is sucient),
signature, face, voice, unique bio-electric signals, or other biometric identi-
er).
Any authentication system includes several fundamental elements that need to be
in place [95]:
the initiators of activity on a target system, normally a user or a group of
users that need to be authenticated.
distinctive traits or attributes that make a distinction for a particular user
or group from others such as knowledge of a secret password
proprietor or administrator working on the proprietor s behalf who is re-
sponsible for the system being used and relies on automatic authentication
to dierentiate authorized users from other users.
an authentication mechanism to verify the user or group of users of the
distinguishing characteristic such as object factors, knowledge factors and
inherent factors.
some privilege allowed when the authentication of the user succeeds by using
an access control mechanism, and the same mechanism denies the privilege
if authentication of the user fails.
As we mentioned at the start of this section, the third class of the positive identi-
cations factors is the inherent factors. Also, we mentioned that the biometric is
based on inherent factors and since our research is focused on biometric authen-
tication, we will limit our discussion only to biometric authentication in the next
sub-section.
2.1. User Authentication in Computer Security 13
2.1.1 Biometric Authentication
Biometrics is the automatic recognition of a person using distinguishing traits
[101]. Biometrics can be physical or behavioural. The physical biometric measures
a static physical trait that does not change, or changes only slightly, over time.
It is related to the shape of the human body like ngerprints, face recognition,
hand and palm geometry, and iris recognition. Behavioural biometric measures
the characteristics of a person by how they behave or act, like speaker and voice
recognition, signature verication, keystroke dynamics, and mouse dynamics.
The advantage of physical biometrics is that it has high accuracy compared to
behavioural biometrics. In contrast, physical biometric devices need to be imple-
mented and this leads to some limitations such as high cost of implementation.
Behavioural biometrics is one of the popular methods for the continuous authen-
tication of a person, but it produces insucient accuracy because behaviour is
unstable and can change over time.
Identication and verication are the goals of both biometric techniques that
involve determining who a person is; biometric verication involves determining if
a person is who they say they are [43]. Physical biometric authentication is the
most foolproof form of person identity and the hardest to forge or spoof, compared
to other traditional authentication methods like user name and password.
The operation of biometric identication or authentication technologies has
four stages [41]:
Capture It is used in the registration phase and also in the identication
or verication phases. The system can capture a physical or behavioural
sample.
Extraction Distinctive pattern or features is extracted from the sample by
selecting the optimum features that represent the user eectively and then
the prole is created for each user.
Comparison The prole is then compared with a new sample in the testing
phase.
Match/Non Match The system then makes a decission if the features on
the prole in the database are a match or non-match to the features on the
new sample in the testing phase.
Jain et al. [1] presented seven essential properties or features of biometric measure.
Universality Everyone should have the same measure.
Uniqueness The measure distinguishes each user from all companions which
means that no two people should have the same value.
Permanence The measure should be consistent with time. However, be-
havioural biometric slightly changes with time as user learns and improve
his skills for accomplishing tasks.
Collectability The process of data collection should be quantitatively mea-
surable.
Performance The system should be accurate. Identication accuracy for
most of the biometric sources is lower than verication accuracy [103].
Acceptability The system should be willing to accept the measure by most
of the people. However, the measure might be objected to for ethical or
privacy reasons by some people.
Circumvention The measure should not be easily fooled. However, once
such knowledge is available, fabrication may be very easy. Therefore, it is
important to keep the collected biometric data secure.
In the next section, we describe in detail the typist authentication as one of the
behavioural biometric types.
2.2 Typist Authentication
Most of the previous features of biometric measure are represented in the keystroke
dynamics or typing. Jain et al. [1]mentioned that the typing behaviour has low per-
manence and performance, and medium collectability, acceptability and circum-
vention. We think all of the seven biometric measures are represented in keystroke
dynamics or using the users typing behaviour for authentication. Universality that
every user can type on the computer except the disabled person.collectability that
the keyboard is able to collect and extract the users data even each keyboard has
dierent specications that may aect quality of the typing data. Uniqueness that
each user typing dierently from other users. It means that no two people have
the same typing behaviour permanence that the users typing behaviour are nor-
mally consistent over the time and several studies [44, 32, 21, 28, 48]conclude that
2.2. Typist Authentication 15
the keystroke rhythm often has characteristics or features that represent consistent
patterns of user typing behavior with time. However, some users typing behaviour
may slightly change with time as the typing skills of the user can be changed over
time. Acceptability that the keyboard is not intrusive instrument which can be
acceptable by most of the users. Performance that the accuracy of using keystroke
dynamics is normally high for representing the user typing behaviour eectively.
Circumvention means that its dicult to copy some other users typing style. In
this thesis we will consider and evaluate some of these measurements of the typing
behaviour including universality, uniqueness, permanence and performance.
Typing "as a behavioural biometric for authentication" has been used for sev-
eral years. By analysing users typing patterns, several studies [44, 32, 21, 28, 48]
conclude that the keystroke rhythm often has characteristics or features that rep-
resent consistent patterns of user typing behavior. Therefore it can be used for
user authentication. In chapter 5, we will present extensive analysis in nding the
features that can represent users typing patterns eectively and then they can be
used for user authentication which is discussed later in chapters 6 and 7.
The input to a typist authentication or keystroke dynamics system is a stream
of key events and the time that each one occurs. Each event is either a key-press or
a key-release. Most typist authentication techniques make use of the time between
pairs of events, typically the digraph time or keystroke duration.
The digraph time is the time interval between the rst and the last of n
subsequent key-presses. It is sometimes called keystroke latency or inter
keystroke interval.
The keystroke duration is the time between the key-press and key-release for
a single key. This is sometimes known as the key-down time, dwell time or
hold time [30].
There are two main types of keystroke analysis, keystroke static and keystroke
dynamic (or continuous) analysis. Static keystroke analysis means that the anal-
ysis is performed on the same predened text for all the individuals under ob-
servation. Most of the literature on keystroke analysis falls within this category
[100, 57, 46, 11, 12, 69, 10]. The intended application of static analysis is at login
time, in combination with other traditional authentication methods.
Continuous analysis involves a continuous monitoring of keystrokes typing and
is intended to be executed during the entire session, after the initial authentication
step. It should be that keystroke analysis performed after the initial authentication
step deals with the typing rhythms of whatever is entered by the users. It means
that the system should deal with free text. In the next two sub-sections we will
give in more details the existing schemes in both types of keystroke dynamics.
2.2.1 Static Typist Authentication
Static authentication involves authenticating users through stable methods like
user name and password. Behavioural static authentication is a static authentica-
tion method that determines how the user acts and behaves with the authentication
system; for example, how a user name and password typed. This method is used
for additional authentication methods and to overcome some limitations of tradi-
tional authentication methods. Keystroke dynamics and mouse dynamics are the
main examples of behavioural static authentication.
Keystroke dynamics analyse the typing patterns of users. Using keystroke
dynamics as an authentication method is derived from handwriting recognition,
which analyses hand writing movements. Table 2.1 summarises a few techniques
that will be discussed in this section. These techniques are measured by two
measurements: FRR when the system incorrectly rejects an access attempt by an
authorised user and FAR when the system incorrectly accepts an access attempt
by an unauthorised user.
In 1980, Gaines et al. [29] were the rst to use keystroke dynamics as an
authentication method. They conducted an experiment with six users. Each
participant was asked to retype two samples and the gap time between collection of
the two samples was four months period. Each sample contained three paragraphs
with varying lengths. They used specic digraphs as a feature that occurred during
the paragraphs by analysing and collecting the keystroke latency timing. The most
frequent ve digraphs that appeared as distinguished features were in, io, no, on,
ul. Then, they compared latencies between two sessions to see whether the average
and mean values were the same at both sessions. The limitation of this experiment
was that the data sample was too small to get reliable results. Also, there was no
automated classication algorithm used between the participants but the results
were claimed to be very encouraging.
Umphress and Williams [100] asked 17 participants to type two samples. One
typing sample used for training included about 1400 characters and a second sam-
ple used for testing included about 300 characters. They represent the features by
Reference FAR
(%)
FRR
(%)
Sample Content Method
Gaines et al.
[29]
0.00 0.00 6000 characters Manual
Umphress
and
Williams[100]
12.00 6.00 1400 characters for
training and 300
characters for testing
Statistical
Leggett and
Williams[57]
5.5 5 1000 words Manual
Joyce and
Gupta [46]
16.36 0.25 user name, a password
and the last names
eight times for training
and ve times for
testing
Statistical
Bleha et al.
[11]
8.1 2.80 Name and xed phrase Bayes
Brown and
Rogers [12]
21.2 12.0 First and last names Neural
network
Obaidat and
Sadoun[69,
70]
0.00 0.00 User names 225 times Neural
network
Furnell et al.
[28]
26 15 4400 characters Statistical
Bergadano et
al. [10]
0.00 0.14 683 characters Nearest
Neighbour
Sang et al.
[88]
0.2 0.1 alphabetic password
and numeric password
SVMs
Table 2.1: The accuracy of statistic typist authentication techniques
grouping the characters in terms of words and then they calculated the time of the
rst 6 digraphs for each word. The classier was based on statistical techniques by
setting the condition that each digraph must fall within 0.5 standard deviations
of its mean to be considered valid. Their system obtained a FRR of 12% and an
FAR of 6%.
Later, an experiment was conducted by Leggett and Williams [57] inviting 17
programmers to type approximately 1000 words, which was similar to the Gaines
et al. [29] experiment, but there was a condition of accepting the user if more than
60% of the comparisons were valid. The results demonstrated that the FAR was
5.5% and the FRR was 5%.
Joyce and Gupta [46] recorded the keystroke during the log-in process by typing
the user name, a password, and the last names of users eight times. 33 users
participated in this experiment and they typed eight times to build historical
prole, and ve times for testing. The classier was based on a statistical approach.
It requires that the digraph fall with 1.5 standard deviations of its reference mean
to belong to a valid user. The result demonstrated that the FAR was 16.36% and
the FRR was 0.25% .
Bleha et al. [11] used the same approach that was proposed by Joyce and
Gupta [46] and they used digraph latencies as a feature to distinguish between
samples of legal users and intruders. The experiment invited 14 participants as
valid users and 25 as impostor users to create their proles. The classier method
was based on Bayes classier using the digraph times. Results show that the FAR
was 8.1% and FRR was 2.8%.
Brown and Rogers [12] are the rst to use the keystroke duration as a feature
to distinguish between the samples of authenticated users and impostors. The
experiment divided the participants into two groups (21 in the rst group and
25 in the second group), and they were asked to type their rst and last names.
The neural network method was applied in this experiment to classify the data
and results show a 0.0% false negative rate and 12.0% FRR in the rst group and
21.2% FAR in the second group.
Furnell et al. [28] used the digraph latencies as representative feature. Thirty
users were invited to type the same text of 2200 characters twice, as a measure to
build their proles. For intruder proles, the users were asked to type two dierent
texts of 574 and 389 characters. Digraph latencies were computed by statistical
analysis, and the results show that in the rst 40 keystrokes of the testing sample,
the FRR was 15% and the FAR were 26%.
Obaidat and Sadoun [69, 70] used the keystroke duration and latency together
as a feature to distinguish between the samples of authenticated users and impos-
tors. The experiment invited 15 users to type their names 225 times each day over
a period of eight weeks to build their proles. Neural network was the classier to
classify the user samples. The results showed that both FAR and FRR were zero.
Bergadano et al. [10] used single text of 683 characters for 154 participants and
they considered the type errors and the intrinsic variability of typing as a feature
that can distinguish users. They used the degree of disorder in trigraph latencies
as a measure for dissimilarity metric and statistical method for classication to
compute the average dierences between the units in the array. The results show
that the FAR was 4.0% and FRR was 0.01%. This method in the experiment
is suitable for the authentication of users at log-in, but it is not applicable for
continuous authentication because it requires predened data.
Sang et al. [88] conducted the same experiment as Obaidat and Sadoun [69,
70](duration and latency together) but with a dierent technique. The technique
used support vector machine (SVM) to classify ten user proles, and the results
demonstrated that this technique is the best for classifying the data of user proles,
where more accurate results of 0.02% FAR and 0.1% FRR.
All of the previous techniques show that the static typist authentication had
great success that can be used to distinguish dierent users ectively. It shows that
the static typist authentication has dierent features that can be used to present
the user typing behaviour. These features can be used for user authentication.
In the next section, we will see whether the continuous typist authentication has
dierent features that can be used eectively for user authentication similar to the
static typist authentication .
2.2.2 Continuous Typist Authentication
Continuous typist authentication using dynamic or free text applies when users
are free to type whatever they want and keystroke analysis is performed on the
available information. Continuous typist authentication using dynamic or free
text is much closer to a real world situation than using static text. The literature
on keystroke analysis of free text is pretty limited. This section describes most
continuous typist authentication techniques. They are summarised in Table 2.2.
Monrose and Rubin [64] conducted an experiment on 31 users by collecting typ-
Reference FAR
(%)
FRR
(%)
Accuracy
(%)
Sample content Method
Monrose
and Rubin
[64]
- - 23 Few predened and free
sentences
Euclidean
distance and
weighted
probability
Dowland
et al. [22]
- - 50 Normal activity on
computers runing Windows
NT
Statistical
method
Dowland
et al.[21]
- - 60 Normal activity on specic
applications such as Word
Statistical
method
Bergadano
et al. [9]
0 5.36 - Two dierent texts, each
300 charcters long
Distance
measure
Nisenson
et al. [68]
1.13 5.25 - Task response&
each user typed 2551
1866 keystrokes.
LZ78
Gunetti
and
Picardi
[32]
3.17 0.03 - Articial emails &each user
typed 15 samples&each
sample contains 700 to 900
keystrokes
Nearest
Neighbour
Hu et al.
[40]
3.17 0.03 - 19 users &each one provide
5 typing data "free text"
k-nearest
neighbor
Bertacchini
et al.[8]
- - - 62 dierent users typed 66
samples based on spanish
language
k-medoids
Hempstalk
et al. [38]
- - - Real world emails
150 email samples& 607
email samples
Gaussian density
estimation
Janakiraman
and Sim
[44]
- - - 22 users collected their data
based on their daily activity
work of using email, ranged
from 30,000 keystrokes to 2
million keystrokes
Based on a
common list of
xed strings
Table 2.2: The accuracy of continuous typist authentication techniques
ing samples in about 7 weeks. Users ran the experiment from their own computers
at their convenience. They had to type predened sentences from a list of available
phrases and/or to enter a few sentences not predened and completely free. To
build a prole for each user, it is unknown how many characters the user should
type. They consider the features that represent the user behaviour by calculating
the mean latency and standard deviation of digraphs as well as the mean duration
and standard deviation of keystrokes. For ltering the user prole, they compared
each latency and duration with its respective mean and any values greater than
standard deviations above the mean (that is, outliers) were removed from the user
prole. Testing samples are manipulated in the same way by removing outliers
and so turned into testing proles to be compared to the reference proles using
three dierent distance measures:
Euclidean distance
Euclidean distance with calculation of the mean, and
standard deviation time of latency and duration of digraph.
The last experiment used Euclidean distance and added weights to digraphs. The
FRR is about 23% of correct classication in the best case (that is, using the
weighted probability measure).
Dowland et al. [22] applied dierent data mining algorithms and statistical
techniques on four users, with data samples to distinguish between authenticated
users and imposters. The users were observed for some weeks during their normal
activity on computers using Windows NT. It means that there was no constraint
in the user to use the computer and the user is free to use the computer in any
way. Users proles are decided to have features using the mean and standard
deviation of digraph latency and only digraphs typed less frequently by all the
users in all samples are considered. To lter the user prole, there were two
thresholds: any times less than 40ms or greater than 750ms were discarded. The
results demonstrated a 50% correct classication rate. The same experiment was
rened by Dowland et al. [21]. It included some application information for Power
Point, Internet Explorer, Word, and Messenger. The experiment collected the data
of eight users over three months and the results demonstrated that the FRR was
40%.
Bergadano et al. [9] calculated a new measure which was the time between
the depression of the rst key and the depression of the second key for each two
characters in a sequence. Forty users were invited to build historical proles by
typing two dierent samples of text. Each text contained 300 characters and the
participants were asked to type 137 samples. 90 new users were invited to build
testing les by typing the second sample only. The mean distance was computed
between unknown instance sample and each sample of a users prole and the mean
distance was also computed between unknown instance sample and each users
prole to classify unknown instance sample. The authors applied a supervised
learning scheme for improving the false negative rate to compute the mean and
standard deviation between every sample in users prole and every sample in a
dierent user prole. Results demonstrated that the FRR was reduced to 5.36%
and the FAR was zero.
A longer experiment done by Dowland and Furnell [23] collected about 3.5
million keystrokes from 35 users during three months. The sample content that
were collected from users is based on the global logging. Global logging includes
all possible typists behaviour.
Nisenson et al. [68] collected free text samples from ve users as normal users
and 30 users behaving as attackers. The sample content was either an open answer
to a question, copy-typing, or a block of free typing. The time dierentials were
calculated from typing data and used as a user feature. Each normal user typed
between 2551 and 1866 keystrokes. Attackers were asked to type two open ended
questions and were required to type the specic sentence, To be or not to be.
That is the question. Also, they were allowed to type in free text between 660
to 597 keystrokes. Then, they trained these features on the LZ78-based classier
algorithm. The accuracy of the system was attained with FRR 5.25% and FAR
1.13%.
Gunetti and Picardi [32], used free text samples in their experiment by inviting
205 participants and they used the same technique that Bergadano et al. [9] pro-
posed in their work based on static typist authentication, discussed in the previous
section. They created proles for each user based on their typing characteristics in
free text. The users performed a series of experiments using the degree of disorder
to measure the distance between the test sample to reference samples from every
other user in their database. The samples are transformed into a list of n graphs,
sorted by their average times. To classify a new sample it is compared with each
existing sample in terms of both relative and absolute timing. Only digraphs that
appear in both reference and unknown samples are used for classication. The
Gunetti study [32] achieves very high accuracy when there are many registered
users.
Many researchers have used clustering algorithms in order to authenticate users.
Hu et al. [40], applied similar technique to that was proposed by Gunetti and
Picardi [32]. 19 users participated in this experiment with each of them providing
ve typing samples. Another 17 users provided 27 typing samples which were used
as impostor data. Typing environment conditions were not controlled in this data
collection. They proposed k-nearest neighbor classication algorithm which an
input needs only to be authenticated against limited user proles within a cluster.
The main dierence between the proposed algorithm by Hu et al. [40] and the
method of Gunetti and Picardi (GP method) is that the authentication process
of the proposed algorithm is within a cluster while the GP method needs to go
through the entire database. Also for a user prole X, the k-nearest neighbor
classication algorithm uses only its representative prole in the authentication
process while the GP method needs to compare with every sample of each user
prole. They used the clustering algorithm to make a cluster for each user. First,
each user provides several training samples and then the prole of each user is used
for building. A representative user prole is built by averaging all such vectors from
all training samples provided. Second, the k-nearest neighbour method is applied
to cluster the representative proles based on the distance measure. Finally, the
authentication algorithm for new text is executed only on the users corresponding
cluster. The success of the proposed algorithm depends upon the threshold value,
which is dependent on the registered users in the system. Moreover, the specic
use of the proposed algorithm in classifying and authenticating only users who
are already registered in the system makes the system less eective when new
users interact with the system. The experiment shows that the proposed k-nearest
neighbor classication algorithm can achieve the same level of FAR and FRR
performance as the Gunetti and Picardi (PG) approach. However, the proposed
approach has improved the authentication eciency up to 66.7% compared to
Gunetti and Picardi (PG) method.
Bertacchini et al. [8], ran their experiment on one dataset. This dataset
contains keystroke data of 66 labeled users based on Spanish language. The dataset
contains a total of 66 samples, one sample per typing session, representing 62
dierent users. They also used clustering to classify users by making a cluster
for each user. The number of clusters is based on the number of the users in the
dataset. The proposed algorithm is Partitioning Around Medoids (PAM) which
is an implementation of k-medoids, rst proposed by Kaufman and Rousseeuw
[49]. It is partitioning technique that clusters the data set of n objects into k
clusters known a-priori. It has the advantage over k-means that it does not need
a denition of mean, as it works on dissimilarities; for this reason, any arbitrary
measure distance can be used. It is also more robust to noise and outliers as
compared to k-means because it minimises a sum of dissimilarities instead of a sum
of squared Euclidean distances. So, the proposed approach worked successfully on
the registered users in the system but it failed were new users are added to the
system.
Hempstalk et al. [38] collected typing input from real-world emails. They
collected about 3000 emails over three months and then they processed into two
nal datasets. It included 150 email samples and 607 email samples respectively.
Then, they created proles only for valid users based on their typing character-
istics in free text. They performed a series of experiments using the Gaussian
density estimation techniques by applying and extending an existing classication
algorithm to the one class classication problem that describes only the valid user
typing data. Hempstalk applied a density estimator algorithm in order to generate
a representative density for the valid users data in the training phase, and then
combined the predictions of the representative density of the valid user and the
class probability model for predicting the new test cases.
Janakiraman and Sim [44], replicated the work of Gunetti and Picardi [32].
They conducted the same experiment again but they did not avoid using the usual
digraph and trigraph latencies directly as features. 22 users were invited over two
weeks to conduct an experiment. Some of the users are skilled were typists and
could type without looking at the keyboard. Other users are unskilled typists, but
are still familiar with the keyboard as they have used it for many years. The users
came from dierent backgrounds including Chinese, Indian or European origin,
and all are uent in English. Keystrokes were logged as users went about their
daily activity work of using email, surng the web, creating documents, and so
on. The collected data from users ranged from 30,000 keystrokes to 2 million
keystrokes. In total 9.5 million keystrokes were recorded for all users. Howevere,
they did not report their ndings in their paper.
One of the main limitation of Gunetti and Picardi approach [32] is the high
verication error rate which cause scalability issue. Gunetti and Picardi proposed
a classical n-graph-based keystroke verication method (GP method), which can
achieve a low False Acceptance Rate (FAR). However, the GP method suers
from a high False Rejection Rate (FRR) and a severe scalability issue. Thus, GP
is not a feasible solution for some applications such as computing cloud application
where scalability is a big issue. To overcome GPs shortcomings, Xi et al. [102]
devloped the latest keystroke dynamic scheme for user vervication to overcome
GPs shortcomings. To reduce high FRR, they designed a new correlation measure
using n-graph equivalent feature (nGdv) that enables more accurate recognition
for genuine users. Moreover, correlation-based hierarchical clustering is proposed
to address the scalability issue. The experimental results show that the nGdv-C
can produce much lower FRR while achieving almost the same level of FAR as
that of the GP method.
All of the previous techniques shows that the continuous typist authentication
had great success similarly to the static typist authentication that can be used
to distinguish users eectively. It shows that we can obtain some features from
the typing data that can be used to represent the user typing behaviour. These
features can be used for successful user authentication. Howevere, the extracted
features from the typing data of continous typist authentication do not guarantee
features with strong statistical signicance and also, do not inherently incorporate
user typing behavior. Furthermore, one of the main limiatation of the previous
continuous typist authentication technuiques is requiring the users data to be
available in advance. In principle, the requirement of collecting the users data
in advance restricts the systems to authenticate only known users whose typing
samples are modelled. In some cases, it is impractical or impossible to gain the
pre-dened typing model of all users in advance (before detection time). It should
be possible to distinguish users without having pre-dened typing model which
lead the system to be more practical.
In the next section, we present some related pioneering works in both supervised
and unsupervised typist authentication. Supervised and unsupervised methods
will help to link the existing continuous typist authentication schemes with the
relevant setting environment or scenario.
2.3 Machine Learning in Typist Authentication
There are two main types of settings (or scenarios) for continuous typist authen-
tication. A continuous authentication scenario might be conducted either in an
open-setting environment or a closed setting environment [79]. There are two main
machine learning methods used to represent these two setting environments su-
pervised learning (closed-setting environment) where the prole of authenticated
users and possible impostors are available in advance such as computer based exam
scenario. This type of setting might be considered as a restricted environment in
that the environment should be under access control to stop any user not reg-
istered in the system. The second method of machine learning is unsupervised
learning (open-setting environment) where the prole of the impostor is not avail-
able such as online banking scenario. However, open-setting environment might
be conducted when the prole of the impostor and valid users are not available
such as computer based TOEFL exam scenario.
2.3.1 Supervised Learning
Supervised approaches can only detect previously known typing models and are
unable to detect unknown typing models. Predened typing models of both valid
user and possible impostors are required to construct models in order to assign
observations into one of the two classes. In this case, the identity of the user who
initiates the session is known, as well as the identities of all possible impostors or
colluders. The characteristics of supervised learning using typing behaviour are
summarised below.
This approach requires the continuous authentication scenario to be in a
closed setting and restricted environment. The environment should prevent
any user not registered in the system from gaining access.
The identity of an authorised user (who initiates a session) is known and this
user has pre-dened typing model registered in the database in advance.
The unauthorised user such as an imposter or intruder would try to claim the
identity of an authorised user throughout the session and, it is assumed, they
will have a pre-dened typing model registered in the database in advance.
The unauthorised user in this approach might be an adversary or a colluder.
The adversarial user may be deliberately acting maliciously towards the au-
2.3. Machine Learning in Typist Authentication 27
thorised user. This may happen when the authenticated user is harmed by
a malicious person or they forget to log o at the end of the session. In this
case, the malicious person may conduct some actions or events on behalf of
the authorised user. Alternatively, the colluding user may be invited by the
valid user to complete an action on behalf of the user for example TOEFL
exam. The victim in this case would be the system operator or the owner of
the application.
The labelled normal data from the authorised users and anomalous data from
possible imposters should be used in order to build the detection model. This
approach is similar to the multi-class classier that learns to dierentiate
between all classes in the training data. This classier is then used to predict
the class of an unseen instance by matching it to the closest known class.
There are many existing supervised learning techniques for building models of nor-
mal behaviour when pre-dened typing models are available for all users. Neural
networks (NNs) [19, 84, 12], decision trees (DTs) [80], support vector machines
(SVMs) [88], nearest neighbour [32], and supervised statistical models [59, 10] are
well known supervised learning techniques.
In general, supervised measures produce more accurate results than unsuper-
vised approaches because the pre-dened typing models have labeled data (i.e.,
it had examples of both normal and anomalous behaviors). However, supervised
techniques are only able to detect known pre-dened typing model of user and
cannot detect an unknown user. In the next section, we discuss studies that have
included unsupervised techniques to detect impostors who do not have pre-dened
typing model.
2.3.2 Unsupervised Learning
In contrast to supervised learning approaches, unsupervised learning methods do
not require a pre-dened typing model of the user. In this case, we assume that
no pre-dened typing model is available for any users, authorised or not, at the
beginning of the session. We do however, assume that the user who initiates the
session is authorised to do so. The challenge here is to build a typing model of
this authorised user while, at the same time, trying to decide whether or not the
session has been taken over by an imposter. A summary of the characteristics of
unsupervised learning using typing behaviour follows.
This approach requires the continuous authentication scenario to be in the
open setting and in a non-restricted environment. The environment should
be in a public location so that any user can use the computer system.
While the identity of an (authorised) user who initiates a session may be
known, no pre-dened typing model for this user is available prior to the
commencement of the session.
Similarly, the pre-dened typing model for the unauthorised user is not avail-
able or cannot be collected before the prediction time or testing time.
The unauthorised user in this approach would be an adversary user and the
victim in this case would be the end user.
There are no labels for both normal and anomalous data to be used in order
to build the detection model in this approach.
In this approach, the system determines whether the typing data in the testing
phase is related to one user or two users by trying to identify any signicant change
within the typing data. Many unsupervised learning methods have been applied
to anomaly detection: clustering [105, 90, 65], time series analysis [82, 4] and
threshold analysis [42, 78] are well known unsupervised learning techniques.
In this thesis, we propose a novel unsupervised impostor detection approach,
which uses a combination of threshold analysis technique and the time series anal-
ysis technique to detect the impostor who may take over from the authorised user
during the computer session (discussed in chapters 6 and 7). In the next section,
we discuss the related work using anomaly detection techniques that need to nd
the patterns data related to an impostor.
2.4 Anomaly Detection Techniques
Anomaly detection is detecting patterns in a given data that do not conform to a
recognised normal behaviour[14]. These non-conforming patterns have two main
expressions used in the context of anomaly detection which are: anomalies and
outliers. Anomalies usually occur when the comparison between testing patterns
data and historical patterns data or normal patterns data do not match. There
are dierent reasons that anomalies occur like malicious activity or the breakdown
of a system. Anomaly detection has dierent challenges like:
2.4. Anomaly Detection Techniques 29
Distinguishing the normal behaviour region from the anomalies behaviour
region is very dicult.
The intruder adapts his activity to become a normal behaviour but in reality
is anomalous behaviour.
Distinguishing between noisy data and anomalies data is a major issue.
There are several applications of anomaly detection.
Intrusion Detection: refers to the detection of malicious activity in a com-
puter system [76]. The challenge of anomaly detection in this domain is the
huge data that the anomaly detection techniques need to deal with. Den-
ning [20] (classies intrusion detection systems into host based and network
based intrusion detection systems). Host based intrusion detection systems
monitors all or parts of the dynamic behavior and the state of a computer
system. The dynamic behaviour in this domain can be proled at dierent
levels such as at the program level or user level. The techniques of this do-
main have to model the sequence of data or calculate the similarity between
sequences. The main reason for the anomalies in this domain relates to the
outside attacker who wants to obtain unauthorised access to the network for
information stealing or disrupting the network. A CBAS based on keystroke
dynamics can be thought of as a kind of intrusion detection application.
An intrusion detection system monitors a series of events, with the aim of
detecting unauthorised activities. In the case of the CBAS with keystroke
dynamics, the unauthorised activity is a person acting as an impostor by
taking over the authenticated session of another (valid) user.
Fraud Detection: refers to detecting the criminal activities in business organ-
isations such as banks, credit card companies, stock market, etc. Fawcett
and Provost [24] were the rst who introduce a new term called "activity
monitoring" to fraud detection in this domain. They make a prole for each
customer and monitor the proles to detect any anomalies. One of the ap-
plications in this domain is credit card and the challenge related to this
application is detecting unauthorised credit card usage which requires on
line detection of fraud as soon as the fraudulent transaction takes place.
Now, we will give a brief denition and the assumption for some of anomaly de-
tection techniques:
Classication based anomaly detection techniques: are used to learn a model
from training data and then classify a test instance into one of the classes
using the testing model. There are dierent anomaly detection techniques
that use dierent classication methods to build classiers. For example:
Neural networks based, Bayesian Networks Based, Support Vector Machines
Based, and Rule based. The assumption of this technique is the classier
technique which can discriminate between normal and anomalous classes
that can be obtained from the given feature space. These techniques can be
classied into two dierent categories [14]: multi-class classication which
assumes the training data belonging to multiple normal classes; and one-
class classication which assumes the training data belonging to only one
class. The advantages of these techniques, especially multi-class techniques
are that they can use eective algorithms that discriminate between dier-
ent instances which belong to dierent classes. Furthermore, the process of
comparison between a historical prole and a testing prole is very fast for
these techniques. However, these techniques have some limitations, such as
they depend on the availability of labels for dierent normal classes.
Nearest neighbour based anomaly detection techniques: require a distance
between two data instances. For continuous attributes like in our research,
the Euclidean distance is a well known measure. The assumption of these
techniques is that the normal data instances would be in neighborhoods,
while the anomalies would not be close to their neighbors. For example,
the user typing is based on the premise that a user often types in a similar
fashion (that is the distance between samples from the same user is small) and
dierent users often type dierently (that is the distance between dierent
user samples is large). Otey et al. [72] proposed a distance measure for
data which include categorical and continuous attributes separately. One of
the advantages is that there are no assumptions regarding the distribution
for the data which is by nature not supervised. However, these techniques
have some limitations such as, some techniques in this domain sometimes
miss anomalies, especially if the data has normal instances that do not have
enough close neighbors or if the data has anomalies that have enough close
neighbors [14].
Clustering based anomaly detection techniques: clustering is used to group
similar data instances into clusters [97]. The clustering techniques are cat-
2.5. Related Techniques 31
egorized into three groups depending on three assumptions. The rst as-
sumption: normal data instances belong to a cluster while the anomalies do
not belong to any cluster. The drawback of these techniques "which depend
on the previous assumption" is that the techniques focus on nding clusters
instead of nding anomalous data. The second assumption: normal data
instances are close to the center of cluster and the anomalies data is far from
the center of cluster. These techniques have been used in dierent aspects
like, intrusion detection [83] and sequence data [13]. The third assumption:
normal data instances belong to thick and large clusters but the anomalies
belong to small or separate clusters. Proposed by He et al. [37], one of these
techniques is called cluster-based local outlier factor (CBLOF). The tech-
nique has the ability to determine the size of the cluster and the distance of
data to the center of the cluster.
Our research needs an anomaly detection technique in order to detect unauthorised
activities that acting as an imposter by taking over the authenticated session of
another (valid) user. The characteristics of our problem is similar to anomaly
detection based on clustering characteristics. In our case, we need to check whether
the typing data in a single session can be divided into two dierent clusters or one
cluster. Time series analysis technique can be one of the applications of anomaly
detection based on clustering. In section 2.5.3, we will give more details about
time series analysis technique.
2.5 Related Techniques
The previous sections have established the background for the thesis as a whole.
In this section, we review the related research to the work described in chapters 5
to 7.
2.5.1 Feature Selection
Feature selection has been an active research area in dierent domains including
pattern recognition, statistics, and data mining communities [56]. The main idea
of feature selection is to choose a subset of input variables by removing features
with little or no predictive information. In order to select the features in the typing
data, it is important to understand the dierent features of the data. The main
aim is to know what are the best attributes that can be extracted from the user
typing data. The raw keystroke dynamic data, such as key events and timestamps,
cannot be used directly by an anomaly detector. Instead, sets of timing features
are extracted from the raw keystroke dynamics data which can help to dierentiate
between users.
Figure 2.1 illustrates how a string AB can be represented as a vectors of four
times. This information can be used to extract other keystroke characteristics such
as the average, standard deviation, maximum and minimum of the times from the
previous ve extracted time vectors. The timing features of keystroke dynamics
include:
Duration, which is dened as the time between the pressure on a key and its
release.
Interval, which is dened as the time between two key presses, time between
the release of a key, the press on the next key and the time between the
release of two successive keys and the rate of typing that the average number
of characters per minute or seconds. All of the previous features have been
extracted by dierent researchers [98, 12, 69].
Figure 2.1: Transforming a keystroke pattern into a timing vector when a user
inputs a string AB: The duration and interval times are measured by millisecond
Continuous authentication approaches based on keystroke dynamics use sequences
of characters that users type during a session as distinguished features. Since users
can type characters in any sequence during a session, continuous authentication
approaches require selection of multiple features that are representative of user
typing behavior. The n-graph is a popular feature among existing continuous
authentication schemes. It is the time interval between the rst and the last
of n subsequent key-presses. Existing approaches use n-graph (feature) selection
techniques to obtain user-representative features.
Dowland et al. [21] collected the typing samples of ve users by monitoring
their regular computer activities, without any particular constraints being imposed
on them such as asking users to type predened set of words. They selected
the features (2-graphs only) that occurred the least number of times across the
collected typing samples. They used keystroke latency which is the elapsed time
between the release of the rst key and the press of the second key. They build
user proles by computing the mean and standard deviation of 2-graphs latency.
They achieved correct acceptance rates in the range of 60%.
Unlike Dowland et al., Gunetti et al. [32] avoided using the 2-graphs and 3-
graphs latencies directly as features. Instead, they used latencies that determine
the relative ordering of dierent 2 and 3-graphs. They extracted the 2-graphs and
3-graphs that are common between two samples and found the dierence between
them. For this, they devised a distance metric to measure the distance between the
two-orderings of 2-graphs and 3-graphs between two samples. In order to identify
the user of an unknown sample, they compared it with all the samples of the users
by computing the distance between them. The users sample with least distance
is deemed to be the user of the unknown sample. They reported 95% accuracy.
Rajkumar and Sim [44] selected popular English words such as the, or, to,
you as features. They showed that many xed strings qualify as good candidates
and identied the user as soon as they typed any of the xed strings. Rajkumar
and Sim proved that these words can be used to discriminate users eectively.
While the previous selected features exploit the occurrences of n-graphs, their
selection criteria do not guarantee features with strong statistical signicance
which apparently causes less accurate statistical user-representation. Furthermore,
their selected features do not inherently incorporate user typing behavior. We be-
lieve that the existing feature selections do not represent the user typing behaviour
eectively.
Our approach of selecting features based on statistical techniques is similar to
Manning et al. [60]. In Manning et al. [60] study they have suggested dierent
statistical techniques for text mining to nd the importance of a term in a docu-
ment. While in our approach we use the statistical techniques to nd the features
that are user representative.
Feature selection is very important for providing continuous user authentica-
tion. In chapter 5, we will focus all of the analysis for reliable this component in
order to nd the optimum features that represent the user typing behaviour. We
propose dierent feature selection techniques that can represent users typing pat-
terns eectively. In particular, the chapter addresses the rst sub-question from
Research Question 2 (outlined in chapter 1):
What are the optimum features that are representative of user typing behavior?
2.5.2 Application of Threshold Analysis
Threshold means in general the minimum or maximum value (established for a
feature, characteristic, or parameter) which provides a benchmark for comparison
[73]. An example of one of the main threshold application is typist authentication.
It involves separating the typing samples into clusters corresponding to users. This
separating or segmentation is based on identifying and dening common features
that represent the user typing behaviour.
Threshold approach has been used widely in keystroke dynamics for represent-
ing the user typing behaviour. Joyce and Gupta [47] are among the rst researchers
who used the threshold approach for static typist authentication. At authentica-
tion time, the user provides a signature sample which is compared with reference
signature for that user to determine the magnitude of dierence between the two
proles. Positive authentication is declared when this dierence is within a thresh-
old variability of the reference signature. Eight training signatures, are used to
decide the threshold for an acceptable dierence vector between a given signa-
ture sample and reference signature sample. The result shows that the threshold
approach is an eective method to distinguish a single user from other users. How-
ever, the threshold is dependent on users so that each user has dierent threshold.
Clustering is one of threshold methods that has been used by some researchers
[40, 8] for typist authentication in order to distinguishing two dierent users typing
data. As explained in section 2.2.2 the Hu et al. and Benitez et al. [40, 8] used
k-means algorithm to cluster users typing data into dierent clusters. Each user
is represented by dierent clusters. Then, the threshold for each user is calculated
based on the distance between the centroid to the maximum distance of that
cluster. In the authentication phase, the authentication of the unknown sample to
the assumed user will be successful if the distance between the unknown sample to
the centroid is less than the predened threshold of that user. We can clearly see all
the previous attempts of threshold analysis for typist authentication is dependent
on users. It means that each user has a dierent threshold which in a sense is a
user dependent threshold.
Independent threshold is only one threshold value based on some parame-
ters, characteristics and features which can be suitable for most of the subjects.
Coetzee and Botha [16] applied global threshold or user-independent threshold
for ngerprint recognition. Eighty ngerprints of low quality images are used to
build the global or independent threshold. After designing the threshold from the
training phase, three approaches to classication are investigated: a correlation
classier, and two feature-based classication schemes.
Ahmad and Choi [2] used global threshold for edge image detection. For re-
moving edges due to noise, they adopted a global threshold by variance value of
the image. It uses global information and ltering methods to extract edge infor-
mation. Also, Matas et al. [61] used a global threshold or independent threshold
approach for face recognition. From set of training face images, the global thresh-
old is designed or built.
Threshould analysis is very important for our thesis in order to nd the user-
independent threshould that can be suitable for the whole set of users. In chapter 6,
we will focus all of the analysis for this component by proposing user-independent
threshold technnuique that can authenticate a user and also detcting the impostor
without need for any predened user typing model a-priori. In particular, the
chapter addresses the second sub-question from Research Question 2 (outlined in
chapter 1):
How well can user-independent threshold work out whether two typing samples
in a user session belong to the same user?
2.5.3 Time Series Analysis
A time series is a sequence of data points or a collection of observations of well-
dened data items obtained through repeated measurements over time [74]. Time
series analysis includes methods for analyzing time series data in order to extract
important statistics and other characteristics of the data. Time series data have
an ordinary temporal ordering. This makes time series analysis dierent from
other common data analysis problems, in which there is no natural ordering of the
observations. There are two main aims of time series analysis [63]: (a) identifying
the nature of the event represented by the sequence of observations, and (b) fore-
casting (predicting future values of the time series variable). Both of these aims
require that the pattern of observed time series data is identied and more or less
formally described.
A continuous typist authentication system can be thought of as a kind of in-
trusion detection system (IDS) [32]. An IDS monitors a series of events, with
the aim of detecting unauthorised activities. In the case of the continuous typist
authentication system, the unauthorised activity is a person acting as an imposter
by taking over the authenticated session of another (valid) user. In order to detect
that impostor, we need to apply change detection approach. Change detection
approach try to identify changes in the probability distribution of a time series. In
general the problem is concerned with (1) detecting whether or not a change has
occurred, or whether several changes might have occurred, and (2) identifying the
times of any such changes.
Two approaches are used to identify unauthorised or malicious activities; anomaly
based and signature based. Signature based approaches employ signatures of
known users to detect unknown user. A database of known user typing mod-
els is used to compare the current user typing in order to detect unauthorised
typing who may take over from the valid user at the start of the session. However,
signature based intrusion detection systems have limitations in detecting new user
whose signature or typing model is not in the signature database. Most of the
current schemes in continuous typist authentication followed signature approach
based (discussed in 2.2.2). In contrast to signature based detection systems, an
anomaly based system is a statistical approach which does not require apriori
knowlege of the anomaly. The anomaly based system build a model of normal
typing behaviour (registration phase) and any deviation from that behaviour is
considered anomalous.
Window mechanisms have been widely used eectively to do time series analysis
and detecting the change in the data in several domains such as network trac
[4]. In this thesis, we will propose two window mechanisms, sliding window (non-
overlapping) and sliding window (overlapping), in order to eectively do time series
analysis. The signicance of the proposed sliding window techniques are two-fold.
Firstly, automatic detection of the impostor, and secondly, ultimately gives rise to
faster detection of impostor. The proposed sliding window techniques addresses
the limitation of using xed window and the inability to detect the impostor in
2.6. Research Challenges 37
faster way.
Time series analysis is necessary for our thesis in order to detect the impostor
automatically. While the threshold is very ecient in distinguishing dierent users
samples, it remains to show how it can be used in a practical system where the
change point between dierent users is unknown a priori. So, we will apply the
time series analysis to detect the impostor automatically. In particular, chapter 7
addresses the third sub-question from Research Question 2 (outlined in Chapter
1):
Can we automatically detect the impostor who takes over from the valid user
during the computer session and how much typing data does the system need to
detect the imposter?
2.6 Research Challenges
Analysing the users typing data and using them as a biometric authentication
has been an active area for many years (see section 2.2 for detail). Traditional ap-
proaches of users typing data with aim of continuous typist authentication are ig-
noring the application scenarios. Most of the approaches dependent on pre-dened
user-typing models for authentication. This dependency restricts the systems to
authenticate only known users whose typing samples are modelled. In some sce-
narios, it is impractical or impossible to gain the pre-dened typing model of all
users in advance (before detection time). Adding to this challenge is the dynamic
nature of users typing data that changes frequently which may cause less accurate
representative of user typing behaviour. This is due to factors including:
The user typing behaviour changes over time [48] and the the current schemes
are typically dependent on pre-dened user-typing models in the database.
This leads to the situation where the user typing behaviour or the user
typing style might change from the registration phase to the authentication
or testing phase.
The valid users physical mood is not the same at all times and could well
be dierent between the registration phase and authentication phase.
User typing behaviour varies from one context to another, so every context
needs a new historical prole for comparison in that context. For example,
typing speed in an online exam may not be the same as the typing speed in
the case of typing email.
Comparison between the new typing model and the historical typing model
in the database takes time and can delay the detection.
We have identied several key research challenges and requirements that need to
be addressed regarding to ecient continuous typist authentication analysis:
The proposed technique should pay attention to the requirements of the
application scenarios of interest. Each application scenario has dierent re-
quirements and each continuous typist authentication technique has dierent
characteristics and attributes. Therefore, we need to match the suitable con-
tinuous typist authentication technique to the suitable application scenario.
The proposed technique should nd character sequences as features that are
representative of user typing behavior. The proposed technique should guar-
antee selecting frequently-typed features and inherently reect user typing
behavior that can eectively user-representative.
The proposed technique should not be dependent on a pre-dened typing
model of a user. The proposed technique should authenticate and distinguish
a user accurately without the need for any predened user typing model a-
priori.
The proposed technique should be able to automatically analyse the users
typing data. The proposed technique should help in automatically detecting
the impostor during a computer session in real time or near real time.
2.7 Summary
In this chapter, past and current research into the analysis of users typing data or
keystroke dynamics with emphasis on the continuous typist authentication schemes
is presented. The chapter also presents a discussion of the current anomaly de-
tection techniques. Two types of data analysis techniques have been widely used
in keystroke dynamics, namely data mining and statistical analysis. Research in
applying these techniques to keystroke dynamics has been reviewed. In addition,
the challenges of the current research are also presented. In the next chapter, we
2.7. Summary 39
present our rst contribution by proposing a generic model which will be the pri-
mary object of this thesis. The model can help in identifying and understanding
the characteristics and requirements of each type of continuous typist authentica-
tion and continuous authentication scenarios.
Chapter 3
Model for Continuous Biometric
Authentication
Chapter 2 provided an overview of current research into the continuous authenti-
cation based on biometrics with an emphasis on keystroke dynamics. It also identi-
ed the key research challenges and requirements regarding analysis of user typing
data. In this chapter, we propose a new model that identies and explains the
characteristics and requirements of each type of continuous authentication system
and continuous authentication scenario. The chapter presents the primary object
of this thesis, to describe the characteristics and attributes of existing continuous
authentication systems, and the second objective to describe the requirements of
dierent continuous authentication scenarios. In particular, the chapter addresses
Research Question 1 (outlined in chapter 1):
"What is the suitable model for identifying and understanding the characteris-
tics and requirements of each type of CAS and continuous authentication scenar-
ios?"
The chapter is organized as follows. The next section presents the background
on CBAS and the motivation of the chapter. In Section 3.2, we present the con-
tinuous biometric authentication system CBAS model in order to describe the
characteristics and attributes of existing CBAS, and to describe the requirements
of dierent scenarios of CBAS. Following this in Section 3.3, we give some exam-
ples of continous authentication scenarios with dierent requirments and we show
how these scenarios are dierent. In section 3.4, we relate the existing schemes to
41
42 Chapter 3. Model for Continuous Biometric Authentication
the proposed model. In Section 3.5, we describe a new class for CBAS without
utilising training data. The nal Section 3.6 summarizes the chapter.
3.1 Continuous Biometric Authentication System
(CBAS)
The majority of existing CBAS are built around the biometrics supplied by user
traits and characteristics [55]. There has been an ongoing pursuit of improving
CBAS. Recently, eorts have been made to improve CBAS by either embedding
intrusion detection into the CBAS itself [58] or by adding a new biometric source
tailored to the detection system [80]. However, these attempts do not consider
the dierent limitations that might aect the practicability of existing CBAS in
real world applications and continuous authentication scenarios. To our knowledge,
there is no CBAS deployed in real world applications; it seems reasonable to assume
that this is because existing systems lack practicality. There are a number of issues
associated with existing schemes which may prevent CBAS from being applied in
real world applications. These limitations include:
the requirement for the training data to be available in advance;
the need for a large size training data;
variability of the behaviour biometric between the training and testing data;
and
variability of the behaviour biometric of the user from one context to another.
For example, typing in email and typing in computer based exam.
There are several scenarios and situations that require the application of a contin-
uous user authentication approach, but these scenarios cover a variety of possible
situations and have dierent characteristics and requirements. The current CBAS
may not consider the dierences of requirements and characteristics. Thus each
scheme might be applied to the wrong domain or scenario because the previous
CBAS might not have chosen accurate measurements for the relevant scenario
or situation and may not consider the type of setting environment or situation.
Additionally, the time taken to detect the intruder might be too long.
For example, choosing the right source of biometric with the relevant applica-
tion or system may not be considered by the existing CBAS. In other words, one
3.2. CBAS Model 43
CBAS might rely on the keystroke dynamics as an input for providing continuous
authentication, while, most actions of a user may be based on mouse dynamics.
Another example, selecting a suitable algorithm with the relevant setting envi-
ronment or situation might not be considered such as where a CBAS might use
multi-class classication algorithm in an open setting environment. Multi-class
classication requires the data to be available from both valid users and impostors
before the prole of normal biometric data is generated but it is impossible to col-
lect the data in advance from the possible impostors in open-setting environment.
In an open-setting environment the prole of the impostor is not available. The
prole of valid users may or may not be available. In a closed-setting environment
the proles of all users, including a possible impostor, are available. This could
be one of the main problems aecting existing schemes in meeting the European
standards for acceptable commercial biometrics. That is, a false acceptance rate
(FAR) of less than 0.001% and false reject rate (FRR) of less than 1% is necessary
[77].
A generic model is proposed in this thesis that attempts to dene most contin-
uous authentication (CA) scenarios and CBAS. The model of CBAS is proposed
based on detection capabilities of continuous authentication (CA) scenarios and
CBAS to better identify and understand the characteristics and requirements of
each type of scenario and system. This model pursues two goals: the rst is to
describe the characteristics and attributes of existing CBAS and the second is to
describe the requirements of dierent scenarios that a CBAS can be represented
with. Also, we identify the main issues and limitations of existing CBAS, observing
that all of the issues are related to the training data. Finally, we consider a new
application for CBAS without requiring any training data either from intruders or
from valid users in order to make the CBAS more practical.
3.2 CBAS Model
To date, CBAS have only been described by the techniques and algorithms that
are used to detect an impostor and to decide whether or not to raise an alarm.
Therefore, there is a need to build a generic model to ensure the identication of
common characteristics and attributes of CBASs. The traditional authentication
mechanisms, such as user name and password, are used to verify the user only at
the start of a session and do not detect throughout the session whether or not the
User
Sensor Feature
Extractions
Detector Response Unit
Administrator
passive response
Active response
Data Base
User has existing Profile
User does not have existing profile
Type of setting (environment)
Figure 3.1: Continuous Biometric Authentication System Model
current user is still the initial authorised user. The CBAS is designed to overcome
this limitation by continuously checking the identity of the user throughout the
session based on physical traits or behavioral characteristics of the user. A CBAS
can be thought of as a kind of intrusion detection system (IDS). An IDS monitors
a series of events, with the aim of detecting unauthorised activities. In the case of
the CBAS, the unauthorised activity is a person acting as an imposter by taking
over the authenticated session of another (valid) user.
The CBAS attacks include imposters or intruders that take over from the valid
user during the session. However, the IDS attacks include both external intrusion
attacks caused by an outsider and misuse attacks caused by an insider. Most
IDS operate at the system or network level, and very little research performed by
the intrusion detection community has focused on monitoring user-level events;
notable exceptions include the recent work by Killourhy and Maxion [52]. In this
chapter, we use the more specic term, CBAS, to describe a scheme which aims
to detect imposters through the continuous monitoring of a user session.
There are six basic components to describe a typical CBAS (see Figure 3.1)
3.2. CBAS Model 45
1. Subjects - initiators of activity on a target system, normally users.
2. Sensor - a device that collects biometric data from the user, either physi-
cally or behaviourally, and translates it into a signal that can be read by an
observer or an instrument such as keyboard or camera.
3. Feature extractions - a process of selecting optimum features that model and
represent the user actions.
4. Detector - compares proles by analysing the biometric data and then per-
forms measurements for errors that may detect the intruder. In order to
determine the accuracy of the detector there are two popular measurements
of CBAS: false acceptance rate (FAR) and false rejection rate (FRR).
5. Biometric database - storage of the biometric data and user actions are
by proles and this process might happen during the registration phase.
The CBAS will use the database for comparison with the live data in the
verication phase.
6. Response unit - taking an appropriate response for detecting the intruder
or impostor. The CBAS has two popular types of responses either passive
response or active response.
An additional aspect of the CBAS model is the type of setting (or scenario). A
continuous authentication scenario might be conducted either in an open-setting
environment or a closed setting environment. We dened the type of setting envi-
ronments in the previous chapter section 2.3. Each of the six CBAS basic compo-
nents is described in detail below.
3.2.1 Subjects
Subjects are initiators of activity on a target system, normally users, either au-
thorised or unauthorised [20]. An authorised user is allowed to access a system
by providing some form of identity and credentials. Also, they are allowed to deal
with objects of the system during the session. The authorised user can be known
to the system where the biometric data of that user is registered in the system
as a historical prole. On the other hand, the authorised user can be unknown
to the system where there is no biometric data of that user registered in the sys-
tem in advance. Unauthorised user can be the second type of user who does not
have distinctive characteristics or recognition factors and would try to claim the
identity of an authenticated user. The unauthorised user can be an adversary
user acting maliciously towards the valid user or it can be a colluder invited by
the valid user to complete an action on behalf of the user. In the rst case of the
non-authorised user, the victim who suers from the attack would be the end user,
but in the second case the victim is typically the system operator or the owner of
the application.
3.2.2 Sensor
The sensor is a device that collects biometric data from the user, either physically
or behaviourally, and translates it into a signal that can be read by an observer
or an instrument such as keyboard or camera. The system may have several sen-
sors. Sensors in this model transform raw biometric data into a form suitable for
further analysis by the detector. The location of the sensor module for collecting
data which is the process of acquiring and preparing biometric data, can be cen-
tralised or distributed. The data can be collected from many dierent sources in a
distributed fashion or it can be collected from a single point using the centralised
approach. The aim of data collection is to obtain biometric data to keep on record,
and to make further analysis of that data. The quality and nature of the raw data
is signicantly aected when the sensor used during registration and authentica-
tion is dierent [87]. The sensors are based on one or more physical traits or
behavioral characteristics [55] for uniquely recognising humans. The physical type
includes biometrics based on stable body traits, such as ngerprint, face, iris, and
hand [71]. The behavioural type includes learned movements such as handwritten
signature, keyboard dynamics (typing), mouse movements, gait, and speech [103].
The feature set is sensitive to several factors including [87]:
1. Change in the sensor used for acquiring the raw data between registration and
verication time. For example, using an optical sensor during registration
and a solid state capacitive sensor during verication in a ngerprint system.
2. Variations in the environment. Very cold weather might aect the typing
speed of the user, while dry weather resulting in faint ngerprints);
3. Improper user interaction such as incorrect facial pose during image acqui-
sition, or drooping eye-lids in an iris system.
3.2. CBAS Model 47
4. Temporary alterations to the biometric trait or characteristics itself (e.g.,
cuts/scars on ngerprints, or injury in some ngers eecting the typing speed
on a keyboard).
The sensors are based upon one or more physical traits or behavioral character-
istics for uniquely recognizing humans. Behavioural biometric can be obtained
directly from the user by keyboard and mouse [32, 80] or by indirect proling of
the operating system and its applications such as call-stack data operations [25].
Each of the physical and behavioural sources have its own advantages and disad-
vantages. The physical type includes biometrics based on stable body traits, such
as ngerprint, face, iris, and hand [71]. The behavioural type includes learned
movements such as handwritten signature, keyboard dynamics (typing), mouse
movements, gait and speech [103]. The disadvantages of the physical type are
that its considered an intrusive interruption method and the cost of equipment
for implementing physical biometric systems is very high. Other disadvantages of
physical biometric methods are that they suer from practical attacks [99] and
require regular replacement [86]. However, it refers to any automatically measur-
able, robust, and distinctive physical characteristic or personal trait that can be
used to identify an individual or verify the claimed identity of an individual. It is
the automatic recognition of a person using distinguishing traits [101].
In contrast, behavioural biometrics can be made totally unobtrusive to the
extent that the user is not even aware that they are being authenticated and avoid
the use of any additional equipment [91]. However, they are considered to be less
accurate than physiological biometrics. Each one of these sources of biometrics is
suitable for specic applications or scenarios.
3.2.3 Feature extractions
Feature extraction involves simplifying the amount of biometric data required to
describe a large set of data accurately. Analysis with a huge number of variables
requires a huge amount of memory and computation power or a classication
algorithm. Feature extractions is the method of choosing a subset of appropriate
features from the biometric data for building robust learning models that can
model or represent the user.
Since each biometric source has dierent characteristics and attributes, it re-
quires selection of suitable feature selection technique with the relevant biometric
source. In this component, we selected features that are user representative (or
reecting a model of a user) based on the type of biometric sources. For each bio-
metric source a representative features are generated and recorded as a requirement
for the input of the next component (detector).
3.2.4 Detector
The detector performs error detection that may lead to intruder detection, which
is based on the biometric data gathered by the CBA sensor. The detector soft-
ware might be implemented in the client computer either the system and does not
known about the application or in the application by adding some functionality in
that application. Also, the detection algorithm might be run in the server com-
puter. The CBA detector is generally the most complex component of the CBAS.
The detector operates in two modes [3]: registration and identication/verication
mode. The operation of each mode consists of the following three stages: in the rst
stage, a data capturing process is conducted by a sensor module, which captures
all biometric data, and converts all of the raw data into a set of more organised
and meaningful data. The data directly passes to this phase for data processing
where the feature extraction is conducted. Then, these features build up all data
features received from the previous stage over a pre-dened session period, and
perform a number of algorithms on the data to produce the Mouse Dynamics Sig-
nature (MDS), Finger Print Signature (FPS), and Keystroke Dynamics Signature
(KDS) for the user being monitored. Finally, the generated signature will directly
pass to a database and be stored as a reference signature for the enrolled user.
The detector stage in this mode is the verication process after the signature is
calculated during the data processing stage. The sensor module is compared to
the reference signature of the legitimate user.
There are two major families of algorithms of detection in the surveyed systems
based on the historical prole of all the users or based on only the valid user
prole. The rst method is a type of machine learning algorithm that requires
biometric data about all users, either valid or possible impostors, to build a model
for prediction. There are dierent algorithms that follow this method such as
nearest neighbor, linear classication and Euclidean distance, which was proposed
by Gunetti et. al [32]. This type of classication algorithm is suitable for the closed
setting and restricted environment also, the environment should be under access
control to stop any user not registered in the system. The system will perform
well when there are many registered users and fail completely when there is only
3.2. CBAS Model 49
one.
Another method of algorithm detection is a type of machine learning algorithm
that requires only biometric data about a single target class in order to build a
model that can be used for prediction. This type of classication algorithm is
trying to distinguish one class of object from all other possible objects, by learning
from a training set containing only the objects of that class. The method is dierent
from, and more dicult than, the rst method of classication problem, which tries
to distinguish between two or more classes with the training set containing objects
from all of the classes and this class is suitable for the open setting environment
where any user can use the system.
The previous algorithms are used for decision-making in order to detect the
attacker either in on-line or o-line mode. The online mode can be identied as a
real-time (or near real-time) detection and o-line as a retrospective (or archival)
detection. The online detection is necessary for detecting the intruder in real time
or near real time when the system realizes any change of biometric data of the
authenticated user. The surveyed systems in this group can also be run in the
retrospective (or archival) mode. The o-line mode is used when it is unnecessary
for detecting the intruder in real time.
The location of the previous data processing can be centralised within a partic-
ular location and/or group, or distributed on multiple computers. Data processing
is a term used to describe a process of transforming the raw biometric data to suit-
able data, summarising and analysing the biometric data.
The accuracy of the correctness of a single measurement will be set in this
component. Accuracy is determined by comparing the measurement against the
true or accepted value. False acceptance rate, or FAR, measures the accuracy of
the CBAS. It measures the likelihood of whether the biometric security system
will incorrectly accept an access attempt by an unauthorised user. A systems
FAR is typically stated as the ratio of the number of false acceptances divided by
the number of identication attempts. False rejection rate, or FRR, is the second
measure of likelihood a biometric security system will incorrectly reject an access
attempt by an authorised user. A systems FRR is typically stated as the ratio of
the number of false rejections divided by the number of identication attempts.
3.2.5 Biometric database
The biometric database is a repository containing the proles of users as historical
data during the registration phase. The proles can have the trait information of
the users or the characteristics of their behaviour. This storage is for the registered
users who have training data, and the system will use the training results for
comparison in the verication phase. The location of a biometric database can be
in the client/server depending on the requirements of the system. The CBAS will
use the database for comparison with the live data in the verication phase.
3.2.6 Response unit
The reponse unit produces an appropriate response for detecting the intruder or
imposter. The CBAS has two main types of response: either a passive response
or an active response. A passive system will typically generate an alert to notify
an administrator of the detected activity. An active system, on the other hand,
performs some sort of response other than generating an alert. Such systems
minimise the damage of the intruder by terminating network connections or ending
the session, for example.
The previous description of the CBAS model components would help to identify
the most common characteristics and attributes of CBAS. Thus we dene the
common characteristics of CBAS which can help to classify the CBAS into dierent
classes and to dierentiate between continuous authentication scenarios. Figure
3.2 gives an overview of the common characteristics of CBAS and the description
of these characteristics will be explained in sections 3.3 and 3.4.
3.3 Continuous Authentication Scenarios
There are many scenarios and situations that require a continuous user authenti-
cation approach. Here we consider some sample scenarios. These scenarios cover
a variety of possible situations and have dierent characteristics. After describing
the scenarios we discuss their similarities and dierences based on the common
characteristics described above in a generic model.
Scenario 1 Consider a computer based exam in a controlled environment. Such
exams are traditionally conducted in closed-setting environments and
the intruder is likely to come from inside. This form of vulnerability is
3.3. Continuous Authentication Scenarios 51
F
i
g
u
r
e
3
.
2
:
C
h
a
r
a
c
t
e
r
i
s
t
i
c
s
o
f
C
B
A
S
known as an insider threat in the computer security environment [79].
The location of data collection in this scenario might be considered as
centralised as it occurs in closed-setting environments and the location
of data processing would be centralised on a computer server. The sys-
tem would verify the student at the start of the exam, but the teacher
or instructor cannot be sure whether the exam has been completed
only by the valid student. Threats include a substitute student who
completes the exam on behalf of the valid student who is already au-
thenticated at the start of the exam. The system needs to continually
check the identity of the student until the end of the exam.
Scenario 2 In an online banking system transactions are typically made in an
open-setting environment and the intruder is likely to come from out-
side. This form of vulnerability is known as the outsider threat. The
location of data collection in this scenario might be distributed as it
occurs in open-setting environment and the location of data process-
ing would be centralised on a computer server. Bank administration
normally secures the communication channels between the user and
the server in the bank in order to avoid threats to the network trac.
However, threats can occur at the user level when the system authen-
ticates the user at the start of the session and later if the user leaves
without logging o from the session. An intruder can then take over
and conduct transactions on behalf of the valid user who is already au-
thenticated. The system accepts the whole transaction as performed
by the valid user.
Scenario 3 Computer based TOEFL exam scenario. The student who conducted
the TOEFL exam is dicult to get his/her typing data in advance
before the real exam. The location of data collection in this scenario
might be considered as a distributed method, as it occurs in an open-
setting environment and the location of data processing would be cen-
teralised such as computer server. This scenario is typically made in
an open-setting environment but the system is unable to collect the
training biometric data for both the valid students and possible im-
postors in advance to compare with live biometric data. The threats
here are similar to those in scenario 2, teacher or instructor cannot be
sure whether the exam has been completed only by the valid student.
3.3. Continuous Authentication Scenarios 53
Threat include a substitute student who completes the exam on behalf
of the valid student who is already authenticated at the start of the
exam. The main dierence between this scenario and scenario 1 is that
in scenario 1, we can get the predened typing model from the students
as the exam can be conducted in closed-setting environments. In this
scenario, its dicult to get the predened typing model from all of
the students who may take TOEFL exam in advance.
The characteristics of the above three scenarios are dierent.
1. The system has training data for the valid user and the impostor in the
closed-setting environment of Scenario 1. The system will try to compare
the live data or new classes with the historical proles or the known classes
by matching the closest prole or class. In Scenario 2, it is impossible to
collect in practical terms, the data from all impostors, which means the data
is not labeled for the impostor. However, in Scenario 3, the data is not
labeled for either the valid user or the impostor.
2. The system in scenarios one and three does not need to detect the impostor
during the session but can detect it later when the exam is nished. In the
second scenario, it is necessary to detect the intruder throughout the session
in actual time before the intruder can complete a malicious action.
3. In the rst and third scenarios , the authorised user wants to involve a non-
authorised user and invites them to complete the session; this is called a
colluder user. The victim in that case might be the instructor or teacher of
the course (system operator). In the second and third scenarios, however,
the authorised user wants to prevent anyone from completing the session on
their behalf and the type of non-authorised user in that case would be the
adversary user.
4. The biometric source type is not the same in all scenarios. Usually the data
type in the rst and third scenario is keystroke dynamics. In the second
scenario, most user actions are based on mouse dynamics.
Table 3.1 describes dierent characteristics of the above scenarios. Most of the
CBAS characteristics described in gure 3.2 can describe the CA scenarios. (An
exception is accuracy as it always involved in measuring the CBAS schemes.) The
detection principle is one characteristic that might eectively dierentiate the CA
Scenario Computer
based exam
Online
Banking
(public
Internet
Cafe)
TOEFL
exam
Training
Data
valid user
&
impostor
valid user None
Authorised
User
Known Known Known
Impostor Colluder Adversary Colluder
Victim
type
Operator End user Operator
Biometric
source
type
Keystroke Mouse Keystroke
Detection
time
Non
real-time
Real-time Non
real-time
Location
of data
collection
Centralised Distributed Distributed
Location
of data
processing
Centralised Centralised Centralised
Table 3.1: Requirements of dierent scenarios of CBA.
scenarios. The feasibility of detecting the impostor is based on the training data
which has three possibilities: 1) training the biometric data for valid users and
possible impostors which might t the closed-setting environment. This character-
istics is similar to Multi-class classication algorithm characteristics. 2) Training
the biometric data only for the valid users which could t with the open-setting
environment. This characteristics is similar to one-class classication algorithm
characteristics that based on historical prole of the valid user. 3) Non-feasible to
use the training biometric data either for valid users or possible impostors which
might t with the open-setting environment. Characteristics such as these are
similar to one-class classication algorithm characteristics but not based on any
historical proles.
3.4. Existing CBAS 55
3.4 Existing CBAS
The existing CBAS schemes can be described based on the requirement of training
data from user. First class of CBAS requires the training data for both intruders
and valid users. Second class requires the training data only for the valid users.
The characteristics of the two classes of CBAS are described below in detail.
3.4.1 Class 1
In this class, the identity of a user who initiates the session is known, as well as
the identities of all possible imposters or colluders. The characteristics of Class 1
are summarised below.
1. This class requires the CA scenario to be in a closed setting and restricted
environment. The environment should prevent any user not registered in the
system from gaining access.
2. The identity of an authorised user "who initiates the session" is known and
this user has training data registered in the database in advance.
3. The unauthorised user such as an impostor or intruder would try to claim
the identity of the authorised user throughout the session and, it is assumed,
will have training data registered in the database in advance.
4. The unauthorised user in this class might be an adversary or a colluder. The
adversarial user may be deliberately acting maliciously towards the valid
user. This may happen when the authenticated user is harmed by a malicious
person or they forget to log o at the end of the session. In this case, the
malicious person may conduct some actions or events on behalf of the valid
user. Alternatively, the colluding user may be invited by the valid user to
complete an action on behalf of the user. The victim in this case would be
the system operator or the owner of the application. We note that collusion
is not always forbidden. However below we provide an example of where it
would be desirable to detect collusion.
5. The labelled normal data from the valid users and anomalous data from
possible imposters should be used in order to build the detection model. This
approach is similar to the multi-class classier that learns to dierentiate
between all classes in the training data. This classier is then used to predict
the class of an unseen instance by matching it to the closest known class.
This class could be suitable for some scenarios held in restricted environments
where the biometric data for both valid users and possible imposters is available.
All of these characteristics of Class 1 are applicable to the computer-based exam
scenario as scenario 1 described above in section 3.3.
There are a number of examples of existing CBAS schemes that fall under
Class 1. Gunetti et al. [32] created proles for each user based on their typing
characteristics in free text. They performed a series of experiments using the
degree of disorder to measure the distance between the proles of known users to
determine how well such a measure performs when assigning unidentied users to
known proles.
Pusara et al. [80] applied a similar approach using mouse movements to com-
pare a test sample to every reference sample in the database. In that case, the
learning algorithm used the training data for all users to determine decision bound-
aries that discriminate between these users by matching the closest user.
3.4.2 Class 2
The main dierence between Class 1 and Class 2 is that in Class 2 we do not assume
that a prole is available for the imposter. That is, the identity of authorised user
who initiates the session is known (as is their corresponding prole which is stored
in the database), but no prole is available for the imposter. The characteristics
of the class can be summarised as follows.
1. This class requires the CA scenario to be in the open setting in a non-
restricted environment. The environment could be in a public location where
any user can use the computer system.
2. The user who is authorised to use the system at the start of the session could
be a known user (as in Class 1).
3. The authorised user who is already authenticated at the start of the session
should have training data registered in the database in advance (as in Class
1).
4. The training data for the unauthorised user, such as an imposter or intruder,
who would try to claim the identity of the authenticated user during the
3.4. Existing CBAS 57
session is not available or it is not possible to collect such data in advance
of the prediction time.
5. The unauthorised user in this class would be an adversary user and the victim
in this case would be the end user.
6. The labelled normal data from the valid users should be used in order to build
the detection model. This approach is similar to the one-class classier that
learns to dierentiate between one class in the training data, which is then
used to predict the class of an unseen instance by making a decision about
whether or not it is related or similar enough to the training class. Since
systems in class 2 do not require labels for the anomaly class, they are more
widely applicable than multi-class classication techniques. The approach
used in such techniques is to build a model for the class corresponding to
normal behaviour and use the model to identify anomalies in the test data.
All of these characteristics of Class 2 are present in the online banking scenario
described above as scenario 2. In the online banking scenario the training data
for only valid users is available or collected before prediction time. Therefore this
class could be suitable for some scenarios held in non-restricted environments such
as public locations where the biometric data for possible imposters is not available
or where it is not possible to collect such data in advance.
There are a number of examples of existing CBAS schemes that fall under
Class 2. Hempstalk et al. [38] created proles only for valid users based on their
typing characteristics in free text. They performed a series of experiments using
the Gaussian density estimation techniques by applying and extending an existing
classication algorithm to the one-class classication problem that describes only
the valid user biometric data. They applied a density estimator algorithm in order
to generate a representative density for the valid users data in the training phase,
and then combined the predictions of the representative density of the valid user
and the class probability model for predicting the new test cases.
Azzini et al. [6] applied a similar approach using multimodal biometrics includ-
ing face recognition and ngerprint in order to compare unidentied data to only
the valid users data in the database. At prediction time, the learning algorithm
used the training data for only the valid user to determine a decision based on
comparison between the unidentied data with the class of the valid users.
3.4.3 Limitations in the current CBAS
Most previous CBAS schemes are mainly concerned with the accuracy of the sys-
tem in reducing false alarms. However, these schemes do not consider issues that
might aect their practicality in dierent real world applications and continuous
authentication scenarios. All previous schemes require training data of both in-
truder and valid users like Class 1, or of valid users like Class 2. Limitations related
to the training data which prevent CBAS from being applied in a practical way as
a real world application include the following.
The existing schemes require the historical prole either from legitimate users
or from possible attackers to be available or collected before prediction time.
It is impossible in some cases to gain the biometric data of the user in advance
of the detection time such as computer based TOEFL exam.
Some schemes require training samples of the user in the registration phase in
order to build a prole of that user and apply it in the testing or comparison
phase. This is likely to present a severe inconvenience for some users.
User behaviour biometric data varies between the registration and testing
prole. The cause of the variability of the data could be due to the following.
a) The valid user physical mood is not the same at all times and could well
be dierent between the registration phase and testing phase. This could
aect the users typing speed, for example. As a result, the false rejection
rate can be increased. b) The typing speed of the valid user changes over
time. This can aect the stability of the keystroke dynamics systems, for
example.
User behaviour varies from one context to another, so every context needs
a new historical prole for comparison in that context. For example, typing
speed in an online exam may not be the same as the typing speed in the case
of typing email.
Comparison between the new prole and the historical proles in the database
takes time and can delay the detection.
3.5. A new class for CBAS 59
3.5 A new class for CBAS
In this section, we describe a new application for CBAS which, in a sense, is the
most dicult case. In this application, we assume that no prole information is
available for any users, authorised or not, at the beginning of the session. We
do however, assume that the user who initiates the session is authorised to do
so. The challenge here is to build a prole of this authorised user while, at the
same time, trying to decide whether or not the session has been taken over by
an imposter. A summary of the characteristics of the new application for CBAS
follows.
1. This class requires the CA scenario to be in the open setting and in a non-
restricted environment. The environment should be in a public location so
that any user can use the computer system (as in Class 2).
2. While the identity of the (authorised) user who initiates the session may be
known, no prole for this user is available prior to the commencement of the
session.
3. Similarly, the training data for the unauthorised user is not available or
cannot be collected before the prediction time.
4. The unauthorised user in this class would be be an adversary user and the
victim in this case of attacking would be the end user (as in Class 2).
5. There are no labels from both normal and anomalous data to be used in
order to build the detection model in this class. This approach is similar to
change point detection [50] that learns from the data on the y. It considers
probability distributions from which data in the past and present intervals
are generated, and regards the target time point as a change point if two dis-
tributions are signicantly dierent. Other methods may also be applicable
to the new application.
All of these characteristics of the new application are present in the computer based
TOEFL exam scenario described above as Scenario 3. In this class, the system
determines whether the biometric data in the testing phase is related to one user
or two users by trying to identify any signicant change within the biometric data.
There are three main challenges associated with this class.
1. How much biometric data should be available before it is possible to identify
a signicant change in the biometric data related to the imposter or intruder?
2. How much time does the system need to detect the impostor?
3. How can the system determine the start and end activities of an imposter or
intruder?
While there are likely to be a number of existing techniques that are potentially
useful for the new class of CBAS, we note that most of the characteristics of this
class are similar to the change point detection problem, and here we focus the
discussion on assessing the applicability of change point detection techniques to
the new class for CBAS.
Change point detection is the problem of discovering time points at which
properties of time-series data change [50]. Change point detection has been applied
to a broad range of real world problems such as fraud detection in cellular systems
[66], intrusion detection in computer networks [4], irregular-motion detection in
vision systems [51], signal segmentation in data streams [7], and fault detection
in engineering systems [27]. There is a clear need to develop some schemes for
the new class for CBAS based on change point methods. Various approaches to
change-point detection have been investigated within this statistical framework,
including threshould analysis [61, 40, 47], the CUSUM (cumulative sum) [4] and
GLR (generalized likelihood ratio) [34, 33].
The new schemes associated with change point algorithms may overcome the
current limitations in the CBAS that have been identied in Section 3.4.3. Specif-
ically, change point analysis can be applied without the need for training data.
New schemes associated with change point detection algorithms detect the at-
tacker based on the data itself and, therefore, have the potential to be faster. In
chapter 7 we will perform a practical evaluation of the applicability of change point
detection techniques for CBAS using the results from chapters 5 and 6.
Table 3.2 summarises some the dierences between the rst, second and the
new class.
3.6 Conclusion
In this chapter we analysed existing continuous biometric authentication schemes
and described sample continuous authentication scenarios. We identied the com-
3.6. Conclusion 61
Characteristics Class 1 Class 2 New class
Type of
environment
Closed Open Open
Training data Authorised /
unauthorised
user
Authorised user None
Unauthorised user Adversary /
colluder
Adversary Adversary /
colluder
Victim type System operator
/ owner of the
application
End user System operator
/ End user
Algorithm type Multi-class
classication
One-class
classication
Change point
detection
(potentially)
Table 3.2: The dierences between the rst, second classes and the new class.
mon characteristics and attributes from the generic model of CBAS. To date there
is no CBAS deployed in real world applications, probably because the existing
systems lack practicality. We observed that the main limitations are related to
the training data that prevent CBAS to be applicable in real world applications.
The problems are the requirement of the training data to be available in advance,
many training data samples required, the variability of the behaviour biometric
between the training and testing phase in case of the comparison time and the
variability of the behaviour biometric of the user from one context to another.
Finally, the chapter considered a new class for CBAS associated (potentially) with
change point detection algorithms that does not require training data for both
intruder and valid users which can overcome the identied limitations associated
with the existing CBAS.
In chapter 7 we will consider this new class that is not dependent on training
data using the results of chapter 5 and 6. The new class can overcome the identied
limitations associated with the existing the CBAS. The new scheme is capable of
distinguishing a user accurately without need for any predened user typing model
a-priori.
Chapter 4
Dataset Analysis
To gain condence in a systems ability in detecting impostors based on typing
data or evaluating the performance of such a system, it is highly desirable to
obtain real data. The use of real typing data allows us to analyse the usability of
the proposed techniques with real world users typing data that have occurred on
computers. However, collecting a typing dataset is not an easy task as it involves
huge eorts to set up the process of collecting the data reliably and accurately
from dierent users. Also, it is time consuming to collect the data from dierent
users over dierent times for example, Gunetti and Picardi dataset "GP dataset"
collected their own real data over 6 months. To our knowledge, the only dataset
public and available for user typing with free text is the one collected by Gunetti
and Picardi [32]. This dataset is popular with researchers in the area of keystroke
dynamics and has been used for evaluation by some of them [38, 94]. Furthermore,
some researchers evaluate their work based on dataset that has been collected by
someone else which is common in this area. For example, Shimson et al. [93] their
evaluated his proposed technique on the dataset that collected from [38].
It is a natural question to consider whether to compare experimental results
over dierent datasets. Evaluating the proposed techniques of keystroke dynamics
with two or more dierent keystroke dynamics dataset is not common. This is
because the condition or the environment of collecting the keystroke dynamics from
users are normally dierent. For example, the specication of computer hardware
such as type of timer clock and type of keyboard are dierent which might have
some impact on the timing information of keystroke dynamics. Killourhy and
63
64 Chapter 4. Dataset Analysis
Maxion [53] showed that the detector accuracy of keystroke dynamics depends
substantially on the type of dataset used in the evaluation. They gave an example
of the performance for two dierent techniques applied on two dierent datasets.
They mentioned that the neural networks false alarm rate changed from 1.0%
to 85.9% from one evaluation dataset to another. On the same two datasets,
a k-nearest neighbors false-alarm rate changed from 19.5% to 46.8% [15, 52].
Therefore, evaluating a scheme using a single dataset may be more reasonable than
evaluating on two or more datasets [53]. For that reason, most of the evaluation
in keystroke dynamics is based on one dataset [32, 93, 94, 64, 22, 21, 68, 40]
However, applying the proposed technique only on one dataset does not mean
the technique can not be applied on other datasets. It can be applied into two
dierent datasets but you may obtain dierent results because the environmental
conditions may dierent on the two datasets. To validate our techniques presented
in chapters 5, 6 and 7, we used "GP dataset". The use of real keystroke dynamics
dataset allows us to analyse the usability of the proposed methods with real world
scenarios that have occurred to detect the impostor during the computer session.
4.1 Predened or free text?
Most research in the eld of keystroke dynamic analysis collected the user typing
samples as predened text. It means that the user is given a predened text and
the user then types that text in both the training phase and the testing phase.
The collection of the typing data requires the user to type long xed texts in
both the training phase for building the users proles and during the authentica-
tion phase. However, the researchers struggled to analyse relatively short typing
samples which do not provide enough timing information in order to distinguish
between users. With current state of the art, the analysis of keystroke dynamics
cannot be performed with such very short texts or samples. Timing analysis on
such texts does not provide enough information to discriminate accurately among
valid users and impostors. Therefore, if relatively long typing texts are accepted,
keystroke analysis can become a valid method to establish personal identity [32].
As a result of that, the predened or xed texts are not suitable for continuous
authentication analysis. Then, the usable solution can be the typing rhythms that
users show during their normal interaction with a computer. In other words, we
have to be able to deal with the keystroke dynamics of free text without putting
4.2. Experimental Setting 65
any restrictions on the users while they typing in both of training and authen-
tication phases. It should be clear that keystroke analysis performed after the
authentication phase should deal with the typing rhythms or keystroke dynamics
of whatever the users have typed [32] which in a sense is free text. The free text
applies when users are free to type whatever they like to enter, which represent
the normal interaction with the computer and keystroke analysis then can be per-
formed on the available information. It means that the analysis of free text is
dynamic by denition, since the length of the text is not xed.
4.2 Experimental Setting
There are lots of issues for dataset collection including text type, keyboard speci-
cations, skilled or unskilled typing users, dierent timer clocks and environmental
conditions. These factors could vary from one dataset to another which can aect
the comparison between two dierent datasets. Gunetti and Picardi have explained
in their paper [32] the method to set up the experiment to collect users data. All
of the information in this section has been taken directly from of section 4 [32].
All the typing samples were collected through a simple HTML form comprised of
two elds. In the rst eld, users had to enter their log-in name. The second eld
was a text area where volunteers were free to enter whatever they wanted to ll
the form. When the form was complete, a submit button was used to send the
sample to the collection server. A client-side JavaScript was written to gather the
timing information. The sampling data was composed of the time (in milliseconds)
when a key was depressed, together with the ASCII value of the key. The timer
had a clock tick of 10 milliseconds "ms".
Forty volunteers provided 15 typing samples each, as described below. In the
experiments, these people acted as legal users of a hypothetical system. All the
people participating in the experiments were native speakers of Italian and were
asked to provide samples written in Italian. Gunetti and Picardi found the vol-
unteers in their department, among colleagues and students in their last year of
study. Although with varying typing skills, all of them were very used to typing
on normal computer keyboards. None of the volunteers were hired or in any way
paid for their assistance.
The samples were collected on the basis of the availability and willingness of
volunteers over a period of about six months. All volunteers were instructed to
provide no more than one sample per day. A few individuals were able to provide
their samples on a very regular basis, each working day or every two or three days,
so they completed their task in about 1 month. However, the majority of the
participants to the experiment provided their typing samples on a very irregular
basis; some of the samples were provided frequently, while other samples every two
or three weeks. For some of the participants, they provided the typing sample and
the next one between 1 or 2 months. Samples were provided at any working hour,
at the convenience of the participants.
For obvious reasons, they had no control over the way volunteers performed
the experiment. Every working day an email was sent to each of the volunteers
to remind them about the experiment, but people were, of course, free to ignore
such reminders. Most volunteers used their oce computers or their notebook
computers to produce the samples. In some cases, the experiment was done by
some volunteer connecting from home or from computers located in another town.
Thus each user may have provided some of their samples using dierent keyboards.
However, they do not have any way to know which samples were provided on
which keyboards and they have not explicitly tested their method using dierent
keyboards. Samples were provided both in Windows and Unix environments, using
both Explorer and Netscape browsers.
Volunteers were asked to enter the samples in the most natural way, more
or less as if they were writing an email to someone. They were completely free
to choose what to write and the only limitations were of not typing the same
word or phrase repeatedly in order to complete the form and not to enter the
same text in two dierent samples. They suggested that volunteers write about
dierent subjects in each sample: their job; movies; holidays; recipes; family; and
so onanything they liked. They were, of course, free to make typos and to correct
them or not. Corrections could be made by using the backspace key or the mouse,
as preferred. People were free to pause for whatever reason and for as long as
they wanted when producing a sample. No sample provided by the volunteers was
rejected. On average, the gathered samples had a length varying between 700 and
900 characters.
4.3. Dataset Description 67
4.3 Dataset Description
Not all users gave permission for their samples to be released to a third party;
therefore, the provided dataset contained data for only 21 users and each user has
15 typing samples. Each sample contains between 700 to 900 characters. Each
sample contains all the data recorded by a user in one session.
The keystroke raw data contains data in the form:
60870
65
61040
32
62910
80
63130
85
63570
66
63680
66
...
...
...
Each large number is the absolute time in milliseconds at which the corre-
sponding key was depressed. The key is reported next to the large number as a
decimal ASCII. In the above example, key "A" (65 decimal) has been depressed
at time 60870. The space (32 decimal) has been depressed at time 61040, that is
61040-60870 milliseconds after "A", and so on.
4.4 Data Prepossessing
Data prepossessing not only assists to prepare the data for further analysis but
also assists in improving the quality of the data [35]. Two main data prepossessing
techniques are discussed in this section: data transformation method and data
cleaning technique. Data transformation method has been incorporated to con-
solidate data into forms appropriate for data mining [45]. It involves combining
the original data into summarised form. The data transformation is dened by
transforming the numbers in the keystroke raw data into Italian characters with
the duration times. The keystroke raw data contained data as a decimal ASCII,
which is explained in the previous section. The ASCII characters were transformed
to the real characters in Italian, based on the ISO/IEC 8859-16 standard Italian
character set. For example, after we transformed the keystroke raw data for user
1 sample 1 to Italian characters will be as following:
" HO PROVATO A FARE IL .........."
However, a strange character was found for some users, including users 2, 3,
4, 5, 7, 14 and 15. This strange character is zero in the ASCII code, and when
trying to transform it into an Italian character, it came up as null value. Thus
the character is not understandable and, if removed from the data sample that
included it, the character aected the sequence of the time for the rest of the
characters. Therefore, it was decided to remove the data of these users and only
use the remaining data in this study, which now contains 14 users with 15 samples
from each user. Table 4.1 shows some characteristics of the users typing data.
It shows the total characters of users typing data over all the 15 samples, lower
case, uppercase, digits, Notations and symobls, backspace, enter and null value.
Appendix A presents the characteristics of the users typing data for the users who
were removed from our study including users 2, 3, 4, 5, 7, 14 and 15 as their data
included null value.
After that, we calculate the time for sequence of two characters, three char-
acters and so on for all of the users data. The output of data of this step will
be including the time of 2-graphs, 3-graphs, 4-graphs and so on for all of the user
samples. Then we categorised the time to dierent categories in order to see the
distribution time for all users. Appendix B presents the distribution time in mil-
liseconds for 10 dierent users from the dataset. From the table in Appendix B,
we can clearly see there is a low percentage of some extreme time values which are
not representative of the normal user typing behaviour.
Removing noisy data that has extreme values is another method of data pro-
cessing. These extreme values could be related to some dierent reasons such as
interrupting the user while their typing or eating while their typing and our aim
is to calculate the delay between the characters while the user is typing. In order
to remove the eect of the outliers presented in the collected data, a detection for
outliers technique should be used.
Several studies used dierent methods for removing the outliers in keystroke
4.4. Data Prepossessing 69
T
y
p
i
n
g
d
a
t
a
C
h
a
r
a
c
t
e
r
i
s
-
t
i
c
s
U
s
e
r
1
U
s
e
r
6
U
s
e
r
8
U
s
e
r
9
U
s
e
r
1
0
U
s
e
r
1
1
U
s
e
r
1
2
U
s
e
r
1
3
U
s
e
r
1
6
U
s
e
r
1
7
U
s
e
r
1
8
U
s
e
r
1
9
U
s
e
r
2
0
U
s
e
r
2
1
T
o
t
a
l
c
h
a
r
a
c
t
e
r
s
1
2
2
6
3
1
4
4
9
8
1
3
9
5
0
1
3
5
2
7
1
4
3
9
0
1
4
2
8
3
1
5
0
6
5
1
3
6
9
8
1
0
7
6
3
1
5
7
3
4
1
5
6
7
3
1
3
7
6
1
1
3
0
7
2
1
2
4
0
0
L
o
w
e
r
c
a
s
e
4
5
5
6
3
3
8
0
1
0
2
1
2
0
9
9
3
1
5
0
1
1
0
5
5
2
3
8
0
6
7
0
U
p
p
e
r
c
a
s
e
9
6
1
7
4
1
7
8
9
7
2
6
9
8
0
9
9
9
8
3
8
6
7
1
0
3
3
8
1
9
6
7
3
7
6
1
0
6
9
9
1
1
4
3
6
4
3
2
0
9
7
1
6
8
7
6
6
D
i
g
i
t
s
2
6
1
6
5
4
9
4
1
2
0
7
1
7
2
8
7
3
6
8
2
7
6
1
6
2
9
1
4
9
N
o
t
a
t
i
o
n
s
a
n
d
s
y
m
b
o
l
s
(
#
,
,
.
.
.
)
5
6
5
3
1
0
9
3
1
0
2
9
5
5
0
3
3
1
7
6
0
9
5
8
3
6
5
8
2
3
1
7
0
1
7
B
a
c
k
s
p
a
c
e
3
3
4
8
3
7
2
1
0
3
4
5
4
2
1
3
0
4
6
9
7
8
3
1
4
0
9
1
3
5
7
4
9
5
9
5
5
6
7
0
2
1
7
E
n
t
e
r
3
1
1
2
7
3
1
4
4
1
7
6
3
1
8
5
7
7
3
1
7
1
0
3
5
1
5
N
u
l
l
0
0
0
0
0
0
0
0
0
0
0
0
0
0
T
a
b
l
e
4
.
1
:
C
h
a
r
a
c
t
e
r
i
s
t
i
c
s
o
f
u
s
e
r
s
t
y
p
i
n
g
d
a
t
a
dynamics data. Joyce and Gupta [46] discarded the outliers that are greater than
three standard deviations from the mean. Umphress and Williams [100] used a
dierent method to discard the outliers in the dataset that each digraph must fall
within 0.5 standard deviations of its mean to be considered valid. Dowland et al.
[22] ltered their dataset by discarding any times less than 40ms or greater than
750 ms . Later, Yu and Cho [106] used a new approach to deal with the outliers in
keystroke dynamics dataset such that if the standard deviation for given digraphs
was greater than its mean, the upper and lower 10% of the values were removed
from the dataset. Dowland and Furnell [23] used an identical method to Dowland
et al. [22], except that the low pass lter was reduced from 40 ms to 10 ms to
prevent some valuable digraphs from being removed. For ltering our dataset, we
used the same approach of Hempstalk [38] used by discarding any times greater
than 500 ms in order to keep the lower times as it gives useful information about
the user.
4.5 Preliminary Analysis
We conducted preliminary experimental analysis in order to gain condence about
the dataset that it can be used for further analysis. Deep analysis has been done
in the dataset to see how possible we can distinguish between users based on some
dataset characteristics. First, based on statistical signicance approach, we looked
to all of the user samples in the dataset and we investigated the most frequent
feature or characteristic that occurs or typed frequently by all of the users. Table
4.2, shows an example of a list of the 30 most frequent features. The list shows
also the average time of the feature was typed by all the users in the dataset.
The average time of the feature represented the middle value of the time. Every
feature was calculated by combining the times from the list and computing a single
number as being the average of the list. Second, we repeated the same analysis but
this time we calculated the average time of the feature for each user separately.
Table 4.3 shows an example of the most frequent 30 features list for dierent users.
This can help to see the ecacy of these feature in distinguishing users. From the
table, we can clearly see the time of these features are dierent from user to user.
This indicates that we can distinguish between users based on some features on
the dataset.
Finally, we repeated the same analysis again but this time we calculated the
4.6. Experimental Framework 71
average time of the feature for each user sample separately. Table 4.4 shows an
example of the average time of the digraph ER with all of the user samples. From
the table, we can clearly see the average time of ER for each user is varies a lot
over samples as can be seen for users 1 and 8. However, some users are typing
similarly and consistently over all samples such as user 18.
The overall observation from the previous analysis is that the most frequent
feature approach can be used to distinguish users. In chapter 5, we do further
extensive experiments and analysis of the most frequent feature approach. Partic-
ularly, we evaluate dierent numbers of this feature in order to nd the optimum
certain numbers. This can help to see the eectiveness of this approach for distin-
guishing users. Also, we compare this approach with other proposed approaches
and compare it with some of the existing approaches.
4.6 Experimental Framework
The experimental methodology adopted by this thesis in chapters 5,6 and 7 is
motivated by the requirements for extensive experimentation. In this experimen-
tal framework, we describe the simulation procedures used to create features that
represent the user typing behaviour by using the clustering performance mea-
sure. Also, we describe the experimental procedure to obtain the user-independent
threshold and the factors that aect the accuracy of the threshold. Then, we de-
scribe the simulation procedure used the user-independent threshold in order to
detect the impostor data from the authenticated users data in real time or near
real time by window mechanisms. This is along with the description of the accu-
racy measurements used to conduct the all of the experiments.
First, the experimental methodology to select dierent features has two main
phases:
1. Selecting candidate features: It has taken as input users typing data that
were recorded based on the ASCII characters and associated key-press times
in text les. Then, these text les are exported to My SQL Server. After
that, we preprocessed the users typing data by running sets of SQL queries
that generate dierent tables in order to select dierent feature selection
techniques. Each table is represented by one feature selection technique that
describes all of the user samples. Then, these tables will be used in the next
phase for evaluation.
Feature or characteristic frequency avg time
ER 3984 181
RE 3551 136
ON 3166 239
TO 3102 230
CO 2845 166
NO 2614 193
RI 2609 213
DI 2587 159
IN 2499 210
EN 2435 190
AN 2435 255
CH 2385 195
AR 2356 228
TE 2341 143
TA 2310 288
RA 2216 175
NT 2202 251
TI 2179 240
HE 2081 176
IO 2034 142
LA 2017 199
AT 1985 275
ST 1971 252
UN 1887 149
AL 1880 228
ES 1862 289
PE 1831 149
OR 1813 264
IA 1813 156
Table 4.2: Most frequent feature in the dataset
Table 4.3: Avg time of most frequent characteristic for dierent users
T
a
b
l
e
4
.
4
:
A
v
e
r
a
g
e
t
i
m
e
o
f
E
R
f
o
r
a
l
l
o
f
t
h
e
u
s
e
r
s
a
m
p
l
e
s
2. Evaluate candidate features: The tables obtained by the feature selection
techniques will be evaluated to nd user-representative features. We use the
k-means clustering algorithm in MATLAB, which will be described in chap-
ter 5 section 5.3.2.1. K-means used to nd out whether the selected features
reect a normal typing pattern of a user. The notion is user typing data
representing a single user should be grouped into one cluster using user rep-
resentative features since clustering naturally group data that share similar
behaviour (based on features). Due to the ambiguity of the clustering result
in that some clusters contain samples from dierent users, it is necessary
to rely on criteria able to extract relevant information about distinguishing
users and assigning users to clusters. The algorithm to calculate the crite-
ria had been implemented in MATLAB. It is explained in more details in
chapter 5 section 5.3.2.2 and the decision of assigning each user to a dierent
cluster is based on the frequency of the samples to the cluster. After that, the
cluster evaluation algorithm used to measure and evaluate the measurement
of the feature selection techniques. The evaluation algorithm is calculated
by dividing the correct user samples in the cluster by the total number of
user samples in the dataset. The output of this phase will be the optimum
feature (table) which represents the user typing behaviour eectively. Then,
that table of the optimum features will be exported to the next phase to
obtain the user-independent threshold.
Second, the experimental methodology to obtain the user-independent threshold
has been implemented in MATLAB. The experiment is designed so that we can nd
a threshold value suitable for the whole set of users to distinguish them without
requiring any prior training data for a user. We analyse the similarities between
the two samples from the same users data and the dierences between two sam-
ples from a dierent users data. We applied distance measures to distinguish the
samples of two dierent users. The distance measure is trained between pairwise
samples from the same users data which can describe the normal distance be-
tween pairwise samples for the genuine user. Also, the distance measure is trained
between two samples from dierent users data, which can describe the distance
between genuine users and impostors. After that, we compare all of the previous
distance values to dierent threshold values in order to nd the cross-point or the
equal error rate (EER) which in our experimental work is the user-independent
threshold value. Then, the consistency of the threshold value is calculated among
dierent groups of users by calculating the variance or the standard deviation of
the threshold value among all groups of users. The experimental methodology is
explained in more detail in chapter 6 section 6.5.2.
The methodology comprised of three steps: (1) dividing the data sets into
dierent subsets; (2) selecting values of interest for each of the factor; and (3)
repeatedly running an evaluation, while systematically varying the factors among
the selected values. The experiment will observe the eects of the factors or
parameters including distance type, number of keystrokes, feature set and feature
amount on the consistency of the user independent threshold among dierent group
of users. The output of this phase will be the user-independent threshold value
suitable for a whole set of users. Then, this value will be used in the next phase
to detect the impostor from the authenticated user.
Third, the experimental methodology to detect the impostor data from the
authenticated users data has been implemented in MATLAB. The experiment is
designed so that we can nd how eective the user-independent threshold is in
designing a new system with aim of detecting the impostor with aim of minimis-
ing the delay of detection. The experimental method itself is comprised of three
steps: (1) initiate the rst window for an authenticated user that has certain size
of samples; (2) varying the typing data from both authenticated user and impostor
in the second window; (3) repeatedly running an evaluation, while systematically
changing the users data in the second window. The distance measures are cal-
culated in the previous steps by comparing the rst window for an authenticated
user to the second window or testing window. If the distance value is not exceeded
the predened threshold between the two windows, the typing data is only for the
authenticated user and the session has not been taken of by an imposter. The
experimental methodology will be explained in more details in chapter 7 section
7.5.2. In the next section we present the evaluation methodology to validate the
proposed techniques.
4.7 Evaluation Methodology
To test the eectiveness and assess the validity of the proposed techniques for
the analysis of continuous user authentication using keystroke dynamics, there
is a need to identify some performance measurements. The performance of our
system is determined by computing the detection accuracy in terms of error rates.
4.8. Summary 77
Detection accuracy is the ability of the technique to correctly identify the impostor
or intruder.
The aim of the system is to maximise the detection accuracy: that is the
percentage of detection the impostor or intruder that masquerade as a genuine user.
Also, the aim is to minimise the false alarm rate (FRR): that is the percentage
of genuine users that are identied as impostors or intruders. The evaluation of
the proposed methods is considered as three phases: generate a label dataset that
includes authenticated users and impostors, run the proposed technique against
the dataset and analyse the result by the two measurements: FAR and FRR.
There is an advantage of using labeled dataset where the information of au-
thenticated user and impostor is available before you run the technique. In chapter
5, the detection accuracy is measured by dividing the correct user samples in the
right cluster by the total number of user samples in the dataset. In chapter 6,
the detection accuracy is carried out by investigating the trade o between false
positives FP and false negatives FN over dierent threshold values. The trade o
between FP and FN can be considered as equal error rate (EER) where both the
false positive and false negative rates are the same. Also, in chapter 6, we inves-
tigate the consistency of the threshold value among dierent groups of users by
calculating the standard deviation of the threshold value among dierent groups
of users. A higher standard deviation of the threshold value among dierent group
of users depicts that the user-independent threshold is inconsistent. A lower stan-
dard deviation of the threshold value among dierent groups of users means that
the threshold value is consistent which means the threshold value is suitable for
the whole set of users. In chapter 7, FAR and FRR measurements will be applied
to validate the technique with the aim of detecting the impostor in real time or
near real time.
4.8 Summary
The chapter described the whole dataset that we use in this thesis and also de-
scribed the experimental framework for conducting all the proposed techniques
in chapters 5,6 and 7. In this chapter, we explained the reason behind choosing
this dataset to evaluate our work and also we described the experimental setting
to collect the data. In addition, we gave an overview of the dataset description.
Furthermore, the chapter described in details the preproccessing phase including
the transformation method and data cleaning method. Additionally, the chapter
describes the experimental framework including the analysis in steps to conduct
all of the experiments in this thesis. Finally, the chapter proposed the evaluation
methodology to validate the proposed technuiques.
After we preprocessed the dataset, the dataset is ready to conduct the rst ex-
periment evaluating dierent features in order to nd user-representative features.
In the next chapter, we will propose dierent feature selection techniques that
can represent users typing patterns eectively. These techniques consider highest
statistical signicance and also encompasses users dierent typing behaviours.
Chapter 5
User-Representative Feature
Selection for Keystroke Dynamics
Chapter 3 proposed a model that describes the whole process of continuous au-
thentication. One of the main components of that model is the feature extraction
component that is necessary for providing continuous user authentication. In this
chapter, we will focus all of the analysis for this component in order to nd the
optimum features that represent the user typing behaviour. We propose dierent
feature selection techniques that can represent a users typing patterns eectively.
In particular, the chapter addresses the rst sub-question from Research Question
2 (outlined in chapter 1):
What are the optimum features that are representative of user typing behavior?
The chapter is organised as follows. The next section provides a brief intro-
duction to the feature selection techniques and our methodology to evaluate the
techniques. Following this in Section 5.2, we present the proposed feature selection
techniques. In section 5.3, we describe in more details our evaluation methodology
for the feature selection techniques. In Section 5.4, we discuss the experimental
results and show the comparison between the proposed feature selection techniques
and the existing ones. In Section 5.5, we compare between xed features and dy-
namic features over dierent data sizes. The nal section in 5.6 summarises the
chapter.
79
80 Chapter 5. User-Representative Feature Selection for Keystroke Dynamics
5.1 Introduction
In section 2.5.1 we showed that some of several studies [44, 32, 21, 28, 48] con-
clude that the keystroke rhythm often has characteristics or features that represent
consistent patterns of user typing behavior. Therefore, it can be used for user au-
thentication. User authentication based on keystroke dynamics is less obtrusive
than many other biometrics and does not require any special hardware. Schemes
can be classied as either static or continuous [65]. Static approaches analyse
typing behavior of a xed predened set of characters (such as a password) for au-
thentication. They are more robust than simple password-matching but they do
not detect changes to the initial authorized user later in the session. Continuous
authentication approaches, by contrast, monitor and verify the user throughout
the computer session.
Continuous authentication approaches use sequences of characters that users
type during a session as distinguished features. The n-graph is a popular feature
among existing continuous authentication schemes. It is the time interval between
the rst and the last of n subsequent key-presses. Existing approaches use n-graph
(feature) selection techniques to obtain user-representative features that include:
popular-word selection (variable length n-graphs that are popular in a lan-
guage; for instance, or or of ) [44];
common n-graphs selection (often typed by all the users of a system) [32];
the least-frequent n-graphs (least frequently typed by all users of a system)
[21].
While these choices exploit the occurrences of n-graphs, their selection criteria
do not guarantee features with strong statistical signicance. Furthermore, their
selected features do not inherently incorporate user typing behavior.
This chapter proposes four statistical-based feature selection techniques that
overcome some limitations of existing ones. The rst is simply the most frequently
typed n-graphs which we expect to have highest statistical signicance. The other
three encompass users dierent typing behaviors.
1. The most frequently typed n-graph selection technique; it selects a certain
number of highly occurring n-graphs.
5.1. Introduction 81
2. The quickly-typed n-graph selection technique; it obtains n-graphs that are
typed quickly. It computes the average of n-graphs representing their usual
typing time and then, selects the n-graphs having least typing time. This
kind of behaviour indicates that the more quickly typed n-graphs are more
familiar to the user. We think the n-graphs that typed quickly could repre-
sent consistent patterns of user typing behavior as the user is familiar with
these n-graphs.
3. The time-stability typed n-graph selection technique; it selects the n-graphs
that are typed with consistent time. It computes the standard deviation of
n-graphs representing the variance from their average typing time and then
selects the n-graphs having least variance. This kind of behaviour indicates
that the more consistent typed n-graphs over timerepresent the user typing
behaviour eectively. We think the n-graphs typed consistently over time
may represent the normal patterens of user typing behaviour.
4. The time-variant typed n-graph selection technique; it selects the n-graphs
that are typed with noticeably dierent time. It computes the standard
deviation of n-graphs among all users representing the variance from their
average typing time and then, selects the n-graphs having large variance.
This type of behaviour shows the more variant typed n-graphs over dierent
users. We think the n-graphs that typed variably or inconsistently over
dierent users possibly may distinguish users eectively.
For evaluating the proposed techniques, we analysed whether selected features are
user representative (or reecting a normal typing pattern of a user). For this,
we use 2-graphs (sequence of two characters) as features in this thesis because
they are the basic element of n subsequent key-presses and occur more frequently
than general n-graphs. They are also used by Gunetti et al. [32], Monrose et
al. [64], Dowland et al. [22, 21] for their continuous user authentication schemes.
Moreover, we use the k-means cluster algorithm to nd out whether the 2-graphs
(selected by our techniques) are user representative. The notion is user typing
data representing a single user should be grouped into one cluster using user-
representative features since clustering naturally groups data that share similar
behavior (based on features). We use GP dataset [32] for experiments which
already described in chapter 4. Our experimental results show that among the
proposed techniques, the most-frequent 2-graphs selection technique can eectively
nd user-representative features. That is, using its selected features, each user
samples group into a unique cluster. Furthermore, to substantiate our results, we
compare this technique with the three existing feature selection techniques that
have discussed in chapter 2 section 2.5.1. They are popular Italian words [44],
common n-graphs [32], and least frequent n-graphs [21]. We have found that the
most-frequent 2-graphs selection technique performs better after selecting a certain
number of 2-graphs.
5.2 Proposed feature selection techniques
We proposed four statistical-based feature selection techniques. Their details are
as below.
5.2.1 Most frequently typed n-graph selection
In general, the sample size of a statistical sample is the number of occurrences or
frequencies in the population. The larger sample size will lead to increased pre-
cision in the estimates of the various properties of that population. Furthermore,
repeated measurements and replications of independent samples are often required
in measurements and experiments to reach the desired precision. Thus, the fre-
quency f of 2-graphs in a dataset d can be used for dataset-specic weighting and
denoted as 2-graph; f(d,2-graph). This is only a measure of 2-graph signicance
within d.
The dataset that contains 2-graphs with higher frequencies should receive a
higher grade. A plausible grading technique is to compute a grade that is the sum,
over the dataset, of the matched grades between each 2-graphs and the dataset.
Towards this end, a weight for the 2-graphs was assigned to each 2-graphs in the
dataset, which depends on the number of occurrences or frequencies of the 2-graphs
in the dataset. A score between the 2-graphs and dataset d is computed, based on
the weight of the 2-graphs in d. The simplest approach is to assign the weight to
be equal to the number of occurrences of the 2-graphs in dataset d. This weighting
scheme is referred to as the 2-graphs frequency.
Each 2-graph may occur in d more than once in each sample s, and then the
average function (AVG), standard deviation function (STD) and combine AVG
with STD together AVG +STD could be calculated and tested independently in
order to see the impact of these statistics on the process of distinguishing the
5.2. Proposed feature selection techniques 83
users. Calculating the AVG of dierent times for the same 2-graphs is done by
measuring the mean time for those 2-graphs. For example, if the user types "ER"
in one sample 100 times, we calculated the average time of all the 100 times in
order to see whether the AVG may represent the user typing behaviour in that
sample. Also, calculating the STD of the dierent times for the same 2-graphs by
measuring the variability or diversity of the times for those 2-graphs. For example,
if the user types "ER" in one sample 100 times, we calculate the standard deviation
of all 100 times in order to see whether the STD may represent the user typing
behaviour in that sample. Also, we considerd the AVG and STD for the 2-graphs
both of them as two features to represent the user typing behaviour in one sample
in order to see any impacts of them on the user representation.
5.2.2 Quickly-typed n-graph selection
The aim of this technique is to identify and extract the 2-graphs that are typed
by users very quickly. In general, an average is a measure of the mean value of 2-
graphs. The average is a single value that is meant to typify a list of values. Hence,
the AVG function was used in this technique to represent the behavior of the users
typing of 2-graphs; however, the user has frequently typed the same 2-graphs more
than once in the one sample. The average of the 2-graphs was calculated, based
on the average for all the user samples AVG 2-graphs (U, S)
j
where U
i
indicates
the user number and S
j
indicates the sample number to determine the 2-graphs
with the smallest average. Therefore, the 2-graph list was reordered based on the
shortest AVG time. The top of the 2-graph list means that the user typed these
2-graphs very quickly.
5.2.3 Time-stability typed n-graph selection
The aim of this technique is to identify and extract the 2-graphs typed by U
i
con-
sistently most of the time. Generally, standard deviation STD is used to measure
the variability or diversity in statistics and probability theory. A low STD indi-
cates that the data points tend to be very close to the mean, whereas a high STD
indicates that the data are spread out over a large range of values.
STD was used here to test the typing stability of 2-graphs for every (U, S
j
)
that the STD 2-graphs (U, S
j
). Then the AVG for 2-graphs was calculated and
the 2-graphs list was reordered based on the smallest AVG. Being placed at the
top of the 2-graphs list means that these 2-graphs have more stable user typing
than the other 2-graphs.
5.2.4 Time-variant typed n-graph selection
The aim of this technique is to identify and extract the 2-graphs that are typed
by users very dierently. The AV G function was used to represent user behavior
based on the AVG of 2-graphs per user u and the STD was used to measure
the variances and dierences in order to determine the highest variability of 2-
graphs. These 2-graphs indicated that the users typing these n-graphs are very
dissimilar. First, for each 2-graphs in the set d, the AVG time of the 2-graphs has
been calculated per u that shows by AVG 2-graphs (U
i
) and then the STD was
calculated for each 2-graphs among U
i
in order to see the typing variances between
users for that 2-graphs. The 2-graphs list was then reordered, based on the largest
STD , in order to identify the top 2-graphs that the user typed dierently. Thus, if
the STD of the 2-graphs is very low, this means that the users typing the 2-graphs
are similar; however, if the STD of the 2-graphs is very high, then the users type
the 2-graphs dierently and can be easily distinguished. Therefore, the 2-graphs
list were reordered based on the highest STD time. The top of the 2-graphs list
means that the user typed these 2-graphs very dierently.
5.3 Evaluation Methodology
The methodology is about selecting and evaluating dierent features from the
dataset that might represent the users typing pattern. Figure 5.1 shows two main
phases: the feature selection phase and the evaluation phase. The rst phase is
the process of selecting the candidate features of users typing pattern based on
dierent feature selection techniques. The second phase is the process of evaluating
the candidate features based on the k-means clustering algorithm with the aim of
assigning each users data to a unique cluster to determine which cluster belongs
to which user. The descriptions of the two phases are described below in detail.
5.3.1 Selecting candidate features
The aim of this phase is to select the candidate features that can represent the
users typing pattern. The phase has three sub-phases : pre-processing dataset,
5.3. Evaluation Methodology 85
E
v
a
l
u
a
t
e

c
a
n
d
i
d
a
t
e

f
e
a
t
u
r
e
s

o
b
t
a
i
n
e
d

b
y

f
e
a
t
u
r
e

s
e
l
e
c
t
i
o
n

t
e
c
h
n
i
q
u
e
s
S
e
l
e
c
t
i
n
g

c
a
n
d
i
d
a
t
e

f
e
a
t
u
r
e
s
Feature
selection
techniques
Keystroke
raw data
Pre-
process
Candidate
features
Extract the pre-
processed data
based on the
candidate features
C
l
u
s
t
e
r
i
n
g

f
o
r
m
a
t
i
o
n
apply
K-means
For each
user, if he has similar
maximum count of samples of the
same cluster or less than
any other
users
Identify the next
maximum count of
samples among all
users
Assigning
users to
clusters
For each user, identify
the maximum count of
samples among all
clusters
For each user, count
the samples in each
cluster
A
s
s
i
g
n
i
n
g

u
s
e
r
s

t
o

c
l
u
s
t
e
r
s
No
Yes
C
o
m
p
u
t
i
n
g

a
c
c
u
r
a
c
y
Min = Compute
number of samples of
user in the cluster
compute
accuracy
= (Pi Min)/Pi
*100
Pi = Compute the
total number of
samples of the
user in dataset
...
c1 c2 c3 ...cn
Resultant clusters
Figure 5.1: Evaluation methodology for feature selection techniques
apply the feature selection techniques and then, extract the dataset after prepro-
cessed and with the candidate features. pre-processing phase has been explained
in Section 4.4. The other two sub-phases are described next.
1. Feature selection techniques: The process in this sub-phase is selecting and
extracting some features that could represent the users typing pattern based
on dierent techniques. These techniques can be selected based on dierent
typing users behavior such as when the user typed quickly. Also, these
techniques can be selected based on statistical measures such as the less
frequent n-graph or most frequent n-graph. Then, the output of the feature
selection techniques will be some candidate features that could represent the
users typing pattern.
2. Extract the preprocessed data based on the candidate features: This sub-
phase will go to the preprocessed data and extract the relevant dataset that
includes the users and samples based on the candidate features that have
been identied from the previous sub-phase. Then, the relevant dataset will
be used and tested in the following evaluation phase.
5.3.2 Evaluate candidate features (obtained by feature se-
lection techniques)
This section evaluates the candidate features selected by the feature selection tech-
niques. We use clustering (k-means) algorithm for the evaluation. The notion is,
with user representative candidate features, the samples of each user should be
grouped into unique cluster since clustering naturally group data that share similar
behavior. To do this, we divide the evaluation phase to three steps: 1) clustering
formation; 2) assigning users to clusters; and 3) computing the accuracy. The
descriptions of the three phases are described below in detail.
5.3.2.1 K-means algorithm
K-means is one of clustering methods that has been been studied in data mining,
statistical and machine learning literatures. K-means represent an example of
unsupervised learning (as described in chapter 2). K-means has been used for
feature selections [18][54]. K-means in general naturally groups samples that share
similar behaviors which can be useful to partition the user samples into subsets
5.3. Evaluation Methodology 87
whose in-class members are similar in the identied features and whose cross-
class members are dissimilar in the corresponding sense. K-means algorithm has
been used successfully used for evaluating and testing dierent features of keystroke
dynamics [39, 94]. K-means is the most well-known clustering algorithm and
unsupervised learning algorithms that solve the known clustering problem. Thus,
the k-means was used to evaluate the feature selection techniques. It applied in
the dataset in order to divide the users samples into dierent clusters.
The procedure is to group a given users samples through a certain number of
clusters (assumed number of clusters) xed a prior based on the total number of
users. The main idea is to dene k-centroids, one for each cluster. These centroids
should be placed in a clever way, because a dierent location causes a dierent
result. Given a set of observations S
i
, where each observation in our case is a user
sample, k-means clustering C
n
aims to partition the S observations into C
n
sets
there is formula here so as to minimize the within-cluster sum of squares. The
experiments are conducted using Matlabs built-in k-means function, which takes
two parameters as input C
n
, the number of clusters or partitions to be made and
S
i
, the dataset or the matrix which in our case the user samples. The default
distance with k-means is Euclidean distance calculated between the object or the
the user sample and the cluster mean. User samples are assigned clusters that
they are most similar to, based on the Euclidean distance between them. The
algorithm then computes the new mean for each cluster. This process iterates
until the iteration function converges, based on the square-error function [36].
E =
n
i=1
sc
i
|s m|
2
Where E is the sum of the square error for all of the user samples in the
database, s is a user sample and m
i
is the mean of cluster c
i
. It means put
each user sample in each cluster, the Euclidean distance from the user sample to
its cluster center is squared and the distances are summed. The procudure of
iteratively reassigning user samples to clusters is known as iterative relocation.
Some distance types can be used with k-means but, in this chapter, the evalu-
ation is based on the two most popular distances: Euclidean distance [65][10] and
city-block distance [92][89].
5.3.2.2 Assigning users
Due to the ambiguity of the clustering result in that some clusters contain samples
from dierent users, it is necessary to rely on criteria that are able to extract rele-
vant information about distinguishing users. The output of the k-means technique
needs to design criteria that is able to assign each user to a dierent cluster. The
decision of assigning each user to dierent cluster is based on the frequency of the
samples to the cluster. For example, if cluster 1 has been assigned by user 1 with
ve samples and by user 2 with ten samples, then the decision will be to assign
user 2 to cluster 1. However, if the frequency of the samples are similar for two
users and assigned to one cluster, then the cluster will not be assigned to any one
of those users. They will be assigned based on the second most frequent samples
of them.
To illustrate further, following is the complete set of steps to assign a user to
a cluster:
1. The input of the algorithm is the output of the previous k-means algorithm
that includes the users number, the samples number and the cluster number.
2. Count (U
i
: C
n
) = for each user, count samples in each cluster Where i=number
of user and n=number of the cluster. For example, the samples of user 1 are
assigned to clusters 1, 4, and 6. Then, how many samples assigned to these
clusters are counted.
3. Compute max (count (U
i
: C
n
)) for Ui. For each user, it is necessary to iden-
tify the maximum count of samples among the clusters that the user assigned
to them. For example, two samples of user 1 were assigned to cluster 4, nine
samples were assigned to cluster 1 and two samples were assigned to cluster
6. Hence, the maximum counted samples for user 1 will be 9 with cluster 1.
4. For each user, If max (count (U
i
: C
n
)) is not equal or less than any other
users in a similar cluster, then go to 6.
5. Otherwise, compute the next max (count (U
i
: C
n
)), then go to step 4.
6. Assign U
i
to C
n
.
5.4. Experiments 89
5.3.2.3 Cluster evaluation criterion
Accuracy rate was used in this classication system in order to measure and eval-
uate the measurement of the system
Accuracy =

P
i
M
in
P
i
100
where:
P
i
= of samples of U
i
in the dataset
M
in
= of samples of U
i
in C
n
Accuracy is calculated by rst identifying the total number of user samples in
the dataset and the user samples in the cluster.
5.4 Experiments
In this section, we describe the experimental setting, show the results of the com-
parison of the proposed feature selection techniques, and compare the existing
feature selection techniques. In the next sub-sections, will describe in detail all of
these points. We used the same GP dataset as it has been explained in chapter 4.
5.4.1 Experimental settings
For the sake of conciseness, only 60 2-graphs that were the most frequent in the
dataset were considered, since 100% accuracy was reached with these numbers of
features. Then, the proposed feature selection techniques were applied on the 60
2-graphs list. Therefore, each feature selection technique will produce the same list
of 60 2-graphs, but in a new order. For the sake of conciseness, the new list of 60
2-graphs was divided into 6 groups in order to see the impact of the evaluation of
each group individually and cumulatively by adding one group by one. Each group
contained 10 2-graphs and the group of 10 represented the rst 10 of 2-graphs from
the new list that was produced after applying the feature selection technique, and
group of 60 represented the last 10
th
of 2-graphs from the new list. The result was
presented separately for each group of 10 2-graphs. Furthermore, the result was
also presented cumulatively by adding more groups of 10 2-graphs.
One of the issues raised is that some users in some samples do not type some
2-graphs from the 60 2-graphs list and clustering techniques cannot analyse data
that have missing data values. To solve this issue, the missing 2-graph was replaced
by the global time of that 2-graph. This global time was calculated based on the
AV G time of that 2-graph from all the user samples in the dataset. The idea of
choosing the global value to replace the missing value of 2-graphs is because the
value of global time does not aect the size of the cluster.
5.4.2 Experimental results
The results present the accuracy of the selected features in two ways either based
on individual group of features or based on cumulatively adding groups of features
one by one. The purpose behind presenting the accuracy result on individual
groups of features in order to see how the sequence of the features is impacted
by the proposed feature selection technique. Also, the purpose of presenting the
accuracy result of adding the number of selected features cumulatively is to see
the impact of the proposed feature selection with dierent number of features. For
all the gures, the numbers in the horizontal line represent the number of selected
features of 2-graphs. Percentages in the vertical line represent the system accuracy
percentage of the user samples that were correctly classied to the right user.
Figure 5.2 shows the comparison between all of the proposed feature selection
techniques adding the number of selected features cumulatively. The comparative
analysis of all of the proposed feature selection techniques of 2-graphs demonstrates
that the most frequently typed n-graph selection technique has the highest accu-
racy percentage which leads to show the users are represented eectively because of
their highest statistical signicance. Furthermore, modeling the user behavior by
the time-stability typed n-graph selection technique is still promising to represent
the user typing eectively. This is due to the consistent time of the selected n-
graph which represents the normal users typing behavior. However, for a smaller
number of 2-graphs, the most frequent 2-graphs technique obtained a much better
accuracy percentage compared to other proposed feature selection techniques. For
example, when selecting the rst 10 features for all the proposed feature selection
techniques, it attained about 78% accuracy when using the most frequent 2-graphs
technique and, for the other proposed techniques, a maximum 60% accuracy was
obtained. For all the proposed feature selection techniques, when the durations of
2-graphs of feature occurred more than once in the users sample, the AVG func-
5.4. Experiments 91
10 20 30 40 50 60
10
20
30
40
50
60
70
80
90
100
# of selected features
A
c
c
u
r
a
c
y

(
%
)

Most frequently typed
Time stability typed
Quicklytyped
Timevariant typed
Figure 5.2: Comparison between proposed feature selection techniques based on
# of selected features cumulatively
tion was computed. Figures 5.3 and 5.4 exhibit the dierent statistics measured
including AV G, STD and AV G+STD for computing the occurrences of 2-graphs
in the users sample and their comparisons.
Figure 5.3 shows the comparison between two types of k-means distances: Eu-
clidean distance and city block distance. The comparison based on most frequently
typed n-graph selection technique when applying them on individual group of fea-
tures. The AVG function was computed when the 2-graphs occurred more than
once in the users sample. The result presents that the city block distance is similar
to Euclidean distance. This indicates the distance type of k-means may not af-
fecting the system accuracy as other factors may do. For example, the importance
of the feature order or sequence in the features list such as when selecting from
1 to 10 most frequent 2-graphs from the most frequent 2-graphs list, the result
attained about 78% accuracy; and when selecting from 51 to 60 most frequent
2-graphs from the most frequent 2-graphs list, the result dropped to about 50%
accuracy. Also, gure 5.3 in case of using city block distance, comparing between
dierent statistics measures including AV G, STD, AV G+STD when the 2-graphs
occurred more than once in the users sample. The result shows that the system
accuracy based on the AV G function is better than STD and AV G+STD over all
the individual group of features. Furthermore, using the STD measurement when
the durations of 2-graphs occurred more than once in the users sample is not a
good measure to represent the user behavior of that 2-graph as the accuracy result
does not exceed 30 percentage over all the individual group of features. This indi-
cates the standard deviation or the variance of the times for those 2-graphs that
110 1120 2130 3140 4150 5160
10
20
30
40
50
60
70
80
90
100
range of selected features
A
c
c
u
r
a
c
y

(
%
)

Euclidean distance using AVG
cityblock using AVG
AVG+STD
STD
Figure 5.3: Comparison between dierent statistics and distances for the most
frequent 2-graphs technique based on individual group of features
occurred more than once in the users sample is not a good measurement for mod-
eling the user behaviour. However, adding STD to AVG function for representing
the 2-graphs that occurred more than once in the users sample is aecting the
result of the AVG function. It is clearly shown from the gure the system accuracy
is dropped after adding STD to AVG function. This result means that it is not
the case that adding more statistical measurements to represent the 2-graphs that
occurred more than once in the users sample are increasing the system accuracy.
Figure 5.4 shows the similar comparison, as gure 5.3 using dierent statistics
measures including AV G, STD, AV G + STD when the 2-graphs occurred more
than once in the users sample, but based on adding the number of selected features
cumulatively. The results show that the accuracy result based on AV G is quite
similar to AV G+STD, after selecting 20 features of 2-graphs cumulatively. This
is because the number of 2-graphs is increased which aecting the system accuracy.
However, the 2-graphs that were calculated based on AV G+STD required more
computation than the 2-graphs that were only calculated based on AV G.
5.4. Experiments 93
10 20 30 40 50 60
10
20
30
40
50
60
70
80
90
100
A
c
c
u
r
a
c
y

(
%
)

AVG
AVG+STD
STD
Figure 5.4: Comparison between dierent statistics for the most frequent 2-graphs
technique based on # of selected features cumulatively
5.4.3 Comparison with existing feature selection techniques
This sub-section presents the comparison of the most frequently typed n-graph
selection technique with three existing feature selection techniques. The existing
feature selection techniques include the popular Italian words, common n-graphs,
and least frequent n-graphs. Dowland et al. [21] collected the typing samples of
ve users by monitoring their regular computer activities, without any particular
constraints being imposed on them such as asking users to type predened set of
words. They selected the features (2-graphs only) that occurred the least number
of times across the collected typing samples. They use keystroke latency, which is
the elapsed time between the release of the rst key and the press of the second
key. They build user proles by computing the mean and standard deviation of
2-graphs latency. They achieved correct acceptance rates in the range of 60%.
Unlike Dowland et al., Gunetti et al. [32] avoided using the 2-graphs and 3-
graphs latencies directly as features. Instead, they used latencies that determine
the relative ordering of dierent 2-graphs/3-graphs. They extracted the 2-graphs
and 3-graphs that are common between two samples and found the dierence
between them. For this, they devised a distance metric to measure the distance
between the two-orderings of 2-graphs and 3-graphs between two samples. In order
to identify the user of an unknown sample, they compare it with all the samples of
the users by computing the distance between them. The users sample with least
distance is deemed to be the user of the unknown sample. They reported 95%
accuracy.
Rajkumar and Sim [44] selected popular English words such as the, or, to, you
Most frequent
Italian words
Classier
accuracy (in
percentages)
(B)
Most frequent
2-graphs
Classier
accuracy (in
percentages)
(A)
Improvement
A-B
non 40.00 co 34.28 -5.72
di 29.52 to 33.80 4.28
che 28.09 re 29.04 0.95
la 28.57 er 28.57 0
il 23.33 on 23.33 0
combining all
of them
together
60.48 combining all
of them
together
64.76 4.28
Table 5.1: Comparison of Italian words and most frequent 2-graphs
Most frequent
2-
graphs/common
2-graphs
Number of
common 2-
graphs
Number of most frequent 2- graphs
18 18 20 30 40 50
Classier
accuracy (in
percentages)
94.81 90.95 95.23 97.14 98.57 100
Table 5.2: Comparison between common 2-graphs and most frequent 2-graphs
as features. They showed that many xed strings qualify as good candidates and
identied the user as soon as he typed any of the xed strings. They proved that
these words can be used to discriminate users eectively.
Regarding the feature of the most frequent English words, the most frequent
Italian words are analysed and tested as this dataset is based on Italian words
such as non, di, che, la, and il. Table 5.1 shows the comparison of classication
accuracy between the feature of the ve most frequent Italian words with the ve
most frequent 2-graphs. The result of the classication accuracy is quite similar
if the analysis is done individually per word or 2-graph. However, if the analysis
is done by combining all of the ve most frequent Italian words and the ve most
frequent 2-graphs, it can be seen that the ve most frequent 2-graphs are better
than the ve most frequent 2-graphs in terms of classication accuracy.
Table 5.2 shows the comparison of classication accuracy between the feature of
5.4. Experiments 95
the common 2-graphs and the most frequent 2-graphs. In this dataset, the count of
the common 2-graphs was 18 and, to get a comparable result, the 18 most frequent
2-graphs were compared with the 18 common 2-graphs. The result of the common
2-graphs is slightly better in terms of classication accuracy. If the number of
the most frequent 2-graphs is increased to 20, a similar result is obtained for the
common 2-graphs feature. Moreover, the common 2-graphs feature do not reach
100% classication accuracy in this dataset and this percentage can be obtained
based on the most frequent 2-graphs feature in taking 50 values of the feature
in account. Furthermore, the feature of the common 2-graphs is very dependent
on all users typing; for example, if only 5 2-graphs exist in all the user samples
and typed by everyone, the result of the classication accuracy will drop down to
63.3%. Therefore, to take this approach into account, a large a mount of data is
needed for the whole user sample in order to increase the chance of getting more
common or shared 2-graphs for all users.
Figure 5.5 compares the accuracy between the highest and lowest frequency
2-graphs when applying them on individual group of features. For this partic-
ular comparison, the rst 100 highest frequent 2-graphs were extracted from the
dataset, and the 100 highest frequent 2-graphs were divided into 5 groups, starting
from group 10 which indicates the rst ten of highest frequent 2-graphs to group
50, which indicates the last 10 of lowest frequent 2-graphs. From the 100 highest
frequent 2-graphs list, we extract the rst 5 groups which represents the highest
frequent 2-graphs as indicated by the upper line starting from group 10 which
indicates the rst 10 highest frequent 2-graphs to group 50 which indicates the
lowest 10 frequent 2-graphs. Also, we extracted the last 5 groups from the list of
100 highest frequent 2-graphs which represents the lowest frequency 2-graphs as
indicated by the bottom line starting from group 10 which indicates the lowest 10
frequency 2-graphs to group 50 which indicates the 10 most frequency of 2-graphs.
Therefore, group 10 in gure 5.5 represents the result of the rst 10 highest fre-
quent 2-graphs "as indicated by the top line" and also represents the result of the
least frequent 2-graphs as indicated by the lower line. It is observed that the
highest frequent 2-graphs "as indicated in the above line starting from group 1 to
group 50" has a better system accuracy than the lowest frequency of 2-graphs as
indicated in the bottom line starting from group 1 to group 50. Figure 5.6 exhibits
the result of the classication accuracy for both the most frequency 2-graphs and
lowest frequency 2-graphs where adding the groups of 2-graphs cumulatively. It
110 1120 2130 3140 4150
10
20
30
40
50
60
70
80
90
100
range of selected features
A
c
c
u
r
a
c
y

r
a
t
e

Most frequancy
Lowest frequancy
Figure 5.5: Comparison between the most and least frequent 2-graphs based on
individual group of features
shows from the graph that the selecting less than 20 numbers of highest frequent
2-graphs performs better than selecting 50 numbers of lowest frequency 2-graphs.
The previous results of all of the techniques were based on the whole sample.
As we explained in chapter 4, the size of the sample varies between 700 to 900
characters. In the next section, we will evaluate the optimum feature with dierent
sample sizes such as 600, 400, 200 to 100 characters from each data of user sample,
in order to see the inuence of the data size on the features that represent the user
typing behaviour. Also, in the next section, we will compare between the xed
(global) features that we identied from the whole dataset with dynamic features
that identied from the data you have in the training time.
5.5 Comparing xed and dynamic features on
dierent data sizes
Fixed (global) features are identied from the whole dataset. For example, in case
of proposing the most frequently typed 2-graphs selection technique, we searched
in the whole dataset and extract the most frequent 2-graphs. Then, we used the
identied most frequent 2-graphs from the whole dataset to represent the user
5.5. Comparing xed and dynamic features on dierent data sizes 97
10 20 30 40 50
10
20
30
40
50
60
70
80
90
100
A
c
c
u
r
a
c
y

(
%
)

Most frequancy
Lowest frequancy
Figure 5.6: Comparison between the most and least frequent 2-graphs based on #
of selected features cumulatively
sample. But dynamic features are identied from the current data size such as
if the analytical technique on 600 characters data size from each user, then the
features selected from the 600 characters not from the whole dataset. In this
section, we compare the two approaches of selecting the features over dierent
data size to see which one of them is performing better for representing the user
typing behaviour. Particularly, we run an experiment to meet the two aims:
1. Whether the global xed features is more applicable than calculating the
features dynamically based on the data that we have.
2. What is the minimum sample size of data that we need to distinguish between
users correctly ?
To meet the previous aims, we will use the same methodology that we proposed in
section 5.3 but with dierent sample sizes such as 800, 600, 400, 200, .. characters
from each user sample. This allows us to know whether the approach of the global
xed features is better than calculating dynamically the features from the data
that we have based on the data size. Also, from the experiment discussed in this
section, we will know what the minimum data size is that we need to distinguish
between users eectively. The output of this phase will identify the best approach
of feature selection either dynamically or globally and the minimum data size of
user sample. From the previous result section in 5.4.3, we found the highest or
most frequently typed 2-graphs selection technique performs better than other
techniques. So, we will use this technique to meet our previous aims.
100 200 300 400 500 600 700 sample
0
10
20
30
40
50
60
70
80
90
100
Data size(# of charcters)
A
c
c
u
r
a
c
y

Fixed features
Dynamic features
Figure 5.7: Comparing xed and dynamic features over dierent data sizes
Figure 5.7 shows the comparison between xed features and dynamic features
over dierent data sizes. We can clearly see from the gure that the trend of
accuracy for both of the xed and dynamics features is similar. However, dynamic
features are slightly better than xed features. Also, the trend of the result is
similar over dierent data sizes. Particularly, the accuracy is increased with the
same degree for both dynamic and static features when we increased the data size.
We conclude that "as the data size increases" more accurate result are obtained.
For example, when the analysis of the feature selections based on the data size or
number of characters equal to 100 characters, the accuracy attained about 31.43
% accuracy for dynamic features and 23.81% for xed features. It is clearly seen
that the accuracy increases when more characters are added to the sample size.
When the analysis of feature selections based on 600 characters, we attained almost
100% accuracy for both dynamic and xed features. Since the dynamic features
require more computation and memory storage, we will use xed (global) features
as representing the user typing behaviour for distinguishing users. Therefore, the
analysis in chapter 6 and 7 will be based on xed features.
5.6. Summary 99
5.6 Summary
This chapter proposed four statistical-based feature selection techniques with the
aim of nding out whether the 2-graphs (selected by our techniques) are user-
representative. The rst is simply the most frequently typed n-graphs and the
other three consider dierent user typing behaviors by selecting: n-graphs that are
typed quickly; n-graphs that are typed with consistent time; and n-graphs that
have large time variance among users. We use 2-graph (as features) in our experi-
ments and found that the most-frequent 2-graphs are the more easy to model and
represent the user typing behaviour because of their highest statistical signicance.
We further substantiated our results by comparing it with three contemporary fea-
ture selection techniques (i.e popular Italian words, common n-graphs, and least
frequent n-graphs). We found that our technique performed best after selecting a
certain number of 2-graphs.
Among the other three proposed techniques, n-graph that are typed with con-
sistent time, showed promising results. It achieved signicantly higher accuracy
and even after selecting a certain number of features. It matches the accuracy
near to the frequently typed n-graphs.
In the next chapter, we will use the most frequently typed n-graphs feature
selection technuique which represents the user typing behaviour eectively. In
chapter 6, we will propose a new method that authenticates a user accurately
without need for any predened user typing model a-priori. This method will be
based on the optimum feature (most frequently typed n-graphs) that has been
proposed in this chapter.
Chapter 6
User-independent Threshold
In Chapter 3, we proposed a model that describe the whole process of the contin-
uous authentication. One of the main components of that model is the detector
component that is necessary for detect the impostor during the entire session. In
this chapter, we will focus all of the analysis for this component by developing
a new approach using unsupervised method. The approach is proposing user-
independent threshold technique that can authenticate a user and also detecting
the impostor without need for any predened user typing model a-priori. In par-
ticular, the chapter addresses the second sub-question from Research Question 2
(outlined in Chapter 1):
How well can user-independent threshold work out whether two typing samples
in a user session belong to the same user?
The chapter is organised as follows. The next section provides an introduction
and discusses the motivation behind the work. In section 6.2, we explain and
discuss the the user-independent threshold system. Following this in section 6.3,
we give an overview how we design and evaluate the user-independent threshold.
In section 6.4, we describe in details the procedure to obtain the user-independent
threshold. In section 6.5, we discuss the experimental methodology. In section
6.6, we discuss in details the experimental results. In section 6.7, we compare the
results between the proposed approach of user-independent threshold and some of
the existing schemes that are based on user-dependent threshold. In section 6.8,
we discuss more about the user-independent threshold approach and the associated
limitations. The nal section in 6.9 summarises the chapter.
101
102 Chapter 6. User-independent Threshold
6.1 Motivation
Continuous user authentication systems (that are based on keystroke dynamics)
are typically dependent on pre-dened user-typing models for authentication. In
order to overcome the dependency of pre-dened user-typing models, this chapter
proposes the user-independent threshold that does not need to build user-typing
models. It is based on the premise that a user often types in a similar fashion (that
is the distance between samples from the same user is small) and dierent users
often type dierently (that is the distance between dierent user samples is large).
Thus a threshold line can be drawn between the distance of the typing samples
of same and dierent users in order to work out whether two typing samples in
a user session belong to the same user. However, there are factors that inuence
the accuracy of the threshold such as the size of typing samples. This chapter
identies and presents a detailed analysis of these factors. We obtain a reasonable
user-independent threshold on using a certain size of typing samples and a certain
number of the most frequently occurring character pairs. Moreover, we compare
our results with two existing approaches and we nd that our approach performs
comparably to the existing approaches after a sucient number of keystrokes are
obtained. An advantage of our proposed authentication scheme is that "unlike the
existing schemes" it can distinguish two unknown user samples and decide whether
they are from the same user or not.
6.2 A user-independent threshold system
The main challenge is to develop an authentication system which is not user-
dependent and which does not require a predened normal typing behaviour model
for a user. The end goal is to develop a system that can authenticate all users
whether they are valid users or possible impostors without having a predened
user typing model in advance from any user, valid or otherwise. The proposed
user-independent threshold approach selects features that are user representative
or that reect the normal typing pattern of a user. Our previous work successfully
identied features that are representative of user typing behaviour to distinguish a
single user from other users. In this chapter, based on the representative features,
we take two samples and decide whether they are from the same user or from dif-
ferent users by nding the similarities and dierences between them. Particularly,
this chapter analyses and identies the degree of similarity between two samples
6.3. Designing and evaluating user-independent threshold 103
from the same users data and the degree of dierence between two samples from
dierent users. We establish a boundary or line that maximises the similarities
between the two samples from the same users data and minimises the dierences
between two samples from dierent users data.
6.3 Designing and evaluating user-independent
threshold
The user-independent threshold boundary can be dened in dierent ways de-
pending on the aim of the system. If the system is more focused on user conve-
nience or usability in order to prevent denying access for the valid users, then the
independent-threshold can be achieved where the false positive equals zero. If the
system is more focused on security in order to detect the intruder or impostor,
then the independent-threshold can be achieved where the false negative equals
zero. In this research, we choose the threshold to achieve a specied operating
point equal error rate (EER) and this is a specied relation between the false re-
jection of true claims and false acceptance of intruder or imposter claims. The
operating point is determined from the receiver operating characteristics (ROC)
curve which plots the relationship between these two error rates as a function of
the decision threshold. We chose user-independent threshold as the equal error
rate (EER) point of the system inspired by Nanni and Lumini [67], who develop
several biometric systems based on palm prints, nger prints, facial recognition
and ear biometrics. The EER in gure 6.1 indicates the measures of the error rate
with the frequency of false positive (FP) and the frequency of false negative (FN).
The cross-point of false positives and false negatives will classify the typing
data into two categories. The rst category indicates those to which some property
measured from the typing falls below a cross-point, signalling that the typing data
is normal and was generated by only one user. The second category indicates those
at which the property equals or exceeds a cross-point, signaling that the typing
data is abnormal and was generated by two dierent users. This cross-point is
the user-independent threshold that can be used to distinguish every user from
all other users by measuring the distance between the various features of the two
samples and by deciding whether they are from the same user or dierent users.
For evaluating the proposed cross-point or a threshold value, we analyse the
similarities between the two samples from the same users data and the dierences
0
100
Threshold Values
A
l
a
r
m

(
%
)
False
Positive
(FP)
False Negative
(FN)
Equal Error Rate
(ERR)
Figure 6.1: Equal error rate
between two samples from a dierent users data. For this, we have applied distance
measures to distinguish the samples of two dierent users, based on our hypothesis
that the change in data between the two samples usually transforms the keystroke
dynamics parameters in such a way that statistical properties no longer remain
constant, resulting in distance changes. Therefore, the distance between same user
samples is small and the distance between dierent user samples is large. Note
that distance measures are the most common characteristic used in the existing
continuous user authentication schemes [32, 40].
For investigating and evaluating the user-independent threshold, we test dier-
ent threshold values in order to nd the cross-point value. These threshold values
are generated from dierent measurement distance values. Also, we analyse the
consistency of the cross-point by dividing the data set to dierent sub-data sets.
Then, we calculate the variance of the cross-point value among the dierent sub-
data sets in order to be condent that the user-independent threshold is suitable
for the whole set of users. For this research, we have identied and examined four
dierent variables that are directly related to the user samples for authentication:
distance type, number of keystrokes, feature type and amount of features. The
6.4. Approach 105
importance of these factors is discussed in section 6.5.2. All of these factors could
inuence the consistency of the threshold among dierent users in order to nd
the optimum factor values that stabilise the cross-point among dierent users. A
higher variance of the threshold value among dierent group of users depicts that
the user-independent threshold is sensitive to dierent users typing data and fac-
tor values are not optimum for that threshold. A lower variance of the threshold
value among dierent group of users means that the threshold value is not sensitive
towards any users typing data and the factor values are the optimum values to
obtain the user-independent threshold.
6.4 Approach
The aim of this approach is to distinguish a user accurately without needing any
predened user typing model a-priori. It can be achieved based on a user indepen-
dent threshold that can be xed for a whole set of users. The threshold is derived
to balance two distance measures for authentication. Firstly, a distance measure
between two samples from the same users data. Secondly, a distance measure
between two samples from a dierent users data. These two distance values are
combined to gain one threshold value suitable for the whole set of users.
The threshold would distinguish whether typing data samples are from the same
user or dierent users. In this research, we dened the user-independent threshold
by the cross point between the false positives and false negatives (equal error rate)
where the trade o is between security and user convenience. However, the choice
of the user-independent threshold can be dened in dierent ways depending on the
aim of the system. First it can be dened when the false positive equals zero where
the system is more concerned about user convenience. Second it can be dened
when the false negative equals zero where the system is more concerned about
security. Also, the research examined four dierent variables including distance
type, number of keystrokes, feature type and feature amount in order to nd the
optimum factor values that help to nd the existence of the user-independent
threshold and also to stabilise that threshold among dierent users.
The procedure to obtain the user-independent threshold is as follows:
1. Divide the users into dierent group of users: Let U
k
(1 k K) be a group
of dierent users, u
i
(1 i I) be a user and u
j
i
(1 j J) be a sample or
a session of the same user. We divide u
i
into dierent U
k
randomly so that
each U
k
has dierent u
i
.
2. Apply and vary four dierent factor values:
Apply dierent distance metrics such as Euclidean distance between
samples of same and dierent users:
For each U
k
and u
i
, apply the distance metric between u
j
i
(distance
between same user samples).
For each U
k
and u
i
, apply the distance metrics between the u
j
i
to all
other u
j
i
in the same U
k
(distance between dierent user samples).
Extract dierent feature types that are representative of user typing be-
haviour: if u
j
i
= (f
j,1
, f
j,2
, ....., f
j,n
) and u
j1
i
= (f
j1,1
, f
j1,2
, ...., f
j1,n
)
where f
j
and f
j1
, are the same features in u
j
i
and u
j1
i
, then the dis-
tance metric from f
j
to f
j1
, or from f
j1
to f
j
will be calculated based
on the type of distance metric including Euclidean distance, City block
distance, Euclidean distance using a Fibonacci series and cosine simi-
larities.
Select dierent number of features in order to nd the optimum number
of features that will help to nd the existence and stabilize of user-
independent threshold among dierent users. For each u
j
i
, we vary the
number of features f
j
by selecting dierent number of f
j
.
Select dierent size of users data by choosing dierent number of keystrokes
from each user. For each u
i
, we vary the size length of u
j
i
by combining
dierent samples of the same u
i
.
3. Normalise all the distance values for all u
i
in all U
k
in order to compare
the dierent distance measurements easily. To compare the accuracy of the
threshold among dierent distance measures fairly, we need to normalize
the distance values. We use dierent distance measures and each measure
has dierent values. So, normalisation can get the same values for all of the
distances in order to make the comparison easier. For each distance measure,
we calculate all pairwise sample distances for the same user and the pairwise
sample distances between each user to other users. We nd the maximum
distance value. Then, divide each pairwise sample distance by the maximum
distance value. In this way, all distances are mapped to a value between 0
and 1. The larger the distance, the closer value to 1.
6.5. Experiment 107
4. Test dierent threshold values (distance values) in order to nd the cross-
point value of FP and FN for each U
k
: In each U
k
, we apply the previous
factor values that occur in phase 2 and then, the output of distance values
O
i
will be compared to dierent threshold values t
i
. In chapter 5, we dened
the user is typing similarity and dierently from other users. So, the O
i
between the same users typing data will be small distance value and the
O
i
between users data and other users data will be large distance value.
Therefore, we try dierent threshold values in order to nd the optimum
threshold value where the maximum O
i
between the same users typing data
can not go beyond that t
i
when the false positive equals zero. Also, the t
i
can be achieved where the minimum O
i
between users data and other users
data can not be lower than that t
i
when the false negative equal zero
O
i
t
i
similar user
O
i
> t
i
dierent user
In this work, we choose the t
i
to achieve a specied operating point equal error
rate (EER) where both the false positive and false negative rates are the same
where the trade o is between security and user convenience.
1. Calculate the variance of the t
i
across dierent U
k
: This step will validate
the threshold value among subsets of data. The threshold value could be
inuenced by some users data in a particular U
k
. To be condent about
the t
i
, we evaluate the consistency of t
i
in dierent U
k
. We choose the
standard deviation of the t
i
among all U
k
to calculate the variance of the t
i
value.
2. If t
i
value among all U
k
is inconsistent go back to step 2, otherwise choose t
i
as a user-independent threshold.
6.5 Experiment
The experiment is designed so that we can nd a threshold value suitable for the
whole set of users to distinguish them without requiring any prior training data
for a user. Also, the experiment will observe the eects of the four parameters
including distance type, number of keystrokes, feature set and feature amount on
the consistency of the user independent threshold among dierent group of users.
In this section, we lay out the dataset, experimental method, and we present the
empirical results.
6.5.1 Data set
For our evaluation data, we used GP dataset [32] that explained in chapter 4 in
more details. We divided the users into dierent groups in order to see how the
user independent threshold is varying and sensitive with dierent typing of users.
In particular, we divided the users into three groups. The rst group had ve
users, each user has 15 samples and the total number of samples in this group for
all the ve users was 75 samples. The second group had the same total number as
the rst group but the third group had four users and the total number of samples
in this group for all the four users was 60 samples. We computed each classication
instance over a sample by merging ve samples together and then decreased the
classication instance to 4, 3, 2 and to one sample. As a result, a classication was
carried out for all the dierent sizes in order to see how the classication inuenced
the data size. This gave us dierent groups of users with a dierent total number
of samples. So, the evaluation of dierent groups of users with the dierent total
number of samples showed how the approach is sensitive with dierent numbers
of samples in each group. This approach added credibility to the existence and
stability of the user-independent threshold value among dierent groups of users
with a dierent number of samples.
6.5.2 Experimental method
The method itself is comprised of three steps: (1) dividing the data sets into
dierent subsets; (2) selecting values of interest for each of the four factors; and (3)
repeatedly running an evaluation, while systematically varying the factors among
the selected values.
Selecting factor values. The four factors of interest in this study: distance
type; number of keystrokes; feature set and feature amount; can take many
dierent values (e.g., number of keystrokes can range from 10 characters to
more than 1000 characters). For this study, we needed to choose a subset of
values to test.
6.5. Experiment 109
1. Distance type: We used four distance metrics for our evaluation to measure
the distance between various user features and to decide that two samples
are from the same user or dierent users. First, Euclidean distance has been
used by several researchers to measure the distance between two typing data
samples [64, 65, 10]. It is the "ordinary" distance between two points which
is more relevant to distance measure, that does not rely on the distribution
mean and the variance-covariance. Also, it denes the dissimilarity between
the two points. Second, Euclidean distance using a Fibonacci series calcu-
lates for each sample one aggregate global score [85].The aim of Fibonacci
numbers is to separate the samples eectively and increase intra-cluster simi-
larity and decrease the inter-cluster similarity. Third, the City block distance
(also referred to as Manhattan distance), rst described by [5], which calcu-
lates the distance between a test vector and the mean of the training vectors.
It has been used for measuring the distances between users typing data in or-
der to nd the similarities and dissimilarities between them [92, 89]. Finally,
we implemented cosine similarities which are the most popular measure for
nding the similarities in text documents [104]. It captures a scale invariant
understanding of similarities. It has been used by several researchers for mea-
suring the similarities and dissimilarities between two samples of keystroke
dynamics [70, 26].
2. Feature type : Since users can type characters in any order, it is imperative
to nd character sequences that are representative of user typing behaviour.
In the previous chapter we evaluated dierent feature types (feature selection
techniques) in order to nd the user-representative features. So, in this work,
we evaluated some of the optimum techniques to see their inuence of the
user-independent threshold.
3. Feature amount: It means the total number of features that represent the
user sample. We evaluated dierent values of the feature type by choosing
certain dierent numbers such as 50 and 100 of each feature type.
4. Number of keystrokes (Data size): We tested the user-independent threshold
based on dierent amounts of data. Prior research analysed the keystroke
dynamics with dierent numbers of keystrokes of user typing. Most of the
results of that analysis supported the assumption that all user typing repre-
sents the user typing behaviour. This chapter analysed the user-independent
threshold based on dierent amounts of data such as ve samples, four sam-
ples, three samples, two samples and one sample. Each sample contains
between 700 to 900 number of keystrokes.
Evaluation procedure. Having chosen values for these four factors, we needed
an evaluation procedure that could be run for all the dierent combinations.
The chosen design procedure had four inputs: the distance type; feature
type; feature amount and data size. The false positives and false negatives
were used to generate an ROC curve for the detector. From the ROC curve,
the equal-error rate was calculated (i.e., the false-alarm and/or miss rate
when the detector has been tuned so that both are equal). It is a common
overall measure of detector performance in keystroke-dynamics research [75].
The variance of the user independent threshold was calculated for dierent
data sizes. The following steps summarise the evaluation procedure:
1. The distance measure is trained between pairwise samples from the same
users data. For each user, the distance values are calculated for all com-
binations of two pairwise samples. Then, these values are designated for
genuine user scores that describe the normal distance between pairwise sam-
ples from the same users data.
2. The distance measure is trained between two samples from dierent users
data. The distance values are calculated for all combinations of two pairwise
samples between dierent users data. Then, these values are designated
for impostor or intruder scores that describe the distance between pairwise
samples from genuine users and impostors.
3. The genuine user and impostor or intruder scores are used to generate a
receiver operating characteristics (ROC) curve for classication or authenti-
cation. From the ROC curve, the equal-error rate is calculated in order to
determine the threshold value.
4. The consistency of the threshold value is calculated among dierent groups
of users. Specically, in each group of users, we calculated the EER and in
that cross-point, we identied the threshold value as the optimum threshold
value for these users in this group. We did the same procedure for all groups
of users. Then, we identied all optimum threshold values in all groups of
6.6. Results 111
users and calculated the variance or the standard deviation of the threshold
among all groups of users.
6.6 Results
We explored and showed the results of our experiment in two ways. First, we
showed the result gures for dierent groups of users but with the same value of
the four factors in order to depict how the user-independent threshold is sensitive
to dierent users typing data. This method of presenting the results can be seen
in gure 6.2 as an example. Figure 6.2 shows the result for dierent groups of users
but with the same value of the four factors. Particularly, we use in this gure,
Euclidean distance, 100 most frequent character pairs and 1 sample data size for
each of the three groups of users. By presenting the result in this way, we can
check whether the threshold is consistent with all of the group of users or not. It
is clearly seen from gure 6.2 the threshold is inconsistent across the three group of
users. This is because the factor values are not stabilised at the cross-point or the
threshold among dierent groups of users. We can see from the gure that there is
a higher variance of the threshold value among the three dierent group of users.
This depicts that the user-independent threshold is sensitive to dierent users
typing data and factor values are not optimum for that threshold. Therefore, we
should try to nd the optimum factor values that stabillise the threshold among
dierent group of users. To do that, we examined dierent factor values in each
individual group of users, as we will see in the second way of showing the results.
Second, we showed the result gures for each group of users with dierent
factor values in order to nd the optimum factor values that were suitable for all
groups of users. To see this second way of showing the results, see gure 6.3 as an
example. It shows the results for each group of users with dierent factor values
as an example from group 1. In this particular group of users, we vary the data
size factor from 1 sample to 5 samples in order to nd the optimum size of samples
that have lowest equal error rate. This gives credibility regarding the eectiveness
of factor values. We can clearly see from the gure 6.3 that the equal error rate
decreases with larger sample sizes.
(a) Group 1
(b) Group 2
(c) Group 3
Figure 6.2: Inconsistency of threshold among dierent group of users
6.6. Results 113
(a) 1 sample (b) 2 samples
(c) 3 samples (d) 4 samples
(e) 5 samples
Figure 6.3: Varying the data size for group 1 of users
6.6.1 Distance measures
Table 6.1 shows the results of the comparison between the dierent distance mea-
surements. In particular, we showed the relationship between distance type and the
ability of user-independent threshold for user authentication. We calculated EER
(equal error rate) for each distance among all users. Furthermore, we computed
the consistency of the threshold across dierent groups of users by calculating the
(STD) standard deviation. Both EER and STD are the measurements for the
system evaluation. The EER is quite similar for all of the distances except Eu-
clidean distance using Fibonacci series. However, STD of the user-independent
threshold across dierent users is not similar. The lowest value of STD is the more
consistent threshold across dierent groups of user. We can clearly see from the
table the lowest variance or STD is for Euclidean and Cosine distances; it means
that their discrimination ability is stable cross dierent users. There is dierence
in results between Euclidean distance using a Fibonacci series and the other three
algorithms. This is due to the detection ability of the algorithm. Euclidean dis-
tance using a Fibonacci series algorithm does not consider the importance of all
features in the sample as it only considers the mean of all features. However, Eu-
clidean distance, City block distance and cosine similarities measure the distance
between two samples for each feature. This would help to consider the importance
of each feature separately.
Distance type EER STD
Euclidean distance 18.66% 0.058%
Euclidean distance
using Fibonacci
series
42.66% 0.915%
City block distance 16% 1.069%
Cosine distance 17.3% 0.070%
Table 6.1: Comparing accuracy for dierent measurements
6.6.2 Various number of keystrokes (data size)
We varied the data size in order to nd the optimum data size that can stabilise the
user-independent threshold across dierent users. We computed each classication
6.6. Results 115
instance over a sample by merging ve samples together and then decreased the
classication instance from four samples, down to three samples until reaching one
sample. As a result, a classication was carried out for all the dierent sizes in order
to see how the threshold was inuenced by the data size for user authentication.
Table 6.2 shows the average of EER result of comparing between Cosine and
Euclidean distance over dierent data sizes. Particularly, we show the relation
between EER and varying the data size in order to see the inuence of having
a large data size with EER. As might be expected, we can clearly see from the
table that EER decreases with a larger sample size for both distances. Euclidean
distance reduced to almost zero percent of EER when we had 5 sample sizes.
However, cosine distance had better results of EER for lower sample sizes such
as 1 and 2 sample sizes. This is because the functionality of both cosine distance
and Euclidean distance and the natural of the typing data. When we had lower
sample sizes, such as 1 and 2 sample sizes, the typing data has lots of diversity and
variety. Normally cosine distance works better if the data is not homogenous, and
takes the magnitude and the angle between the two vectors [96, 81]. When we had
higher sample sizes such as 4 and 5 sample sizes, the typing data becomes more
homogenous. Euclidean distance normally works better when the data is nearly
identical or more homogenous and it measures only the actual distance between
the two vectors[96, 81].
Therefore, the mis-classication of users sample is inuenced by the sample
size or the number of features of user typing. Having a large sample size from
users reduces the variability of user typing for each user which helps to decrease
the overlapping or the mis-classication between dierent users sample as EER
shows that clearly in case of 5 sample size. On the other hand, taking a lower
sample size from users will increase the variability of user typing for each user
which helps to increase the overlapping or the mis-classication between dierent
users samples as EER shows in case of 1 sample size.
EER 1
sample
2
samples
3
samples
4
samples
5
samples
Cosine
distance
17.3% 11.66% 9% 7.33% 8.6%
Euclidean
distance
18.66% 16.66% 7.33% 3.33% 0.66%
Table 6.2: comparing EER for dierent size of data
Figure 6.4 shows the STD result of comparing between cosine and Euclidean
distance over dierent data sizes. Particularly, we show the relation between STD
or the variance of user-independent threshold across dierent groups of users in
order to nd the inuence of having a large data size with consistency of the
threshold. We can clearly see from the graph that the variance of threshold de-
creases with larger sample sizes when using Euclidean distance. This means that
the threshold is more consistent with larger sample sizes in case of using Euclidean
distance.
1 2 3 4 5
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Data size
S
t
a
n
d
a
r
d

d
e
v
a
t
i
o
n

cosine distance
Euclidan distance
Figure 6.4: Consistency of the threshold for dierent size of data
6.6.3 Feature type
Since users can type characters in any sequence during a session, continuous au-
thentication approaches require selecting multiple features that are representative
of user typing behaviour. We show the relation between feature type and ability
of user-independent threshold for user authentication. In the previous chapter, we
identied dierent feature types that are representative of user typing behaviour
which distinguish a single user from many users. In this work, we chose the best
two techniques of feature selection including most frequently typed character pairs
and most character pairs that are typed with consistent time. We evaluated them
on user-independent threshold in order to see the inuence of feature type on the
consistency of the threshold. We can clearly see from table 6.3 that the feature
6.6. Results 117
EER STD
50 most frequent
character pairs
0% 0.09245
50 most consistent time
character pairs
2% 0.09245
Table 6.3: Comparing accuracy for dierent feature type
EER STD
100 most frequent
character pairs
0.66% 0.00372484
50 most frequent
character pairs
0% 0.09245
Table 6.4: Comparing accuracy for dierent amount of feature set
type of most consistent time character pairs and most frequent character pairs
have a similar impact factor on inconsistency of the user-independent threshold.
6.6.4 Feature amount
Feature amount means the number of features that represent the user typing in
each user sample. The previous experiments in section 6.6.1 and 6.6.2 were based
on about 100 most frequent character pairs. We choose this certain number of
features as it gives the best results in terms of consistency of threshold across dif-
ferent groups of users. Table 6.4 shows an example of comparing between dierent
amounts of features. We reduced the number of most frequent character pairs.
We can clearly see from table 6.4 that taking into account the amount of features
is critical to obtaining the consistency of user-independent threshold. Therefore,
we need a certain number of features in order to get consistent user-independent
threshold across dierent groups of users.
After we examined dierent factor values that could inuence the accuracy of
the threshold, we found some optimum factor values that stabilise the threshold
value among dierent group of users. Figure 6.5 shows the threshold is consistent
with all of the three dierent group of users. The optimum factor values including
Euclidean distance, 100 most frequent character pairs and 5 sample data size for
all three groups of users.
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Threshold values
A
l
a
r
m

(
%
)

False positive
False negative
Optimum
threshold
starting from
this crossing
End of
threshold
value in this
crossing
(a) Group 1
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Threshold values
A
l
a
r
m

(
%
)

False positive
False negative
Optimum
Threshold
(b) Group 2
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Threshold values
A
l
a
r
m

(
%
)

False positive
False negative
Optimum
Threshold
(c) Group 3
Figure 6.5: Consistency of threshold among dierent group of users
6.7. Comparing to user-dependent threshold 119
6.7 Comparing to user-dependent threshold
In this comparison, we assumed the closed-setting deployment scenario to discrim-
inate among N users. In this scenario the environment should prevent any user not
registered in the system from gaining access. There are contemporary schemes of
continuous user authentication that can be applied in this kind of scenario [32, ?]
which are based on a user-dependent model. We classied these user-dependent
schemes into two classes, as explained in chapter 3.
The rst class of continuous user authentication schemes using supervised learn-
ing explained in section 3.4.1 This type of learning problem is known as multi-class
classication. The schemes authenticate only the users who are already registered
in the system. This makes the system less eective when new users interact with
the system. Particularity, it will not detect unathorised users who were previously
unknown.
The second class of continuous user authentication schemes using supervised
learning algorithms explained in section 3.4.2. Also demand that the user typing
data be available and collected only for valid users [40, 8] before a prole of normal
behaviour is generated. This type of learning problem is known as one-class classi-
cation. Again the schemes dependent on pre-dened user-typing models and this
dependency restricts the systems to authenticate only known users whose typing
samples are modelled.
We implemented the previous two approaches of classication which are based
on a user-dependent model and compared it to our new approach of classication
which is based on a user-independent model. The rst approach of classication is
based on a multi-class classication algorithm using k-means. We calculated the
mean value (centroid) for each user and then, in the testing phase, we measured
the distance between the unknown user sample to each centroid in the database
of the system in order to nd the smallest distance. Therefore, the accuracy of
the system has been calculated based on the mis-classication when the sample
is classied to the wrong user. The second approach of classication is based on
a one-class classication using k-means. We calculated the mean value (centroid)
for each user, but in the testing phase, we did not use all the user models to make
the decision. We only used the model for the user who wants to authenticate. In
the training phase, we calculated the distance between the centroid to each sample
of the user and then identied the furthest or maximum distance as a threshold of
the user. Then, in the testing phase, the classication of the unknown sample to
the assumed user will be successful if the distance between the unknown sample
to the centroid is less than the predened threshold of that user.
We summarise a list of the similarities and dierences between our proposed
approach (user-independent threshold) and a user-dependent approach based on
the k-means clustering process, based on results of our experiment, in table 6.5.
6.7.1 Experimental Methodology
For the experiments in this section we used the Euclidean distance measure for
all of the three classication approaches. We used the same dataset [32] used to
obtain the user-independent threshold. However, the method here was not the
same as the previous experiment to obtain the user-independent threshold. In this
method, we do not divide the users to dierent group of users. We deal with all
user samples and we do the comparison for all possible combinations. We classify
between typing samples of same and dierent users.
6.7.2 Empirical Analysis
The results obtained by Euclidean distance measure for all of the three approaches
are shown in table 6.6. The accuracy of the system over all 14 users was computed
by identifying the mis-classied user sample over all the classication user samples.
Table 6.6, shows the average of comparison result of classication between
the current schemes that are user-dependent and our proposed approach that is
user-independent. As might be expected, we can clearly see the accuracy of user-
dependent approach is better than our proposed approach especially when using
one-class classication. However, our approach performs comparably to the ex-
isting approaches after a sucient number of keystrokes are obtained in case of 5
samples used. Also, in case of using 6 samples, we obtained 100% accuracy but
our dataset has limited size which can cause less statistical signicant result. Fur-
thermore, our proposed approach has an advantage over the current schemes as
it is able to classify two unknown user samples and decide whether they are from
one user or two dierent users. The current schemes have no mechanism to detect
two unknown user samples if they belong to a user who has not interacted with
the system before. For example, in an open-setting scenario in a non-restricted
environment where the typing data for trusted users and possible imposters is
not available or where it is not possible to collect such data in advance such as a
6.7. Comparing to user-dependent threshold 121
C
r
i
t
e
r
i
o
n
P
r
o
p
o
s
e
d
A
p
p
r
o
a
c
h
(
U
s
e
r
-
i
n
d
e
p
e
n
d
e
n
t
t
h
r
e
s
h
o
l
d
)
M
u
l
t
i
-
c
l
a
s
s
c
l
a
s
s
i
c
a
t
i
o
n
(
k
-
m
e
a
n
s
c
l
u
s
t
e
r
i
n
g
a
l
g
o
r
i
t
h
m
)
O
n
e
-
c
l
a
s
s
c
l
a
s
s
i
c
a
t
i
o
n
(
k
-
m
e
a
n
s
c
l
u
s
t
e
r
i
n
g
a
l
g
o
r
i
t
h
m
)
I
n
p
u
t
d
a
t
a
O
n
e
k
n
o
w
n
u
s
e
r
s
a
m
p
l
e
&
u
n
k
n
o
w
n
u
s
e
r
s
a
m
p
l
e
o
r
t
w
o
u
n
k
n
o
w
n
u
s
e
r
s
a
m
p
l
e
s
U
n
k
n
o
w
n
u
s
e
r
s
a
m
p
l
e
&
C
e
n
t
r
o
i
d
o
f
c
l
u
s
t
e
r
s
f
o
r
k
n
o
w
i
n
g
u
s
e
r
s
U
n
k
n
o
w
n
u
s
e
r
s
a
m
p
l
e
&
C
e
n
t
r
o
i
d
o
f
c
l
u
s
t
e
r
f
o
r
k
n
o
w
i
n
g
u
s
e
r
s
D
e
t
e
c
t
i
o
n
m
e
c
h
a
-
n
i
s
m
B
a
s
e
d
o
n
o
n
e
x
e
d
t
h
r
e
s
h
o
l
d
v
a
l
u
e
U
s
e
s
t
h
e
d
i
s
t
a
n
c
e
m
e
a
s
u
r
e
a
n
d
t
h
e
m
e
a
n
v
a
l
u
e
s
(
c
e
n
t
r
o
i
d
)
t
o
n
d
t
h
e
s
m
a
l
l
e
s
t
d
i
s
t
a
n
c
e
U
s
e
s
t
h
e
d
i
s
t
a
n
c
e
m
e
a
s
u
r
e
a
n
d
t
h
e
m
e
a
n
v
a
l
u
e
(
c
e
n
t
r
o
i
d
)
o
f
o
n
e
k
n
o
w
n
u
s
e
r
F
i
n
a
l
o
u
t
c
o
m
e
k
n
o
w
i
n
g
w
h
e
t
h
e
r
t
h
e
t
w
o
s
a
m
p
l
e
s
r
e
l
a
t
e
d
t
o
o
n
e
u
s
e
r
o
r
t
w
o
d
i
e
r
e
n
t
u
s
e
r
s
K
n
o
w
i
n
g
w
h
e
t
h
e
r
t
h
e
u
n
k
n
o
w
n
u
s
e
r
s
a
m
p
l
e
r
e
l
a
t
e
d
t
o
t
h
e
a
s
s
u
m
e
d
u
s
e
r
o
r
n
o
t
K
n
o
w
i
n
g
w
h
e
t
h
e
r
t
h
e
u
n
k
n
o
w
n
u
s
e
r
s
a
m
p
l
e
r
e
l
a
t
e
d
t
o
t
h
e
a
s
s
u
m
e
d
u
s
e
r
o
r
n
o
t
T
a
b
l
e
6
.
5
:
C
o
m
p
a
r
a
t
i
v
e
a
n
a
l
y
s
i
s
o
f
o
u
r
a
p
p
r
o
a
c
h
(
u
s
e
r
-
i
n
d
e
p
e
n
d
e
n
t
)
a
n
d
c
u
r
r
e
n
t
s
c
h
e
m
e
s
(
u
s
e
r
-
d
e
p
e
n
d
e
n
t
)
Data size in
samples
sample between
(700-900)
characters
User-
independent
threshold
User-
dependent
using
Multi-class
classica-
tion
(k-means)
User-
dependent
using
One-class
classica-
tion
(K-means)
1 sample 78% 61.9% 100%
2 samples 83% 78.5% 100%
3 samples 92.3% 100 % 100%
4 samples 96.6% 100 % 100%
5 samples 99% 100 % 100%
Table 6.6: comparing accuracy between user-dependent and user-independent
public library or an Internet cafe. Our scheme also works more simply and much
faster since current schemes are based on multi-class classication requiring com-
puting distance with their pre-built user models. Our scheme only computes the
distance between two samples and compares the distance with the xed threshold
for classication.
6.8 Discussion and limitations
We proposed a new process of unsupervised method for providing continuous user
authentication with keystroke dynamics. The new process aims to obtain the
user-independent threshold. We initiated the current work to check whether the
user-independent threshold can be found and applied to a whole set of users in the
dataset. This work has shown that it is possible to obtain the threshold, but with
the requirement of a sucient number of keystrokes from users, measurement of
the number of features and using distance measures that are based on magnitude
of dierence between vectors such as Euclidean distance. This research has shown
how the simple comparison of a statistic or absolute value with a threshold can
distinguish users, including those who are new to the system.
We described how we obtained the user-independent threshold using two tests,
between the two samples from the same users data and between two samples from
a dierent users data. In theory, since the user often types in a similar way and
dierent users often type dierently, it can be possible to nd a single threshold
6.8. Discussion and limitations 123
which separates the two classes at given operating cross-point error rates. This
user-independent threshold for EER has been found experimentally to derive an
alternative way for distinguishing two dierent users typing data.
The optimum threshold values get bigger when we reduce the number of keystrokes
or the size of samples. When we increase the number of keystrokes, it decreases the
variability of the user typing behaviour which helps to decrease the valid users
instances to be overlapped or miss-classied as those belonging to an intruder.
This is regarded to as the false positive or false alarm rate. Taking less data size
from users will increase the variability of the user typing behaviour for the user
which increases the intruders instances to be overlapped or mis-classied as those
belonging to the valid user. This is indicated to as the false negative rate.
No matter how large the data set that is used to extract the authentication
model based on user-independent threshold, this data has a limited scope of au-
thentication space and is not necessary as representative of the entire authenti-
cation space. In addition, the authentication space is very dynamic as dierent
environmental factors can inuence the accuracy of the system including dierent
language and dierent keyboards.
In this work, we only evaluated four factors including distance type, number
of keystrokes, feature set and feature amount that are directly related to the user
samples which could inuence the consistency of the threshold among dierent
users. However, there are some other factors related to the environment that could
inuence the user-independent threshold such as the language and keyboard type.
In our work, the language is based on Italian. We obtained the user-independent
threshold based on this language. However, we do not know about the inuence of
other languages on the user-independent threshold. Evaluation of other languages
could be attempted to obtain the user-independent threshold. This would establish
whether the language used has an impact on the user-independent threshold or
not. Furthermore, dierent type of keyboards might inuence or aect the user-
independent threshold. It will be interesting to see the user who using dierent
keyboards has an impact on the independent threshold.
We do not assume the user-independent threshold value that we obtained would
be applicable to any other datasets. However, we conjecture the process of obtain-
ing the the user-independent threshold can be applied to any other datasets to ob-
tain the user-independent threshold value. It means, the user-independent thresh-
old value could be dierent from one dataset to other datasets because the eect of
dierent factors related to the conditions of the dataset such as dierent keyboards,
dierent time clocks and dierent languages. The impact of these conditional envi-
ronments could change the parameter values of the of user-independent threshold.
Therefore, the user independent threshold value can change.
The promising capability of the proposed approach that authenticates users
without requiring a pre-dened typing model from any user, motivated our inves-
tigations into improving the technique further to suit an on-line authentication
system. Further investigation is required in order to overcome some limitations.
These limitations are listed below.
We used EER as a user-independent threshold in this expirement and this
point of EER has been found experimentally to derive an alternative way for
distinguishing two dierent users typing data. Our experiments show that
for our data set the user-independent threshold is sucient to dierentiate
all users.
Currently the parameters used in the proposed approach are set manually
based on the analysis of the dataset. A further extension is to devise a
methodology for automatically adapting these parameters.
The use of a static user-independent threshold value to authenticate unknown
samples as it is built from one dataset. It needs to evaluate the threshold
value in another dataset.
A new mechanism is required for improving the authentication technique by
reducing the data size.
Currently authentication is based on data size that is xed for both authen-
ticated users and unknown user samples. We need to know the optimum
data size for both of them symmetric .
6.9 Summary
In this chapter, we presented a new approach to continuous user authentication
using behavioural biometrics provided by keystroke dynamics. Our approach fo-
cused on obtaining a user-independent threshold that can distinguish a user ac-
curately without needing any pre-dened typing samples. We evaluated the user-
independent threshold with four dierent factors including distance type, number
6.9. Summary 125
of keystrokes, feature set and feature amount in order to nd the optimum factor
values that are suitable for a whole set of users. We also evaluated the user-
independent threshold among three dierent groups of users to depict how the
user-independent threshold is sensitive to dierent users typing data. This in-
creased our condence surrounding the eectiveness of the optimum factor values.
Using a GP dataset for keystroke dynamics, we showed experimentally that a
user-independent threshold can be obtained if a sucient number of keystrokes
are used, number of features are used and using Euclidean distance. In the next
chapter we focus on overcoming the limitations discussed in the previous section
by designing a system that uses the proposed approach in order to develop an
automated system. The system will automate the authentication model to adapt
to the changing typing data from users. The system illustrates a sequential analysis
technique for authentication of users. A sliding window and xed window approach
will be proposed to design the system.
Chapter 7
Typist Authentication based on
user-independent threshold
Detecting an intruder or impostor who takes over from the genuine user during
the computer session presents several challenges that include the high volume of
users data and the diculty of identifying the change point of the impostors
data during the computer session. As we have previously noted, an eective way
to work out whether two typing samples belong to the same user or not is through
deploying a user-independent threshold. In the previous chapter, we proposed
the idea of a user-independent threshold that does not need to build user-typing
models a-priori. The threshold can be a line drawn between the distance of the
typing samples of same and dierent users in order to work out whether two
typing samples in a user session belong to the same user. While the threshold is
very ecient in distinguishing dierent users samples, it remains to show how it
can be used in a practical system where the change point between dierent users
is unknown a-priori. A concrete to design for a system can be used to answer a
question such as: how much typing data does the system need from the impostor
in order to detect him? In particular, the chapter addresses the third sub-question
from Research Question 2 (outlined in chapter 1):
Can we automatically detect the impostor who takes over from the valid user
during the computer session and how much typing data does the system need to
detect the imposter?
In this chapter we address these issues and propose adaptive authentication
127
128 Chapter 7. Typist Authentication based on user-independent threshold
system that automatically detects the impostor. The main contributions of this
chapter include:
A method for automatic dierentiation between two dierent users typing
data based on the comparison of distances between samples compared to the
user-independent threshold value.
A method for automatic distinguishing dierent users typing data without
the need for any predened user typing model a-priori.
A method for minimise the detection delay time of impostor or intruder.
The rest of the chapter is structured as follows. Section 7.1 provides the motiva-
tion behind this work. Section 7.2 provides a detailed description of the proposed
system. Section 7.3 gives an overview of change point detection. Section 7.4, dis-
cusses the automated detection techniques using sliding window(non-overlapping)
and sliding window (overlapping). Experimental methodology and the results of
the experiment are discussed in section 7.5. More discussion about the experimen-
tal results and limitations of the proposed techniques are described in Section 7.6.
Finally, section 7.5 summaries the chapter.
7.1 Motivation
A user-independent threshold has been obtained with the requirement of having
large data size from users as discussed in the previous chapter. The requirement
of user-independent threshold to have large typing data can be applicable for
one of the continuous authentication scenarios that we identied in chapter 3.
For example, computer based TOEFL exam scenario that typically requires a
large amount of typing data from students. In this scenario, we assume that no
prole information is available for any users, authorised or not, at the beginning
of the session. We do however, assume that the user who initiates the session is
authorised to do so by user name and password. The system would verify the
student at the start of the exam, but the teacher or instructor cannot be sure
whether the exam has been completed only by the valid student. Threat include
a substitute student completing the exam on behalf of the valid student who is
already authenticated at the start of the exam. The challenge here is to build a
prole of this authorised user, while, at the same time, trying to decide whether
or not the session has been taken over by an imposter.
7.2. Typist Authentication System 129
7.2 Typist Authentication System
The proposed continuous authentication system can be thought of as a kind of
intrusion detection system (IDS). An IDS monitors a series of events, with the aim
of detecting unauthorised activities. In the case of our continuous authentication
system, the unauthorised activity is a person acting as an imposter by taking over
the authenticated session of another (valid) user. Based on the generic model that
we proposed in chapter 3 and the computer based TOEFL exam scenario, there
are ve basic components that can describe and illustrate the system, as it shown
in Figure 7.1.
1. Students: Students can be either authorised or unauthorised [20]. An au-
thorised student is allowed to access a system by providing some form of
identity and credentials. Also, they are allowed to type on the system dur-
ing the session. We assume that no predened typing model is available
for the authorised student at the beginning of the session. Also, we assume
that no predened typing model is available to the system in advance for the
unauthorised student. The unauthorised student can be a colluder invited
by the valid student to complete an action on behalf of the student.
2. Keyboard: a device that collects behavioural biometric data from the user,
and translates it into a signal that can be read by the keyboard. The aim
of data collection is to keep on record, and to make further analysis of, that
data. The sensor is based on one or more behavioral characteristics [55]
for uniquely recognising humans. All of this data is exported to the next
component (feature extraction).
3. Feature Extraction: Since users can type characters in any sequence during
a session, it requires selection of multiple features that are representative of
user typing behavior. The sequences of characters that users type during a
session can be used as distinguished features. An example of one of these
features that was proposed in chapter 5, n-graph that is the time interval
between the rst and the last of n subsequent key-presses. In this component,
we selected features (sequence of characters) that are user representative (or
reecting a normal typing pattern of a user). For each user typing during
the computer session, representative features are generated and recorded as
a requirement for the input of the next component (detector).
4. Detector: In this component, we analyse the users typing data and compare
between these data in order to performs measurements for errors that may
detect the intruder. The algorithm that can detect the intruder is taking in
account the user-independent threshold as an input. The challenge of the
algorithm is to build a prole of this authorised user while, at the same
time, trying to decide whether or not the session has been taken over by
an intruder. The accuracy of the correctness of a single measurement will
be set in this component. Accuracy is determined by comparing the mea-
surement against the true or accepted value. False acceptance rate, or FAR,
measures the accuracy of the CBAS. It measures the likelihood of whether
the biometric security system will incorrectly accept an access attempt by
an unauthorised user. A systems FAR is typically stated as the ratio of
the number of false acceptances divided by the number of identication at-
tempts. False rejection rate, or FRR, is a measure of likelihood a biometric
security system will incorrectly reject an access attempt by an authorised
user. A systems FRR is stated as the ratio of the number of false rejections
divided by the number of valid identication attempts.
5. Response unit: takes an appropriate response for detecting the intruder or
imposter. The system has two main types of response: either a passive re-
sponse or an active response. A passive system will typically generate an
alert to notify an administrator of the detected activity. For example, of the
passive response in the TOEFL exam, the teacher or the instructor can be
notied that the behaviour of users typing has been changed signigicantly,
which indicates that the intruder might came over from the authenticated
student and complete the exam on behalf of him. An active system, on the
other hand, performs some sort of response other than generating an alert.
Such systems minimise the damage of the intruder by terminating network
connections or ending the session, for example. An example might be, In
terms of the TOEFL exam, the system would shutdown or the exam appli-
cation stop where the system recognises signicant changes in the behaviour
of the users typing.
7.3. Change detection 131
Figure 7.1: Overview of the proposed typist authentication system
7.3 Change detection
Most of the detector algorithms to date have dealt with discrete typing data, with
labelled or implied beginning and ending points. In such tasks, the problem of
determining where the user has been changed during the computer session is merely
one of silence detection. A change detection based approach has been proposed in
this chapter to automatically identify impostor through computer session.
Change detection is based on the premise that a user often types in a similar
fashion (that is the distance between samples from the same user is small) and
dierent users often type dierently (that is the distance between dierent user
samples is large). The change in data between the two dierent samples usually
transforms the keystroke dynamics parameters in such a way that statistical prop-
erties no longer remain constant, resulting in distance changes. Particularly, a
change point is occurring where the distance value between two consecutive typ-
ing samples changes signicantly and over the predened threshold. In order to
detect any signicant change that indicates an impostor activity, we recorded the
data in terms of individual character sequences as a series of time-based events.
Window mechanisms have been used eectively to do time series analysis in sev-
eral domains such as network trac [4]. In this chapter, we proposed two window
mechanisms, sliding window (non-overlapping) and sliding window (overlapping),
in order to eectively do time series analysis.
The signicance of the proposed technique is two-fold. Firstly, automatic de-
tection of the impostor and, secondly, minimise the detection delay time. The
validity of the proposed technique is investigated by experimentation showing how
quickly the system is able to detect the impostor. The analysis of the proposed
technique demonstrates its high detection accuracy, low false alarm rate and fast
time detection.
7.4 Time Series Analysis and Attack Detection
The user typing data, recorded in terms of individual character sequences, is a
series of time-based events. Typically, character sequences (n-graph) is a popular
feature among existing schemes. It is the time interval between the rst and the
last of n subsequent key-presses. In our work, we use 2-graphs (sequences of two
characters) as features because they are the basic element of n subsequent key-
presses and occur more frequently than general n-graphs. The 2-graphs time is
represented as a time series in the window mechanism. In case that an impostor
takes over from the genuine user who authenticated at the start of the computer
session, it is expected the distance measures between the samples no longer remain
constant, resulting in distance changes.
We propose two window mechanisms sliding window not overlapping (SNO)
and sliding window overlapping (SO) in order to analyse the time series. Sliding
window not overlapping technique means that testing window (TSW) is shifting
window by window and the window in our case has ve samples size. Sliding
window overlapping technique means that the testing window (TSW) partitions
to dierent points and then shifting the window sample by sample with xed size
of window (ve samples).
In order to detect the impostor, the xed and sliding windows of time series can
be used as a criterion to snatch typing data stream. The abnormal change of typing
data can be detected by comparing two adjacent non-overlapping windows of the
time series; the rst window that we authenticated at the start of the computer
session, authenticated window (AUW) and test window (TSW). If the degree of
distance between AUW and TSW is over the threshold, we decide an abrupt change
has happened, which means that an impostor took over from the authenticated
user at this time point. The size of windows can be xed or variable. Variable
window means that the size of AUW and TSW are variables. Fixed window means
that the size of AUW and TSW is xed and same for each comparison. In our
case, the threshold has been obtained when the size of window has xed for both
7.4. Time Series Analysis and Attack Detection 133
Figure 7.2: Sliding window (not overlapping)
AUW and TSW. The details of xed window and sliding window are described in
the following:
7.4.1 Sliding window(non-overlapping)
Fixed length window analysis will delay the impostor detection. It analyses only
the parameters within a window and compares the values it to the window that
has been authenticated at the start of the session. The procedure to detect the
impostor is described as follows (see Figure 7.2) :
1. The AUW and TSW should be the same size. We authenticate the AUW as
coming from the authenticated user who start the computer session. Then,
we divide the rest of the typing data to dierent TSWs with the same size
of AUW.
2. Compare the distance between AUW and rst window of TSW. If the dis-
tance is not over the threshold, then we move to the second TSW and com-
pare it to the AUW. Repeat the step until the distance exceeds the threshold.
3. When the distance between AUW and TSW exceeds the threshold, this
means we detect the impostor. If the distance does not exceed the threshold
for all comparing windows between AUW and TSW, then the typing data is
only for the authenticated user and the session has not been taken over from
an impostor.
7.4.2 Sliding Window (overlapping)
The analysis of time-varying parameters is usually performed over parameter val-
ues covered by a window of innite length, considering all the previous values, or
by analyzing values within a window of nite but xed length. It is well known
that the use of a nite window size helps in eective analysis of such parameters
Figure 7.3: Sliding window (overlapping)
more than the use of window with unbound length. In addition the use of xed
length window for change detection in time-varying parameters might lead to in-
correct analysis by delaying the detection of a change point or even not detecting a
change point in the rst place. We aim to overcome this problem by using a sliding
window mechanism where window length is adjusted in response to a change in the
parameter value. The procedure to detect and identify the change in the typing
data which indicate the impostor activities in the session is described as follows
(see Figure 7.3):
1. The AUW and TSW should be the same in size. We authenticate the AUW
as coming from the authenticated user who starts the computer session.
Then, we divide the TSW into dierent partition points.
2. Compare the distance between AUW and TSW. If the distance is not over
the threshold, then the rst partition point in the TSW shifts to the second
partition point and the new TSW will start from the second shift point.
Repeat the step until the distance exceeds the threshold.
3. When the distance value between AUW and TSW exceeds the threshold,
this means we detect the impostor and if it does not exceed the threshold
for all comparing windows between AUW and TSW this means the typing
data is only for the authenticated user and the session has not been taking
over from by impostor.
7.5. Experiment 135
7.5 Experiment
The experiment is designed so that we can nd how eective the user-independent
threshold is in designing new CAS based on keystroke dynamics with aim of de-
tecting the impostor automatically. Also, the experiment will discover how quickly
we can detect the impostor.
In this section, we lay out the dataset, experimental method, and we present
the empirical results.
7.5.1 Data set
In the previous chapter, we obtained the user-independent threshold when we
merged ve samples together into one sample and used it for distinguishing between
users. So, in this experiment, we used the same size of merging ve samples into
one sample for all users. Then, the total users data contains three samples (each
sample has ve samples merged together). For our evaluation data, we used all
14 users data by setting the typing data for each user as authenticated data and
other 13 users data behaving as impostors.
7.5.2 Experimental method
The method itself is comprised of three steps: (1) initiate the rst window for
an authenticated user that has size of ve samples; (2) varying the typing data
from both authenticated user and impostor in the second window; (3) repeatedly
running an evaluation, while systematically changing the users data in the second
window. The following steps summaries the evaluation procedure:
1. We xed the user data in the rst window by choosing the rst ve samples
from user 1...to user 14 in order to represent an authenticated user.
2. The second window should have the same size of the rst window (authen-
ticated window). It requires size of ve samples and these data samples
determined to be mixed from authenticated user and impostor in order to
see the impact of amount of data from impostor.
3. In the second window, we start from the beginning of the window by adding
one sample from authenticated user who is already authenticated in the rst
window (AUW) and the rest four samples from impostor. For example, if
the typing data in AUW from user1, then the typing data in the second
window (TSW) has one sample from user1 but not has used in the the rst
window (AUW) and the rest 4 samples can be from other users such as
user 6... to user 14. Then, we increase the data from authenticated user to
two samples, three samples, four samples and ve samples where the whole
window become from the authenticated user.
4. The distance measures were calculated in the previous steps by comparing
the authenticated window (AUW) to the second window or testing window
(TSW). If the distance value is not exceed the predened threshold between
AUW and TSW, this means the typing data is only for the authenticated
user and the session has not been taking over from an impostor.
7.5.3 Experimental results
We tested our detection system while varying the point of change in the typing
data under dierent users typing data. In the test, we set the user-independent
threshold to detect whether an impostor exists or not. Once the distance exceeds
the threshold, it means there is a signicant change in the typing data which
means at this time point, we think the computer session has been taken over by
an impostor.
On the basis of the above experiment, it can be inferred that the impostor
detection can happen early if his typing data is signicantly dierent from the
authenticated user. However, some users that behave as an impostor may be typing
similarly to the authenticated user which means that it is hard to distinguish them
based on small data.
In typical identication systems the detection accuracy can be increased by
changing the threshold. In such systems we can expect an increase at the same time
in the false positives. However, in this experiment we are doing something dierent.
In the previous chapter we have shown that the generic threshold obtained results
in no false positives. In this chapter we are not changing the threshold instead
we are just examining the likelihood of being able to detect a value above the
threshold earlier than we know can be guaranteed.
The accuracy result of impostor detection can be seen in Table 7.1. For exam-
ple, the table shows the result of accuracy in the rst row where the authenticated
data (AUW) is from user 1 (samples 1 to 5) and impostors data from other users
7.5. Experiment 137
based on dierent size of impostors data. In our experiment the window size of
testing (TSW) is based on 5 samples and we analyse that window when it includes
data from both authenticated user and impostor. Inside the testing window, we
vary the size of data from impostor (1 sample, 2 samples, 3 samples, 4 samples, 5
samples) in order to see how quickly we can detect the change of the typing data.
We can clearly see from the table that the accuracy rate increase with having
larger sample sizes from impostor. However, we can still detect the impostor even
when having small data size from him. The reason behind this result is that there
are some users typing dierently comparing to other users and it can aect the
distance between the two comparison windows.
Data size Data of au-
thenticated
user (1)
1
sample
2
samples
3
samples
4
samples
5
samples
Accuracy 1-5 samples 2.56% 20.51% 79.49% 97.44% 100%
Accuracy 6-10 samples 5.13% 28.21% 74.35% 94.87% 100%
Table 7.1: Accuracy of detection where varying the size of impostors data in one
window
Also, Table 7.1 shows the result of accuracy for detecting the impostor in
the second row based on dierent size of impostors data and dierent data from
authenticated user (user 1: 6-10 samples). The only dierence between the result
of this experiment and the result of the previous experiment that we changed the
data from the authenticated user in the testing window. We can clearly see from
the table that the accuracy is similar to the previous experiment. This indicates
there is no impact of authenticated users data in detecting the impostor.
Figure 7.4 shows the comparative result of accuracy for detecting the impostor
between two dierent authenticated users. For example, the gure shows the result
of accuracy when the authenticated data (AUW) is from user 6 and dierent size of
impostors data. We can clearly see from the gure that we can detect the impostor
very quickly. This is because the typing behaviour for this particular user (user 6)
is signicantly dierent from other users. This will make the distance between user
6 and other users very high which exceeds the predened threshold even in case
of having less data (1 sample) from impostors and four samples from user 6 in the
testing window (TSW). Therefore, it can be possible to detect the impostor very
fast in some user cases even you have in the testing window (TSW) most of the
typing data from the authenticated user and less typing data from the impostor
Figure 7.4: Comparing the accuracy of detection between two dierent authorised
users
data.
Also, the gure shows an example of the result of accuracy when authenticated
data (AUW) is from user 1 and dierent size of impostors data. We can clearly
see from the gure that we can not detect the impostor quickly. This is because
the typing behaviour of this particular user (user 1) is similar to the impostors
typing behaviour. This will make the distance between the typing data of user 1
and the impostors typing data less than the predened threshold, even in case of
having large data (4 samples) from the impostor and 1 sample from user 1 in the
testing window (TSW).
7.6 Discussions and Limitations
Both of SNO and SO techniques are able to detect the impostor after certain
number of characters. However, SO technique is able to detect the impostor much
faster than SNO technique. For example, if the impostor takes over in the middle of
one of the TSWs, then the SNO technique may not be able to detect the impostor
in the current TSW until the analysis reach to the next TSW where all of the
7.7. Summary 139
typing data in that window related to the impostor. Using SO technique will
move or shift the analysis period by period with xed size of the window and in
our case the period is the users sample which about 700-900 characters. This can
help to detect the impostor much quicker as the sliding window slides point by
point where TSW length adjust in response to a change in the parameter value
more quickly than SNO technique.
The length of the period of typing data can be small or large. A sliding window
usage small period length would cause high computing cost for the system. On
the other hand, sliding the window using a too large period length results in a
longer detection time. Therefore, the use of SNO technique which the window
slides with large length period for detecting the impostor might lead to incorrect
analysis either by delaying the detection or even not detecting the impostor in the
rst place.
Typing behaviour of some users are signicantly dierent from others which
can aect the detection time. This can be seen clearly in gure 7.4 where having
only one sample from the impostor and the system is able to detect that impostor
accurately. This is because the typing data of both the impostor and authen-
ticated user has more diversities and variations. Therefore, the distance value
between both sets of data would be very high which easily exceed the threshold.
However, some users are typing similary to the impostor, which minimises the
delays of impostor detection. This is because the typing data of both the impos-
tor and authenticated user is nearly similar or a bit homogeneous. Therefore, as
the distance value between both sets of data is small, it is dicult to exceed the
threshold.
One of the limitation of this work is that the size of a TSW is xed based on
the analysis of obtaining the user-independent threshold in the previous chapter.
This may prevent to detect the impostor with fast detection or even not detecting
the impostor in the rst place. Varying the size of the TSW can enhance the
systems ability to detect an imposter thus minimising the delay in the TSW
length adjusting to a change in the parmeter value.
7.7 Summary
This chapter detailed the designed and evaluation of two novel sliding window
techniques for analysis the typing data in order to detect the impostor who may
take over from the genuine user during the computer session with minimise the
delay of detection. The use of proposed SO technique resulted in its better perfor-
mance over the proposed SNO in terms of fast detection. The user-independent
threshold has been used practically for detecting the impostor automatically.
Chapter 8
Conclusion and Future Directions
Detecting the impostor or intruder that takes over the valid user session based on
keystroke dynamics in real time or near real time is a challenging research problem.
This thesis focuses on developing automatic analysis techniques for continuous user
authentication systems (that are based on keystroke dynamics) with the goal of
detecting the impostor or intruder that takes over the valid user session. The main
motivation of this research has been the need for a exible system that can authen-
ticate users and must not be dependent on a pre-dened typing model of a user.
Also, there are other motivations including the ignorance of the application scenar-
ios by the current schemes and the need for new feature selection techniques that
represent user typing behaviour which can guarantee selecting frequently-typed
features and inherently reect user typing behavior. Added to these motivations
has been the lack of an automated continuous authentication system based on
keystroke dynamics that is low in computational resource requirements and thus
suitable for real time detection.
This thesis extends previous research on improving continuous user authen-
tication systems (that are based on keystroke dynamics). The research sought
to:
Better identify and understand the characteristics and requirements of each
type of continuous authentication scenario and system.
Find new features that are representing the user typing behaviour which
can guarantee selecting frequently-typed features and inherently reect user
141
142 Chapter 8. Conclusion and Future Directions
typing behavior.
Discover whether pre-dened typing model of a user is necessary for success-
ful authentication.
Develop new exible technique that authenticate users and automate this
technique to continuously authenticate users.
8.1 Summary of Contributions
This research has resulted in a number of signicant contributions in each of these
directions as following:
1. Model for continuous biometric authentication: A generic model was pro-
posed for most continuous authentication (CA) scenarios and CBAS. The
model of CBAS is proposed based on their detection capabilities to better
identify and understand the characteristics and requirements of each type of
scenario and system. This model pursues two goals: the rst is to describe
the characteristics and attributes of existing CBAS, and the second is to
describe the requirements of dierent scenarios of CBAS.
2. User-representative feature selection for keystroke dynamics: We proposed
four statistical-based feature selection techniques for keystroke dynamics.
The rst is simply the most frequently typed n-graphs and the other three
consider dierent user typing behaviors by selecting: n-graphs that are typed
quickly; n-graphs that are typed with consistent time; and n-graphs that
have large time variance among users. We use 2-graph (as features) in our
experiments and found that the most-frequent 2-graphs can represent users
typing patterns eectively because of their highest statistical signicance.
We further substantiate our results by comparing it with three contemporary
feature selection techniques (i.e popular Italian words, common n-graphs,
and least frequent n-graphs). We found that our technique performed after
selecting a certain number of 2-graphs.
3. User-independent threshold for continuous user authentication based on keystroke
dynamics: We presented a new approach to continuous user authentication
using behavioural biometrics provided by keystroke dynamics. Our approach
focused on obtaining a user-independent threshold that can distinguish a user
8.2. Future Directions 143
accurately without needing any pre-dened typing samples. We evaluated
the user-independent threshold with four dierent factors including distance
type, number of keystrokes, feature set and feature amount in order to nd
the optimum factor values that are suitable for a whole set of users. We
also evaluated the user-independent threshold among three dierent groups
of users to depict how the user-independent threshold is sensitive to dierent
users typing data. This increased our condence surrounding the eective-
ness of the optimum factor values. Using a well known dataset for keystroke
dynamics, we showed experimentally that a user-independent threshold can
be obtained if a sucient number of keystrokes are used, number of features
are used and Euclidean distance is used.
4. Automatic impostor detection based on user-independent threshold: We de-
tail the design and evaluation of two novel sliding window techniques for
analysis of the typing data in order to detect the impostor who may take
over from the genuine user during the computer session in real time. The
use of proposed sliding window (overlapping) technique resulted in better
performance over the proposed sliding window(non-overlapping) in terms of
fast detection. The user-independent threshold has been used practically
demonstrates for detection the impostor in near real time.
8.2 Future Directions
This thesis has presented a framework for the analysis of continuous authentication
system based on keystroke dynamics. The work described in the thesis points to
various areas of future research. These future areas of research are presented below.
8.2.1 Application of the Proposed Technique to Dierent
Datasets
In this thesis, we use GP keystroke dataset [32] based on Italian language for
experiments and obtain a reasonable user-independent threshold. The techniques
provided in this thesis can be applied to other data sets in keystroke dynamics with
dierent languages. We obtained the user-independent threshold based on Italian
language. However, we do not know about the inuence of other languages on the
user-independent threshold. Evaluation of other languages could be attempted
to obtain the user-independent threshold. It will be interesting to analyse the
eectiveness of the proposed techniques and this will help answer questions about
how well the user-independent threshold can be generalised to other data sets.
8.2.2 Application of the Proposed Technique to Dierent
Biometric Sources
we used keystroke dataset [32] as one of the biometric sources for experiments in
this thesis for providing continuous user authentication. The techniques provided
in this thesis can be applied to other sources of biometric such as mouse dynamics
and face recognition. We found that the pre-dened typing model of a user is not
necessary for successful authentication. However, we do not know whether the pre-
dened model for a user is necessary for successful authentication in other biometric
sources. Evaluation of other biometric sources could be attempted to obtain the
user-independent threshold. It will be interesting to analyse the eectiveness of
the proposed techniques and this will help answer questions about how well the
user-independent threshold can be generalised to other biometric sources.
8.2.3 Improvements to the Proposed the User-Independent
Threshold
The promising capability of the proposed approach which authenticates users with-
out requiring a pre-dened typing model from any user, motivated our investiga-
tions into improving the technique further. Further investigation is required to
devise a methodology for automatically adapting the user-independent thresh-
old parameters such as sliding window siz. Also, using dynamic threshold could
improve the accuracy of the proposed user-independent threshold approach. Fur-
thermore, a new mechanism is required for improving the authentication technique
by reducing the data size.
8.2.4 Detection of Impostor in Real time
The proposed techniques have been used in an oine mode to analyse the users
typing data. We believe they can further be extended to analyse the users typing
data in a real time manner. In this regard the performance of the proposed tech-
nique with respect to detection delays needs to be investigated. In addition, the
8.3. Concluding Remarks 145
potential of using typing data in supplementing other perimeter defense techniques
needs to be explored. This direction can help in answering how well observed user
typing behaviour can be used in assisting other security solutions such as an aid
to intrusion detection systems.
8.3 Concluding Remarks
This research has highlighted the challenges of analysing the user typing data
in distinguishing between users and can be used as a biometric authentication.
New methods for improving the analysis of user typing data in order to detect
the impostor during the computer session were proposed. The analysis techniques
provided in this thesis have been successfully used that are representative of user
typing behavior and also distinguished a user accurately without need for any
predened user typing model a-priori.
Appendix A
Characteristics of users typing
data
147
148 Appendix A. Characteristics of users typing data
Typing data
Characteristics
User 2 User 3 User 4 User 5 User 7 User 14 User 15
Total characters 12903 20165 14950 21873 12335 16898 15267
Lower case 8978 12776 10067 15801 9427 5581 10443
Uppercase 819 804 208 305 95 5801 181
Digits 20 13 20 48 4 22 2
Notations and
symbols (#, ,
...)
306 554 5 818 496 648 539
Backspace 270 1668 1589 936 210 1291 601
Enter 168 274 177 264 168 111 198
Null 427 1657 322 550 117 993 1216
Table A.1: Characteristics of users typing data
149
Distribution Time
The following table presents the distribution time in milliseconds for 10 dierent
users from the dataset. From the table, we can easily see typing delay is dierent
for most of the users. For example, the average typing delay for sequence of two
characters is signicantly dierent for some users such as user3 and other users
including user 2, 4, 5,7 and 8. Furthermore, from the table, it can be seen clearly
that most of the typing delay of sequences of two characters happened in the
category 101-200 ms. However, it is not the case for some users such as user 6
where most of his typing delay of sequence of two characters happened in category
201-300 ms.
150 Appendix A. Characteristics of users typing data
T
a
b
l
e
A
.
2
:
D
a
t
a
d
i
s
t
r
i
b
u
t
i
o
n
f
o
r
1
0
u
s
e
r
s
Bibliography
[1] R. Bolle A. Jain and S. Pankanti. Biometrics: Personal identication in a
network society. MA: Kluwer Academic Publisher, Norwell, 1999.
[2] M.B. Ahmad and T.S. Choi. Local threshold and boolean function based
edge detection. Consumer Electronics, IEEE Transactions on, 45(3):674
679, 1999.
[3] A.A.E. Ahmed and I. Traore. Anomaly intrusion detection based on biomet-
rics. In Information Assurance Workshop, 2005. IAW05. Proceedings from
the Sixth Annual IEEE SMC, pages 452453, 2005.
[4] E. Ahmed, A. Clark, and G. Mohay. A novel sliding window based change
detection algorithm for asymmetric trac. In Network and Parallel Com-
puting, 2008. NPC 2008. IFIP International Conference on, pages 168175,
2008.
[5] L.C.F. Araujo, L.H.R. Sucupira Jr, M.G. Lizarraga, L.L. Ling, and J.B.T.
Yabu-Uti. User authentication through typing biometrics features. Signal
Processing, IEEE Transactions on, 53(2):851855, 2005.
[6] A. Azzini and S. Marrara. Impostor Users Discovery Using a Multimodal
Biometric Continuous Authentication Fuzzy System. Lecture Notes in Com-
puter Science, 5178:371378, 2008.
[7] M. Basseville and I.V. Nikiforov. Detection of abrupt changes: theory and
application. Prentice Hall, 1993.
[8] Carlos E. Benitez, Maximiliano Bertacchini, and Pablo I. Fierens. User
clustering based on keystroke dynamics. In CACIC10, October 2010.
[9] F. Bergadano. Identity verication through dynamic keystroke analysis. In-
telligent Data Analysis, 7(5):469496, 2003.
151
152 BIBLIOGRAPHY
[10] F. Bergadano, D. Gunetti, and C. Picardi. User authentication through
keystroke dynamics. ACM Transactions on Information and System Security
(TISSEC), 5(4):367397, 2002.
[11] S. Bleha, C. Slivinsky, and B. Hussien. Computer-access security systems
using keystroke dynamics. IEEE Transactions on Pattern Analysis and Ma-
chine Intelligence, 12(12):12171222, 1990.
[12] M. Brown and SJ Rogers. User identication via keystroke characteristics of
types names using neural networks. International Journal of Man-Machine
Studies, 39(6):9991014, 1993.
[13] S. Budalakoti, A. Srivastava, R. Akella, and E. Turkov. Anomaly detection
in large sets of high-dimensional symbol sequences. NASA Ames Research
Center, Tech. Rep. NASA TM-2006-214553, 2006.
[14] V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey.
ACM Computing Surveys (CSUR), 41(3):15, 2009.
[15] S. Cho, C. Han, D.H. Han, and H.I. Kim. Web-based keystroke dynamics
identity verication using neural network. Journal of organizational comput-
ing and electronic commerce, 10(4):295307, 2000.
[16] L. Coetzee and E.C. Botha. Fingerprint recognition in low quality images.
Pattern Recognition, 26(10):14411460, 1993.
[17] E. Council. Federal nancial institutions examination council. Stat,
2160:2222, 1994.
[18] M. Dash, K. Choi, P. Scheuermann, and H. Liu. Feature selection for
clustering-a lter solution. In Data Mining, 2002. ICDM 2002. Proceedings.
2002 IEEE International Conference on, pages 115122. IEEE, 2002.
[19] H. Debar, M. Becker, and D. Siboni. A neural network component for
an intrusion detection system. In Research in Security and Privacy, 1992.
Proceedings., 1992 IEEE Computer Society Symposium on, pages 240250.
IEEE, 1992.
[20] D.E. Denning. An intrusion-detection model. IEEE Transactions on software
engineering, pages 222232, 1987.
BIBLIOGRAPHY 153
[21] P. Dowland, S. Furnell, and M. Papadaki. Keystroke analysis as a method
of advanced user authentication and response. Security in the Information
Society: Visions and Perspectives, page 215, 2002.
[22] P. Dowland, H. Singh, and S. Furnell. A preliminary investigation of user
authentication using continuous keystroke analysis. In Proceedings of the
IFIP 8th Annual Working Conference on Information Security Management
& Small Systems Security, Las Vegas, pages 2728, 2001.
[23] P.S. Dowland and S.M. Furnell. A long-term trial of keystroke proling using
digraph, trigraph and keyword latencies. In Security and protection in infor-
mation processing systems: IFIP 18th world computer congress: TC11 19th
international information security conference, 22-27 August 2004, Toulouse,
France, page 275. Kluwer Academic Pub, 2004.
[24] T. Fawcett and F. Provost. Activity monitoring: Noticing interesting changes
in behavior. In Proceedings of the fth ACM SIGKDD international confer-
ence on Knowledge discovery and data mining, pages 5362. ACM New York,
NY, USA, 1999.
[25] H.H. Feng, O.M. Kolesnikov, P. Fogla, W. Lee, and W. Gong. Anomaly
Detection Using Call Stack Information. In 2003 Symposium on Security
and Privacy: proceedings: 11-14 May, 2003, Berkeley, California, page 62.
IEEE, 2003.
[26] E. Flior and K. Kowalski. Continuous biometric user authentication in online
examinations. In Information Technology: New Generations (ITNG), 2010
Seventh International Conference on, pages 488492. IEEE, 2010.
[27] R. Fujimaki, T. Yairi, and K. Machida. An approach to spacecraft anomaly
detection problem using kernel feature space. In Proceedings of the eleventh
ACM SIGKDD international conference on Knowledge discovery in data
mining, page 410. ACM, 2005.
[28] SM Furnell, J.P. Morrissey, P.W. Sanders, and C.T. Stockel. Applications of
keystroke analysis for improved login security and continuous user authen-
tication. In Information systems security, pages 283294. Chapman & Hall,
Ltd., 1996.
154 BIBLIOGRAPHY
[29] R.S. Gaines, W. Lisowski, S.J. Press, and N. Shapiro. Authentication by
keystroke timing: Some preliminary results. Rand, 1980.
[30] D.R. Gentner, J.T. Grudin, S. Larochelle, D.A. Norman, and D.E. Rumel-
hart. A glossary of terms including a classication of typing errors. Cognitive
aspects of skilled typewriting, pages 3943, 1983.
[31] L.A. Gordon, M.P. Loeb, W. Lucyshyn, and R. Richardson. CSI/FBI Com-
puter crime and security survey 2006. Computer Security Institute publica-
tions, 2006.
[32] D. Gunetti and C. Picardi. Keystroke analysis of free text. ACM Transac-
tions on Information and System Security (TISSEC), 8(3):312347, 2005.
[33] F. Gustafsson. The marginalized likelihood ratio test for detecting abrupt
changes. IEEE Transactions on automatic control, 41(1):6678, 1996.
[34] F. Gustafsson. Adaptive ltering and change detection. John Wiley & Sons
Inc, 2000.
[35] J. Han, M. Kamber, and J. Pei. Data mining: concepts and techniques.
Morgan Kaufmann, 2011.
[36] D.J. Hand, H. Mannila, and P. Smyth. Principles of data mining. The MIT
press, 2001.
[37] Z. He, X. Xu, and S. Deng. Discovering cluster-based local outliers. Pattern
Recognition Letters, 24(9-10):16411650, 2003.
[38] K. Hempstalk. Continuous Typist Verication using Machine Learning. PhD
thesis, The University of Waikato, 2009.
[39] S. Hocquet, J. Ramel, and H. Cardot. User Classication for Keystroke
Dynamics Authentication. Lecture Notes in Computer Science, 4642:531,
2007.
[40] J. Hu, D. Gingrich, and A. Sentosa. A k-Nearest neighbor approach for user
authentication through biometric keystroke dynamics. In Proceedings of the
IEEE International Conference on Communications, pages 15561560, 2008.
[41] S. Huopio. Biometric identication. Eye, 3:1, 1988.
BIBLIOGRAPHY 155
[42] K. Ilgun, R.A. Kemmerer, and P.A. Porras. State transition analysis: A
rule-based intrusion detection approach. Software Engineering, IEEE Trans-
actions on, 21(3):181199, 1995.
[43] A.K. JAIN and A. ROSS. An overview of biometrics.
[44] R. Janakiraman and T. Sim. Keystroke Dynamics in a General Setting.
Lecture Notes in Computer Science, 4642:584, 2007.
[45] H. Jiawei and M. Kamber. Data mining: concepts and techniques. San
Francisco, CA, itd: Morgan Kaufmann, 5, 2001.
[46] R. Joyce and G.K. Gupta. User authorization based on keystroke latencies.
Dept. of Computer Science, James Cook University of North Queensland,
1989.
[47] Rick Joyce and Gopal K. Gupta. Identity authentication based on keystroke
latencies. Commun. ACM, 33(2):168176, 1990.
[48] P. Kang, S. Hwang, and S. Cho. Continual retraining of keystroke dynamics
based authenticator. Lecture Notes in Computer Science, 4642:1203, 2007.
[49] L. Kaufman, P.J. Rousseeuw, et al. Finding groups in data: an introduction
to cluster analysis, volume 39. Wiley Online Library, 1990.
[50] Y. Kawahara and M. Sugiyama. Change-point detection in time-series data
by direct density-ratio estimation. In Proceedings of 2009 SIAM Interna-
tional Conference on Data Mining (SDM2009), pages 389400, 2009.
[51] Y. Ke, R. Sukthankar, and M. Hebert. Event detection in crowded videos.
In IEEE International Conference on Computer Vision, volume 23, pages
3841. Citeseer, 2007.
[52] K.S. Killourhy and R.A. Maxion. Comparing Anomaly-Detection Algorithms
for Keystroke Dynamics. In IEEE/IFIP International Conference on De-
pendable Systems & Networks, 2009. DSN09, pages 125134, 2009.
[53] K.S. Killourhy and R.A. Maxion. Should security researchers experiment
more and draw more inferences? Technical report, CARNEGIE-MELLON
UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE, 2011.
156 BIBLIOGRAPHY
[54] Y.S. Kim, W.N. Street, and F. Menczer. Feature selection in unsupervised
learning via evolutionary search. In Proceedings of the sixth ACM SIGKDD
international conference on Knowledge discovery and data mining, pages
365369. ACM, 2000.
[55] G. Kwang, R.H. Yap, T. Sim, and R. Ramnath. An Usability Study of
Continuous Biometrics Authentication. In Proceedings of the Third Inter-
national Conference on Advances in Biometrics, pages 828837. Springer-
Verlag, 2009.
[56] L. Ladha and T. Deepa. Feature selection methods and algorithms. Inter-
national Journal, 3, 2011.
[57] J. Leggett and G. Williams. Verifying identity via keyboard characteristics.
Int. J. Man-Machine Studies, 28(1):6776, 1988.
[58] J. Liu, FR Yu, C.H. Lung, and H. Tang. A Framework of Combining Intru-
sion Detection and Continuous Authentication in Mobile Ad Hoc Networks.
In IEEE International Conference on Communications, 2008. ICC08, pages
15151519, 2008.
[59] T.F. Lunt. Automated audit trail analysis and intrusion detection: A sur-
vey. In In Proceedings of the 11th National Computer Security Conference.
Citeseer, 1988.
[60] C.D. Manning, P. Raghavan, and H. Schutze. Introduction to information
retrieval, volume 1. Cambridge University Press Cambridge, 2008.
[61] J. Matas, M. Hamouz, K. Jonsson, J. Kittler, Y. Li, C. Kotropoulos, A. Tefas,
I. Pitas, T. Tan, H. Yan, et al. Comparison of face verication results on the
xm2vtfs database. In Pattern Recognition, 2000. Proceedings. 15th Interna-
tional Conference on, volume 4, pages 858863. IEEE, 2000.
[62] J. McHugh. Intrusion and intrusion detection. International Journal of
Information Security, 1(1):1435, 2001.
[63] R.A.F. Mini, A.A.F. Loureiro, and B. Nath. The distinctive design charac-
teristic of a wireless sensor network: the energy map. Computer Communi-
cations, 27(10):935945, 2004.
BIBLIOGRAPHY 157
[64] F. Monrose and A. Rubin. Authentication via keystroke dynamics. In Pro-
ceedings of the 4th ACM Conference on Computer and Communications Se-
curity, pages 4856. ACM New York, NY, USA, 1997.
[65] F. Monrose and A.D. Rubin. Keystroke dynamics as a biometric for authen-
tication. FUTURE GENER COMPUT SYST, 16(4):351359, 2000.
[66] U. Murad and G. Pinkas. Unsupervised Proling for Identifying Superim-
posed Fraud. In Proceedings of the Third European Conference on Principles
of Data Mining and Knowledge Discovery, page 261. Springer-Verlag, 1999.
[67] L. Nanni and A. Lumini. A supervised method to discriminate between
impostors and genuine in biometry. Expert Systems With Applications,
36(7):1040110407, 2009.
[68] M. Nisenson, I. Yariv, R. El-Yaniv, and R. Meir. Towards behaviometric
security systems: Learning to identify a typist. Knowledge Discovery in
Databases: PKDD 2003, pages 363374, 2003.
[69] M.S. Obaidat and B. Sadoun. Simulation evaluation study of neural net-
work techniques to computer user identication. Information Sciences,
102(1):239258, 1997.
[70] M.S. Obaidat and B. Sadoun. Verication of computer users using keystroke
dynamics. IEEE Transactions on Systems, Man, and Cybernetics, Part B,
27(2):261269, 1997.
[71] L. OGorman. Comparing passwords, tokens, and biometrics for user au-
thentication. Proceedings of the IEEE, 91(12):20212040, 2003.
[72] M.E. Otey, A. Ghoting, and S. Parthasarathy. Fast distributed outlier de-
tection in mixed-attribute data sets. Data Mining and Knowledge Discovery,
12(2):203228, 2006.
[73] N. Otsu. A threshold selection method from gray-level histograms. Auto-
matica, 11:285296, 1975.
[74] V.P. Paun. Fractal surface analysis of zircaloy-4 sem micrographs using
the time-series method. Central European Journal of Physics, 7(2):264269,
2009.
158 BIBLIOGRAPHY
[75] A. Peacock, X. Ke, and M. Wilkerson. Typing patterns: A key to user
identication. IEEE Security and Privacy, pages 4047, 2004.
[76] C. Phua, D. Alahakoon, and V. Lee. Minority report in fraud detection: clas-
sication of skewed data. ACM SIGKDD Explorations Newsletter, 6(1):50
59, 2004.
[77] D. Polemi. Biometric techniques: review and evaluation of biometric tech-
niques for identication and authentication, including an appraisal of the
areas where they are most applicable. Institute of Communication and Com-
puter Systems National Technical University of Athens, pages 35, 1997.
[78] P.A. Porras and A. Valdes. Live trac analysis of tcp/ip gateways. In
Networks and Distributed Systems Security Symposium, 1998.
[79] M. Pusara. An Examination of User Behavior for Re-authentication. PhD
thesis, Center for Education and Research in Information Assurance and
Security, Purdue Univeristy (August 2007).
[80] M. Pusara and C.E. Brodley. User re-authentication via mouse movements.
In Proceedings of the 2004 ACM workshop on Visualization and data mining
for computer security, pages 18. ACM New York, NY, USA, 2004.
[81] G. Qian, S. Sural, Y. Gu, and S. Pramanik. Similarity between euclidean
and cosine angle distance for nearest neighbor queries. In Proceedings of the
2004 ACM symposium on Applied computing, pages 12321237. ACM, 2004.
[82] WU Qing-tao and S. Zhi-qing. Detecting ddos attacks against web server
using time series analysis. Wuhan University Journal of Natural Sciences,
11(1):175180, 2006.
[83] M. Ramadas, S. Ostermann, and B. Tjaden. Detecting anomalous network
trac with self-organizing maps. Lecture Notes in Computer Science, pages
3654, 2003.
[84] A. Rapaka, A. Novokhodko, and D. Wunsch. Intrusion detection using radial
basis function network on sequences of system calls. In Neural Networks,
2003. Proceedings of the International Joint Conference on, volume 3, pages
18201825. IEEE, 2003.
BIBLIOGRAPHY 159
[85] R. Rawat, R. Nayak, Y. Li, and S. Alsaleh. Aggregate distance based cluster-
ing using bonacci series-bclus. Web Technologies and Applications, pages
2940, 2011.
[86] K. Revett, F. Gorunescu, M. Gorunescu, M. Ene, S. Magalhaes, and H. San-
tos. A machine learning approach to keystroke dynamics based user authen-
tication. International Journal of Electronic Security and Digital Forensics,
1(1):5570, 2007.
[87] A. Ross and A. Jain. Biometric sensor interoperability: A case study in
ngerprints. Lecture notes in computer science, pages 134145, 2004.
[88] Y. Sang, H. Shen, and P. Fan. Novel Impostors Detection in Keystroke Dy-
namics by Support Vector Machine. In Proc. of the 5th international confer-
ence on Parallel and Distributed Computing, Applications and Technologies
(PDCAT 2004). Springer.
[89] T. Scheidat, C. Vielhauer, and J. Dittman. Handwriting verication-
comparison of a multi-algorithmic and a multi-semantic approach. Image
and Vision Computing, 27(3):269278, 2009.
[90] K. Sequeira and M. Zaki. Admit: anomaly-based data mining for intru-
sions. In Proceedings of the eighth ACM SIGKDD international conference
on Knowledge discovery and data mining, pages 386395. ACM, 2002.
[91] S.J Shepherd. Continuous authentication by analysis of keyboard typingchar-
acteristics. In Security and Detection, 1995., European Convention on, pages
111114, 1995.
[92] Y. SHI and L. CAO. User identity verication based on recognition of typing
style. Computer Engineering, 6, 2005.
[93] T. Shimshon, R. Moskovitch, L. Rokach, and Y. Elovici. Clustering di-
graphs for continuously verifying users according to their typing patterns.
In Electrical and Electronics Engineers in Israel (IEEEI), 2010 IEEE 26th
Convention of, pages 000445000449. IEEE.
[94] T. Shimshon, R. Moskovitch, L. Rokach, and Y. Elovici. Continuous veri-
cation using keystroke dynamics. In Computational Intelligence and Security
(CIS), 2010 International Conference on, pages 411415. IEEE, 2010.
160 BIBLIOGRAPHY
[95] R.E. Smith. Authentication: from passwords to public keys. Addison-Wesley,
2002.
[96] A. Strehl, J. Ghosh, and R. Mooney. Impact of similarity measures on
web-page clustering. In Workshop on Articial Intelligence for Web Search
(AAAI 2000), pages 5864, 2000.
[97] P.N. Tan, M. Steinbach, V. Kumar, et al. Introduction to data mining.
Pearson Addison Wesley Boston, 2006.
[98] P.S. Teh, A.B.J. Teoh, T.S. Ong, and H.F. Neo. Statistical Fusion Approach
on Keystroke Dynamics. In Proceedings of the 2007 Third International
IEEE Conference on Signal-Image Technologies and Internet-Based System-
Volume 00, pages 918923. IEEE Computer Society, 2007.
[99] U. Uludag and A.K. Jain. Attacks on biometric systems: a case study in
ngerprints. In Proc. SPIE-EI, pages 622633. Citeseer, 2004.
[100] D. Umphress and G. Williams. Identity verication through keyboard char-
acteristics. International journal of man-machine studies, 23(3):263273,
1985.
[101] J.D. Woodward Jr, C. Horn, J. Gatune, and A. Thomas. Biometrics: A look
at facial recognition. Technical report, DTIC Document, 2003.
[102] Kai Xi, Yan Tang, and Jiankun Hu. Correlation keystroke verication
scheme for user access control in cloud computing environment. Comput.
J., 54(10):16321644, 2011.
[103] R.V. Yampolskiy and V. Govindaraju. Taxonomy of Behavioural Biometrics.
Behavioral Biometrics for Human Identication: Intelligent Applications,
page 1, 2009.
[104] N. Ye. The handbook of data mining. Lawrence Erlbaum, 2003.
[105] N. Ye and X. Li. A scalable clustering technique for intrusion signature recog-
nition. In Proceedings of 2001 IEEE Workshop on Information Assurance
and Security, pages 14. Citeseer, 2001.
BIBLIOGRAPHY 161
[106] E. Yu and S. Cho. Novelty detection approach for keystroke dynamics
identity verication. Intelligent Data Engineering and Automated Learning,
pages 10161023, 2003.

Free Text Keystroke Dynamics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Free Text Keystroke Dynamics

Uploaded by

Copyright:

Available Formats

An Examination of Keystroke Dynamics

You might also like