Professional Documents
Culture Documents
Signal Processing
journal homepage: www.elsevier.com/locate/sigpro
Review
art ic l e i nf o
a b s t r a c t
Article history:
Received 22 March 2016
Received in revised form
17 May 2016
Accepted 5 July 2016
Available online 14 July 2016
Localization has attracted a lot of research effort in the last decade due to the explosion of location based
service (LBS). In particular, wireless ngerprinting localization has received much attention due to its
simplicity and compatibility with existing hardware. In this work, we take a closer look at the underlying
aspects of wireless ngerprinting localization. First, we review the various methods to create a radiomap.
In particular, we look at the traditional ngerprinting method which is based purely on measurements,
the parametric pathloss regression model and the non-parametric Gaussian Process (GP) regression
model. Then, based on these three methods and measurements from a real world deployment, the
various aspects such as the density of access points (APs) and impact of an outdated signature map which
affect the performance of ngerprinting localization are examined. At the end of the paper, the audiences
should have a better understanding of what to expect from ngerprinting localization in a real world
deployment.
& Published by Elsevier B.V.
Keywords:
Fingerprinting localization
Location-based service (LBS)
Received signal strength indicator (RSSI)
Pathloss model
Gaussian Process
Non-parametric model
Machine learning
Contents
1.
2.
3.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.
Ofine phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.
Online phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.
Traditional ngerprinting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.
Parametric model pathloss model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.
Non-parametric model Gaussian process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5. Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.
Benchmark experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.
k-NN for traditional ngerprinting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.
Combining RSSI measurements from multiple MAC addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.
Density of training database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5.
Density of access points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.6.
Density of RSSI signature map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7.
Impact from outdated signature map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
235
236
237
237
237
238
238
238
239
240
241
241
241
242
242
242
243
243
244
1. Introduction
n
Corresponding author.
E-mail addresses: simon.yiu@nokia.com (S. Yiu),
marzieh.dashti@nokia.com (M. Dashti), holger.claussen@nokia.com (H. Claussen),
fernando.perez-cruz@nokia.com (F. Perez-Cruz).
http://dx.doi.org/10.1016/j.sigpro.2016.07.005
0165-1684/& Published by Elsevier B.V.
Recent applications in location based services (LBS) have stimulated extensive research on wireless localization [14]. Among
all the localization technologies, wireless ngerprinting has been
proven as an effective technique due to its simplicity and
236
deployment practicability [512]. Wireless ngerprinting localization avoids hardware deployment cost and effort by relying on
existing network infrastructure such as WiFi (e.g. IEEE 802.11 [13])
or cellular (e.g. long term evolution (LTE) [14,15]). Fingerprinting
localization works in two phases: an ofine training phase and an
online localization phase. During the training phase, radio frequency (RF) measurements (also known as signatures or ngerprints) at known locations are collected in a database. The ngerprint database is also referred to as the radiomap [16] and we
use these terms interchangeably in this work. During the online
phase, users determine their location by comparing the real-time
RF measurement with the entries in the database. The majority of
previous research on ngerprinting localization utilizes received
signal strength as the RF measurement due to its availability at
both the transmitter and receiver sides. For example, there are
billions of WiFi access points (APs) and their coverage is almost
universal. At any location in dense urban areas, we can measure
the received signal strength indicator (RSSI) from several tens (or
hundreds) of them, which can be acquired easily by any Android
device together with a position reference signal (PRS) from cellular
communication standards.
Churchill (House of Commons, November 11, 1947), Many
forms of [localization] have been tried, and will be tried in this world
of sin and woe. No one pretends that [RSSI] is perfect or all wise.
Indeed, it has been said that [RSSI] is the worst form of [localization]
except for all those other forms that have been tried from time to
time. RSSI-based localization has many limitations, such as the
received power's heavy dependency on the environment, the
chipset, the antenna, and the orientation of the device [17]. But
RSSI does not have a showstopper, as time of arrival or angle of
arrival seem to have, due to the need for stringent synchronization
or multiple antennas and the strong effect of multipath biases,
which signicantly impedes the use of those technologies [18].
RSSI can be assisted with accelerometers, gyroscopes, magnetometers, barometers or Bluetooth beacons to become more accurate [19]. It can also make use of blueprints or maps and nonlinear
tracking to remove outliers [16]. However, in order to assist RSSI
localization, we rst need a baseline probabilistic RSSI-only localization algorithm.
Although ngerprinting localization is one of the most exploited techniques in localization, there remains a lot of unsolved research problems. For example, previous research in the literature
has reported localization error ranging from 3 m to 10 m for ngerprinting localization using WiFi RSSI [512] and errors in iOS or
Android wide area localization are even larger. However, it is unclear what contributes to the performance discrepancy. In this
paper, we attempt to provide a comprehensive review on ngerprinting localization and answer these open questions.
We consider three methods of generating the radiomap in this
work. We begin by reviewing the simplest form of ngerprinting
localization which relies only on measurements to create the
radiomap. This method is referred to as traditional ngerprinting.
In this method, users are localized by comparing the real-time
measurement with the entries in the radiomap using a k-nearest
neighbor (k-NN) algorithm. Next, we consider the scenario where
a parametric pathloss regression method is used to help generating the radiomap. Finally, we consider the non-parametric GP [20]
regression method to help generating the radiomap. The parametric and non-parametric regression methods are useful in large
area where it is not practical to do an exhaustive measurement
campaign due to time and labor constraints. We would like to nd
out how the performance of the three aforementioned methods
compares in terms of localization performance in a real-world
deployment.
In our experiments which is based on real-world deployment,
we look at several elements which may potentially affect the
base is used for training and online testing, an unrealistic assumption where the independent and identically distributed (i.i.
d.) assumption holds. The performance degrades with the
density of the available database. The performance also degrades signicantly if a different database (e.g. taken at a different time or day or with a different device or by a different
person) is used for testing, because in this case the i.i.d. assumption does not hold.
The GP and pathloss model are more robust than tradition
ngerprinting localization when only partial measurements in a
subset of location are available. The performance of GP and
pathloss model are relatively stable when the number of
training points decreases. Until a certain point in which there is
not enough information and the performance degrades rapidly.
When multiple temporal measurements are available at a location, we could not come to a conclusion of whether using the
mean or maximum value of the measurements provide the best
performance. There are arguments for using either, but we
found no empirical evidence to support either.
The localization performance decreases with decreasing number of APs.
When an AP transmits using multiple MAC address, i.e., there
are different RSSI entries corresponding to those MAC addresses, it is benecial to consolidate these RSSI values when
the training dataset is small. For a full training dataset, it is
better to treat the entries as they are from different APs.
For the GP and pathloss regression model, the radiomap can be
generated ofine a priori. We do not see a degradation in performance until the density of the radiomap generated is less
than 3 m2, i.e., a ngerprint signature is generated every 3 m2 or
larger.
Among the three methods of generating the radiomap, GP is the
most robust in terms of database and test track mismatch, i.e.,
online testing is performed in a different time or date or device.
On the other hand, traditional ngerprinting performs the
worst.
The rest of the paper is organized as follows. In Section 2, related work in the literature is presented. The network model under
consideration is presented in Section 3. Background material on
nearest neighbor, the parametric pathloss regression model, and
the non-parametric Gaussian process model are introduced in
Section 4. Experimental results with different parameters are
studied and discussed in Section 5 followed by some conclusions
in Section 6.
2. Related work
Localization techniques can be generally classied into two
main categories: Infrastructure-free and Infrastructure-based approaches. Infrastructure-free approaches focus on leveraging existing infrastructure such as WiFi [2126], FM, TV [2729], Global
System for Mobile communications (GSM) [30,31], geo-magnetic
[32], and sound signals [33] to enable localization. On the other
hand, infrastructure-based approaches rely on deploying dedicated RF infrastructure such as RFID [34], infrared [35], ultrasound
[36], Bluetooth and/or visible lights [37] for localization purpose.
In this work, we consider Location ngerprinting (LF) which is
an infrastructure-free approach without the requirement of deploying expensive hardware. LF relies on existing RF infrastructure
and determines a UE's location by comparing the UE's real-time
RSSI readings against the pre-recorded entries in the radiomap
database. The radiomap is constructed in an ofine training phase
and it contains the RSSI readings from detectable access points
(APs) at multiple known locations (reference points (RP) or calibration points). LF requires an updated radiomap of the area of
interest to provide the accuracy that meets the requirements of
commercial LBS.
A common practice to construct a radiomap is to manually
collect ngerprints at multiple known locations in the entire
building. Obviously manual calibration is a labor-intensive, tedious
and time consuming task especially when measuring large areas.
To reduce the human effort, self-guided robots equipped with
inertial measurement unit (IMU) sensors can roam around and
explore the space of interest to collect training data [38]. Using
robots is currently not a global economic approach. Radio maps
can also be constructed automatically using crowd-sourcing and
machine learning methods [3942]. Crowd-sourcing relies on volunteers willing to participate in data collection [43]. This approach makes use of random traces of measurements (RF measurement and inertial sensor measurements) collected by volunteers carrying smartphones as they walk around the localization
area during their daily routines. Obtaining large enough number of
traces that cover the whole building requires spending a long
period of time (e.g., a week). The crowdsourcing approach is
computationally complex, time consuming, and obtaining high
accuracy is challenging. Recently, a new method has been proposed, where an AP plays a role as a xed site-survey collector
[44]. Every AP scans the RSSI values from other APs. Then the GP
method is used to learn the power distribution of the AP's signal
over the whole area of interest. This method requires prior
knowledge of the positions of all APs, which is often not available.
The number of distributed APs should also be sufcient, which
might be a limiting factor.
The time and effort required to build the RF signature map
during the ofine phase have prompted research in simultaneous
localization and mapping (SLAM) [45,46]. However, although the
effort to build the RF signature map is eliminated, the performance
is generally not good enough for most practical indoor applications. In this paper, the effort of building the RF signature map is
reduced by modeling the received signal strength as a GP. Previously, GP was used for WiFi position estimation in an industrial
environment [47]. It was integrated with the laser-localizer on the
hot metal carrier (HMC). The mean function and covariance kernel
used are different from the one assumed in this work. In [48], GP
was used to predict WiFi and GSM signal strength for location
estimation purpose. A different mean function was assumed and
the hyperparameters were estimated by using conjugate gradient
descent method. Bekkali et al. [49] applied GP to an indoor environment. It is unclear what algorithm was used to train the
hyperparameters. Also, the mean function was not utilized. In [50],
the WiFi-SLAM problem is addressed by using GP latent variable
models. An adaptive particle lter was used as the localizer in [51]
and the gradient descent algorithm was used for hyperparameters
estimation. Similar to [50], a mean function was not employed. In
237
3. Problem statement
We consider a two dimensional area where a localization
service is of interest.1 It is assumed that wireless service is provided to the area with a homogeneous wireless technology. In this
work, it is assumed that the entire area has WiFi coverage. In
particular, it is assumed that the area is served by a sufcient
number of APs. There is enough redundancy (several tens of APs)
to be able to compensate the errors and deviations mentioned in
the introduction. It should be noted that not all the APs serve the
whole area in , this condition deals with dead-zones in which
APs are not heard, as well as scalability issues, when covering a
large area. We consider a downlink localization technology based
on downlink signal transmitted from the APs. An AP advertises its
service availability by broadcasting its MAC address. It should be
noted that some modern APs have the capability to broadcast
more than one MAC address in the same or different wireless
channels. At the receivers, e.g., mobile phone or tablet, the power
of the received RF signal from all APs is measured as RSSI. APs that
are far enough away to result in a RF signal below the received
antenna's sensitivity level will not be detected by the receiver.
3.1. Ofine phase
It is assumed that the area 2 is discretized into a set of L
known locations = {xl |l = 1L}, where xl represents the 2-dimensional (2-D) Cartesian coordinate of location l. A RM consisting
of RSSI values is collected a priori at these locations ofine.
Commonly, RSSI is scanned for a certain period of time to record
multiple temporal samples from every AP to tolerate some degree
of noise. It is assumed that T temporal samples from all N unique
MAC addresses are collected for all L locations. The RSSI values are
collected in a three dimensional matrix D with dimension
L N T . The RSSI sample collected at location xl from MAC address n at time index t is denoted as nt (xl ), t = 1T , l = 1L , and
n = 1N . For locations where certain APs cannot be heard, nt (xl )
can be replaced by a constant such as 110, i.e., the device sensitivity level.
3.2. Online phase
During the online phase, a receiver at an unknown location x
listens to all the APs in the area and collects the RSSI measurements rnp (x ) in a two dimensional database R with dimension
1
We consider a multistory building as disconnected 2-D spaces and we do not
attempt 3D localization on a given oor.
4. Background
The radiomap database can be obtained/generated by different
methods. The three methods that are used to generate the ngerprint database are introduced in this section.
Histrogram
150
100
Count
238
50
85
75
65
55
45
35
25
15
RSSI (dBm)
Fig. 1. Histogram of RSSI values for a APs 2 m apart. RSSI value is 8 bit number in
which 255 represents 0 dBm and 160 95 dBm.
from the k-NN ( k > 1) algorithm generally does not coincide with
the locations in due to the averaging.
For T 1 and P 1, there is more than one temporal measurement for both the database and online measurement for a
given location. In this case, either the mean or max RSSI values can
be used to replace n (xl ) and rn (x ) in (1):
rn (x ) = mean rnp (x ),
p
n (x ) = mean nt (xl )
t
rn (x ) = max rnp (x ),
The rational behind using the mean value is assuming that the
deviation in RSSI is governed by thermal noise and hence averaging should provide a more accurate estimate. However, the
measured RSSI is in many cases affected by fading and interference
and the change in RSSI might be considerable. For example, if we
are physically close to the WiFi APs, we observe drop in the RSSI
measurements in the range of 20 dB to 50 dB when the beacon
packages collide. In Fig. 1, we show the histogram of the RSSI
values for an AP that is 2 m away from the reading device. The
mean value would typically be around 200 ( 55 dBm), which
suggests that the reading device is signicantly further away than
we actually are.
x* = argmin
(n (xl ) rn (x ))2.
xl
n= 1
(1)
min x i x j 2 .
x i, x j
ij
(2)
n (x ) = max nt (xl ).
(3)
(4)
n* (z ) = C + log10 ( z z nAP )
(5)
(6)
Eqs. (5) and (6) are referred to as the hyperbolic model and the
mixture model in this work, respectively. In the above equation,
znAP represents the relative location of AP n to z in three dimensional Cartesian coordinate. is the pathloss exponent (for simplicity, the factor 10 is factored into ), whereas C is the pathloss at
a reference point 1 unit away from z nAP . The only difference between (5) and (6) is that a linear multiplicative term ( z z nAP )
is introduced in (6). The term is useful to model linear pathloss
due to walls (and other obstacles) commonly found in ofce space.
In order to apply (5) and (6) to make RSSI estimation, one
needs to rst learn the parameters C, , and z nAP . For convenience,
all non-zero temporal RSSI measurements at the L training locations
are
collected
in
a
column
vector
yn ,
yn {nt (zl )|nt (zl ) 0, l, t} and the corresponding training locations zl are collected in vn . Applying vn to (5) and (6), we obtain
the estimated RSSI vector yn* = n* (vn ). It should be noted that the
same location zl can appear in vn more than once if there are
multiple non-zero temporal measurements for that location.
Algorithm 1. Optimization algorithm to learn parameters of the
pathloss model.
1:
2:
AP
Initialize t 1, z nAP , z^n (t ) = znAP ;
Obtain solution for C, and, using yn and vn assuming
AP
z^n (t ) is the location of AP n;
3:
4:
5:
6:
7:
8:
AP
Compute yn* using z^n (t ) and the optimized parameters C,
and, ;
Compute the standard deviation of the error:
(t ) = std (yn* yn );
Set t2;
while t < MaxGeneration do
AP
z^n (t )
AP
z^n (t
=
1) + rand (3, 1);
Obtain new solution for C, and, using yn and vn asAP
suming z^n (t ) is the location of AP n;
9:
AP
Compute yn* using the z^n (t ) and the optimized parameters C, and, ;
10:
Compute the standard deviation of the error:
(t ) = std (yn* yn );
11:
12:
13:
14:
15:
16:
17:
(t ) = (t 1);
AP
AP
z^n (t ) = z^n (t 1);
end if
t = t + 1;
end while
Copt = C , opt = , opt = ,
^ AP
z nAP
, opt = t z n (t ) /MaxGeneration
Based on the set of training data n = {yn , vn }, we use the optimization described in Algorithm 1 to learn the parameters for
each AP. The algorithm rational is based on the MetropolisHastings sampling algorithm [52]. For a given location of the APs, the
values of C, and can be computed by least squares and we
update the position of the APs using a random walk. If the new
position of the AP provides a lower error we accept it, if the new
position of the AP provides a larger error, we accept with a
probability that is proportional to the ratio of the errors. The
MetropolisHastings algorithms allows exploring the potential
locations of the APs and avoid getting trapped in local minima. In
the rst step, the time index is initialized to t1. A random AP
AP
AP
location z AP is generated and set z^ (t ) = z AP . With z^ (t ), the
n
239
vector (t ) = yn* yn is logged in step 4. The time step is incremented to 2 in step 5 and the algorithm enters a while loop in
step 6. Within the while loop, a new AP location is generated acAP
AP
cording to z^ (t ) = z^ (t 1) + rand (3, 1) in step 7. Then in step 8,
n
peated for a pre-dened number of iterations MaxGeneration. Finally, the optimized parameters are given by Copt = C , opt = ,
AP
= , z AP = z^ (t ) /MaxGeneration .
n, opt
opt
Once the optimized parameters are obtained, (1) and (2) can be
used to estimate the received RSSI from AP n at any arbitrary location z . The algorithm can be used to generate radiomap of any
resolution in when only measurements from a few training
locations are available.
Suppose (1) and (2) are used to generate a radiomap consisting
of RSSI predictions at locations zq , q = 1 , Q . The following localization algorithm can be used to estimate the location of a user
with real-time RSSI measurement rn (z ),
N
z^ = argmin
(n* (zq ) rn (z ))2.
zq
n= 1
(7)
240
0
yn ,
0
c (w1, w1 ) c (w1, w M )
n
n
n
n n n
2
+ n t IM M .
c (w M , w1 ) c (w M , w M )
n n
n
n
n
n
(8)
(9)
was used in this work. n = [n2, ln, n2n ] is referred to as the hyperparameters of the GP. They are called hyperparameters because
they loosely dene the structure of the non-parametric model.
Training of the hyperparameters is explained later in this section.
Our goal is to use the non-parametric model to predict RSSI
measurements at any arbitrary location x*. It should be noted that
the marginal distribution of nt (x ) over the training location wn and
test location x* has a joint multivariate Gaussian distribution.
Therefore, by using rules of conditional probability of Gaussian
random variable and assuming knowledge of the trained hyperparameters n, it follows that the estimated RSSI nt (x*) at x*
given prior measurements yn taken at locations wn is normally
distributed as follows:
should be taken to ensure that the solution is not from bad local
maxima [54]. It should be noted that the training data used for
training the hyperparameters do not have to be the same as the
training data n used for making the RSSI prediction in (10)(12).
As mentioned before, the hyperparameters only loosely dene the
structure of the RSSI measurement through the prior model. The
length scale ln determines the length of the wiggles in the function.
In general, measurements that are taken ln units away can be
considered as uncorrelated. The output variance sn2 determines
the average distance of the random function away from its mean.
Since measurements from all APs are available at all locations, it is
possible to train the hyperparameters by using all available measurements, i.e., [y1; y2 ; ; yN ] and [w1; w2; ; wN ]. In this case, the
trained hyperparameters can be used for all APs. During the prediction phase, the training data n from each AP can be used in
(10)(12) to predict the mean and the variance of the RSSI measurement from that AP. Suppose Eqs. (10)(12) are used to generate the mean and variance of the predicted RSSI at locations x q ,
q = 1 , Q , i.e., a radiomap consisting of mean and variance at Q
locations is obtained. The best location estimate given the received
RSSI rn (x ) at unknown location x is given by the maximum likelihood decision rule
N
pn (rn (x )| n (x q ), n2 (x q ))
(10)
where
(11)
(13)
n= 1
(rn (x ) n (x q ))2
exp
.
2n2 (x q )
2n2 (x q )
(14)
1
and
(12)
wnm ),
5. Experiments
In this section, we present some experiments based on the
three different methods to generate the radiomap introduced in
Section 4. The test area is a 2500 m2 enterprise building, with 27
ofce cubicles, 16 meeting rooms, and corridors. A radiomap
consisting of measurements from L 235 locations is collected.
The measurement locations are shown in Fig. 2. To obtain the
measurements, the data-collector person walks through the whole
110
Radiomap
Track 1
Track 2
Track 3
5% of Radiomap
100
90
80
y [m]
70
60
50
40
30
80
90
100
110
120
130
140
150
x [m]
Fig. 2. Measurement locations of the radiomap and three test tracks. (For interpretation of the references to color in this gure caption, the reader is referred to
the web version of this paper.)
e = x x^ 2 .
(15)
We also consider the root-mean-square (RMS) error. The RMS error for an error vector of length M is dened as
erms =
1
M
Empirical CDF
1
GP
Hyperbolic
Mixture: =40
Traditional: 1NN
0.9
0.8
0.7
0.6
CDF
building for less than an hour, and collects the data using a Google
Nexus tablet and a data collection application (app) developed for
this purpose. Each measurement location is geotagged manually
by the data-collector tapping his/her location on the building map
displayed on the tablets screen. At each measurement location,
T 5 consecutive scans were made. It should be reminded that a
MAC address may appear in one scan but not in another scan. The
database referenced N 174 different MAC addresses and is referred to as Track 0. As mentioned earlier, some of these MAC
addresses belong to the same WiFi router. We will discuss on how
to possibly aggregate these MAC addresses in Section 5.3. To simulate the online phase, 3 different track databases are considered. Tracks 1, 2, and 3 consist of 171, 106, and 29 locations,
respectively. P 5 measurements were taken at each location, and
all track measurements were taken on different dates. The measurement locations of the radio map and all three tracks are shown
in Fig. 2.
Performance metrics: The performance metric used in this work
is localization error dened as
241
0.5
0.4
0.3
0.2
0.1
0
10
15
20
25
Fig. 3. Localization error of ngerprinting based on various radiomaps. Full measurement database. N 174 MAC addresses. Track 1.
Table 1
RMS localization error in meters for all tracks and radiomaps generated by different
methods. Full measurement database. N 174 MAC addresses.
GP
Hyperbolic
Mixture
Traditional: 1-NN
6.0957
5.4712
5.1695
8.2512
6.5876
7.0417
7.9879
6.5004
6.4434
7.7251
6.1220
7.3700
2
em
.
m=1
(16)
Track 1
Track 2
Track 3
Empirical CDF
0.9
1NN
4NN
0.8
0.7
0.6
CDF
0.5
0.4
0.3
0.2
0.1
0
10
15
20
25
Fig. 4. Localization error with 1-NN and 4-NN. Full measurement database. N 174
MAC addresses. Track 1. 174 APs.
242
from the same physical AP (and same 2.4 GHz channel) with different MAC addresses. Therefore, each aggregated AP can potentially have more than 5 RSSI measurements due to the combined
RSSI values from multiple MAC addresses transmitted from the
same AP. This is essentially a method to obtain more temporal
measurements without having to spend time for performing additional scans. After aggregating the MAC addresses, the length of
the training vector yj , j = 1J will be typically longer than yn ,
n = 1N . In general, the localization error is similar whether the
MAC addresses are combined or not if full measurements are
available for all radiomaps. Table 2 shows the RMS localization
error of Tracks 1 and 3 with and without MAC address aggregation
for various radiomaps assuming full measurements is available. As
seen in the table, a conclusion cannot be drawn on whether there
is a benet to aggregate the MAC addresses from the same AP
when full measurements are available. However, as we will see in
the next subsection, when only partial measurements are available
to construct the radiomap, consolidating the MAC addresses improves the localization performance signicantly.
The major drawback of traditional ngerprinting is the requirement of acquiring (and maintaining) the radiomap database.
This is a labor intensive and time consuming effort. On the other
hand, when only a subset of the area is ngerprinted to reduce this
effort, it is expected that the localization error increases. This is
investigated in Fig. 5 where only 5% (12 locations) of the full database (Track 0) is assumed to be available. The 12 locations, represented by the lled red circle in Fig. 2, were selected heuristically such that the mutual minimum distance among all
Table 2
RMS localization error in meters of Track 1 and Track 3 with and without MAC
addresses aggregation. Full measurement database.
GP
Track 1 Without
aggregation
With aggregation
Track 3 Without
aggregation
With aggregation
6.0957 8.2512
7.9879
7.7251
6.5748
7.8301
7.4186
8.0721
5.1695
7.0417
6.4434
6.3570
5.5863 6.2945
6.0344
7.2831
Empirical CDF
1
0.9
0.8
0.7
CDF
0.6
0.5
0.4
GP
Hyperbolic
Mixture: =40
Traditional: 4NN
0.3
0.2
0.1
0
10
15
20
25
30
243
Table 3
RMS localization error in meters of Track 1 and radiomaps generated by different methods assuming % of full database was available. J 50 aggregated APs.
100%
75%
56%
42%
32%
24%
18%
13%
10%
8%
5%
GP
Hyperbolic
Mixture
Traditional: 4-NN
6.5748
7.8301
7.4186
7.1746
6.2907
8.4694
7.2697
7.1133
6.2958
8.5210
7.3984
7.7858
6.4423
8.4591
7.5715
7.6563
6.3315
9.3177
7.2636
8.1945
6.4227
8.1986
7.4166
8.3877
6.3167
9.2503
7.7663
8.9986
7.1246
11.9746
8.0909
11.0387
7.0740
13.2857
11.1986
11.3684
6.7417
14.8389
8.5420
12.2231
8.8566
19.0852
14.1696
12.6178
Table 4
RMS localization error in meters for all tracks and radiomaps generated by GP. 5% of
database was assumed to be available. With and without MAC addresses
aggregation.
With aggregation
Without aggregation
Track 1
Track 2
Track 3
8.8566
10.6867
7.9087
11.595
9.6042
11.4876
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
10
15
20
25
30
35
40
45
50
AP
Table 5
RMS localization error in meters of all tracks with 17 APs and 50 APs. Full measurement database.
GP
Hyperbolic
Mixture
Traditional: 4-NN
Track 1
17 APs
50 APs
7.6964
6.5748
8.7235
7.8301
8.1284
7.4186
7.4932
7.1746
Track 2
17 APs
50 APs
7.2827
5.4305
6.9676
6.4508
6.8025
6.0703
8.3214
5.4049
Track 3
17 APs
50 APs
5.9161
5.5863
9.1388
6.2945
8.4710
6.0344
7.3815
6.0169
Table 6
RMS localization error in meters of Track 3 and radiomaps generated by different
methods at different resolutions. Full measurement database. N 174 MAC
addresses.
Resolution
GP
Hyperbolic
Mixture
Traditional: 4-NN
1
2
4
8
16
32
5.1695
5.1327
5.0923
6.7262
7.9307
20.504
7.0417
7.1535
7.6834
7.6474
7.0466
22.1593
6.4434
6.5285
6.2146
7.3532
7.0466
21.1399
6.357
6.357
6.357
6.357
6.357
6.357
6. Conclusion
This paper provides a comprehensive review on RSSI localization. RSSI localization is an important topic as it might become the
de facto form of localization. This is especially true when most of
the literature is concentrated on how to amend RSSI localization
with other sensors or tracking, and the RSSI mapping decisions are
not actually explained. On the other hand, if RSSI will be useful, it
has to work on its own as well, because there are going to be many
devices that can only work with the simplest measurements and
without further assistance.
In this paper, we have shown how RSSI localization works at its
simplest. Also we have concentrated on those little decisions that
are typically neglected. What can be done with repeated
Table 7
RMS localization error in meters of Track 0 and Track 1 and radiomaps generated by
different methods based on Track 0. J 50 APs.
Radiomap
Track
GP
Track 0
Track 0
4.0106 6.3514
Track 0
Track 1
6.5748 7.8301
56% of Track 0 44% of Track 5.7115 7.6805
0
5.5994
7.4186
6.7104
0
8.0721
8.3187
244
measurements, how sparse the radiomap can be, what interpolation techniques make more sense or even how to treat the sensibility of the device. Finally, how the different interpolation
techniques compare to each other and why we believe GP present
the best trade-off between accuracy and ease of ngerprinting. To
make RSSI localization deployable in practice, it will have to rely
on sparse number of measurements.
References
[1] C. Papamanthou, R.P. Preparata, R. Tamassia, Algorithms for location estimation based on rssi sampling, in: Proceedings of Algorithmic Aspects of Wireless
Sensor Networks, Fourth International Workshop, ALGOSENSORS 2008, Reykjavik, Iceland, July 2008, pp. 7286.
[2] Z. Farid, R. Nordin, M. Ismail, Recent advances in wireless indoor localization
techniques and system, J. Comput. Netw. Commun. 2013 (2013) 112.
[3] H. Liu, H. Darabi, P. Banerjee, J. Liu, Survey of wireless indoor positioning
techniques and systems, IEEE Trans. Syst., Man, Cybern. Part C: Appl. Rev. 2007
(2007) 10671080.
[4] A.R. Kulaib, R.M. Shubair, M.A. Al-Qutayri, J.W.P. Ng, An overview of localization techniques for wireless sensor networks, in: Proceedings of 2011 International Conference on Innovations in Information Technology (IIT), Abu
Dhabi, April 2011, pp. 167172.
[5] P. Bahl, V.N. Padmanabhan, Radar: an in-building rf-based user location and
tracking system, in: Proceedings of Nineteenth Annual Joint Conference of the
IEEE Computer and Communications Societies, INFOCOM, vol. 2, 2000, pp.
775784.
[6] Martin Klepal, Stphane Beauregard, et al., A novel backtracking particle lter
for pattern matching indoor localization, in: Proceedings of the First ACM
International Workshop on Mobile Entity Localization and Tracking in GPSless Environments, ACM, San Francisco, USA, 2008, pp. 7984.
[7] A.M. Ladd, K.E. Bekris, A. Rudys, L.E. Kavraki, D.S. Wallach, Robotics-based
location sensing using wireless ethernet, Wirel. Netw. 11 (12) (2005)
189204.
[8] M.A. Youssef, A. Agrawala, A. Udaya Shankar, Wlan location determination via
clustering and probability distributions, in: Proceedings of the First IEEE International Conference on Pervasive Computing and Communications (PerCom), 2003, pp. 143150.
[9] Veljo Otsason, Alex Varshavsky, Anthony LaMarca, Eyal De Lara, Accurate gsm
indoor localization, in: Ubiquitous Computing (UbiComp), Springer, Tokyo,
Japan, 2005, pp. 141158.
[10] Alex Varshavsky, Eyal de Lara, Jeffrey Hightower, Anthony LaMarca,
Veljo Otsason, Gsm indoor localization, Pervasive Mob. Comput. 3 (6) (2007)
698720.
[11] Andrew M. Ladd, Kostas E. Bekris, Algis P. Rudys, Dan S. Wallach, Lydia
E. Kavraki, On the feasibility of using wireless ethernet for indoor localization,
IEEE Trans. Robot. Autom. 20 (3) (2004) 555559.
[12] Chen Feng, Wain Sy Anthea Au, Shahrokh Valaee, Zhenhui Tan, Receivedsignal-strength-based indoor positioning using compressive sensing, IEEE
Trans. Mob. Comput. 11 (12) (2012) 19831993.
[13] A. Petrick, B. O'Hara, IEEE 802.11 Handbook: A Designer's Companion, Standards Information Network IEEE Press, New York, NY, USA, 2005.
[14] S. Sesia, I. Touk, M. Baker, LTEThe UMTS Long Term Evolution: From Theory
to Practice, 2nd edition, Wiley, West Sussex, UK, 2011.
[15] E. Dahlman, S. Parkvall, J. Skold, 4G: LTE/LTE-Advanced for Mobile Broadband,
2nd edition, Academic Press, Waltham, MA, USA, 2014.
[16] M. Dashti, S. Yiu, S. Youse, F. Perez-Cruz, H. Claussen, RSSI localization with
Gaussian processes and tracking, in: Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM), San Diego, CA, December 2015.
[17] E. Martin, O. Vinyals, G. Friedland, R. Bajcsy, Precise indoor localization using
smart phones, in: Proceedings of the 18th ACM International Conference on
Multimedia (MM'10), New York, NY, October 2010, pp. 787790.
[18] J.-R. Jiang, C.-M. Lin, F.-Y. Lin, S.-T. Huang, ALRD: AoA localization with RSSI
differences of directional antennas for wireless sensor networks, Int. J. Distrib.
Sens. Netw. 2013 (2013) 111.
[19] Henri Nurminen, Anssi Ristimaki, Simo Ali-Loytty, Robert Pich, Particle lter
and smoother for indoor localization, in: Proceedings of International Conference on Indoor Positioning and Indoor Navigation (IPIN), 2013, pp. 110.
[20] F. Perez-Cruz, S.V. Vaerenbergh, J.J. Murillo-Fuentes, M. Lazaro-Gredilla,
I. Santamaria, Gaussian processes for nonlinear signal processing, IEEE Signal
Process. Mag. 30 (June (4)) (2013) 4050.
[21] A. Haeberlen, E. Flannery, A.M. Ladd, A. Rudys, D.S. Wallach, L.E. Kavraki,
Practical robust localization over large-scale 802.11 wireless networks, in:
Proceedings of MobiCom, 2004.
[22] S. Sen, B. Radunovic, R. R. Choudhury, T. Minka, Precise indoor localization
using phy layer information, in: Proceedings of HotNets, 2011.
[23] S. Sen, B. Radunovic, R.R. Choudhury, T. Minka, You are facing the Mona Lisa:
spot localization using phy layer information, in: Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services, MobiSys '12 2012.
[24] H. Wang, S. Sen, A. Elgohary, M. Farid, M. Youssef, R.R. Choudhury, No need to
war-drive: unsupervised indoor localization, in: Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services,