Anda di halaman 1dari 6

Indoor Geolocation

Anushka Vazirani
10/02/2017

Introduction

The usage of Global Positioning Systems (GPS) has increased drastically in the past decade. Not only do
people use these services to direct themselvse from place to place, but GPS is widely used for tracking
purposes, indoors and outdoors. The application of indoor geolocation can have a widespread impact on
most businesses. With the advancement of technology, high-tech devices are being designed to make everyday
operations easier - however, the tools that enable this can be too expensive to purchase an ample supply. With
few devices, quick and accurate location of these devices within a building can greatly improve the efficiency
of business operations. Indoor geolocation of devices is notoriously ineffective, which could easily decrease the
efficiency of businesses using devices that require location detection. Along with GPS, the advancement and
usage of Wi-Fi has rapidly grown. Since its relatively harder to improve the GPS functionality of a device,
Wi-Fi signals could instead be used, in lieu of GPS, to locate devices in indoor settings. Devices can now
emit wireless signals to other devices from certain distances away from the receiving device, called an access
point. Based on building size, youd want to have more access points catchinng signals to more accurately
determine the location of the device you are trying to find. Using the signal strengths from various access
points within a building and linear regression, a prediction model can be created to pinpoint the location of a
device within the building - however, the accuracy of this prediction is inherently worse for weaker signal
strengths because the access points cannot detect signals from locations that are too far away.

Data Description

Signal strengths at five different access points are collected for 254 devices at known locations in a building.
Using the location of the access points and the devices, the distance between each device and each access
point is calculated. The mapping of the access points and devices is shown below.
Intuitively, signal strengths received at the access points are stronger for devices that are closer to the access
points. We examine the relationship between distance of the device from an access point and the signal
strength to gain more insight into this relationship.
As visible in the above plots, the devices closest to the access points are emitting the strongest signals, while
the ones farthest away are emitting the weakest ones. Next we observe the quantitative relationship between
the two, aiming to see if there is a linear relationship present, or a transformation that creates a linear
relationship.
The plot above only shows the relationship between distance and signal strength for one access point, but
all five access points showed similar trends in a more linear relationship appearing by performing a log
transformation on the signal strengths. It is known that signal strength and distance is generally known to
follow a log relationship, and the transformations validity holds with this data.
In addition to validating the logarithmic relationship between distance and signal strength, it is important to
know that the strongest signal strength is actually coming from the closest access point. To determine this,
the closest access point for each device was found using the previously calculated distances. The strongest
signal strength was found by using the given signal strengths from each access point. Comparing the two
columns, 220 out of 254, about 87%, of the devices had the strongest signal strength being emitted to the
closest access point. This was calculated to ensure the signals and wireless signal receivers at the access
points are working according to our assumptions.

1
150

100
y

50

0 50 100 150 200


x

Figure 1: Map of access points and device locations - devices are represented as blue dots and access points
as red squares. The access points are strategically placed on each corner and in the middle of the building.
NOTE: The layout of devices suggests that this building has many walls in between the access points, which
could interfere with signal reception.

2
Signal Strength from Access Point 1 Signal Strength from Access Point 2
150 150
100 100
y

y
50 50
0 0
0 50 100 150 200 0 50 100 150 200
x x
Signal Strength from Access Point 3 Signal Strength from Access Point 4
150 150
100 100
y

50 50
0 0
0 50 100 150 200 0 50 100 150 200
x x
Signal Strength from Access Point 5
150
100
y

50
0
0 50 100 150 200
x

Figure 2: Heat map of signal strengths frome each access point - the spectrum of signal strengths goes from
blue (strongest signal strength) to green (weakest signal strength).

3
90 80 70 60 50 40

4.0
log(100 + Signal Strength)
Signal Strength

3.5
3.0
2.5
2.0

0 50 100 150 200 0 50 100 150 200

Distance from Access Point 1 Distance from Access Point 1

Figure 3: Relationship between distance from access point and signal strength - without transforming the
data, signal strength seems to have a logarithmic relationship with distance. Performing a log transformation
on the signal strengths shows a more linear relationship with distance. The farther away the device from
the access point, the weaker the signal strength. NOTE: A strong signal strength is one closer to 0. 100
was added to the signal strength before performing the log transformation to preserve the direction of the
relationship in both graphs for easier visual comparison.

4
Using the log-linear relationship between the signal strength and distance, we can try to create a model to
predict the location of a device given the signal strengths at each access point.

Analysis

Two separate models are created to pinpoint the location of the device. One model predicts the x-coordinate
given the log-transformed signal strengths, and the other model similarly predicts the y-coordinate. The two
predictions will be combined to predict the x-y location of the device.
Since we have data that already has the location of the device, the data is split up into a train and test set.
Each device was randomly numbered from 1 to 254 and 50 of the devices were removed to be the test set for
the final model.
Initially, the models were made without using devices that had signal strengs of -90 or lower because -92 is
the weakest possible signal strength, so it seemed reasonable to assume those signal strengths would not help
detect a location if it could not detect anything further. However, the models including those devices had
higher r2 values. The models are as follows:

x = 0.77536.145log(100+SS1 )+42.398log(100+SS2 )+33.581log(100+SS3 )16.079log(100+SS4 )+15.962log(100+S

y = 104.24620.151log(100+SS1 )33.749log(100+SS2 )+29.596log(100+SS3 )+14.798log(100+SS4 )1.136log(100+SS

where SS1 refers to the signal from access point one and so on until SS5 .
Although the r2 values are high, the residual standard errors are also pretty high, which is not ideal. The
farther away the devices are from access points, the signals get too weak for an accurate location prediction
and the the errors become pretty large.
We can conclude that the model fit fairly well, but the residual standard errors are surprisingly high for the
r2 values. Running the model on the test set will help evaluate the models effectiveness and validity.
The above figure shows the difference in real location and predicted locations on the test data. The dotted
lines represent the distance between the real and prediction locations. The large residual standard error
given by the model is apparent in the graph, as the magnitudes of the lines are fairly large compared to
expectations due to the high r2 .

Conclusion

The model for predicting the x- and y-coordinate of a devices location based on the log transformation of
signal strengths is a generally accepted method. However, once the devices are far away from access points
the signal strengths are too weak to accurately assess. The model created here is useful in a smaller section
contained within all of the access points. Unfortunately, the model becomes worse at predicting locations
that are outside the edges created by the access points. The issue comes in determining which side of the
access point the device is on, because common signal strength come from devices in a circle around the access
point, disregarding direction. If the model were to be contained to locations inside the access points, or the
access points were positioned to be outside all the detectable locations, with the addition of one or two more
central access points, the accuracy in predicting a devices location could be improved. A more complicated
model for location detection based on signal strengths would need to exist, or the detection would have to be
restricted to an optimal radius around the central access point. Its unreasonable to apply those restrictions
to the model because the idea is that youre not sure how far away the devices are. However, devices detected
on the edge near all the predictions coming from weaker signal strengths should be treated with caution
because they could be coming from farther locations in the building.

5
150

100
y

50

0 50 100 150 200


x

Figure 4: Overlay of predictions on real device and access point locations - the real device locations are
represented by the outlined dots, the predictions by the solid black dots, and the access points by the solid
red squares. The real locations are connected by dotted lines to their respective predictions. The distances
between the observed and predicted locations are larger than we would expect from a model that had such
high r2 values. The predictions on the outer edges or ones that are far from many of the access points seem
to be the ones with larger errors.

Anda mungkin juga menyukai