Network-side Positioning of
Cellular-band Devices with Minimal Effort
Ayon Chakraborty, Luis Ortiz, and Samir R. Das
IEEE INFOCOM 2015
What is Network-Side Positioning?
Cell phone located at <X, Y>
Serving Cell Tower
Neighboring Cell Towers
RSS
3RSS
2RSS: Received Signal Strength RSS
1RSS
2RSS
3RSS
1RSS
2RSS
3RSS
1Y X ˆ , ˆ
Estimate
Location
Network Providers are Constrained
• Unlike OTT Apps …
typically no
direct access to such sensors
• ONLY utilize cellular signal strength
information
Sensors Galore!
Fingerprinting in Cellular Networks
Tower C Tower B
Tower A (X1, Y1)
RSSC RSSB RSSA
Locatio
n Feature Vector X1, Y1 < RSSA RSSB
RSSC>1
Fingerprint Database
Fingerprinting in Cellular Networks
Tower B
Tower A
(X2, Y2) RSSC
RSSB
RSSA
X2, Y2 < RSSA RSSB RSSC>2 Locatio
n Feature Vector X1, Y1 < RSSA RSSB
RSSC>1
Fingerprint Database
Fingerprinting in Cellular Networks
Tower C Tower B
Tower A (X3, Y3)
RSSC
RSSB RSSA
X2, Y2 < RSSA RSSB RSSC>2 Locatio
n Feature Vector X1, Y1 < RSSA RSSB
RSSC>1
Fingerprint Database
X3, Y3 < RSSA RSSB RSSC>3
Fingerprinting in Cellular Networks
Tower B
Tower A X2, Y2 < RSSA RSSB RSSC>2 Locatio
n Feature Vector X1, Y1 < RSSA RSSB
RSSC>1
Fingerprint Database
X3, Y3 < RSSA RSSB RSSC>3 X4, Y4 < RSSA RSSB
RSSC>4 X5, Y5 < RSSA RSSB
RSSC>5
… … …
XN, YN < RSSA RSSB RSSC>N
Fingerprinting in Cellular Networks
Tower C Tower B
Tower A X2, Y2 < RSSA RSSB RSSC>2 Locatio
n Feature Vector X1, Y1 < RSSA RSSB
RSSC>1
Fingerprint Database
X3, Y3 < RSSA RSSB RSSC>3 X4, Y4 < RSSA RSSB
RSSC>4 X5, Y5 < RSSA RSSB
RSSC>5
… … …
XN, YN < RSSA RSSB RSSC>N
Fingerprinting-based Localization Techniques
• Many deterministic / statistical techniques
• Median accuracy ≈ 100 –
X2, Y2 < RSSA RSSB RSSC>2 Locatio
n Feature Vector X1, Y1 < RSSA RSSB
RSSC>1
Fingerprint Database
X3, Y3 < RSSA RSSB RSSC>3 X4, Y4 < RSSA RSSB
RSSC>4 X5, Y5 < RSSA RSSB
RSSC>5
… … …
XN, YN < RSSA RSSB RSSC>N
Estimate Location
?, ?
< RSSA RSS> B RSSCTest Data
Y
X ˆ , ˆ
Accuracy Depends on Fingerprint Density
X2, Y2 < RSSA RSSB RSSC>2 Locatio
n Feature Vector X1, Y1 < RSSA RSSB
RSSC>1
Fingerprint Database
X3, Y3 < RSSA RSSB RSSC>3 X4, Y4 < RSSA RSSB
RSSC>4 X5, Y5 < RSSA RSSB
RSSC>5
… … …
XN, YN < RSSA RSSB RSSC>N
Estimate Location
?, ?
< RSSA RSS> B RSSCTest Data
(# of Locations) Cost
Y X ˆ , ˆ
∝
Our Work:
Minimizing Labeled Data Requirement
• Minimize labeled data
• Provide good accuracy with less cost
____ , ____
< RSSA RSSB RSSC>2 Locatio
n Feature Vector X1, Y1 < RSSA RSSB
RSSC>1
X3, Y3 < RSSA RSSB RSSC>3
____ , ____
< RSSA RSSB RSSC>4
____ , ____
< RSSA RSSB RSSC>5
… … …
XN, YN < RSSA RSSB RSSC>N
Estimate Location
?, ?
< RSSA RSS> B RSSCTest Data
Labeled Data Unlabeled Data
Fingerprint Database
Y
X ˆ , ˆ
Mostly Unlabeled Data, Few Labeled Data
Tower C Tower B
Tower A
Labeled data Unlabeled data
Semi-supervised Setting
+ =
Semi-supervised Clustering
• Unsupervised clustering
• Labeled data anchors
clusters
Tower B
Tower A
Location Estimation
Tower C Tower B
Tower A
51
*
k
K K
UE
L P
L P
1P
3P
5
P
4P
2Prob.
that phone belongs to cluster
L
1L
5L
4L
3L
2Physical location of
cluster 2
RSSA RSSB RSSC RSSA RSSB RSSC
Semi-supervised Modeling Approach
• Marginal Gaussian PDF of Signal S (received signal strengths from
towers), given location L (hidden variable) f
S|L• Mixture of independent Gaussians (GMM) over all possible locations
• Learning problem: Learn f
S|L
Kf
SK|LA Simple Example
• Many unlabeled signal strength samples
• Few samples with label E or W
• Given test signal S
Testestimate the probabilities p(W| S
Test) and p(E| S
Test)
W E
Probability
Signal strength
Gaussian distribution at W
Gaussian distribution at E
Combined Gaussian distribution actually seen
A Simple Example
Estimated location = p(W| RSS )*midpoint of W + p(E| RSS )*midpoint of E.
W E
Probability
p(W|RSS3) = 0.95 p(E|RSS3) = 0.05 p(W|RSS1) = 0.1
p(E|RSS1) = 0.9
p(W|RSS2) = 0.65 P(E|RSS2) = 0.35
RSS1 RSS2 RSS3 Midpoint of W
Midpoint of E
Signal strength
Our Experimental Setup
• University campus:
Partitioned into uniform grid
(15mx15m)
• Each grid cell
⁼ one location
• ≈ 3K grid cells.
2.5 Kms (approx.)
2.5 Kms (approx.)
Data Collected
• 35K samples at outdoor locations
• T-Mobile’s GSM network on
Nexus4/5 phone (our technique not specific to GSM though)
• 10K samples kept aside for
testing
Algorithm Overview
• Step 1: Initialize each location with a Gaussian(mean, variance) and
prior probability for the location
• Step 2: Run Expectation Maximization (EM)
– Handles partially-labeled training data
– The EM converges to the local MLE
– Yields ‘learned model’
WINGS Lab 20
WINGS Lab 20
10% training samples have labels
Median Accuracy ≈ 70m
Mixture Model K-Nearest Neighbors Gaussian Naïve Bayes
1% labeled training samples
Median Accuracy ≈ 90m
Mixture Model K-Nearest Neighbors Gaussian Naïve Bayes
Good news!
Unlabeled measurements easy to obtain
2%
20%
Observation 1:
More Unlabeled Data Reduces Error
Of course, up to theoretical limit
(Bayes Risk)
Observation 2:
Higher Errors are Spatially
Clustered
Observation 3:
Accuracy Improves with Cell Tower Density
Potential for good performance in high density deployments (e.g., small cells /
Almost 40m!
# of
features
= cell
towers
heard
Summary of Our Contributions
• Minimize labeled training data requirement – a cost center for operators
• 1% labeled data achieves ≈ 90m median accuracy
• Additional unlabeled data improves accuracy, up to a theoretical limit (Bayes risk)
• Possible to extend to completely unsupervised
setting (see paper)
Acknowledgement
Huawei Technologies, New Jersey, USA
Thank
You
Cell Towers
• The region
(University campus) is partitioned into a uniform grid
(15mx15m) and each grid cell represents a candidate location (~3K grid cells).
• Remove cell towers backup
• These are our locations.
Cell Tower
2.5 Kms (approx.)