Cell Tower

(1)

Network-side Positioning of

Cellular-band Devices with Minimal Effort

Ayon Chakraborty, Luis Ortiz, and Samir R. Das

IEEE INFOCOM 2015

(2)

What is Network-Side Positioning?

Cell phone located at <X, Y>

Serving Cell Tower

Neighboring Cell Towers

RSS

₃

RSS

₂

RSS: Received Signal Strength RSS

₁

RSS

₂

RSS

₃

RSS

₁

RSS

₂

RSS

₃

RSS

₁

Y X ˆ , ˆ

Estimate

Location

(3)

Network Providers are Constrained

• Unlike OTT Apps …

typically no

direct access to such sensors

• ONLY utilize cellular signal strength

information

Sensors Galore!

(4)

Fingerprinting in Cellular Networks

Tower C Tower B

Tower A (X₁, Y₁)

RSS_C RSS_B RSS_A

Locatio

n Feature Vector X₁, Y₁ < RSS_A RSS_B

RSS_C>₁

Fingerprint Database

(5)

Tower B

Tower A

(X₂, Y₂) RSS_C

RSS_B

RSS_A

X₂, Y₂ < RSS_A RSS_B RSS_C>₂ Locatio

RSS_C>₁

Fingerprint Database

(6)

Tower C Tower B

Tower A (X₃, Y₃)

RSS_C

RSS_B RSS_A

RSS_C>₁

Fingerprint Database

X₃, Y₃ < RSS_A RSS_B RSS_C>₃

(7)

Tower B

Tower A X₂, Y₂ < RSS_A RSS_B RSS_C>₂ Locatio

RSS_C>₁

Fingerprint Database

X₃, Y₃ < RSS_A RSS_B RSS_C>₃ X₄, Y₄ < RSS_A RSS_B

RSS_C>₄ X₅, Y₅ < RSS_A RSS_B

RSS_C>₅

… … …

X_N, Y_N ^{< RSS}^A^RSS^B RSS_C>_N

(8)

Tower C Tower B

Tower A X₂, Y₂ < RSS_A RSS_B RSS_C>₂ Locatio

RSS_C>₁

Fingerprint Database

RSS_C>₅

… … …

(9)

Fingerprinting-based Localization Techniques

• Many deterministic / statistical techniques

• Median accuracy ≈ 100 –

RSS_C>₁

Fingerprint Database

RSS_C>₅

… … …

Estimate Location

?, ?

^{< RSS}^A^RSS_> ^B^RSS^C

Test Data

Y

X ˆ , ˆ

(10)

Accuracy Depends on Fingerprint Density

RSS_C>₁

RSS_C>₅

… … …

Estimate Location

?, ?

Test Data

(# of Locations) Cost

Y X ˆ , ˆ

∝

(11)

Our Work:

Minimizing Labeled Data Requirement

• Minimize labeled data

• Provide good accuracy with less cost

____ , ____

< RSS_A RSS_B RSS_C>₂ Locatio

RSS_C>₁

X₃, Y₃ < RSS_A RSS_B RSS_C>₃

____ , ____

< RSS_A RSS_B RSS_C>₄

____ , ____

< RSS_A RSS_B RSS_C>₅

… … …

Estimate Location

?, ?

Test Data

Labeled Data Unlabeled Data

Y

X ˆ , ˆ

(12)

Mostly Unlabeled Data, Few Labeled Data

Tower C Tower B

Tower A

Labeled data Unlabeled data

Semi-supervised Setting

+ =

(13)

Semi-supervised Clustering

• Unsupervised clustering

• Labeled data anchors

clusters

Tower B

Tower A

(14)

Location Estimation

Tower C Tower B

Tower A





⁵

1

*

k

K K

UE

L P

₁

P

₃

P

5

P

₄

P

₂

Prob.

that phone belongs to cluster

L

₁

L

₅

L

₄

L

₃

L

₂

Physical location of

cluster 2

RSS_A RSS_B RSS_C RSS_A RSS_B RSS_C

(15)

Semi-supervised Modeling Approach

• Marginal Gaussian PDF of Signal S (received signal strengths from

towers), given location L (hidden variable)  f

_S|L

• Mixture of independent Gaussians (GMM) over all possible locations

• Learning problem: Learn f

_S|L





K

f

S_K_|L

(16)

A Simple Example

• Many unlabeled signal strength samples

• Few samples with label E or W

• Given test signal S

_Test

estimate the probabilities p(W| S

_Test

) and p(E| S

_Test

)

W E

^P^ro

bability

Signal strength

Gaussian distribution at W

Gaussian distribution at E

Combined Gaussian distribution actually seen

(17)

A Simple Example

Estimated location = p(W| RSS )*midpoint of W + p(E| RSS )*midpoint of E.

W E

^P^ro

bability

p(W|RSS3) = 0.95 p(E|RSS3) = 0.05 p(W|RSS1) = 0.1

p(E|RSS1) = 0.9

p(W|RSS2) = 0.65 P(E|RSS2) = 0.35

RSS1 RSS2 RSS3 Midpoint of W

Midpoint of E

Signal strength

(18)

Our Experimental Setup

• University campus:

Partitioned into uniform grid

(15mx15m)

• Each grid cell

⁼ one location

• ≈ 3K grid cells.

2.5 Kms (approx.)

(19)

Data Collected

• 35K samples at outdoor locations

• T-Mobile’s GSM network on

Nexus4/5 phone (our technique not specific to GSM though)

• 10K samples kept aside for

testing

(20)

Algorithm Overview

• Step 1: Initialize each location with a Gaussian(mean, variance) and

prior probability for the location

• Step 2: Run Expectation Maximization (EM)

– Handles partially-labeled training data

– The EM converges to the local MLE

– Yields ‘learned model’

WINGS Lab ²⁰

WINGS Lab 20

(21)

10% training samples have labels

Median Accuracy ≈ 70m

Mixture Model K-Nearest Neighbors Gaussian Naïve Bayes

(22)

1% labeled training samples

Median Accuracy ≈ 90m

Mixture Model K-Nearest Neighbors Gaussian Naïve Bayes

(23)

Good news!

Unlabeled measurements easy to obtain

2%

20%

Observation 1:

More Unlabeled Data Reduces Error

Of course, up to theoretical limit

(Bayes Risk)

(24)

Observation 2:

Higher Errors are Spatially

Clustered

(25)

Observation 3:

Accuracy Improves with Cell Tower Density

Potential for good performance in high density deployments (e.g., small cells /

Almost 40m!

# of

features

= cell

towers

heard

(26)

Summary of Our Contributions

• Minimize labeled training data requirement – a cost center for operators

• 1% labeled data achieves ≈ 90m median accuracy

• Additional unlabeled data improves accuracy, up to a theoretical limit (Bayes risk)

• Possible to extend to completely unsupervised

setting (see paper)

(27)

Acknowledgement

Huawei Technologies, New Jersey, USA

Thank

You

(28)

(29)

Cell Towers

• The region

(University campus) is partitioned into a uniform grid

(15mx15m) and each grid cell represents a candidate location (~3K grid cells).

• Remove cell towers backup

• These are our locations.

Cell Tower

2.5 Kms (approx.)