PDF Automatic Wi-Fi Fingerprint System based on Unsupervised Learning

SVM(Support Vector Machine), PCA(Principal Component Analysis) 등 지도 학습과 준지도 학습을 기반으로 하는 핑거프린팅 알고리즘은 전파 지도를 생성하기 위해 모든 실내 공간에서 RSSI 측정이 필요합니다. 제안된 알고리즘으로 생성된 전파지도는 위치인식 단계에서 사용자의 실시간 위치인식에 적용된다. 제안된 알고리즘은 실제 측정 기반 전파지도와 비교하여 제안된 지문 시스템의 높은 안정성과 정확도를 확인하였다.

Introduction

Background and Necessity for Research

Furthermore, in non-line-of-sight (NLOS) cases, where radio reflection and refraction can occur, position errors increase significantly because it will be difficult to accurately measure the radio arrival time. In the positioning phase, the internal positions of the users are estimated based on the radio map created in the training phase. On the other hand, fingerprinting does not need an analysis of the internal radio environment in the training phase, because it uses a radio map, which stores the RSSI as it is.

Objectives and Contents for Research

Therefore, as the building size and Wi-Fi density increases, the size and generation time of the radio map for positioning also increases rapidly. This process is repeated and the generator is gradually improved to an excellent learned generator that can similarly produce the real measured radio map. The modified GAN-based radio map on the new floors is generated through the learned generator by combining the 2-D map and coordinates of the APs and the various kernel data analyzed based on the real radio map measured on the reference floor .

Wi-Fi Positioning and Unsupervised Learning

Wi-Fi Positioning

Wi-Fi Signal and Fingerprint
Fingerprint Techniques

However, the accuracy of the radio map is high, due to the sufficient collection of signals at the reference points. The dispersion values of each AP show the characteristics of each AP, and the RSSIs of the radio map can be reduced according to the number of principal components [40, 41]. According to the definition of the log function, it can be expressed by the equation given by.

Fig. 2.1 The conventional attenuation of RSSI over distance

Unsupervised Learning

Neural Network
Autoencoder
Generative Adversarial Network

The x-axis is the SSID of the AP and the y-axis is the difference value of the IG of MDLP with respect to the IGs of CAIM and CACC. The magnitude of the y-axis is very good, but these results have a large impact on the ability to remove APs. 2.10, neural network-based supervised and semi-supervised learning algorithms interpret features through hidden layer correlations between input and output data.

Labeling refers to the process of inducing the desired data output by allowing the user to compare the expected output data directly with the trained output data. Therefore, learning performance is determined by label quality, training data size, and training volume. Moreover, because the distribution of RSSIs can be highly distorted and ambiguous, depending on the internal structures and radio environment, there is limited room to improve the tag quality in order to increase accuracy of the learning algorithm.

Here, training data is labeled data, because the label, or the result of the classifier, is entered along with the data. Using training data consisting of input data and one datum with the value of 1, the autoencoder encodes signals feature vector  of the deterministic function type according to each feature. The back-propagation, which is a core algorithm of the supervised learning-based neural networks, definitely requires ground truth.

However, GAN has recently been proposed to improve the prediction performance through mutual learning of the training model.

Fig. 2.10 The network structure of supervised/semi-supervised learning

Proposed Fingerprint System

Unsupervised Dual Radio Mapping Algorithm

First, in the learning minimum, the proposed algorithm measures the relative SSID and RSSI according to a certain reference point on the reference floor of the building where users' locations can be known. As the amount of iterative learning for the same AP gradually increases, so does the accuracy of the distribution curve. Even if the positions of the reference points change, the accuracy of RSSI prediction between a receiver and a transmitter increases.

Therefore, the radio map is predicted and generated by the proposed algorithm in other floors of the same structure. Therefore, the radio map is predicted and generated by the proposed algorithm in other floors of the same structure. The input radio map measured at the reference floor is updated by comparing the weight of the autoencoder with the generated radio map, and the autoencoder consists of encoders and decoders.

The generated radio map for each AP is predicted taking into account the coordinates of the installed APs on the new floors, assuming the same structure as the measured reference floor and the learned indoor structures. The modified GAN-based radio map generation algorithm can be applied to indoor environments where the reflection and refraction characteristics of the radio are the same, even though the installed coordinates of the APs and indoor structures are different. The discriminator learns the data from the fake radio map, which is the combined 2-D map and the coordinates APs.

The learned generator combines different learned kernel types and predicts RSSI distributions using the 2-D map and coordinates of APs on floors that are different.

Fig. 3.2 The type of input/output data of modified autoencoder

MDLP-based Radio Map Feedback Algorithm

In this process, overfitting occurs when an algorithm learns 100% of the detail and noise from the training data, such that it negatively affects the performance of the model. Often the noises or random fluctuations of the training data are picked up and learned as concepts by the model. Through this operation, the coordinates of the reference points with the highest similarity are determined to be the user locations.

This method improves positioning performance by filtering out unnecessary AP signals and automatically reducing a dimension of the radio map in the area where the number of users is large and the density of Wi-Fi APs is high. The measured Wi-Fi signals in real time are able to position and compare the RSSIs of the radio map. Also, they are listed in the form of the proposed radio map according to their measured locations.

Input data consists of a series of red rectangles, which are divided into different subsets via discretization of the performed MDLP operations. If the number of disjoint subsets is less than the reference point of the corresponding layer, it can be determined that the entire set of measured RSSIs are APs that cannot distinguish RSSIs. APs with similar signal characteristics have a very small influence on positioning, but they increase the amount of calculations and the size of the radio map.

The data characteristics of the quantified AP numerically represent the similarity between the different AP signals, so that only those with the best classification remain among the signals of similar distribution.

Fig. 3.10 The proposed flowchart of the positioning phase

Experiment and Result

Experimental Environment and Configuration

The measurement items were SSIDs and RSSIs, and the generated radio map was used MySQL, which enables the management of the radio map in real time. Because the modified GAN was needed for inputting the 2-D maps and the coordinates of APs, their information had to be secured. In a corridor of the reference floor, the measured public APs from the walking survey were 8 APs out of 139.

One setting of the 2-D map for learning assumed that a line in the image was a structure that completely blocked Wi-Fi signals. Generating a radio map used in the conventional fingerprint system required a simple arrangement of the measured RSSIs according to the reference points. Therefore, the structure of the radio card was reconfigured to fit the proposed UDRM, designed with the modified autoencoder and GAN.

Unlike the conventional radio map, which is applied for fingerprinting, the radio map applied to the proposed system not only learns the propagation features but also the indoor structures of the building. The real radio map of the reference floor and the input data used in autoencoder learning and applied to similar areas and floors remained the same. The generated RSSIs were then input to the coordinates of the APs and a custom autoencoder-based radio map was finally completed.

The coordinates on the left are the location coordinates obtained by converting those of the real APs into the reference points.

Fig. 4.2 The structure of experimental floor

Results of Unsupervised Dual Radio Mapping Algorithm

Based on the AP coordinates, the real RSSI and the modified radio map based on auto-encoding repeat almost the same. One pixel on the x and y axes is a 3 m interval on the 2-D map plane. a) shows the real measured radio map with walking. 4.8 (a) and (b) show the results of the measured radio map and the predicted radio map based on the modified auto-encoder in the 4th floor.

4.9 (a) and (b) show the measured radio map and the predicted radio map based on the modified GAN at the 5th floor. The predicted UDRM radio map was compared with the real measured radio map as follows. The total error rate of the radio map generated for each floor by the modified autoencoder is within 10%, and the modified GAN at the 5th floor is almost the same as the measured radio map: about 1.

As a result of applying the algorithm to generate the radio map for all floors, the directly measured radio map is based on walking. The side effect of randomly updating the radio map is prevented in advance. The concordance rates of the radio map generated by the proposed UDRM and the walking study are relatively accurate, within 10.

Then, when the measured radio map based on the walking study was 100%, the concordance rates of the generated radio map obtained by the custom autoencoder and GAN were 90.4% and 92.7%, which are 9.6% and 7.3% prediction errors, respectively showed. In the positioning stage, when the MDLP-based RMF algorithm was applied to the radio map of the modified autoencoder and GAN, the update rate of the radio map was 52.9% and 60.5%, respectively. Moreover, the positioning accuracy of the proposed fingerprint system was 88.59%, compared with the accuracy of 89.66% obtained with the measured radio map based on the gait survey, showing that there were almost no errors.

Fig. 4.6 The result of radio map by modified autoencoder (2nd floor) intervals of 3 m to the reference points

Results of MDLP-based Radio Map Feedback Algorithm