Để đánh giá thuật toán ta sẽ phân lớp và xác định truy cập là bất thường với bộ dữ liệu AWID [36], AWID là bộ dữ liệu lớn thu thập các kết nối mạng kèm phân lớp dữ liệu, Là bộ dữ liệu được các nhà nghiên cứu sử dụng để đánh giá với các nghiên cứu đánh giá về khả năng xâm nhập mạng không dây.
AWID có nhiều version khác nhau, gồm 8 gói dữ liệu chính chia là bộ đầy đủ và bộ cắt giảm.
Bảng 3.1: Bộ dữ liệu AWID [36]
Bảng 3.2: Các lớp của bộ dữ liệu AWID [36]
Bảng 3.3: Tỉ lệ của các bản ghi và lớp trong bộ dữ liệu
Trong luận văn này sẽ sử dụng bộ
Để làm tập train và tập test.
Mỗi bản ghi trong bộ dữ liệu có 155 thuộc tính :
Bảng 3.4: Thuộc tính trong 1 bảng ghi
DESCRIPTION TYPE
1 Interface id Unsigned integer, 4 bytes
2 WTAP_ENCAP Signed integer, 2 bytes
3 Time shift for this packet Time offset
4 Epoch Time Time offset
5 Time delta from previous captured frame Time offset 6 Time delta from previous displayed frame Time offset 7 Time since reference or first frame Time offset
8 Frame length on the wire Unsigned integer, 4 bytes
9 Frame length stored into the capture file Unsigned integer, 4 bytes
10 Frame is marked Boolean
11 Frame is ignored Boolean
12 Header revision Unsigned integer, 1 byte
13 Header pad Unsigned integer, 1 byte
14 Header length Unsigned integer, 2 bytes
15 TSFT Boolean
16 Flags Boolean
17 Rate Boolean
18 Channel Boolean
19 FHSS Boolean
20 dBm Antenna Signal Boolean
21 dBm Antenna Noise Boolean
22 Lock Quality Boolean
23 TX Attenuation Boolean
24 dB TX Attenuation Boolean
25 dBm TX Power Boolean
26 Antenna Boolean
27 dB Antenna Signal Boolean
28 dB Antenna Noise Boolean
29 RX flags Boolean
30 Channel+ Boolean
31 MCS information Boolean
32 A-MPDU Status Boolean
33 VHT information Boolean
34 Reserved Unsigned integer, 4 bytes
35 Radiotap NS next Boolean
36 Vendor NS next Boolean
37 Ext Boolean
38 MAC timestamp Unsigned integer, 8 bytes
39 CFP Boolean
40 Preamble Boolean
41 WEP Boolean
42 Fragmentation Boolean
43 FCS at end Boolean
44 Data Pad Boolean
45 Bad FCS Boolean
46 Short GI Boolean
47 Data rate (Mb/s) Floating point (single-precision)
48 Channel frequency Unsigned integer, 4 bytes
49 Turbo Boolean
50 Complementary Code Keying (CCK) Boolean 51 Orthogonal Frequency-Division
Multiplexing (OFDM) Boolean
52 2 GHz spectrum Boolean
53 5 GHz spectrum Boolean
54 Passive Boolean
55 Dynamic CCK-OFDM Boolean
56 Gaussian Frequency Shift Keying (GFSK) Boolean
57 GSM (900MHz) Boolean
58 Static Turbo Boolean
59 Half Rate Channel (10MHz Channel Width) Boolean 60 Quarter Rate Channel (5MHz Channel
Width) Boolean
61 SSI Signal Signed integer, 4 bytes
62 Antenna Unsigned integer, 4 bytes
63 Bad PLCP Boolean
64 Type/Subtype Unsigned integer, 2 bytes
65 Version Unsigned integer, 1 byte
66 Type Unsigned integer, 1 byte
67 Subtype Unsigned integer, 1 byte
68 DS status Unsigned integer, 1 byte
69 More Fragments Boolean
70 Retry Boolean
71 PWR MGT Boolean
72 More Data Boolean
73 Protected flag Boolean
74 Order flag Boolean
75 Duration Unsigned integer, 2 bytes
76 Receiver address Ethernet or other MAC address
77 Destination address Ethernet or other MAC address 78 Transmitter address Ethernet or other MAC address
79 Source address Ethernet or other MAC address
80 BSS Id Ethernet or other MAC address
81 Fragment number Unsigned integer, 2 bytes
82 Sequence number Unsigned integer, 2 bytes
83 Block Ack Request Type Unsigned integer, 1 byte
84 BAR Ack Policy Boolean
85 Multi-TID Boolean
86 Compressed Bitmap Boolean
87 TID for which a BlockAck frame is requested Unsigned integer, 2 bytes
88 Block Ack Bitmap Sequence of bytes
89 Good Boolean
90 ESS capabilities Boolean
91 IBSS status Boolean
92 CFP participation capabilities Unsigned integer, 2 bytes
93 Privacy Boolean
94 Short Preamble Boolean
95 PBCC Boolean
96 Channel Agility Boolean
97 Spectrum Management Boolean
98 Short Slot Time Boolean
99 Automatic Power Save Delivery Boolean
100 Radio Measurement Boolean
101 DSSS-OFDM Boolean
102 Delayed Block Ack Boolean
103 Immediate Block Ack Boolean
104 Listen Interval Unsigned integer, 2 bytes
105 Current AP Ethernet or other MAC address
106 Status code Unsigned integer, 2 bytes
107 Timestamp Unsigned integer, 8 bytes
108 Beacon Interval Unsigned integer, 4 bytes
109 Association ID Unsigned integer, 2 bytes
110 Reason code Unsigned integer, 2 bytes
111 Authentication Algorithm Unsigned integer, 2 bytes 112 Authentication SEQ Unsigned integer, 2 bytes
113 Category code Unsigned integer, 2 bytes
114 HT Action Unsigned integer, 1 byte
115 Supported Channel Width Unsigned integer, 1 byte 116 GAS Query Response fragment Frame number
117 Starting Sequence Number Unsigned integer, 2 bytes
118 Tagged parameters Label
119 SSID Character string
120 Current Channel Unsigned integer, 1 byte
121 DTIM count Unsigned integer, 1 byte
122 DTIM period Unsigned integer, 1 byte
123 Multicast Boolean
124 Bitmap Offset Unsigned integer, 1 byte
125 Environment Unsigned integer, 1 byte
126 RSN Version Unsigned integer, 2 bytes
127 Group Cipher Suite type Unsigned integer, 1 byte 128 Pairwise Cipher Suite Count Unsigned integer, 2 bytes 129 Auth Key Management (AKM) Suite Count Unsigned integer, 2 bytes 130 Auth Key Management (AKM) type Unsigned integer, 1 byte 131 RSN Pre-Auth capabilities Boolean
132 RSN No Pairwise capabilities Boolean
133 RSN PTKSA Replay Counter capabilities Unsigned integer, 2 bytes 134 RSN GTKSA Replay Counter capabilities Unsigned integer, 2 bytes 135 Management Frame Protection Required Boolean
136 Management Frame Protection Capable Boolean
137 PeerKey Enabled Boolean
138 Transmit Power Signed integer, 1 byte
139 Link Margin Signed integer, 1 byte
140 Initialization Vector Unsigned integer, 3 bytes
141 Key Index Unsigned integer, 1 byte
142 WEP ICV Unsigned integer, 4 bytes
143 TKIP Ext. Initialization Vector Character string 144 CCMP Ext. Initialization Vector Character string
145 TID Unsigned integer, 2 bytes
146 Priority Unsigned integer, 2 bytes
147 EOSP Boolean
148 Ack Policy Unsigned integer, 2 bytes
149 Payload Type Boolean
150 #N/A #N/A
151 QoS bit 4 Boolean
152 TXOP Duration Requested Unsigned integer, 2 bytes
153 Buffer State Indicated Boolean
154 data.len
155 class Character string
Ví dụ về 1 bản ghi trong bộ dữ liệu thô :
0,?,0.000000000,1393668189.035614000,0.000000000,0.000000000,0.000000 000,1552,1552,0,0,0,0,26,1,1,1,1,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0x00000000,0,0,0,8987 920748,0,0,0,0,1,0,0,0,54,2437,0,0,1,1,0,0,0,0,0,0,0,0,25,1,0,0x28,0,2,8,0x02,0,0,0,0,1 ,0,44,c0:18:85:94:b6:55,c0:18:85:94:b6:55,28:c6:8e:86:d3:d6,00:13:33:87:62:6d,2 8:c6:8e:86:d3:d6,0,2313,?,?,?,?,?,?,1,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,
?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,0xf6ddd9,0,0x0e5162ff,?,?,1,1,0,0x000 0,0,?,?,?,0,1488,normal