AI 深度學習簡介與範例介紹
陳瑞樂
2017/11/28
• Deep Learning , Machine Learning
• CNN
• RNN
• Opportunity
Machine Learning
≈ Looking for a (kernel) Fun ction
• Speech Recognition
• Playing Go
• Dialogue System
f
• Image Recognition
f
f
f
“Cat”
“How are you”
“5-5”
“Hello”
“Hi”
(what the user said) (system response) (next move)
Framework
A set of
function f1 , f2
f1
“cat”
f1
“dog”
f2
“money”
f2
“snake”Model
“cat” Image Recognition:
f
Framework
A set of
function f1 , f2
“cat” Image Recognition:
f
Model
Trainin g Data
Goodness of function f
Better!
“cat
”
“dog”
function input:
function output: “monkey”
Supervised Learning
Tag/LabelFramework
A set of
f1 , f2
“cat” Image Recognition:
f
Model
Trainin g Data
Goodness of function f
“monkey” “cat” “dog”
f *
Pick the “Best” Function
Using f
“cat
” Training Testing function
Step 1
Step 2 Step 3
Step 1:
define a set of function
Step 2:
goodness of function
Step 3: pick the best function
Three Steps for Deep Learnin g
Deep Learning is so simple ……
Variants of Neural Networks
Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN)
Widely used in image processing
卷積神經網路
資料來源
https://www.youtube.com/watch?v=FrKWiRv254g&t=938s
Why CNN for Image
?
100
… …
… …
• When processing image, the first layer of fully connected network would be very large
……
……
……
Sofmax
100
100 x 100x3 1000
Can the fully connected network be simplified by considering the properties of image
recognition?
Why CNN for Imag e
“beak” detector
• Some patterns are much smaller than the whole image
A neuron does not have to see the whole image to discover the pattern.
Connecting to small region with less
parameters
Why CNN for Imag e
“middle beak”
detector
• The same patterns appear in different regions.
“upper-lef beak” detector
Do almost the same thing
They can use the same
set of
parameters.
Why CNN for Imag
• Subsampling the pixels will not change e
the object bird
bird
subsampling
We can subsample the pixels to make image smaller
Less parameters for the network to process the image
The whole CNN
Fully Connected Feedforward network
Fully Connected Feedforward network
cat dog ……
Convolution Convolution
Max Pooling Max Pooling
Convolution Convolution
Max Pooling Max Pooling Flatten
Flatten
Can repeat many times
The whole CNN
Convolution Convolution
Max Pooling Max Pooling
Convolution Convolution
Max Pooling Max Pooling Flatten
Flatten
Can repeat many times
Some patterns are much
smaller than the whole image
The same patterns appear in different regions.
Subsampling the pixels will not change the object
Property 1
Property 2
Property 3
The whole CNN
Fully Connected Feedforward network
Fully Connected Feedforward network
cat dog ……
Convolution Convolution
Max Pooling Max Pooling
Convolution Convolution
Max Pooling Max Pooling Flatten
Flatten
Can repeat many times
CNN – Convolution
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1 -1 1 -1 -1 -1 1
Filter 1
-1 1 -1 -1 1 -1 -1 1 -1
Filter 2
… …
Those are the network parameters to be learned.
Matrix
Matrix
Each filter detects a small pattern (3 x 3).
Property 1
CNN – Convolution
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1 -1 1 -1 -1 -1 1
Filter 1
33 -1-1 stride=1
CNN – Convolution
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1 -1 1 -1 -1 -1 1
Filter 1
33 -3-3 If stride=2
We set stride=1 below
CNN – Convolution
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1 -1 1 -1 -1 -1 1
Filter 1
33 -1-1 -3-3 -1-1 -3-3 11 00 -3-3 -3-3 -3-3 00 11
33 -2-2 -2-2 -1-1 stride=1
Property 2
CNN – Convolution
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
33 -1-1 -3-3 -1-1 -3-3 11 00 -3-3 -3-3 -3-3 00 11
33 -2-2 -2-2 -1-1 -1 1 -1
-1 1 -1 -1 1 -1
Filter 2
-1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -2-2 11 -1-1 -1-1 -2-2 11 -1-1 00 -4-4 33
Do the same process for every filter
stride=1
4 x 4 image
Feature
Map
CNN – Colorful image
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
1 -1 -1 -1 1 -1
-1 -1 1 Filter 1
-1 1 -1 -1 1 -1
-1 1 -1 Filter 2 1 -1 -1
-1 1 -1 -1 -1 1
1 -1 -1 -1 1 -1 -1 -1 1
-1 1 -1 -1 1 -1 -1 1 -1
-1 1 -1 -1 1 -1 -1 1 -1 Colorful image
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
image
convolution
-1 1 -1 -1 1 -1 -1 1 -1 1 -1 -1
-1 1 -1 -1 -1 1
… …
… …
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
Convolution v.s. Fully Connected
Fully-
connected
x1
x2
x36
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image 1 -1 -1
-1 1 -1 -1 -1 1
Filter 1 1
:2 :3 :
…
7 :8 :9 :
…
13 :14 :15:
…
Only connect to 9 input, not fully connected
Only connect to 9 input, not fully connected
4 :
10:
16:
1 0 0 0 0 1 0 0 0 0 1 1
33
Less parameters!
Less parameters!
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
1 -1 -1 -1 1 -1
-1 -1 1 Filter 1
1 :2 :3 :
…
7 :8 :9 :
…
13 :14 :15:
…
4 :
10:
16:
1 0 0 0 0 1 0 0 0 0 1 1
33
-1-1
Shared weights Shared weights 6 x 6 image
Less parameters!
Less parameters!
Even less parameters!
Even less parameters!
The whole CNN
Fully Connected Feedforward network
Fully Connected Feedforward network
cat dog ……
Convolution Convolution
Max Pooling Max Pooling
Convolution Convolution
Max Pooling Max Pooling Flatten
Flatten
Can repeat many times
CNN – Max Pooling
33 -1-1 -3-3 -1-1 -3-3 11 00 -3-3 -3-3 -3-3 00 11
33 -2-2 -2-2 -1-1
-1 1 -1 -1 1 -1
-1 1 -1 Filter 2
-1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -2-2 11 -1-1 -1-1 -2-2 11 -1-1 00 -4-4 33 1 -1 -1
-1 1 -1 -1 -1 1
Filter 1
CNN – Max Pooling
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
33 00
11 33
-1-1 11 33 00
2 x 2 image
Each filter is a channel New image but smaller Conv
Max
Pooling
The whole CNN
Convolution Convolution
Max Pooling Max Pooling
Convolution Convolution
Max Pooling Max Pooling
Can repeat many times
A new image A new image
The number of the channel is the number of filters
Smaller than the original image
33 00 11 33
-1-1 11 33 00
The whole CNN
Fully Connected Feedforward network
Fully Connected Feedforward network
cat dog ……
Convolution Convolution
Max Pooling Max Pooling
Convolution Convolution
Max Pooling Max Pooling Flatten
Flatten
A new image A new image
A new image
A new image
Flatten
33 00
11 33
-1-1 11
33
00
Flatten
33 00 11 33 -1-1
11 00 33
Fully Connected Feedforward network
Fully Connected Feedforward network
More Application: Playing Go
Network
(19 x 19 positions) Next move19 x 19 vector Black: 1 white: -1
none: 0
19 x 19 vector
Fully-connected feedforward network can be used
Fully-connected feedforward network can be used
But CNN performs much better.
But CNN performs much better.
19 x 19 matrix (image)
19 x 19 matrix (image)
More Application: Playing Go
CNN
CNN
record of previous plays
Target:
“ 天元 ” = 1 else = 0
Target:
“ 五之 5” = 1
else = 0
Training: 黑
: 5之
五 白
:天元 黑
:五之
5…
Why CNN for playing Go?
• Some patterns are much smaller than the who le image
• The same patterns appear in different regions.
Alpha Go uses 5 x 5 for first layer
Variants of Neural Network s
Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN) Neural Network with Memory
循環神經網路 : 適用於時間序列預測的 LSTM 模型
。
Long Short-Term Memory (LSTM)
RNN Tutorial
• 空氣汙染預測
– 準備來源資料 – 準備基本資料
– 建立多變數 LSTM 預測模型 – 參考網址 :
• Multivariate Time Series Forecasting with LSTMs in Keras
•
基於
Keras的
LSTM多變數時間序列預測
準備準備來源資料
Air Quality dataset
• 使用空氣質量資料集。
– 這是美國駐北京大使館記錄了五年的資料集,
其按小時報告天氣和汙染值。
– 此資料包括日期、 PM2.5 濃度,以及天氣資訊
,包括露點、溫度、氣壓、風向、風速和降水
時長。
•
原始資料中的完整特徵列表如下:
• NO:行號
• year: 年份
• month: 月份
• day: 日
• hour: 時
• pm2.5: PM2.5 濃度
• DEWP: 露點溫度
• TEMP: 溫度
• PRES: 氣壓
• cbwd: 組合風向
• Iws: 累計風速
• s: 累積降雪時間
• Ir: 累積降雨時間
準備基本資料
原始資料尚不可用,我們必須先處理它。
以下是原始資料集的前幾行資料。
1. 將零散的日期時間資訊整合為一個單一的日期時間,我們可以將其用作 索引。
2. 在資料集中還有幾個零散的「 NA 」值,我們現在可以用 0 值標記它們
。
建立 7 個子圖,顯示每個變數 5 年中的資料。
LSTM 資料準備
• 我們將監督學習問題設定為:
– 根據上一個時間段的汙染指數和天氣條件,預 測目前時刻( t )的汙染情況。
• 問題探索與應用
– 根據過去一天的天氣情況和汙染狀況,預測下 一個小時的汙染狀況。
– 根據過去一天的天氣情況和汙染狀況以及下一
個小時的「預期」天氣條件,預測下一個小時
的汙染狀況。
1. 資料正規化
2. 將下一筆資料的 pollution 值當作是上一筆資料的預測值 (Label)
在訓練過程中繪製 RMSE( 方均根差 )(root-mean-square error) 可 能會使問題明朗。
Opportunity
• 學習門檻高 : 理論深奧 @@
– 類神經網路,機器學習,分類 / 預測 / 辨識 ,….
• 發現問題 @@
– 有什麼問題是可以讓 AI 工具 (Tensorflow + Keras) 來解決的
• 問題的資料源如何取得 (Key point!!)
– 預測空氣汙染:要有空物偵測值,何處取得 ? – 人臉辨識:要有人臉照片與標記 (tag) ,如何製
作 ?
My Target
預測股市隔天指數值 : RMSE=100 Train set : 2013/01/08~2014/8/3 Test set : 2014/8/27~2017/7/23
預測股市隔兩天指數值 : RMSE=110 Train set : 2013/01/08~2014/8/3 Test set : 2014/8/27~2017/7/23
預測股市隔三天指數值 : RMSE=117 Train set : 2013/01/08~2014/8/3 Test set : 2014/8/27~2017/7/23