Estimasi
Estimasi
Prob. Density Function
Prob. Density Function
dengan
dengan
EM
EM
Sumber
Sumber
:
:
--
Forsyth & Ponce Chap. 7
Forsyth & Ponce Chap. 7
--
Standford
Standford
Vision & Modeling
Vision & Modeling
Probability Density Estimation
Probability Density Estimation
• Parametric Representations
• Non-Parametric Representations
• Mixture Models
Metode
Metode
estimasi
estimasi
Non
Non
-
-
parametric
parametric
• Tanpa asumsi apapun tentang distribusi
• Estimasi sepenuhnya bergantung ada DATA
• cara mudah menggunakan: Histogram
Histograms
Histograms
Histograms
Histograms
• Butuh komputasi banyak, namun sangat umum
digunakan
• Dapat diterapkan pada sembarang bentuk
densitas (arbitrary density)
Histograms
Histograms
Permasalahan:
• Higher dimensional Spaces:
- jumlah batang (bins) yg. Exponential
- jumlah training data yg exponential
- Curse of Dimensionality
• size batang ? Terlalu sedikit: >> kasar
Terlalu banyak: >> terlalu halus
Pendekatan
Pendekatan
secara
secara
prinsip
prinsip
:
:
• x diambil dari ‘unknown’ p(x)
• probabiliti bahwa x ada dalam region R adalah:
V
x
p
dx
x
p
P
R)
(
'
)
'
(
≈
=
∫
Pendekatan
Pendekatan
secara
secara
prinsip
prinsip
:
:
V
x
p
dx
x
p
P
R)
(
'
)
'
(
≈
=
∫
N
K
P
=
• x diambil dari ‘unknown’ p(x)
Pendekatan
Pendekatan
secara
secara
prinsip
prinsip
:
:
V
x
p
dx
x
p
P
R)
(
'
)
'
(
≈
=
∫
N
K
P
=
NV
K
x
p
≈
⇒
(
)
• x diambil dari ‘unknown’ p(x)
• probabiliti bahwa x ada dalam region R adalah:
Pendekatan
Pendekatan
secara
secara
prinsip
prinsip
:
:
NV
K
x
p
≈
⇒
(
)
Dengan Fix V Tentukan K Dengan Fix K Tentukan V Metoda Kernel-Based K-nearestMetoda
Metoda
Kernel
Kernel
-
-
Based:
Based:
NV
K
x
p
≈
⇒
(
)
Parzen Window:
< = otherwise 0 2 / 1 | u | 1 ) (u j HMetoda
Metoda
Kernel
Kernel
-
-
Based:
Based:
NV
K
x
p
≈
⇒
(
)
Parzen Window:
< = otherwise 0 2 / 1 | u | 1 ) (u j H∑
NMetoda
Metoda
Kernel
Kernel
-
-
Based:
Based:
NV
K
x
p
≈
⇒
(
)
Parzen Window:
< = otherwise 0 2 / 1 | u | 1 ) (u j H∑
= − = N n n x x H K 1 ) (∑
= − = N n n d H x x Nh x p 1 ) ( 1 ) (Metoda
Metoda
Kernel
Kernel
-
-
Based:
Based:
NV
K
x
p
≈
⇒
(
)
Gaussian Window:
−
−
=
∑
= 2 2 1 2 /22
||
||
exp
)
2
(
1
1
)
(
h
x
x
h
N
x
p
N n nπ
dMetoda
Metoda
Kernel
Kernel
-
-
Based:
Based:
K
K
-
-
nearest
nearest
-
-
neighbor:
neighbor:
NV
K
x
p
≈
⇒
(
)
K
K
-
-
nearest
nearest
-
-
neighbor:
neighbor:
K
K
-
-
nearest
nearest
-
-
neighbor:
neighbor:
Klasifikasi secara Bayesian :
V
N
K
C
x
p
k k k)
=
|
(
NV
K
x
p
(
)
=
N
C
p
(
)
=
kK
K
-
-
nearest
nearest
-
-
neighbor:
neighbor:
Klasifikasi secara Bayesian :
V
N
K
C
x
p
k k k)
=
|
(
NV
K
x
p
(
)
=
N
N
C
p
k k)
=
(
K
K
x
C
p
k k|
)
=
(
“aturan klasifikasi k-nearest-neighbour ”
Probability Density Estimation
Probability Density Estimation
• Parametric Representations
• Non-Parametric Representations
• Mixture Models (Model Gabungan)
Mixture
Mixture
-Models (Model
-
Models (Model Gabungan
Gabungan):
):
Gaussians:
- Mudah
- Low Memory
- Cepat
- Good Properties
Non-Parametric:
- Umum
- Memory Intensive
- Slow
Mixture Models
Campuran
Campuran
fungsi
fungsi
Gaussian (mixture of
Gaussian (mixture of
Gaussians):
Gaussians):
x p(x)
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
x p(x)
Jumlah dari Gaussians tunggal
Keunggulan: Dapat mendekati bentuk densitas
sembarang (Arbitrary Shape)
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
x p(x)
Generative Model:
z
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
x p(x)
∑
==
M jj
P
j
x
p
x
p
1)
(
)
|
(
)
(
−
−
=
2 /2 2 22
||
||
exp
)
2
(
1
)
|
(
j d jx
j
x
p
σ
µ
πσ
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
Maximum Likelihood:
∑
=−
=
−
=
N n nx
p
L
E
1)
(
ln
ln
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
Maximum Likelihood:
∑
=−
=
−
=
N n nx
p
L
E
1)
(
ln
ln
0
=
∂
∂
kE
µ
E k µCampuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
Maximum Likelihood:
∑
=−
=
−
=
N n nx
p
L
E
1)
(
ln
ln
0
=
∂
E
⇒ =∑
= n N n n x x j P 1 ) | ( µCampuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
∑
∑
= = = N n n n N n n j x j P x x j P 1 1 ) | ( ) | ( µCampuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
∑
∑
= = = N n n n N n n j x j P x x j P 1 1 ) | ( ) | ( µ∑
==
M k n n nk
P
k
x
p
j
P
j
x
p
x
j
P
1)
(
)
|
(
)
(
)
|
(
)
|
(
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
∑
∑
= = = N n n n N n n j x j P x x j P 1 1 ) | ( ) | ( µ − − = 2 2 2 / 2 2 || || exp ) 2 ( 1 ) | ( j j n d j n x j x p σ µ πσ∑
==
M k n n nk
P
k
x
p
j
P
j
x
p
x
j
P
1)
(
)
|
(
)
(
)
|
(
)
|
(
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
∑
∑
= = = N n n n N n n j x j P x x j P 1 1 ) | ( ) | ( µ − − = 2 || || 1 xn µj∑
==
M k n n nk
P
k
x
p
j
P
j
x
p
x
j
P
1)
(
)
|
(
)
(
)
|
(
)
|
(
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
Maximum Likelihood:
∑
=−
=
−
=
N n nx
p
L
E
1)
(
ln
ln
0
=
∂
∂
kE
µ
E k µTidak ada
solusi pendek !
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
Maximum Likelihood:
∑
=−
=
−
=
N n nx
p
L
E
1)
(
ln
ln
EGradient Descent
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
Maximum Likelihood:
∑
=−
=
−
=
N n nx
p
L
E
1)
(
ln
ln
)
,...,
,
,...,
,
,...,
(
1 M 1 M 1 M kf
E
µ
µ
σ
σ
α
α
µ
=
∂
∂
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
Optimasi secara Gradient Descent:
• Complex Gradient Function
(highly nonlinear coupled equations)
• Optimasi sebuah Gaussian tergantung dari seluruh
campuran lainnya.
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
x p(x)
-> Dengan strategi berbeda:
Observed Data:
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
x p(x)
Observed Data:
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
x p(x) y Variabel Hidden 1 2 Observed Data:
Campuran
Campuran
fungsi
fungsi
Gaussian:
Gaussian:
x p(x) y Variabel Hidden 1 2 Observed Data:
Contoh
Contoh
populer
populer
ttg. Chicken and Egg
ttg
. Chicken and Egg
Problem:
Problem:
x p(x) 1 1 1111 1 2 2 2222 2 yAnggap
kita tahu
Max.Likelihood
Utk. Gaussian #1
Max.Likelihood
Utk. Gaussian #2
Chicken+Egg Problem:
Chicken+Egg Problem:
x p(x) Anggap kita tahuP(y=1|x)
P(y=2|x)
Chicken+Egg Problem:
Chicken+Egg Problem:
x p(x)
1 1 1111 1 2 2 2222 2 y
Tapi yg ini kita tidak tau sama sekali ? ?
Chicken+Egg Problem:
Chicken+Egg Problem:
x p(x)Clustering:
Clustering:
x
1 1 1111 1 2 2 2222 2 y
Tebakan benar ?
K-mean clustering / Basic Isodata
Pengelompokan
Pengelompokan
(Clustering):
(Clustering):
Procedure: Basic Isodata
1. Choose some initial values for the means
Loop: 2. Classify the n samples by assigning them to the class of the closest mean.
3. Recompute the means as the average of the samples in their class.
4. If any mean changed value, go to Loop;
M
µ µ ,...,1
Isodata
Isodata
:
:
Inisialisasi
Inisialisasi
1
µ
2
µ
Isodata
Isodata
:
:
Menyatu
Menyatu
(Convergence)
(Convergence)
1
µ
2
Isodata
Isodata
:
:
Beberapa
Beberapa
permasalahan
permasalahan
Ditebak
Ditebak
Eggs / Terhitung
Eggs /
Terhitung
Chicken
Chicken
x p(x) Max.Likelihood Utk. Gaussian #1 Max.Likelihood Utk. Gaussian #2
GaussianAproximasi
GaussianAproximasi
yg
yg
.
.
baik
baik
x p(x)
• Namun tidak optimal!
• Permasalahan: Highly overlapping Gaussians
Expectation Maximization (EM)
Expectation Maximization (EM)
• EM adalah formula umum dari problem seperti “Chicken+Egg” (Mix.Gaussians, Mix.Experts, Neural Nets,
HMMs, Bayes-Nets,…)
• Isodata: adalah contoh spesifik dari EM
• General EM for mix.Gaussian: disebut Soft-Clustering • Dapat konvergen menjadi Maximum Likelihood
Ingat
Ingat
rumusan
rumusan
ini
ini
?:
?:
∑
∑
= = = N n n n N n n j x j P x x j P 1 1 ) | ( ) | ( µ − − = 2 2 2 / 2 2 || || exp ) 2 ( 1 ) | ( j j n d j n x j x p σ µ πσ∑
==
M k n n nk
P
k
x
p
j
P
j
x
p
x
j
P
1)
(
)
|
(
)
(
)
|
(
)
|
(
Soft Chicken and Egg Problem:
Soft Chicken and Egg Problem:
x p(x) P(1|x) 0.1 0.3 0.7 0.1 0.01 0.0001 0.99 0.99 0.99 0.5 0.001 0.00001
∑
∑
= = = N n n n N n n j x j P x x j P 1 1 ) | ( ) | ( µSoft Chicken and Egg Problem:
Soft Chicken and Egg Problem:
x p(x) P(1|x)
∑
∑
= = = N n n n N n n j x j P x x j P 1 1 ) | ( ) | ( µ 0.1 0.3 0.7 0.1 0.01 0.0001 0.99 0.99 0.99 0.5 0.001 0.00001 Anggap kita tahu:Weighted Mean of Data
Soft Chicken and Egg Problem:
Soft Chicken and Egg Problem:
x p(x) P(1|x)
∑
∑
= = = N n n n N n n j x j P x x j P 1 1 ) | ( ) | ( µ 0.1 0.3 0.7 0.1 0.01 0.0001 0.99 0.99 0.99 0.5 0.001 0.00001 Step-2: Hitung ulang posteriorsLangkah
Langkah
prosedur
prosedur
EM:
EM:
Procedure: EM
1. Choose some initial values for the means
E-Step: 2. Compute the posteriors for each class and each sample:
M-Step:3. Re-compute the means as the weighted average of their class:
4. If any mean changed value, go to Loop; otherwise, stop. M µ µ ,...,1 ) | (j xn P ∑ ∑ = = = N n n n N n n j x j P x x j P 1 1 ) | ( ) | ( µ
EM
EM
dan
dan
Gaussian mixture
Gaussian mixture
)
,
(
max
arg
( 1) ) (i=
Q
θ
θ
i−θ
θ∑
∑
− = −=
N i n N n n i n i jx
j
p
x
x
j
p
) 1 ( 1 ) 1 ( ) ()
,
|
(
)
,
|
(
θ
θ
µ
EM
EM
dan
dan
Gaussian mixture
Gaussian mixture
)
,
(
max
arg
( 1) ) (i=
Q
θ
θ
i−θ
θ∑
∑
= − = −−
−
=
∑
N n i n N n T i j n i j n i n i jx
j
p
x
x
x
j
p
1 ) 1 ( 1 ) ( ) ( ) 1 ( ) ()
,
|
(
)
)(
)(
,
|
(
θ
µ
µ
θ
EM
EM
dan
dan
Gaussian mixture
Gaussian mixture
)
,
(
max
arg
( 1) ) (i=
Q
θ
θ
i−θ
θ∑
= −=
N n i n i jp
j
x
N
1 ) 1 ( ) (1
(
|
,
θ
)
α
Contoh
Contoh
-
-
contoh
contoh
EM:
EM:
Training Samples
Contoh
Contoh
Contoh
-
-
contoh
contoh
EM:
EM:
Training Samples End Result of EM
Contoh
Contoh
Contoh
-
-
contoh
contoh
EM:
EM:
Color Segmentation