lectures Submodular functions El 260
and
discrete optimization
• Combinatorial optimization in ML
• Submodeleh functions
•
maximizing monotone submodularfeinch.org
-
Greedy method
-
( I
-E) appronimahion
TA Semion ( Hw 3) : 22
"
Monday 1800 hors
.- Francis Bach ( monograph )
So far , woe have seen Corwen optimization
problems in ML
wt WI
"
classify
'+
'and
-
-
'
by finding
a separating hyperplane
solve for the best vector the minimizes
the L( an id 1-
Size of margin WI = arg min Llw )
we
Feature selection :
-
• Predict Y from a subset ✗ a = { Xi , ,
- -.✗ in }
• Given random variables Y , X , ,
- - -, ✗ n
ÑE2=k
^ " " % "
" " d "
{ " " = { " " " }
model
Az = { * , , ✗ u } .
Recent travel # I
, ,
L
female cough Ago ; { ✗ s , Xo }
✗ iz ✗ is
wish to select
'
K' most informative features :
A- * = arg Max IG ( Xa ; y ) St
.I Alek
Infineon gain :
IG ( ✗ a ;y ) = HCY )
-HCYI ✗ a)
~ -
uncertainty before uncertainty after
knowing Xp Knowing ✗ A
This is a combinatorial problem 1 !
.=
Senior ptgcmnt ( set cover problem )
• • • • • • • ←-
possible
• ☒
•
•
.
•
location
•
• •
• •
How to place K sensors
°
.
•
• • • • • ✓ Out of V candidate
positions to increase the
Nodes predicts / measures values coverage ?
with dome radius / coverage
.Factoring distributions :
-
-
Given random variables ×
, ,
- -
✗
no , portion V
them into set A and D= VIA that are
as independent as possible
r
✗ , Xu ✗ 2×3
A- * = ang min I ( xp ; Xu
/ a) ✗
6 ✗ 5
A
St
.O c l Al C N A flu
D= VIA
1- ( ✗ a i ✗
ya ) ✗ i ✗ a
✗ 6
✗ z × ,
= H( Xp )
-H ( ✗ 31 ✗ a) ✗ r
Again , combinatorial ! !
Set functions :
f : 2✗ → IR
→ Takes as input a set ; inputs are subsets of
the ground set ✗ = { 1 , 2 ,
- .,N )
→ I is the power set ( set of all subsets )
minimization ( His manimizalion ) of a set
function
min
.F- (A) = min
.f- (A)
AC ✗
A C- É
5. t
.constraints on the sub # d- A
Reformulation as Boolean function :
min f- ( w ) with FA CV
If { 0,13N
-e- (
IA ) = f- (A)
• •
( 1,1111-{1/2,3}
•
• • ( 1,1 ,o ) - { 1,2 }
• •
( 90,0 ) - { } ( 011,0 )~{ 2 }
flee ) = £ Wi Ii
i= ,
' -
=
maximize f- ( ee )
} Optimally
( P ) solve this
we c- { 0,1 } "
we have to
enhaujtivedy
St
.Hello € k
enumerate
FzTÉ=k over all
concave
CoÑ^ :
f function K
-✗ pane
vectors
( Pe ) maximize f- ( w )
s.to to c- { 0,1 ] " : born constraint
Heelless the ( best convene
OE Wi £1
approximation
of do
-norm )
key property :
"
Diminishing returns : "
A B
!?
& ④
I
• am
:
Afc 400,000 Bank
100 Alc
cash
back
+ 5° N
t 50 Ry
( pay -1M ) !
Submo@aoteenh_ons.A set function is said to be submodule - if
and only if
f- ( B u { i } )
-f (B) E f / A u { i } )
-9- (A)
F A C- BE ✗ and i ¢ B
Eq-mi.hn :
f- (A) tf (B) f f ( Anb ) + f- ( AUB )
FA , BE ✗
• Equality leader to modular functions
• F- (4) = 0
T oÉ :
let A ' = A u { i } and NB ' = B
f ( AU { i } ) + f- (B)
= f- ( A ' ) + f ( B ' ) z f- ( A ' no ' ) + f ( A ' UB ' )
= # ( Av { i } n B) + f- ( Au { i } ur )
= f- (A) + f( Bu { i } )
: if is super modular if and only
if
-f is sub modular
.→ of submodule "
Sufy inn : → Lovato difference
{ " " " Kh " % " " " M "
minimization of sub modular {
^^ "
functions :
✗ aeity
functions
.or
Active learning , feature clustering , structure learning
TMAP inference in mmarkov random fields selection , ranking
Min f- ( n )
-scn )
NEX
Difference convene fn
.f- ( n ) & gcn ) are convene
T
ffn )
-9( Noi )
-791¥ no
Scp : min
NEX
EnÉf 8ubmo_Iion :
⇐ : flows
, sat cover , differ heal entropic
EH :
Given p random variables ✗ i ,
- -xp
F- (A) as the joint entropy of variables ( ✗ a) kea
€8B f- (A) in sub modular
if A C- B and K & B
f- ( Au { 1k } )
-F- (A) = H( ✗ a. ✗ a)
-H ( Xa )
= HCXKIXA )
( conditioning reduces Entropy ) 7 H( ✗ ✗ Ixn )
= f- ( Bulk } )
-f- (B) ☐•
Manimzing submo ta^ :
maximize f (A)
5. to I Al s k
A C- V Nemhauser (197-8) :
If F in ✗ ubmodvlar , monotone increasing , and nonempty
- -
f( Au Ei } ) > f- (A) f( 07=0 Then Greedy algorithm : A = 01
for i = 1 , 2 ,
. .K
i ← arg i ¢ max A [ * ( A u{ i } )
-F- (A) ]
A ← A u { i } ; return A
The above greedy method satisfied :
e- (A) 3 ( I
-E) e- ( Aopt )
f- ( Aopt ) = arg max ¥
AEV ; / A1 =p ,
A) { Enhaeesliue
]
search
=-D Although this bound is not that tight ,
results are close to exhaustive search in
practice ( whenever , verifiable ) .
claim : pick any A E V Buch that I Al C K
.Then
Max § ( Au { i } )
-f- CAD 3 ÷ fflotopt )
-flab
i c- V
PII :
let Aoptl A = { i , ,
. . .ip } so that psk
Then £ ( Aopt ) I f ( A- opt U A) ( monotonicity )
p
= f- (A) + E ' 7- ( D- u { i , .
. -i ;] )
-f(Au{ in
. -ij.is/j--iCsubmodnlantu)st- (A) + & 4- ( Au { i ; } )
-f. ( A ) )
j= I
c- f- (A) + § Max ff ( Au { i } )
-f- (A) J
j
-i i c- A
( psk ) I f- (A) + K max [ f- ( A u{ i } )
-f- ( A1 )
i
-A
appÉ BBB
let Ak be the Foliation of the greedy method
at step K
.Then from the previous result
FCA " )
-f( A " " ) > ⇐ fflaoet )
-flank
-
' 7 ]
f- ( AK ) 7 ¥ f( Aopt ) + ( I
-¥ ) f( A " ' )
> ± 7- ( Aopt ) + ( I
-÷ ) ( I ftp.opt )
+ ( 1- E) flak -2 ) )
( psk ) I f- (A) + k max [ f- ( A u{ i } )
-f- ( A1 )
i
-A
appÉ BBB
let Ai be the Bolivian of the greedy method
at step i. Then from the previous result
FCA " )
-f( A " ' ) > 1- fflaoet )
-f( ai -17 ]
$4
f(A°A )
-flail e ( I
-fflaoet )
-ffa " ' ) )
combining for every iteration : Hei E K
e- ( Aat )
-e- ( AK ) e ( I
-Ig ) " [ f / A 't )
-e- ( O ) )
I
e- ( AK ) z e- ( Aat )
-( I
-E) ⇐ { f- ( Aert ) ]
Using the fact that I
-n s e
-n
( I
-E) keel
e- ( AK ) 7 ( I
-I )f( sort
☐•B