Greedy Technique
Greedy Technique - Definition
The greedy method is a general algorithm design paradigm, built on the following elements:
configurations: different choices, collections, or values to find
objective function: a score assigned to configurations, which we want to either maximize or minimize
It works best when applied to problems with the
Greedy-choice property
Optimal sub structure
Greedy Choice Property
Make whatever choice seems best at the moment and then solve the sub-problems arising after the choice is made.
The choice made by a greedy algorithm may depend on choices so far. But, it cannot depend on any future
choices, it progresses one greedy choice after another iteratively reducing each given problem into a smaller one.
A greedy algorithm makes the decision early and will never reconsider the old decisions. It may not be
accurate for some problems.
Optimal Sub-structure
A problem exhibits optimal sub-structure, if an optimal solution to the sub-problem contains within its optimal solution to the problem.
This property is used to determine the usefulness of dynamic programming and greedy algorithms in a problem.
Greedy Technique - Making Change
Problem: A dollar amount to reach and a collection of coin amounts to use to get there.
Objective function: Minimize number of coins returned.
Greedy solution: Always return the largest coin you can Example 1: Coins are valued $.32, $.08, $.01
Has the greedy-choice property, since no amount over $.32 can be made with a minimum number of coins by omitting a $.32 coin (similarly for amounts over $.08, but under $.32).
Example 2: Coins are valued $.30, $.20, $.05, $.01
Does not have greedy-choice property, since $.40 is best made with two $.20’s, but the greedy solution will pick three coins (which ones?)
Task Scheduling or Activities Selection
Given: a set T of n tasks, each having:
A start time, si
A finish time, fi (where si < fi)
Goal: Perform all the tasks using a minimum number of
“machines.”
1 2 3 4 5 6 7 8 9
Machine 1 Machine 3 Machine 2
Task Scheduling or Activity Selection
Brute force:
Try all subsets of activities.
Choose the largest subset which is feasible.
The running time for listing all subsets of activities would be Θ(2n)
Task Scheduling or Activity Selection
1 2 3 4 5 6 7 8 9
0 10
7 5
6
6 0
5
9 5
4
8 3
3
5 3
2
4 1
1
fi si
i
Task Scheduling or Activity Selection
1 2 3 4 5 6 7 8 9
0 10
7 5
6
6 0
5
9 5
4
8 3
3
5 3
2
4 1
1
fi si
i
Sorted by finish times
Task Scheduling or Activity Selection
Algorithm Greedy_Activity (s[0…n-1], f[0…n-1]) A ← {1}
j ← 1
for i ← 2 to n do if s[j]>=f[j]
A← A + {i}
j = i return Activity
Exclude the sorting time, this algorithm ‘s running time T(n) = Θ(n)
If include the sorting time, what will be the total running time ? T(n) = ??
Proving Optimality
Proof Greedy choice property
Purpose: Showing that activity#1 (greedy choice) is in the optimal solution.
Let S = {1, 2, . . . , n} be the set of activities. Since activities are in order by finish time. It implies that activity 1 has the earliest finish time.
Suppose, A ∈ S is an optimal solution and let activities in A are ordered by finish time. Suppose, the first activity in A is k.
If k = 1, then A begins with greedy choice and we are done (or to be very precise, there is nothing to proof here).
If k ≠1, there is another solution B that begins with greedy choice, activity 1.
Let B = (A - {k})+{1}. Because f1≤ fk the activities in B are disjoint and since B has same number of activities as A, i.e.,
|A| = |B|, B is also optimal.
Proving Optimality
Proof Optimal Substructure
Purpose: Show that an optimal solution to the problem contain within its optimal solutions to sub-problems.
Once the greedy choice is made, the problem reduces to
finding an optimal solution for the problem. If A is an optimal solution to the original problem S, then A` = A - {1} is an optimal solution to the activity-selection problem S`= {i ∈ S:
si ≥ fi }.
Why? Because if we could find a solution B` to S` with more activities than A`, adding 1 to B` would yield a solution B to S with more activities than A, there by contradicting the
optimality.
The Fractional Knapsack Problem
Given: A set S of n items, with each item i having
bi - a positive benefit
wi - a positive weight
Goal: Choose items with maximum total benefit but with weight at most W.
If we are allowed to take fractional amounts, then this is the fractional knapsack problem.
In this case, we let xi denote the amount we take of item i
Objective: maximize
∑
∈S i
i i
i x w
b ( / )
∑ ≤
Example
Given: A set S of n items, with each item i having
bi - a positive benefit
wi - a positive weight
Goal: Choose items with maximum total benefit but with weight at most W.
Weight:
Benefit:
1 2 3 4 5
4 ml 8 ml 2 ml 6 ml 1 ml
$12 $32 $40 $30 $50
Items:
10 ml
Solution:
• 1 ml of 5
• 2 ml of 3
• 6 ml of 4
• 1 ml of 2
“knapsack”
The Fractional Knapsack Algorithm
Algorithm fractionalKnapsack(S, W)
Input: set S of items w/ benefit bi and weight wi; max. weight W Output: amount xi of each item i to maximize benefit w/ weight at most W
for each item i in S xi ← 0
vi ← bi / wi //{value}
w ← 0 //{total weight}
while w < W
remove item i w/ highest vi xi ← min{wi , W - w}
w ← w + min{wi , W - w}
Proving Optimality
Let v1/w1 ≥ v2/w2 ≥ v3/w3 … ≥ vn/vn
Let x be the solution Æ
Let y be any feasible solution vector Æ
We want to show that Æ
i iv
∑
yi i v∑
x∑
=≥
n −
i
i i
i y v
x
1
0 )
(
0
… xk 0
1
… 1
1
1 2 k n
x xk < 1 (Fractional)
1 = Select a whole item 0 = Does not select at all
Proving Optimality
∑
=⎪ ⎪
⎪
⎩
⎪⎪
⎪
⎨
⎧
=
n
−
i
i i
i
y v
x
1
) (
∑
−= 1 −
1
) (
k
i i
i i i
i w
w v y x
k k k k
k w
w v y
x )
( −
+
0
… xk 0
1
… 1
1
1 2 k n
x xk < 1 (Fractional)
∑
+=
− + n
k
i i
i i i
i w
w v y x
1
)
(
⎪ ⎪
⎭
⎪ ⎪
⎬
⎫ ∑
−=
−
≥ 1
1
) (
k
i k
k i i
i w
w v y x
k k k k
k w
w v y
x )
( −
=
∑
+=
−
≥ n
k
i k
k i i
i w
w v y x
1
) (
Proving Optimality
∑
=⎪ ⎪
⎪
⎩
⎪⎪
⎪
⎨
⎧
=
n
−
i
i i
i
y v
x
1
) (
∑
−= 1 −
1
) (
k
i i
i i i
i w
w v y x
k k k k
k w
w v y
x )
( −
+
0
… xk 0
1
… 1
1
1 2 k n
x
∑
+=
− + n
k
i i
i i i
i w
w v y x
1
)
(
⎪ ⎪
⎭
⎪ ⎪
⎬
⎫
∑
=−
=
ni
i i
i k
k
x y w
w v
1
) (
∑
=−
≥ n
i k
k i i
i w
w v y x
1
) (
≥ 0
Huffman Code
Huffman code is a technique for compressing data.
Huffman's greedy algorithm look at the occurrence of each character and it as a binary string in an optimal way.
Suppose we have a data consists of 100,000 characters that we want to compress. The characters in the data occur with following frequencies.
Consider the problem of designing a "binary character code" in which each character is represented by a
5,000 9,000
16,000 12,000
13,000 45,000
Frequency
f e
d C
b a
Fix Length Code
In fixed length code, we needs only 3 bits to represent six characters.
Total number of characters are 45,000 + 13,000 + 12,000 + 16,000 + 9,000 + 5,000 = 100,000.
Add each character is assigned 3-bit codeword => 3*
100,000 = 300,000 bits.
5,000 9,000
16,000 12,000
13,000 45,000
Frequency
101 100
011 010
001 000
Fix Length Code
f e
d C
b a
Variable Length Code
A variable-length code gives frequent characters in shorter codewords (a sequence of bits) and infrequent characters in longer codewords using prefix codes.
In Prefix Codes no codeword is a prefix of other
codeword. The reason prefix codes are desirable is that they simply encoding (compression) and decoding.
A variable Length Code requires only 224,000 bits
5,000 9,000
16,000 12,000
13,000 45,000
Frequency
1100 1101
111 101
100 0
Variable Length Code
f e
d C
b a
Huffman Code – Binary Tree
a
b c
d
0
0
0 0
0 1
1 1 1
1
5,000 9,000
16,000 12,000
13,000 45,000
Frequency
1100 1101
111 101
100 0
Variable Length Code
f e
d C
b a
Constructing Huffman Code
a = 45 b = 13 c = 12 f = 5 e = 9 d = 16
30 14
25
55 100
Huffman Code – Binary Tree
Given a tree T corresponding to the prefix code,
compute the number of bits required to encode a file.
Let f(c) be the frequency of c and let dT(c) denote the depth of c's leaf. Note that dT(c) is also the length of codeword. The number of bits to encode a file is
B(T) = ∑f(c) dT(c)
= 45*1 +13*3 + 12*3 + 16*3 + 9*4 +5*4 = 224
= 224*1000 = 224,000
Huffman Code – Binary Tree
line 2, BuildHeap is in O(n) time.
for loop executed n - 1 times
Each heap operation requires O(log n) time. Therefore, the for Algorithm Huffman(C,n)
Q = BuildHeap(C) for i ← 1 to n-1 do
z ← Allocate-Node() z.left ← Extract_Min(Q) z.right ← Extract_Min(Q)
z.freq ← z.left.freg + z.right.freg Insert(Q,z)
Return Extract_Min(Q)
Optimal Substructure
B(T) = B(T*) + f(x)dT(x)+f(y)dT(y) – f(c)dT*(c)
= B(T*) + f(x)+f(y)
x y
T T*
c c
f(x)dT(x)+f(y)dT(y) = (f(x) + f(y))(dT*(c)+1)
= f(c)dT*(c) + f(x) + f(y)
Greedy Choice Property
If x and y are the nodes that have the least frequency, then there exists an optimal tree (represent prefix code) that contain node x and y as the deepest depth.
Let T is the optimal tree, x and y is the node that has the least frequency.
Greedy Choice Property
∑f(c)dT(c) - ∑f(c)dT*(c) //Before swap – After swap
= f(x)dT(x) + f(b)dT(b) – f(x)dT*(x) – f(b)dT*(b)
= f(x)dT(x) + f(b)dT(b) – f(x)dT(b) – f(b)dT(x)
b c
T
x y
x c
T*
b y
x y
T**
b c
Single Source Shortest Path (Dijkstra’s Algorithm)
Given a vertex called the source in a weighted
connected graph, find shortest paths to all its other vertices. (This is not a TSP)
The distance of a vertex v from a vertex s is the length of a shortest path between s and v
Dijkstra’s algorithm computes the distances of all the vertices from a given start vertex s
Assumptions:
the graph is connected
the edges are undirected
the edge weights are nonnegative
Single Source Shortest Path (Dijkstra’s Algorithm)
Grow a “cloud” of vertices, beginning with s and eventually covering all the vertices
Store with each vertex v a label d(v) representing the distance of v from s in the sub-graph consisting of the cloud and its adjacent vertices
At each step
Add to the cloud the vertex u outside the cloud with the smallest distance label, d(u)
Update the labels of the vertices adjacent to u
Edge Relaxation
Consider an edge e = (u,z) such that
u is the vertex most recently added to the cloud
z is not in the cloud
The relaxation of edge e updates distance d(z) as follows:
d(z) ← min{d(z),d(u) + weight(e)}
d(z) = 75 d(u) = 50
10
u e
d(z) = 60 d(u) = 50
10
s u e z
Example
C B
A
E
D
F 0
4 2
8
∞ ∞
8 4
7 1
2 5
2
3 9
C B
A
D 0
3 2
8
5 11
8 4
7 1
2
3 9
B C
A
E
D F
0
3 2
8
5 8
8 4
7 1
2 5
2
3 9
C B
A
D 0
3 2
7
5 8
8 4
7 1
2
3 9
Example
C B
A
E
D F
0
3 2
7
5 8
8 4
7 1
2 5
2
3 9
C B
A
E
D F
0
3 2
7
5 8
8 4
7 1
2 5
2
3 9
Pseudo Code
1 Algorithm Dijkstra(G, w, s)
2 for each vertex v in V[G] // Initializations 3 d[v] ← infinity
4 previous[v] ← undefined 5 d[s] ← 0
6 S ← empty set
7 Q ← V[G] // Build Q and Store V to Q
8 while Q is not an empty set // The algorithm itself 9 u ← Extract_Min(Q)
10 S ← S + {u}
11 for each edge(u,v) outgoing from u
12 if d[u] + w(u,v) < d[v] // Relax (u,v)
13 d[v] ← d[u] + w(u,v)
14 previous[v] ← u
Analysis
The time efficiency of Dijkstra’s algorithm depend on the data structures used for implementing the priority queue and for representing an input graph itself.
If we store a graph in a form of an ordinary linked list or array and for representing Q, operation
Extract-Min(Q) is a linear search through all vertices in Q. The running time is O(V2).
If we store a graph in a form of adjacency lists and using a binary heap as a priority queue (to implement the Extract-Min() function). With a binary heap, the algorithm requires O((E+V)logV) time, recall that ∑ deg(v) = 2E
The running time can also be expressed as
Why Dijkstra’s Algorithm Works
Dijkstra’s algorithm is based on the greedy
method. It adds vertices by increasing distance.
C B
A
E
D F
0
3 2
7
5 8
8 4
7 1
2 5
2
3 9
Suppose it didn’t find all shortest distances. Let F be the first wrong vertex the algorithm processed.
When the previous node, D, on the true shortest path was considered, its distance was correct.
But the edge (D,F) was relaxed at that time!
Thus, so long as d(F)>d(D), F’s
distance cannot be wrong. That is, there is no wrong vertex.
Why It Doesn’t Work for Negative- Weight Edges
If a node with a negative
incident edge were to be added late to the cloud, it could mess up distances for vertices already in the cloud.
C B
A
E
D F
0
4 5
7
5 9
8 4
7 1
2 5
6
0 -8
Dijkstra’s algorithm is based on the greedy
method. It adds vertices by increasing distance.
C’s true distance is 1, but it is already in the cloud
Acknowledgement
http://www.personal.kent.edu/~rmuhamma/Algorith ms/algorithm.html
http://ww3.algorithmdesign.net/
http://en.wikipedia.org/wiki/