• Tidak ada hasil yang ditemukan

CSL 356 - Greedy Algorithms: Huffman Coding

N/A
N/A
Protected

Academic year: 2024

Membagikan "CSL 356 - Greedy Algorithms: Huffman Coding"

Copied!
27
0
0

Teks penuh

(1)

Ragesh Jaiswal CSE, IIT Delhi

CSL 356: Analysis and Design of

Algorithms

(2)

Greedy Algorithms: Huffman

Coding

(3)

Greedy Algorithms: Huffman Coding

Problem: Given an alphabet set containing n alphabets and the frequency of occurrence of alphabets

(t(a1),t(a2),…,t(an)). Find a binary tree T with n leaves (one leaf labeled with one alphabet) such that:

OT = (d(a1)*t(a1)+d(a2)*t(a2)+…+ d(an)*t(an))

d(ai) above is the depth of the leaf labeled with alphabet ai

What are the properties of the optimal tree T?

1. Claim: T is a complete binary tree.

Complete binary tree: Every non-leaf node has exactly two children.

2. Claim: Consider the two alphabets x, y with least frequency.

Then x and y have maximum depth in any optimal T and there is an optimal T where x and y are siblings.

(4)

Greedy Algorithms: Huffman Coding

Let be a new symbol not present in . Consider the following (smaller) problem:

∑’ = ∑ - {x,y} U {}

For all z in ∑’\{}, t’(z) = t(z)

t’() = t(x) + t(y)

Find the optimal binary tree for the new alphabet ∑’ and the new frequencies given by t’.

Let T’ be the optimal binary tree for the above problem.

Consider the leaf v labeled with in T’. Consider a new tree T where v has two children that are leaves and are labeled with x and y.

Claim: T is the optimal tree for the original problem.

(5)

Running time?

Huffman(∑)

- Let v1,…,vn be nodes. Each node denoting an alphabet - S = {v1,…,vn}

-While (|S|>1)

- Pick two nodes x and y with the least value of t(x) and t(y) - Create a new node z and set t(z) = t(x) + t(y)

- Set x as the left child of z and y as the right child - Remove x and y from S and add z to S

-When |S|=1, return the only node in S as the root node of the Huffman Tree

Greedy Algorithms: Huffman Coding

(6)

Greedy Algorithms: Huffman Coding

A DNA sequence has four characters A,C,T,G and these

characters appear with frequency 30%, 20%, 10%, and 40%

respectively.

We have to encode a sequence of length 1 million(106) in bits.

If we use two bits for each character, then the size of the encoding will be 2 million bits.

Huffman coding:

f(A) = 10, f(C) = 110, f(T) = 111, f(G) = 0

We will need 1.9 million bits.

(7)

Greedy Algorithms

For some problems, even though the greedy strategy does not give an optimal solution but it might give a solution that is provably close to the optimal solution.

(8)

Greedy Approximation: Examples

Let S be a set containing n elements. A set of subsets

{S1,…,Sm} of S is called a covering set if each element in S is present in at least one of the subsets S1,…,Sm.

Problem (Set Cover): Given a set S containing n elements and m subsets S1,…,Sm of S. Find a covering set of S of minimum cardinality.

Example:

S = {a, b, c, d, e, f}

S1={a, b}, S2={a,c}, S3={a,c}, S4={d,e,f}, S5={e, f}

{S1,S2,S3,S4} is a covering set.

{S1,S2,S4} is a covering set of minimum size.

(9)

Greedy Approximation: Examples

Problem (Set Cover): Given a set S containing n elements and m subsets S1,…,Sm of S. Find a covering set of S of minimum cardinality.

Application: There are n villages and the government is

trying to figure out which villages to open schools at so that it has to open minimum number of schools. The constraint is that no children should have to walk more than 3 miles to get to a school.

(10)

Greedy Approximation: Examples

Problem (Set Cover): Given a set S containing n elements and m subsets S1,…,Sm of S. Find a covering set of S of minimum cardinality.

Greedy strategy: Give preference to the subset that covers most number of elements.

GreedySetCover(S, S1, …, Sm) -T = {}; R = S

- While R is not empty

- Pick a subset Si that covers the maximum number of elements in R -T=T U {Si}; R = R - Si

(11)

Greedy Approximation: Examples

Problem (Set Cover): Given a set S containing n elements and m subsets S1,…,Sm of S. Find a covering set of S of minimum cardinality.

Greedy strategy: Give preference to the subset that covers most number of elements.

GreedySetCover(S, S1, …, Sm) -T = {}; R = S

- While R is not empty

- Pick a subset Si that covers the maximum number of elements in R -T=T U {Si}; R = R - Si

Counterexample:

S={a,b,c,d,e,f,g, h}, S1={a, b, c, d, e}, S2={a, b, c, f}, S3={d, e, g, h}

(12)

Greedy Approximation: Examples

Claim: Let k be the optimal cardinality of the covering set.

Then the greedy algorithm outputs a covering set with cardinality at most k*ln(n).

Proof: Let Nt be the number of uncovered elements after t iterations of the loop.

Claim: Nt ≤ (1 – 1/k)*Nt-1

GreedySetCover(S, S1, …, Sm) -T = {}; R = S

- While R is not empty

- Pick a subset Si that covers the maximum number of elements in R -T=T U {Si}; R = R - Si

(13)

Greedy Approximation: Examples

Claim: Let k be the optimal cardinality of the covering set.

Then the greedy algorithm outputs a covering set with cardinality at most k*ln(n).

Proof: Let Nt be the number of uncovered elements after t iterations of the loop.

Claim: Nt ≤ (1 – 1/k)*Nt-1

Claim: N(k*ln(n)) < 1

Using the fact that (1-x) ≤ e-x and the equality holds only for x=0.

GreedySetCover(S, S1, …, Sm) -T = {}; R = S

- While R is not empty

- Pick a subset Si that covers the maximum number of elements in R -T=T U {Si}; R = R - Si

(14)

Greedy Approximation: Examples

Problem (Minimum Makespan): You have m identical machines and n jobs. For each job i, you are given the

duration of this job d(i) that denotes the time that required by any machine to perform this job. Assign these n jobs on m machines such that the maximum finishing time is

minimized.

Example:

10

40

5

30

60

35

(15)

Greedy Approximation: Examples

Problem (Minimum Makespan): You have m identical machines and n jobs. For each job i, you are given the duration of this job d(i) that denotes the time that required by any machine to

perform this job. Assign these n jobs on m machines such that the maximum finishing time is minimized.

Greedy Strategy: Assign the next job to a machine with least load

10

40

5

30

60

35

(16)

Greedy Approximation: Examples

Problem (Minimum Makespan): You have m identical machines and n jobs. For each job i, you are given the duration of this job d(i) that denotes the time that required by any machine to

perform this job. Assign these n jobs on m machines such that the maximum finishing time is minimized.

Greedy Strategy: Assign the next job to a machine with least load

10

40

5

30

60

35

(17)

Greedy Approximation: Examples

Problem (Minimum Makespan): You have m identical machines and n jobs. For each job i, you are given the duration of this job d(i) that denotes the time that required by any machine to

perform this job. Assign these n jobs on m machines such that the maximum finishing time is minimized.

Greedy Strategy: Assign the next job to a machine with least load

10

40

5

30

60

35

(18)

Greedy Approximation: Examples

Problem (Minimum Makespan): You have m identical machines and n jobs. For each job i, you are given the duration of this job d(i) that denotes the time that required by any machine to

perform this job. Assign these n jobs on m machines such that the maximum finishing time is minimized.

Greedy Strategy: Assign the next job to a machine with least load

10

40

5

30

60

35

(19)

Greedy Approximation: Examples

Problem (Minimum Makespan): You have m identical machines and n jobs. For each job i, you are given the duration of this job d(i) that denotes the time that required by any machine to

perform this job. Assign these n jobs on m machines such that the maximum finishing time is minimized.

Greedy Strategy: Assign the next job to a machine with least load

10

40

5

30 60

35

(20)

Greedy Approximation: Examples

Problem (Minimum Makespan): You have m identical machines and n jobs. For each job i, you are given the duration of this job d(i) that denotes the time that required by any machine to

perform this job. Assign these n jobs on m machines such that the maximum finishing time is minimized.

Greedy Strategy: Assign the next job to a machine with least load

10

40

5 30 60

35

(21)

Greedy Approximation: Examples

Problem (Minimum Makespan): You have m identical machines and n jobs. For each job i, you are given the duration of this job d(i) that denotes the time that required by any machine to

perform this job. Assign these n jobs on m machines such that the maximum finishing time is minimized.

Greedy Strategy: Assign the next job to a machine with least load

10

40

5 30 60

35

Is this the optimal Solution?

(22)

Greedy Approximation: Examples

Let OPT be the optimal value.

Let G denote the maximum finishing time of a machine as per the greedy assignment.

Claim: G ≤ 2 * OPT

Claim: OPT ≥ (d(1) + … + d(n))/m

Claim: For any job t, OPT ≥ d(t)

Let the jth machine finish last. Let i be the last job assigned to machine j. Let s be the start time of job i on machine j.

Claim: s ≤ (d(1) + … + d(n))/m

GreedyMakespan

- While all jobs are not assigned

- Assign the next job to a machine with least load

(23)

Greedy Approximation: Examples

Let OPT be the optimal value.

Let G denote the maximum finishing time of a machine as per the greedy assignment.

Claim: G ≤ 2 * OPT

Proof:

Claim 1: OPT ≥ (d(1) + … + d(n))/m

Claim 2: For any job t, OPT ≥ d(t)

Let the jth machine finish last. Let i be the last job assigned to machine j. Let s be the start time of job i on machine j.

Claim 3: s ≤ (d(1) + … + d(n))/m

So, G ≤ s + d(i)

(24)

Greedy Approximation: Examples

Let OPT be the optimal value.

Let G denote the maximum finishing time of a machine as per the greedy assignment.

Claim: G ≤ 2 * OPT

Proof:

Claim 1: OPT ≥ (d(1) + … + d(n))/m

Claim 2: For any job t, OPT ≥ d(t)

Let the jth machine finish last. Let i be the last job assigned to machine j. Let s be the start time of job i on machine j.

Claim 3: s ≤ (d(1) + … + d(n))/m

So, G ≤ s + d(i)

This implies G ≤ (d(1) + … + d(n))/m + d(i) (from Claim 3)

(25)

Greedy Approximation: Examples

Let OPT be the optimal value.

Let G denote the maximum finishing time of a machine as per the greedy assignment.

Claim: G ≤ 2 * OPT

Proof:

Claim 1: OPT ≥ (d(1) + … + d(n))/m

Claim 2: For any job t, OPT ≥ d(t)

Let the jth machine finish last. Let i be the last job assigned to machine j. Let s be the start time of job i on machine j.

Claim 3: s ≤ (d(1) + … + d(n))/m

So, G ≤ s + d(i)

This implies G ≤ (d(1) + … + d(n))/m + d(i) (from Claim 3)

This implies G ≤ OPT + d(i) (from Claim 1)

(26)

Greedy Approximation: Examples

Let OPT be the optimal value.

Let G denote the maximum finishing time of a machine as per the greedy assignment.

Claim: G ≤ 2 * OPT

Proof:

Claim 1: OPT ≥ (d(1) + … + d(n))/m

Claim 2: For any job t, OPT ≥ d(t)

Let the jth machine finish last. Let i be the last job assigned to machine j. Let s be the start time of job i on machine j.

Claim 3: s ≤ (d(1) + … + d(n))/m

So, G ≤ s + d(i)

This implies G ≤ (d(1) + … + d(n))/m + d(i) (from Claim 3)

This implies G ≤ OPT + d(i) (from Claim 1)

This implies G ≤ OPT + OPT (from Claim 2)

(27)

End

Problems to think about:

1. Consider the following algorithm for minimum makespan problem:

Sort the jobs in decreasing order of duration. Let L be the sorted list of jobs.

While all jobs are not assigned

Assign the next job in L to a machine with least load

Let G be the maximum finishing time as per greedy algorithm above and let OPT be the maximum finishing time as per the optimal schedule. Then G ≤ (4/3)*OPT

Referensi

Dokumen terkait