Huffman Coding
Huffman coding (Gonzalez et al. 2008, A.Wong Lecture 16b 2011)
•Popular lossless data compression technique
•Removes coding redundancy (or in context of image processing, redundant image data)
•Often used in compression standards:
▫JPEG
▫MPEG-1, 2, 4
•Uses variable-length coding
▫Encoding a source symbol using a table that considers the estimated occurrence probability of that symbol
▫Resulting data size depends on underlying image characteristics
Steps
1.) Determine histogram of image 2.) Construct Huffman tree
3.) Encode the image using codes generated from the Huffman tree
1.) Histogram
http://courses.cs.washington.edu/courses/cse373/02wi/slides/ImageA DT/sld012.htm
7 3 3 3
3 3 3 3
1 1 9 9
5 6 9 9 0 1 3 5 6 7 9
1 2 3 4 5 6 7 8
Intensity
Frequency
2) Huffman tree
Intensity Frequency
1 2
3 7
5 1
6 1
7 1
9 4
Intensity Frequency
5 1
6 1
7 1
1 2
9 4
3 7
Sorted from lowest to highest frequency
1 3 5 6 7 9 0
5 10
Intensity
Frequency
Intensity Frequency
5 1
6 1
7 1
1 2
9 4
3 7
1 5
1 6 2
•Lower frequency becomes left child node
•Higher frequency becomes right child node
•Sum of the two children nodes becomes the parent node
Intensity Frequency
5 1
6 1
7 1
1 2
9 4
3 7
1 5
1 6 1 2
7
3
Intensity Frequency
5 1
6 1
7 1
1 2
9 4
3 7
2 1
5
1 5
1 6 1 2
7
3
4 9 7 9
3
16
2 1
5
1 5
1 6 1 2
7
3
3) Encoding
•If traversing the left branch Label 1
•If traversing the right branch Label 0
•Follow this procedure from the root to the child of interest adding a 1 or 0
depending on the traversal
1 0
Intensity Encoding
3 1
9 01
1 001
5 00001
6 00000
7 0001
9 3
1
5 6
7 1
1
1
1
1 0
0
0
0
0
What’s awesome here?
1 3 5 6 7 9
0 1 2 3 4 5 6 7 8
Intensity
Frequency
Intensity Encoding
1 001
3 1
5 00001
6 00000
7 0001
9 01
Real Example
•Which values are going to have the shortest code?
Q8.8 in textbook
How many unique Huffman codes are there for a three-symbol source?
Construct them.
Let’s assume the three symbols are: A, B and C where the probability of A, B and C are in order from lowest to highest.
What would the Huffman tree look like?
1 3
4
1 0
A B
5 0 C
1
9
A 11
B 10
C 0
What would happen if the probability of C was less than the sum of A and B?
1 3
4
1 0
B A
3 0
C 1
7
A 01
B 00
C 1
•Therefore there are two unique codes for a three-symbol source
•Notice the codes are complements of each other
A 11
B 10
C 0
A 01
B 00
C 1
Q8.10 in textbook
Using the Huffman code in Fig. 8.8, decode the encoded string:
0101000001010111110100
0101000001010111110100 0101000001010111110100