Binary Trees
Kyunghan Lee
Networked Computing Lab (NXC Lab)
Department of Electrical and Computer Engineering
Outline
□ In this talk, we will look at the binary tree data structure:
§ Definition
§ Properties
§ A few applications
• Ropes (strings)
• Expression trees
Definition
□ The arbitrary number of children in general trees is often unnecessary—many real-life trees are restricted to two branches
§ Expression trees using binary operators
§ An ancestral tree of an individual, parents, grandparents, etc.
There are also issues with general trees:
§ There is no natural order between a node and its children
Definition
□ A binary tree is a restriction where each node has exactly two children:
§ Each child is either empty or another binary tree
§ This restriction allows us to label the children as left and right subtrees
□ At this point, recall that lg(n) = Q (log b (n)) for any b
Definition
□ We will also refer to the two sub-trees as
§ The left-hand sub-tree, and
§ The right-hand sub-tree
Definition
□ Sample variations on binary trees with five nodes:
Q. How to count the number of all possible variations with n nodes?
Definition
□ A full node is a node where both the left and right sub- trees are non-empty trees
Legend:
Definition
□ A full binary tree is where each node is either:
§ A full node, or
§ A leaf node
□ These have applications in
§ Expression trees
§ Huffman encoding
Size
□ The recursive size function runs in Q (n) time and Q (h) memory
template <typename Type>
int Binary_node<Type>::size() const { if ( left == nullptr ) {
return ( right == nullptr ) ? 1 : 1 + right->size();
} else {
return ( right == nullptr ) ? 1 + left->size() :
1 + left->size() + right->size();
}
Height
□ The recursive height function also runs in Q (n) time and Q (h) memory
int Binary_node<Type>::height() const { if ( left == nullptr ) {
return ( right == nullptr ) ? 0 : 1 + right->height();
} else {
return ( right == nullptr ) ? 1 + left->height() :
1 + max(left->height(), right->height());
}
Run Times
□ Recall that with linked lists and arrays, some operations would run in Q (n) time
□ The run times of operations on binary trees, we will see, depends on the height of the tree
□ We will see that:
§ The worst is clearly Q (n)
§ Under average conditions, the height is Q( 𝑛)
§ The best case is Q (ln(n))
Run Times
□ If we can achieve and maintain a height Q (lg(n)), many operations of the binary tree can run in Q (lg(n)) we
□ Logarithmic time is not significantly worse than constant time:
lg( 1000 ) ≈ 10 kB lg( 1 000 000 ) ≈ 20 MB lg( 1 000 000 000 ) ≈ 30 GB lg( 1 000 000 000 000 ) ≈ 40 TB
lg( 1000 n ) ≈ 10 n
http://xkcd.com/394/
Application: Ropes
□ In 1995, Boehm et al. introduced the idea of a rope, or a
heavyweight string
Application: Ropes
□ Alpha-numeric data is stored using a string of characters
§ A character (or char) is a numeric value from 0 to 255 where certain numbers represent certain letters
§ ASCII code: http://www.asciitable.com/
□ For example,
‘A’ 65 01000001 2
‘B’ 66 01000010 2
‘a’ 97 01100001 2
‘b’ 98 01100010 2
‘ ’ 32 00100000 2
Application: Ropes
□ A C-style string is an array of characters followed by the character with a numeric value of 0
□ One problem with using arrays is the runtime required
to concatenate two strings
Application: Ropes
□ Concatenating two strings requires the operations of:
§ Allocating more memory, and
§ Coping both strings Q (n + m)
Application: Ropes
□ The rope data structure:
§ Stores strings in the leaves,
§ Internal nodes (full) represent the concatenation of the two strings, and
§ Represents the string with the right sub-tree concatenated onto the end of the left
□ The previous concatenation may now occur in Q (1) time
Application: Ropes
□ The string
□ may be represented using the rope
19
Application: Expression Trees
□ Any basic mathematical expression containing binary operators may be represented using a binary tree
□ For example, 3(4a + b + c) + d/5 + (6 – e)
A
B
C
D
E
Figure 4.12 Worst-case binary tree
4.2.1 Implementation
Because a binary tree node has at most two children, we can keep direct links to them. The declaration of tree nodes is similar in structure to that for doubly linked lists, in that a node is a structure consisting of the element information plus two pointers ( left and right ) to other nodes (see Fig. 4.13).
We could draw the binary trees using the rectangular boxes that are customary for linked lists, but trees are generally drawn as circles connected by lines, because they are actually graphs. We also do not explicitly draw nullptr links when referring to trees, because every binary tree with N nodes would require N + 1 nullptr links.
Binary trees have many important uses not associated with searching. One of the principal uses of binary trees is in the area of compiler design, which we will now explore.
4.2.2 An Example: Expression Trees
Figure 4.14 shows an example of an expression tree. The leaves of an expression tree are operands, such as constants or variable names, and the other nodes contain operators.
This particular tree happens to be binary, because all the operators are binary, and although this is the simplest case, it is possible for nodes to have more than two children. It is also possible for a node to have only one child, as is the case with the unary minus operator.
We can evaluate an expression tree, T, by applying the operator at the root to the values
struct BinaryNode {
Object element; // The data in the node BinaryNode *left; // Left child
BinaryNode *right; // Right child };
Figure 4.13 Binary tree node class (pseudocode)
Application: Expression Trees
□ Observations:
§ Internal nodes store operators
§ Leaf nodes store literals or variables
§ No nodes have just one sub tree
§ The order is not relevant for
• Addition and multiplication (commutative)
§ Order is relevant for
• Subtraction and division (non-commutative)
Application: Expression Trees
□ A post-order depth-first traversal converts such a tree to
the reverse-Polish format
Application: Expression Trees
□ Humans think in in-order
□ Computers think in post-order:
§ Both operands must be loaded into registers
§ The operation is then called on those registers
□ Most programming languages use in-order notation (C, C++, Python, Java, C#, etc.)
§ Necessary to translate in-order into post-order
Summary
□ In this talk, we introduced binary trees
§ Each node has two distinct and identifiable sub-trees
§ Either sub-tree may optionally be empty
§ The sub-trees are ordered relative to the other
□ We looked at:
§ Properties
§ Applications
Perfect Binary Trees
Kyunghan Lee
Networked Computing Lab (NXC Lab)
Department of Electrical and Computer Engineering Seoul National University
https://nxc.snu.ac.kr
Outline
□ Introducing perfect binary trees
§ Definitions and examples
§ Number of nodes: 2 h+1 – 1
§ Logarithmic height
§ Number of leaf nodes: 2 h
§ Applications
Definition
□ Standard definition:
§ A perfect binary tree of height h is a binary tree where
• All leaf nodes have the same depth h
• All non-leaf nodes are full
Definition
□ Recursive definition:
§ A binary tree of height h = 0 is perfect
§ A binary tree with height h (> 0) is a perfect if both sub-trees
are perfect binary trees of height h – 1
Examples
□ Perfect binary trees of height h = 0, 1, 2, 3 and 4
Theorems
□ Four theorems that describe the properties of perfect binary trees:
§ A perfect tree has 2 h + 1 – 1
§ The height is Q (ln(n))
§ There are 2 h leaf nodes
§ The average depth of a node is Q (ln(n))
□ These theorems determine the optimal run-time
properties of operations on binary trees
2 h + 1 – 1 Nodes
□ Theorem
A perfect binary tree of height h has 2 h + 1 – 1 nodes
Proof:
We will use mathematical induction:
1. Show that it is true for h = 0
2. Assume it is true for an arbitrary h
3. Show that the truth for h implies the truth for h + 1
2 h + 1 – 1 Nodes
□ The base case:
§ When h = 0 we have a single node n = 1
§ The formula is correct: 2 0 + 1 – 1 = 1
2 h + 1 – 1 Nodes
□ The inductive step:
§ Assume that if the height of the tree is h, the number of nodes is n = 2 h + 1 – 1
h
2 h + 1 – 1 Nodes
□ We must show that a tree of height h + 1 has n = 2 (h + 1) + 1 – 1 = 2 h + 2 – 1 nodes
h + 1
2 h + 1 – 1 Nodes
□ Using the recursive definition, both sub-trees are perfect trees of height h
§ By assumption, each sub-tree has 2 h + 1 – 1 nodes
§ Therefore, the total number of nodes is
(2 h + 1 – 1) + 1 + (2 h + 1 – 1) = 2 h + 2 – 1
h h
Logarithmic Height
□ Theorem
A perfect binary tree with n nodes has height lg(n + 1) – 1
Proof
Solving n = 2 h + 1 – 1 for h:
n + 1 = 2 h + 1 lg(n + 1) = h + 1
h = lg(n + 1) – 1
Logarithmic Height
□ Lemma
lg(n + 1) – 1 = Q (ln(n))
Proof
( )
( )
1
1 ln(2)
lg( 1) 1 1 1
lim lim lim lim
ln( ) 1 1 ln(2) ln(2) ln(2)
n n n n
n n n
n n
n
®¥ ®¥ ®¥ ®¥
+ - +
= = = =
+
2 h Leaf Nodes
□ Theorem
A perfect binary tree with height h has 2 h leaf nodes
Proof (by induction):
When h = 0, there is 2 0 = 1 leaf node.
Assume that a perfect binary tree of height h has 2 h leaf nodes.
Then observe that both sub-trees of a perfect binary tree of height h + 1 have 2 h leaf nodes.
□ Consequence: Over half all nodes are leaf nodes:
The Average Depth of a Node
□ The average depth of a node in a perfect binary tree is
( )
1 1 1 1
0
1 1 1
1
2 2 2 2 (2 1) (2 1) 1
2 1 2 1 2 1
1 1 1 ln( )
2 1
h k
h h h h
k
h h h
h
k h h h
h h h n
+ + + +
=
+ + +
+
- + - - - + +
= =
- - -
= - + + » - = Q -
å
0 2 1 3 4 5
1 4 2 16 8 32 Depth Count
Number of nodes Sum of
depths the
Applications
□ Perfect binary trees are considered to be the ideal case
§ The height and average depth are both Q (ln(n))
□ We will attempt to find trees which are as close as
possible to perfect binary trees for efficient operations
Summary
□ We have defined perfect binary trees and discussed:
§ The number of nodes: n = 2 h + 1 – 1
§ The height: lg(n + 1) – 1
§ The number of leaves: 2 h
§ Half the nodes are leaves
• Average depth is Q (ln(n))
§ It is an ideal case