Trees#
Tree data structure has a root, branches, and leaves. The difference between a tree in nature and a tree in computer science is that a tree data structure has its root at the top and its leaves on the bottom.
Trees are used in many areas of computer science, including operating systems, graphics, database systems, and computer networking. Tree data structures have many things in common with their botanical cousins.
Example
Taxonomic classification tree from biology.
Computer file system for storing data. In a file system, directories, or folders, are structured as a tree.
A web page written in HTML is a tree.
Prperties of trees#
First property of trees is that trees are hierarchical, trees are structured in layers with the more general things near the top and the more specific things near the bottom.
Second property of trees is that all of the children of one node are independent of the children of another node.
Third property is that each leaf node is unique.
If each node in the tree has a maximum of two children, we say that the tree is a binary tree.
Tree Traversals#
Visitation of the nodes of a tree is called a “traversal”. The three traversals we will look at are called preorder, inorder, and postorder.
Pre-Order: visit the root node first, then recursively do preorder traversal of the left subtree, followed by recursive preorder traversal of the right subtree.
In-Order: recursively do an inorder traversal on the left subtree, visit the root node, and finally do a recursive inorder traversal of the right subtree.
Post-Order: recursively do a postorder traversal of the left subtree and the right subtree followed by a visit to the root node.
class BinaryTree:
def __init__(self,obj):
self.key = obj
self.left_child = None
self.right_child = None
def insert_left(self,newNode):
if self.left_child == None:
self.left_child = BinaryTree(newNode)
else:
t = BinaryTree(newNode)
t.left_child = self.left_child
self.left_child = t
def insert_right(self,newNode):
if self.right_child == None:
self.right_child = BinaryTree(newNode)
else:
t = BinaryTree(newNode)
t.right_child = self.right_child
self.right_child = t
def get_right_child(self):
return self.right_child
def get_left_child(self):
return self.left_child
def set_root_val(self,obj):
self.key = obj
def get_root_val(self):
return self.key
def pre_order_traversal(self):
traversal = ""
traversal += f"{self.key} "
if self.left_child:
traversal += self.left_child.pre_order_traversal()
if self.right_child:
traversal += self.right_child.pre_order_traversal()
return traversal
def post_order_traversal(self):
traversal = ""
if self.left_child:
traversal += self.left_child.pre_order_traversal()
if self.right_child:
traversal += self.right_child.pre_order_traversal()
traversal += f"{self.key} "
return traversal
def in_order_traversal(self):
traversal = ""
if self.left_child:
traversal += self.left_child.pre_order_traversal()
traversal += f"{self.key} "
if self.right_child:
traversal += self.right_child.pre_order_traversal()
return traversal
t = BinaryTree("a")
t.insert_left("b")
t.insert_right("c")
print(t.get_root_val())
print(t.get_left_child().get_root_val())
print(t.get_right_child().get_root_val())
lc = t.get_left_child()
lc.insert_left("d")
lc.insert_right("e")
print(lc.get_root_val())
print(lc.get_left_child().get_root_val())
print(lc.get_right_child().get_root_val())
rc = t.get_right_child()
rc.insert_left("f")
rc.insert_right("g")
print(rc.get_root_val())
print(rc.get_left_child().get_root_val())
print(rc.get_right_child().get_root_val())
print(t.in_order_traversal())
print(t.pre_order_traversal())
print(t.post_order_traversal())
a
b
c
b
d
e
c
f
g
b d e a c f g
a b d e c f g
b d e c f g a
Practice Problems#
Write a Python function to create a parse tree for mathemetical expressions like ((3 * 2) + (12 / 4))
.#
def build_parse_tree(exp):
pass
Binary Search Tree#
Binary search trees is another way to map from a key to a value. In this case we are not interested in the exact placement of items in the tree, but we are interested in using the binary tree structure to provide for efficient searching.
Binary Search Tree is a node-based binary tree data structure which has the following properties:
The left subtree of a node contains only nodes with keys lesser than the node’s key.
The right subtree of a node contains only nodes with keys greater than the node’s key.
The left and right subtree each must also be a binary search tree.
Binary Search Tree Abstract Data Type#
The Binary Search Tree abstract data type is defined by the following structure and operations.
Binary Search Tree Operations#
BinarySearchTree()
Create a new, empty binary search tree.put(key, val)
Add a new key-value pair to the binary search tree. If the key is already in the binary search tree then replace the old value with the new value.get(key)
Given a key, return the value stored in the binary search tree or None otherwise.remove(key)
Remove the key-value pair from the binary search tree.size()
Return the number of key-value pairs stored in the binary search tree.
bst = BinarySearchTree()
creates a binary search tree that starts out empty.
class TreeNode:
def __init__(self, key, value, parent):
self.key = key
self.value = value
self.left = None
self.right = None
self.parent = parent
def get_key(self):
return self.key
def get_value(self):
return self.value
class BinarySearchTree:
def __init__(self):
self.root = None
self.num_nodes = 0
def is_empty(self):
return self.root is None
def put(self, key, value):
self.num_nodes += 1
self.root = self._put(self.root, key, value)
def _put(self, node, key, value, parent = None):
if node is None:
node = TreeNode(key, value, parent)
else:
if key < node.key:
node.left = self._put(node.left, key, value, node)
elif key > node.key:
node.right = self._put(node.right, key, value, node)
else:
node.value = value
return node
def get(self, key):
return self._get(self.root, key)
def _get(self, node, key):
if node is None:
raise Exception(f"Tree node with key {key} does not exist")
else:
if key < node.key:
node = self._get(node.left, key)
elif key > node.key:
node = self._get(node.right, key)
return node
def remove(self, key):
self.num_nodes -= 1
node = self.get(key)
if node.right is None and node.left is None:
self._reassign_nodes(node, None)
elif node.left is None:
self._reassign_nodes(node, node.right)
elif node.right is None:
self._reassign_nodes(node, node.left)
else:
lowest_node = self._get_lowest_node(node.right)
lowest_node.left = node.left
lowest_node.right = node.right
node.left.parent = lowest_node
if node.right:
node.right.parent = lowest_node
self._reassign_nodes(node, lowest_node)
def _reassign_nodes(self, node, new_children):
if new_children:
new_children.parent = node.parent
if node.parent:
if node.parent.right == node:
node.parent.right = new_children
else:
node.parent.left = new_children
else:
self.root = new_children
def _get_lowest_node(self, node):
if node.left:
lowest_node = self._get_lowest_node(node.left)
else:
lowest_node = node
self._reassign_nodes(node, node.right)
return lowest_node
bst = BinarySearchTree()
keys = [5, 4, 1, 2, 3, 9, 8, 7, 6, 13, 10, 14, 11]
values = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m"]
for i in range(13):
bst.put(keys[i], values[i])
print(bst.get(4).value)
print(bst.get(1).value)
print(bst.get(7).value)
bst.put(3, "r")
print(bst.get(3).value)
# bst.get(4)
b
c
h
r
Search Tree Analysis#
A perfectly balanced tree has the same number of nodes in the left subtree as the right subtree. In a balanced binary tree, the worst-case performance of put, get is O(logn)
, where n is the number of nodes in the tree. So log n
gives us the height of the tree, and represents the maximum number of comparisons that put will need to do as it searches for the proper place to insert a new node.
Performance of the binary search tree can degrade to O(n)
for operations like get and put when the tree becomes unbalanced.
AVL Tree#
AVL tree and is named for its inventors: G.M. Adelson-Velskii and E.M. Landis. AVL Tree is a special kind of binary search tree that automatically makes sure that the tree remains balanced at all times. An AVL tree implements the Map abstract data type just like a regular binary search tree, the only difference is in how the tree performs.
To implement our AVL tree we need to keep track of a balance factor for each node in the tree, balance factor for each node in the tree is calculated by looking at the heights of the left and right subtrees. More formally, we define the balance factor for a node as the difference between the height of the left subtree and the height of the right subtree.
balanceFactor = height(leftSubTree) − height(rightSubTree)
Using the definition for balance factor given above we say that a subtree is left-heavy if the balance factor is greater than zero. If the balance factor is less than zero then the subtree is right heavy. If the balance factor is zero then the tree is perfectly in balance. A tree is considered to be in balanced if the balance factor is -1, 0, or 1. Once the balance factor of a node in a tree is outside this range we will need to have a procedure to bring the tree back into balance.