Blog Cover

Binary Search Trees: What are they and how they work

Author profile image
Aitor Alonso

Jul 28, 2024

9 min read

Binary Search Trees (BSTs) are a fundamental data structure in computer science. They provide efficient methods for storing, searching, and managing data. In this article, we'll explore what BSTs are, how they work, and how to implement them.

What is a Binary Search Tree?

A Binary Search Tree is a binary tree where each node has at most two children, referred to as the left child and the right child. For each node:

  • The left subtree contains only nodes with values less than the node's value.
  • The right subtree contains only nodes with values greater than the node's value.

This property makes BSTs useful for operations like searching, insertion, and deletion.

Okay, now that the basics are covered, let's see how to code and work with a BST! I'll use Python for this example, just because it's very popular and looks like pseudo-code, but the concepts are the same for other programming languages.

Code definition

To create a BST data structure, we just need to model the nodes of the graph. Each node will have a value, and pointers to the left and right children, like so:

class Node:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

So to create a tree, we just create the root node with a value:

tree = Node(10) # 10 is the root node

And that's it! Now, let's see how to implement the basic operations of a BST. We don't like anemic models, so we'll implement the insertion, search, and deletion operations as methods of the Node class. Let's go!

Basic Operations

As I like to learn by example, let's take a look at how to implement a BST in Python.

Insertion

To insert a value into a BST, we start at the root and compare the value to be inserted with the current node's value. If the value is less, we move to the left child; if greater, we move to the right child. We repeat this process until we find an empty spot where the new value can be inserted.

def insert(self, root, key):
    # If the root is None, create a new node with the key and return it
    if root is None:
        return Node(key)
    else:
        # If the key is already present in the tree, return the root
        if root.value == key:
            return root
        # If the key is greater than the root's value, insert it in the right subtree
        if root.value < key:
            root.right = self.insert(root.right, key)
        # If the key is less than the root's value, insert it in the left subtree
        else:
            root.left = self.insert(root.left, key)
    # Return the root after insertion
    return root

Let's see this in action!

tree = Node(10)
tree.insert(tree, 5)
tree.insert(tree, 15)
tree.insert(tree, 3)
tree.insert(tree, 7)
tree.insert(tree, 12)
tree.insert(tree, 18)
The BST after inserting some nodes

Search

To search for a value in a BST, we start at the root and compare the value with the current node's value. If the value matches, we return the node. If the value is less, we search the left subtree; if greater, we search the right subtree. We repeat that algorithm until we find the node or the node or we end in a leaf node.

def search(self, root, key):
    # Base case: return the node if it's found or None if it's not
    if root is None or root.value == key:
        return root
    # If the key is greater than the root's value, search in the right subtree
    if root.value < key:
        return self.search(root.right, key)
    # If the key is less than the root's value, search in the left subtree
    return self.search(root.left, key)

This will return the node we're looking for, or None if the node is not found.

node = tree.search(tree, 12)
print(node and node.value) # 12
Searching for value 12

Deletion

Deleting a node from a BST involves three main cases:

  1. The node to be deleted is a leaf node.
  2. The node to be deleted has one child.
  3. The node to be deleted has two children.

If the node to be deleted is a leaf node, we just need to remove the node from the tree. However, if the node has one or two children, we need to handle the deletion differently. For a node with one child, we replace the node with its child. For a node with two children, we find the inorder successor (the smallest node in the right subtree), replace the node's value with the inorder successor's value, and then delete the inorder successor.

The inorder successor is the smallest node in the right subtree of the node to be deleted (i.e., the leftmost node in the right subtree, or the following node if we order them from lowest to highest).

def delete(self, root, key):
    # Base case: if the root is None, return None
    if root is None:
        return root
    # If the key to be deleted is smaller than the root's key,
    # then it lies in the left subtree
    if key < root.value:
        root.left = self.delete(root.left, key)
    # If the key to be deleted is greater than the root's key,
    # then it lies in the right subtree
    elif key > root.value:
        root.right = self.delete(root.right, key)
    # If the key is the same as the root's key, then this is the node
    # to be deleted
    else:
        # Node with only one child or no child
        if root.left is None:
            return root.right
        elif root.right is None:
            return root.left
        # Node with two children: Get the inorder successor
        # (smallest in the right subtree)
        temp = self.minValueNode(root.right)
        # Copy the inorder successor's content to this node
        root.value = temp.value
        # Delete the inorder successor
        root.right = self.delete(root.right, temp.value)
    return root

def minValueNode(self, node):
    # Loop to find the leftmost leaf
    current = node
    while current.left is not None:
        current = current.left
    return current

Let's see this in action, this time, with a bigger tree. This is how our BST initially looks like:

BST prior to deleting

Now, let's delete the node with value 5.

tree = tree.delete(tree, 5)

The resulting tree is a follows:

BST after deleting node "5"

Caveats

The efficiency of a BST depends on its balance. If the tree is unbalanced, the search, insertion, and deletion operations can take O(n) time, where n is the number of nodes in the tree. Deletions can affect the balance of the tree, but also insert orders. THe resulting tree will differ depending on the order of insertions. For example:

tree = Node(10)
tree.insert(tree, 5)
tree.insert(tree, 15)
tree.insert(tree, 3)
tree.insert(tree, 7)
tree.insert(tree, 12)
tree.insert(tree, 18)
tree.insert(tree, 6)
tree.insert(tree, 8)
tree.insert(tree, 2)
tree.insert(tree, 4)

will produce a tree like this:

BST after inserting nodes in an specific order

But if we change the order of insertion like so:

tree = Node(10)
tree.insert(tree, 18)
tree.insert(tree, 15)
tree.insert(tree, 12)
tree.insert(tree, 8)
tree.insert(tree, 7)
tree.insert(tree, 6)
tree.insert(tree, 5)
tree.insert(tree, 4)
tree.insert(tree, 3)
tree.insert(tree, 2)

We end up with a quite different tree:

BST after inserting nodes in a different order

Sometimes, it might be needed to balance the tree when there is a lot of unbalanced nodes, like in the last example. A simple way to improve the balance of a tree is the following:

def balanceBST(self, root):
    # Helper function to store the in-order traversal of the BST
    def storeInOrder(node, inorder_nodes):
        if not node:
            return
        storeInOrder(node.left, inorder_nodes)
        inorder_nodes.append(node.value)
        storeInOrder(node.right, inorder_nodes)

    # Helper function to build a balanced BST from sorted nodes
    def buildBalancedBST(nodes, start, end):
        if start > end:
            return None
        mid = (start + end) // 2
        node = Node(nodes[mid])
        node.left = buildBalancedBST(nodes, start, mid - 1)
        node.right = buildBalancedBST(nodes, mid + 1, end)
        return node

    # Store the in-order traversal of the BST
    inorder_nodes = []
    storeInOrder(root, inorder_nodes)

    # Build and return the balanced BST
    return buildBalancedBST(inorder_nodes, 0, len(inorder_nodes) - 1)

Conclusion

Binary Search Trees are a powerful tool for managing ordered data. They provide efficient algorithms for insertion, search, and deletion, making them a fundamental component in computer science. Understanding BSTs and their operations is crucial for any programmer looking to master data structures and algorithms.

All the code from this article can be found on Github.


I hope my article has helped you, or at least, that you have enjoyed reading it. I do this for fun and I don't need money to keep the blog running. However, if you'd like to show your gratitude, you can pay for my next coffee(s) with a one-time donation of just $1.00. Thank you!

No by AICC-BY 4.0