Understanding Prefix Trees

An overview of the Trie data structure

6 min readJan 8, 2024

About

This post explains the trie data structure and provides a simple implementation. The goal is to provide a quick reference for easy reviewing.

Tries are k-ary search trees. The nodes are mostly used to represent characters while traversing the trie locates words.

Usecase
Design
Implementation
Complexity
Examples

The Usecase

Tries — or prefix trees — are useful for sequential word search. Common features like autocomplete or spell-checking are more efficient when implemented with a prefix tree.

The Design

Overview

The high-level overview of a trie is straightforward.

There’s a root node that represents the head of your trie.
The children of this root node represent all unique starting letters of the words in your trie.
The process continues: the second letters are children of these nodes, and so forth.
The end of each word is explicitly mentioned, say, with a boolean flag in that node.

A trie data structure. The * indicates word endings: Image by Author created using draw.io

Operations

We’ll define the following operations for our trie data structure:

add(word: str) : This operation adds a string to the trie.
search(word: str) : This operation checks if word exists in the trie
starts_with(word: str) : This operation checks if word is the prefix of any string in the trie.
remove(word: str) : This operation removes word from the trie

The Implementation

The TrieNode class

We’ll first define a TrieNode class. Each node in our trie will be an instance of this class.

As discussed in the design section, each node will have the following attributes to construct a simple trie:

a character value: val
a boolean flag to indicate if a word ends on this node: is_end
and a dictionary of children of the form str: TrieNode : children

class TrieNode:
    def __init__(self, val=None):
        self.val = val
        self.children = dict()
        self.is_end = False

Now, let’s define the Trie class and its methods.

The Trie class

We first instantiate the Trie’s root node. It’s an instance of our TrieNode class.

class Trie:

    def __init__(self):
        self.rootNode = TrieNode()

The add method

Next, let’s implement the add method. At a high level, it works like this:

start at the root node.
For every character in word, check if that character already exists in one of the current node’s children.
If it does, set the current node to the child that represents that character.
If it doesn’t, add a child that represents that character.
Set the last node’s is_end attribute to True, since it now represents the word we just added.

def add(self, word: str) -> None:
    """
    insert word into the trie
    """
    curr_node = self.rootNode

    for char in word:
        if char not in curr_node.children:
            curr_node.children[char] = TrieNode(char)

        curr_node = curr_node.children[char]

    curr_node.is_end = True

The search method

This method follows a similar logic for traversing the trie.

Traverse the trie, looking for children that represent the current character of the word we’re searching for.
If at any step, there’s no such child, return False.
Once the last character has been found, check if that node has is_end set to True. Return True only if that’s the case.

def search(self, word: str) -> bool:
    """
    check if word exists in the trie
    """
    curr_node = self.rootNode

    for char in word:
        if char not in curr_node.children:
            return False

        curr_node = curr_node.children[char]

    return curr_node.is_end

The starts_with method

The method is exactly like the search method, except that it doesn’t matter if the last node we land on represents a complete word or not.

def starts_with(self, prefix: str) -> bool:
    """
    check if word is the prefix of some string
    in the trie.
    """
    curr_node = self.rootNode

    for char in prefix:
        if char not in curr_node.children:
            return False

        curr_node = curr_node.children[char]

    return True

The remove method

This is the most complex method to implement so far. While there are a few ways to implement this, the general logic goes like this:

Traverse the trie looking for each character in the word to be removed.
Once the last node is found, set its is_end attribute to False, since it no longer represents the word we’ve just removed.
Next, if this last node has no children, we can simply delete it. Likewise, its parent node may also not be needed.
Traverse up the trie deleting nodes that are no longer needed — that is, they have no children.

def remove(self, word: str) -> None:
    """
    Remove word from the trie.
    """
    curr_node = self.rootNode
    branch = [curr_node]

    # traverse the tree and keep track of the nodes encountered
    for char in word:
        curr_node = curr_node.children[char]
        branch.append(curr_node)

    # set the last node's is_end to False
    curr_node.is_end = False

    # go back up the tree, deleting nodes that no longer have any children
    n = len(branch)
    for i in range(n-2, -1):
        j = i + 1
        if not branch[j].children:
            del branch[i].children[branch[j].val]

    return None

This implementation of remove assumes that word exists in the trie. But handling that edge case is very straightforward, so I won’t be discussing that.

The remove method can also be conveniently implemented with a stack.

The complete implementation

class TrieNode:
    def __init__(self, val=None):
        self.val = val
        self.children = dict()
        self.is_end = False


class Trie:

    def __init__(self):
        self.rootNode = TrieNode()

    def add(self, word: str) -> None:
        """
        insert word into the trie
        """
        curr_node = self.rootNode

        for char in word:
            if char not in curr_node.children:
                curr_node.children[char] = TrieNode(char)

            curr_node = curr_node.children[char]

        curr_node.is_end = True

    def search(self, word: str) -> bool:
        """
        check if word exists in the trie
        """
        curr_node = self.rootNode

        for char in word:
            if char not in curr_node.children:
                return False

            curr_node = curr_node.children[char]

        return curr_node.is_end

    def starts_with(self, word: str) -> bool:
        """
        check if word is the prefix of some string
        in the trie.
        """
        curr_node = self.rootNode

        for char in word:
            if char not in curr_node.children:
                return False

            curr_node = curr_node.children[char]

        return True

    def remove(self, word: str) -> None:
        """
        Remove word from the trie.
        """
        curr_node = self.rootNode
        branch = [curr_node]
    
        # traverse the tree and keep track of the nodes encountered
        for char in word:
            curr_node = curr_node.children[char]
            branch.append(curr_node)
    
        # set the last node's is_end to False
        curr_node.is_end = False
    
        # go back up the tree, deleting nodes that no longer have any children
        n = len(branch)
        for i in range(n-2, -1):
            j = i + 1
            if not branch[j].children:
                del branch[i].children[branch[j].val]
    
        return None

Usage

Using this implementation comes down to instantiating the Trie class and using its methods to add and search for words.

trie = Trie();
trie.add("apple");

trie.search("apple");    # return True
trie.search("app");      # return False
trie.starts_with("app"); # return True

trie.add("app");
trie.search("app");      # return True

Complexity Analysis

Considering an average word length of n , average insertion time is linear as we iterate through each character, creating a node when required.

Searching and prefix searching (startswith) are also linear, since we use the same traversal logic.

Removal is linear as well since we go down the trie once and then go up.

Add: O(n)
Search: O(n)
Prefix-search: O(n)
Remove: O(n)

Examples

These are some questions on Leetcode to give you a good idea of how tries work in practice.

Implement Trie (Prefix Tree) - LeetCode

Can you solve this real interview question? Implement Trie (Prefix Tree) - A trie [https://en.wikipedia.org/wiki/Trie]…

leetcode.com

Design Add and Search Words Data Structure - LeetCode

Can you solve this real interview question? Design Add and Search Words Data Structure - Design a data structure that…

leetcode.com

Word Search II - LeetCode

Can you solve this real interview question? Word Search II - Given an m x n board of characters and a list of strings…

leetcode.com

Understanding Prefix Trees

An overview of the Trie data structure

About

Table of Contents

The Usecase

The Design

Overview

Operations

The Implementation

The TrieNode class

The Trie class

The add method

The search method

The starts_with method

The remove method

The complete implementation

Usage

Complexity Analysis

Examples

Implement Trie (Prefix Tree) - LeetCode

Can you solve this real interview question? Implement Trie (Prefix Tree) - A trie [https://en.wikipedia.org/wiki/Trie]…

Design Add and Search Words Data Structure - LeetCode

Can you solve this real interview question? Design Add and Search Words Data Structure - Design a data structure that…

Word Search II - LeetCode

Can you solve this real interview question? Word Search II - Given an m x n board of characters and a list of strings…

Written by Aniruddha Karajgi

Responses (1)