Intro to Trie (Prefix Tree) – FEAT. Leetcode 208. Implement Trie

A trie, or a prefix tree, refers to an ordered data structure that is usually used to store strings (Wikipedia). The picture on the above is an example where a trie stores [ape,apple,are,art]. Now let’s take a closer look at it and it’s not hard to find that:

  • Each path from the root to leaves forms a string.
  • Each node except for the root node contains a value.
  • All the descendants of a node share a common prefix associated with that node.

The great part about trie is that, by taking more space, it has O(k), k=len(string) time complexities for both inserting and searching, regardless of the number of stored strings.

Note: the idea of implementing this in Java and Python are inspired by @Hua Hua.

Implementations

$ means it’s a word instead of a prefix

There are some differences when implementing it in Java and Python due to the language’s nature (as referred to static language vs dynamic language by @Hua Hua). This post will show both of them, with Java coming first.

Let’s take Leetcode 208. Implement Trie as an example to walk through the implementations for trie.

Implement a trie with insertsearch, and startsWith methods.
Example:
Trie trie = new Trie(); 
trie.insert("apple"); 
trie.search("apple"); // returns true 
trie.search("app"); // returns false 
trie.startsWith("app"); // returns true trie.insert("app"); trie.search("app"); // returns true

Java

Firstly we need to define a node class, which maintains an array children so that children[c-'a'] refers to its subtree for the letter c. Besides, we also need a boolean variable isWord to indicate the string is a word or only a prefix.

// Java
class TrieNode {
    public boolean isWord;
    public TrieNode[] children;
    public TrieNode(){
        children = new TrieNode[26];
        isWord = false;
    }
}

Inserting

We start with a dummy head and for each letter in the word, we would check if it’s in children. If so, we would move into that node and process the next letter; or if not, we can create a node for that letter.

    // Java
    public void insert(String word) {
        TrieNode cur = root;
        for (int i=0;i<word.length();i++){
            int index = word.charAt(i) - 'a';
            if (cur.children[index] == null){
                cur.children[index] = new TrieNode();
            }
            cur = cur.children[index];
        }
        cur.isWord = true;
    }

Searching

Similar to inserting, we start with the first letter of the word and if we find one that is null, we could return false directly. A little difference between search(String word) and startsWith(String prefix) is that the former one needs to check isWord. Putting all together, the Java code is as follows:

class TrieNode {
    public boolean isWord;
    public TrieNode[] children;
    public TrieNode(){
        children = new TrieNode[26];
        isWord = false;
    }
    
}

class Trie {
    private TrieNode root;
    /** Initialize your data structure here. */
    public Trie() {
        root = new TrieNode();
    }
    
    /** Inserts a word into the trie. */
    public void insert(String word) {
        TrieNode cur = root;
        for (int i=0;i<word.length();i++){
            int index = word.charAt(i) - 'a';
            if (cur.children[index] == null){
                cur.children[index] = new TrieNode();
            }
            cur = cur.children[index];
        }
        cur.isWord = true;
    }
    
    /** Returns if the word is in the trie. */
    public boolean search(String word) {
        TrieNode cur = root;
        for (int i=0;i<word.length();i++){
            int index = word.charAt(i) - 'a';
            if (cur.children[index] == null){
                return false;
            }
            cur = cur.children[index];
        }
        return cur.isWord;
    }
    
    /** Returns if there is any word in the trie that starts with the given prefix. */
    public boolean startsWith(String prefix) {
        TrieNode cur = root;
        for (int i=0;i<prefix.length();i++){
            int index = prefix.charAt(i) - 'a';
            if (cur.children[index] == null){
                return false;
            }
            cur = cur.children[index];
        }
        return true;
    }
}

Python

For the dynamic programming language Python, by taking advantage of the dictionary (hashmap), we could have an efficient implementation instead of translating it directly from Java.

When we use Java, we first define a node class and in each node object, we would store its children – potentially 26 nodes – in an array. With Python, we use a dictionary as a node, and its children are stored as key-value pairs where keys are from ['a'-'z'] and values are also dictionaries.

class Trie:

    def __init__(self):
        """
        Initialize your data structure here.
        """
        self.root = {}
        

    def insert(self, word: str) -> None:
        """
        Inserts a word into the trie.
        """
        p = self.root
        for char in word:
            if not p.get(char):
                p[char] = {}
            p = p[char]
        p['$'] = True
            
    def search(self, word: str) -> bool:
        """
        Returns if the word is in the trie.
        """
        p = self.root
        for char in word:
            if p.get(char):
                p = p[char]
            else:
                return False
        return p.get('$')
        

    def startsWith(self, prefix: str) -> bool:
        """
        Returns if there is any word in the trie that starts with the given prefix.
        """
        p = self.root
        for char in prefix:
            if p.get(char):
                p = p[char]
            else:
                return False
        return True

Reference

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments