Trie Kotlin

Emanuel Moecklin
4 min readSep 14, 2022

--

Implementing an optimized Trie in Kotlin or a short story on why I love Kotlin. Before reading on, make sure you understand how Tries work (not explained here). E.g.: Wikipedia, Article 1, Article 2 ...

TL;DR go straight to the code: https://github.com/1gravity/kotlin-trie.

Node

A typical node looks like this (Wikipedia):

structure Node
Children Node[Alphabet-Size]
Is-Terminal Boolean
Value Data-Type
end structure

Many implementations don’t have the Value parameter, they only allow to search for keys: fun search(key: String): Boolean.
My implementation stores aValue → the signature of the search function is: fun search(key: String): Value?:

data class Node<Value>(
val children: MutableMap<Char, Node<Value>> = mutableMapOf(),
var value: Value? = null,
)

Note that there’s no Is-Terminal property (needed for deletion) because Is-Terminal is the same as node.value != null.

Insert

The basic implementation according to the pseudo code on Wikipedia:

The code traverses the Trie following the insert key and inserts missing nodes if needed. The last node will be set to value.

To traverse data structures in Kotlin the fold function is great. We can use the root node as initial value and return the child node in the lambda function till we reach the leaf node that becomes fold’s return value:

Now all we need is adding missing nodes:

If a node is missing (node.children[char] == null), we create a new one and insert it.

Thanks to fold, scope functions and the Elvis operator we got this from 10 lines of code down to 5.

Search

The basic implementation according to the pseudo code on Wikipedia:

Search is very similar to insert. Instead of adding nodes when finding a leaf, it returns null (key not found) and instead of inserting the value into the leaf, search returns it’s value (return currentNode.value).

Analogous to insert, we use the fold function to traverse the Trie:

If a leaf is found, its value is returned. If no leaf is found null is returned. From 10 lines of code down to 3.

Delete

The basic implementation converted from The Trie Data Structure in Java:

The delete function is a recursive function calling itself with an increasing index used to address characters in key.

The first statement ( if (index == key.length) is a stop condition: if we reach the last character of the search key, we found the node to be deleted. The function returns:

  • null if the node was a leaf (was because the value was just deleted)
  • the deleted node itself, if it still has children (not a leaf)

The second statement (val child = node...) is a stop condition too. If the next character can’t be found in the Trie → return null.

The rest of the code is the recursive part to traverse the Trie and delete the current node if the deletion further “down” the Trie made it “obsolete” (node.children.isEmpty() && node.value == null).

There’s a lot of redundant code in this implementation, e.g. it checks twice whether the node is obsolete (once in the first stop condition, then again after the delete function calls). Let’s see how this can be improved.

  1. The return value is not needed. All it does is signal whether a node should be removed from its parent (no value & no children). The parent node can check that itself: child.children.isEmpty() && child.value == null. That alone simplifies the stop condition considerably:
    if (index == key.length) node.value = null.
  2. The second stop condition becomes a simple null check (use a scope function): node.children[key[index]]?.run { ... }.
  3. The check to remove the child from its parent and the actual removal becomes a one liner:
    if (children...) node.children.remove(key[index]).

Putting all together we have:

From 17 lines of code down to 11.

Encore: Non-Recursive

Not everyone likes recursive algorithm so here’s an iterative delete implementation. To make this happen, we added a parent property to each Node (the insert function needed to be changed as well):

Note the use of foldRight to traverse the Trie bottom up (from leaf to parent, from search key right to left). The rest should be self-explaining.

Encore 2: Cleanup

A good amount of the delete code is needed to clean out unused nodes. If you don’t care about unnecessary memory allocations, delete can be simplified considerably:

Summary

While not every optimization uses Kotlin specific features, traversing the Trie is simple(r) using the fold function while scope functions replace verbose null condition checks and “side effects”.

As always, thanks for reading and for your feedback.

Happy coding!

PS: how does this show my love for Kotlin? Just check out the gazillion Trie implementations out there…

--

--

No responses yet