Trie Kotlin
Implementing an optimized Trie in Kotlin or a short story on why I love Kotlin. Before reading on, make sure you understand how Tries work (not explained here). E.g.: Wikipedia, Article 1, Article 2 ...
TL;DR go straight to the code: https://github.com/1gravity/kotlin-trie.
Node
A typical node looks like this (Wikipedia):
structure Node
Children Node[Alphabet-Size]
Is-Terminal Boolean
Value Data-Type
end structure
Many implementations don’t have the Value
parameter, they only allow to search for keys: fun search(key: String): Boolean.
My implementation stores aValue
→ the signature of the search function is: fun search(key: String): Value?
:
data class Node<Value>(
val children: MutableMap<Char, Node<Value>> = mutableMapOf(),
var value: Value? = null,
)
Note that there’s no Is-Terminal
property (needed for deletion) because Is-Terminal
is the same as node.value != null
.
Insert
The basic implementation according to the pseudo code on Wikipedia:
The code traverses the Trie following the insert key and inserts missing nodes if needed. The last node will be set to value.
To traverse data structures in Kotlin the fold function is great. We can use the root node as initial value and return the child node in the lambda function till we reach the leaf node that becomes fold’s return value:
Now all we need is adding missing nodes:
If a node is missing (node.children[char] == null
), we create a new one and insert it.
Thanks to fold, scope functions and the Elvis operator we got this from 10 lines of code down to 5.
Search
The basic implementation according to the pseudo code on Wikipedia:
Search is very similar to insert. Instead of adding nodes when finding a leaf, it returns null
(key not found) and instead of inserting the value into the leaf, search returns it’s value (return currentNode.value
).
Analogous to insert, we use the fold function to traverse the Trie:
If a leaf is found, its value is returned. If no leaf is found null
is returned. From 10 lines of code down to 3.
Delete
The basic implementation converted from The Trie Data Structure in Java:
The delete function is a recursive function calling itself with an increasing index used to address characters in key
.
The first statement ( if (index == key.length)
is a stop condition: if we reach the last character of the search key, we found the node to be deleted. The function returns:
null
if the node was a leaf (was because the value was just deleted)- the deleted node itself, if it still has children (not a leaf)
The second statement (val child = node...
) is a stop condition too. If the next character can’t be found in the Trie → return null
.
The rest of the code is the recursive part to traverse the Trie and delete the current node if the deletion further “down” the Trie made it “obsolete” (node.children.isEmpty() && node.value == null
).
There’s a lot of redundant code in this implementation, e.g. it checks twice whether the node is obsolete (once in the first stop condition, then again after the delete function calls). Let’s see how this can be improved.
- The return value is not needed. All it does is signal whether a node should be removed from its parent (no value & no children). The parent node can check that itself:
child.children.isEmpty() && child.value == null
. That alone simplifies the stop condition considerably:if (index == key.length) node.value = null
. - The second stop condition becomes a simple null check (use a scope function):
node.children[key[index]]?.run { ... }
. - The check to remove the child from its parent and the actual removal becomes a one liner:
if (children...) node.children.remove(key[index])
.
Putting all together we have:
From 17 lines of code down to 11.
Encore: Non-Recursive
Not everyone likes recursive algorithm so here’s an iterative delete implementation. To make this happen, we added a parent
property to each Node
(the insert function needed to be changed as well):
Note the use of foldRight
to traverse the Trie bottom up (from leaf to parent, from search key right to left). The rest should be self-explaining.
Encore 2: Cleanup
A good amount of the delete code is needed to clean out unused nodes. If you don’t care about unnecessary memory allocations, delete can be simplified considerably:
Summary
While not every optimization uses Kotlin specific features, traversing the Trie is simple(r) using the fold function while scope functions replace verbose null condition checks and “side effects”.
As always, thanks for reading and for your feedback.
Happy coding!
PS: how does this show my love for Kotlin? Just check out the gazillion Trie implementations out there…