Serialize a binary search tree compactly with preorder traversal and rebuild it using BST value bounds.
Problem Restatement
We need to design two methods:
| Method | Meaning |
|---|---|
serialize(root) | Convert a BST into a string |
deserialize(data) | Convert that string back into the original BST |
The tree is guaranteed to be a binary search tree.
There is no required format. Any format is valid as long as deserializing the serialized string reconstructs the same BST.
The encoded string should be as compact as possible. The number of nodes is in [0, 10^4], node values are in [0, 10^4], and the input tree is guaranteed to be a BST.
Input and Output
| Item | Meaning |
|---|---|
Input to serialize | Root of a BST |
Output from serialize | String encoding |
Input to deserialize | Serialized string |
Output from deserialize | Reconstructed BST root |
| Empty tree | Encoded as an empty string |
The node class is usually:
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = rightExamples
Example 1:
root = [2, 1, 3]A compact preorder serialization is:
2 1 3The reconstructed tree is:
2
/ \
1 3Example 2:
root = []The serialized string is:
and deserialization returns:
NoneFirst Thought: Serialize Like a Normal Binary Tree
For a general binary tree, we often serialize null markers:
2 1 # # 3 # #This works for any binary tree because the null markers preserve exact shape.
But this problem gives us a BST.
The BST property gives extra information:
| Subtree | Value rule |
|---|---|
| Left subtree | Values are smaller than root |
| Right subtree | Values are larger than root |
Because of this, we can avoid null markers and make the string more compact.
Key Insight
Preorder traversal of a BST is enough to reconstruct the tree.
Preorder order is:
root -> left subtree -> right subtreeFor example:
5
/ \
3 7
/ \ \
2 4 8Preorder serialization is:
5 3 2 4 7 8During deserialization, we read values from left to right.
For each recursive position, we know the valid range of values.
For the root, valid values are:
(-infinity, infinity)For the left child of 5, valid values are:
(-infinity, 5)For the right child of 5, valid values are:
(5, infinity)If the next value does not fit the current range, that subtree is empty.
This removes the need for null markers.
Algorithm
Serialization
Use preorder DFS.
For each node:
- Append
node.val. - Serialize the left subtree.
- Serialize the right subtree.
Join values with spaces.
Deserialization
Split the string into integers.
Use an index pointing to the next unread value.
Define:
build(low, high)This builds a subtree whose values must satisfy:
low < value < highInside build:
- If there are no values left, return
None. - Look at the next value without consuming it.
- If it is outside
(low, high), returnNone. - Otherwise consume it and create a node.
- Build its left subtree with bounds
(low, value). - Build its right subtree with bounds
(value, high). - Return the node.
Correctness
Serialization writes nodes in preorder: root first, then the entire left subtree, then the entire right subtree.
During deserialization, build(low, high) reconstructs exactly the subtree whose values belong in that range.
If the next preorder value lies outside the current range, then it cannot belong to this subtree, so the subtree is empty.
If the next value lies inside the range, it must be the root of the current subtree, because preorder always visits the subtree root before its descendants.
After creating that root, all values for its left subtree must be smaller than the root value, so the left recursive call uses (low, root.val). All values for its right subtree must be larger than the root value, so the right recursive call uses (root.val, high).
Because the input tree is a BST, these bounds place every serialized value in the same position it had in the original tree.
Therefore, deserialize(serialize(root)) reconstructs the original BST.
Complexity
| Method | Time | Space |
|---|---|---|
serialize | O(n) | O(n) |
deserialize | O(n) | O(n) |
n is the number of nodes.
The extra space includes the output string, token list, and recursion stack.
Implementation
class Codec:
def serialize(self, root: 'Optional[TreeNode]') -> str:
values = []
def preorder(node: 'Optional[TreeNode]') -> None:
if not node:
return
values.append(str(node.val))
preorder(node.left)
preorder(node.right)
preorder(root)
return " ".join(values)
def deserialize(self, data: str) -> 'Optional[TreeNode]':
if not data:
return None
values = list(map(int, data.split()))
index = 0
def build(low: int, high: int) -> 'Optional[TreeNode]':
nonlocal index
if index == len(values):
return None
value = values[index]
if value <= low or value >= high:
return None
index += 1
node = TreeNode(value)
node.left = build(low, value)
node.right = build(value, high)
return node
return build(float("-inf"), float("inf"))Code Explanation
The serializer performs preorder traversal:
values.append(str(node.val))
preorder(node.left)
preorder(node.right)No null marker is written.
This is compact because the BST property provides enough structure during decoding.
The deserializer starts by parsing the values:
values = list(map(int, data.split()))The variable index points to the next preorder value.
The helper:
build(low, high)tries to build one subtree.
Before consuming a value, it checks whether the value belongs to the current subtree:
if value <= low or value >= high:
return NoneIf it fits, the value becomes the subtree root:
node = TreeNode(value)Then the left and right subtrees are built with narrower BST bounds:
node.left = build(low, value)
node.right = build(value, high)The final call uses infinite bounds because the root can be any valid BST value.
Testing
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def same_tree(a, b):
if a is None or b is None:
return a is b
return (
a.val == b.val
and same_tree(a.left, b.left)
and same_tree(a.right, b.right)
)
def run_tests():
codec = Codec()
root = TreeNode(2, TreeNode(1), TreeNode(3))
assert same_tree(root, codec.deserialize(codec.serialize(root)))
root = TreeNode(
5,
TreeNode(3, TreeNode(2), TreeNode(4)),
TreeNode(7, None, TreeNode(8)),
)
assert same_tree(root, codec.deserialize(codec.serialize(root)))
assert codec.serialize(None) == ""
assert codec.deserialize("") is None
root = TreeNode(1, None, TreeNode(2, None, TreeNode(3)))
assert same_tree(root, codec.deserialize(codec.serialize(root)))
root = TreeNode(3, TreeNode(2, TreeNode(1)), None)
assert same_tree(root, codec.deserialize(codec.serialize(root)))
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
[2,1,3] | Checks basic BST reconstruction |
| Larger BST | Checks both left and right subtrees |
| Empty tree | Checks empty string behavior |
| Right-skewed tree | Checks increasing chain |
| Left-skewed tree | Checks decreasing chain |