A preorder DFS codec for converting a binary tree to a string and reconstructing the same tree from that string.
Problem Restatement
We need to design a codec for a binary tree.
The codec has two operations:
serialize(root)
deserialize(data)serialize(root) converts a binary tree into a string.
deserialize(data) converts that string back into the original binary tree.
There is no required string format. The only requirement is that a tree serialized by our method can be deserialized back into the same tree structure. The source statement also clarifies that we do not have to follow LeetCode’s own binary tree input format.
Input and Output
| Operation | Input | Output |
|---|---|---|
serialize | Root of a binary tree | String representation |
deserialize | String representation | Root of the reconstructed binary tree |
Node values satisfy:
-1000 <= Node.val <= 1000The number of nodes is in the range:
0 <= nodes <= 10^4Examples
For this tree:
1
/ \
2 3
/ \
4 5LeetCode displays it as:
[1, 2, 3, null, null, 4, 5]A valid codec could serialize it as:
"1,2,#,#,3,4,#,#,5,#,#"Here, # means a null child.
For an empty tree:
root = NoneA valid serialization is:
"#"and deserializing "#" gives back None.
First Thought: Level Order With Nulls
One possible solution uses breadth-first search.
We scan the tree level by level and write both real nodes and null markers.
For example:
[1, 2, 3, null, null, 4, 5]could become:
"1,2,3,#,#,4,5,#,#,#,#"This works.
But preorder DFS with null markers is usually simpler to implement because the recursive structure of deserialization exactly mirrors the recursive structure of serialization.
Key Insight
A binary tree can be reconstructed uniquely from preorder traversal if we also record null children.
Preorder means:
node, left subtree, right subtreeWithout null markers, this is ambiguous.
For example, the preorder values:
1, 2could mean:
1
/
2or:
1
\
2With null markers, the shape becomes explicit.
Left-child case:
"1,2,#,#,#"Right-child case:
"1,#,2,#,#"So every missing child must be written into the serialized string.
Algorithm
For serialization, use preorder DFS.
If the node is None, append:
"#"Otherwise:
- Append the node value.
- Serialize the left child.
- Serialize the right child.
Then join the tokens with commas.
For deserialization, read tokens from left to right.
If the next token is "#", return None.
Otherwise:
- Create a node from the token.
- Recursively build its left child.
- Recursively build its right child.
- Return the node.
Correctness
Serialization writes each node before its left and right subtrees. It also writes a null marker for every missing child.
Therefore the serialized stream contains both the node values and the full tree shape.
During deserialization, the first token represents the root of the current subtree.
If the token is "#", the current subtree is empty, so returning None is correct.
If the token is a value, the algorithm creates that root node. The next tokens in the stream describe the entire left subtree, because serialization wrote the left subtree immediately after the root. After that recursive call consumes exactly the left subtree tokens, the following tokens describe the right subtree.
So each recursive deserialization step reverses exactly one recursive serialization step.
By induction on the tree structure, deserialize(serialize(root)) reconstructs a tree with the same values and the same shape as root.
Complexity
Let n be the number of real nodes.
| Operation | Time | Space | Why |
|---|---|---|---|
serialize | O(n) | O(n) | Visits every node and null child marker |
deserialize | O(n) | O(n) | Reads every token once |
| Recursion stack | O(h) | O(h) | h is the height of the tree |
The output string itself has O(n) tokens because a binary tree with n nodes has n + 1 null children.
Implementation
# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, x):
# self.val = x
# self.left = None
# self.right = None
class Codec:
def serialize(self, root):
values = []
def dfs(node):
if node is None:
values.append("#")
return
values.append(str(node.val))
dfs(node.left)
dfs(node.right)
dfs(root)
return ",".join(values)
def deserialize(self, data):
values = iter(data.split(","))
def dfs():
value = next(values)
if value == "#":
return None
node = TreeNode(int(value))
node.left = dfs()
node.right = dfs()
return node
return dfs()Code Explanation
Serialization stores tokens in a list:
values = []When a null child appears, we write:
values.append("#")When a real node appears, we write its value:
values.append(str(node.val))Then we serialize the left subtree before the right subtree:
dfs(node.left)
dfs(node.right)Finally, we join tokens into one string:
return ",".join(values)For deserialization, we split the string and create an iterator:
values = iter(data.split(","))Each recursive call consumes exactly the tokens needed for one subtree.
If the token is "#":
if value == "#":
return Nonethe subtree is empty.
Otherwise, we create a node:
node = TreeNode(int(value))Then rebuild its left and right children in the same order used during serialization:
node.left = dfs()
node.right = dfs()Testing
class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None
def same_tree(a, b):
if a is None and b is None:
return True
if a is None or b is None:
return False
return (
a.val == b.val
and same_tree(a.left, b.left)
and same_tree(a.right, b.right)
)
def test_codec():
codec = Codec()
assert codec.deserialize(codec.serialize(None)) is None
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.right.left = TreeNode(4)
root.right.right = TreeNode(5)
rebuilt = codec.deserialize(codec.serialize(root))
assert same_tree(root, rebuilt)
root = TreeNode(-1)
root.right = TreeNode(1000)
rebuilt = codec.deserialize(codec.serialize(root))
assert same_tree(root, rebuilt)
print("all tests passed")
test_codec()Test meaning:
| Test | Why |
|---|---|
| Empty tree | Confirms null root support |
| Standard tree | Confirms both left and right children are preserved |
| Negative and large values | Confirms integer parsing works |
| Right-only child | Confirms null markers preserve shape |