Skip to content

LeetCode 297: Serialize and Deserialize Binary Tree

A preorder DFS codec for converting a binary tree to a string and reconstructing the same tree from that string.

Problem Restatement

We need to design a codec for a binary tree.

The codec has two operations:

serialize(root)
deserialize(data)

serialize(root) converts a binary tree into a string.

deserialize(data) converts that string back into the original binary tree.

There is no required string format. The only requirement is that a tree serialized by our method can be deserialized back into the same tree structure. The source statement also clarifies that we do not have to follow LeetCode’s own binary tree input format.

Input and Output

OperationInputOutput
serializeRoot of a binary treeString representation
deserializeString representationRoot of the reconstructed binary tree

Node values satisfy:

-1000 <= Node.val <= 1000

The number of nodes is in the range:

0 <= nodes <= 10^4

Examples

For this tree:

    1
   / \
  2   3
     / \
    4   5

LeetCode displays it as:

[1, 2, 3, null, null, 4, 5]

A valid codec could serialize it as:

"1,2,#,#,3,4,#,#,5,#,#"

Here, # means a null child.

For an empty tree:

root = None

A valid serialization is:

"#"

and deserializing "#" gives back None.

First Thought: Level Order With Nulls

One possible solution uses breadth-first search.

We scan the tree level by level and write both real nodes and null markers.

For example:

[1, 2, 3, null, null, 4, 5]

could become:

"1,2,3,#,#,4,5,#,#,#,#"

This works.

But preorder DFS with null markers is usually simpler to implement because the recursive structure of deserialization exactly mirrors the recursive structure of serialization.

Key Insight

A binary tree can be reconstructed uniquely from preorder traversal if we also record null children.

Preorder means:

node, left subtree, right subtree

Without null markers, this is ambiguous.

For example, the preorder values:

1, 2

could mean:

  1
 /
2

or:

1
 \
  2

With null markers, the shape becomes explicit.

Left-child case:

"1,2,#,#,#"

Right-child case:

"1,#,2,#,#"

So every missing child must be written into the serialized string.

Algorithm

For serialization, use preorder DFS.

If the node is None, append:

"#"

Otherwise:

  1. Append the node value.
  2. Serialize the left child.
  3. Serialize the right child.

Then join the tokens with commas.

For deserialization, read tokens from left to right.

If the next token is "#", return None.

Otherwise:

  1. Create a node from the token.
  2. Recursively build its left child.
  3. Recursively build its right child.
  4. Return the node.

Correctness

Serialization writes each node before its left and right subtrees. It also writes a null marker for every missing child.

Therefore the serialized stream contains both the node values and the full tree shape.

During deserialization, the first token represents the root of the current subtree.

If the token is "#", the current subtree is empty, so returning None is correct.

If the token is a value, the algorithm creates that root node. The next tokens in the stream describe the entire left subtree, because serialization wrote the left subtree immediately after the root. After that recursive call consumes exactly the left subtree tokens, the following tokens describe the right subtree.

So each recursive deserialization step reverses exactly one recursive serialization step.

By induction on the tree structure, deserialize(serialize(root)) reconstructs a tree with the same values and the same shape as root.

Complexity

Let n be the number of real nodes.

OperationTimeSpaceWhy
serializeO(n)O(n)Visits every node and null child marker
deserializeO(n)O(n)Reads every token once
Recursion stackO(h)O(h)h is the height of the tree

The output string itself has O(n) tokens because a binary tree with n nodes has n + 1 null children.

Implementation

# Definition for a binary tree node.
# class TreeNode:
#     def __init__(self, x):
#         self.val = x
#         self.left = None
#         self.right = None

class Codec:
    def serialize(self, root):
        values = []

        def dfs(node):
            if node is None:
                values.append("#")
                return

            values.append(str(node.val))
            dfs(node.left)
            dfs(node.right)

        dfs(root)

        return ",".join(values)

    def deserialize(self, data):
        values = iter(data.split(","))

        def dfs():
            value = next(values)

            if value == "#":
                return None

            node = TreeNode(int(value))
            node.left = dfs()
            node.right = dfs()

            return node

        return dfs()

Code Explanation

Serialization stores tokens in a list:

values = []

When a null child appears, we write:

values.append("#")

When a real node appears, we write its value:

values.append(str(node.val))

Then we serialize the left subtree before the right subtree:

dfs(node.left)
dfs(node.right)

Finally, we join tokens into one string:

return ",".join(values)

For deserialization, we split the string and create an iterator:

values = iter(data.split(","))

Each recursive call consumes exactly the tokens needed for one subtree.

If the token is "#":

if value == "#":
    return None

the subtree is empty.

Otherwise, we create a node:

node = TreeNode(int(value))

Then rebuild its left and right children in the same order used during serialization:

node.left = dfs()
node.right = dfs()

Testing

class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

def same_tree(a, b):
    if a is None and b is None:
        return True

    if a is None or b is None:
        return False

    return (
        a.val == b.val
        and same_tree(a.left, b.left)
        and same_tree(a.right, b.right)
    )

def test_codec():
    codec = Codec()

    assert codec.deserialize(codec.serialize(None)) is None

    root = TreeNode(1)
    root.left = TreeNode(2)
    root.right = TreeNode(3)
    root.right.left = TreeNode(4)
    root.right.right = TreeNode(5)

    rebuilt = codec.deserialize(codec.serialize(root))
    assert same_tree(root, rebuilt)

    root = TreeNode(-1)
    root.right = TreeNode(1000)

    rebuilt = codec.deserialize(codec.serialize(root))
    assert same_tree(root, rebuilt)

    print("all tests passed")

test_codec()

Test meaning:

TestWhy
Empty treeConfirms null root support
Standard treeConfirms both left and right children are preserved
Negative and large valuesConfirms integer parsing works
Right-only childConfirms null markers preserve shape