# LeetCode 297: Serialize and Deserialize Binary Tree

## Problem Restatement

We need to design a codec for a binary tree.

The codec has two operations:

```python
serialize(root)
deserialize(data)
```

`serialize(root)` converts a binary tree into a string.

`deserialize(data)` converts that string back into the original binary tree.

There is no required string format. The only requirement is that a tree serialized by our method can be deserialized back into the same tree structure. The source statement also clarifies that we do not have to follow LeetCode's own binary tree input format.

## Input and Output

| Operation | Input | Output |
|---|---|---|
| `serialize` | Root of a binary tree | String representation |
| `deserialize` | String representation | Root of the reconstructed binary tree |

Node values satisfy:

```python
-1000 <= Node.val <= 1000
```

The number of nodes is in the range:

```python
0 <= nodes <= 10^4
```

## Examples

For this tree:

```python
    1
   / \
  2   3
     / \
    4   5
```

LeetCode displays it as:

```python
[1, 2, 3, null, null, 4, 5]
```

A valid codec could serialize it as:

```python
"1,2,#,#,3,4,#,#,5,#,#"
```

Here, `#` means a null child.

For an empty tree:

```python
root = None
```

A valid serialization is:

```python
"#"
```

and deserializing `"#"` gives back `None`.

## First Thought: Level Order With Nulls

One possible solution uses breadth-first search.

We scan the tree level by level and write both real nodes and null markers.

For example:

```python
[1, 2, 3, null, null, 4, 5]
```

could become:

```python
"1,2,3,#,#,4,5,#,#,#,#"
```

This works.

But preorder DFS with null markers is usually simpler to implement because the recursive structure of deserialization exactly mirrors the recursive structure of serialization.

## Key Insight

A binary tree can be reconstructed uniquely from preorder traversal if we also record null children.

Preorder means:

```python
node, left subtree, right subtree
```

Without null markers, this is ambiguous.

For example, the preorder values:

```python
1, 2
```

could mean:

```python
  1
 /
2
```

or:

```python
1
 \
  2
```

With null markers, the shape becomes explicit.

Left-child case:

```python
"1,2,#,#,#"
```

Right-child case:

```python
"1,#,2,#,#"
```

So every missing child must be written into the serialized string.

## Algorithm

For serialization, use preorder DFS.

If the node is `None`, append:

```python
"#"
```

Otherwise:

1. Append the node value.
2. Serialize the left child.
3. Serialize the right child.

Then join the tokens with commas.

For deserialization, read tokens from left to right.

If the next token is `"#"`, return `None`.

Otherwise:

1. Create a node from the token.
2. Recursively build its left child.
3. Recursively build its right child.
4. Return the node.

## Correctness

Serialization writes each node before its left and right subtrees. It also writes a null marker for every missing child.

Therefore the serialized stream contains both the node values and the full tree shape.

During deserialization, the first token represents the root of the current subtree.

If the token is `"#"`, the current subtree is empty, so returning `None` is correct.

If the token is a value, the algorithm creates that root node. The next tokens in the stream describe the entire left subtree, because serialization wrote the left subtree immediately after the root. After that recursive call consumes exactly the left subtree tokens, the following tokens describe the right subtree.

So each recursive deserialization step reverses exactly one recursive serialization step.

By induction on the tree structure, `deserialize(serialize(root))` reconstructs a tree with the same values and the same shape as `root`.

## Complexity

Let `n` be the number of real nodes.

| Operation | Time | Space | Why |
|---|---:|---:|---|
| `serialize` | `O(n)` | `O(n)` | Visits every node and null child marker |
| `deserialize` | `O(n)` | `O(n)` | Reads every token once |
| Recursion stack | `O(h)` | `O(h)` | `h` is the height of the tree |

The output string itself has `O(n)` tokens because a binary tree with `n` nodes has `n + 1` null children.

## Implementation

```python
# Definition for a binary tree node.
# class TreeNode:
#     def __init__(self, x):
#         self.val = x
#         self.left = None
#         self.right = None

class Codec:
    def serialize(self, root):
        values = []

        def dfs(node):
            if node is None:
                values.append("#")
                return

            values.append(str(node.val))
            dfs(node.left)
            dfs(node.right)

        dfs(root)

        return ",".join(values)

    def deserialize(self, data):
        values = iter(data.split(","))

        def dfs():
            value = next(values)

            if value == "#":
                return None

            node = TreeNode(int(value))
            node.left = dfs()
            node.right = dfs()

            return node

        return dfs()
```

## Code Explanation

Serialization stores tokens in a list:

```python
values = []
```

When a null child appears, we write:

```python
values.append("#")
```

When a real node appears, we write its value:

```python
values.append(str(node.val))
```

Then we serialize the left subtree before the right subtree:

```python
dfs(node.left)
dfs(node.right)
```

Finally, we join tokens into one string:

```python
return ",".join(values)
```

For deserialization, we split the string and create an iterator:

```python
values = iter(data.split(","))
```

Each recursive call consumes exactly the tokens needed for one subtree.

If the token is `"#"`:

```python
if value == "#":
    return None
```

the subtree is empty.

Otherwise, we create a node:

```python
node = TreeNode(int(value))
```

Then rebuild its left and right children in the same order used during serialization:

```python
node.left = dfs()
node.right = dfs()
```

## Testing

```python
class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

def same_tree(a, b):
    if a is None and b is None:
        return True

    if a is None or b is None:
        return False

    return (
        a.val == b.val
        and same_tree(a.left, b.left)
        and same_tree(a.right, b.right)
    )

def test_codec():
    codec = Codec()

    assert codec.deserialize(codec.serialize(None)) is None

    root = TreeNode(1)
    root.left = TreeNode(2)
    root.right = TreeNode(3)
    root.right.left = TreeNode(4)
    root.right.right = TreeNode(5)

    rebuilt = codec.deserialize(codec.serialize(root))
    assert same_tree(root, rebuilt)

    root = TreeNode(-1)
    root.right = TreeNode(1000)

    rebuilt = codec.deserialize(codec.serialize(root))
    assert same_tree(root, rebuilt)

    print("all tests passed")

test_codec()
```

Test meaning:

| Test | Why |
|---|---|
| Empty tree | Confirms null root support |
| Standard tree | Confirms both left and right children are preserved |
| Negative and large values | Confirms integer parsing works |
| Right-only child | Confirms null markers preserve shape |

