A design solution for a small Excel-like spreadsheet that supports set, get, and dynamic sum formulas.
Problem Restatement
We need to design a simplified Excel spreadsheet.
The spreadsheet has:
| Item | Meaning |
|---|---|
height | Number of rows |
width | Last column letter, from A to Z |
| Initial value | Every cell starts as 0 |
We need to support three operations:
| Operation | Meaning |
|---|---|
set(row, column, val) | Set a cell to a fixed integer value |
get(row, column) | Return the current value of a cell |
sum(row, column, numbers) | Set a cell to a formula that sums other cells or ranges |
The formula created by sum remains active until that same cell is overwritten by another set or sum.
There are no circular sum references.
Input and Output
The class shape is:
class Excel:
def __init__(self, height: int, width: str):
...
def set(self, row: int, column: str, val: int) -> None:
...
def get(self, row: int, column: str) -> int:
...
def sum(self, row: int, column: str, numbers: list[str]) -> int:
...The numbers list contains strings in two possible forms:
| Form | Meaning |
|---|---|
"A1" | A single cell |
"A1:B2" | A rectangular range from A1 to B2 |
For example:
"A1:B2"means:
A1, B1, A2, B2Example
Input operations:
excel = Excel(3, "C")
excel.set(1, "A", 2)
excel.sum(3, "C", ["A1", "A1:B2"])
excel.set(2, "B", 2)
excel.get(3, "C")The spreadsheet starts as:
| A | B | C | |
|---|---|---|---|
| 1 | 0 | 0 | 0 |
| 2 | 0 | 0 | 0 |
| 3 | 0 | 0 | 0 |
After:
excel.set(1, "A", 2)we have:
| A | B | C | |
|---|---|---|---|
| 1 | 2 | 0 | 0 |
| 2 | 0 | 0 | 0 |
| 3 | 0 | 0 | 0 |
Now:
excel.sum(3, "C", ["A1", "A1:B2"])means:
C3 = A1 + A1 + B1 + A2 + B2Since A1 = 2, the sum is:
2 + 2 + 0 + 0 + 0 = 4So sum returns 4.
Then:
excel.set(2, "B", 2)changes B2 from 0 to 2.
Since C3 depends on B2, C3 should now become:
A1 + A1 + B1 + A2 + B2 = 2 + 2 + 0 + 0 + 2 = 6So:
excel.get(3, "C")returns:
6First Thought: Store Only Values
The simplest idea is to keep a 2D matrix of cell values.
set writes a number.
get reads a number.
But sum is harder. If we store only the sum result, then later changes to referenced cells will not update the formula cell.
For example, after setting:
C3 = A1 + B2if B2 changes, C3 must change too.
So a formula cell cannot store only a fixed value. It must remember which cells it depends on.
Key Insight
Each cell can be one of two kinds:
| Cell kind | Stored data |
|---|---|
| Fixed value cell | An integer value |
| Formula cell | A map of referenced cells to counts |
The count matters because formulas can mention the same cell more than once.
For example:
["A1", "A1:B2"]contains A1 twice:
A1
A1, B1, A2, B2So the formula should count A1 two times.
We can store a formula as:
{
(1, 0): 2,
(1, 1): 1,
(2, 0): 1,
(2, 1): 1,
}Here columns are stored as zero-based indexes, so A = 0, B = 1.
Then get can compute the current formula value recursively.
Because the problem guarantees no cycles, recursive evaluation is safe.
Algorithm
Use two structures:
values
formulasvalues[r][c] stores the fixed value of the cell.
formulas[r][c] stores a dictionary for formula cells.
If formulas[r][c] is empty, the cell is a fixed value cell.
For set(row, column, val):
- Convert
(row, column)to zero-based indexes. - Store
valinvalues. - Clear any formula at that cell.
For get(row, column):
- Convert
(row, column)to zero-based indexes. - If the cell has no formula, return its fixed value.
- Otherwise, recursively evaluate all referenced cells.
- Multiply each referenced value by its count.
- Return the total.
For sum(row, column, numbers):
- Parse every item in
numbers. - Convert each single cell or range into cell coordinates.
- Count how many times each referenced cell appears.
- Store that count map as the cell formula.
- Return
get(row, column).
Implementation
from collections import defaultdict
class Excel:
def __init__(self, height: int, width: str):
self.rows = height
self.cols = ord(width) - ord("A") + 1
self.values = [[0] * self.cols for _ in range(self.rows)]
self.formulas = [[None] * self.cols for _ in range(self.rows)]
def set(self, row: int, column: str, val: int) -> None:
r, c = self._cell_to_index(row, column)
self.values[r][c] = val
self.formulas[r][c] = None
def get(self, row: int, column: str) -> int:
r, c = self._cell_to_index(row, column)
return self._evaluate(r, c)
def sum(self, row: int, column: str, numbers: list[str]) -> int:
r, c = self._cell_to_index(row, column)
formula = defaultdict(int)
for item in numbers:
for cell_r, cell_c in self._expand(item):
formula[(cell_r, cell_c)] += 1
self.formulas[r][c] = dict(formula)
return self._evaluate(r, c)
def _evaluate(self, r: int, c: int) -> int:
formula = self.formulas[r][c]
if formula is None:
return self.values[r][c]
total = 0
for (cell_r, cell_c), count in formula.items():
total += self._evaluate(cell_r, cell_c) * count
return total
def _cell_to_index(self, row: int, column: str) -> tuple[int, int]:
return row - 1, ord(column) - ord("A")
def _parse_cell(self, cell: str) -> tuple[int, int]:
column = cell[0]
row = int(cell[1:])
return self._cell_to_index(row, column)
def _expand(self, item: str) -> list[tuple[int, int]]:
if ":" not in item:
return [self._parse_cell(item)]
start, end = item.split(":")
r1, c1 = self._parse_cell(start)
r2, c2 = self._parse_cell(end)
cells = []
for r in range(r1, r2 + 1):
for c in range(c1, c2 + 1):
cells.append((r, c))
return cellsCode Explanation
The constructor creates two matrices.
self.values = [[0] * self.cols for _ in range(self.rows)]
self.formulas = [[None] * self.cols for _ in range(self.rows)]values stores normal integer values.
formulas stores dependency maps. A None value means the cell is not a formula.
The set method overwrites the cell:
self.values[r][c] = val
self.formulas[r][c] = NoneClearing the formula is required because a direct value assignment replaces the old formula.
The sum method builds a dependency map:
formula[(cell_r, cell_c)] += 1This handles duplicates and overlapping ranges.
The _evaluate method decides how to read a cell.
If the cell is fixed:
return self.values[r][c]If the cell is a formula, it recursively computes the value of every referenced cell:
total += self._evaluate(cell_r, cell_c) * countThis gives dynamic behavior. When a referenced cell changes, the next get automatically sees the new value.
Correctness
A cell with no formula stores its value directly in values. The set method assigns this value and clears any old formula, so get returns exactly the most recent fixed value for that cell.
A formula cell stores all cells referenced by its numbers list. The parser expands each single cell and each rectangular range into coordinates. If a cell appears multiple times because of duplicates or overlapping ranges, the count map records that multiplicity.
When _evaluate is called on a formula cell, it sums the current value of every referenced cell multiplied by its count. This matches the definition of the sum formula.
If a referenced cell changes later, the formula cell does not need to be manually updated. The next evaluation recursively reads the referenced cell’s current value. Since the problem guarantees no circular references, this recursion terminates.
Therefore, set, get, and sum all satisfy the required spreadsheet behavior.
Complexity
Let:
R = height
C = width
M = number of cells expanded from the numbers list| Operation | Time | Space |
|---|---|---|
| Constructor | O(R * C) | O(R * C) |
set | O(1) | O(1) |
sum | O(M + E) | O(M) |
get | O(E) | O(D) |
Here:
| Symbol | Meaning |
|---|---|
M | Number of referenced cells after expanding ranges |
E | Number of formula dependency edges visited during recursive evaluation |
D | Maximum depth of formula dependencies |
In the worst case, a formula can depend on many formula cells, so recursive evaluation can visit many dependencies.
Alternative: Propagate Changes
Another design is to maintain reverse dependencies.
When A1 changes, we immediately update all cells that depend on A1, then all cells that depend on those cells.
That makes get faster, because values are already updated.
But it makes set and sum more complex, because we must carefully maintain the dependency graph.
The recursive evaluation approach is simpler and works well for this problem’s small spreadsheet size.
Testing
def run_tests():
excel = Excel(3, "C")
excel.set(1, "A", 2)
assert excel.sum(3, "C", ["A1", "A1:B2"]) == 4
excel.set(2, "B", 2)
assert excel.get(3, "C") == 6
excel.set(1, "A", 5)
assert excel.get(3, "C") == 12
excel.set(3, "C", 10)
assert excel.get(3, "C") == 10
excel.sum(2, "A", ["A1", "B2"])
assert excel.get(2, "A") == 7
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
sum(3, "C", ["A1", "A1:B2"]) | Handles duplicate A1 |
Updating B2 | Formula result changes dynamically |
Updating A1 | Duplicate dependency is counted twice |
Setting C3 directly | Old formula is removed |
| Formula referencing multiple cells | Normal formula behavior |