A clear explanation of simplifying algebraic expressions by parsing, substituting variables, and combining polynomial terms.
Problem Restatement
We are given an algebraic expression as a string.
The expression may contain:
integers
variables
+
-
*
parenthesesWe are also given some variable values through:
evalvars
evalintsFor example:
evalvars = ["e"]
evalints = [1]means:
e = 1We need to simplify the expression after substituting known variables.
The result should be returned as a list of string terms.
The official format requires variable factors inside each term to be sorted lexicographically, terms sorted by descending degree, and ties sorted lexicographically by variables. Zero coefficient terms are omitted. Constants are included only if non-zero.
Input and Output
| Item | Meaning |
|---|---|
| Input | expression, a valid algebraic expression |
| Input | evalvars, variables with known values |
| Input | evalints, integer values for those variables |
| Output | Simplified expression as a list of string terms |
| Operators | +, -, * |
| Order | Parentheses first, then multiplication, then addition and subtraction |
Example function shape:
def basicCalculatorIV(
expression: str,
evalvars: list[str],
evalints: list[int],
) -> list[str]:
...Examples
Example 1:
expression = "e + 8 - a + 5"
evalvars = ["e"]
evalints = [1]Substitute e = 1:
1 + 8 - a + 5Simplify:
14 - aOutput:
["-1*a", "14"]Example 2:
expression = "e - 8 + temperature - pressure"
evalvars = ["e", "temperature"]
evalints = [1, 12]Substitute:
1 - 8 + 12 - pressureSimplify:
5 - pressureOutput:
["-1*pressure", "5"]Example 3:
expression = "(e + 8) * (e - 8)"
evalvars = []
evalints = []No variables are substituted.
Expand:
e * e - 64Output:
["1*e*e", "-64"]First Thought: Evaluate Like a Normal Calculator
For a normal calculator, each subexpression becomes one number.
Here, a subexpression may become a polynomial.
For example:
a * b + 2 * a * bsimplifies to:
3*a*bSo instead of storing just an integer value, each expression result must store many terms.
Key Insight
Represent each polynomial as a dictionary:
term_variables -> coefficientUse a tuple of sorted variable names as the key.
Examples:
| Term | Key | Coefficient |
|---|---|---|
5 | () | 5 |
-1*a | ("a",) | -1 |
3*a*b | ("a", "b") | 3 |
2*a*a*b | ("a", "a", "b") | 2 |
This representation makes combining like terms easy.
For example:
{
("a", "b"): 1,
("a", "b"): 2,
}becomes:
{
("a", "b"): 3,
}Polynomial Operations
We need three operations.
Addition
Add coefficients for matching term keys.
("a",) -> 2
("a",) -> 3becomes:
("a",) -> 5Subtraction
Same as addition, but subtract the second polynomial’s coefficients.
("a",) -> 2
minus
("a",) -> 3becomes:
("a",) -> -1Multiplication
Multiply every term from the first polynomial with every term from the second polynomial.
Variables are combined and sorted.
Example:
(2*a*b) * (3*a*c)Coefficient:
2 * 3 = 6Variables:
a, b + a, c -> a, a, b, cResult:
6*a*a*b*cParsing Strategy
The expression has precedence rules:
- parentheses
- multiplication
- addition and subtraction
A clean way is recursive descent parsing.
We define three parser levels:
| Function | Parses |
|---|---|
parse_expr | Addition and subtraction |
parse_term | Multiplication |
parse_factor | Number, variable, or parenthesized expression |
This matches normal arithmetic grammar.
Tokenization
The expression may include spaces.
We first convert it into tokens.
For example:
"e + 8 - a + 5"becomes:
["e", "+", "8", "-", "a", "+", "5"]Parentheses are also tokens.
Algorithm
- Build a dictionary from
evalvarstoevalints. - Tokenize the expression.
- Parse the expression recursively:
parse_factorhandles integers, variables, and parentheses.parse_termhandles multiplication.parse_exprhandles addition and subtraction.
- Get the final polynomial dictionary.
- Remove zero coefficient terms.
- Sort terms by:
- descending degree
- lexicographic variable tuple
- Format each term as a string.
- Return the formatted list.
Correctness
Each integer token is converted into a constant polynomial. Each known variable is substituted into a constant polynomial. Each unknown variable is converted into a one-variable polynomial.
The parser follows the expression grammar: parse_factor evaluates atomic expressions and parentheses, parse_term combines factors using multiplication, and parse_expr combines terms using addition and subtraction. Therefore, all operators are evaluated with the required precedence.
The polynomial operations preserve algebraic meaning. Addition and subtraction combine coefficients of matching variable tuples. Multiplication distributes every term on the left over every term on the right, multiplying coefficients and combining variables.
Thus the parsed polynomial is algebraically equivalent to the original expression after substitutions.
The final formatting removes zero coefficient terms and sorts remaining terms by the required degree and lexicographic order. Therefore, the returned list is exactly the required simplified representation.
Complexity
Let T be the number of tokens, and let M be the number of distinct polynomial terms produced during expansion.
| Metric | Value | Why |
|---|---|---|
| Time | O(T * M^2 * V log V) | Multiplication may combine many term pairs, and variable tuples are sorted |
| Space | O(M * V) | Polynomial dictionaries store terms and variable tuples |
V is the maximum number of variables inside one term.
The exact bound depends on how much multiplication expands the expression.
Implementation
from collections import defaultdict
class Solution:
def basicCalculatorIV(
self,
expression: str,
evalvars: list[str],
evalints: list[int],
) -> list[str]:
values = dict(zip(evalvars, evalints))
tokens = self.tokenize(expression)
index = 0
def add(poly1, poly2):
result = defaultdict(int)
for key, coeff in poly1.items():
result[key] += coeff
for key, coeff in poly2.items():
result[key] += coeff
return result
def sub(poly1, poly2):
result = defaultdict(int)
for key, coeff in poly1.items():
result[key] += coeff
for key, coeff in poly2.items():
result[key] -= coeff
return result
def mul(poly1, poly2):
result = defaultdict(int)
for key1, coeff1 in poly1.items():
for key2, coeff2 in poly2.items():
key = tuple(sorted(key1 + key2))
result[key] += coeff1 * coeff2
return result
def constant(value):
return {(): value}
def variable(name):
return {(name,): 1}
def parse_expr():
nonlocal index
result = parse_term()
while index < len(tokens) and tokens[index] in {"+", "-"}:
op = tokens[index]
index += 1
right = parse_term()
if op == "+":
result = add(result, right)
else:
result = sub(result, right)
return result
def parse_term():
nonlocal index
result = parse_factor()
while index < len(tokens) and tokens[index] == "*":
index += 1
right = parse_factor()
result = mul(result, right)
return result
def parse_factor():
nonlocal index
token = tokens[index]
index += 1
if token == "(":
result = parse_expr()
index += 1
return result
if token.isdigit():
return constant(int(token))
if token in values:
return constant(values[token])
return variable(token)
polynomial = parse_expr()
terms = [
(vars_, coeff)
for vars_, coeff in polynomial.items()
if coeff != 0
]
terms.sort(key=lambda item: (-len(item[0]), item[0]))
answer = []
for vars_, coeff in terms:
if not vars_:
answer.append(str(coeff))
else:
answer.append(str(coeff) + "*" + "*".join(vars_))
return answer
def tokenize(self, expression: str) -> list[str]:
tokens = []
i = 0
while i < len(expression):
ch = expression[i]
if ch == " ":
i += 1
continue
if ch in "+-*()":
tokens.append(ch)
i += 1
continue
j = i
while j < len(expression) and expression[j].isalnum():
j += 1
tokens.append(expression[i:j])
i = j
return tokensCode Explanation
We first build the substitution map:
values = dict(zip(evalvars, evalints))Then we tokenize the expression:
tokens = self.tokenize(expression)A polynomial is stored as a dictionary. The empty tuple () represents the constant term.
def constant(value):
return {(): value}An unknown variable becomes:
def variable(name):
return {(name,): 1}Addition and subtraction combine coefficients with the same key.
Multiplication loops over every term pair:
for key1, coeff1 in poly1.items():
for key2, coeff2 in poly2.items():The new variable tuple is:
key = tuple(sorted(key1 + key2))Sorting ensures b*a and a*b become the same term key.
The parser has three precedence levels.
parse_expr handles + and -.
parse_term handles *.
parse_factor handles numbers, variables, and parenthesized expressions.
When parse_factor sees "(", it parses a full expression inside the parentheses:
result = parse_expr()
index += 1
return resultThe index += 1 skips the closing ")".
Finally, we remove zero terms:
if coeff != 0Then sort by required output order:
terms.sort(key=lambda item: (-len(item[0]), item[0]))A higher degree appears first. If two terms have the same degree, their variable tuples decide lexicographic order.
Testing
def run_tests():
s = Solution()
assert s.basicCalculatorIV(
"e + 8 - a + 5",
["e"],
[1],
) == ["-1*a", "14"]
assert s.basicCalculatorIV(
"e - 8 + temperature - pressure",
["e", "temperature"],
[1, 12],
) == ["-1*pressure", "5"]
assert s.basicCalculatorIV(
"(e + 8) * (e - 8)",
[],
[],
) == ["1*e*e", "-64"]
assert s.basicCalculatorIV(
"a * b * c + b * a * c * 4",
[],
[],
) == ["5*a*b*c"]
assert s.basicCalculatorIV(
"a - a",
[],
[],
) == []
assert s.basicCalculatorIV(
"2 * 3 + 4",
[],
[],
) == ["10"]
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
| Substitution and free variable | Confirms known variables become constants |
| Multi-letter variables | Confirms variable names are parsed correctly |
| Parentheses and multiplication | Confirms expansion |
| Combining like terms | Confirms variable tuples are normalized |
| Zero coefficient | Confirms zero terms are omitted |
| Pure arithmetic | Confirms constants are evaluated |