Project Euler Problem 89

For a number written in Roman numerals to be considered valid there are basic rules which must be followed.

Project Euler Problem 89

Solution

Answer: 743

1. Mathematical analysis

We are given 1000 Roman numerals that are valid, but not necessarily written in their minimal form. We must determine how many characters are saved if every numeral is rewritten optimally.

The key idea is:

For each Roman numeral:

  1. Convert it to its integer value.
  2. Convert that integer back into the minimal Roman representation.
  3. Count the difference in string lengths.

Sum these differences over the entire file.

Roman numeral rules

Roman symbols:

Symbol Value
I 1
V 5
X 10
L 50
C 100
D 500
M 1000

Subtractive notation is used for efficient writing:

  • IV = 4 instead of IIII
  • IX = 9 instead of VIIII
  • XL = 40 instead of XXXX
  • XC = 90 instead of LXXXX
  • CD = 400 instead of CCCC
  • CM = 900 instead of DCCCC

Thus the minimal system is essentially a greedy decomposition into the largest valid Roman blocks.

For example:

  • XIIIIII = 16 → minimal form XVI

  • saved: $7 - 3 = 4$

  • VIIII = 9 → minimal form IX

  • saved: $5 - 2 = 3$


2. Python implementation

# Roman numeral values
roman_values = {
    'I': 1,
    'V': 5,
    'X': 10,
    'L': 50,
    'C': 100,
    'D': 500,
    'M': 1000
}

# Ordered minimal Roman building blocks
minimal_pairs = [
    (1000, 'M'),
    (900, 'CM'),
    (500, 'D'),
    (400, 'CD'),
    (100, 'C'),
    (90, 'XC'),
    (50, 'L'),
    (40, 'XL'),
    (10, 'X'),
    (9, 'IX'),
    (5, 'V'),
    (4, 'IV'),
    (1, 'I')
]

def roman_to_int(s):
    """
    Convert a Roman numeral to an integer.
    Uses subtraction when a smaller numeral
    appears before a larger numeral.
    """
    total = 0
    i = 0

    while i < len(s):
        # Check for subtractive pair
        if i + 1 < len(s) and roman_values[s[i]] < roman_values[s[i + 1]]:
            total += roman_values[s[i + 1]] - roman_values[s[i]]
            i += 2
        else:
            total += roman_values[s[i]]
            i += 1

    return total

def int_to_minimal_roman(n):
    """
    Convert an integer into its minimal Roman numeral form.
    Greedily use the largest legal Roman block.
    """
    result = []

    for value, symbol in minimal_pairs:
        while n >= value:
            result.append(symbol)
            n -= value

    return ''.join(result)

# Read the file
with open("roman.txt") as f:
    numerals = [line.strip() for line in f]

# Compute total savings
saved = 0

for numeral in numerals:
    value = roman_to_int(numeral)
    minimal = int_to_minimal_roman(value)
    saved += len(numeral) - len(minimal)

print(saved)

3. Code walkthrough

roman_to_int(s)

We scan left to right.

Normally we add symbol values:

$$XVI = 10 + 5 + 1 = 16$$

But if a smaller numeral appears before a larger one, we subtract:

$$IX = 10 - 1 = 9$$

So:

if roman_values[s[i]] < roman_values[s[i + 1]]:

detects subtractive pairs like IV, IX, XL, etc.


int_to_minimal_roman(n)

We greedily subtract the largest legal Roman block.

For example, for 944:

  • 900 → CM
  • 40 → XL
  • 4 → IV

giving:

$$944 = CMXLIV$$

The ordered list:

minimal_pairs

ensures minimality automatically.


Example check

Take:

XIIIIII

Convert to integer:

$$10 + 1 + 1 + 1 + 1 + 1 + 1 = 16$$

Minimal conversion:

$$16 = X + V + I = XVI$$

Length saved:

$$7 - 3 = 4$$

Matches the problem statement.


4. Final verification

The answer should be:

  • positive, since minimal Roman numerals are never longer,
  • moderate in size, since only 1000 numerals are involved,
  • consistent with known Project Euler results.

Running the algorithm on the supplied roman.txt file yields:

Answer: 743