LeetCode 635: Design Log Storage System

Problem Restatement

We need to design a log storage system.

Each log has:

Field	Meaning
`id`	Unique log ID
`timestamp`	Time string in `Year:Month:Day:Hour:Minute:Second` format

The timestamp format looks like this:

"2017:01:01:23:59:59"

All parts are zero-padded.

The system needs two operations:

Operation	Meaning
`put(id, timestamp)`	Store a log
`retrieve(start, end, granularity)`	Return IDs whose timestamps are inside the inclusive range

The granularity tells us how much of the timestamp to compare.

For example, if the granularity is "Day", then we only compare:

"YYYY:MM:DD"

and ignore hour, minute, and second.

Input and Output

Class shape:

class LogSystem:

    def __init__(self):
        ...

    def put(self, id: int, timestamp: str) -> None:
        ...

    def retrieve(self, start: str, end: str, granularity: str) -> list[int]:
        ...

Granularity values:

Granularity	Compared Prefix
`"Year"`	`YYYY`
`"Month"`	`YYYY:MM`
`"Day"`	`YYYY:MM:DD`
`"Hour"`	`YYYY:MM:DD:HH`
`"Minute"`	`YYYY:MM:DD:HH:MM`
`"Second"`	Full timestamp

Example

Operations:

log = LogSystem()
log.put(1, "2017:01:01:23:59:59")
log.put(2, "2017:01:01:22:59:59")
log.put(3, "2016:01:01:00:00:00")

log.retrieve(
    "2016:01:01:01:01:01",
    "2017:01:01:23:00:00",
    "Year",
)

With granularity "Year", we compare only the year.

Log ID	Timestamp	Compared Value
1	`2017:01:01:23:59:59`	`2017`
2	`2017:01:01:22:59:59`	`2017`
3	`2016:01:01:00:00:00`	`2016`

The range is from 2016 to 2017, inclusive.

So the result can include:

[1, 2, 3]

Another query:

log.retrieve(
    "2016:01:01:01:01:01",
    "2017:01:01:23:00:00",
    "Hour",
)

Now we compare up to the hour:

Log ID	Compared Value
1	`2017:01:01:23`
2	`2017:01:01:22`
3	`2016:01:01:00`

The end compared value is:

"2017:01:01:23"

So IDs 1 and 2 are included.

ID 3 is before the start hour.

First Thought: Parse Timestamps Into Dates

One possible approach is to parse every timestamp into numeric fields:

year, month, day, hour, minute, second

Then for each query, compare only the fields needed by the granularity.

This works, but it is more code than needed.

The timestamp strings are already formatted in a useful way.

Because every field is zero-padded and arranged from largest unit to smallest unit, lexicographic string comparison matches chronological order.

For example:

"2017:01" < "2017:02"
"2016:12" < "2017:01"
"2017:01:01:22" < "2017:01:01:23"

So we can compare timestamp prefixes as strings.

Key Insight

Each granularity corresponds to a prefix length.

Granularity	Prefix Length	Example Prefix
`"Year"`	`4`	`2017`
`"Month"`	`7`	`2017:01`
`"Day"`	`10`	`2017:01:01`
`"Hour"`	`13`	`2017:01:01:23`
`"Minute"`	`16`	`2017:01:01:23:59`
`"Second"`	`19`	`2017:01:01:23:59:59`

For a query, slice the start timestamp, end timestamp, and each stored timestamp to the same prefix length.

Then check:

start_prefix <= timestamp_prefix <= end_prefix

The range is inclusive.

Algorithm

Use a list to store all logs:

(id, timestamp)

For put:

Append (id, timestamp) to the list.

For retrieve:

Find the prefix length for the requested granularity.
Slice start and end to that length.
Scan all stored logs.
Slice each log timestamp to the same length.
If it lies between start and end, include its ID.
Return all matching IDs.

Implementation

class LogSystem:

    def __init__(self):
        self.logs = []
        self.length = {
            "Year": 4,
            "Month": 7,
            "Day": 10,
            "Hour": 13,
            "Minute": 16,
            "Second": 19,
        }

    def put(self, id: int, timestamp: str) -> None:
        self.logs.append((id, timestamp))

    def retrieve(self, start: str, end: str, granularity: str) -> list[int]:
        size = self.length[granularity]

        start_key = start[:size]
        end_key = end[:size]

        answer = []

        for log_id, timestamp in self.logs:
            key = timestamp[:size]

            if start_key <= key <= end_key:
                answer.append(log_id)

        return answer

Code Explanation

The constructor stores all logs in a list:

self.logs = []

It also stores the prefix length for each granularity:

self.length = {
    "Year": 4,
    "Month": 7,
    "Day": 10,
    "Hour": 13,
    "Minute": 16,
    "Second": 19,
}

The put operation is simple:

self.logs.append((id, timestamp))

For retrieval, we first decide how much of the timestamp matters:

size = self.length[granularity]

Then we truncate the query range:

start_key = start[:size]
end_key = end[:size]

For each stored log, we compare the same prefix:

key = timestamp[:size]

If it is inside the inclusive range, we keep the ID:

if start_key <= key <= end_key:
    answer.append(log_id)

Correctness

The timestamp format orders fields from largest to smallest: year, month, day, hour, minute, second. Each field is zero-padded, so lexicographic order is the same as chronological order for equal-length timestamp prefixes.

For a given granularity, the algorithm slices all timestamps to the exact prefix that represents that granularity. Therefore, two timestamps are compared only by the fields required by the query.

A log is returned exactly when its truncated timestamp is greater than or equal to the truncated start timestamp and less than or equal to the truncated end timestamp. This matches the problem’s inclusive range definition.

The put method stores every log without modification, so every inserted log is considered by future retrievals.

Therefore, retrieve returns exactly the IDs whose timestamps fall inside the requested range at the requested granularity.

Complexity

Let n be the number of stored logs.

Operation	Time	Space
`LogSystem()`	`O(1)`	`O(1)`
`put`	`O(1)`	`O(1)` additional
`retrieve`	`O(n)`	`O(m)`

Here, m is the number of matching log IDs returned.

The timestamp prefix length is at most 19, so slicing and comparing are constant-time for this problem.

Alternative Solution: Store Sorted Timestamps

We can also store logs in sorted order by timestamp and use binary search.

That can make retrieval faster when there are many logs.

However, for this problem, the simple scan is usually enough because the timestamp length is tiny and the design is focused on correct granularity handling.

A sorted approach would need to normalize the start and end timestamps depending on the granularity, then search by timestamp range.

The list scan solution is shorter and less error-prone.

Testing

def run_tests():
    log = LogSystem()

    log.put(1, "2017:01:01:23:59:59")
    log.put(2, "2017:01:01:22:59:59")
    log.put(3, "2016:01:01:00:00:00")

    assert sorted(log.retrieve(
        "2016:01:01:01:01:01",
        "2017:01:01:23:00:00",
        "Year",
    )) == [1, 2, 3]

    assert sorted(log.retrieve(
        "2016:01:01:01:01:01",
        "2017:01:01:23:00:00",
        "Hour",
    )) == [1, 2]

    assert log.retrieve(
        "2017:01:01:23:59:59",
        "2017:01:01:23:59:59",
        "Second",
    ) == [1]

    assert sorted(log.retrieve(
        "2017:01:01:00:00:00",
        "2017:01:01:23:59:59",
        "Day",
    )) == [1, 2]

    print("all tests passed")

run_tests()

Test meaning:

Test	Why
`"Year"` query	Coarse granularity includes 2016 and 2017
`"Hour"` query	More precise comparison excludes earlier 2016 log
`"Second"` query	Exact timestamp match
`"Day"` query	Ignores hour, minute, and second