A clear explanation of designing a simple URL encoder and decoder using a hash map and generated keys.
Problem Restatement
We need to design a class that can shorten a long URL and later recover the original URL.
The class has two methods:
encode(longUrl)and:
decode(shortUrl)encode receives a valid long URL and returns a tiny URL.
decode receives a tiny URL generated by the same object and returns the original long URL.
The problem does not restrict the algorithm. It only requires that a URL encoded by the object can be decoded back to the original URL. The statement also guarantees that decode is called only with a short URL created by the same object.
Input and Output
| Method | Input | Output |
|---|---|---|
encode | A long URL string | A short URL string |
decode | A short URL string | The original long URL |
Constraints:
| Item | Constraint |
|---|---|
| URL length | 1 <= url.length <= 10^4 |
| URL validity | The input URL is guaranteed to be valid |
Examples
Suppose the input URL is:
url = "https://leetcode.com/problems/design-tinyurl"We call:
codec = Codec()
tiny = codec.encode(url)The tiny URL can be any valid string chosen by our implementation, for example:
"https://tinyurl.com/1"Then:
codec.decode(tiny)must return:
"https://leetcode.com/problems/design-tinyurl"The exact short URL does not matter. The round trip matters:
decode(encode(url)) == urlFirst Thought: Return the Original URL
Since there is no restriction on the algorithm, one trivial idea is:
encode(longUrl) returns longUrl
decode(shortUrl) returns shortUrlThis technically preserves the URL, but it defeats the purpose of a URL shortener.
A better solution should return a shorter string and store enough information to recover the original URL.
Key Insight
A short URL does not need to contain the whole original URL.
It only needs to contain a key.
We can store this mapping:
key -> longUrlThen the short URL is:
base_url + keyFor example:
"https://tinyurl.com/" + "1"When decoding, we extract the key from the short URL and look up the original URL in the hash map.
A simple counter gives us unique keys:
1, 2, 3, 4, ...This avoids collision handling and keeps the implementation small.
Algorithm
Maintain:
self.code_to_url
self.next_id
self.baseself.code_to_url maps generated codes to original URLs.
self.next_id gives each new URL a unique code.
self.base is the prefix used for short URLs.
For encode(longUrl):
- Convert
self.next_idto a string code. - Store
code -> longUrl. - Increment
self.next_id. - Return
self.base + code.
For decode(shortUrl):
- Remove the base prefix from
shortUrl. - Use the remaining code to look up the long URL.
- Return the stored URL.
Correctness
Every call to encode assigns the current counter value as the code. The counter increases after each assignment, so no two encoded URLs receive the same code.
The algorithm stores the exact original URL under that code in code_to_url.
The short URL returned by encode is the base prefix followed by the code. When decode receives this short URL, it removes the same base prefix and recovers the same code.
Since code_to_url[code] stores the original longUrl, decode returns exactly the URL that was passed to encode.
Therefore, for every encoded URL:
decode(encode(longUrl)) == longUrlComplexity
Let L be the length of the URL.
| Operation | Time | Space |
|---|---|---|
encode | O(L) | O(L) |
decode | O(1) average lookup, plus string slicing | O(1) extra |
The stored map uses total space proportional to the total length of all encoded long URLs.
Implementation
class Codec:
def __init__(self):
self.base = "http://tinyurl.com/"
self.next_id = 1
self.code_to_url = {}
def encode(self, longUrl: str) -> str:
code = str(self.next_id)
self.next_id += 1
self.code_to_url[code] = longUrl
return self.base + code
def decode(self, shortUrl: str) -> str:
code = shortUrl[len(self.base):]
return self.code_to_url[code]Code Explanation
The constructor initializes the service:
self.base = "http://tinyurl.com/"
self.next_id = 1
self.code_to_url = {}The base string is only a prefix. It does not need to be a real working domain for this LeetCode problem.
In encode, we generate a code from the counter:
code = str(self.next_id)
self.next_id += 1Then we store the mapping:
self.code_to_url[code] = longUrlFinally, we return the tiny URL:
return self.base + codeIn decode, we recover the code by slicing off the base prefix:
code = shortUrl[len(self.base):]Then we return the original URL:
return self.code_to_url[code]Optional Improvement: Reuse the Same Short URL
The simple counter solution creates a new short URL each time, even if the same long URL is encoded again.
We can add another map:
longUrl -> codeThen repeated calls with the same URL return the same short URL.
class Codec:
def __init__(self):
self.base = "http://tinyurl.com/"
self.next_id = 1
self.code_to_url = {}
self.url_to_code = {}
def encode(self, longUrl: str) -> str:
if longUrl in self.url_to_code:
return self.base + self.url_to_code[longUrl]
code = str(self.next_id)
self.next_id += 1
self.url_to_code[longUrl] = code
self.code_to_url[code] = longUrl
return self.base + code
def decode(self, shortUrl: str) -> str:
code = shortUrl[len(self.base):]
return self.code_to_url[code]This is not required by the problem, but it is closer to how a real URL shortener would usually behave.
Testing
def run_tests():
codec = Codec()
url1 = "https://leetcode.com/problems/design-tinyurl"
short1 = codec.encode(url1)
assert codec.decode(short1) == url1
url2 = "https://example.com/a/very/long/path?x=1&y=2"
short2 = codec.encode(url2)
assert codec.decode(short2) == url2
assert short1 != short2
url3 = "https://a.com"
short3 = codec.encode(url3)
assert codec.decode(short3) == url3
print("all tests passed")
run_tests()| Test | Why |
|---|---|
| Standard URL | Checks basic round trip |
| URL with path and query | Confirms the whole string is preserved |
| Different URLs | Confirms different generated codes |
| Short valid URL | Checks small input |