A clear explanation of finding the minimum number of swaps needed to transform one anagram string into another using BFS.
Problem Restatement
We are given two strings s1 and s2.
The two strings are anagrams. That means they contain the same letters with the same frequencies.
One operation lets us swap any two characters in s1.
We need to return the smallest number of swaps needed to turn s1 into s2.
This smallest number is called k.
For example:
s1 = "abc"
s2 = "bca"One valid sequence is:
"abc" -> "bac" -> "bca"So the answer is:
2Input and Output
| Item | Meaning |
|---|---|
| Input | s1, the starting string |
| Input | s2, the target string |
| Output | The minimum number of swaps needed |
| Constraint | s1 and s2 are anagrams |
| Constraint | 1 <= s1.length <= 20 |
| Constraint | Letters come only from a through f |
Function shape:
class Solution:
def kSimilarity(self, s1: str, s2: str) -> int:
...Examples
Example 1:
s1 = "ab"
s2 = "ba"Swap the two characters:
"ab" -> "ba"The answer is:
1Example 2:
s1 = "abc"
s2 = "bca"One shortest sequence is:
"abc" -> "bac" -> "bca"The answer is:
2Example 3:
s1 = "aabc"
s2 = "abca"One shortest sequence is:
"aabc" -> "abac" -> "abca"The answer is:
2First Thought: Try All Swaps
A direct way is to treat every string arrangement as a state.
From one state, we can swap any two positions to create another state.
Then the problem becomes:
Find the shortest path from s1 to s2.
Each swap costs 1.
Since every move has the same cost, breadth-first search is a natural fit.
The problem is that trying every possible swap creates too many states.
For a string of length n, there are up to:
n * (n - 1) / 2possible swaps from each state.
We need to generate fewer useful states.
Key Insight
At any current string, find the first index where it differs from s2.
Call that index i.
For example:
current = "abc"
target = "bca"At index 0:
current[0] = "a"
target[0] = "b"So index 0 is wrong.
To make progress, we should swap some later character into index 0.
Specifically, we want a later index j where:
current[j] == target[i]Then after swapping i and j, index i becomes correct.
This pruning is important.
Instead of trying all swaps, we only try swaps that fix the first wrong position.
Because an optimal solution must eventually put the correct character at that first wrong position, BFS can safely explore only these useful moves.
Algorithm
Use BFS over string states.
Start from s1.
Each BFS level represents one more swap.
For each current string:
- If it equals
s2, return the current number of swaps. - Find the first index
iwherecurrent[i] != s2[i]. - Try every later index
jsuch that:current[j] == s2[i]current[j] != s2[j]
- Swap
iandjto create a new string. - Add the new string to the queue if it has not been visited.
The condition:
current[j] != s2[j]avoids breaking a position that is already correct. It is a safe and useful pruning step.
Correctness
Each string state is a node in a graph.
There is an edge between two states if one can be changed into the other by one swap.
The answer is the length of the shortest path from s1 to s2.
BFS explores states by distance from s1. It checks all states reachable in 0 swaps, then all states reachable in 1 swap, then all states reachable in 2 swaps, and so on.
Therefore, the first time BFS reaches s2, the number of levels used is the minimum number of swaps.
The pruning step does not remove all optimal paths.
For any current state, let i be the first mismatched index. In any valid solution, index i must eventually receive s2[i]. Since all earlier indices are already correct, we can choose a shortest solution that fixes index i now by swapping it with some later position containing s2[i]. This produces a state that is at least as close in the same shortest-search graph.
So the generated neighbors still include enough states to reach an optimal answer.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | Exponential in n | BFS may still explore many string states |
| Space | Exponential in n | The queue and visited set store generated states |
The constraints are small: s1.length <= 20, and the alphabet is limited to six letters.
The pruning makes BFS practical for this problem.
Implementation
from collections import deque
class Solution:
def kSimilarity(self, s1: str, s2: str) -> int:
if s1 == s2:
return 0
queue = deque([s1])
visited = {s1}
swaps = 0
while queue:
for _ in range(len(queue)):
current = queue.popleft()
if current == s2:
return swaps
i = 0
while current[i] == s2[i]:
i += 1
current_list = list(current)
for j in range(i + 1, len(current)):
if current[j] == s2[i] and current[j] != s2[j]:
next_list = current_list[:]
next_list[i], next_list[j] = next_list[j], next_list[i]
next_state = "".join(next_list)
if next_state not in visited:
visited.add(next_state)
queue.append(next_state)
swaps += 1
return -1Code Explanation
We start with the simplest case:
if s1 == s2:
return 0If the strings are already equal, no swaps are needed.
The BFS queue starts with s1:
queue = deque([s1])
visited = {s1}
swaps = 0The variable swaps is the BFS level. It tells us how many swaps were used to reach the states currently in the queue.
For each string state, we first check whether it is the target:
if current == s2:
return swapsThen we find the first wrong index:
i = 0
while current[i] == s2[i]:
i += 1This is the index we try to fix next.
Then we search for a later character that belongs at index i:
for j in range(i + 1, len(current)):
if current[j] == s2[i] and current[j] != s2[j]:The first condition means swapping j into i fixes position i.
The second condition avoids taking a character from a position that is already correct.
For every valid swap, we build the next string:
next_list = current_list[:]
next_list[i], next_list[j] = next_list[j], next_list[i]
next_state = "".join(next_list)If we have not seen it before, we add it to BFS:
if next_state not in visited:
visited.add(next_state)
queue.append(next_state)After finishing the current BFS level, we increase the swap count:
swaps += 1Testing
def test_k_similarity():
s = Solution()
assert s.kSimilarity("ab", "ba") == 1
assert s.kSimilarity("abc", "bca") == 2
assert s.kSimilarity("aabc", "abca") == 2
assert s.kSimilarity("abc", "abc") == 0
assert s.kSimilarity("abccaacceecdeea", "bcaacceeccdeaae") == 9
print("all tests passed")
test_k_similarity()Test meaning:
| Test | Why |
|---|---|
"ab" to "ba" | Minimum non-trivial swap |
"abc" to "bca" | Needs two swaps |
"aabc" to "abca" | Repeated letters |
| Equal strings | Zero swaps |
| Larger case | Checks deeper BFS search |