Skip to content

LeetCode 854: K-Similar Strings

A clear explanation of finding the minimum number of swaps needed to transform one anagram string into another using BFS.

Problem Restatement

We are given two strings s1 and s2.

The two strings are anagrams. That means they contain the same letters with the same frequencies.

One operation lets us swap any two characters in s1.

We need to return the smallest number of swaps needed to turn s1 into s2.

This smallest number is called k.

For example:

s1 = "abc"
s2 = "bca"

One valid sequence is:

"abc" -> "bac" -> "bca"

So the answer is:

2

Input and Output

ItemMeaning
Inputs1, the starting string
Inputs2, the target string
OutputThe minimum number of swaps needed
Constraints1 and s2 are anagrams
Constraint1 <= s1.length <= 20
ConstraintLetters come only from a through f

Function shape:

class Solution:
    def kSimilarity(self, s1: str, s2: str) -> int:
        ...

Examples

Example 1:

s1 = "ab"
s2 = "ba"

Swap the two characters:

"ab" -> "ba"

The answer is:

1

Example 2:

s1 = "abc"
s2 = "bca"

One shortest sequence is:

"abc" -> "bac" -> "bca"

The answer is:

2

Example 3:

s1 = "aabc"
s2 = "abca"

One shortest sequence is:

"aabc" -> "abac" -> "abca"

The answer is:

2

First Thought: Try All Swaps

A direct way is to treat every string arrangement as a state.

From one state, we can swap any two positions to create another state.

Then the problem becomes:

Find the shortest path from s1 to s2.

Each swap costs 1.

Since every move has the same cost, breadth-first search is a natural fit.

The problem is that trying every possible swap creates too many states.

For a string of length n, there are up to:

n * (n - 1) / 2

possible swaps from each state.

We need to generate fewer useful states.

Key Insight

At any current string, find the first index where it differs from s2.

Call that index i.

For example:

current = "abc"
target  = "bca"

At index 0:

current[0] = "a"
target[0]  = "b"

So index 0 is wrong.

To make progress, we should swap some later character into index 0.

Specifically, we want a later index j where:

current[j] == target[i]

Then after swapping i and j, index i becomes correct.

This pruning is important.

Instead of trying all swaps, we only try swaps that fix the first wrong position.

Because an optimal solution must eventually put the correct character at that first wrong position, BFS can safely explore only these useful moves.

Algorithm

Use BFS over string states.

Start from s1.

Each BFS level represents one more swap.

For each current string:

  1. If it equals s2, return the current number of swaps.
  2. Find the first index i where current[i] != s2[i].
  3. Try every later index j such that:
    1. current[j] == s2[i]
    2. current[j] != s2[j]
  4. Swap i and j to create a new string.
  5. Add the new string to the queue if it has not been visited.

The condition:

current[j] != s2[j]

avoids breaking a position that is already correct. It is a safe and useful pruning step.

Correctness

Each string state is a node in a graph.

There is an edge between two states if one can be changed into the other by one swap.

The answer is the length of the shortest path from s1 to s2.

BFS explores states by distance from s1. It checks all states reachable in 0 swaps, then all states reachable in 1 swap, then all states reachable in 2 swaps, and so on.

Therefore, the first time BFS reaches s2, the number of levels used is the minimum number of swaps.

The pruning step does not remove all optimal paths.

For any current state, let i be the first mismatched index. In any valid solution, index i must eventually receive s2[i]. Since all earlier indices are already correct, we can choose a shortest solution that fixes index i now by swapping it with some later position containing s2[i]. This produces a state that is at least as close in the same shortest-search graph.

So the generated neighbors still include enough states to reach an optimal answer.

Complexity

MetricValueWhy
TimeExponential in nBFS may still explore many string states
SpaceExponential in nThe queue and visited set store generated states

The constraints are small: s1.length <= 20, and the alphabet is limited to six letters.

The pruning makes BFS practical for this problem.

Implementation

from collections import deque

class Solution:
    def kSimilarity(self, s1: str, s2: str) -> int:
        if s1 == s2:
            return 0

        queue = deque([s1])
        visited = {s1}
        swaps = 0

        while queue:
            for _ in range(len(queue)):
                current = queue.popleft()

                if current == s2:
                    return swaps

                i = 0
                while current[i] == s2[i]:
                    i += 1

                current_list = list(current)

                for j in range(i + 1, len(current)):
                    if current[j] == s2[i] and current[j] != s2[j]:
                        next_list = current_list[:]
                        next_list[i], next_list[j] = next_list[j], next_list[i]
                        next_state = "".join(next_list)

                        if next_state not in visited:
                            visited.add(next_state)
                            queue.append(next_state)

            swaps += 1

        return -1

Code Explanation

We start with the simplest case:

if s1 == s2:
    return 0

If the strings are already equal, no swaps are needed.

The BFS queue starts with s1:

queue = deque([s1])
visited = {s1}
swaps = 0

The variable swaps is the BFS level. It tells us how many swaps were used to reach the states currently in the queue.

For each string state, we first check whether it is the target:

if current == s2:
    return swaps

Then we find the first wrong index:

i = 0
while current[i] == s2[i]:
    i += 1

This is the index we try to fix next.

Then we search for a later character that belongs at index i:

for j in range(i + 1, len(current)):
    if current[j] == s2[i] and current[j] != s2[j]:

The first condition means swapping j into i fixes position i.

The second condition avoids taking a character from a position that is already correct.

For every valid swap, we build the next string:

next_list = current_list[:]
next_list[i], next_list[j] = next_list[j], next_list[i]
next_state = "".join(next_list)

If we have not seen it before, we add it to BFS:

if next_state not in visited:
    visited.add(next_state)
    queue.append(next_state)

After finishing the current BFS level, we increase the swap count:

swaps += 1

Testing

def test_k_similarity():
    s = Solution()

    assert s.kSimilarity("ab", "ba") == 1
    assert s.kSimilarity("abc", "bca") == 2
    assert s.kSimilarity("aabc", "abca") == 2
    assert s.kSimilarity("abc", "abc") == 0
    assert s.kSimilarity("abccaacceecdeea", "bcaacceeccdeaae") == 9

    print("all tests passed")

test_k_similarity()

Test meaning:

TestWhy
"ab" to "ba"Minimum non-trivial swap
"abc" to "bca"Needs two swaps
"aabc" to "abca"Repeated letters
Equal stringsZero swaps
Larger caseChecks deeper BFS search