Anagrams are strings or sequences that contain the same elements with the same multiplicities, possibly in different order.
Anagrams are strings or sequences that contain the same elements with the same multiplicities, possibly in different order. For example, "listen" and "silent" are anagrams because each letter appears the same number of times in both strings.
The usual algorithmic task is to compare frequency tables. Sorting also works, but counting is often faster when the alphabet is small or bounded.
Problem
Given two strings a and b, decide whether they are anagrams.
a = "listen"
b = "silent"Expected result:
trueFor strings of different lengths, the answer is immediately false.
Solution: Frequency Table
For ASCII lowercase letters, use a fixed-size array of length 26.
func IsAnagramLowerASCII(a, b string) bool {
if len(a) != len(b) {
return false
}
var count [26]int
for i := 0; i < len(a); i++ {
count[a[i]-'a']++
count[b[i]-'a']--
}
for i := 0; i < 26; i++ {
if count[i] != 0 {
return false
}
}
return true
}This assumes every byte is between 'a' and 'z'.
Invariant
After processing the first i bytes:
count[c] = occurrences of c in a[0:i] minus occurrences of c in b[0:i]If both strings are anagrams, every final difference is zero. If any count is nonzero, at least one character appears a different number of times.
General Byte Strings
For arbitrary byte strings, use an array of length 256.
func IsAnagramBytes(a, b string) bool {
if len(a) != len(b) {
return false
}
var count [256]int
for i := 0; i < len(a); i++ {
count[a[i]]++
count[b[i]]--
}
for i := 0; i < 256; i++ {
if count[i] != 0 {
return false
}
}
return true
}This treats strings as byte sequences. It is correct for binary data and byte-oriented formats.
Unicode Code Points
For Unicode code points, range over runes and use a map.
func IsAnagramRunes(a, b string) bool {
count := make(map[rune]int)
lenA := 0
for _, r := range a {
count[r]++
lenA++
}
lenB := 0
for _, r := range b {
count[r]--
lenB++
}
if lenA != lenB {
return false
}
for _, v := range count {
if v != 0 {
return false
}
}
return true
}This compares Unicode code points, not grapheme clusters or normalized text. For human-facing text, normalize first when needed.
Sorting Method
Another method is to sort both strings and compare the sorted results.
func IsAnagramBySorting(a, b string) bool {
if len(a) != len(b) {
return false
}
x := []byte(a)
y := []byte(b)
sort.Slice(x, func(i, j int) bool { return x[i] < x[j] })
sort.Slice(y, func(i, j int) bool { return y[i] < y[j] })
return string(x) == string(y)
}This requires:
import "sort"Sorting is simple and general for byte strings, but slower than counting for bounded alphabets.
Complexity:
Time: O(n log n)
Space: O(n)Counting complexity:
Time: O(n)
Space: O(1) for fixed alphabets, O(k) for mapswhere k is the number of distinct symbols.
Grouping Anagrams
To group words that are anagrams of each other, compute a canonical key for each word.
For lowercase ASCII, the frequency vector is a good key.
func GroupAnagrams(words []string) [][]string {
groups := make(map[[26]int][]string)
for _, w := range words {
var key [26]int
for i := 0; i < len(w); i++ {
key[w[i]-'a']++
}
groups[key] = append(groups[key], w)
}
out := make([][]string, 0, len(groups))
for _, g := range groups {
out = append(out, g)
}
return out
}The array [26]int can be used as a map key because arrays are comparable in Go.
Find Anagrams in a String
A common sliding window problem asks for all positions where a substring of s is an anagram of p.
func FindAnagrams(s, p string) []int {
if len(p) > len(s) {
return nil
}
var need [26]int
var have [26]int
for i := 0; i < len(p); i++ {
need[p[i]-'a']++
have[s[i]-'a']++
}
var out []int
if have == need {
out = append(out, 0)
}
for r := len(p); r < len(s); r++ {
have[s[r]-'a']++
have[s[r-len(p)]-'a']--
if have == need {
out = append(out, r-len(p)+1)
}
}
return out
}Invariant:
Before comparison, have contains frequencies for the current window of length len(p).Each step adds the new rightmost character and removes the old leftmost character.
Design Choices
| Case | Method | Time | Space |
|---|---|---|---|
| Lowercase ASCII | [26]int counts | O(n) | O(1) |
| Byte strings | [256]int counts | O(n) | O(1) |
| Unicode code points | map[rune]int | O(n) | O(k) |
| Simple general byte method | sorting | O(n log n) | O(n) |
| Repeated grouping | canonical frequency key | O(total length) | O(groups) |
Common Pitfalls
Using len(s) for Unicode character count gives byte count, not rune count.
For lowercase-only code, failing to validate input can panic or corrupt counts when characters fall outside 'a' to 'z'.
Sorting strings directly is impossible in Go because strings are immutable. Convert to []byte or []rune.
For human language, visually identical strings may have different Unicode representations. Normalize before comparison when that matters.
Using a map key built by concatenating counts without separators can collide. For example, counts [1, 11] and [11, 1] both become ambiguous if encoded carelessly.
Takeaway
Anagram detection is equality of frequency tables. Use fixed arrays for small alphabets, maps for larger symbol sets, and sorting when simplicity matters more than linear time. For substring anagram search, combine a frequency table with a fixed-size sliding window.