Partition the array around a pivot, then sort partitions concurrently using parallel recursion.
Parallel quicksort follows the same partition based divide and conquer structure as quicksort, but executes recursive calls concurrently. The key idea is that once the partition step finishes, the left and right subarrays become independent and can be processed in parallel.
Compared to parallel merge sort, this method avoids the cost of merging but introduces challenges in load balancing and pivot quality.
Problem
Given an array of size , sort it in nondecreasing order using parallel execution.
Algorithm
Partition the array, then sort both sides concurrently.
parallel_quicksort(A, lo, hi):
if lo >= hi:
return
p = partition(A, lo, hi)
spawn task:
parallel_quicksort(A, lo, p - 1)
spawn task:
parallel_quicksort(A, p + 1, hi)
wait for both tasksThe partition step rearranges elements such that:
Partition
Partition remains sequential in most implementations, which can limit scalability.
partition(A, lo, hi):
pivot = A[hi]
i = lo
for j from lo to hi - 1:
if A[j] <= pivot:
swap A[i], A[j]
i += 1
swap A[i], A[hi]
return iComplexity
| measure | value |
|---|---|
| work | average |
| span | average |
| worst case work |
Parallelism depends heavily on balanced partitions.
Cost Model
| condition | behavior |
|---|---|
| balanced partitions | high parallelism |
| skewed partitions | low parallelism |
| poor pivot choice | degrades to sequential |
Expected parallel time with processors:
Correctness
Each partition step places the pivot in its final sorted position. Recursive calls sort independent subarrays. Since no cross dependencies exist after partitioning, parallel execution preserves correctness.
Practical Considerations
- Pivot selection is critical. Randomized or median of three improves balance.
- Small subarrays should switch to sequential sort to reduce overhead.
- In place partitioning gives good cache locality.
- Work stealing schedulers help mitigate imbalance.
When to Use
Use parallel quicksort when:
- in place sorting is required
- memory usage must remain low
- average case performance is sufficient
- input distribution is not adversarial
For guaranteed worst case bounds or stability, parallel merge sort is more predictable.
Implementation (Go, simplified)
func ParallelQuicksort(a []int, lo, hi int) {
if lo >= hi {
return
}
p := partition(a, lo, hi)
done := make(chan struct{}, 2)
go func() {
ParallelQuicksort(a, lo, p-1)
done <- struct{}{}
}()
go func() {
ParallelQuicksort(a, p+1, hi)
done <- struct{}{}
}()
<-done
<-done
}
func partition(a []int, lo, hi int) int {
pivot := a[hi]
i := lo
for j := lo; j < hi; j++ {
if a[j] <= pivot {
a[i], a[j] = a[j], a[i]
i++
}
}
a[i], a[hi] = a[hi], a[i]
return i
}