Parallel Quicksort

Parallel quicksort follows the same partition based divide and conquer structure as quicksort, but executes recursive calls concurrently. The key idea is that once the partition step finishes, the left and right subarrays become independent and can be processed in parallel.

Compared to parallel merge sort, this method avoids the cost of merging but introduces challenges in load balancing and pivot quality.

Problem

Given an array $A$ of size $n$ , sort it in nondecreasing order using parallel execution.

Algorithm

Partition the array, then sort both sides concurrently.

parallel_quicksort(A, lo, hi):
    if lo >= hi:
        return

    p = partition(A, lo, hi)

    spawn task:
        parallel_quicksort(A, lo, p - 1)

    spawn task:
        parallel_quicksort(A, p + 1, hi)

    wait for both tasks

The partition step rearranges elements such that:

A[lo..p-1] \le A[p] \le A[p+1..hi]

Partition

Partition remains sequential in most implementations, which can limit scalability.

partition(A, lo, hi):
    pivot = A[hi]
    i = lo

    for j from lo to hi - 1:
        if A[j] <= pivot:
            swap A[i], A[j]
            i += 1

    swap A[i], A[hi]
    return i

Complexity

measure	value
work $T_1(n)$	$O(n \log n)$ average
span $T_\infty(n)$	$O(\log n)$ average
worst case work	$O(n^2)$

Parallelism depends heavily on balanced partitions.

Cost Model

condition	behavior
balanced partitions	high parallelism
skewed partitions	low parallelism
poor pivot choice	degrades to sequential

Expected parallel time with $p$ processors:

O\left(\frac{n \log n}{p} + \log n\right)

Correctness

Each partition step places the pivot in its final sorted position. Recursive calls sort independent subarrays. Since no cross dependencies exist after partitioning, parallel execution preserves correctness.

Practical Considerations

Pivot selection is critical. Randomized or median of three improves balance.
Small subarrays should switch to sequential sort to reduce overhead.
In place partitioning gives good cache locality.
Work stealing schedulers help mitigate imbalance.

When to Use

Use parallel quicksort when:

in place sorting is required
memory usage must remain low
average case performance is sufficient
input distribution is not adversarial

For guaranteed worst case bounds or stability, parallel merge sort is more predictable.

Implementation (Go, simplified)

func ParallelQuicksort(a []int, lo, hi int) {
	if lo >= hi {
		return
	}

	p := partition(a, lo, hi)

	done := make(chan struct{}, 2)

	go func() {
		ParallelQuicksort(a, lo, p-1)
		done <- struct{}{}
	}()

	go func() {
		ParallelQuicksort(a, p+1, hi)
		done <- struct{}{}
	}()

	<-done
	<-done
}

func partition(a []int, lo, hi int) int {
	pivot := a[hi]
	i := lo

	for j := lo; j < hi; j++ {
		if a[j] <= pivot {
			a[i], a[j] = a[j], a[i]
			i++
		}
	}

	a[i], a[hi] = a[hi], a[i]
	return i
}