Sorting; my quicksort vs std sort vs pdqsort, my sort wins, why?

I have tested next sorting function:

  1. my one (just quick sort copy/past from the Internet)
  2. std::slice::sort_unstable
  3. pdqsort::sort

I sorted vector of &str with 1_00_000_000 items.

it takes on my PC

  1. 52 seconds
  2. 92 seconds
  3. 98 seconds

2 and 3 functions is about the same by mine wins almost in 2x times.

I use release mode and measure time with std::time::Instant::now()/now.elapsed().

Why library function for sorting too slow?
Does it make sense to use library functions to sort items if my function is so fast?

a bonus question I written one more program in golang where I am sorting items too and it takes 56 seconds like my custom quick sort function. It faster then sorting library written in rust.
golang uses pdqsort as my thrid function.
I measure time there with help
start := time.Now()

the code you can find below

fn main() {
    let mut hash = std::collections::HashSet::new();

    let now = std::time::Instant::now();
    for i in 0..100_000_000 {
    println!("{:?} insert to hashmap", now.elapsed());

    let now = std::time::Instant::now();
    let mut v: Vec<&str> = hash.iter().map(|s| s.as_str()).collect();
    println!("{:?} collect keys to vector", now.elapsed());

    // let now = std::time::Instant::now();
    // let i = quick_sort(&mut v);
    // println!("{} {:?} custom sorting vector", i, now.elapsed());

    // let now = std::time::Instant::now();
    // pdqsort::sort(&mut v);
    // println!("{:?} pdqsort sorting vector", now.elapsed());

    let now = std::time::Instant::now();
    println!("{:?} std sort sorting vector", now.elapsed());

pub fn quick_sort(arr: &mut [&str]) -> usize {
    let len = arr.len();
    _quick_sort(arr, 0, (len - 1) as isize)

fn _quick_sort(arr: &mut [&str], low: isize, high: isize) -> usize {
    let mut i = 0;
    if low < high {
        let p = partition(arr, low, high);
        _quick_sort(arr, low, p.0 - 1);
        _quick_sort(arr, p.0 + 1, high);
        i += p.1;


fn partition(arr: &mut [&str], low: isize, high: isize) -> (isize, usize) {
    let pivot = high as usize;
    let mut store_index = low - 1;
    let mut last_index = high;
    let mut i = 0;

    loop {
        i += 1;
        store_index += 1;
        while arr[store_index as usize] < arr[pivot] {
            store_index += 1;
        last_index -= 1;
        while last_index >= 0 && arr[last_index as usize] > arr[pivot] {
            last_index -= 1;
        if store_index >= last_index {
        } else {
            arr.swap(store_index as usize, last_index as usize);
    arr.swap(store_index as usize, pivot);
    (store_index, i)

package main

import (

func main() {
	basket := make(map[string]string)

	start1 := time.Now()
	for i := 0; i < 100000000; i++ {
		basket[strconv.Itoa(i)] = strconv.Itoa(i)

	fmt.Println(time.Since(start1), "insert to hashmap")

	keys := make([]string, 0, len(basket))

	start2 := time.Now()
	for k := range basket {
		keys = append(keys, k)
	fmt.Println(time.Since(start2), "collect keys to vector")

	start3 := time.Now()
	fmt.Println(time.Since(start3), "sorting vector")

Are you compiling the Rust code with optimizations enabled (cargo run --release)?

exactly I know this trap))

I'm not sure how HashSet behaves but it seems to me you might end up sorting a Vector that is already sorted. If it were me I would fill the vector with strings created from a random number sequence.

Are you aware that you're including the construction of v in your measurement? Not sure how much time that takes, but I guess that's not really wanted. Same for the println, but I guess this doesn't weigh in heavily here.

Do you mean filling input data?
If so I just measure sorting

    let now = std::time::Instant::now();
    let i = quick_sort(&mut v);
    println!("{} {:?} custom sorting vector", i, now.elapsed());

Ahh you're creating now several times, which threw me off :slight_smile:

I would suggest writing a small program to generate a list of random strings, and then compare the performance of sorting that same list of strings if you're actually trying to observe differences in behavior between the algorithms. Otherwise you could easily end up comparing one algorithm's best case performance to another algorithm's worst, which is a sort of skewed comparison.

1 Like

If you mean rust vs go benchmark then may be you are right, I check it.
But I think hashset means usually O(1) it means to access item you need to calc hash. And hash means undefined order.

Yes, I did it before but generating random strings takes a lot of time comparing to hashset.
Ok, I will try it. For example I will generate it to file and then use it all the time.
I will back in about 15 minutes with result... if generating does not takes too much)

One of many tricky parts of optimization is accidentally targeting your specific machine. Here's what I got on mine (an AMD 3970X, which per-thread is slow by today's standards):

1) 67.820884889s custom sorting vector
2) 43.069694185s std sort sorting vector
3) 47.000875651s pdqsort sorting vector

No need to generate it to file. You can generate pseudo/random number sequences from a seed value. If the seed is always the same the same sequence will be generated every time.

1 Like

Another tricky part is not accounting for worst-case:

1) I killed it after 10 min
2) 352.981881ms std sort sorting vector
3) 318.982989ms pdqsort sorting vector

I presorted the vector for this last run; this is the achilles heel of the basic quicksort algorithm.


I tried it in the playground (reducing the number of elements to keep the playground happy) and std's sort is significantly faster there: Rust Playground

307.995471ms custom sorting vector
4.727579ms std sort sorting vector
1 Like

I also can't replicate your results locally. Are you sure the code that generated those numbers is exactly what you posted, with the relevant parts (un)commented? The way you're measuring it, it would be easy to accidentally take the wrong now if you weren't careful about uncommenting stuff.

There are also extraneous things that can make misleading results like this, like say Windows spontaneously decides to index your entire hard drive in the middle of a benchmark. Can you run all 3 again to confirm that it wasn't a fluke?

1 Like

just because you sorts already sorted vector in second case

1 Like

That's tricky for comparing across languages, which is why I suggested a file. But otherwise yes, you're absolutely correct

@semicoleon @ZiCog
I wrote the code which generates random strings and write them to file.
Then I read the file in other program and sort lines.
For rust the result almost the same but golang takes more time - 68 seconds instead of 53.

1 Like

Oh, my mistake.

here the code you wanted to post
std sort faster