Generics, Arrays and Cloning


#1

I wrote a generic merge sort function, and through fighting with the compiler, arrived at the below solution. I had to use copious amount of clone() to get what I wanted. Is there a solution for such a program that involves less friction? Thanks!

use std::cmp::{Eq,Ord};
use std::clone::Clone;

pub fn merge_sort<T: Eq + Ord + Clone>(a: &mut[T]) {
	if a.len() <= 1 { return }

	let mid = a.len() / 2;

	merge_sort(&mut a[..mid]);
	merge_sort(&mut a[mid..]);

	let mut temp: Vec<T> = Vec::new();
	temp.push_all(a);
	
	merge_lists(&a[..mid], &a[mid..], &mut temp);
	for i in 0..temp.len(){
		a[i] = temp[i].clone();
	}

}

fn merge_lists<T: Eq + Ord + Clone>(a: &[T], b: &[T],c: &mut Vec<T>) {
	let mut a_index = 0;
	let mut b_index = 0;
	for i in 0..c.len() {
		if a_index == a.len() {
			c[i] = b[b_index].clone();
			b_index += 1;
		}
		else if b_index == b.len() {
			c[i] = a[a_index].clone();
			a_index += 1;
		}
		else {
			if a[a_index] < b[b_index] {
				c[i] = a[a_index].clone();
				a_index += 1;
			}
			else {
				c[i] = b[b_index].clone();
				b_index += 1;
			}
		}
	}
}

#2
let mut temp: Vec<T> = Vec::new();
temp.push_all(a);

I guess you’re doing this just to ensure the buffer is initialized, since everything in temp gets overwritten by merge_lists. But since merge_lists writes to the elements of temp in order, you can instead pass an empty vector and have it push those elements. To avoid reallocations, use reserve first.

merge_lists(&a[..mid], &a[mid..], &mut temp);

Since you’re passing a reference, a is expected to still be a valid list of Ts at the end of the call, so merge_lists is forced to clone the elements the first two parameters. Unfortunately, I’m not sure there’s a way to fix this in safe code without incurring some other overhead… This is one thing &uniq [T] (or whatever it’s called) from the recent RFC would be useful for, although for this use case, a similar type implemented in a library using unsafe code would also be sufficient.

If you care about safety more than performance, you can change merge_sort to take a as a Vec, then use split_off to split it into two Vecs at the midpoint. However, the allocator isn’t smart enough to be able to reuse the memory of each part of the original Vec, so half of it will be memcpy’d into a new buffer.


#3

@comex already gave a good answer to your questions, but I just thought I’d let you know that I have a repository of sorting algorithms, including merge sort:

I also used lots of clone(); I didn’t see a good way around that other than some ugly unsafe code, and I was focusing more on readability.


#4

@comex thanks for the tip. This is what I ended up doing:

let mut temp: Vec<T> = Vec::with_capacity(a.len()); 

Not sure about what you mean about reserve though. Is this some language feature? Because I can’t seem to find it.

@wackywendell haha nice you beat me to the sorting punch! You’re code looks way more Rust-like. Using match is still not my first go-to, even though it seems quite powerful, and my code ends up looking a lot like C++ because of it. Our sorting algorithms should race! :smile:


#5

The reserve method on Vec, which adds capacity to a vector. I forgot about constructing with with_capacity; that’s fine too of course.


#6

Haha, well, I try and 1) always do for loops over iterators and not range() (unless I really want the numbers and not just use them as indices), and 2) prefer match over if … else. And I was going for readability.

Well, that repo has benchmarks built in, with cargo bench, so feel free to use that! I’d be curious… I’m not sure my use of iterators and cloning will be quite as fast as it could; your more “C-like” algorithm might have an advantage. But one never knows until one benches!


#7

I think you should be able to use std::mem::swap in merge_lists rather than clone (as you don’t need to leave the contents of a and b intact). All you really need then is to take an auxiliary slice (for storage) as input to a merge_sort_helper function, and you can swap into it, and then swap out of it in your for loop. You could either take that spare slice as an argument, or clone it up once, or use the Default trait instead to populate the slice (which is probably better for any T with a non-trivial Clone implementation).

Caveat: I implemented none of this, but have written a merge sort in Rust once upon a time. :slight_smile: