Implementing Ord with external content

I am stuck on an interesting problem. Any hints on how to proceed would be much appreciated.
I am using BinaryHeap<T>, where T can be arbitrary (potentially large) structs.
The problem is that the collections implementation copies and stores them, which is inefficient.
I want to instead insert into the heap items from index:&[usize], where the index values refer to positions in some given data:Vec<T>.

I understand that in order for BinaryHeap<usize> to work like this, I must implement Ord, so that the comparisons it carries out on the index items indirectly compare the values in data instead. However, I keep getting compiler complaints about not being allowed to refer to external content (data Vec). It suggest using closures to capture it but when I attempted that, it did not help either.

I can get round this by implementing my own BinaryHeap but that seems a duplication of effort and an overkill. It seems that the problem is the global nature of Ord implementation, which makes it of limited use.

This isn't possible in general, unless the item in the heap borrows the vector (e.g. is &T) or the vector is in a static. Generally what you're looking for is a "raw entry" API like hashbrown provides for it's hashmap which allows you to inject the dependency by providing some impl FnMut(&T, &T) -> Ordering rather than using the intrinsic Ord.

There's an open issue about adding a raw entry API to other collections and extending HashMap's to the full dependency injection, but it's a lot of added API surface for an esoteric use case. The original motivation for the HashMap raw entry API (performance by avoiding redundant rehashing) doesn't really apply anymore, and removing it has been discussed due to the complexity.

Another discussed potential extension is to add a comparator object to the BinaryHeap/BTree/etc. which is in charge of doing the comparisons and can hold and consult its own state for that. That could probably work for this use case but also runs into a number of issues when trying to actually use it.

If the goal is to minimize the amount of memory copied when rebalancing the heap, using BinaryHeap<Box<T>> (or Arc, etc) may be sufficient. If you're just sticking a downstream-defined T in, don't worry about it any; they can box their type before handing it to you.

3 Likes

Thanks for that. The 'dependency injection' FnMut(&T, &T) -> Ordering is a powerful technique and I used it myself before for my general binary search implementation in crate indxvec. Indeed, I may have to reimplement BinaryHeap like that too, so there does seem to be not so 'esoteric' need.

I am trying to save copying T items into the heap in the first place, not just the rebalancing swaps.
I am not sure if I follow your last paragraph. You think it can be achieved by using BinaryHeap<&T> or BinaryHeap<Box<T>>, or Arc, etc? If so, which and how? I can implement Ord for, say, &T without complaints? If so, that may be the solution I need.

If you're putting them into Vec<T>, you're moving them onto the heap. BinaryHeap is just a Vec<T> with some ordering rules.

Just because you've run into a need more than once doesn't indicate it's not esoteric. If you've run into it before, you're significantly more likely to run into it again, because you're doing things that run into that need. But doing that thing with your constraints can still be an esoteric use case compared to everyone else not running into it.

To be extremely curt, worrying about the cost of copying structs around is an esoteric pursuit, especially if it can't be addressed just by boxing the large struct and passing that box around instead. Golfing allocations is a somewhat common pastime in the Rust space, but the broad use of "box everything" GC languages is proof enough that this is a specialist concern.

Any of them will work; it's a matter of what ownership pattern you want.

For any T: Ord, it's already the case that &T: Ord, Box<T>: Ord, and Arc<T>: Ord; no further implementation required. If those impls weren't available, you could make a struct Wrapper(Box<T>) and implement traits for that.

4 Likes

You can technically make something that works via shared ownership but it's awkward as all hell to use, so I wouldn't really recommend it.

1 Like

Yes, that is how my data comes but I do not want another copy of it.
The rest sounds promising, thank you.
That does seem to be the route of least resistance.

It might be instead that you just want the BinaryHeap<T>: From<Vec<T>> impl, which takes ownership of the Vec and sorts it into heap order directly. Then you don't have any sort of indirection (and likelihood of ending up with self-borrowing) to worry about.

1 Like

Nice try but unfortunately, no. The reason I am using BinaryHeap in the first place is that I need some operations on it, such as inserting additional items, which will incur all those copying and swapping costs.

Here is some code to be more specific:

    /// Max heap of k smallest items
    fn smallest_k_heap(self, k: usize) -> BinaryHeap<T> 
    where 
        T: Ord + Sized + Copy
    {
        let mut heap = BinaryHeap::from(self[0..k].to_vec());
        for item in &self[k..] {
        let mut root = heap.peek_mut().unwrap();
        if item < &root {
            *root = *item;
        }
    }
    heap
}

The point of this exercise is that it is more efficient than doing the full sort to find the k smallest items. In the worst case: Ord(k) + Ord((n-k)*log(k))

If T is large enough to care about the performance cost of moves (or copies), it shouldn't be Copy.

1 Like

Exactly, that is why I am trying to get away from BinaryHeap<T>

I have now changed the code in the minimalistic way to create local BinaryHeap<&T>, working with references only. To remove problems with ownership lifetime of the result, I clone only the final items. This seems to be a satisfactory solution:

    /// Vec of k smallest items
    fn smallest_k(self, k: usize) -> Vec<T>
    where
        T: Ord + Clone,
    {
        let mut heap = BinaryHeap::from_iter(&self[0..k]);
        for item in &self[k..] {
            let mut root = heap.peek_mut().unwrap();
            if item < *root {
                *root = item;
            }
        }
        heap.iter().map(|&x| x.clone()).collect()
    }

(Written before you completely changed your comment and code)

You can't consume self, which apparently owns the T, and then return references to the now-dropped T. The lifetime 'a coming "out of nowhere" is a big clue / red flag.

Try:

fn smallest_k_heap(&self, k: usize) -> BinaryHeap<&T> 
// aka
fn smallest_k_heap<'a>(&'a self, k: usize) -> BinaryHeap<&'a T> 

(though I'm still guessing given the lack of known types).


If your T is as generic as it seems, "let consumers of the API worry about if they need to box their T for movement performance concern or not" is still probably the canonical answer.

2 Likes

This is now significantly more expensive. Consider T=String: now you're cloning k heap allocated strings to avoid sorting the n-k strings. Since you're taking self by value, you don't even want a T: Clone bound ideally. You probably want something along the lines of

    fn smallest_k(self, k: usize) -> Vec<T>
    where T: Ord,
    {
        assert!(k > 0);
        let mut iter = self.into_iter();
        let mut heap: BinaryHeap<T> = iter.by_ref().take(k).collect();
        for item in iter {
            let mut root = heap.peek_mut().unwrap();
            if item < *root {
                *root = item;
            }
        }
        heap.into()
    }

Yes, this will move each T out of self in turn. Unless you write a specific partial heapify, though, this is as good as you're going to get.

Rust's design is fairly predicated on the idea that moving values around is cheap. Avoiding doing this along with the binary heap allocation is an interesting exercise but extremely unsafe and prone to leaking on unwind.

I got nerd sniped

Very roughly, the solution could look like

fn smallest_k<T: Ord>(mut v: Vec<T>, k: usize) -> Vec<T> {
    assert!(k <= v.len());
    if k == 0 { v.clear(); return v; }
    let n = v.len();
    v[..k].sort_unstable();
    unsafe {
        v.set_len(k);
        for i in 0..n - k {
            let mut v = guard_on_unwind(&mut v, |v| {
                v.spare_capacity_mut()[i + 1..n - k]
                    .slice_assume_init_drop();
            });
            let (head, spare) = v.split_at_spare_mut();
            let next = guard(&mut spare[i], MaybeUninit::assume_init_drop);
            if head[k - 1] < next.assume_init_ref() {
                continue;
            }
            let p = head.partition_point(next.assume_init_ref());
            let next = ScopeGuard::into_inner(next).assume_init();
            v.insert(p, next);
            v.truncate(k); // equiv. v.pop(), but in place, since that matters
        }
    }
    v
}

This is COMPLETELY UNTESTED and gives NO PROMISE OF BEING SOUND. It just exists to illustrate the theoretical way to do this without an extra allocation and the significant amount of unsafe required in order to manage the concurrent drain and modification of the vector.

Bonus: I hit post and immediately spotted a place where an unwind can happen and leak the n-k elements. I wrote this on my phone and have zero confidence in its correctness.

And I'd not be surprised if this optimizes/performs worse than the simple and safe version above.


If you prefer the cloning version, then return the references directly instead, e.g.

    fn smallest_k(&self, k: usize) -> Vec<&T>
    where T: Ord,
    {
        assert!(k > 0);
        let mut iter = self.iter();
        let mut heap: BinaryHeap<&T> = iter.by_ref().take(k).collect();
        for item in iter {
            let mut root = heap.peek_mut().unwrap();
            if item < *root {
                *root = item;
            }
        }
        heap.into()
    }

and the caller can do the .map(Clone::clone).

2 Likes