How can I `impl` trait on a type param?

How can I impl std::cmd::Ord on a type parameter?

I have this struct,

#[derive(Debug)]
pub struct OrderBook<K> {
    pub bids: BTreeMap<K, f32>,
    pub asks: BTreeMap<K, f32>,
}

And K needs to impl std::cmd::Ord. I want to specify the size of K at compile time for optimization.

I think I found the answer myself.

#[derive(Debug)]
pub struct OrderBook<K: std::cmp::Ord> {
    pub bids: BTreeMap<K, f32>,
    pub asks: BTreeMap<K, f32>,
}

Putting a K: Ord bound on the struct itself has no performance benefits. It only has some disadvantages in that you'll have to write K: Ord a bit more often when working with a generic OrderBook<K>, even for methods or trait implementations that don't “really” need it.

I donʼt really understand what youʼre talking about w.r.t. “the size of K”.

5 Likes

I assume using K of a type with a known size instead of String means more allocations on the stack rather than the heap.

Is that the same reason why the definition of struct BTreeMap<K, V> doesn't add any bounds on K? And why, for example, Pin<P> doesn't have a Deref bound for P?

Moreover, does that mean it's generally a bad idea to add bounds for structs or enums and only use them on methods and functions?


Do these bounds (on the struct) play a role when using dyn?

Yes.

Not necessarily particularly bad, but still.. it's unnecessary and it can have some (usually small) disadvantages while on the other end having typically no advantages at all. One bigger disadvantage would be fn new methods that are supposed to be const fn, but const fns do (AFAIK still) not really support traits bounds.

The bounds you do need on a type are usually only of two kinds

  • bounds you need to define the fields, e. g. if they're using some associated type from a trait implementation of a type argument
  • bounds you need to implement Drop, e. g. you might want to call certain methods in Drop. That's e. g. why the (unstable) allocator arguments on types like Box have a bound on the struct definition (of Box) itself.

I'm not aware of any particular effect regarding trait objects, but I'm also not sure what connection of struct definition and trait objects you have in mind to begin with.


As another reason against putting trait bounds on structs: There's precedent in Haskell, the language that invented the system that Rusts trait system was based on. The ability to put (the equivalent of) trait bounds in (the equivalent of) struct definitions was even removed from the language (i. e. the feature is deprecated and it's disabled by default) because it was considered a misfeature. Actually, the trait bound was not even really put on the type itself but instead the constructor; the reasons why it's a misfeature are still the same as why you dont want the bounds im Rust when you don't need them. It has no advantages, only disadvantages. Note that in Haskell the two legitimate use-cases in Rust don't apply:

  • associated types can always be named, even if the type doesn't implement the trait. If there's no trait implementation, a type like (the equivalent of) <T as Trait>::Assoc simply doesn't reduce to any known type and is only constructable as a bottom value (i. e. a value that throws an exception when evaluated; note that Haskell uses lazy evaluation, so unevaluated "thunks" are possible for any type)
  • Haskell doesn't have destructors

so that trait bounds on a type are literally never useful at all in Haskell

4 Likes

I don't follow.

First, BTreeMap already stores both keys and values on the heap. There's no way you can force the tree to live on the stack, it just isn't implemented in such a way.

Second, if you mean that you can avoid an additional layer of indirection by using something like a SmallString which stores the buffer inline (as opposed to a separate allocation), then that is true. But what does this have to do with trait bounds?

1 Like

Yes, it doesn't have anything to do with a trait bound, but not sure if that's what was meant.

I think what you maybe meant was a type with "known (overall) memory usage". But even then, some types might store their data on the heap (e.g. by using Box). For example:

struct A {
    n: Box<[u8; 4096]>,
}

This type A has known size (i.e. it fulfills the Sized trait) and also known overall memory usage. It will still store 4096 bytes on the heap.

Note that String also has a known size (you can use it for fields in a struct, for example), opposed to str, which does not fulfill the Sized trait. But &str does have a known size and filfills the Sized trait (you can also use it for a field in a struct).

String has a known size, as the additional heap allocations don't change the "size" of the String itself (which is basically a pointer with capacity and length information, which just consume a few bytes (24 bytes on the platform used by Rust's playground), independent of the length of the actual string that's stored).

Consider the following example:

fn main() {
    let mut s = String::from("Hello");
    for _ in 0..30 {
        s.push_str(", hello");
    }
    s.push_str(" World!");
    println!("String has size: {}", std::mem::size_of::<String>());
    println!("s has size: {}", std::mem::size_of_val(&s));
    println!("&s has size: {}", std::mem::size_of_val(&&s));
    println!("{}", s);
}

(Playground)

Output:

String has size: 24
s has size: 24
&s has size: 8
Hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello, hello World!

The "Hello, hello, … World!" string is longer than 24 characters, yet the String (s) has a "size" of 24 only. And that size won't change, i.e. it is known at compile time and String is Sized.

(Edit: Renamed S to A in first example to avoid confusion with second example, and extended second example to show size of &s in addition to size of s.)

1 Like

I see. Thanks for the detailed help. Is there such a thing as a heapless ordered list?

Maybe something like heapless::BinaryHeap would work for you? I'm sure there are other "heapless" crates as well.

1 Like

The structs in this crate appear to follow insertion order, I'm trying to order by keys, storing key-value pairs.

They are definitely ordered. Otherwise there would be no need for the Ord trait bound. But I didn't catch that it stores keys only, not values too. Sorry about that. I've never actually used BinaryHeap, only heapless::Vec.

It doesn't look like the heapless crate has what you need. I'm sure there are other crates, though.

1 Like

Looks like heapless::FnvIndexMap does what I need. But I can't tell if it's actually faster than BTreeMap. I'm gonna post another thread for this question. Thanks much for the help.

I created this performance test and it seems to show that heapless::FnvIndexMap can't perform updates as fast as std::collections::BTreeMap.

running 2 tests
test btree_map ... bench:      24,106 ns/iter (+/- 949)
test index_map ... bench:      93,411 ns/iter (+/- 2,936)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

Would be curious to see what others think, whether this test is setup right, and if we can truly conclude that BTreeMap is faster than the stack-allocated alternative.

This benchmark is certainly flawed. To start with, you are generating 2 pseudorandom numbers in the hot loop for FnvIndexMap, while only 1 PRNG call is involved in the equivalent loop for BTreeMap. Since generating pseudorandom numbers takes a nontrivial amount of computation, it can be a source of significant bias.

You are also including the instantiation of the PRNG itself and the derivation of the key sequence in the loop, and the key/value types of the two benchmarks differ. This is also weird.

You are also violating the following requirement stated in the documentation of IndexMap:

Note that the capacity of the IndexMap must be a power of 2.

You are using a capacity of 6, which is not a power of 2. This presumably either makes the IndexMap misbehave and return incorrect results, or it causes spurious collisions, slowing insertion down.

Furthermore, your key space is exceptionally small. If you have only 4 keys, then you might as well use match directly or index into an array. With anything remotely approaching realistic, such as 16 different keys, they come out on par (or IndexMap starts winning out by a small margin) on my computer.

Here is the updated benchmark with the following results:

running 2 tests
test btree_map ... bench:     141,199 ns/iter (+/- 12,270)
test index_map ... bench:     121,889 ns/iter (+/- 7,392)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured
1 Like

Ah! You are right. Initially I had these setup the same, but the tests kept crashing or just not running, and in my attempt to figure out why, I tried changing the inputs and I later forgot to put it back. Sorry about the sloppy mistake, and thanks for helping me out with the code review!

Thanks for the updated version, it looks like you've also fixed the issue with PRNG? I'll admit I don't completely understand the mistake yet. Will look closer at a later time.

Thanks again.

The issue with the PRNG is that it can be slow, and therefore if you include a different number of calls in the two benchmarks, then the one with the more PRNG calls will appear slower. So you end up benchmarking the wrong thing. I.e, what you measure is not "insertion into BTreeMap vs insertion into IndexMap"; instead, it is "insertion into BTreeMap plus a PRNG call vs insertion into IndexMap plus two PRNG calls".