What're pros & cons of vec! literals VS &[T]?

All the way since I started using Rust, when I needed a hard-coded list of some objects, I just made a vec literal. Example:

fn tag_in(obj: OsmObj, key: &str, values: Vec<&str>) -> bool {
	obj.tags.get(key).map(|v| {
		for w in values.iter() { if *v == *w { return true; } }
		false
	}).unwrap_or(false)
}

if tag_in(some_obj, "highway", vec!["primary", "primary_link"]) { ... }

Now, I got braver to use a slice (Am I right calling it so?):

fn tag_in(obj: OsmObj, key: &str, values: &[&str]) -> bool {
	// same code
}

if tag_in(some_obj, "highway", &["primary", "primary_link"]) { ... }

Apart from saving some keystrokes, what are the pros and cons of these usages?

1 Like

Vectors:

  • allocate space for their elements on the heap
  • are resizable
  • are not Copy even if their elements are

Arrays:

  • Have an immutable length once created
  • Are Copy if their elements are Copy, which means they can live on the stack or the heap in their entirety

Other than that they're fairly similar, since they're both linear datastructures.

7 Likes

Are allocations a serious performance penalty?

I guess, hard-coded arrays are less penalty than hard-coded Vecs, aren't they?

It depends. Briefly speaking, it really matter if you do it billion times per second.

2 Likes

If you only need a read-only, baked-in list of items, then there's no benefit in using a Vec. If you aren't planning to change the number of elements, then the Vec is not any more capable than a simple slice, so there's no point in dynamically allocating memory.

Dynamic allocation has not only a performance implication, but also a consequence regarding usability: it's not const, so you can't (easily) put a non-empty Vec in a static or const item. Thus, it's better to default to a slice or an array if you don't need to add or remove elements later.

11 Likes

They're moderately expensive, so a few here or there generally speaking won't be too troublesome, but allocating in a hot code path can really trash your performance.

Not necessarily. Vecs allocate on the heap, but after the allocation is done, for performance it doesn't really matter¹ whether that allocation was on the stack or the heap. Also keep in mind that an array on the stack implies that copying / moving that array will copy/move all of the array, including the elements. In contrast, when moving a Vec, roughly speaking only its ownership is copied or moved. So depending on the use case (e.g. lots of moving stack-allocated arrays around), keeping your data on the stack can hurt performance rather than help it.

¹ We're blissfully skipping over the harsh realities of CPU caching ATM :slight_smile:

3 Likes

It is better for a function to accept a slice reference. This way you can pass both slices and Vectors because you can get a slice reference out of a vector. This will even happen implicitly as needed due to Deref coercions.

fn tag_in(key: &str, values: &[&str]) -> bool {
    todo!()
}

fn main() {
    tag_in("highway", &["primary", "primary_link"]);
    tag_in("highway", &vec!["primary", "primary_link"]);
}
5 Likes

Thanks, this is a very useful trick and actually solves an issue that I encountered today:

[ // outer array
    ("highway", &["primary", "primary_link"]),
    ("highway", &["primary", "primary_link", "secondary", "secondary_link"])
].iter().map(tag_in)

won't work, because the members of the outer array must be of the same size, whereas with &vec! it will compile, and the signature is still the same:

[ // outer array
    ("highway", &vec!["primary", "primary_link"]),
    ("highway", &vec!["primary", "primary_link", "secondary", "secondary_link"])
].iter().map(tag_in)

A syntactically nicer and non-allocating version would be

<[(_, &[_])]>::iter(&[
    ("highway", &["primary", "primary_link"]),
    ("highway", &["primary", "primary_link", "secondary", "secondary_link"])
])
3 Likes

Note that a temporary value for a slice will usually be allocated on the stack, which if you have a lot of objects with a large direct memory footprint, vec! might avoid a stack exhaustion situation.

2 Likes

Box<[T; N]> and Box<[T]> also exist (though you'll be relying on the optimizer to do emplacement if the stack use actually matters).

No, any individual allocation is pretty cheap, especially if you replace the default system allocator with a better one using global_allocator in the binary.

What really matters is pervasive or frequent allocations. Reallocating a buffer every time you read another 3 bytes from a file is a serious performance penalty. Allocating a buffer once to reuse across lots of reads is negligible cost. Allocating a vec![r, g, b] for every pixel in an image is a serious performance penalty. Allocating one big Vec<[u8; 3]> for a whole image is entirely reasonable. Etc.

4 Likes

The usual trick for coercing an array reference to a slice reference is to index, e.g.:

[
    ("highway", &["primary", "primary_link"][..]),
    ("highway", &["primary", "primary_link", "secondary", "secondary_link"][..])
].iter().map(tag_in)

It's definitely not very discoverable to need to write &___[..], but it's functional and fairly obvious what it does, even if not why it's necessary.

You could also wrap it into a slice! macro if that's preferable.

Even better, there's an as_slice() method nowadays.

6 Likes

Vec is good for allocation. It is helpful for example when you have a sequence of elements that you want to convert to bytes or perform some other mapping. If you do it on a &[T], the compiler will complain that the reference does not live long enough, so it must be reassigned. If you have hardcoded values, there is no need for a Vec, but there is not much harm either. Vec is pretty efficient and you can allocate the whole buffer if the number of elements is deterministic.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.