# Correct way of passing nested vectors as arguments

Assume that I have a pre tokenized list of sentences in a `Vec<Vec<String>>` or `Vec<Vec<&str>>`; Upon which I need to compute a bunch of different metrics.

``````fn main() {
let sent_words = vec![
vec!["This", "is", "sentence1."],
vec!["This", "is", "not", "sentence3"],
];
let m1: f32 = metric1(&sent_words);
let m2: f32 = metric2(&sent_words);
}
``````

Some constraints/invariants.

1. The metrics are all read only. They never modify the vector, inner vector or the string.
2. I need a fast way of computing the number of words in each sentence, so no flattening/streaming.

My question is considering these constraints what is the right way to define the metrics?

``````fn metric1<T>(t: &[Vec<T>]) -> f32
where
T: AsRef<str>
{
return t.len() as f32 / 2.0;
}

``````

The above definition works, but I don't understand what is `&[Vec<T>]` is even supposed to mean? Have the inner Vectors moved inside the function?

Can I somehow take `&[&[T]]` as an argument? Because it does not seem to work.
This Stackoverflow answer suggests using AsRef and it works.

``````fn metric2<T, U>(t: &[U]) -> f32
where
T: AsRef<str>,
U: AsRef<[T]>,
{
return t.len() as f32 / 2.0;
}
``````

How does my metric1 differ from metric2?
When would I prefer using either?
Is there something better than both ?

To answer both of these questions, letâ€™s talk about generic function and trait bounds.

Starting with the function signature

``````fn metric1<T>(t: &[Vec<T>]) -> f32
where
T: AsRef<str>
``````

A generic function like

``````fn foo<T>(t: &T) {}
``````

is like a scheme (or C++ folks would say template) for a whole lot of concrete functions with concrete function signatures: Thereâ€™s `foo<u32>` and `foo<String>` and `foo<Vec<Vec<String>>>`, etcâ€¦ i.e. weâ€™re defining an infinite number of functions, one for each possible type that could be substituted for `T`. (By the way, in Rust syntax, when calling these this explicitly, one would have to write `foo::<u32>` instead of `foo<u32>`.)

Substituting `T` a concrete type when theyâ€™re called is the whole idea behind generic functions; the generic function itself is a template with some wholes to be filled out in order to create a real actual concrete function as a result.

The resulting function signatures of these functions then is easy to determine: substitute `T` accordingly to find out that `foo<u32>` has the signature `fn(&u32)`, and `foo<String>` the signature `fn(&String)`; in fact, the compiler can even confirm this

``````fn foo<T>(t: &T) {}

fn demonstration() {
let _: fn(&u32) = foo::<u32>;
let _: fn(&String) = foo::<String>;
// compiles successfully, so the types are correctly annotated
}
``````

Now enter trait bounds. A generic function with trait bounds no longer allows you to substitute any type you want when calling it; instead it constrains the possible types to those that have the properties required in the `where` clause. Letâ€™s just do some compilation test again

``````fn metric1<T>(t: &[Vec<T>]) -> f32
where
T: AsRef<str>
{
todo!()
}

fn demonstration() {
let _ = metric1::<String>;
let _ = metric1::<&str>;
let _ = metric1::<Box<str>>;
let _ = metric1::<&&&str>;
// ^^^^^^^^^^^^^^ these all work
let _ = metric1::<u32>; // <- this doesnâ€™t compile
}
``````

The last line will produce a compilation error

``````error[E0277]: the trait bound `u32: AsRef<str>` is not satisfied
--> src/lib.rs:13:13
|
13 |     let _ = metric1::<u32>;
|             ^^^^^^^^^^^^^^ the trait `AsRef<str>` is not implemented for `u32`
|
note: required by a bound in `metric1`
--> src/lib.rs:3:8
|
1  | fn metric1<T>(t: &[Vec<T>]) -> f32
|    ------- required by a bound in this
2  | where
3  |     T: AsRef<str>
|        ^^^^^^^^^^ required by this bound in `metric1`
``````

which is due to the fact that for `T` being `u32`, the constraint `T: AsRef<str>` from the where clause is not met. The error message explains: `the trait bound `u32: AsRef<str>` is not satisfied`, and the `u32: AsRef<str>` comes from substituting `u32` for `T` in `T: AsRef<str>`.

Notably, the other cases all compiler successfully, so apparently the trait bounds `String: AsRef<str>`, `&str: AsRef<str>`, `Box<str>: AsRef<str>`, and `&&&str: AsRef<str>` are all satisfied. The signatures are also as expected, just substitute `T`. This code would compile if it wasnâ€™t for some more advanced details regarding lifetimes that I wonâ€™t go into here:

``````fn demonstration() {
let _: fn(&[Vec<String>]) -> f32 = metric1::<String>;
let _: fn(&[Vec<&str>]) -> f32 = metric1::<&str>;
let _: fn(&[Vec<Box<str>>]) -> f32 = metric1::<Box<str>>;
let _: fn(&[Vec<&&&str>]) -> f32 = metric1::<&&&str>;
}
``````

So what does `&[Vec<T>]` mean? It means `&[Vec<String>]` or `&[Vec<&str>]` or any other type in place of `T` as long as the trait bound from the `where` clause it satisfied.

What does `&[Vec<String>]` mean though? Youâ€™re asking â€śHave the inner Vectors moved inside the function?â€ť; the answer is, no. The whole type starts with `&`, so itâ€™s a reference type and nothing is moved. It references a slice of `Vec<String>`s. This reference `&[Vec<String>]` can, for example, then be created from (implicitly coercing) a `&Vec<Vec<String>>` when you call `metric1`.

How does `metric2` differ from `metric1`? Well, looking at its signature

``````fn metric2<T, U>(t: &[U]) -> f32
where
T: AsRef<str>,
U: AsRef<[T]>,
``````

itâ€™s more generic. Two degrees of freedom instead of one, of you will; or, actuallyâ€¦ the way in which it actually is â€śmore genericâ€ť is that the `metric1` signature is a special case of `metric2`. You can let the compiler check this by simply calling `metric2` from `metric1`â€™s function body,

``````fn metric2<T, U>(t: &[U]) -> f32
where
T: AsRef<str>,
U: AsRef<[T]>,
{
todo!()
}

fn metric1<T>(t: &[Vec<T>]) -> f32
where
T: AsRef<str>
{
metric2(t)
}
``````

or you could figure this out yourself by comparing and matching the signatures:

• in order to make
``````(t: &[U]) -> f32
``````
and
``````(t: &[Vec<T>]) -> f32
``````
match, we need `U = Vec<T>`
• the parameters â€ś`T`â€ť of both function could just be made to match, even though in general, the name of such parameters doesnâ€™t matter; but Iâ€™ll spare us the intermediate step of renaming one of them and figuring out theyâ€™re going to have to be the same anyways.
• the `metric2` function has an additional constraint that needs to be eliminated: `U: AsRef<T>`. Weâ€™ve determined we want `U` to be `Vec<T>`, so next up weâ€™ll need to check whether or not `Vec<T>: AsRef<[T]>` is a thing. Turns out it is a thing. Hereâ€™s the implementation. And hereâ€™s how the compiler can verify for us that itâ€™s a thing.

Now, with the above exploration done, answering the second question â€śHow does my metric1 differ from metric2?â€ť

Weâ€™ve seen that metric2 is more generic than metric1. You can use metric2 as a replacement wherever you can use metric1. From a callerâ€™s perspective (assuming `metric1` and `metric2` implement the same metric), if you have `metric2`, then also having `metric1` is superfluous / useless.

Itâ€™s also actually more generic, since `metric2` supports function signatures that `metric1` canâ€™t support. The example you gave is `&[&[T]`, so something like `&[&[String]]` would work for `metric2`, and it does not work for `metric1` (you can ask the compiler, why; the answer doesnâ€™t even involve traits, the type of the wrong shape, so it wonâ€™t match no matter what we substitute in for `T`; in fact, the error message highlights that the choice of what goes in for `T` doesnâ€™t matter by simply printing an â€ś`_`â€ť in its place, essentially saying â€śthese types arenâ€™t the same no matter what goes into the â€ś_â€ť placeâ€ť).

Weâ€™ve only focused on the function signature so far yet, and how to call instantiate and perhaps call the function; but thereâ€™s also the implementationâ€¦ the function body!

Now thereâ€™s an opposite effect here: A more generic function signature makes calling the function easier, but it makes implementing it harder. While you can always call `metric2` in place of `metric1`, you cannot use every function body thatâ€™s valid for `metric1` and use it in `metric2` instead; in fact itâ€™s the opposite: every implementation of `metric2` is also a valid `metric1` implementation.

When youâ€™re working with the items of type `U` in `metric2`, youâ€™ll always need to call `.as_ref()` on them to get a `&[T]`, whereas in `metric1`, you had `Vec<T>`s (behind a reference) and itâ€™s possible to turn a `&Vec<T>` into `&[T]` more implicitly.

``````fn metric1<T>(t: &[Vec<T>]) -> f32
where
T: AsRef<str>
{
let x: &[T] = &t[0]; // this works

return t.len() as f32 / 2.0;
}

fn metric2<T, U>(t: &[U]) -> f32
where
T: AsRef<str>,
U: AsRef<[T]>,
{
// let x: &[T] = &t[0]; // this doesnâ€™t work
let x: &[T] = t[0].as_ref(); // but this works

return t.len() as f32 / 2.0;
}
``````

When the additional capabilities that the `metric2` signature gives you on the caller side are useful, then thatâ€™s a good reason for preferring that approach. If implementing the metric with the `metric2`-style signature is problematic (though I donâ€™t see a practical reason why it ever would be) that that could be a strong reason in favor of the `metric1` signature. If the caller benefited from the generality of `metric2`, it would then have to work around the limitation.

Thereâ€™s no real downside to the more generic function. Itâ€™s just a efficient if `metric1` and `metric2` have the same function body, and even if the latter has a few extra `as_ref` calls, those are typically exactly as cheap as the implicit coercions that `Vec<T>` offers. There can be downsides to generic functions in that

• if there are generic parameters that are not part of any argument type, or possibly even worse neither part of an argument type nor of a return type, then the caller might need to explicitly speficy that generic parameter, or else thereâ€™d be a compilation error
• the way that generic functions are implemented (via â€śmonomorphizationâ€ť) can in some cases result in a larger binary size; if that trade-off is undesired, less generic functions can be a performance benefit; this is arguably a rather advanced topic though, often involving things like â€śtrait objectsâ€ť, and sometimes the performance benefit can be achieved without making the function less generic for the caller, by implementing a generic shim around a less generic implementation

â€śBetterâ€ť is subjective. More general is certainly possible. E.g. you could start involving `IntoIterator` trait bounds. At some point it might become useless though; as mentioned above, more generic version are only useful if you have any use[1] for some of the new signatures they support, such as the `&[&[T]]` example you mention: You had some type you wanted to also support and only this motivates trying to make the function more generic accordingly.

1. or, in case the function is part of some public external API: if some (potential) user has any use â†©ď¸Ž

4 Likes

Thanks for the incredibly detailed solution I definitely understood the more generic vs less generic argument. But I am still a bit unclear on the `AsRef`

I was reading this blog On dealing with owning and borrowing in public interfaces

This is what the author says

`Q: AsRef<T>` is a `&T` on steroids in terms of software design. It gives more power to the client as theyâ€™ll be able to pass more types in your functions. However, you now introduce a polymorphic API. The semantics are the same: you plan to perform only read-only computations or accept not to use a client-provided move-in value if you need to own something. Keep in mind an important hidden property here: because you accept values to be moved in while you just have a read-only contract on them, you also accept clients values to be dropped by corollary.

I am not understanding what the author means by the bolded part. It seems like a bad thing. For instance if I try to make a metric even more generic I could do

``````fn metric3<T, U, V>(t: V) -> f32
where
T: AsRef<str>,
U: AsRef<[T]>,
V: AsRef<[U]>
{
return t.as_ref().len() as f32 / 2.0;
}
``````

So when someone calls `metric3` with `Vec<Vec<String>>` what is actually going on ? Am I accepting a move even though it is read only?

EDIT Can I somehow turn `metric3` into something that will accept either a Vec or a slice but still only borrows but does not move?

Yes, they're moving `Vec` into the function. But they may opt to not doing it - `V: AsRef<[U]>` implies `&V: AsRef<[U]>`, so if they need the `Vec` afterwards, they will just borrow it themselves:

``````fn main() {
let v = vec![vec![""]];

// works
metric3(&v);
metric3(&v);

// would fail with second line uncommented
metric3(v);
// metric3(v);
}
``````
2 Likes

Ah, I see, so it's a matter of implicit vs explicit. I feel like in a public API it might be better to make it explicitly clear that we don't want to move because we are only doing read only operations

The user of `metric3` (or anyone else) can coerce a `Vec` to a slice, so anything that accepts a slice accepts a `Vec` in that sense. If you mean, without the user typing anything else, no -- if it accepts a `Vec` in that sense, it's moving.

If you wanted to accept `&[_]` or `&Vec<_>` (to any depth of `&`) but to not accept `Vec`, you could use

``````fn metric3<T, U, V>(t: &V) -> f32
//                 new ^
where
// ...
V: ?Sized + AsRef<[U]>
``````

This gets rid of the "oops, accidentally gave away my `Vec`" hazard, though it also introduces an "I don't care if I give away my `Vec` but now I have to type more to pass `&vec` anyway" possibility.

2 Likes

Well, as for me, the `AsRef` bound is explicit enough, since it explicitly allows the user to use the reference and not move the value.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.