Generic Types with Lifetimes

This is a follow-up to an older post about correct usage of a specific API in the aerospike crate. It was suggested by @vitalyd that the API could be improved using generic type parameters. I am the maintainer of the aerospike crate and I’ve been trying to do exactly that.

The existing function I would like to make generic is this:

fn put(&self, policy: &WritePolicy, key: &Key, bins: &[&Bin]) -> Result<()>

I would like to turn this into the following, as suggested:

fn put<I: IntoIterator<Item=A>, A: AsRef<Bin>>(&self, policy: &WritePolicy, key: &Key, bins: I) -> Result<()>

I’ve pulled out the relevant pieces of the code into this gist (rev 1). The full source code is available in the aerospike-client-rust repo.

I’ve taken a first stab at making the required changes in rev 2 of the gist.

Now I’m stuck trying to define the generic types for a WriteCommand struct, which needs to hold the bins iterator (the result of calling bins.into_iter()):

  pub struct WriteCommand<'a, I: Iterator<Item=A>, A: AsRef<Bin<'a>>> {
    bins: I,
}

I am getting the following error:

  --> bins.rs:20:25
   |
20 | pub struct WriteCommand<'a, I: Iterator<Item=A>, A: AsRef<Bin<'a>>> {
   |                         ^^ unused type parameter
   |
   = help: consider removing `'a` or using a marker such as `std::marker::PhantomData`

I don’t understand this error: I am using the 'a lifetime, after all. And removing the 'a type parameter leads to the expected error that 'a is now undeclared. How do I resolve this?

In general, you should practically never have trait bounds on type parameters of data structures. Only type parameters of functions or impl blocks should have trait bounds. Trait bounds encode constraints on behavior, and behavior is defined by functions and impl blocks.

The code compiles without headaches if you simply delete the trait bounds on WriteCommand. https://play.rust-lang.org/?gist=0a90217ac76746b38d10c73215f6940f

I believe the reason the code couldn’t compile as written was that just from I::Item: AsRef<Bin<'a>> there is not enough information to deduce variance of WriteCommand<'a, I, A> with respect to the lifetime 'a i.e. is it legal to cast a WriteCommand with a long lifetime to a WriteCommand with a shorter lifetime or vice versa. That nomicon page explains how this can play out.

3 Likes

+1. You’ll typically want to specify a bound for a generic type in the struct definition if you actually need to use something from that constraint in the definition - classic example is declaring a field whose type is an associated type of the constrained generic parameter.

The downside to specifying constraints in impl blocks is compiler messages can get fuzzier because the methods/fn’s defined in the impl block will appear to not exist if your calling code’s types don’t match the impl block constraints.

If you did want to keep the constraints in the struct definition, you could add a PhantomData field that uses the lifetime; in this particular example, it could probably be PhantomData<Bin<'a>>.

In general, you should practically never have trait bounds on type parameters of data structures.

I guess this kind of makes sense to me now. Though I have to say that defining the WriteCommand struct with a generic, unbounded type bins member still feels very wrong. After all, there are very tight constraints on the bins type, so shouldn’t those be declared on the struct definition and not just the impl block? It’s not relevant in this particular case, but wouldn’t this also potentially cause problems with the API documentation, where the docs for the struct do not specify the expected member type?

In the end, though, I had to give up on the Iterator-based solution for a different reason. In the real code, the execute function needs to iterate over the bins twice: Once to determine the overall size of the bins and to allocate a sufficiently-sized buffer; then once again to actually serialize the bins into the buffer. I don’t think that’s possible with an Iterator. So instead I’m back to using a Bin slice and am using just AsRef to allow both owned and referenced Bins:

fn put<A: AsRef<Bin>>(&self, policy: &WritePolicy, key: &Key, bins: &[A]) -> Result<()>

(Another suggestion from the original post.)

You can switch from Iterator to IntoIterator and iterate twice:

impl<'a, I: IntoIterator<Item = A> + Copy, A: AsRef<Bin<'a>>> WriteCommand<I> {
    pub fn new(bins: I) -> Self {
        WriteCommand { bins: bins }
    }

    pub fn execute(&mut self) -> Result<(), String> {
        for bin in self.bins.into_iter() {
            println!("{:?}", bin.as_ref());
        }
        for bin in self.bins.into_iter() {
            println!("{:?}", bin.as_ref());
        }

        Ok(())
    }
}

I added the Copy constraint to avoid attempts to move out of borrowed content. This essentially means that I will be a reference.

1 Like

Thanks for the suggestion! That works.

But adding the Copy constraint limits the usefulness of being generic over the Iterator trait quite significantly. I.e. &Vec<_> and &[_] implement Copy but many (most?) other implementations of IntoIterator don't implement Copy. For example:

let bins = HashMap::new();
bins.insert("a", 1);
bins.insert("b", 2);
bins.insert("c", 3);
let bins = bins.iter().map(|(k, v)| as_bin!(k, v));

client.put(&wpolicy, &key, &bins).unwrap();

This causes a compile time error because std::iter::Map does not implement Copy:

error[E0277]: the trait bound `std::iter::Map<std::collections::hash_map::Iter<'_, &str, {integer}>, [closure@tests/src/kv.rs:42:32: 42:54]>: std::marker::Copy` is not satisfied
  --> tests/src/kv.rs:44:12
   |
44 |     client.put(&wpolicy, &key, bins).unwrap();
   |            ^^^ the trait `std::marker::Copy` is not implemented for `std::iter::Map<std::collections::hash_map::Iter<'_, &str, {integer}>, [closure@tests/src/kv.rs:42:32: 42:54]>`

Is there some other way to avoid the move out of borrowed content without using Copy?

So the IntoIterator is needed if you want to iterate multiple times (which you do, based on what you said upthread). IntoIterator::into_iter() takes self, which means it consumes itself. In order to allow getting a fresh Iterator from into_iter(), we need to make sure that the IntoIterator isn’t actually consumed in the process. This is what I was going for via the added Copy bound. You’re right, however, that this will exclude certain argument types.

You could instead require Clone and iterate over the cloned value, and that should allow for more types. I don’t know if it’ll handle all the cases that you’d like to enable, though. But if a slice works and satisfies the vast majority of expected usecases, then I’d just go for that (as you’ve already done).

I’d say that constraining the constructor method is just as good. It’s like all the other invariants of the private fields that you’re maintaining: not written in the struct definition, but always holds.

Regarding just this portion, is this defining a constraint to some set of types I, all of which can support a method into_iter and all of which contain some type from the set of types A, which are some kind of reference to a Bin?

It sort of makes sense up to AsRef, although the docs on that confused me upon reading the original thread. It seems what you’re trying to achieve is making the method generic upon the container, either an array slice or a vector in particular. And you’re trying to make the contained type generic over different types of references. Does this mean, e.g., Rc vs &, or something else?

Do you mind briefly explaining where the lifetimes and AsRef come into play?

Yes, that works and it does enable the use of a map iterator as per my previous example. For now though, I have decided to stick with the simpler slice-based API, which -- as you said -- works and covers most expected use-cases. Still, it was instructive to figure out how to make the iterator-based API work and it has given me some more ideas for other improvements I can do to make my API more flexible for it's users. Thanks!

Sure, AsRef just gives the caller of the API the flexibility to pass in the Bin values as references or as owned values, e.g.

let bin = as_bin!("foo", "bar");
let bins = [&bin];
client.put(..., &bins);

vs.

let bin = as_bin!("foo", "bar");
let bins = [bin];
client.put(..., &bins);

The extra lifetime 'a comes into play because the Bin struct itself holds a reference to some value. So that reference needs to have a specified lifetime.

My original question was why the type definition <'a, I: Iterator<Item=A>, A: AsRef<Bin<'a>>> was resulting in an "unused type parameter" error, referring to the lifetime 'a. I'm still not entirely sure about the answer to that question...

Hope this helps?

1 Like