Parameterizing over data structures (HKT?)

Hi folks.

I have a problem that I'd like to be able to solve "well", but I'm open to the possibility that it isn't yet time. The rough idea is that I would like to abstract over the representation of some data that I have, and write an implementation that only requires the ability to access the data certain ways. I'm running in to pain points with references, and what seems like it might want HKT over lifetimes or something.

Imagine I had a &'a HashMap<K, Vec<(T, Vec<(V,i32)>)>>. This lets me map keys K into a thing that I can iterate over and get pairs &'a (T, Vec<(V,i32)>) or more abstractly pairs (&'a T, X) where X is a thing I can iterate over and get pairs (&'a V, i32) out of. The ability to produce things that iterate over references to Ts and Vs is the only feature I need from the type, and I would like to be able to use different concrete implementations; some use much less memory / are faster for specific settings; e.g. V = (), or "I know there is only one (T, Vec<(V,i32)>) element".

This sound a bit complicated, but it is actually what I need. A simpler version is just: I would like to define a trait with an associated type that says "a reference to me is basically like a &'a Vec<T>, in that it can be iterated over to get &'a T references. I'd then be able to write a method that hands back references to this associated type, and other people would be able to iterate over its T references.

I've read up a bit on IntoIterator which seems great to implement for any &'a Vec<T> replacement, and I'm comfortable doing that. Implementing it for &'a MyStorage<T> or whatever makes lots of sense. What I'm less clear on is how (or whether) I can write the constraint for the associated type that a reference to it should implement IntoIterator.

What I might like to write is

trait AbstractStorage<K, T> {
    type Storage where &'a Storage : IntoIterator<Item = &'a T>;
    fn lookup<'a,'b>(&'a self, key: &'b K) -> &'a Self::Storage;
}

The other option (which seems workable, but painful) is to not use associated types and expose the instance of the Storage type to the methods that would otherwise be generic in implementors of AbstractStorage. It seems a bit horrible, especially since I'll eventually need a second for the &'a Vec<(V,i32)> replacement, but ... I'm also open to other suggestions! I like the references and would rather not have to clone the T and Vs, though (they could be Strings or otherwise have owned memory).

Thanks for any tips. Will trade answers for high-throughput data-parallel frameworks. :smile:

Thinking out loud, could something like this solve the problem?

trait AbstractStorage<K, T> {
    type Storage; // no constraints, because nothing to say yet
    // instead, put the constraint on the method invocation
    fn lookup<'a,'b>(&'a self, key: &'b K) -> &'a Self::Storage 
        where &'a Self::Storage : IntoIterator<Item=&'a T>;
}

It seems a bit weird because we don't really say anything at all about Storage, but maybe we can't actually say anything until 'a gets bound, at which point we pull a weird and complicated constraint out of nowhere.

We do have higher-kinded trait bounds. Does this work? (It compiles in Playpen but I haven't tried your use-case yet.)

trait AbstractStorage<K, T> {
    type Storage: for<'a> IntoIterator<Item = &'a T>;
    // Elide the lifetimes here to see what happens
    fn lookup(&self, key: &K) -> &Self::Storage;
}

I have something that looks like what you wanted, but I didn't need HKT to do it: Rust Playground

I'll try this out, thank you. I'm a bit weirded out by having the 'a in the trait, but I suppose if it is implemented for all 'a maybe it isn't a problem? I guess Rust will find the right trait when I have a &AbstractStorage, based on the lifetime of the ref?

The universal quantification of the lifetime parameter just says "for any conceivable lifetime, this type implements IntoIterator". There are only multiple traits in the same way that Option defines infinite data types.

1 Like

So I'm a bit confused about this actually. As the 'a is part of the trait definition, it seems hard to write a generic method which borrows the storage and uses it, e.g.

fn use_storage<'a, K, T:'a, AS: AbstractStorage<'a, K, T>+'a>(store: AS, keys: &[K]) {
    for key in keys {
        store.lookup(key);
    }
}

In playground this errors out with "store doesn't live long enough", which makes sense because the borrow only exists in the loop, but the lookup method needs a borrow with lifetime 'a.

I did get a bit closer to a solution to the original problem with some code that looks like

pub trait TempTrace<K, T, V> 
    where K: Ord, V: Ord, T: LeastUpperBound {

    /// A type for which &'a VIterator: IntoIterator<Item=(&'a V, i32)>
    type VIterator;    

    /// A type for which &'a TIterator: IntoIterator<Item=(&'a T, &'a VIterator)>
    type TIterator;    

    /// Enumerates times and differences at those times.
    fn trace<'a>(&'a self, key: &K) -> &'a Self::TIterator 
        where T: 'a, Self::TIterator: 'a, &'a Self::TIterator: IntoIterator<Item=(&'a T, &'a Self::VIterator)>,
              V: 'a, Self::VIterator: 'a, &'a Self::VIterator: IntoIterator<Item=(&'a V, i32)>;

    // ...
}

at which point I realized I really have no idea if it is even right for &'a TIterator to produce items containing a &'a VIterator. My attempts to instantiate a thing that that I could return a reference to all either (i) are slices of the wrong type, or (ii) end up having lifetimes in their type definitions, so I can't name them as associated types.

If I just wanted to enumerate items &'a V then I'd probably take VIterator = [V] or something, so that &'a VIterator implements IntoIterator<Item=&'a V> which is great. But that isn't what I need; I have more like a [(V,i32)] that I want to IntoIterator by pairing the &'a V with the i32, no reference. But I don't get to pick the IntoIterator implementation for &'a [(V,i32)]. As far as I can tell I can't easily re-wrap the &'a [(V,i32)] without introducing a lifetime parameter into the wrapper, at which point I can't name it as an associated type anymore, and I can't easily re-wrap the [(V,i32)] because it only ever exists as a slice. Maybe DST magic that I don't understand would work here, but it all seems very complicated. ><

If you haven't already, I highly suggest checking out eclectic - Rust

which IMO is the state of the art in generic collection stuff.

Thanks. Just checked it out. I don't think it addresses the problem here (and the patterns don't give me any particularly good ideas for next steps). But, I do like the intent.

I have this tricky example here that shows why lifetime parameters in the trait have their limitations.

This trait is called Reborrow and is intended to capture the commonality of &T and &mut T: they can be temporarily reborrowed.

The lifetime issue here is the same as I have experienced with Iterator-returning traits that I use.

Playpen Gist for trait Reborrow

What happens is that using a bound like fn next<'a>(&self, &'a G) where G: Reborrow<'a> will be fine, but what we really want is the for<'a> kind of construction.

However, as in the example, the for<'a> construction is too eager! We really want for all 'a where 'f: 'a, and I can't find any way to express that.

As you can see from the error, rustc is upfront with us and tells us that the for<'a> clause requires that even &'static Vec<T> implements Reborrow<'static>, which is impossbile because of the T here not living that long.

What do compiler hackers think about the notion of a for<'a> bound that's limited to the lifetime bounds the type actually passes, or even for<'a> where 'b: 'a? This is alluded to in the previous post.

Is it nonsense? Or could rust have it? From my understanding it would make lifetime-parameterised traits much more powerful.

cc @arielb1 @eddyb

I'm currently using a 'static constraint that would probably be removed by this as well (as you know, but maybe others don't ;)). It is a constraint that I've currently written as

pub trait Traceable where for<'a> &'a Self: TraceRef<'a, ...> {
    ...
}

which I would love to indicate that for each &'a Self we might have, it implements TraceRef<'a ...>, rather than that Self needs to be 'static, which is what it seems like Rust concludes.

At the moment I'm ok with the 'static bounds, but ... always good to generalize and stuff.

The way Rust generics work is that we take the generic bounds, and use them to resolve type variables into specific types... but it's impossible to actually resolve a for<> construct with a where bound in a way which removes the bound.

Note that the issue with the Reborrow trait in particular could probably be resolved in a much more elegant way:

trait Reborrow<T> {
    type Output<'a>; // https://github.com/rust-lang/rfcs/issues/324
    fn reborrow<'a>(&'a mut self) -> Self::Output<'a>;
}
1 Like

I knew that, but you're right, if we're asking for new features why not ask for the one we want -- HKT.

Isn't it still nonsense though that for<'c> Reborrow<'c, Vec<T>>, requires that T: 'static ? I think that a "restricted" variant of the for bound would make sense.

I think that kind of new bound would be all that's needed, because in the T: 'static case, a subscope reborrow still compiles — a borrow of &'a mut self in the Reborrow<'a> trait, and yet, this borrow must be restricted to just the body of the loop, so we know the 'a can not have been inferred to a lifetime known from the function signature alone (?).

The problem is that the "bound" you're proposing isn't really a bound in the same sense as a bound on a function definition; it becomes part of the type. For example, someone could write let x: &for<'a> where 'b: 'a Foo<'a, 'b> = &1;.

It might be feasible, but I'm not sure I understand all the implications; at the very least, figuring out whether two types are equivalent is very complicated.

I wonder how the G: for<'c> Reborrow<'c, Vec<T>> is inferred in that code that compiles.

It seems to be it simply uses G: Reborrow<'static, Vec<T>>. Yet then, the .reborrow() call would require a 'static borrow for its method call, so there is some borrow shrinking going on?

I've used this kind of interface.

It doesn't get past these two points:

  • You need to use the trait generically, i.e. accept a reference that you can call the iterator method on
  • The iterator elements are non-'static. (Need to be 'static due to how for<'a> bounds work).

I only ever get it to work with the explicit lifetime parameterization (see look_at_ref), and its problem is that then you can only call the method with exactly that 'a lifetime. You can't delegate to another function, or borrow the iterator for just a subscope.