Beginner question; cloning changes types?

In the code below (playground), I want to return an iterator for a vector contained in a hashmap as a value for a given key.

It feels inefficient to clone the vector before obtaining the iterator but, without the cloning, I get a compile error, which I do not understand

Is there a way to avoid cloning, keep the function signature, and still work around the compile error?

// code from playground
use std::collections::{HashMap};
use std::hash::Hash;

fn get_iterator<'a, K, V>(map: &HashMap<&'a K, Vec<&'a V>>, key: &K) -> Option<impl Iterator<Item=&'a V>>
    where V: 'a,
          K: Eq + Hash {
    match map.get(&key) {
        None => { None }
        Some(vec) => {
            let it = vec.clone().into_iter();
            Some(it)
        }
    }
}

confused thinking

I guess that I assume that an iterator over a vector of references &T will dereference the elements to T before adding its own reference again to get &T but that this does not happen resulting in &&T.
However, it appears that cloning somewhat implements this process while I would have guessed that cloning retains the type?

dispensable attempt

I tried to change the type of the hashmap but this gave me eventually problems elsewhere in my code where I create these HashMaps since I create the stored vecs incrementally requiring mutable vecs and these then require HashMap<&'a K, &'a mut Vec<&'a V>> but then I do not know how to fix the following get_iterator function for that change again.

use std::collections::{HashMap};
use std::hash::Hash;

fn get_iterator<'a, K, V>(map: &HashMap<&'a K, &'a Vec<&'a V>>, key: &K) -> Option<impl Iterator<Item=&'a V>>
    where V: 'a,
          K: Eq + PartialEq + Hash,
          V: Eq + PartialEq + Hash {
    match map.get(&key) {
        None => { panic!("unknown key") }
        Some(vec) => {
            let it = vec.into_iter().map(|x|*x);
            Some(it)
        }
    }
}

Using map to dereference one layer is probably the best way, would this work for you?

fn get_iterator<'a, K, V>(map: &'a HashMap<&'a K, Vec<&'a V>>, key: &'a K) -> Option<impl Iterator<Item=&'a V> + 'a>
    where V: 'a,
          K: Eq + Hash {
    match map.get(&key) {
        None => { None }
        Some(ref vec) => {
            Some(vec.iter().map(|x| *x))
        }
    }
}

You can write it like this:

fn get_iterator<'a, 'q, K: Eq + Hash, V>(
    map: &'q HashMap<&'a K, Vec<&'a V>>,
    key: &'q K,
) -> Option<impl Iterator<Item = &'a V> + 'q> {
    Some(map.get(&key)?.iter().copied())
}
3 Likes

The variable vec has type &Vec<&'a V>. When you call .clone(), you get a Vec<&'a V>.

The Clone::clone method's receiver type is &Self, so the default thing for .clone() to do is go from &T to T. Any time you add a .clone() call and the type doesn't change, that's because method lookup saw you had some T: Clone and automatically took a reference so you're calling Clone::clone() on an &T.

In this case, cloning is not a good way to solve this problem, because it spends time allocating and copying into a new Vec that is not needed. vec.iter().copied() is good because it iterates over the original vector instead.

4 Likes

Here's a demo of the effect of .clone() on types:

use std::any::type_name_of_val;

fn main() {
    let x = &&&"hello world";
    dbg!(type_name_of_val(x));
    let x = x.clone();
    dbg!(type_name_of_val(x));
    let x = x.clone();
    dbg!(type_name_of_val(x));
    let x = x.clone();
    dbg!(type_name_of_val(x));
    let x = x.clone();
    dbg!(type_name_of_val(x));
}
[src/main.rs:5:5] type_name_of_val(x) = "&&&str"
[src/main.rs:7:5] type_name_of_val(x) = "&&str"
[src/main.rs:9:5] type_name_of_val(x) = "&str"
[src/main.rs:11:5] type_name_of_val(x) = "str"
[src/main.rs:13:5] type_name_of_val(x) = "str"

Once we run out of references, auto-ref kicks in and the type doesn't change any further because each method call is adding a reference that gets immediately removed. (And the compiler warnings tell us that that call does nothing.)

(Note that the final steps print no &s at all because type_name_of_val takes a reference and prints the name of the type of the referent; it's not the case that x is ever of type str.)

1 Like

Thanks!
This automatic coercion with multiple references was new to me.
It is also described here.
I also tested it for my self with

#[test]
fn foo() {
    let x1: &&str = &&"asd";
    let x2: &str = &&"asd";
    let x3: &str = *&&"asd";
    println!("{x1} {x2} {x3}"); // printing "asd asd asd"
}

It would be great if you could check my understanding of your solution :face_with_peeking_eye:

The two solutions seem to be very closely related to each

  • The use of copied() is pretty equal to .map(|x|*x) according to the doc
  • Lifetime annotation + 'q in return type: does this specify the lifetime of the iterator itself?
  • Usage of two lifetimes in general: in which context could it be helpful to use the solution with two lifetimes 'q and 'a (I guess it gives some more freedom for the usage)?
    • I suppose that having recursive types as &'q HashMap<&'a K, Vec<&'a V>> generally allows to use different lifetimes having inner lifetimes (here 'a) being longer than outer lifetimes (here 'q) just for the case that the wrapper (here the hashmap) is disposed before the content (here the keys and values in the hashmap) is disposed.
    • But why would key: &'q K make more sense than key: &'a K? Sure, it is also at the outermost level but it should be a key in the hashmap and the keys there have lifetime 'a... On the other hand, the iterator depends on the key and, for the iterator, the key only has to live until the death of the iterator... so that also makes sense (unless the iterator does not depend on the key after its construction anymore but with borrowing this may be the case?).

Sort of. It requires the returned iterator to meet a 'q bound, which means it is valid for at least 'q. So any lifetimes in the type of the returned iterator have to be at least as long as 'q.

It's also a workaround in this case for the fact that 'q wouldn't otherwise show up in the return type (the return type doesn't "capture" 'q), which is a type of API violation: other code might assume the returned iterator could live as long as 'a, given the signature without + 'q.

Inside traits, return position impl Trait automatically captures all lifetimes and you don't need the workaround.[1] In a future edition, non-trait RPIT such as this example probably will as well.

It's a workaround because sometimes the 'q bound is undesirable, like if it forces some other generic to meet a 'q bound when it doesn't need to. I don't think that distinction is terribly important in this example, though.

It's useful if you need to retain longer lifetimes, without borrowing the map longer than is necessary. For example, it's required for this to work:

fn example<'a, K: Eq + Hash, V>(
    map: &mut HashMap<&'a K, Vec<&'a V>>,
    key: &K,
) {
    let iter = get_iterator(&*map, key).unwrap();
    let value = iter.last().unwrap();
    map.get_mut(key).unwrap().push(value);
}

If the lifetimes were required to be the same, the map would still be borrowed after you were done with the iterator, and you wouldn't be able to push the copied value back into a map entry.

The lifetime on the key doesn't need to matter at all, it should just be used to do the lookup in the function body and then it is no longer needed (there's no need for the iterator to capture it, say).

This change removes the need to make the lifetime on the key and on the map the same.

 fn get_iterator<'a, 'q, K: Eq + Hash, V>(
-     map: &'q HashMap<&K, Vec<&'a V>>,
+     map: &'q HashMap<&'a K, Vec<&'a V>>,
-    key: &'q K,
+    key: &K,
) -> Option<impl Iterator<Item = &'a V> + 'q> {
-    Some(map.get(&key)?.iter().copied())
+    Some(map.get(key)?.iter().copied())
}

The reasons are pretty obscure, and I wouldn't expect a newcomer to figure it out.

You pass a &Q to HashMap::<HmK, HmV>::get where HmK: Borrow<Q>. In the example, HmK = &'a K upon entry to the function. Now, &'a K: Borrow<&'a K>, but only when the lifetimes are the same due to how generics work with the blanket T: Borrow<T> implementation. To pass &key, it seems at first, you would need Q = &'a K, so key: &'a K.

But inside the function, due to covariance, the &'q HashMap<&'a K, _> can be coerced to a &'q HashMap<&'q K, _> (without coercing the lifetimes in the Vec<_>). Then you could pass a &&'q K to get instead (Q = &'q K). But to do so, you need key: &'q K.

Why couldn't the compiler just coerce the map to &'q HashMap<&'shorter K, _>? That would be a reference &'q _ that outlives the referent (HashMap<&'shorter K, _>), which is unsound. Implicitly, it has to be the case that 'shorter: 'q.

So then why couldn't the compiler just coerce the map to &'shorter HashMap<&'shorter K, _>? Because the return type annotation (+ 'q) says that the borrow of the contents of the map which you are returning has to be at least 'q.

A natural follow-up question is, why did changing get(&key) to get(key) fix things? In this case you utilize a different impl Borrow<T> for &T implementation. Instead of Borrow<&'x K> for &'x K you're now using impl Borrow<K> for &K, so Q = K and the lifetime of key (or &K) doesn't matter.


  1. Instead you need workarounds to avoid capturing lifetimes when you don't want to. ↩ī¸Ž

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.