Understanding differences in references

I'm a complete newbie in Rust and I'm trying to understand different kinds of references.
I have an a: HashSet<String> and I want to iterate over it's elements. I can write this code:

for s in a.iter() {
    ...
}

This works but I don't understand how, does my String members are copied or moved?
Ok, I tried to iterate with a reference:

for &s in a.iter() {
   ...
}

And compiler made me a hint: 'to prevent move, use ref s or ref mut s'. So the third option is to write this:

for ref s in a.iter() {
   ...
}

And now I don't understand what is going on with references and iterators in this language.
It would be great if someone can tell me.

2 Likes

The for loop matches an irrefutable pattern against the Some(_) results of the iterator. Your first and last example are semantically equal.

In both cases s will be a string (slice) reference (the slicing business is due to auto-deref).

The iter for a HashSet returns references to T's in the form of Option<&'a T>; the for loop is sugar over calling next() on the iter. As @llogiq said, for String, you get string slices. To answer your question, no copies (String isn't Copy anyway) or moves occur.

Also, the Iterators section of the book expands on it: Iterators

There's quite a number of things happening here. Let's break it down.

First, for clarity, the actual types in your examples:

  • In the first loop, s is a &String. (a reference is taken)
  • In the second loop, s is a String. (and it tries to move the string, causing a compile failure)
  • In the third loop, s is a &&String. (yes, really)

The "Item" type of an iterator

This is what you are seeing in your first example:

for s in a.iter() { ... }

The type of s in this case is whatever the Item type is of the IntoIterator impl of your iterable. However...

  • By "your iterable", I am referring to the type a.iter() returns, which is... this... thing.
  • That... thing doesn't actually have an explicit IntoIterator impl because it is actually itself an Iterator. (all Iterators implicitly implement IntoIterator with a matching item type)
  • By cross referrencing the type parameters used in the signature of HashSet::iter() with that Item type, one can eventually deduce the Item type.

Easy as pie, right?

...thankfully, you don't usually ever have to look any of the above stuff up. Most iter methods are kind enough to document the item type. Here, the documentation for HashSet::iter tells us that the item type is &'a T (i.e. &String).

Pattern matching

Almost all of the rest of your questions are really about pattern matching. So let's get rid of the iterator, and focus on a let statement, which shares exactly the same syntax (except that we can now easily put various types of things on the right hand side to experiment!)

Here's various values and their types:

let one = 1i32;
let x = one;   // type is i32
let x = &one;  // type is &i32
let x = &&one; // type is &&i32
let x = (one,one);    // type is (i32, i32)
let x = (&one, &one); // type is (&i32, &i32)
let x = &(one, one);  // type is &(i32, i32)

However, x doesn't have to be an identifier here; it is more generally a pattern, which lets you pick apart the value on the right hand side as you create variables:

let one = 1i32;
let x = one;     // x is i32
let &x = &one;   // x is i32
let &&x = &&one; // x is i32
let (x, y) = (one, one);       // x and y are i32
let (&x, &y) = (&one, &one);   // x and y are i32
let &(x, y) = &(one, one);     // x and y are i32

The theme here is that syntax used in a pattern effectively "cancels out" parts of the type. So adding a & to a pattern actually removes a & from the type you create!

If we use a non-copy type like String, then we cannot create more String variables from it without explicitly cloning it, so many of the above examples will stop compiling (like your second loop). We need to create &String type variables instead:

let hello = "hello".to_string();
let x = &hello;
let &x = &&hello;
let (x, y) = (&hello, &hello);

ref patterns

Short version: The pattern ref x is like the opposite of &x. It automatically borrows the thing it matches against.

Long version: Why does this exist? Consider this.

let hello = "hello".to_string();
let world = "world".to_string();
let tuple = (hello, world);
let hard_to_match = &tuple;  // <-- type &(String, String)

How can we match against hard_to_match to create two &String variables? None of these work:

let (x, y) = hard_to_match;   // type error; (_,_) vs &(_,_)
let &(x, y) = hard_to_match;  // error: move out of borrowed content

The problem is that we cannot reach the innards of the type without dereferencing it, but dereferencing it causes a move. To get around this, we need to reborrow its innards with ref patterns:

let &(ref x, ref y) = hard_to_match;  // x is &String, y is &String

The compiler's suggestion was silly

Your use case is not at all the kind of place where you need to use ref. You see, when the compiler suggested using ref, it was actually suggesting that you add a ref to the &s pattern: (to prevent the move)

for &(ref s) in a.iter() { ... }

However, this would be identical to:

for s in a.iter() { ... }

So... yeah. :stuck_out_tongue:

Deref Coercion

Even though your third example creates a &&String, it generally isn't a huge deal because rust freely coerces between a variety of borrowed forms of types:

let hello = "hello".to_string();
let x: &String = &&hello;  // Coerces &&String into &String
let x: &str = &hello;      // Coerces &String into &str

This isn't to say &&String and &String are identical; but most code that compiles with a &String will probably also compile with a &&String.

16 Likes

When in doubt about the type of something...

#![feature(core_intrinsics)]

fn print_type_of<T>(_: T) {
    println!("{}", unsafe { std::intrinsics::type_name::<T>() });
}

fn main() {
    let mut set = std::collections::HashSet::new();
    set.insert(String::new());
    
    for s in set.iter() {
        print_type_of(s); // prints &std::string::String
    }

    for ref s in set.iter() {
        print_type_of(s); // prints &&std::string::String
    }
}
3 Likes

My favorite way:

let () = s;
error[E0308]: mismatched types
 --> <anon>:5:9
  |
5 |     let () = s;
  |         ^^ expected reference, found ()
  |
  = note: expected type `&&std::string::String`
  = note:    found type `()`

:wink:

10 Likes

Thanks a lot, it was very, very helpful for me to understand all this!

But why, the hell, a.iter() and a.into_iter() return iterators with different types of element? HashSet::iter() has Item = &T and HashSet::into_iter() from IntoIterator has Item = T. Why??

into_iter takes self by value, i.e. a.into_iter() is consuming a. That means:

  1. The iterator cannot hand out references, because there is no backing store anymore. References would be dangling.
  2. OTOH it can hand out the elements by value.

For iter it's the other way round, it takes &self by reference, which means:

  1. The iterator can hand out references, because the a is still there.
  2. But it cannot hand out by value, because a is only borrowed. Returning by value usually means moving, which would destroy the original value.

I other words, the element type mirrors the self type of iter/into_iter.

5 Likes

To clarify, when I said that Iterators implement IntoIterator with a matching Item type, I was referring to the return type of iter() (not HashSet itself):

// These are all identical:
// ('a' will be taken by reference,
//  and the item type will be &String)
for s in a.iter() { ... }
for s in a.iter().into_iter() { ... }
for s in a.iter().into_iter().into_iter() { ... }

// So are all of these:
// ('a' will be destroyed, the item type will be String,
//  and the Strings are moved without copying)
for s in a { ... }
for s in a.into_iter() { ... }
for s in a.into_iter().into_iter() { ... }

Put another way:

  • An iterator is something that implements Iterator.
  • An iterable is something that implements IntoIterator.
    • Their into_iter consumes the iterable and produces an iterator.
    • These are the things you can use in for loops.
  • All iterators are iterables (their into_iter is a no-op)
  • ...but not all iterables are iterators.
4 Likes

Can't you also use "for s in &a" to iterate over the HashSet without consuming it?

Yup and I imagine that's probably the preferred way to write it. (I was trying to circle around that topic since I had already written so much :stuck_out_tongue:)

For the OP or any others not familiar with that syntax, the important thing to understand there is that &HashSet has an explicit implementation of IntoIterator (which is defined to call iter):

impl<'a, T, S> IntoIterator for &'a HashSet<T, S>
where T: Eq + Hash, S: BuildHasher

So for x in &a won't necessarily work for all iterables; just those that provide it (and most collections in the standard library do, so please do take advantage of it!).

1 Like

Your replies in this thread have been very informative, so :thumbsup:!

I'm a bit lost here. Why does the last line moves out of borrowed content ?

The let statement is using a destructuring pattern, and is destructuring a reference which basically "undoes" the reference aspect and tries to take the value.

Why do we get the error: "move out of borrowed content"?

let &(x, y) = hard_to_match;

is approximately the same as:

let temp = hard_to_match;
let x = temp.0;
let y = temp.1;

I.e. the content of the tuple is assigned by value to x and y.

On the other hand, this

let &(ref x, ref y) = hard_to_match;

would be roughly equivalent to:

let temp = hard_to_match;
let x = &temp.0;
let y = &temp.1;

Here the content is again borrowed which is ok.

1 Like