Tricky Cow Ownership

Like any good farmer, Rust is my programming language of choice. But these darn Cows keep wandering into the field over and mixing with the neighbour's herd, and next thing you know, you can't tell who owns what. Will you help out an honest farmer?

The following compiles.

struct Foo<'a> {
    item: Cow<'a, str>,
}

fn experiment<'a>(foos: &'a [Foo<'a>]) -> Vec<&'a str> {
    let mut strings: Vec<&'a str> = vec![];

    for f in foos {
        let thing: &'a str = &f.item;
        strings.push(thing);
    }

    strings
}

We're iterating through a borrowed structure whose elments contain a Cow. Regardless of the internal ownership status of the innards of the Cow (either a &str it got from somewhere else, or an owned String it had to clone at some point), we're able to copy all those references into a new container (Vec here) and give it back. No ownership has transferred, it's just like we're "stripping off" the Cowness.

Alright, let's try a HashMap instead:

fn experiment_2<'a>(foos: &'a [Foo<'a>]) -> HashMap<&'a str, &'a str> {
    let mut strings: HashMap<&'a str, &'a str> = HashMap::new();

    for f in foos {
        let thing: &'a str = &f.item;
        strings.insert(thing, thing);
    }

    strings
}

This compiles too. So far so good. One little tweak and we can break it:

fn experiment_3<'a>(foos: &'a mut [Foo<'a>]) -> HashMap<&'a str, &'a str> {
    let mut strings: HashMap<&'a str, &'a str> = HashMap::new();

    for i in 0..foos.len() {
        match foos.get_mut(i) {
            Some(f) => {
                let thing: &'a str = &f.item;
                strings.insert(thing, thing);
            }
            None => {}
        }
    }

    strings
}

Calling get_mut here gives us the dreaded:

*foos was mutably borrowed here in the previous iteration of the loop

I've read elsewhere about why this is, and mostly understand. Rust wants to know that I'm not mutably borrowing the same index more than once. But actually, why should Rust care? What if I want to borrow the same index more than once? In the code above that won't happen, but in my real-world scenario that's precisely where I find myself.

Furthermore, getting rid of the Cow but leaving the other code the same gets rid of the error:

struct Bar<'a> {
    item: &'a str,
}

fn experiment_4<'a>(bars: &'a mut [Bar<'a>]) -> HashMap<&'a str, &'a str> {
    let mut strings: HashMap<&'a str, &'a str> = HashMap::new();

    for i in 0..bars.len() {
        match bars.get_mut(i) {
            Some(f) => {
                let thing: &'a str = &f.item;
                strings.insert(thing, thing);
            }
            None => {}
        }
    }

    strings
}

This compiles just fine. Why? What about Cow-nature makes Rust mad in the penultimate case, but not the final? Yes get and not get_mut doesn't suffer from this, but in my real-world case I need get_mut. Any idea what's going on?

Thank you kindly.

Experiment 3 fails to compile because if thing points into an owned cow, any further mutable access to foos might result in overwriting the cow at that index with some other cow, invalidating the thing pointer. This is why simultaneous overlapping mutable access to foos is not allowed.

In experiment 4, it is guaranteed that thing is not invalidated by changes to bars, so this problem is avoided. Even if you overwrite the item field, whatever the reference points to was not invalidated by that.

Note that if you had used two different lifetimes for the argument, the problem becomes more clear. E.g. if experiment 3 takes &'short mut [Foo<'long>], then the type of thing would be forced to be &'short str, whereas in experiment 4, it is able to have the type &'long str.

3 Likes

Of course, in experiment 3 it would compile if you match on the cow and only push it in the case where the cow is borrowed. E.g.:

use std::collections::HashMap;
use std::borrow::Cow;

struct Foo<'a> {
    item: Cow<'a, str>,
}

fn experiment_3<'a>(foos: &'a mut [Foo<'a>]) -> HashMap<&'a str, &'a str> {
    let mut strings: HashMap<&'a str, &'a str> = HashMap::new();

    for i in 0..foos.len() {
        match foos.get_mut(i) {
            Some(f) => {
                match &f.item {
                    Cow::Owned(_) => { /* do nothing */ },
                    Cow::Borrowed(thing) => {
                        strings.insert(thing, thing);
                    }
                }
            }
            None => {}
        }
    }

    strings
}

This works because in the case when the cow is borrowed, overwriting the cow in a later iteration cannot invalidate thing.

1 Like

I think this was the source of my misunderstanding. Since I saw a &str coming out of the Cow-to-str deref, I would assume that that would be the same immutable string, forever. Especially since the original foos was also of 'a, and got its strs from somewhere else.

Oh wait a minute, maybe that's exactly what you're saying. In the case where the original Cow had an internal owned String, and I took a reference to it, there's no guarantee that further iterations wouldn't overwrite/replace that internal String again. There'd be no true owner anymore of the String my reference points to.

Exactly. This is also what I tried to illustrate with my example. It is exactly the case where the cow has an internal owned string that causes trouble.

1 Like

Note exactly related, but keep in mind that you almost never want &'a [Something<'a>] or &'a mut [Something<'a>] (same lifetimes), use different lifetimes instead. It makes your API harder to use. @jonhoo's latest video goes into detail on why.

4 Likes

Thanks everyone! I'm going to ask a follow-up in a different thread to clarify my understanding of Cows.

experiment 3 also works if you don't use indexing, and instead use iterators:

fn experiment_3<'a>(foos: &'a mut [Foo<'_>]) -> HashMap<&'a str, &'a str> {
    let mut strings: HashMap<&'a str, &'a str> = HashMap::new();

    for f in foos {
        let thing: &'a str = &f.item;
        strings.insert(thing, thing);
    }

    strings
}

In my real case, I'm not iterating through foos, I'm iterating through another collection and get_muting into foos, which is actually a HashMap.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.