Another lifetime problem: Cow and slices

I have another lifetime problem that I'm trying to solve in Rust. I have a structure with two fields:

  1. A collection of Cows.
  2. Another field that holds slices referencing the data from the first collection (for simplicity, let's say it references the first Cow).

For simplicity example just refers the first cow.

This structure isn't strictly self-referential because the second field doesn't reference the structure itself, but rather the data it refer. Given that, I don't see why this should be impossible from a memory layout perspective. However, I'm unsure if this can be achieved without using unsafe code. Any help?

It is.

If struct A has field x, then borrowing &a.x borrows a. That's the rule. Otherwise, nothing ever would work without stable addresses (practically, heap allocation). The type system doesn't know and doesn't care about heap allocation (if it did, then again, nothing would ever work, because you would suddenly need to care whether or not every reference points to the heap).

You should consider this to realize why your current code can't possibly compile.

4 Likes

Hello, thanks for answer. Your example with [u8; 16] very different. It's true self-referential structure because it refer itself so it's not possible to move Struct1 in your example. In original problem it does not refer itself, but rather there were multiple unmutable reference on some memory outside of itself.

Also, am I write original problem possible to solve with unsafe?

Note that Cow has an owning variant. So if you have a Cow::Owned, you do in fact have full ownership of the data you are trying to reference, making your struct self-referential. If you weren't to use Cow but &'a str for the element type of your data vector instead (which would in fact make your slice vector elements refer to data not owned by your type), your code compiles fine.

4 Likes

It's not, it's exactly the same as far as the borrow checker is concerned. That's the whole point I was trying to make.

There's no difference between Cow<InlineType> and Cow<HeapAllocated> as far as the borrowing and lifetime structure of the relevant methods are concerned, therefore if it is not correct with an inline type like an array, then it can't be allowed with a heap-allocating type like Vec, either.

1 Like

For a concrete example as to why the fact that Vec elements being on the heap doesn't "fix" your struct being self-referencial, consider the following method:

impl<'a> Struct1<'a> {
    fn bye(&mut self) {
        self.data = vec![];
    }
}

It complies, but if slices contains references to the contents of data, they would dangle after calling bye.

The limitations of self-referencial structs goes beyond "things dangle after a move", even if there was a language distinction between heap and stack pointers (which there isn't).

2 Likes

In your original problem you restricted the slices to the Borrowed variant. If this is the case you do not have a self referential struct and you can do the following:

        let slices = data.iter().filter_map(|elem| {
            if let Cow::Borrowed(slice) = elem {
                Some(*slice)
            } else {
                None
            }
        }).collect();
        Self { data, slices }

This is possible because we can extract a &'a str from Borrowed but not from Owned.

1 Like

All that make sense to me. Thanks for these eloquent examples.

Is it possible to somehow consume field data to hide it and expose only slices? Field slices is only thing which make sense to expose. It would be safe in that case. Is it possible without unsafe?

Maybe you can just pass the Cow containing vector by reference and not consume it at all? Here an example:

use std::borrow::Cow;

pub struct Struct1<'a> {
    slices: Vec<&'a str>,
}

impl<'a> Struct1<'a> {
    pub fn extract(data: &'a [Cow<'_, str>]) -> Self {
        let slices = vec![data[0].as_ref()];
        Self { slices }
    }
}

fn main() {
    let data: Vec<Cow<'_, str>> = vec![Cow::Borrowed("abc"), Cow::Owned("def".to_string())];
    let s1 = Struct1::extract(&data);
}

Playground.

If you need to consume the vector (take ownership of the data), I would work with owned types instead (i.e. store slices as Vec<String>).

1 Like

A different solution to the problem of collecting slices of strings, which involves neither copying nor lifetimes, is to use a reference-counted string type that permits slicing, like arcstr::Substr. Then you don’t need to worry about lifetime annotations allowing you to borrow the original strings, because their ownership is shared.

3 Likes

Friends, there actually is a solution.

The solution is to introduce a callback and move all the post-processing code there. In most programming languages, you would expect to have the post-processing code right after the extraction call. However, in Rust, instead of placing the post-processing code immediately after the extraction function call, you put it inside a callback that the extraction function calls. This approach allows you to overcome the borrow checker restrictions while still maintaining safety.

Thanks for the hints—they were useful.

Yeah, that's a good solution in this concrete setup.

Note however that it still doesn't really "solve" the original problem; you were only lucky because the lifetime is covariant so it can be shortened. For instance, you can't get a reference of lifetime 'data inside the callback; it can only accept a local lifetime. Accordingly, this still does not compile.

(BTW, you don't need the explicit for binder or the mem::swap().)

1 Like

If you're willing to use callbacks to avoid popping the stack, you can also easily avoid self-referencial structs. As neat as it may be that you can safely construct self-referencial structs, their advantages are pretty scant. Avoiding them, even with callbacks, is a more general approach that avoids problems around destructors, not being able to move or otherwise exclusively use the owned resource again, etc.

2 Likes

And now Struct1 is coppyable! @quinedot that's brilliant! Why did you make function extract stand-alone ( non-member ) function. Was it intentionally?

I didn't put a ton of thought into it; as far as I recall, it was just a combo of simplest to write and call. Adapt to what works best for your use case.

Here's a look at a few alternatives. And more are possible, there's many ways to do it.

1 Like

Thank you :purple_heart:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.