[SOLVED] Lifetime issue using Cow::into_owned()


#1

I’m trying to convert some byte-slice (nul-terminated UTF-8) into a proper String. I didn’t figure out how to perform this with CString without using unsafe (which shouldn’t be necessary) so I wrote my own:

let device_name_bytes: [libc::c_uchar; VK_MAX_PHYSICAL_DEVICE_NAME_SIZE] = properties.deviceName;
let device_name_len = device_name_bytes.iter().position(|&c| c == 0).unwrap_or(device_name_bytes.len());
let device_name = String::from_utf8_lossy(&device_name_bytes[0..device_name_len]).into_owned();

This works flawlessly but it wouldn’t let me decouple the lifetime of the source (properties.deviceName) from the result (device_name). Cow::into_owned() implicitly copies the underlying slice, but passes the lifetime from the source (if I got this right). I don’t want to create another unnecessary clone.

Note:
device_name shall be part of the return value of the surrounding function.
properties.deviceName is a field of a temporary structure which shall not survive the surrounding function.


#2

I’m not sure what you mean here. Cow::into_owned returns an owned value (in this case a String) that does not have any lifetime parameters.

I don’t want to create another unnecessary clone.

Your code currently copies bytes twice: in the assignment on line 1, and in to_owned on line 3. You could eliminate the first copy by calling .iter() directly on properties.deviceName, or by making a reference to it instead of a copy:

    let device_name_bytes: &[u8] = &properties.deviceName[..];
    let device_name_len = device_name_bytes.iter().position(|&c| c == 0).unwrap_or(device_name_bytes.len());
    let device_name = String::from_utf8_lossy(&device_name_bytes[0..device_name_len]).into_owned();

#3

Whoops! That wasn’t my intention. Thanks for pointing out.

Ok, it seems my knowledge about lifetimes is even worse than I hoped :blush:
I thought there would be some kind of lifetime inference.

I reworked my code to contain a complete example of which I would like to know how to solve this:

pub struct MyStruct<'a> {
	pub result: &'a str,
}

impl<'a> MyStruct<'a> {
	
	pub fn new() -> Self {
		
		let survivor = String::new();

		MyStruct {
			result: &survivor,
		}
	}

}

fn main() {
}

results in:

<anon>:12:13: 12:21 error: `survivor` does not live long enough
<anon>:12 			result: &survivor,
          			         ^~~~~~~~
<anon>:7:23: 14:3 note: reference must be valid for the lifetime 'a as defined on the block at 7:22...
<anon>: 7 	pub fn new() -> Self {

My thinking is that I somehow have to tell the compiler that survivor shall have a lifetime that equals that of the return value. But I don’t know how to annotate this.


#4

From the point of view of new the lifetime 'a is an input “parameter”, not something the function can influence. For each possible value of 'a the function promises to return a MyStruct<'a>. Since it doesn’t have any actual parameters you can only put a 'static borrow there, for example a string literal:

pub fn new() -> Self {
    MyStruct { result: "hello" }
}

The following function takes an input, which outlives 'b, thus it can also return a struct, which outlives 'b (I’ve changed 'a to 'b to make it clearer that 'a from impl<'a> is not relevant here):

fn new<'b>(input: &'b str) -> MyStruct<'b> {
    MyStruct { result: input }
}

#5

For me it always helps to remember the following point when I have problems with lifetimes:
Lifetimes can only restrict what you can do, they never let you do thing that you cannot do without them.

survivor lives on the stack of the function new(). The lifetime of survivor ends at the end of the function. You cannot automagically extend the lifetime at your will.

The lifetime annotation of MyStruct<'a> only tell the following:
The data referenced by result must live at least as long as the MyStruct instance itself.

In a language like C++, where references/pointers have no associated lifetimes, the code would compile, but crash (or worse) at runtime. Rust lifetimes prevent such mistakes, but they only tell you so, they don’t solve the problem for you.

If you want result to be coupled to MyStruct, just use a String instead of &str:

pub struct MyStruct {
    pub result: String,
)

#6

Thanks for the explanation.

So any heap-allocated resource I want to include in my result has to be passed as argument?

My result contains a nested struct containing multiple Strings, whose content will be computed in new. Do I have to pass a mutable argument for each of those Strings, even if they shall be immutable for the rest of their lifetime? So I would have to change the method signature as soon as I change some of the (possibly nested) result struct?


#7

Why do you want them to be heap allocated?
String also uses heap allocation (except maybe for very short strings) but hides it from you.

If you explicitly want a heap allocated string but owned by MyStruct then use Box<String>.

Keep in mind that String::new() ist not heap allocation but returns by value. In Rust “objects” are often passed by value if ownership is needed. This is actually not expensive because they are moved, not copied by default.


#8

Someone has to own that data. If new computes a string which hasn’t existed yet (as opposed to just taking a piece of a provided string) either MyStruct has to own it or new take a mutable reference to a container where it can put that string.


#9

Ah, I thought I could somehow teach Rust to place survivor in the heap by using lifetimes. I used to use String only for strings that can be changed later.

I know that someone has to own the data, but I thought, that I could choose the result as owner.

That’s a big difference to what I’m used to in other languages.

I think most of my misunderstanding can be boiled down to the difference how &str and String are handled by Rust regarding lifetime and memory.

Using String as type within the result struct was the solution I should have used. Thank you both for your help - I’m one more problem closer to enlightenment :wink:


#10

In earlier versions of Rust, references (&) were called “borrowed pointers”. I like that name because it makes it very clear that they never assume ownership.