Can I fix this "borrowed value does not live long enough" error with lifetimes parameters?


#1

I defined the following structs and methods:

struct DataVar<'a> {
    name: &'a str,
    val: &'a [f64],
}

impl<'a> DataVar<'a> {
    fn new(name: &'a str, val: &'a [f64]) -> Self {
        Self { name, val }
    }
}

struct DataSet<'a> {
    name: &'a str,
    datavars: &'a [&'a DataVar<'a>],
}

impl<'a> DataSet<'a> {
    pub fn new(name: &'a str, datavars: &'a [&DataVar]) -> Self {
        Self { name, datavars }
    }
}

struct DataGroup<'a> {
     name: &'a str,
     datasets: &'a [&'a DataSet<'a>],
}

impl<'a> DataGroup<'a> {
     fn new(name: &'a str, datasets: &'a [&DataSet]) -> Self {
        Self { name, datasets }
    }
}

Now the following code works:

let x = DataVar {
    name: "X",
    val: &[1., 2., 3.],
};
let dataset = DataSet {
    name: "dataset",
    datavars: &[&x, &x, &x],
};
let dataplot = DataGroup {
    name: "datagroup",
    datasets: &[&dataset, &dataset],
};

while the following does not work (I get the error “borrowed value does not live long enough” when defining dataplot):

let x = DataVar::new("X", &[1., 2., 3.]);
let dataset = DataSet::new("dataset", &[&x, &x, &x]);
let dataplot = DataGroup::new("datagroup", &[&dataset, &dataset]);

Can this error be fixed with lifetimes parameters, or am I forced to define intermediate variables like let y = &[&x, &x, &x]? I would like all references to live like x, but cannot find out how to specify it with lifetime parameters.


#2

The error message I see is

error[E0716]: temporary value dropped while borrowed
  --> src/main.rs:36:40
   |
36 | let dataset = DataSet::new("dataset", &[&x, &x, &x]);
   |                                        ^^^^^^^^^^^^ - temporary value is freed at the end of this statement
   |                                        |
   |                                        creates a temporary which is freed while still in use
37 | let dataplot = DataGroup::new("datagroup", &[&dataset, &dataset]);
   |                                              -------- borrow later used here
   |
   = note: consider using a `let` binding to create a longer lived value

This can’t be fixed by adding lifetimes. It has to be changed in the compiler so that these temporaries are not dropped. (To be honest, I’ve often felt that the compiler drops temporaries too early in certain cases, and the precise circumstances in which it does so are difficult to understand)


#3

Lifetimes don’t do anything, and can’t change program behavior/generated code. They only describe what the program is doing anyway.

And in this case the program is dropping a temporary value early.

It wouldn’t be a problem if you used owned values in the structs, since then your code could keep the values for as long as it needs, instead of being attached to temporary borrows from the calling environment.


#4

You can fix this by using temporaries.

let x = DataVar::new("X", &[1., 2., 3.]);
let dataset = &[&x, &x, &x];
let dataset = DataSet::new("dataset", dataset);
let dataplot = &[&dataset, &dataset];
let dataplot = DataGroup::new("datagroup", dataplot);

#5

Yes I understand that I can avoid this error either using owned values or temporaries, but I hoped there was a better solution. I am implementing a plotting library (which I hope to publish on crates.io one day), and I think those two solutions have the following shortcomings:

  1. Owned values imply allocations. Since I am using those structs just to organize data before passing them to the lib, and the passed data could be in general extensive, I would like to avoid copying all values to the heap.

  2. Using temporaries means that all users of the plotting lib would have to write 5 lines of code for each plot instead of 3… however if you confirm that there is no better solution to this error (including implementing the structs and/or constructors differently) then I will need to follow this path.

I agree with @ExpHP: I also feel that the compiler often drops temporaries too early (the code above is just one case I hit recently, which I reported because I hoped there was a better solution). Maybe references created as a function argument could be assigned to the parent’s scope? (after all, they could be considered as created and passed there). Or otherwise maybe there could be a way to instruct the compiler to extend the life of a reference to the parent’s scope, something like a 'super lifetime parameter, or a super() method, which would allow for instance to write:

let dataset = DataSet::new("dataset", &'super [&x, &x, &x]);

or:

let dataset = DataSet::new("dataset", &[&x, &x, &x].super());

If there is any agreement on this, should I open a discussion on Rust Internals?

Thanks


#6

The lifetime of temporaries (and the ensuing “too early drop”) is a known issue - e.g. see this comment and the linked document there.

AIUI, the core issue seems to be around the place where the temp would end up being dropped, and that being somewhat invisible/non-explicit in code; this can have ramifications for unsafe code, for example. Given the fix is to insert an explicit let binding, I suspect this hasn’t been considered too big of a problem - it’s a bit of an ergonomic hit in some cases, but there’s an argument to be made that being explicit with such things is desirable.


#7

Owned values are not on the heap. Ownership and allocation are separate concepts. Owned values can exist on the stack and be temporary too.

References can’t exist without a corresponding owned value. They’re not a way to avoid allocations, but to cheaply share access to the existing allocations.

It makes sense to borrow [f32] and strings, but the rest are types from your library, so users will have to make specifically for you anyway.


#8

Interesting reading! So now that we have NLL I’m really looking forward for “Better Temporary Lifetimes”, as this is one of the issues I hit more often, and I think it is an actual obstacle for new programmers approaching Rust.


#9

Thanks for clarifying, if I understand correctly you suggest to change my code as follows:

#[derive(Debug, Default)]
struct DataVar<'a> {
    name: &'a str,
    val: &'a [f64],
}

impl<'a> DataVar<'a> {
    fn new(name: &'a str, val: &'a [f64]) -> Self {
        Self { name, val }
    }
}

#[derive(Debug, Default)]
struct DataSet<'a> {
    name: &'a str,
    datavars: Vec<&'a DataVar<'a>>,
}

impl<'a> DataSet<'a> {
    pub fn new(name: &'a str, datavars: Vec<&'a DataVar>) -> Self {
        Self { name, datavars }
    }
}

#[derive(Debug, Default)]
struct DataPlot<'a> {
     name: &'a str,
     datasets: Vec<&'a DataSet<'a>>,
}

impl<'a> DataPlot<'a> {
     fn new(name: &'a str, datasets: Vec<&'a DataSet>) -> Self {
        Self { name, datasets }
    }
}

Now the following works:

let x = DataVar::new("X", &[1., 2., 3.]);
let dataset = DataSet::new("dataset", vec![&x, &x, &x]);
let dataplot = DataPlot::new("dataplot", vec![&dataset, &dataset]);

Can you please confirm?

Thanks


#10

Not quite. Vec<&Foo> usually still requires caller to have another Vec or array of <Foo> somewhere to borrow from.

Also on modern architectures references are relatively expensive, because they may reduce cache locality, indirect access is costly when CPU can’t predict/speculate it, and they don’t get autovectorized.

Don’t use Vec of references to such tiny objects, unless you have to use polymorphism. You can even make DataVar a Copy type, because this type itself is nothing more than just a couple of references. Use faster, more efficient Vec<DataVar>.

If you expect users to use one dataset multiple times, then Vec<&DataSet> might be OK (since cloning of the Vec inside it would duplicate its heap data). But if users would typically use each dataset once, then Vec<DataSet> is fine too and it saves a layer of indirection.


#11

This is the data model that I envision for the plotting library I’m implementing:

  • Users first define several DataVars, each one referencing a series of data.
  • Then they define one or more DataSets, which are collections of DataVars, with the requirement that they have same length.
  • Finally they define a DataPlot, which contains at least one DataSet, or more DataSets for instance when not all DataVars have the same length.

To create several plots, a user has to define a DataPlot for each plot, and doing so I expect he would typically reuse several DataVars (for instance the time series), and maybe also some DataSets.

I used references to avoid asking the user to clone DataVars when reusing them. Now that you know the use case scenario, do you still think it is acceptable/advisable to use Vec<DataVar> and Vec<DataSet>? In this case, do you think I should make both DataVar and DataSet Copy types, or only DataVar, or neither one and ask the user to clone them when reusing?

Thanks


#12

DataVar can be Copy, because it only contains shared references. It’d make it a copy type, because it’s small and copy is convenient.

DataSet can’t be Copy because of the Vec in it. Cloning will clone the Vec’s content. It could be cheaper to clone with Arc<Vec<>>.

If DataSet is supposed to be reusable (same instance in multiple places) then you could take it by reference.