How can i move things with depending lifetimes?


#1

I am sorry if the title sounds confusing, i wouldn’t know how to phrase it better.

We try to use this csv crate to read csv files. Due to our framework’s design, we should expose the functions init() and tick(), where tick() should read and handle one value from the csv iterator. However, i do not yet understand how to deal with lifetimes in this case. Here is a simplified example:

extern crate csv;
extern crate rustc_serialize;

use csv::DecodedRecords;
use std::fs;

fn main() {
    let mut handle = CSVHandle { reader: None, iterator: None};
    handle.init();
    handle.tick();
}

struct CSVHandle<'a> {
    pub reader: Option<Box<csv::Reader<fs::File>>>,
    pub iterator: Option<Box<csv::DecodedRecords<'a, fs::File, Point>>>
}

impl<'a> CSVHandle<'a> {
    fn init(&mut self) {
        self.reader = Some(Box::new(csv::Reader::from_file("./src/Sample.csv").unwrap()));
    }

    fn tick(&mut self) {
        //read one element from iterator and do stuff
    }
}

#[derive(RustcDecodable, Clone, Debug)]
pub struct Point {
    pub x: i32,
    pub y: i32,
    pub z: i32
}

If i understand lifetimes correctly (please correct me if i do not), this should create a Reader on the heap (since it is in a box), and move the box into self. The box and the reader should have the same lifetime, and the compiler knows it can drop them when it drops the CSVHandle because we transferred the ownership to it. What is the lifetime of them now, actually? A new created one? Since i could assign a new value to reader and that should lead to the old one being dropped i presume it cannot be the one of the CSVHandle.

Now we need an iterator for the values, which i can get with:
pub fn decode<'a, D: Decodable>(&'a mut self) -> DecodedRecords<'a, R, D>
I guess the definition makes sense: The DecodedRecords lifetime should not exceed the one of the Reader it belongs to. But how can i save/move ownership of a DecodedRecords?

If i change init() to this:

        let mut r = Box::new(csv::Reader::from_file("./src/Sample.csv").unwrap());
        let mut i = Box::new(r.decode::<Point>());

the compiler does not complain, because r and i live in the same lifetime and all is well.
But we have to save i, but i cannot just do self.iterator = Some(i); because i would drop r when it goes out of scope (“error: *r does not live long enough”).

So i tried to save both:

        let mut r = Box::new(csv::Reader::from_file("./src/Sample.csv").unwrap());
        self.reader = Some(r);
        let mut i = Box::new(r.decode::<Point>());
        self.iterator = Some(i);

but this way i would use a moved value. Saving the reader after decoding is also not possible: decode borrows the reader, i cannot use it in the same scope afterwards.

So i tried playing with scopes:

        let mut r = Box::new(csv::Reader::from_file("./src/Sample.csv").unwrap());
        {
            let i = Box::new(r.decode::<Point>());
            self.iterator = Some(i);
        }
        self.reader = Some(r);

This way i can deal with decode borrowing my reader and saving it later, but self.iterator = Some(i); is not allowed because the compiler does not know that r will actually live long enough - it thinks that r will be dropped after init(), and thus throws an error.

TL;DR: I have two things A and B, where B needs A as long as it is alive, and i want to move the ownership of both to a struct. How can i do that?


#2

You can’t.

That said, there’s a gazillion stackoverflow questions about storing a value and a reference to it in the same struct. The canonical answer is this one.


#3

I am in the process of rewriting the csv crate (almost done) and this particular problem will be solved by offering two versions of the iterator: one that borrows the reader like today and one that moves ownership of the reader into the iterator. Using the latter would make this problem go away.


#4

Is B needing A to be alive an implicit reference? Or does this happen because the iterator maintains a reference to the reader somewhere in its implementation?

@BurntSushi 15 minutes response time and i did not even contact you - wow! Can you estimate when it will be finished?


#5

Iterators generally maintain a reference to the underlying source data; the short of it is that you can’t move the source because the address would change, and invalidate the reference held in the iterator.


#6

I try not to estimate my free time. Sorry. I can tell you what’s left to do though:

  1. Write (more) documentation.
  2. Write examples.
  3. Move to serde 1.0.

#7

Is moving the source first an option? I do not neccessarily want to move two depending structs, i just want them to arrive in the struct somehow.


#8

Try the rental crate.

When you write struct CSVHandle<'a>, the struct has to be valid for any lifetime the user of this struct wants. This is certainly not true in your case, because the lifetime has to be tied to the reader field somehow. The rental crate has a macro that allows you to do something like (the struct doesn’t have any parameters!):

struct CSVHandle {
    pub reader: Box<csv::Reader<fs::File>>,
    pub iterator: csv::DecodedRecords<'reader, fs::File, Point>
}

It works by forbidding you a direct acces to the fields and letting you access the struct only in a closure. That way, you can’t take out the iterator field and separate it from reader.


#9

You can only move the source if there are no outstanding references (that’s a general Rust requirement). However, if you moved the source into your struct, and then tried to store a reference to that value in your struct as a field, you’d get another compiler error about lifetimes; the problem is similar - if you move the parent struct, which owns a value (source), it invalidates the reference it stores to that value. As @oli_obk mentioned, this is a fairly well-known issue in Rust: self-referencing structs/sibling references don’t play well with the language as-is. That’s why there are external crates (e.g. rental, owning-ref) that try to facilitate this or some end up using Rc as a “mediator” of ownership (i.e. it becomes a pseudo-owner, so to speak, and other components take refcounted borrows).