Dealing with local lifetimes while deserializing

I wrote some code to parse CSV files, and now I'd like to add an abstraction to prevent having duplicated code the CSV reading all over my crate. However, this is causing issues:

use csv::{ReaderBuilder, StringRecord};
use serde::Deserialize;

fn process<'a>(read: &mut Chunk<'a>) {
    read_csv::<Raw>(read, HEADER)
}

pub fn read_csv<'a, 'de, T>(chunk: &mut Chunk<'a>, header: &[&str])
where
    T: Deserialize<'de>,
{
    let mut reader = ReaderBuilder::new().from_reader(chunk.0);
    let header = StringRecord::from(header);
    let mut buf = StringRecord::new();
    loop {
        let _ = reader.read_record(&mut buf).unwrap();
        let _ = buf.deserialize::<T>(Some(&header));
    }
}

#[derive(Debug, Deserialize)]
struct Raw<'a> {
    name: &'a str,
}

pub struct Chunk<'a>(&'a [u8]);

const HEADER: &[&str] = &["foo", "bar"];

(Playground.)

Which results in these errors:

error[E0502]: cannot borrow `buf` as mutable because it is also borrowed as immutable
  --> src/lib.rs:16:36
   |
8  | pub fn read_csv<'a, 'de, T>(chunk: &mut Chunk<'a>, header: &[&str])
   |                     --- lifetime `'de` defined here
...
16 |         let _ = reader.read_record(&mut buf).unwrap();
   |                                    ^^^^^^^^ mutable borrow occurs here
17 |         let _ = buf.deserialize::<T>(Some(&header));
   |                 -----------------------------------
   |                 |
   |                 immutable borrow occurs here
   |                 argument requires that `buf` is borrowed for `'de`

error[E0597]: `buf` does not live long enough
  --> src/lib.rs:17:17
   |
8  | pub fn read_csv<'a, 'de, T>(chunk: &mut Chunk<'a>, header: &[&str])
   |                     --- lifetime `'de` defined here
...
17 |         let _ = buf.deserialize::<T>(Some(&header));
   |                 ^^^--------------------------------
   |                 |
   |                 borrowed value does not live long enough
   |                 argument requires that `buf` is borrowed for `'de`
18 |     }
19 | }
   | - `buf` dropped here while still borrowed

error[E0597]: `header` does not live long enough
  --> src/lib.rs:17:43
   |
8  | pub fn read_csv<'a, 'de, T>(chunk: &mut Chunk<'a>, header: &[&str])
   |                     --- lifetime `'de` defined here
...
17 |         let _ = buf.deserialize::<T>(Some(&header));
   |                 --------------------------^^^^^^^--
   |                 |                         |
   |                 |                         borrowed value does not live long enough
   |                 argument requires that `header` is borrowed for `'de`
18 |     }
19 | }
   | - `header` dropped here while still borrowed

I find this mightily confusing because (a) it doesn't seem like the mutable borrow and the immutable borrow for buf have overlapping liveness and (b) raw doesn't outlive buf and (c) it seems like buf and header have the same lifetime.

How can I convince rustc that this code is okay? I already tried to change T to be for<'de> Deserialize<'de>, but that gets me into trouble at the callsite (that is, in process()).

You cannot use the type Raw in this context, because the data read from R is destroyed once you return from read_csv, at which point you can no longer hold a &str reference to it.

Either take the chunk as a slice

pub fn read_csv<'de, R, T>(chunk: &'de [u8], header: &[&str])
where
    T: Deserialize<'de>,
    R: std::io::Read,
{

Or use an owned type:

#[derive(Debug, Deserialize)]
struct Raw {
    name: String,
}
1 Like

Ah, actually, I misunderstood. You can't do this, because you want T to be Raw<'a> for the specific lifetime 'a inside your loop, and every choice of lifetime emits a different Raw type.

You're more or less asking for GAT, but a generic parameter rather than an associated type.

Your options are still the two in my post above.

Oops, I oversimplified the example. In my actual code, the input type is more or less slice-like. Updated the post to demonstrate that this doesn't (by itself) solve the problem. I've tried different ways of bounding the lifetime of the chunk, but haven't figured out a solution yet.

(And I don't want to allocate a String for every row for performance reasons.)

You still have the same problem. To deserialize into a Raw<'a>, you must borrow buf for the lifetime 'a. But 'a is a generic parameter and thus larger than the entire function's body, but buf is invalidated in each iteration of the loop, which means that references into buf are not valid for the full 'a lifetime.

To do this with T being generic, the string slice in Raw must point into the chunk, and not into buf.

Note that the issue goes away if T is not generic, because then it is possible to refer to Raw<'a> for the right lifetime 'a.

1 Like

I want to use this with multiple implementations of Raw, so T has to be generic. Do you think this variant can be fixed?

use csv::{ReaderBuilder, StringRecord};
use serde::Deserialize;

fn process<'a>(read: &mut Chunk<'a>) {
    read_csv::<Raw>(read, HEADER)
}

pub fn read_csv<'a, 'de, T>(chunk: &mut Chunk<'a>, header: &[&str])
where
    T: Deserialize<'de>,
{
    let mut reader = ReaderBuilder::new().from_reader(chunk.0);
    let header = StringRecord::from(header);
    let mut buf = StringRecord::new();
    loop {
        let _ = reader.read_record(&mut buf).unwrap();
        read_row::<T>(&buf, &header)
    }
}

pub fn read_row<'a, 'de: 'a, T>(buf: &'a StringRecord, header: &'a StringRecord)
where
    T: Deserialize<'de>,
{
    let _ = buf.deserialize::<T>(Some(&header));
}

#[derive(Debug, Deserialize)]
struct Raw<'a> {
    name: &'a str,
}

pub struct Chunk<'a>(&'a [u8]);

const HEADER: &[&str] = &["foo", "bar"];

I don't see any obvious ways. Perhaps a macro?