Serde zero allocation and generic lifetimes

Hi there, I'm looking for some help on transitioning over a Serde implementation to use zero allocation. I'm new to Rust and have been researching as much as possible but any advice in that regard would be welcome as well!

I have a small CSV parser where I read in a file, do some basic deserialization with Serde and then write back to a buffer to return some data. This is built using a combination of the CSV crate + Serde.

I have a basic working implementation right now that seems to use up extra memory. One of the files is around 80MB in size and by the end of the process uses around 300MB. I might see even larger files in the future so was interested in a zero allocation technique to try and make things a little more efficient.

I've tried following the CSV crate's documentation on this technique but have hit a snag when it comes to generics. One piece to this puzzle is that the CSV parser is built to handle different types of CSV files, so depending on the type of file that is uploaded, I pass a different type of struct to deserialize with. The main problem I'm running into is incorporating a lifetime in the appropriate fashion.

I've added a scaled down example of my code. I think I've got the lifetime implemented correctly on the RecordA struct as well as the read_and_write function signature. Right now I'm seeing two compiler complaints.

  1. Line 37: cannot borrow raw_record as mutable because it is also borrowed as immutable

  2. Line 38: raw_record does not live long enough`

I think what's happening is that the raw_record.deserialize call is consuming the raw_record variable and it's going out of scope at that point, but I'm not sure how to work around this. I've tried dereferencing and using .copy(), etc... on raw_record but these don't seem to be correct. I've spun my wheels on this for quite a while and haven't been able to figure it out.

Lifetimes are still pretty new to me also and I get the concepts behind them but I'm not even sure if I've gotten everything implemented correctly to use them with Serde. In particular, I'm having some trouble understanding the differences between serde::Deserialize and serde::DeserializeOwned trait bounds. (I'm new and can only add two links to my post but this part of the Serde docs is what I'm referring to: serde.rs/lifetimes.html#trait-bounds). Previously, I was using DeserializeOwned to handle the deserialization, but when introducing zero allocation I believe I need to switch to Deserialize so that I can just use references to the original read_buffer. Is this assumption correct?

Anyways, thanks for reading. Any advice would be appreciated! :+1:

1 Like

I think you're being overly restrictive with your lifetimes.

First, you don't need any connection between lifetimes on bytes, transformed and 'de:

pub fn read_and_write<'de, W: Write + ?Sized, T>(
    bytes: &[u8],
    transformed: &mut W,
) -> Result<(), Box<dyn std::error::Error>> 
where
  T: Deserialize<'de> + Serialize

But here comes another problem – the lifetime 'de is now "external" to this function, but you'd actually need it to "internal" – you're deserializing from raw_record, which is local variable.

The best way to say it is to use a higher-rank trait bound (for <'a> syntax), to basically say "this type is deserializable using any lifetime":

pub fn read_and_write<W: Write + ?Sized, T>(
    bytes: &[u8],
    transformed: &mut W,
) -> Result<(), Box<dyn std::error::Error>> 
where
  T: for<'de> Deserialize<'de> + Serialize

(note that read_and_write now doesn't contain any explicit lifetime params).

Unfortunately, that doesn't work, because what you'd need is actually:

  for<'de> T<'de>: for Deserialize<'de> + Serialize

(because Record<'a> implements Deserialize<'a>). This syntax is not currently supported by Rust though. You can work around it by making your function more specific: Playground – it works using only RecordA. Not sure how to make this function generic to support both RecordA and RecordB :frowning:.

Ah I see, well thanks! Your suggestions definitely help to simplify some things so that's awesome :grinning:. Reading up on the higher-rank trait bounds, thanks for the link! That's too bad about doing this generically haha. I actually did have a previous implementation where I ran specific functions for each struct and moved to the generic approach in an attempt to reduce some code duplication. In any case, thanks for your reply! Much appreciated

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.