Guidance on Custom Lifetimes and Lifetime Function Parameters

I've been wrestling with a code issue for a while now and could use some assistance. I keep encountering an error either on the variable d or the data part of my code.

fn process_messages(
    processors: Processors<'static>,
    data: &'static str,
    tx_to_send: Sender<String>,
) {
    let processors = processors.clone();
    let parsed_data = parse_ws_packet(data);
    for d in parsed_data {
        let d = parse_each_packet(d);
        for processor in &processors {
            tokio::spawn({
                let d: Packet<'static> = d.clone();
                let tx_to_send = tx_to_send.clone();
                let processor = *processor;
                async move {
                    processor(&d, tx_to_send).await;
                }
            });
        }
    }
}

And I keep on getting this error either on either d or the data:

error[E0597]: `d` does not live long enough
   --> src/quote/session.rs:364:31
    |
360 |                 let d: Packet<'static> = d.clone();
    |                     - binding `d` declared here
361 |                 let tx_to_send = tx_to_send.clone();
362 |                 let processor = *processor;
    |                     --------- lifetime `'1` appears in the type of `processor`
363 |                 async move {
364 |                     processor(&d, tx_to_send).await;
    |                     ----------^^-------------
    |                     |         |
    |                     |         borrowed value does not live long enough
    |                     argument requires that `d` is borrowed for `'1`
365 |                 }
    |                 - `d` dropped here while still borrowed

I have tried .clone(), .as_str(), Arc and to no avail.
You can see the whole context on Github

To fix the data issues I had to use let txt: &'static str = Box::leak(text.into_boxed_str());, is this a safe and acceptable way to handle this? I'm eager to learn and appreciate any help or insights you can provide.

Any help would be much appreciated

Your code has:

pub type MessageProcessor<'a> = fn(&'a Packet<'a>, mpsc::Sender<String>) -> BoxFuture<'a, ()>;

This says it can process messages given a reference with lifetime 'a. Then in the signature of process_messages you specify that lifetime is 'static. No value that is going to be deallocated can ever be borrowed for 'static.

Generally when writing a signature for a function you want it to accept an independent lifetime for each call, not the same one for all calls to it:

pub type MessageProcessor = for<'a> fn(&'a Packet<'a>, mpsc::Sender<String>) -> BoxFuture<'a, ()>;

Notice the type declaration now doesn't have a lifetime, and Processors won't need one either. This is a good sign — most named types in a Rust program generally won't have lifetimes, when lifetimes are being used well.

4 Likes

Thanks so much :blush:. It fixed it!

I was just wondering if there was a better way to do this line?

let txt: &'static str = Box::leak(text.into_boxed_str());

String::leak stabilized with the current Rust version (1.72).

Preferred not to leak if there's an alternative though.[1]


  1. I haven't dug into your code enough to have an opinion. ↩ī¸Ž

I'll try and better explain what I am trying to fix.

I have found most of my functions which accept &str have to be annotated with &'static str, is this the best way to do this or does it negate the whole purpose of str.

For example:

fn process_messages(processors: &Processors, data: &'static str, tx_to_send: &Sender<String>) {
// -- inner code --
}

This is why I had to use

let txt: &'static str = text.leak();

to coerce a String into a &'static str.

Any insight would be appreciated.

Generally you should not need to leak. If something is asking for a &'static str, that's could mean that:

  • you are using &strs (of any lifetime) where you should use String instead — that's likely the case if you are dealing with threads,
  • you used 'static where the function should have a lifetime parameter instead, or
  • the function you're calling really wants something like a compile-time constant.

It's not possible to say what the right choice is without context; it depends on exactly where the requirement is arising. But in your case, process_messages probably should not require 'static input, since the thing it is doing next is parsing that data:

fn process_messages(processors: &Processors, data: &str, tx_to_send: &Sender<String>) {
3 Likes

It's probably a sign you're trying to use borrow-holding structs in a context that doesn't allow them, like a non-scoped thread. Let me just ping-pong around your code and see what I find...

pub struct Session<'a> {
    // ...
    processors: Vec<MessageProcessor<'a>>,
}
type Processors<'a> = Vec<MessageProcessor<'a>>;
pub type MessageProcessor<'a> = fn(&'a Packet<'a>, mpsc::Sender<String>) -> BoxFuture<'a, ()>;

Like @kpreid said, get rid of the lifetime on MessageProcessor, then you can get rid of it on Processors and Session, that will help.


fn process_messages(
    processors: Processors<'static>,
    data: &'static str,
    tx_to_send: Sender<String>,
) {
    let processors = processors.clone();
    let parsed_data = parse_ws_packet(data);
    for d in parsed_data {
        let d = parse_each_packet(d);
        for processor in &processors {
            tokio::spawn({

The spawning is presumably what requires 'static here. And let's see...

pub fn parse_each_packet(packet: &str) -> Packet {

First, turn on #![deny(elided_lifetimes_in_paths)] so you're not hiding the fact you're borrowing stuff.

pub fn parse_each_packet(packet: &str) -> Packet<'_> {

So you're borrowing from somewhere in this thread (&str, Packet<'_>) and then trying to send those borrows elsewhere.

You should make an owned (no lifetime) version of Packet<'_> and send that instead. Or maybe refactor so that you spawn something that captures data as a String, and parse on the other side.


processor(&d, tx_to_send).await;
// `d` not used again

Is this all you ever do with a MessageProcessor? If so, maybe you should be taking an owned Packet.

2 Likes

Thank you for all your help.

I refactored my code to take &str from Strings, as I heard this was the better option as I wasn't mutating the string nor manipulating it... but this may of been a mistake. I can always change it back to take String.

I have pushed by changes to Github, so you can see the removed lifetimes on Session.

I have turned that on and can see my code light up with many errors, thanks for the advice.

This is the enum variant of Packet that leads to it having a lifetime:

pub struct WSPacket<'a> {
    pub m: &'a str,
    pub p: ArrayData<'a>,
}

Should I change it from &'a strs to Strings, and will it significantly impact performance?

Well... it will have an impact. But so does leaking everything.[1] The impact is probably perfectly acceptable. I'd try it and see.

Zero-copy parsing can be great when you only need the parsed form for as long as the source data can be borrowed, but if you need it longer than that (e.g. with your current approach), it generally doesn't work.

(Sometimes it can work decently in combination with leaking if you have some scenario like "load this unchanging resource at the start, parse it once, and run the rest of my program based on the parsed form". Packet<'_> doesn't sound like that to me but I could be wrong.)


  1. You're still allocating in order to leak, just in larger chunks which you never free. And... you're leaking. Memory use will just continue to grow as you process things. ↩ī¸Ž

I designed Packet to be passed to every processor after it has been parsed, as they will mostly just analyse the packet and not really manipulate it.

I'll probably change data to String.

If I want to only process the incoming data once and pass it in the form of a Packet into each proccessor — who are each on their spawned threads. Should I borrow Packet or pass it as owned through cloning?

For a threaded pipeline, ideally, you neither borrow nor clone the packet but move it to the thread, then move it on to the next stage. Channels are handy for this.

Whether or not you clone it, it will have to be an owned type — no &strs in it, only Strings. You can only combine borrowed data and threads if the threads are scoped — the borrowed data outlives the threads.

Sorry for all the questions, but what channel would you recommend?

You are already using tokio::sync::mpsc and I see no reason to suggest something else.

How would it work in practise? As mpsc is single consumer and I would like to spawn multiple processes.

If you want to process things in parallel, then never mind the channels. If you are concerned about the cost of cloning the data for each task spawned, then store the packet struct in an Arc. (But don't assume that that will be significantly faster, either. If performance matters, write benchmarks and test many version.)

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.