Generic parameter bound by `Deserialize<'de>`. Is this pattern possible in Rust?

I am refactoring a simple Rust project. Some structures in the project implement a run method, which all share the same behaviour:

  1. receive a WebSocket message.
  2. deserialize the message to a Rust structure (Implement Deserialize<'de'>, not DeserializedOwned for performance consideration) via serde_json::from_str.
  3. pass the deserialized structure to a callback function.

Here is a simplified example to demonstrate the implementation currently:

use serde::Deserialize;

fn message() -> String {
    r#"{"field1":"value1"}"#.to_string()
}

#[derive(Deserialize, Debug)]
pub struct Info<'a> {
    pub field1: &'a str,
}

fn run<Callback>(mut callback: Callback)
where
    Callback: FnMut(Info),
{
    loop {
        let content = message();
        match serde_json::from_str::<Info>(&content) {
            Ok(res) => callback(res),
            Err(e) => eprintln!("Error: {}", e),
        }
    }
}

fn main() {
    run(|info: Info| {
        println!("info: {:?}", info);
    });
}

This is fine, when run this example, it outputs: info: Info { field1: "value1" }

Then I try to make function run more generic. My goal is to allow serde_json::from_str deserialize message to any structure, not just Info, so I introduce a generic parameter Res, the run function becomes:

fn run<'de, Res: Deserialize<'de>, Callback>(mut callback: Callback)
where
    Callback: FnMut(Res),
{
    loop {
        let content = message();
        match serde_json::from_str::<Res>(&content) {
            Ok(res) => callback(res),
            Err(e) => eprintln!("Error: {}", e),
        }
    }
}

But when I run the example, rust complains:

error[E0597]: `content` does not live long enough
  --> examples/demo2.rs:18:43
   |
12 | fn run<'de, Res: Deserialize<'de>, Callback>(mut callback: Callback)
   |        --- lifetime `'de` defined here
...
18 |         match serde_json::from_str::<Res>(&content) {
   |               ----------------------------^^^^^^^^-
   |               |                           |
   |               |                           borrowed value does not live long enough
   |               argument requires that `content` is borrowed for `'de`
...
22 |     }
   |     - `content` dropped here while still borrowed

Apparently, there is no lifetime issue at runtime. But since Res was bound to lifetime 'de declared in run function, so rust complains. (My understanding of lifetime here is I said Res must live long enough than the scope of fuction run, but actually it is not).

Is there any approach to work around this issue? Or other patterns to achive the same goal? Thanks for your help!

Here is the real project: yufuquant/rust-bybit: Rust API connector for Bybit's WebSockets APIs. (github.com)

1 Like

You should not use references in structs. A type like Info<'a> is often a design mistake. Use struct Info {…} instead (not restricted to a temporary lifetime) with an owning string type like Box<str> or String.

This is because borrowed data isn't merely "by reference". It's a special feature that exists to give only a temporary permission to view data that has already been previously stored somewhere. In your case you get new data that has only been put into a temporary variable, and is NOT held in the Info<'this_data_is_not_here> struct, so it's going to be forever bound to the scope of the variable it's borrowing from. References are the total opposite of storing data — they're entire purpose is to never be able to own anything.

The second problem is that this reference is strictly read-only and can't change the data it's pointing to. When the JSON string requires unescaping (like "\n" to an actual newline) it will be literally impossible to deserialize in a way that can be expressed as such reference.

References in structs are very rarely the right choice, and you should avoid them.

If you must make it compile (and tolerate the spam of lifetime annotations and the paralyzing restrictions it causes, and the inability to deserialize some data to the wrong string type), then change the Callback to be for<'a> FnMut(Info<'a>). This is because your Info<'a> type is inseparable from the scope of the content variable, and using a temporary loan instead of normal owning type requires you diligently to precisely uphold its restrictions in every single use of that data, forever.

4 Likes

Talking about the errors / trait bounds / what is possible, and ignoring design considerations...

A lifetime parameter on a function means, "I can work with any lifetime that the caller chooses." All you know is that it lasts at least as long as your function body -- callers can't name any shorter lifetimes than that -- and any such lifetime is longer than you can borrow a local variable for.

For lifetimes shorter than the function body, you need higher-ranked trait bounds (HRTBs): for<'any> [bound...]. HTRBs mean something more like "the generic type that the caller supplies can work with any lifetime (that I, the function writer, choose)". Then you can work with lifetimes shorter than your function body (e.g. that borrow locals). As it so happens, you were already using one; these are all the same thing with varying amounts of syntactic sugar:

fn run<Callback>(mut callback: Callback) where Callback: FnMut(Info),
fn run<Callback>(mut callback: Callback) where Callback: FnMut(Info<'_>),
fn run<Callback>(mut callback: Callback)
where
    Callback: for<'any> FnMut(Info<'any>),

And that's why your original compiled. (Incidentally, I recommend always writing Info<'_> instead of just Info, et cetera, to make implicit HRTB and implicit borrows more obvious in the code.)


After reading that, you might be tempted to try:

fn run<Res: for<'de> Deserialize<'de>, Callback>(mut callback: Callback)
where
    Callback: FnMut(Res),

The method will compile with such a bound, but you won't be able to actually use it like you think. (Also in case you didn't know, DeserializedOwned is a convenient wrapper around that exact bound.)

There are two related problems, with the same root cause: Types that vary by lifetime are still distinct types. Info is not a type; it is a type constructor. What is a type is Info<'a> where 'a has taken on some specific, concrete lifetime. Another thing to understand is that generic type parameters must represent a single type, and not a type constructor like Info without a concrete lifetime.

The first problem is what the error message is about: There is no Info<'concrete> where Info<'concrete>: for<'any> Deserialize<'any>. Instead, Info<'concrete> implements Deserialize<'concrete>.

The second problem is that you need your closure to be generic over the input lifetime; you want a for<'any> FnMut(Info<'any>) like you had before. But you're trying to replace that with FnMut(Res) where Res is a type parameter. This won't work because, again, Res must represent a single type, not a type constructor.


Is there any way forward? Yes -- you can emulate generic type constructors by using a trait with a GAT, or with a GAT-emulating lifetime-taking trait.

However, the abstraction is somewhat complicated and it tends to completely wreck inference.

You can sometimes recover inference by replacing enough generic implementations with concrete ones. If you do so, the boilerplate grows enough you might want a macro you call on each type. (At this point there's an argument you're not gaining much with the generic framework over just macroing your original repetitive code.)

Discussion of a similar, more general use case.

6 Likes

It works very well. The duplicated business logic was extracted to a generic function eventually. Thank you very much!

I've just faced a very similar problem as the OP and was stuck for days trying to figure out a solution other than the "morally correct" one of dropping the lifetime bound from the struct, as I wanted zero-copy deserialization. @quinedot's approach worked flawlessly for me too, although it introduces significant complexity.

It'd be great if Rust had more ergonomic support for this pattern of using type constructors with closures. I don't think there's anything morally wrong with doing what the OP and me wanted from a design standpoint.

Also, to add on what @kornel stated about zero-copy deserialization being unable to cope with escapes, I think it is worth pointing out that a Cow<'a, str> can be used with deserializers such as serde_json to deal with that owned-borrowed data duality. This is another reference from the creator of serde stating that Cow<'a, str> is the way to go.