Is this lifetime bound possible to express?

I've been staring at this code for hours, unable to make it work...

use std::fmt::Debug;
use serde::Deserialize;
use serde_json::Deserializer;

#[derive(Debug, Deserialize)]
struct Foo<'input> {
    data: &'input str,
}

// What lifetime bounds do you have to put on Deserialize
// here to make this code compile?
fn parse_test_string<T>() -> bool
where
    T: Deserialize
{
    let test_string = r#"{"data":"test"}"#.to_owned();
    let mut de = Deserializer::from_str(&test_string);
    let success: bool = T::deserialize(&mut de).is_ok();
    success
}

fn main() {
    parse_test_string::<Foo>();
}

If I hard-code the type, it works fine. But I just can't figure out how to make it generic over any T. 'de, T: Deserialize<'de> and T: for<'de> Deserialize<'de> both don't work, which makes sense. I guess I would want some sort of existential lifetime specifier?

Is it possible to type this function, or not? Do you know of any workarounds, or are there any nightly features that could help?

1 Like
This post is incorrect. Click if you really want to see the previous wrong answer.

There must be some lifetime 'x such that:

  1. The value of type Foo<'x> does not outlive 'x (it is dropped[1] before 'x ends).
  2. The value given to the deserializer is borrowed for 'x (it must not be dropped before 'x ends).

However,

  • The value given to the deserializer is test_string, a local variable in parse_test_string.
  • The value of type Foo is returned from parse_test_string and outlives the local variable.

Therefore, the requirements are contradictory. There are no trait bounds that can make parse_test_string compile, because if it compiled, it would contain a use-after-free bug.


In order to fix this, you must do one of these things:

  1. Change Foo to own its data (data: String) and then use the bound T: for<'de> Deserialize<'de> (also known as T: serde::de::DeserializeOwned). This is the option that gives the most flexibility in how Foo may be used after it is deserialized.

    struct Foo {
       data: String,
    }
    fn parse_test_string<T: DeserializeOwned>() -> bool {...
    
  2. Change parse_test_string() to accept a &'de str argument. This is the option if your application wishes to do zero-copy deserialization and accept input at run-time.

    fn parse_test_string<'de, T>(test_string: &'de str) -> bool
    where
        T: Deserialize<'de>
    {
    
  3. Parse a 'static string. This is appropriate if the input string is in fact supposed to be hard-coded in parse_test_string (i.e. the function is for use by tests).

    fn parse_test_string<T>() -> bool
    where
        T: Deserialize<'static>
    {
        let test_string: &'static str = r#"{"data":"test"}"#;
        let mut de = Deserializer::from_str(test_string);
    

  1. or at least no longer used ↩︎

The value of type Foo is returned from parse_test_string and outlives the local variable.

It's not! :smile: The function returns bool, the Foo is dropped.

You can emulate "generic type constructors" for a set of known types with a trait.

trait DeserMaybeBorrowed {
    type Ty<'de>: Deserialize<'de>;
}

struct FooRep;
impl DeserMaybeBorrowed for FooRep {
    type Ty<'de> = Foo<'de>;
}

fn parse_test_string<T>() -> bool
where
    T: DeserMaybeBorrowed
{
    let test_string = r#"{"data":"test"}"#.to_owned();
    let mut de = Deserializer::from_str(&test_string);
    let success: bool = <T::Ty<'_>>::deserialize(&mut de).is_ok();
    success
}
2 Likes

Sorry for the incorrect information! I saw the beginning and assumed too much.

No worries :smiley: I'm sure the problem of returning deserialized data referencing a temporary is much more common than this strange situation I find myself in

That's a really interesting workaround. Only working with a known set of types is a pretty harsh limitation though, unfortunately.

That doesn't help with deserializing from owned strings, and helpers aren't needed with &'static str. Something like shorten happens automatically in that case.

Yes, my bad. I forgot to bring to_owned back. The original example just works with the 'static string and T: Deserialize<'de>.

1 Like

It did prompt me to think up another workaround:

fn parse_test_string<'de, T>(buf: &'de mut String) -> bool
where
    T: Deserialize<'de>,
{
    *buf = r#"{"data":"test"}"#.to_owned();
    let mut de = Deserializer::from_str(buf);
    let success: bool = T::deserialize(&mut de).is_ok();
    success
}

fn main() {
    parse_test_string::<Foo>(&mut String::new());
}
3 Likes

...which leads to an obvious unsafe workaround: Rust Playground

It would be easy to trigger an undefined behavior when modifying this function so, in general, I recommend to avoid workarounds like this but if you just want to quickly check whether your structures are being parsed correctly, then it might do.

It's not sound.

1 Like

Ok, I surrender. In my defense, you violated my SAFETY comment but yes: the whole function must be marked unsafe and the inner safety comment must become a doc comment which includes an article describing how Deserializers and T::deserialize with side effects harm the welfare of the population.

At this point, I'm tired of wrestling with the compiler and just wish to have a never lifetime('!) to handle such patterns.

fn parse_test_string<T>() -> bool
where
    T: Deserialize<'!>

'! would magically ensure that references in T don't outlive the function body and everyone would be happy :hugs:

the problem isn't about an "existential" lifetime, it's about T is higher kinded type (or, generic type constructor), which doesn't exist in rust. the closest we can get is using GAT to emulate, as demonstrated in @quinedot earlier reply.

hypothetically, hkt in rust could look like this:

fn parse_test_string<T<'a>>() -> bool
where
    T<'a>: Deserialize<'a>;

but we don't have T<'a>, we only have T::Of<'a>.

FYI similar issue: Polonius fails to infer lifetimes of borrows · Issue #134554 · rust-lang/rust · GitHub

HKTs would definitely help but as far as I know they're not planned to be added to the language. It seems to me that this particular use-case can be solved with a never lifetime and the never lifetime even exists in the compiler but it's still not expressible in the user syntax.