Fun (and sadness) with serde untagged enums

So here's an attempt to provide "any" and "all" deserialization options within serde's public APIs:

use serde::Deserialize;

#[derive(Deserialize, Debug)]
struct Foo { foo: u64, baz: u64, }
#[derive(Deserialize, Debug)]
struct Bar { bar: u64, baz: u64, }

// equivalent to `||`
#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum Any {
    Foo(Foo),
    Bar(Bar),
}

// sadly not equivalent to `&&`
#[derive(Deserialize, Debug)]
struct All {
    #[serde(flatten)]
    foo: Foo,
    #[serde(flatten)]
    bar: Bar,
}

fn main() {
    // Foo is preferred over Bar, ok.
    let _ = dbg!(serde_json::from_str::<Any>(r#"{"foo": 10, "bar": 20, "baz": 30}"#));
    // Foo fails, so we get Bar, ok.
    let _ = dbg!(serde_json::from_str::<Any>(r#"{"bar": 20, "baz": 30}"#));
    // Foo consumes baz, so Bar can't have it. As such this fails. Ideally we want this to work.
    let _ = dbg!(serde_json::from_str::<All>(r#"{"foo": 10, "bar": 20, "baz": 30}"#));
    // This is not a solution. Even if it worked, it wouldn't provide the correct semantics.
    let _ = dbg!(serde_json::from_str::<All>(r#"{"foo": 10, "bar": 20, "baz": 30, "baz": 40}"#));
}

Playground link: Rust Playground

The Any pattern works great! Sadly there doesn't seem to be any way to provide the All pattern, at least not within the public API. Ah well, it was worth a shot.

Anyway not much point to this other than play with some weird stuff and see what happens. Thoughts?

Side note, despite what the manual says, you can have a deny_unknown_fields struct as the trailing flattened struct in another struct. :‌p (or at least you should be able to. haven't actually tested it, just inferring.)

Edit: but you need to wrap it around an untagged enum, like so:

use serde::Deserialize;

#[derive(Deserialize, Debug)]
struct Foo { foo: u64, }

#[derive(Deserialize, Debug)]
#[serde(deny_unknown_fields)]
struct Bar { bar: u64, }

// lmao
#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum Fake<T> {
    T(T),
}

#[derive(Deserialize, Debug)]
struct Baz {
    foo: Foo,
    #[serde(flatten)]
    bar: Fake<Bar>,
}

fn main() {
    let _ = dbg!(serde_json::from_str::<Baz>(r#"{"foo": {"foo": 10}, "bar": 20}"#));
    let _ = dbg!(serde_json::from_str::<Baz>(r#"{"foo": {"foo": 10}, "bar": 20, "baz": 30}"#));
}

This behavior works when you deserialize two maps instead of two structs, i.e.,

#[derive(Deserialize, Debug)]
struct A {
    #[serde(flatten)]
    a: HashMap<String, String>,
    #[serde(flatten)]
    b: HashMap<String, String>,
}

serde makes a difference here between deserialize a struct, which has a fixed set of keys, or deserializing a map, which has an arbitrary amount of arbitrary keys. In the later case, the items are not remove from the internal Content type. So under the condition that Foo and Bar can be deserialized from a map (which the serde_derive does) you can write a Deserializer, which wraps a second Deserializer and forwards all calls, except the deserialize_struct.

One way which works, although maybe not in the nicest way is this. It works on the first three of your tests and fails on the last one due to the duplicate key.

serde_with::with_prefix!(prefix_none "");

// sadly not equivalent to `&&`
#[derive(Deserialize, Debug)]
struct All {
    #[serde(flatten, with="prefix_none")]
    foo: Foo,
    #[serde(flatten, with="prefix_none")]
    bar: Bar,
}
3 Likes

Huh, interesting!

Altho then again, this is still not a proper &&... because we kinda forgot to consider cases like the following:

use serde::Deserialize;

#[derive(Deserialize, Debug)]
struct Foo ( u64, u64, );
#[derive(Deserialize, Debug)]
struct Bar ( u64, u64, );

// sadly not equivalent to `&&`
#[derive(Deserialize, Debug)]
struct All {
    #[serde(flatten)]
    foo: Foo,
    #[serde(flatten)]
    bar: Bar,
}

fn main() {
    // doesn't work at all!
    let _ = dbg!(serde_json::from_str::<All>(r#"[1, 2]"#));
}

Even with the prefix_none hack, it still wouldn't work. :‌(

One other thing is support for non-string keys. Not all formats support those but serde in general does. Uhh unfortunately it's not very easy to test with those... we do think it's important to support them tho.

Edit: So we guess this isn't possible at all within serde's public API. (We tried.)

It's still nice that serde_derive provides easy support for || tho! Maybe in the future serde can gain && support as well.

Ohh interesting, this works:

use serde::Deserialize;

#[derive(Deserialize, Debug)]
struct Foo { foo: u64, baz: u64, }
#[derive(Deserialize, Debug)]
struct Bar { bar: u64, baz: u64, }

// equivalent to `||`
#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum Any {
    Foo(Foo),
    Bar(Bar),
}

// lmao
#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum Fake<T> {
    T(T),
}

// sadly not equivalent to `&&`
#[derive(Deserialize, Debug)]
struct All {
    #[serde(flatten)]
    foo: Fake<Foo>,
    #[serde(flatten)]
    bar: Fake<Bar>,
}

fn main() {
    // Foo is preferred over Bar, ok.
    let _ = dbg!(serde_json::from_str::<Any>(r#"{"foo": 10, "bar": 20, "baz": 30}"#));
    // Foo fails, so we get Bar, ok.
    let _ = dbg!(serde_json::from_str::<Any>(r#"{"bar": 20, "baz": 30}"#));
    // The enum makes a copy of the contents, so this works.
    let _ = dbg!(serde_json::from_str::<All>(r#"{"foo": 10, "bar": 20, "baz": 30}"#));
    // This obviously fails. :)
    let _ = dbg!(serde_json::from_str::<All>(r#"{"foo": 10, "bar": 20, "baz": 30, "baz": 40}"#));
}

Kinda silly way of doing it, and probably not very efficient. But it's neat!

Still can't figure out how to make work something like this:

struct All {
    foo: String,
    bar: Url,
}

Our best idea would be to use #[serde(transparent)] but unfortunately it doesn't work with #[serde(flatten)].