Is there a cleaner way to write this match statement?

I should note that this is my second day of Rust and I got thru the first day via shear bloody mindedness. I finally came up with the working, albeit ugly, match statement that is the core of a module that parses semi-structured json blobs.

I should also mention that this is a port from Python to Rust... That's how I learn. Anyway...

I have a match statement which is a really just a bunch of if statements in disguise:

fn flatten_tasks(task: &Value, task_id: String) {
    let task_id = get_task_id(task, task_id);

    match task {
        Value::Object(obj)
            if obj.contains_key("value")
                && obj["value"].is_array()
                && obj["value"][0].is_string() =>
        {
            add_list_of_values(task, task_id);
        }
        Value::Object(obj) if obj.contains_key("value") && obj["value"].is_array() => {
            nested_tasks(task, task_id);
        }
        Value::Object(obj) if obj.contains_key("tool_label") && obj.contains_key("width") => {
            add_box_values(task, task_id);
        }
        Value::Object(obj) if obj.contains_key("tool_label") && obj.contains_key("x1") => {
            add_length_values(task, task_id);
        }
        Value::Object(obj) if obj.contains_key("tool_label") && obj.contains_key("x") => {
            add_point_values(task, task_id);
        }
        Value::Object(obj) if obj.contains_key("tool_label") && obj.contains_key("details") => {
            add_values_from_workflow(task, task_id);
        }
        Value::Object(obj) if obj.contains_key("select_label") => add_selected_value(task, task_id),
        Value::Object(obj) if obj.contains_key("task_label") => add_text_value(task, task_id),
        _ => panic!("Unkown field type in: {:?}", task),
    }
}

It looks much cleaner in python due to the lack of if arms. Is it possible to to something similar in Rust?

def flatten_annotation(anno, row, workflow_strings, task_id=""):
    """Flatten one annotation recursively."""
    task_id = anno.get("task", task_id)

    match anno:
        case {"value": [str(), *__], **___}:
            list_annotation(anno, row, task_id)
        case {"value": list(), **__}:
            subtask_annotation(anno, row, workflow_strings, task_id)
        case {"select_label": _, **__}:
            select_label_annotation(anno, row, task_id)
        case {"task_label": _, **__}:
            task_label_annotation(anno, row, task_id)
        case {"tool_label": _, "width": __, **___}:
            box_annotation(anno, row, task_id)
        case {"tool_label": _, "x1": __, **___}:
            length_annotation(anno, row, task_id)
        case {"tool_label": _, "x": __, **___}:
            point_annotation(anno, row, task_id)
        case {"tool_label": _, "details": __, **___}:
            workflow_annotation(anno, row, workflow_strings, task_id)
        case _:
            print(f"Annotation type not found: {anno}")

Edit: Sorry for the crappy question: (I'll add more as I uncrapify this)

  • This is serde_json
  • I only know what some of the data is, as in there are extra-fields that I don't control.
  • The structure of the json is only known at the leaves.

Not sure how your Value type is defined. I'd say just use if else statements

if let Value::Object(obj) = task {
    if obj.contains_key("value") && obj["value"].is_array() && obj["value"][0].is_string() {
        add_list_of_values(task, task_id);
    } else if obj.contains_key("value") && obj["value"].is_array() {
        nested_tasks(task, task_id);
    } // Bunch of other if-else
} else {
    panic!("Unkown field type in: {:?}", task);
}

I'm guessing this is serde_json. You don't usually want to work with a raw Value if you already know the shape of the data. I'd recommend deserializing as soon as you can. Then you can get rid of the panic!() too.

Rough outline:

#[derive(Deserialize)]
enum Task {
    List(String),
    Subtask(Vec<Task>),
    ToolLabel(String)
    // ...
}

fn flatten_tasks(task: &Task) {
    match task {
        List(abc) => add_list_of_values(abc),
        // etc
    }
}

let task: Task = data.deserialize().unwrap();
flatten_tasks(&task);
1 Like

Thank you for showing me how to do this in an if block. I tried and failed earlier.

It works and it has the added benefit of answering my next question, "Why so many match statements?"

I can definitely do this if I can elide over the data that doesn't pertain to this program and is constantly changing.

So I know what the leaves of the json (tree) are but I'm not in control of what they contain. A leaf will look similar to:

{
  "task": "T11",
  "select_label": "my_choice",
  "junk1": "these fields will blink into and out of existance",
  "junk2": "these fields will blink into and out of existance"
}

You could also use the Value::pointer() method with some pattern matching.

if let Some(Value::String(s)) = task.pointer("value/0") {
  // do something with the string at `value[0]`
}

This really shines for nested values when your task object's shape isn't known at compile time. Although, if you know what task might look like at compile time then I'd just #[derive(Deserialize)] to deserialize the JSON into a strongly-typed value.

2 Likes

This is very helpful and it might cleanup the code further.

Ya know, I've worked with JSON for a few years and I've even seen this notation before but I never realized it was part of an RFC. json pointers rfc

Hi mods, not sure why @telesphore's thread suddenly got deleted by a... automated spam filter? But you can move this there after y'all figure that out. Edit: Thanks!


By default, a serde impl generated from a struct will already simply skip over unrecognized fields, doing nothing more than validating them as valid JSON. (or not even that, if you have already parsed to Value).

You can even do this explicitly for a field that you merely want to check exists, by using serde::de::IgnoredAny. E.g. this will work similar to the....... uh...... part of the example stuff that disappeared from the thread in the middle of me typing this. >_>

use serde::de::{Deserialize, IgnoredAny};
#[derive(Deserialize)]
struct Task {
    task: String,
    task_id: String,
    #[serde(flatten)
    kind: TaskKind,
}

#[derive(Deserialize)]
#[serde(untagged)]
enum TaskKind {
    BoxValues { tool_label: IgnoredAny, width: IgnoredAny },
    LengthValues { tool_label: IgnoredAny, x1: IgnoredAny },
    PointValues { tool_label: IgnoredAny, x: IgnoredAny },
    WorkflowValues { tool_label: IgnoredAny, details: IgnoredAny },
    SelectedValue { select_label: IgnoredAny },
    TextValue { text_value: IgnoredAny },
}

Of course, this type I've written here isn't too useful on its own (since it just tosses away all of the extraneous fields). Part of the idea here is that you can make multiple passes if a single Deserializable type doesn't meet all of your needs. E.g. you can parse the whole thing to Value first, and then do serde_json::from_value to get one of these for your match, but continue passing the whole Value to other functions.

4 Likes