More efficient way to check for Url query parameter

I'm writing a web application with axum, and came up with the following to get a next_url query parameter from the current url (this is executed in the context of a handler, with the HeaderMap extractor), returning a default value of "/" in case REFERER header is unavailable, or the query key doesn't exist:

fn get_next_url(headers: HeaderMap) -> String {
    let next_url = if let Some(referer_url) = headers
        .get("REFERER")
        .map(|x| x.to_str().ok())
        .unwrap_or_else(|| None)
        .map(|x| Url::from_str(x).ok())
        .unwrap_or_else(|| None)
    {
        let query_map: HashMap<String, String> =
            referer_url.query_pairs().into_owned().collect();

        match query_map.get("next_url") {
            Some(next_url) => next_url.clone(),
            None => "/".to_string(),
        }
    } else {
        "/".to_string()
    };
    next_url
}

This seems.. awfully verbose. I was hoping to do one long chain of maps and unwrap_or_elses, but got stuck on the temporary HashMap that had to be collected into, hence the match block.

Is there a better way of doing this?

Thanks!

How about (untested):

fn get_next_url(headers: HeaderMap) -> String {
    headers
        .get("REFERER")
        .and_then(|x| x.to_str().ok())
        .and_then(|x| Url::from_str(x).ok())
        .and_then(|referer_url| {
            referer_url
                .query_pairs()
                .find_map(|(k, v)| (k == "next_url").then(|| v.into_owned())
        })
        .unwrap_or_else(|| "/".to_string())
}

This does have a subtle difference from your version: If the referer has multiple next_url params, your version returns the last one, while mine returns the first.

2 Likes

I came up with pretty much the same thing. The takeaways I would highlight are:

option.map(|x| ..).unwrap_or_else(|| None)
// Is
option.and_then(|x| ..)

And Iterator::find_map (or Iterator::find).

If you care about that, instead of find_map you could chain together filter, last, and map.

(There are better ways if this was a double-ended iterator, but it appears that it is not.)

2 Likes

@quinedot @jwodder Thanks! and_then seems like something I should get familiar with.

is also spelled flatten(). And any time you have map().flatten() you can use and_then instead. In fact, in iterators the same operation is called flat_map.

1 Like

Right so, IIRC, if your closure / function returns some Option, then and_then is preferable so you don't add another layer of Option.

Is there ever a situation where you would want to use map on a closure that returns an Option, i.e., another layer of Option makes sense?

1 Like

Is there ever a situation

Liar's paradox withstanding, saying something is always/never true is never correct. Even in math, at best something is true up to axioms. Now some things are "rarely" true/false (for some notion of "rarely"). So to literally answer your question, yes there exist situations.

Any time you have the possibility of not having a value that itself may not exist is a case for Option<Option<T>>. Now there may be preferred ways to model that for one's taste. For example, the Clippy lint option_option exists to dissuade one from having nested Options; however whatever you replace that with (e.g., a separate enum) is functioning for what Option<Option<T>> would suffice for. It's not too different than asking when should one distinguish among null, an empty map/string, or the lack of key-value pair at all in JSON. There are times when one wants to distinguish among those three possibilities even if one normally wants to treat them equivalently.

The other case beyond intentionally having an Option<Option<T>> that is arguably more common is when needing to satisfy a trait's definition. You may have to wrap a generic thing in an Option; but if the "thing" you have is already an Option, then you're forced to generate a nested Option.

If that's too much mumbo jumbo, then here is an explicit example relating it back to JSON. Let's say I have the following JSON schema:

{
  "x": null | <string>,
  "y": null | <bool>
}

I require x and y to exist. Why would I require them while still allowing for null? If one sends a JSON map without one of the keys, then there's a chance they didn't even know that that is something I care about; but had they known, they would have sent that information. Unfortunately while I would like that data, there are times one may simply not have that info. By requiring one to explicitly send null instead of omitting the key altogether, they at least "prove" to me that they know about this information I seek. They'll send null when they don't have the data.

I require the keys to exist, so I need some way to detect if I received that key (e.g., Option<T>). If after parsing the map, I have None; then I know they didn't send the key and I'll error. If they do send the key but they send the value as null, then I need a way to model that (e.g., Option<T>). Combining both of those, we now see my local variable is Option<Option<T>>. Or in code:

use core::fmt::{self, Formatter};
use serde::de::{Deserialize, Deserializer, Error, MapAccess, Visitor};
struct Foo {
    x: Option<String>,
    y: Option<bool>,
}
impl<'de> Deserialize<'de> for Foo {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        struct FooVisitor;
        impl<'d> Visitor<'d> for FooVisitor {
            type Value = Foo;
            fn expecting(&self, formatter: &mut Formatter<'_>) -> fmt::Result {
                formatter.write_str("Foo")
            }
            fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
            where
                A: MapAccess<'d>,
            {
                enum Field {
                    X,
                    Y,
                }
                impl<'e> Deserialize<'e> for Field {
                    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
                    where
                        D: Deserializer<'e>,
                    {
                        struct FieldVisitor;
                        impl Visitor<'_> for FieldVisitor {
                            type Value = Field;
                            fn expecting(&self, formatter: &mut Formatter<'_>) -> fmt::Result {
                                write!(formatter, "'{X}' or '{Y}'")
                            }
                            fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
                            where
                                E: Error,
                            {
                                match v {
                                    X => Ok(Field::X),
                                    Y => Ok(Field::Y),
                                    _ => Err(E::unknown_field(v, FIELDS)),
                                }
                            }
                        }
                        deserializer.deserialize_identifier(FieldVisitor)
                    }
                }
                let mut x = None;
                let mut y = None;
                while let Some(key) = map.next_key()? {
                    match key {
                        Field::X => {
                            if x.is_some() {
                                return Err(Error::duplicate_field(X));
                            }
                            x = map.next_value().map(Some)?;
                        }
                        Field::Y => {
                            if y.is_some() {
                                return Err(Error::duplicate_field(Y));
                            }
                            y = map.next_value().map(Some)?;
                        }
                    }
                }
                x.ok_or_else(|| Error::missing_field(X)).and_then(|x| {
                    y.ok_or_else(|| Error::missing_field(Y))
                        .map(|y| Foo { x, y })
                })
            }
        }
        const X: &str = "x";
        const Y: &str = "y";
        const FIELDS: &[&str; 2] = &[X, Y];
        deserializer.deserialize_struct("Foo", FIELDS, FooVisitor)
    }
}
fn main() {
    // This is bad since `x` is not sent.
    let bad_json = "{\"y\":null}";
    assert!(serde_json::from_str::<Foo>(bad_json).is_err());
    // This is good since both `x` and `y` are sent even if the
    // values are `null`.
    let good_json = "{\"x\":null,\"y\":null}";
    assert!(
        serde_json::from_str::<Foo>(good_json)
            .map_or(false, |foo| foo.x.is_none() && foo.y.is_none())
    );
}

Of course there are literally an infinite number of examples including ones that justify the need for 100 nested Options; admittedly there is probably a better/more efficient way. The probability of you needing/wanting such a thing decreases the more nested Options you have though; but like almost anything, won't ever truly reach the probability of 0.

Probably I need to start another thread, but it's a really dilemma for me. Rust way to specify that there is no data - Option::None.

But for web, it can be:

  • ?var=
  • ?

So, I use - Some("") for first case, and None - for the second. However it can be matter for some cases, but if it doesn't matter, why do I need an extra unwrap? I simply use "" for no variable or an empty value of it.

Indeed, this should be a separate thread since answering your question deals with your specific needs/desires. Worst case, you may end up finding out later down the road that you do need to distinguish between a lack of key-value pair and a key whose value is "" even if it's only some of the time causing you to refactor your code so that you don't map both situations to the same thing. Conversely, if you decide to not abstract away enough, you may find out that you end up always treating multiple situations the exact same perhaps behooving you to refactor your code to perform lossy conversions immediately for the sake of code simplicity.