Option.max and min


#1

During some Advent of Code I had to find the max and the min in a list of integers, I could find them in two steps, but why not in a single one?

So I stepped in a common problem, set the initial value for the first comparison, and Option seems to be the right type for this, but it works only on the max side:

  • None.max(Some(1)) -> Some(1)
  • None.min(Some(1)) -> None

I supposed None to be the “unwanted” value, both for max and min, and that any Some value should win.

I ended up implementing my minimal enum with an Unset and a min and max impls, but the question remains.

Can someone explain why min is like it is and so different from max?

https://play.rust-lang.org/?gist=13fa2179a70dbb0bce94dcd72b72c985&version=stable

Thanks!


#2

Your code there doesn’t compare the integer values, it compares Option::None and Option::Some. Since Option is an enum and None is smaller than Some you get this result.

To get min/max values of a list you’d use someting like

[2, 5, 4, 1, 999, 8].iter().min()
[2, 5, 4, 1, 999, 8].iter().max()

#3

Yes, the two iters was the “I could find them in two steps” solution :slight_smile:

The problem IMHO is that None is neither bigger nor smaller of any Some value, it is an out of band value.

Are we doing some kind of fallback and consider the size of the type? In that sense None is smaller than Some, but IMHO it is a bit unexpected.

Do someone has a better explanation?

PS

I’ve followed the docs, but https://doc.rust-lang.org/std/option/enum.Option.html#impl-Ord max points to https://doc.rust-lang.org/src/core/cmp.rs.html#460-463 and I’m lost.


#4

In the itertools crate there’s a function to find min and max in a single pass on the items:

https://docs.rs/itertools/0.7.4/itertools/trait.Itertools.html#method.minmax

(But I think it’s badly named (I’d like it to be named min_max()) and I think its return type is over-engineered (I’d like it to return just a Option((T, T)) )).

If you don’t want to use itertools then you can find the min and max in a single pass with code like:

fn main() {
    let items = [4, 1, 7, 3, 10];

    let (min, max) = items
                     .iter()
                     .fold((items[0], items[0]), |acc, &x| (acc.0.min(x), acc.1.max(x)));
    println!("{}, {}", min, max);
}

#5

I’d like to play (in part to avoid the access by index for the first element, in part just for fun), so I did this:

    enum Opt {
        Unset,
        Val(i32)
    }
    impl Opt {
        fn max(self, other: Self) -> Opt {
            match (&self, &other) {
                (&Opt::Unset, _) => other,
                (&Opt::Val(a), &Opt::Val(b)) => Opt::Val(a.max(b)),
                _ => self
            }
        }
        fn min(self, other: Self) -> Opt {
            match (&self, &other) {
                (&Opt::Unset, _) => other,
                (&Opt::Val(a), &Opt::Val(b)) => Opt::Val(a.min(b)),
                _ => self
            }
        }
    }

to be used as:

        let (max, min) = line.split("\t").map(|i| Opt::Val(i.parse::<i32>().expect("some num"))).fold(
            (Opt::Unset, Opt::Unset),
            |(a, b), i| (
                a.max(i),
                b.min(i)   //  `Option` does not work here, so I had to write a lot 
                //if let Some(a) = a { Some(if a > i { a } else { i })} else { Some(i) },
                //if let Some(b) = b { Some(if b < i { b } else { i })} else { Some(i) }
            )
        );

To workaround the fact that None.min does not work, but the real question here is about understanding why None.min is designed this way (you cannot escape from None), perhaps there are some use cases and I’d like to understand a bit more about this.


#6

Oh, so that’s where you got your initial None. You could sidestep this whole issue if you initialised your values with std::i32::MIN and std::i32::MAX like this:

let (max, min) = line.split("\t")
    .map(|i| i.parse::<i32>().expect("some num"))
    .fold((std::i32::MIN, std::i32::MAX), |(a, b), i| {
        (
            a.max(i),
            b.min(i),
        )
    });

(You’d just have a problem with empty lists)

But back to your question about Option and Ordering: That’s just how it is with enums. The functionality was influenced by Haskell but the name escapes me. I thought it was something like “value types”.
In a nutshell: The value of a declaration in an enum depends on the order of appearance (like C enums if you don’t define a value).

#[derive(Debug, Ord, PartialOrd, Eq, PartialEq)]
enum E {
    A,
    B,
    C,
}

fn main() {
    let list = [E::A, E::B, E::C];

    println!("min: {:?}", list.iter().min()); // => E::A
    println!("max: {:?}", list.iter().max()); // => E::C
}

#7

I often wish that Option<T:Ord> was only PartialOrd, not Ord, so that Some and None are incomparable. The fact that min and max work so differently being one of the biggest reasons.


#8

Which would always require T: Clone, and that is too narrow.


#9

Right, but the minmax in Itertools is kind of useless… So what’s a better solution?


#10

Why is it useless?

I guess the over-engineering of it is just keeping with Rust’s general style :wink:. Itertools has committed to one lowering of that bar: Minmax uses the PartialOrd trait, not Ord (and the rationale is: it’s up to your program to decide if it makes sense to use it with floats or not).


#11

This idea of having Some(_) and None incomparable seems to work out quite well if you think of the comparison operators. Some(1) <op> None would be false for all operators <, <=, >, >=.


#12

In my opinion the only purpose a min_max() should be to scan an iterable only once to find both its min and max. Everything else makes its API less usable, less intuitive, less composable, harder to remember, more baroque. I’m probably not going to use that minmax function.

Thankfully in Rust and its std lib that’s often false :slight_smile:

What’s the behaviour of minmax when the call to a partial_cmp returns None?

(By the way, Itertools is great and I use it almost daily, this is meant just as a comment on a small part).


#13

That is exactly what the function does. In the case of Clone types, you can call into_option which returns what you expected.

It might not meet your definition of a pretty API, but to call it useless is disingenuous.


#14

Nobody’s mentioned the reason yet why comparison of options works this way.

Plain and simply: Generic code.

Having Option<T> form a total order when T forms a total order is nice because then when you have a struct that contains an Option<T> member, you can still #[derive(Ord)] and thus be able to sort vectors of that type and use algorithms like binary search, and make BTreeMaps of it, etc…


Also note: Ord::{min, max} is an extremely recent addition. The fact that it means that Option::<T>::{min, max} now exists could be considered, I guess, the fallout of this addition. Nobody in their right mind would ever call .min() or .max() on an Option.


#15

In general, a.min(b) and a.max(b) have the same value only if a equals b. As None is not equal to Some(1), it is correct that None.min(Some(1)) and None.max(Some(1)) are not equal.
The expression None < Some(x) is true for any x, and so the value of None.max(Some(x)) is Some(x) for any x, and the value of None.min(Some(x)) is None for any x.


#16

Ord::min is just a new name for std::cmp::min. But the specific method isn’t even the problem.

For example, Iterator has min & max too. How do you find the largest non-none option in an iterator? Iterator::max! How do you fine the smallest non-none option in an iterator? Did you think Iterator::min? Because it’s not. (I’ve seen this kind of thing come up numerous times on IRC. The fact that the solutions aren’t symmetric is just bad. It’s all the same reasons that NaN means that float shouldn’t be Ord, which thankfully Rust gets right.)

What it does is clear. Why it does it is the problem. Is there any obvious reason why None is smaller, rather than larger, than Some?


#17

No… but nonetheless, a decision had to be made for the sake of generic code.

In my honest view, I do not always consider None as the “absence” of a value. Sometimes I do; and other times, I consider it to be just another value (where Option is used to extend a type with a single additional value).


This problem doesn’t just affect Option. Many types that have normally have no business at all being compared are given PartialOrd and Ord implementations simply for the sake of supporting algorithms. A crate like num-complex has to make a tough decision:

  • if Complex does not derive Ord, you won’t be able to do cool things like construct a BTreeMap whose keys are gaussian integers.
  • If Complex does derive Ord, everybody’s code will be full of heinous bugs where they accidentally used the < operator on complex numbers.

All things considered, I wish Rust did not conflate the ordered comparisons we care about (i.e. the places where we actually want to use < and max) together with the ordered comparisons that exist because they can (i.e. the kind that are used by BTreeMap and algorithms). I imagine an alternate universe with three traits:

// * can use #[derive(Lex)]
// * used in bounds for BTreeMap
// * there might also be a Vec::lex_sort
/// Trait for lexical comparisons.
trait Lex {
    fn lex_cmp(&self, other: &Self) -> Ordering;
}

// * cannot be derived (it now serves a role like Add and Sub)
/// Trait for the `<`, `>`, and etc. operators.
trait PartialOrd {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering>;
}

// * cannot be derived
// * used by Vec::sort, Iterator::min, Iterator::max
trait Ord: PartialOrd + Eq {
    fn cmp(&self, other: &Self) -> Ordering { ... }
    fn min(&self, other: &Self) -> &Self { ... }
    fn max(&self, other: &Self) -> &Self { ... }
}

#18

And on one final note, allow me to share the state of things in C++:

C++ still does not have tagged unions but it does have an Option type. Without tagged unions, one might ask, how do people work with them? Easy! By having a separate type that represents None, and using overloading.

rust

let mut opt;
opt = None;
opt = Some(3);
let x = opt.unwrap_or(4);

c++

// The following "just works" because optional's operator= has overloads
//  for arguments of nullopt_t and T.
std::optional<int> opt;
opt = nullopt;
opt = 3;
int x = opt ? *opt : 4;

Like rust, C++ has a heavy focus on using generic code for algorithms. So comparisons of Options have the same apparently broken semantics as Rust, to ensure that they form a total order.

rust

let a = Some(0);
let b = None;
assert!(b < a);

c++

std::optional<int> a = 0;
std::optional<int> b = nullopt;
if (!(b < a))
    throw std::runtime_error("something's seriously wrong, yo");

Alright. Okay. Seems reasonable so far.

But the C++ committee didn’t stop there. For reasons I can hardly even begin to imagine (though I’m sure they must have had a specific use case in mind, like generic lambdas or somesuch), they decided that the comparison operators also needed to work seamlessly in the same manner as assignment.

c++

std::optional<int> a = 0;
std::optional<int> b = nullopt;
if (!(nullopt < a && b < 0))
    throw std::runtime_error("something's seriously wrong, yo");

That’s right. Not only does C++ let you compare Options with comparison operators (with the same seemingly broken semantics as rust), but it also lets you compare an Option<T> with a T! With those very same semantics!!

This was, of course, the very first mistake I made when I recently gave C++17 a try.

…so if you ask me, I say we count our blessings!


#19

Thank you all for your answers,

to sum up:

  • Option<T> implements Ord, this allows you to have a total order if T is Ord, and then to use Option<T> as a key (or as field inside a key) in a binary tree or similar things.
  • Recently (from version 1.21 of Rust), Ord::max and Ord::min have been defined, and therefore also Option has inherited them.
  • Following what is a little bit the default of enum sorting, None is less than Some(x) for any x, and therefore None.min(x) is always None.

In order to understand the importance of total order we can think of floats, nan is the evil for every sorting algorithm, because both nan < x and nan > x are false, so array.sort() has implementation-dependent behavior, in Python for example the sorting seems to make sense until the first nan, but only by chance, an equivalent algorithm, which orders from the bottom of the vector or which, instead of using an if a < b, would use an if b > a could have a different result.

>>> a
[3,2,4, nan, 1,3,4]
>>> sorted(a)
[1,2,3,4, nan, 3,4]

Perhaps the fact that Option implements Ord is the thing that most misled me, but I have not a strong opinion about having it PartialOrd, it seems to be a trade-off on both sides:

  • On one side you can find the maximum not None item from a list, but you cannot find the minimum till you have not popped out all the Nones.
  • On the other side, you shouldn’t have had the list at all, or, if you tried to make a magical Opt enum that allows you to extract both the min and the max from the not Unset values, it could (would?) have been implementation dependent (not sure but seems a lot).

Beside that, I second the @ExpHP LexOrd idea, in that alternate universe, even floats may be sorted in some algorithm independent way.


#20

In a recent revision of IEEE 754 (2008 iirc), floats gained a total order. Given that I’m only 27, hopefully I will see a compliant implementation become popular before I die :stuck_out_tongue: