Writing Ord::cmp for struct

Hello,

I implemented Ord::cmp below, and I feel there should be an easier way. Is there? The reason I'm not deriving Ord is because the docs say sorting is lexicographic, which would mean 10 comes before 9, but maybe I'm misunderstanding the docs? Thanks!

#[derive(Eq, PartialEq)]
pub(crate) struct Variant {
    chrom: String,
    pos: usize,
    ref_allele: String,
    alt_allele: String
}

impl PartialOrd<Self> for Variant {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> { Some(self.cmp(other)) }
}

impl Ord for Variant {
    fn cmp(&self, other: &Self) -> Ordering {
        match self.chrom.cmp(&other.chrom) {
            Ordering::Less => { Ordering::Less }
            Ordering::Greater => { Ordering::Greater}
            Ordering::Equal => {
                match self.pos.cmp(&other.pos) {
                    Ordering::Less => { Ordering::Less }
                    Ordering::Greater => { Ordering::Greater }
                    Ordering::Equal => {
                        match self.ref_allele.cmp(&other.ref_allele) {
                            Ordering::Less => { Ordering::Less }
                            Ordering::Greater => { Ordering::Greater}
                            Ordering::Equal => {
                                self.alt_allele.cmp(&other.alt_allele)
                            }
                        }
                    }
                }
            }
        }
    }
}

Best, Oliver

9 < 10 for derive(Ord). "Lexicographic" here doesn't mean elements are converted to strings, it means struct elements are compared as a sequence of symbols, in your case 4 symbols, the first one being the most significant, the second one second-most significant, etc.

In other words, the derived Ord is exactly what you implemented.

5 Likes

You can use the cargo expand to see the code the #[derive(...)] generate.

Apart from deriving as highlighted by other answers, Ordering::then and Ordering::then_with would greatly simplify your code.

1 Like

Thanks, all!

While the derive is what you want in this case, for the future you can solve the nesting with something like this:

impl Ord for Variant {
    fn cmp(&self, other: &Self) -> Ordering {
        use Ordering::*;

        if let e @ (Less | Greater) = self.chrom.cmp(&other.chrom) {
            return e;
        }

        if let e @ (Less | Greater) = self.pos.cmp(&other.pos) {
            return e;
        }

        if let e @ (Less | Greater) = self.ref_allele.cmp(&other.ref_allele) {
            return e;
        }

        if let e @ (Less | Greater) = self.alt_allele.cmp(&other.alt_allele) {
            return e;
        }
    
        Equal
    }
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=38dd2ee1d6fd70590e7837f68dd131f4

2 Likes

Actually, here's a shorter way by piggy-backing on the Ord impl for tuples, for when you want something other than what the derive does:

impl Ord for Variant {
    fn cmp(&self, other: &Self) -> Ordering {
        fn tuple_view(x: &Variant) -> impl Ord + '_ {
            (&x.pos, &x.alt_allele, &x.chrom)
        }
    
        Ord::cmp(&tuple_view(self), &tuple_view(other))
    }
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=544e36caa557481ff86dc5a4e55931b8

2 Likes

It's perhaps best to understand that for strings, characters from A-Z are sorted as less than any character from a-z. So "alice", "Bert", "Clare" will after sorting be in order "Bert", "Clare", "alice".

You can also use a chain of Ordering::then_with.

impl Ord for Variant {
    fn cmp(&self, other: &Self) -> Ordering {
        self.chrom
            .cmp(&other.chrom)
            .then_with(|| self.pos.cmp(&other.pos))
            .then_with(|| self.ref_allele.cmp(&other.ref_allele))
            .then_with(|| self.alt_allele.cmp(&other.ref_allele))
    }
}

or

impl Ord for Variant {
    fn cmp(&self, other: &Self) -> Ordering {
        self.chrom.cmp(&other.chrom).then_with(|| {
            self.pos.cmp(&other.pos).then_with(|| {
                self.ref_allele
                    .cmp(&other.ref_allele)
                    .then_with(|| self.alt_allele.cmp(&other.ref_allele))
            })
        })
    }
}

(I don’t know if the second kind of nesting performs better; the first is more readable with rustfmt)

Rust's built-in ordering for strings is not really useful except for it's utility in a data structure like a BTreeMap. Sorting of strings for human consumption is significantly more complex (it will involve some type of unicode normalization, and will be locale-dependant).

1 Like

I think that's slightly over-stating it. Say you are presenting a drop-down list, and the choices are sorted byte-wise. It's unlikely anyone will notice minor details in the sort order ( and in English, with typical capitalisation, it probably won't make any difference ). Most of the time I would say it's "good enough" for human consumption, at least for English.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.