Proper way to use sort_by_key for struct field?

I have a simple struct:

#[derive(Debug, Clone)]
pub struct LabeledStrings {
    label: String,
    strings: Vec<String>,
}

impl LabeledStrings {
    pub fn new(label: String, strings: Vec<String>) -> LabeledStrings {
        LabeledStrings { label, strings }
    }

    pub fn get_label(self) -> String {
	self.label
    }

    pub fn get_vals(self) -> Vec<String> {
	self.strings
    }
}

I have a Vector of these things, and I want to sort it by the labels:

   let mut  keys: Vec<LabeledStrings> =  ...
    keys.sort_by_key(|ls| ls.clone().get_label());

When I tried it without the .clone() the compiler complained that I was moving ls. I am missing something in how to use methods like map, filter, sort_by_key when the elements are structures, and thus the parameters are references to structures. to get rid of the need to clone, do I need to make the methods take &self instead of self?
Thanks,
Joel

As written, these functions are consuming the value of self (moving ownership from the caller), so they can only be called once.

sort_by_key is designed to only allow using a reference to the item, not moving out of the item.

You'll probably want your accessor functions to only use references:

    pub fn get_label(&self) -> &str {
        &self.label
    }

    pub fn get_vals(&self) -> &[String] {
        &self.strings
    }

Chapter 4 of the rust book is a good introduction/refresher.

2 Likes

I wondered if I should make the parameters references. That I can do. I don't think I can make the returns references. Making the returns references is somewhat more complex, as I need to end up with copies. But clearly, sort_by_key doesn't need a copy.
Thanks,
Joel

Making the parameter a reference did remove the need to clone the argument in sort_by_key. Making teh result a reference works... But I have to clone in the closure for sort_by_key because the compiler otherwise complains about the &String lifetime. (Still learning to think properly about lifetimes.)
Joel

Yes, this is a classic problem. sort_by_key doesn't for for keys that aren't already references because the closure wants -> T not ∀'a, -> T<'a>.

You'll just have to use sort_by instead.

Well… the first question to ask is: why they heck do you even need these methods at all?

Looks like some kind of OOP-language cargo-culting. In Rust the protection unit is not data structure, but module. Which means that you would see such accessors extremely rarely: they are both problematic and rarely useful.

This being said, sometimes you have to have them, because they could be imposed on you by some third-party API, thus discussion about how to do that “properly” is still useful. Just keep in mind that it's not something you have to use often, rather it's something that may use in some rare and exotic situations.

1 Like

These accessors are for structures that are being referenced between modules. I very much prefer to hide the implementation details of individual modules from other modules. Which is, as far as I can tell the default when i declare a struct to be pub in module A, and then use / reference it in module B. I probably separate modules more than many people would choose.
Yours,
Joel

Thanks. sort_by does what I need without the extra clone.
Joel

See also slice::sort_by_key has more restrictions than slice::sort_by ¡ Issue #34162 ¡ rust-lang/rust ¡ GitHub and How to set lifetime of slice::sort_by_key to make it support reference value? - #6 by Bruecki

1 Like

But for a type like LabeledStrings which seems to be a plain bag of data given meaningful names, exposing accessors to each of its fields doesn't feel like it actually hides the implementation details, so there's not much reason to avoid marking its fields pub. (Unless the type places additional invariants on its fields (invariants not already enforced by the fields' types).)

1 Like

There is a fair argument that LabeledStrings could have exposed its structure. I am also tempted to hide more of the structure.

In Rust, borrowing vs owning data cannot be hidden as an implementation detail.

Borrowing will be always visible in all types containing any borrowed references anywhere. There are also APIs that effectively forbid returning new data instead of borrowing (e.g. the Index trait).

This often means that memory management and ability mutate data is quite visible in public APIs and restricts what can they do in their private implementation.

Additionally, &mut means exclusive access in Rust, and &mut self means exclusive lock on all of self, which forbids having any other references to any data inside self anywhere else while making a &mut self call. This makes getters and setters much less flexible than direct field access. When fields are used directly, the borrow checker can see they don't overlap and allows using more than one field at the same time. When they're hidden behind getters, the borrow checker will take that abstraction with full seriousness and will not let it depend on which fields the getter/setter actually uses, and will restrict access to all fields when calling any single getter/setter.

Cost of hiding implementation details in Rust is quite high. In Rust it's done as the last resort after it's proven there's no way to avoid it. It's not done first just in case.

3 Likes

Given how often this comes up, I wonder if std should include a sort_by_key_ref() function. It wouldn’t be any special algorithm, just a wrapper for sort_by(), but it would be convenient for the common case of “sort by a non-Copy field of the elements”.

(Of course, it would not work for any return type that is not a single & reference, so there are still cases where you would need to resort to sort_by().)

1 Like

Thanks. For LabeledStrings, I see why using accessors would be better. (I only need borrow, I create new ones if I want to mutate them.) I am currently using RefCell to get around similar problems in other structures. I understand that is probably not a good idea. I think I can see how to get rid of some of them. But I have to work it through a step at a time.
Again, thanks for writing more explanation.
Yours,
Joel