More attention to linear-ness of types?

frankmcsherry · April 8, 2015, 12:08pm

Hi. I think this might be just idle thoughts as interfaces have probably calcified as part of the beta shipping, but I thought I'd report on some experiences I've had recently butting my head against some of the library interfaces (mostly collections). These all fall under "my type doesn't implement copy, and I don't want to have to clone it so much", which ends up being an issue for me in "big data" land.

Old news: The HashMap method remove(&Key) returns a Value, and drops the key that matched. "No problem", you say, because obviously I have a reference to a key. Annoyingly, I wanted to send that key along, and now I have to clone it. It could have returned (Key, Value) and I could drop the key or not as needed. This ends up being a pain in "word count" like examples where the keys are strings, and I get shown references to keys that need to be shipped.

Gankro mentioned that this was known, but that there was not a lot of motivation behind fixing it (perhaps just me, and I'm not particularly passionate).
The hashmap::Entry method or_insert_with(FnOnce()->V) would be super useful if it showed me what the key was. Unfortunately, I had to move it when I called entry(Key), and now I can't see it any more (without a clone). As best as I can tell, none of the Entry methods (or methods on their variants) expose a reference to the key.
There were several positive examples (interactions with Gankro et al) where a few Vec methods that would otherwise "drop" their backing memory were able to retain it (I think the Drain iterator came out of this).

I'm sure I've bumped in to other examples (sorry if this fixates on HashMap), where libraries drop or don't share data they got from me, and I want back!

Move semantics and single ownership are great, but "with great power comes great responsibility". It seems like there are a lot of interfaces that could reveal more (references) and drop less.

Is this a well-understood design principle in languages with linear types? It seems like one might want to make a pass over pretty much all of the interfaces with an eye towards this principle, always exposing a maximally sharing minimally dropping interface, and then perhaps wrapping that in convenience methods (for example, the argument that people expect hashmap.remove(&key) to return just a value works for me, but it is a bummer to not have the more powerful interface somewhere).

I'm happy to take notes on things that I run into, especially if there is a story for how these notes might be useful (e.g. adding such interfaces would be backwards compatible). It is pretty easy to mock up examples where each of these pain points results in substantial overhead, but it also feels like there should be a good principle behind the interface designs that says "even if we don't know why you would want this back, we can at least give it to you and let you drop it"

huon · April 8, 2015, 12:25pm

Possibly-relevant previous discussion: Maps, Sets, and the value of Keys - libs - Rust Internals

frankmcsherry · April 8, 2015, 12:37pm

Yes, sorry. I should have pulled out the links I've gotten before. Gankro pointed me at this before, and I think we are on the same page in terms of "something better could be done", with the next question of "is it worth it?" less clear (or maybe very clear by now).

I do have a bunch of cases where it is worth it for me (mostly in terms of saved computation and allocation due to clones). I could scribble these down but perhaps that ship has sailed for a few years (or we should make a crate of clone-friendly collections, if that is the only problem space).

I suppose I'm trying to think of some good next steps that would be productive as I start to hit my head against this sort of thing more often.

frankmcsherry · April 23, 2015, 7:40am

Adding another example, Vec's retain:

fn retain<F>(&mut self, f: F)  where F: FnMut(&T) -> bool

just drops everything evaluating to false. It would be useful if those elements actually went somewhere, at which point I could drop them if that is actually what I want. Usually I do want to do something with them, or I wouldn't have put them in the Vec in the first place. T.T

For example, either letting me steal ownership:

fn retain<F>(&mut self, f: F)  where F: FnMut(T) -> Option<T>

or an action for dropped elements (which can just be |x| {} for the use above, wrapped up with a bow by the library).

fn retain<F, A>(&mut self, f: F, a: A)  where F: FnMut(&T) -> bool, A: FnMut(T)

These seem strictly more general, and concretely useful (for me). Is there room for negotiation on whether they can exist somewhere behind the usually exposed Vec methods? Should I file an "issue" about the lack, or write up an RFC, or what is the right way to get people thinking?

Alternatively, I suspect I can just write these and they are probably (?) simple enough that LLVM will elide the bounds checks and other stuff that retain is probably being smart about. On the other hand, I don't know what those things are, it is silly for everyone to re-write them, and it just seems wrong to do things badly in the first place.

Topic		Replies	Views
"Journaling HashMap" Does such a thing exist? help	8	519	February 7, 2021
Rust hashmap interface remove and insert help	2	152	April 16, 2024
HashMap entry API and ownership?	7	777	December 16, 2022
Lots of references when using HashMap help	6	1790	March 14, 2022
Error message suggestion for cloning HashMap a when K or V don't clone help	3	665	February 7, 2021

More attention to linear-ness of types?

Related Topics