Drain/retain for HashSet?

droundy · July 8, 2020, 5:50pm

I am wondering whether there is a nicely efficient way to remove and collect a subset of elements from a HashSet according to a predicate. My current best guess is to drain the entire set, pipe that through partition and add back in the subset that I want to retain. This seems terribly inefficient for an operation that seems like it could be done in place. Does anyone have a better suggestion?

mbrubeck · July 8, 2020, 5:57pm

There is an open issue for adding these methods to ~~HashSet~~ BTreeMap and BTreeSet, and a tracking issue for some of the implementation work involved.

mbrubeck · July 8, 2020, 7:44pm

Sorry, I got confused between HashSet and BTreeSet. Currently we have the following methods:

HashSet::retain is stable, but doesn't provide a good way to collect the removed elements.
HashSet::drain ~~and BTreeSet::drain~~ is stable, but does not take a predicate.
BTreeSet::drain_filter is unstable.

What you really want is HashSet::drain_filter, which does not exist yet. This is issue #59618.

ssomers · July 8, 2020, 10:22pm

Sorry, @mbrubeck, seems you were still a little confused. There is no BTreeSet::drain, because it's not very useful without range, and you can already do such a full drain on virtually any container with:

std::mem::take(&mut container).into_iter()

droundy · July 8, 2020, 11:13pm

Thanks, that's exactly what I was looking for. It turns out that the hashbrown HashMap does have drain_filter which is a good second best for me at the moment. I love that the hashbrown crate seems to double as a way to access unstable features of the standard library.

mbrubeck · August 27, 2020, 5:49pm

A warning for anyone using the drain_filter method in the hashbrown crate: It was accidentally implemented with opposite semantics from the ones in std. Instead of removing items that match the predicate, it removes items that do not match.

The next major release of hashbrown will fix this, by changing the behavior of drain_filter to match the standard library. Unfortunately, this is a silent breaking change (i.e., it changes behavior but does not trigger any compile-time errors or warnings). Existing crates that uses these methods will become incorrect if they upgrade to hashbrown > 0.8 without also changing their own code.

https://github.com/rust-lang/hashbrown/issues/186

TomP · August 27, 2020, 8:15pm

How did this get through testing? Weren't any of the std::collections::HashMap::drain() tests applied to hashbrown::HashMap::drain_filter()?

mbrubeck · August 27, 2020, 8:20pm

drain doesn't take a predicate, so none of its tests could catch this.

It might have helped to have shared tests between HashMap::drain_filter and BTreeMap::drain_filter, but the former was added to hashbrown before the latter was implemented in libstd, so this wasn't an option at the time.

My fix for the hashbrown issue added test cases adapted from std::collections.

mbrubeck · September 3, 2020, 8:27pm

hashbrown 0.9.0 is now released, and all users of its drain_filter methods will need to update their code when upgrading to this version.

system · December 2, 2020, 8:27pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
How to drain some elements from `HashMap` with predicate help	6	1401	December 17, 2020
HashMap drain from iterator	3	649	June 3, 2021
In-place deletion from HashMap and similar?	4	1065	January 12, 2023
Speeding up or finding alternative to BTreeSet help	19	560	April 2, 2023
Popping maps and sets? help	11	1953	January 12, 2023

Drain/retain for HashSet?

Related Topics