More flexible `Vec::retain`

I have found myself in a situation where I need to:

  • retain some elements of a Vec in place;
  • keep the retained elements in the same order as in the original one (i.e. no swap_remove-based solutions);
  • mutate elements before deciding whether to retain them or not;
  • stop early if some condition is met, in that case I don't care anymore about the remaining items (they can simply be dropped along with the original Vec);
  • mutably access the elements that have already been retained while deciding whether to retain the next ones.

Some solutions I considered:

  • Vec::retain_mut, which satisfies the first 3 conditions, but not the last 2
  • vec.into_iter().filter_map(...).collect::<Option<Vec<_>>() (where "keep" is mapped to Some(Some(item)), "remove" to None and "early stop" to Some(None)), but it doesn't satisfy the 5th requirement;
  • iterating over Vec's elements and building a separate Vec with the retained elements - this satisfies most requirements but is not in place.

Is there a way to do this while satisfying all the above requirements (possibly involving some third-party crate)?

2 Likes

You didn't say that you wanted a linear-time algorithm, in which case the tongue-in-cheek answer is that the quadratic solution is trivial.

A linear-time solution would involve counting the number of items retained and/or discarded so far, and appropriately moving a "hole" pointer that you swap the next to-be-discarded element with. I'll try to write it up soon.

1 Like

My bad, I was kind of assuming a linear algorithm since there's already one (see e.g. retain, what I'm missing are the ergonomics to safely use it with more flexibility)

I see, yeah that's also an option, though a bit less efficient (it will do more moves than necessary).


Also, I now realize I wasn't completly clear with the 4th point:

If I stop early I want to be able to detect this case after the retain ends.


Anyway, thank you, you gave me some great point. I would say the issue is almost solved (except for the additional moves)

Define "necessary". How do you otherwise do it in-place in linear time?

The (quadratic) solution above already allows you to do that.

In an unsafe implementation you would be able to only write the items to their correct location (that is, at most one move is required per retained element), leaving the old slot uninitialized (of course care needs to be taken for ensuring these don't end up in the final vec and/or are dropped). With swaps you would get a similar algorithm except instead of having uninitialized slots you would have to fill them with the removed elements, which causes additional moves that are not really necessary. The complexity remains the same, but it's slower by a constant factor.

Ah my bad, I now see I can distinguish this by whether the value returned is the length of the vec or a value smaller than that.

retain_mut gets &mut Element and is documented to visit things in order and takes FnMut, so you can use a stateful closure.

But this closure can't access the elements already retained, right? They are provided for duration of the call only.

Hmm, true. I suppose this is the "I want to iterate over before: &mut[T], here: &mut T, after: &mut [T]" problem in a different form, and I don't know any trivial way to do that, let alone while retaining.

Some implementations of retain_mut could actually give those slices, but it would preclude an implementation that tries to do data movement in runs (rather than element-at-a-time) so std probably wouldn't want to constrain itself by that API.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.