One of the small, but tricky pieces of code I occasionally have to write is a version of Python's groupby function.
This function takes a sequence and splits it into runs based on a particular key function. In the simplest case, it just splits the sequence into the runs of equal elements: [1, 1, 1, 2, 3, 3, 2]
-> [[1, 1, 1], [2], [3, 3], [2]]
.
I always have a hard time implementing it manually, because it has this annoying problem when you forget to deal with the last group, and then have to duplicate the code for finishing the group (see, for example, this code from rust-analyzer where I've just macro-ed away the duplication).
I wonder if folks here could suggest nice solutions!
Here's the signature and some tests:
pub fn group_by<I, F, K, T>(xs: I, key_fn: F) -> Vec<(K, Vec<T>)>
where
I: IntoIterator<Item = T>,
F: Fn(&T) -> K,
K: Eq,
{
todo!("make the tests pass")
}
#[test]
fn tests() {
assert_eq!(
group_by(0..5, |&x| x % 3 == 0),
vec![
(true, vec![0]),
(false, vec![1, 2]),
(true, vec![3]),
(false, vec![4])
],
);
assert_eq!(
group_by(0..5, |&x| x % 3 == 1),
vec![
(false, vec![0]),
(true, vec![1]),
(false, vec![2, 3]),
(true, vec![4])
],
);
assert_eq!(
group_by(0..5, |&x| x % 3 == 0),
vec![
(true, vec![0]),
(false, vec![1, 2]),
(true, vec![3]),
(false, vec![4])
],
);
assert_eq!(
group_by(0..5, |&x| x),
vec![
(0, vec![0]),
(1, vec![1]),
(2, vec![2]),
(3, vec![3]),
(4, vec![4])
],
);
assert_eq!(group_by(0..5, |_| ()), vec![((), vec![0, 1, 2, 3, 4])]);
assert_eq!(group_by(0..0, |_| ()), vec![]);
}
Note that this deliberately doesn't try to produce an iterator, as that requires some extra legwork.
I've also made a git-clonnable repo for convenience: GitHub - matklad/group-by-challenge