Split a slice into subslices with the same element

I have a sorted slice like this

let s = &[1, 1, 2, 3, 3, 3, 3, 4, 4, 4];

How to split it into the following subslices

[1, 1], [2], [3, 3, 3, 3], [4, 4, 4]

each of which all have the same element? An elegant implementation using iterators is desired. Thanks!

Mind sharing your own attempts at solving this and the difficulties that stood in the way?

I used a Vec of tuples to store the element and its counts. The code is like this:

fn main() {
    let slice = &[1, 1, 2, 3, 3, 3, 3, 4, 4, 4];
    let mut vec: Vec<(usize, usize)> = vec![];
    
    let mut elem = slice[0];
    let mut start = 1;
    let mut end = 0;
    for (idx, &val) in slice.iter().enumerate() {
        if val == elem {
            end += 1;
        } else {
            vec.push((elem, end - start + 1));
            elem = val;
            start = idx;
            end = idx;
       }
    }
    vec.push((elem, end - start + 1));
    println!("{:?}", vec);
}

But I am wondering whether there is a more elegant solution.

1 Like

It seems batching from the itertools crate can achieve the same while avoiding the need for a vector, but ultimately the solution will depend on what exactly you want to do with the resulting slices.

2 Likes

I got curious and decided to give this a try. Here is the algorithm I implemented:

  1. Save the first element (if any) and its index.
  2. Check if the next element is different from the saved element.
    If different:
    a. Take a slice from the index of the saved to the current index and push it to the results.
    b. Save the current element and its index.
    c. Go to 2.
  3. Continue to the next element (if any) and go to 2.
  4. When there are no more elements: Take a slice from the beginning of the last sequence to the end and push it to the results.

It ended up very similar to what you already did, but still slightly different. I'm sure it can be adapted as an iterator, as well. It's actually very similar to an iterator I made the other day. Anyway, you can test run it here: Rust Playground

1 Like

This is my real problem: to find the values that appear most often in a set of data. I finally solved it like this:

fn mode(v: Vec<f64>) -> Vec<f64> {
    let mut vec = v.clone();
    vec.sort_by(|a, b| a.partial_cmp(b).unwrap());

    let mut modes: Vec<f64> = vec![];
    let mut max_count = 0 as usize;
    let mut count = 0 as usize;
    let mut pivot = 0 as f64;
    for &val in vec.iter() {
        if val == pivot {
            count += 1;
        } else {
            if count >= max_count {
                if count > max_count {
                    max_count = count;
                    modes.clear();
                }
                modes.push(pivot);
            }
            pivot = val;
            count = 1;
        }
    }
    if count >= max_count {
        if count > max_count {
            modes.clear();
        }
        modes.push(pivot);
    }
    modes
}

fn main() {
    let v: Vec<f64> = vec![1.2, 4.5, 4.5, 2.3, 3.4, 4.5, 5.6, 2.3];
    assert_eq!(mode(v), vec![4.5]);
}

You can run the code here: Rust Playground. I would like to figure out a more efficient solution. Splitting the slice into subslices is a candidate approach.