Mutating duplicate elements of an iterator

Is there an iterator adaptor that can mutate each element of an iterator according to the predicate based on another element of that iterator? For my current problem, I need to increase the value of any duplicate usize element until there are no more duplicates in the vector. This is my solution currently:

let mut v = vec![0, 1, 1, 2, 3, 13, 14, 15, 16];
// Input will not always be sorted like above
v.sort_unstable();

let mut last: Option<usize> = None;

v.into_iter()
    .map(|mut y| {
        if let Some(x) = last {
            if y == x {
                y += 1;
            }
        }
        last = Some(y);
        y
    })
    .collect::<Vec<usize>>();

assert_eq!(result, vec![0, 1, 2, 3, 4, 13, 14, 15, 16]);

What I would like to know is:

  1. Is there is an adapter that can do the same without the external last variable? There are quite a few ways to deal with pairs in iterators such, peek, windows, chunks, and even reduce and fold, although none of them, AFAIK allow for mutating a single element.
  2. Is there an alternative to more efficiently achieve the same? for (0..100).into_iter(), with an additional 0 inserted at the front, Almost the entire vector will be increased by 1. Am I right in assuming that dedup also has the same caveat, in that each element following the first found duplicate will be mutated (the account for shifting)? In that case, I guess this is the only approach, but I would like to hear from someone more familiar with the topic.

I'd write something like this, personally:

    let mut v = vec![0i32, 1, 1, 2, 3, 13, 14, 15, 16];
    // Input will not always be sorted like above
    v.sort_unstable();

    let mut min = i32::MIN;

    for x in &mut v {
        *x = min.max(*x);
        min = *x + 1;
    }

    assert_eq!(v, vec![0, 1, 2, 3, 4, 13, 14, 15, 16]);
3 Likes

if you read the document for slice::windows(), it mentioned why there's no windows_mut(), but gives an example utilizing Cell for slice of Copy types. your code can be written as:

use std::cell::Cell;

let slice = &mut array[..];
let slice_of_cells: &[Cell<usize>] = Cell::from_mut(slice).as_slice_of_cells();
for window in slice_of_cells.windows(2) {
    if window[1].get() == window[0].get() {
        window[1].set(window[0].get() + 1);
    }
}

this example is just a worst case input for the problem, if your problem statement is accurate. it's not possible to be "more efficient" than the inherent complexity of a problem.

but all in all, I think your problem should be better solved with good old index based loop.

let mut last = array[0];
for i in 1..array.len() {
    if array[i] == last {
        array[i] = last + 1;
    }
    last = array[i];
}
// or, even simpler:
for i in 1..array.len() {
    if array[i] == array[i-1] {
        array[i] = array[i-1] + 1;
    }
}

this can be written using iterators, but I don't think it's more "readable" than the index based for loop:

let start = v[0] - 1; // can start with value less than array[0] since we use `max()`
v.into_iter().scan(start, |last, x| {
    *last = x.max(*last + 1);
    *last    
}).collect()
2 Likes

The mutation in .map() seems weird. If anything can be made more efficient about your code, it's updating in-place instead of allocating: Playground

fn ensure_unique(v: &mut [u64]) {
    v.sort_unstable();

    let Some((&mut mut last, rest)) = v.split_first_mut() else { return };

    for item in rest {
        if *item == last {
            *item += 1;
        }
        last = *item;
    }
}

As others mentioned, this isn't possible to avoid. Perhaps, however, there is a higher-level design change you could make. It seems to me like you are trying to generate unique identifiers, in which case you could do better. What's the specific thing you need?

1 Like

I did a quick benchmark with the methods everyone's shared: Rust Playground

                      original   109.593µs
        original minus collect   450.463µs
             make iter of &mut   430.042µs
                 keep last + 1   132.804µs
    keep last + 1 plus collect   91.223µs
                          cell   422.882µs
          manually index pairs   419.511µs
               split_first_mut   441.302µs

For some reason, using collect is very fast. The last + 1 method from @2e71828 is fast which I expected, and combining it with collect makes it even faster. The original method was second place, and all the others were basically the same.

Edit: here's the fastest one

let mut min = usize::MIN;
v.into_iter().map(|mut x| {
    x = min.max(x);
    min = x + 1;
    x
}).collect()

I dunno if this is right but it seems like the last + 1 method is the only one that's correct. The others only work if there's a maximum of 2 identical elements. If there's more than 2 of an element, you get out something that doesn't have any adjacent identical elements, but also isn't sorted anymore.

2 Likes

Yeah, that's a good point – although the OP specifically says duplicate elements and not an arbitrarily long run of identical elements (and the example only contains 2 consecutive identical items, too).

The plot thickens: Rust Playground

        last + 1 for          random 12.88µs
        last + 1 for            many 12.22µs
        last + 1 for             few 12.71µs
        last + 1 for             all 12.701µs
        last + 1 for            none 12.631µs

    last + 1 collect          random 23.291µs
    last + 1 collect            many 8.19µs
    last + 1 collect             few 9.22µs
    last + 1 collect             all 8.32µs
    last + 1 collect            none 6.68µs

   last + 1 for_each          random 12.311µs
   last + 1 for_each            many 12.321µs
   last + 1 for_each             few 12.3µs
   last + 1 for_each             all 12.69µs
   last + 1 for_each            none 12.75µs

  last + 1 map count          random 30.65µs
  last + 1 map count            many 9.011µs
  last + 1 map count             few 8.09µs
  last + 1 map count             all 8.36µs
  last + 1 map count            none 6.26µs

    last + 1 map for          random 12.68µs
    last + 1 map for            many 12.56µs
    last + 1 map for             few 12.611µs
    last + 1 map for             all 19.121µs
    last + 1 map for            none 12.65µs

First, about random: that test has about half needing to be incremented. The other ones are more predictable. So I'd guess this is branch prediction or some other CPU optimization that may or may not work depending on the input. The slow version is best if you need consistency.

Now, the weird part is that map is somehow the thing that makes it fast. They're all the same, except:

  • for uses a for-loop over &mut usize (slow)
  • collect uses map over usize and then collects (fast)
  • for_each uses for_each over &mut usize (slow)
  • map count uses map over &mut usize and then count to consume the iterator. (fast)
  • map for is the same as the previous but uses a for loop to consume the iterator. (slow)

Other consuming iterator methods like last and collect::<()> are also fast. I also tried other iterator methods that map lazily or greedily, and it seems to be a toss-up whether they're slow or fast. Manually iterating with loop or while is slow. This is likely a missed optimization by rust.

Unrelated to all this, if you have a lot of elements, you could speed this up with multiple threads. I think this would have the biggest advantage if there are few elements that need to be incremented. If you are also allowed to decrement elements, you could find chunks where the last element is greater than the first element + length of chunk, and then replace the entire chunk with first..first + len. I think this could even have less modifications than the current method.

it's probably related with data dependency analysis. but more importantly, for random data set, branch prediction makes a big difference. if you replace the branches with max() (which should be lowered into cmov instruction on modern AMD64 hardware), the result is vastly different.

the same code, but branches are replaced with max(), I get a result like:

                    original   101.893µs
        original minus collect   192.246µs
             make iter of &mut   138.594µs
                 keep last + 1   131.214µs
    keep last + 1 plus collect   84.692µs
                          cell   43.871µs
          manually index pairs   87.062µs
               split_first_mut   130.484µs
1 Like

Yes, there may be more than one duplicate item. Thanks for pointing that out. I was too focused on what approach I should take to solving the problem, that I wasn't thinking enough about the data. This did actually occur to me as I was going back through my code, so thank you for pointing out that solution. Thanks to everyone for the insight into performance, too.

Wow, cell is fast as hell, and even worse for the random case.

use std::cell::Cell;
let cell_slice = Cell::from_mut(v.as_mut_slice()).as_slice_of_cells();

cell_slice.windows(2).map(|window| {
    let [xc, yc] = window else { unreachable!() };
    let x = xc.get();
    let y = yc.get();
    let y = y.max(x + 1);
    yc.set(y);
}).count();
v
        last + 1 for          random 12.29µs
        last + 1 for            many 12.57µs
        last + 1 for             few 12.57µs
        last + 1 for             all 13.211µs
        last + 1 for            none 12.671µs

  last + 1 map count          random 21.301µs
  last + 1 map count            many 8.85µs
  last + 1 map count             few 5.23µs
  last + 1 map count             all 8.31µs
  last + 1 map count            none 6.37µs

      cell map count          random 32.741µs
      cell map count            many 4.19µs
      cell map count             few 4.17µs
      cell map count             all 5.65µs
      cell map count            none 5.03µs

I also tried the other methods and cell is never slow/consistent.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.