Unexpected behavior: filter iterator and interior mutability

Hello,

I have a question regarding unexpected behavior occurring when combining a filter iterator with interior mutability via Cell.

It's about this code snippet in footnote 1 of this article: Mutating through a filter | Barry's C++ Blog

(Footnote 1 is Rust, the rest of the article is about C++)

It shows that the filter iterator leads to unexpected outcomes when changing the filtered values via interior mutability while iterating over them.

My question is: is this expected/well-defined behavior with the filter iterator (i.e. a bug in the program, but expected), or is this a bug in the standard library? If it's expected, is this documented anywhere? I couldn't find anything about it. I actually would have expected that Rust would catch it with a compiler error, but it builds just fine.

Also: is it at least "only" a "normal" bug to do this or does this trigger some kind of undefined behavior (like in C++)? (I wonder because I don't know if unsafe code is maybe somewhere involved deeper down in the standard library)

Best regards,
planetary-jua

I just have some brief comments.

On the second question, this cannot cause undefined behavior in Rust because there is no unsafe code. This is the primary advantage of Rust.

On the first question, I'm not sure if this should be mentioned in the documentation, although it couldn't hurt to have an example showing that interior mutation during iteration is not prevented. The results are actually what I would expect, however.

For reference, here's the code from the article that’s being discussed:

fn main(){
    let v: Vec<_> = (1..=6).map(Cell::new).collect();
    let evens = v.iter().filter(|x| x.get() % 2 == 0);
    for (i, j) in evens.clone().zip(evens.skip(1)) {
        println!("({}, {})", i.get(), j.get());
        j.set(j.get() + 1);
    }
}

By using Cell, you're opting out of compile-time checking of exclusivity of mutable access. This doesn't mean that undefined behavior is possible, because Cell provides its own particular set of safe operations, but it means that you no longer have protections against unintended interleaving of operations (at any level coarser than “one element of the Vec”). &mut’s exclusiveness is, in part, about preventing exactly this kind of bug, whether the bug would cause UB or not.

This is among the reasons why you should, whenever reasonable, stick to plain non-interior-mutable data structures and &mut-based mutations of them. Interior mutability makes the program more complex to understand, because there are more possibilities for the outcome to depend on the order in which operations are executed.

Of course, an even higher level design principle here is “don't mutate things that you're also reading, unless you have a plan for how that's going to work out correctly”. If main() above gives a wrong answer, then that's because it's an incorrect algorithm; it happens that parts of the algorithm here are made up of standard library functions that do high-level things like “pair up the elements of two iterators”, so it can't be read in a straight-line fashion.

6 Likes

Hey everyone,

thanks for he replies!

While there's no unsafe in the code snippet, my worry here was about there maybe being unsafe code involved deeper in the standard library. Like, if you look into the internal implemetation of e.g. Vec, there's a lot of unsafe in there, encapsulated away. And now I was wondering whether this combination of filter and Cell could be something that breaks some assumptions and could lead to UB in the standard library.

Yeah, it's certainly a constructed example to show the edge case, not something that should come up in actual code. I guess I expected that something, like some unsatisfied trait bound or so, would prevent it at compile time, but I'm still pretty new to Rust :slightly_smiling_face:

Best regards
planetary-jua

That would only be possible if there were safety bugs in Vec or Cell. An abstraction that uses unsafe code is required to provide an API that cannot be used (or misused) to cause UB.

This is why it is very important that all libraries using unsafe code are thoroughly audited and tested (including with Miri). And that unsafe code is only used in libraries when necessary.

It's all about the specifics. You have to ask how the compiler would know to prohibit the use of Cell here without disallowing interior mutability entirely.

1 Like

If that were true, then we would say that one of those parts of the standard library is unsound. (Unsound code is code that can be called by safe code in a way that causes UB.) Soundness is taken very seriously — it is the foundation on which all of Rust’s guarantees are built. And it is composable:

  • If Vec’s unsafe code is sound, then we know that it doesn't matter whether the Vec contains Cells.
  • If Cell’s unsafe code is sound, then we know that it doesn't matter whether the Cell is in a Vec.

(This isn't quite as elegant as it sounds, because it turns out there are edge cases where we have to pick between “it's OK for unsafe code to do A” and “it's OK for unsafe code to do B”, because allowing both together would let them interact in a way that is unsound. But the approach Rust takes to such questions is to pick one of those choices and stick to it, not to let both exist and leave a trap for users.)

6 Likes

Thanks for the explanations!

That is an interesting question. I have checked the docs and it seems that iterator adapters are described in terms of what items they yield, but not how exactly they manipulate inner iterators or callbacks.

There is a section about laziness (core::iter - Rust) of iterators but it does not look like specification to me. It does not convince me that, for example, the iterator returned by filter() cannot do some kind of pre-fetching (which could change the output of code snippet mentioned in OP).

Or, as a another concrete example, if the Rust team decided to change the implementation of Iterator::skip() so that the skipping would happen during the construction (like { iter.advance_by(n); iter }) -- would that be a breaking change? (Now that I think of it, the answer seems to be "yes" because it would do different things to iterators shorter than n, but it is not that obvious.)

So it seems to me these things are unspecified, so if you really want to combine iterator adapters and interior mutability (or side effects), better write your own adapters (or use a better specified crate).

1 Like

I agree that iterator adaptors are often underdocumented (even underspecified sometimes). I think your code would make a great test case for libcore, to ensure the semantics don't change.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.