Confusion with borrowing and ownership

This compiles

fn replace_string2(s: &mut String, from: char, to: char){
    for c in s.chars(){
        if c == ',' {
            s.push(';');
            break;
        }
    }
}

But this doesn't

fn replace_string2(s: &mut String, from: char, to: char){
    for c in s.chars(){
        if c == ',' {
            s.push(';');
        }
    }
}

also this doesn't:

fn replace_string2(s: &mut String, from: char, to: char){
    for c in s.chars(){
        if c == ',' {
            s.push(';');
        }else{
           s.push(':');
        }
    }
}

why?

Generally this won't work, and the first case is a special case that borrow checker is very clever about.

The problem is you can't iterate over a string (or any standard Rust collection) and modify it at the same time. This is called iterator invalidation problem, and Rust chose to prevent it by completely forbidding any cases like this.

In the first code the borrow checker sees the break and recognizes that it stops iterating just before modifying the string, so there's no iteration and modification mixed - it's read-only access followed by a single write at the end. In other cases it's a mix of iteration and modification, which is not allowed.

The borrow checking rules forbid this by borrowing the string in shared/read-only mode for iteration, and push requires borrowing exclusively to mutate, but it can't as long as for loop still uses the shared loan.

Additionally, Rust strings are UTF-8. This encoding allows every char to have varying number of bytes, and also Rust guarantees you can't break validity of UTF-8, so ability to do in-place modifications of strings is very limited. You can modify a Vec<u8> in-place with iter_mut().

You'll need to make a copy of the string, e.g. this replaces ,:

s.chars().map(|c| if c == ',' { ';' } else { c }).collect::<String>()

If you really want to push at the end, push to a temporary string, and then concatenate the strings:

let mut tmp = String::new();
for c in s.chars() {
    tmp.push(';');
}
s.push_str(&tmp);
6 Likes

@amolkhatri Your code is a good example that demonstrates that the borrow-checker operates not on the original source code, but on a control-flow-graph instead.

The compiler will pre-process the code by first desugaring the for, of course[1]

fn replace_string2(s: &mut String, from: char, to: char) {
    let mut iterator = s.chars();
    while let Some(c) = iterator.next() {
        if c == ',' {
            s.push(';');
            break;
        }
    }
}

Which can also be expressed as a loop with if let

fn replace_string2(s: &mut String, from: char, to: char) {
    let mut iterator = s.chars();
    loop {
        if let Some(c) = iterator.next() {
            if c == ',' {
                s.push(';');
                break; // (A)
            }
            // nothing here ^^
        } else {
            // just breaking… ^^
            break; // (B)
        }
        // nothing here either ^^
    }
    // about to return
}

and after such a desugaring, it undergoes further transformations, through 2 intermediate representations (called “HIR” and “MIR”) when we, eventually, will have gotten rid of every abstract control flow construct, and instead get the equivalent of write the code with some form of goto, as illustrated in the following code block (pseudo-syntax)

fn replace_string2(s: &mut String, from: char, to: char) {
'START:
    let mut iterator = s.chars();
    GOTO 'LOOP_START

'LOOP_START:
    if let Some(c) = iterator.next() {
        GOTO 'THEN
    } else {
        GOTO 'BREAK_B
    }

'THEN:
    if c == ',' {
        goto 'THEN_THEN
    } else {
        GOTO 'END_OF_THEN
    }

'THEN_THEN:
    s.push(';');
    GOTO 'OUT_OF_THE_LOOP // (A)

'END_OF_THEN:
    // nothing here ^^
    GOTO 'END_OF_LOOP_BODY

'BREAK_B:
    // just breaking… ^^
    GOTO 'OUT_OF_THE_LOOP // (B)

'END_OF_LOOP_BODY:
    // nothing here either ^^
    GOTO 'LOOP_START

'OUT_OF_THE_LOOP:
    // about to return
}

Here's a ChatGPT-generated graphviz visualization:

Now, the borrow-checker will operate on this control-flow graph and – at least intuitively speaking (as I have no idea of its actual inner workings) – will need to mark the area where the iterator (which contains a borrow of s) is considered live. This area must start where iterator is defined, and must remain live on every path that leads to (or leads back to) any usage of iterator, i.e. the node doing the .next() call. Following the graph, the area in question is thus the following

As you can see, there is no reason why iterator would need to be live at the point where s.push(';') is called, so it simply isn’t considered live anymore, and the borrow-checker can accept a new mutable borrow of s to start at (and also immediately end) at that point.

If you did the same kind of analysis with either of the other code examples, there would still be a path from the s.push() node in the graph back to the iterator.next() call, so iterator would still need to have been kept alive and there’s a borrow-checking conflict.


  1. all desugarings and translations presented here are not to be taken as universally applicable, as they might not correctly cover some corner cases if generalized, but they might make the code more clear ↩︎

7 Likes

Thanks, @kornel and @steffahn, for providing such an insightful explanation. It was greatly appreciated.