Will the borrow checker ever be able to handle this simple loop with a mutable HashMap logic?

The problem is well-known one and I have a working solution, but, to comply with the borrow checker, I have to perform a get from the map at every loop to search for the very same element over and over again.
I really wonder if there is a way to avoid it.
E.g. (Rust Playground) :

// WHAT WORKS
fn do_some_loops_OK(maps: &mut HashMap<String, HashMap<String, String>>) {
    for _ in 0..10 {
        let (k, v) = use_maps(maps);

        // HERE I AM FORCED TO GET THE SAME VALUE FROM THE MAP AT EVERY LOOP ITERATION
        let inner = maps.entry("key".to_owned()).or_default();
        inner.insert(k, v);
    }
}

// WHAT I WOULD LIKE INSTEAD
fn do_some_loops_KO(maps: &mut HashMap<String, HashMap<String, String>>) {
    
    // HERE I SEARCH IN THE MAPS ONLY ONCE
    let inner = maps.entry("key".to_owned()).or_default();

    for _ in 0..10 {
        let (k, v) = use_maps(maps); // the borrow checker complains here
        inner.insert(k, v);
    }
}

fn use_maps(maps: &HashMap<String, HashMap<String, String>>) -> (String, String) {
    // use the maps here
    ("some".to_owned(), "thing".to_owned())
}

The problem here is mainly about performance, and, in my case, it is a meaningful issue.
Some questions:

  • Do you see any possible way of fixing do_some_loops_KO? I don't think this can be resolved with a RefCell.
  • Do you see any theoretical problem with the logic in do_some_loops_KO? If no, would the compiler be able to handle this use case one day?
1 Like

Yes, HashMap::insert could cause the map to grow and reallocate, invalidating any pointers you previously had into it, such as inner. This is a real problem, exactly the kind of thing the borrow check is designed to catch.

6 Likes

@cole-miller, generally speaking, I agree with you; but, due to the fact that we never change the external map, but only the internal one, I guess this is still safe. This is why I think that an "extremely" smart borrow checker could be able to detect it. I am wrong?

Technically, I guess it could be done by introducing a myriad of new reference types with fewer capabilities than &mut, e.g. &overwrite_parts_but_not_reallocate T.

However, changing the language so radically just for the use case of allow an un-idiomatic piece of code seems entirely disproportionate.

Nothing here guarantess that you don't access inner in use_maps. "Extremely smart" here would imply some sort of whole-program analysis...

But you can remove your map before, and insert after the loop: Rust Playground

3 Likes

@KillTheMule, unfortunately, in my real use case, the use_maps function requires the inner map to be present along with the updated values of the previous loops. So I cannot apply the remove-and-reinsert approach.

Ok, in that case, I think you rightfully have to look up inner in every loop iteration. It might have reallocated in the previous iteration, right? I don't think any smartness of the borrow checker could avoid that here.

If use_maps is immutable, I don't see a problem. The outer hashmap is not mutated. The inner one is mutated only from one place.

This looks more like a case of two-phase borrows.

2 Likes

@kornel that's super interesting. Is it something I can somehow use in my scenario?

Could you change the signature of use_maps to take maps and inner separately? That would let you take inner out of maps and temporarily borrow it as immutable each loop. Like this

No, that's an internal detail of the compiler. See also Polonius.

2 Likes

@bradleyharden, that is an intriguing approach but I cannot use it in my case

@kornel I am probably over-simplifying it, but, could I say that it is always safe to:

  1. Create a mutable reference to var X
  2. Create as many immutable references to X as you prefer
  3. Use the immutable references
  4. Drop all immutable references to X
  5. Use the mutable reference created at point 1
  6. (optional) loop from point 2 to point 5 safely

E.g.:

    let mut x = "val".to_owned();
    let mut_ref = &mut x;
    {
        // shouldn't this be always safe?!?
        let immut_ref = &x;
        println!("{}", immut_ref);
    }
    mut_ref.insert_str(0, "");

I mean, from my point of view, it doesn't care if there is a mutable borrow when you create an immutable one, what really cares is that, when you USE the mutable borrow, there are no immutable ones alive.

It think it does though. As I understand it, exclusive references can't alias, ever, for any reason. The code here directly violates that. However, what you want could be implemented by reborrowing the exclusive reference, i.e.

    let mut x = "val".to_owned();
    let mut_ref = &mut x;
    {
        // shouldn't this be always safe?!?
        let immut_ref = &*mut_ref;
        println!("{}", immut_ref);
    }
    mut_ref.insert_str(0, "");

I think the distinction is that reborrowing invalidates the exclusive reference for the lifetime of the shared reference. But in your version, there's no way to perform that invalidation directly. I think it would take a more-complex, whole-function analysis to implicitly invalidate the exclusive reference.

1 Like

@bradleyharden I know that exclusive references can't alias; what I believe is that this restriction could be loosened if the mut ref is used only after dropping all immutable refs.
The fact that you can safely reborrow confirms that there are no issues, but reborrow does not solve this problem:

fn do_some_loops_KO(maps: &mut HashMap<String, HashMap<String, String>>) {
    
    let inner = maps.entry("key".to_owned()).or_default();

    for _ in 0..10 {
        let (k, v) = use_maps(maps); // borrow checker error here
        inner.insert(k, v);
    }
}

My point was only to say that I think the analysis becomes much more complicated if you have to implicitly invalidate the exclusive reference. Reborrowing seems like a much more straightforward analysis to me. But, as you say, it doesn't help in this case. Maybe Polonius will help? I'm not sure.

This code compiles:

    let mut x = "val".to_owned();
    let mut_ref = &mut x;
    {
        // shouldn't this be always safe?!?
        let immut_ref = &*mut_ref;
        println!("{}", immut_ref);
    }
    mut_ref.insert_str(0, "");

Note that it's slightly different that it uses &*mut_ref instead of &x. This way instead of breaking aliasing by giving mut_ref exclusivity and breaking exclusivity by reusing x, it takes advantage of mut_ref's exclusivity to reborrow to ensure safe use within the nested scope. It's fine to reborrow an exclusive reference as shared, but it's not fine to make other independent shared references while an exclusive one exists.

Imagine this:

rayon::scope(|scope| {
    let mut x = "val".to_owned();
    let mut_ref = &mut x;
    scope.spawn(move || *mut_ref = String::new());
    {
        // This is clearly unsafe now!
        let immut_ref = &x;
        println!("{}", immut_ref);
    }
});

vs this:

rayon::scope(|scope| {
    let mut x = "val".to_owned();
    let mut_ref = &mut x;
    scope.spawn(move || *mut_ref = String::new());
    {
        // error: mut_ref has been moved to a thread
        let immut_ref = &*mut_ref;
        println!("{}", immut_ref);
    }
});
1 Like

This can only be valid if the mutable reference is not used for something else. For example this should not compile:

use std::cell::RefCell;
let mut x = RefCell::new(Some(Box::new(1)));
let mut_ref = &mut x;
let use_mut_ref: &Box<i32> = mut_ref.get_mut().as_ref().unwrap(); // points to the contents of the Option
{
    let immut_ref = &*mut_ref;
    *immut_ref.borrow_mut() = None;
}
println!("{}", use_mut_ref); // there's no Box anymore, but use_mut_ref is still pointing to it

@SkiFire13 your example breaks my assumption, in fact, when you USE the mutable mut_ref (let immut_ref = &*mut_ref;) the immutable ref use_mut_ref is still living.
My idea is that it is always safe (in the same thread) to create and use immutable refs as long as they are all dropped before the mutable ref is used.

Then I can't see how this would solve your original problem. Consider this code:

fn do_some_loops_KO(maps: &mut HashMap<String, HashMap<String, String>>) {
    
    let inner = maps.entry("key".to_owned()).or_default();

    for _ in 0..10 {
        let (k, v) = use_maps(maps);
        inner.insert(k, v);
    }
}

inner borrows maps (just like use_mut_ref borrows mut_ref) , then maps is reborrowed (just like immut_ref reborrows mut_ref) and then inner is used (just like use_mut_ref is used), so following your reasoning this should also not compile because maps (the mutable reference) is used before all the immutable references (inner) are dropped.