Mutating values while iterating a map, based on keys

Let's say we have these:

struct Val {
    trim_target: String,
    trim_key: String,
}

let mut map: HashMap<String, Val>

I want to iterate through the map and trim Val.trim_target if trim_key is found as a key in map.
I know I can use clone to bypass this but I'm curious if this can be done with only #[derive(Eq, PartialEq, Hash)].
How can I convince the compiler, or is this logic possibly dangerous?

Here's my attempt to do so:

struct Val {
    trim_target: String,
    trim_key: String,
}

fn values_mut(mut map: HashMap<String, Val>) {
    for x in map.values_mut() {
        if map.contains_key(&x.trim_key) {
            x.trim_target = x.trim_target.trim().to_string();
        }
    }
}

On first look, I believe that this is a case that is neither dangerous, nor is there any good way of convincing the compiler that it’s safe. This is because by-reference-access in Rust is a fairly coarse model, and the &mut self access that values_mut requires is the same kind of access as the &mut self that methods like insert or remove (or as an even more extreme/clear example, drain) require, even though the former does not alter the map structure and doesn't touch the keys, which makes it non-dangerous in principle to do things like contains_key concurrently.

Instead of directly convincing the compiler that this is safe, we can discuss workarounds that change the code as little as possible. One possible approach that comes to mind: With HashMaps, it’s sometimes possible to do more complex kinds of interleaved mutable/immutable access patterns (efficiently!) by using the datatype provided by the indexmap crate.

use indexmap::IndexMap;

struct Val {
    trim_target: String,
    trim_key: String,
}

fn values_mut(mut map: IndexMap<String, Val>) {
    for i in 0..map.len() {
        let x = &map[i];
        if map.contains_key(&x.trim_key) {
            let x = &mut map[i];
            x.trim_target = x.trim_target.trim().to_string();
        }
    }
}

Rust Playground

3 Likes

Another approach is to store RefCells inside the HashMap so that they can be modified without needing to take a mutable reference to the map:

use std::cell::RefCell;
use std::collections::HashMap;

struct Val {
    trim_target: String,
    trim_key: String,
}

fn values_mut(map: HashMap<String, RefCell<Val>>) {
    for x in map.values() {
        let mut x = x.borrow_mut();
        if map.contains_key(&x.trim_key) {
            x.trim_target = x.trim_target.trim().to_string();
        }
    }
}

Rust Playground (rust-lang.org)

2 Likes

Why it doesn't work: The thing about &mut is that it means unique access, which it can bestow on something that borrows from it. Intuitively speaking, as soon as you create an iterator with map.values_mut(), the iterator inherits a unique live access to map, and the iterator bestows its unique live access to each element (x) in this case, which becomes the entity with unique live access within the loop. To put it differently:

  1. You start with map (unique, live).
  2. An iterator inherits the unique access: map (unique, not-live) → iterator (unique, live)
  3. An emitted element gets the unique access: map (unique, not-live) → iterator (unique, not-live) → x (unique, live).

To ensure uniqueness, the compiler can't let you borrow again from map (& or &mut) until it becomes live again, which happens once its borrowers are destroyed. Hence, you can't call map.contains_key in the loop. This property is used by rust for both safety guarantees and potential optimizations.

A cheap way around it: Rust supports mutation without a unique (i.e. &mut) borrow to avoid this liveness requirement. For that you need one of the cell-based data structures that supports mutability through a shared (i.e. &) reference at the expense of a more restricted API. Since you're trying to modify a String, which implements Default, you can use Cell<String>, which has no additional overhead:

struct Val {
    trim_target: Cell<String>,  // <--
    trim_key: String,
}

/// Passed in a `&mut HashMap` since our mutation strategy
/// is an implementation detail but `&HashMap` compiles as well.
fn values_mut(map: &mut HashMap<String, Val>) {
    for x in map.values() {
        if map.contains_key(&x.trim_key) {
            // Technically, take() returns an owned `String`.
            // So you could make this more efficient by implementing an
            // in-place `trim()` to avoid an additional heap allocation.
            x.trim_target.set(x.trim_target.take().trim().to_string());
        }
    }
}

Playground

For strategies to trim a String in-place for further optimization, see this thread.

Edit: You can also use Cell similarly to @2e71828's suggestion above with HashMap<String, Cell<Val>>, but then you'd have to make Val implement Default and you'd be take()ing and set()ing the entire Val value at once as opposed to just the field you'd like to mutate.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.