How to avoid write almost the same code due to borrow checking

Fancyflame · February 18, 2022, 3:12pm

I met a situation where I don't know how to avoid using unsafe and not write almost the same code twice.

Here we have an enum named Tree:

use std::collections::HashMap;
enum Tree {
    KV(HashMap<String, Tree>),
    Value(String),
}

Now we want to define a function which could find Tree by path splitted with ".".

impl Tree {
    fn find_element(&self, path: &str) -> Option<&Tree> { todo!() }
    fn find_element_mut(&mut self, path: &str) -> Option<&mut Tree> { todo!() }
}

For example:

let ele: &Tree = tree.find_element("find.element.by.path").unwrap();

I wrote find_element as follow:

impl Tree {
    fn find_element(&self, path: &str) -> Option<&Tree> {
        let mut prv = self;
        assert!(path.is_ascii());
        for name in path.split(".") {
            match prv {
                Tree::KV(map) => match map.get(name) {
                    Some(tree) => prv = tree,
                    None => return None,
                },
                Tree::Value(s) => return None,
            }
        }
        Some(prv)
    }
}

Now here comes the problem: How to write find_element_mut?

I have two solutions. The first one is to copy the code and edit it, so we got this:

impl Tree {
    fn find_element(&self, path: &str) -> Option<&Tree> {
        let mut prv = self;
        assert!(path.is_ascii());
        for name in path.split(".") {
            match prv {
                Tree::KV(map) => match map.get(name) {
                    Some(tree) => prv = tree,
                    None => return None,
                },
                Tree::Value(s) => return None,
            }
        }
        Some(prv)
    }

    fn find_element_mut(&mut self, path: &str) -> Option<&mut Tree> {
        let mut prv = self;
        assert!(path.is_ascii());
        for name in path.split(".") {
            match prv {
                Tree::KV(map) => match map.get_mut(name) {
                    Some(tree) => prv = tree,
                    None => return None,
                },
                Tree::Value(s) => return None,
            }
        }
        Some(prv)
    }
}

The second solution is to use unsafe to avoid writing duplicate code:

impl Tree {
    unsafe fn _find_element(slf: *const Tree, path: &str) -> Option<*const Tree> {
        let mut prv = &*slf;
        assert!(path.is_ascii());
        for name in path.split(".") {
            match prv {
                Tree::KV(map) => match map.get(name) {
                    Some(tree) => prv = tree,
                    None => return None,
                },
                Tree::Value(s) => return None,
            }
        }
        Some(prv)
    }

    fn find_element(&self, path: &str) -> Option<&Tree> {
        unsafe { Self::_find_element(self, path).map(|e| &*e) }
    }

    fn find_element_mut(&mut self, path: &str) -> Option<&mut Tree> {
        unsafe { Self::_find_element(self, path).map(|e| &mut *(e as *mut Tree)) }
    }
}

The above two solutions both seem to be not satisfying. Does anybody have any good ideas?

steffahn · February 18, 2022, 3:22pm

The unsafe option you list also appears unsound to me (i.e. using it causes UB). You’re converting prv: &Tree into *const Tree, then into &mut Tree.

So the only reasonable option (of the two you list) is to use code duplication. Another alternative (not necessary all that satisfying either though) would be to use a macro, e.g.

macro_rules! find_element_implementation {
    ($find_element:ident $get:ident $($mut:ident)?) => {
        fn $find_element(&$($mut)? self, path: &str) -> Option<&$($mut)? Tree> {
            let mut prv = self;
            assert!(path.is_ascii());
            for name in path.split(".") {
                match prv {
                    Tree::KV(map) => match map.$get(name) {
                        Some(tree) => prv = tree,
                        None => return None,
                    },
                    Tree::Value(s) => return None,
                }
            }
            Some(prv)
        }
    }
}

impl Tree {
    find_element_implementation!(find_element get);
    find_element_implementation!(find_element_mut get_mut mut);
}

Fancyflame · February 18, 2022, 3:28pm

Sorry I don't get this... Why are the conversions UB?
I think it may be better if Rust could have a syntax to solve this problem?

steffahn · February 18, 2022, 3:29pm

Transmutes - The Rustonomicon

Transmuting an & to &mut is UB.

Transmuting an & to &mut is always UB.

No you can't do it.

No you're not special.

(this rule applies to any form of converting a &T into &mut T (with the same target (and I’m not 100% sure about ZSTs)), not just literal calls to mem::transmute)

If you want more convincing “arguments”, try running an invocation of the unsafely (and AFAICT unsoundly) implemented find_element_mut method through miri (this might possibly need MIRIFLAGS="-Zmiri-track-raw-pointers")

Fancyflame · February 18, 2022, 4:00pm

I read the link you gave but I still don't know why converting &T to &mut T is UB. Will the compiler generate different assembly code for find_element and find_element_mut? (I'm not familiar with assembly)

Michael-F-Bryan · February 18, 2022, 4:37pm

It is a fundamental assumption in Rust that you can't mutate the value behind a &T reference without some form of "interior mutability" to manage synchronisation and aliasing (e.g. a Mutex). Think of it like me asking "what is the result of 0/0?" in mathematics or a C programmer passing 3 arguments to a function that expects 4.

The Rust language provides the guarantee that data will be shared XOR mutable, which allows programmers and compilers to create arbitrarily complex programs with that assumption in mind. If your code uses unsafe to violate that guarantee then anything building on your code will be fundamentally broken. This is the same for all languages, although Rust puts a much greater emphasis on soundness/UB than most.

Undefined behaviour is not about how code gets optimised or what the hardware does.

Fancyflame · February 18, 2022, 5:25pm

I guess you may not understand me. In the above solution that using unsafe, I guaranteed the uniqueness of mutable reference manually.
In other words, is this function also UB?

fn foo<T>(t:&mut T)->&mut T{
    unsafe{
        let r=&*t;
        &mut *(r as *const T as *mut T)
    }
}

steffahn · February 18, 2022, 5:53pm

Yes

fn foo<T>(t:&mut T)->&mut T{
    unsafe{
        let r=&*t;
        &mut *(r as *const T as *mut T)
    }
}

fn main() {
    foo(&mut 0);
}

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 1.19s
     Running `/playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo-miri target/miri/x86_64-unknown-linux-gnu/debug/playground`
error: Undefined Behavior: trying to reborrow for Unique at alloc993, but parent tag <untagged> does not have an appropriate item in the borrow stack
 --> src/main.rs:4:9
  |
4 |         &mut *(r as *const T as *mut T)
  |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ trying to reborrow for Unique at alloc993, but parent tag <untagged> does not have an appropriate item in the borrow stack
  |
  = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
  = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
          
  = note: inside `foo::<i32>` at src/main.rs:4:9
note: inside `main` at src/main.rs:9:5
 --> src/main.rs:9:5
  |
9 |     foo(&mut 0);
  |     ^^^^^^^^^^^
  = note: inside `<fn() as std::ops::FnOnce<()>>::call_once - shim(fn())` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
  = note: inside `std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:122:18
  = note: inside closure at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:145:18
  = note: inside `std::ops::function::impls::<impl std::ops::FnOnce<()> for &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>::call_once` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:259:13
  = note: inside `std::panicking::r#try::do_call::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:492:40
  = note: inside `std::panicking::r#try::<i32, &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:456:19
  = note: inside `std::panic::catch_unwind::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:137:14
  = note: inside closure at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:128:48
  = note: inside `std::panicking::r#try::do_call::<[closure@std::rt::lang_start_internal::{closure#2}], isize>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:492:40
  = note: inside `std::panicking::r#try::<isize, [closure@std::rt::lang_start_internal::{closure#2}]>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:456:19
  = note: inside `std::panic::catch_unwind::<[closure@std::rt::lang_start_internal::{closure#2}], isize>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:137:14
  = note: inside `std::rt::lang_start_internal` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:128:20
  = note: inside `std::rt::lang_start::<()>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:144:17

error: aborting due to previous error

17cupsofcoffee · February 18, 2022, 5:57pm

For context, the error message shown above was generated by Miri, a Rust interpreter which can detect a lot (but not all) cases of UB.

You can run Miri on the playground via the Tools dropdown:

Michael-F-Bryan · February 20, 2022, 4:38am

It doesn't matter that you use &mut to ensure uniqueness at the top level because Rust requires each individual operation to be sound, regardless of the wider context. In this case you have a line that uses pointer casting to turn a &T into a &mut T, which is UB.

louy2 · February 20, 2022, 5:57am

Please understand that the issue at hand here is soundness, not completeness. Soundness only requires that if a program is wrong, it doesn't compile, but correct programs may also not compile. Programming in this sense is like a conversation with the compiler, who can only be convinced by a certain logic, regardless if you can prove your program to be correct by some different logic.

Despite being a very intuitive ask, generalizing over the multiplicity of a type (value, &mut, and &) is currently an open research question. Please take recommendation of current workarounds as pragmatic suggestions that can solve your problem right now, not as our complacency with them. You are welcome to track that progress at the issue rfcs#414. (It's a problem hard enough that the issue was created way back then but still without a draft RFC.)

system · May 21, 2022, 5:57am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Cannot borrow `*tree` as mutable more than once at a time	2	175	February 24, 2024
How to resolve borrow mutable more than once help	2	443	May 11, 2023
Best practice to keep abstractions and avoid "cannot borrow as mutable more than once" help	6	305	December 13, 2023
How to efficiently write and check unsafe code? help	4	252	August 23, 2023
Mark a function to be used carefully help	3	334	July 29, 2021

How to avoid write almost the same code due to borrow checking

Related Topics