Optimisation of RefCell borrow operations

Per the Rust book

" The RefCell<T> keeps track of how many Ref<T> and RefMut<T> smart pointers are currently active. Every time we call borrow , the RefCell<T> increases its count of how many immutable borrows are active. When a Ref<T> value goes out of scope, the count of immutable borrows goes down by one. Just like the compile-time borrowing rules, RefCell<T> lets us have many immutable borrows or one mutable borrow at any point in time."

So in principle, there is a runtime cost to calling borrow() or borrow_mut() on a RefCell. However, it seems to me that in many cases it might be possible to do a global analysis of a program, and establish at compile time that the dynamic check is not needed.

For example, if (globally) each borrow doesn't do anything for the lifetime of the borrow that could cause a conflicting borrow, the dynamic check can be omitted.

Some questions:
(a) Am I thinking straight here, does this make sense?
(b) Does Rust compiler actually do any analysis like this?
(c) If so, is there a way to easily check if the analysis has established that a borrow panic cannot occur? [ I think this would be useful for the programmer to know! ]

(a) One big problem with this is that the compiler would need to give the RefCell type special treatment. While that's certainly possible in principle, this approach scales horribly and has significant complexity costs in the compiler itself.

(b) To my knowledge, the compiler/language team actively avoids global analysis. And that makes sense to me, since recompilation of arbitrary crates could become much more expensive.

(C) N/A


I guess it’s unlikely that the checks are removed in most practical cases. I think that it’s a valid optimization to remove the modification of the counts; those are non-atomic integers; if you only do a few things inside of the scope of the Ref<T>, then I wouldn’t be surprised if LLVM can optimize away the increment+decrement of the counter. Even if the checks aren’t optimized, the panic on failure will probably ensure that the success-case is the hot path (for e.g. branch prediction) and the overhead should be minimal.

The way to check LLVM optimization is to look at the generated assembly.


For example

use std::cell::RefCell;

pub fn foo(x: &RefCell<i32>) {
    *x.borrow_mut() += 42;
	pushq	%rax
	cmpq	$0, (%rdi)
	jne	.LBB1_1
	addl	$42, 8(%rdi)
	movq	$0, (%rdi)
	popq	%rax

	leaq	.L__unnamed_1(%rip), %rdi
	leaq	.L__unnamed_2(%rip), %rcx
	leaq	.L__unnamed_3(%rip), %r8
	movq	%rsp, %rdx
	movl	$16, %esi
	callq	*core::result::unwrap_failed@GOTPCREL(%rip)

doesn’t do any modification of the counter, but there’s a compare instruction checking if it’s zero.

Edit: OTOH, the movq $0, (%rdi) seems redundant…

1 Like

In principle, it is possible to perform the global analysis manually and remove the counter itself with unsafe. For example, consider this file which uses module boundaries to make the global analysis very easy, as only a single rather small file needs to be manually checked to ensure the correctness of the removed checks.

1 Like

If throughout the program, the counter is never incremented, then it never need be checked either.
But yes, I guess this is an easier optimisation as it's local.

Thanks everyone for the replies, my question was mostly out of curiosity. It seems to me that "one day" ( maybe long after we are all dead! ) a Rust compiler might exist that could do this kind of analysis. In the mean time, I doubt the checks cost very much.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.