If you use enough Rc<RefCell<T>>, does rust become a garbage collected language?

As I understand it, Rf<RefCell<T>> is a wrapper that makes the object survive until all references to it are dropped, just like many garbage collected languages do for all objects. I really like rust's semantics, but sometimes dealing with all the manual memory management stuff can get frustrating, especially when my code isn't performance sensitive at all. If I created a language that was exactly like rust except that you don't specify any ownership information, and it was "compiled" to rust by wrapping everything in Rc<RefCell<>>` until the borrow checker was happy, would that work?

1 Like

That's essentially what Swift does with automatic reference counting instead of a sweeping garbage collector.

1 Like

This is basically gluon.

1 Like

It looks like gluon is a functional language implemented in rust, which isn't really what I was asking. I was wondering whether adding implicit Rc wrappers to structs would be sufficient to turn it into a garbage collected language, not about using rust to create a new language from scratch.

The lack of cycle collection means that you still need to think about ownership when using Rc and Arc in any complicated cases. There are some external crates with cycle collectors or other "real" garbage collectors.

7 Likes

I understand, but I was just pointing out that there are already somewhat Rust-like, statically-typed GC languages.

No. That would be insufficient and yet radically different from a traditional GC language for a number of reasons:

  • Reference counting can't detect cycles in itself. Merely adding Rc blindly all over the place would likely cause the code to leak.
  • Idiomatic RAII code that previously relied on the borrow checker would cease to work. The prime example is lock guards. Currently, locking a Mutex returns a guard object that derefs to the protected object and has an appropriate lifetime, so the borrow checker errors if you try to use it for too long. With an implicit Rc around MutexGuard, this would still be technically true, but it would now be a lot easier to accidentally keep a copy of the guard around, resulting in mysterious live-locks.
  • Rc<RefCell<_>> also moves exclusivity checking from compile-time to runtime, which means you would likely run into several panics related to multiple mutable borrows. This wouldn't remove the annoyance of borrow checking, it would merely make it harder to debug.
  • There would be many spurious heap allocations and redundant reference counting operations, as Rc is not a language built-in, and the optimizer doesn't magically know how to eliminate it.
  • Relatedly, doing this would impede thread-safety or incur an unacceptable runtime cost. Rc<RefCell> is not thread-safe, so if you wrapped everything in it, you won't be able to write threaded or async code at all. On the other hand, if you used the thread-safe equivalent, Arc<Mutex>, that would mean locking every single variable, even trivially thread-safe locals, for no reason.
  • Pattern matching would also become disproportionately more difficult, as you can't directly match the wrapped value of an Rc by value or by mutable reference.

TL;DR: if you find Rust hard due to the borrow checker, tacking a Rc<RefCell> onto every value won't help with that, and would likely result in code that is way inferior to idiomatically written Rust code.

12 Likes

Thanks for the very thorough response! You made a lot of good points, it's probably easiest to follow if I go through them one by one in the same format.

Yep. Most code doesn't ever create reference cycles though, so I think it's an acceptable sacrifice to require the programmer to break cycles manually when these cases come up, like Swift does.

If I accidentally leaked the guard by making a cycle, then yeah. Otherwise, the single instance of the Rc<RefCell<MutexGuard<_>>> should free normally when it leaves scope, so I'm pretty sure this wouldn't happen. I could be wrong though.

If I unwrapped the Rc<RefCell<_>> each time I needed the value and dropped the handle as soon as I finished with it, I'm pretty sure this wouldn't happen. Horrific runtime cost, but I don't think it would panic.

Yep, tons. It would be the absolute worst. For code that doesn't need to run fast, that shouldn't matter though.

While locking every single variable is absolutely awful, code that doesn't need to run fast would probably still be okay if it did that. There's also the question of how much code there really is that needs to be multithreaded, but doesn't need to run fast.

You got me there. Pattern matching is one of the things that I like the most about rust's syntax, and it would be such a shame to lose it. I think nesting match statements would be able to get around it and make it work, but that's such an ugly thing to need to do.

I actually love the borrow checker and don't fight with it that much at all, I just sometimes wish there was a language that was identical to rust but didn't make me think about where data is owned, so I could prototype faster. I think it's also an interesting question as to whether that hypothetical language has a relatively simple mapping to valid rust code, because it felt like it might, and knowing why or why not would help me understand the language a little better. If it is possible to make it work at all, there's probably a fancy way to optimize out a lot of the unnecessary heap allocations, and that may even result in it generating idiomatic rust code if the input code is written in a way that makes it possible.

It could easily happen in recursive code, when you need to keep the mutable borrow around until after the recursive call will have returned. Normally, this would pass static borrowck by means of automatic (or explicit) reborrowing, but RefCell would still panic.

1 Like

I don't really understand what you mean by that, could you give an example?

This:

fn recursive(value: &mut T, recurse: bool) {
    if (recurse) {
        recursive(&mut *value, false);
    }
}

could become this (Playground):

fn recursive(value: Rc<RefCell<T>>, recurse: bool) {
    // if for some reason you need this mutable borrow while recursing:
    let guard = value.borrow_mut();
    if (recurse) {
        // then the same borrow_mut() would panic in the recursive call below:
        recursive(Rc::clone(&value), false);
    }
}
1 Like

Couldn't that be fixed by never holding a Ref or a RefMut in a variable, and only ever creating them inline as needed? So code like this:

let val = MyStruct::new();
println!("{}", val.x);
val.x = 1;
println!("{}", val.y);
Val.y = 2;

Would turn into this:

let val = Rc::new(RefCell::new(MyStruct::new()));
println!("{}", val.borrow().x);
val.borrow_mut().x = 1;
println!("{}", val.borrow().y);
val.borrow_mut().y = 2;

That would mean the line let guard = value.borrow_mut(); would never need to happen.

See the comment. If you need the mutable borrow while performing the recursion, then no, you can't fix it. Perhaps most of the time you can get away with it.

But again, it's easy to forget because it's not statically checked. And if you are trying to come up with elaborate ways of avoiding beartraps like this, why not just spend the same amount of effort on doing it the idiomatic way, with static borrow checking? I don't see how the RefCell way would be easier, and I feel like it's stretching the language for no really good reason.

1 Like

The example breaks because the guard survives until the scope ends. If the guard is dropped and re-acquired, it runs fine. I added a main function and imports so it can be tested.

Before:

use std::cell::RefCell;
use std::rc::Rc;

fn recursive<T>(value: Rc<RefCell<T>>, recurse: bool) {
    let guard = value.borrow_mut();
    // use guard
    if (recurse) {
        recursive(Rc::clone(&value), false);
    }
    // use guard
}

fn main() {
    recursive(Rc::new(RefCell::new(String::new)), true);
}

After:

use std::cell::RefCell;
use std::rc::Rc;

fn recursive<T>(value: Rc<RefCell<T>>, recurse: bool) {
    let guard = value.borrow_mut();
    // use guard
    std::mem::drop(guard);
    if (recurse) {
        recursive(Rc::clone(&value), false);
    }
    let guard = value.borrow_mut();
    // use guard
}

fn main() {
    recursive(Rc::new(RefCell::new(String::new)), true);
}

I would never write code like this by hand, you're absolutely right that it's awful. This is purely a question about whether a relatively simple compiler could turn code with no ownership information into valid rust. The reason it would be easier is the same reason that people find dynamically typed languages easier; less syntax takes less time to type, so you have something working faster than if you used a language that required you to be more explicit.

Yes, that's exactly what I was referring to by "see the comment", i.e. the one that says "if for some reason you need this mutable borrow while recursing".

A naive replacement wouldn't work. For example, & can be used across threads, but Rc can't. Shared references are Copy, but Rc/Arc aren't. Wrapping in RefCell makes types incompatible with FFI, etc.

You could keep digging, and change the language semantics and refine the implementation until it handles all the cases. However, you'll end up with a huge performance overhead. A naive replacement will give you performance of Python's GIL. An optimized implementation may end up comparable to Swift's ARC, but that's still a significant overhead.

Rust's unique value depends on having both safety and performance. If you mess with the performance (and low-level memory control, FFI, etc.) then Rust loses its key advantage. This raises question why bother if there's Swift, Kotlin, TypeScript, and many others that may be easier to use.

7 Likes

I had this same thought a number of years ago and made a prototype language based on the concept: GitHub - willcrichton/lia: A high-level language for Rust

To me, the interesting question is whether you could automatically lift the stdlib into this language.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.