Has a language level "weak reference" ever been explored for Rust?


#1

I see that the reference counting Rc library includes weak references but what I’m wondering is if the idea of a weak reference was ever considered for the language itself. I’m thinking specifically about the difficulties in creating self referential structures using what would be considered in other languages as “normal” approaches (I’m a long time C/C++ programmer).

I’m curious to know what types of things were discussed and why they weren’t adopted. I’ve wondered if instead of having all or nothing safe or unsafe code, having just an unsafe/unchecked type (ie. the back reference). Kind of in the same way that root permissions on most OSs are no longer a giant all or nothing hammer and are now decomposed into smaller permissions.


#2

Before 1.0 Rust used to have @ symbol for garbage-collected pointers, which actually used reference counting. So it’s been somewhat explored, and removed from the language. But self-referential struct problem is still unsolved.

Current approach is the rental crate. I’ve seen some proposals to add “immovable” types to the language that would allow self-referential structs to be safe.

Plain pointers in Rust are an unsafe type, so *mut Type is essentially this unchecked backreference type. You can place them in structs and copy in safe code. You only need unsafe{} to dereference them (but of course it’s up to you to ensure the memory is still allocated and the struct hasn’t been memmove'd elsewhere).


#3

Thanks for the great answer!

I find the keyword “unsafe” really gets under my skin - when it just means “unchecked.” It’s no worse than normal C/C++.


#4

This has specific meaning in Rust. Dereferencing a raw pointer, which is “unchecked” for validity by the compiler, can lead to memory unsafety, hence “unsafe”.

C/C++ are entirely unsafe (in Rust terms) languages to begin with, so there’s no dinstinction there to be made. Rust has (memory) safe and unsafe portions of the language, and the distinction is important.


#5

Yeah I get it. Admittedly, I’m quibbling over semantics but using “unsafe” as a keyword feels overstated and implies dire consequences when all it does is direct the compiler to stop being so “helpful”. Lol.


#6

The thing is, there is no use splitting this up further - any one of the things you can only do in unsafe blocks, can introduce reading or writing invalid memory locations, or data races.


#7

Maybe a better analogy would be how C++ made C-style casting safer by identifying specific use cases and introducing several kinds of much more limited casts to address them (const_cast, static_cast, dynamic_cast.) It’s in that sense that I humbly suggest perhaps “unsafe” doesn’t have to necessarily be all or nothing.


#8

@ibkev You might be able to address this by making a module with a more specific scary name than unsafe, then bundle all your specific functionality in that module. This is a normal way to use modules and I’m pretty sure it’s greppable the way unsafe is unless you break discipline by importing the module unqualified. It doesn’t give you block syntax, but if you really wanted you could expose your unsafe functions as fns-taking-closures to get it back.

Example: if your program has a type that you want to split into two parts which have an overlapping view of some data, you’ll need to use unsafe. But you can wrap that like this:

mod unsafe_abc {
  fn split<'q>(abc: &'q mut ABC) -> (&'q mut AB, &'q mut BC);
  fn split2<F, X>(read: & mut ABC, callback: F) -> X where F: Fn(&mut AB, &mut BC) -> X;
}

(You could even just name it unsafe_abc_split to prevent people from accidentally importing it unqualified.)

Good luck!


#9

I think it’s not just unsafe. It’s undefined behavior to create multiple &mut references to overlapping data. You’d probably need to use pointers for that, and your calling code will still require unsafe.