Hey everyone, I'm currently working on a guide to global data in Rust, including
let
, const
, include_str
, include_bytes
, lazy_static
, phf
, include
, and static mut
. I would love to hear your feedback on how the guide can be improved and augmented!
I would consider adding thread_local! and once_cell to the guide. Personally, I find them very useful and they seem to be popular options for others as well.
I also find arc-swap
useful. It allows me to load a new version of the object without locking.
I could see dependency injection in your list too.
There is a pretty good answer to singletons on StackOverflow which might be helpful as well.
Thanks for the suggestion! I added a once_cell
example. How would I use thread_local!
for global data? By making thread-local copies of an immutable configuration, maybe?
@ronlobo thanks for pointing me to that Stack Overflow thread!
I think DI is a great idea, but I'm a little hesitant to add it in here as I feel like DI is a deep topic in and of itself. Maybe I could take a swing at writing a DI guide, if one doesn't already exist.
How would I use
thread_local!
for global data? By making thread-local copies of an immutable configuration, maybe?
It might be outside the scope of the article since the data will be scoped to the current thread and not strictly global, but I feel thread_local!
is like The Fourth Beatle that goes by unnoticed even when it should be considered.
-
Some programs are simply single threaded. You can have mutable global data and avoid the cost of synchronization since you don't need it.
-
Some multi threaded programs doesn't need a shared mutable state between threads, but still want the advantages of thread local data which can be great for certain API's or just for the initialization guarantees (only one instance can exist in a thread). I've seen globals which requires synchronization used in these cases just because they didn't know about
thread_local
. -
Types doesn't need to be
Sync
-
You can use a
RefCell
for interior mutability -
Values get's dropped when the thread exists (with some caveats)
And probably more.
Granted, once_cell
can cover these cases as well, but since it's in the standard library I thought it might be worth a mention.
Yeah, I think I want to keep it out of the scope of the article but I would like to write something about it because I know that thread_local!
exists but I have never really known what to use it for.
Careful,
static mut
is a construct that still exists for backwards compatibility, it is expected to be eventually deprecated, given how easy it is to have unsound code and trigger UB when using them
So I highly advise against suggesting its usage, which can always be replaced with plain static VAR
, using safe shared-mutable wrappers, such as Mutex/RwLock
(in conjunction with lazy_static!
or once_cell
, or some const fn
constructor such as ::parking_lot::const_mutex
) in a multi-threaded context, and Cell/RefCell
(in conjunction with thread_local!
) otherwise.
- Regarding FFI, instead of
*mut Thing
, you use a*const Mutex<Thing>
, for instance, and(*ptr).lock()
it on usage, for the general case, although in practice whenThing
is just an integer you could simply cast your*const Cell<integer>
to a*mut integer
and operate with the latter, provided you never upgrade that pointer to&'_ integer
nor&'_ mut integer
(but staying with a*const Cell<integer>
that you upgrade to&'_ Cell<integer>
to use.set()
and.get()
is less error-prone).
Hey @Yandros, thanks for this! You're right that including mutable static items in the guide was a mistake. I removed the examples and replaced them with a dire warning. I also added a section on immutable statics with a parking_lot
example to give people an idea of how they can implement mutable global data more safely. I feel bad about including mutable statics because I know that some programmers will think of them as an easy way to "just get things done" without necessarily understanding all the requirements to use them safely.
Great! I love how it is phrased in the repo right now
Hey, that's a great idea for a guide! Here's my addition.
I have a pretty specific use-case: a library that is LD_PRELOAD
-ed into a game, hooking some functions. Since I know that pretty much all of the functions I care about are going to be called from a single main game thread, I came up with this global variable scheme.
- There's a
MainThreadMarker
struct, non-Send
/Sync
. If you have one of these you're on the main game thread.
Hooked functions construct the marker and pass it down:/// This marker serves as a static guarantee of being on the main game thread. Functions that /// should only be called from the main game thread should accept an argument of this type. #[derive(Clone, Copy)] pub struct MainThreadMarker { // Mark as !Send and !Sync. _marker: PhantomData<*const ()>, } impl MainThreadMarker { /// Creates a new `MainThreadMarker`. /// /// # Safety /// This should only be called from the main game thread. #[inline] pub unsafe fn new() -> Self { Self { _marker: PhantomData, } } }
#[no_mangle] pub unsafe extern "C" fn Host_Shutdown() { abort_on_panic(move || { let marker = MainThreadMarker::new(); some_rust_function(marker); }); }
- All global state is stored in
MainThreadCell
orMainThreadRefCell
s which provide safe access if you have a marker:/// Cell accessible only from the main thread. pub struct MainThreadCell<T>(Cell<T>); // Safety: all methods are guarded with MainThreadMarker. unsafe impl<T> Send for MainThreadCell<T> {} unsafe impl<T> Sync for MainThreadCell<T> {} impl<T> MainThreadCell<T> { /// Creates a new `MainThreadCell` containing the given value. pub const fn new(value: T) -> Self { Self(Cell::new(value)) } /// Sets the contained value. pub fn set(&self, _marker: MainThreadMarker, val: T) { self.0.set(val); } } impl<T: Copy> MainThreadCell<T> { /// Returns a copy of the contained value. pub fn get(&self, _marker: MainThreadMarker) -> T { self.0.get() } } static GLOBAL_MUTABLE_VALUE: MainThreadCell<i32> = MainThreadCell::new(16); fn some_rust_function(marker: MainThreadMarker) { let value = GLOBAL_MUTABLE_VALUE.get(marker); GLOBAL_MUTABLE_VALUE.set(marker, 32); }
This way there's no runtime overhead (present with thread_local
or mutexes) and it still ensures the global data doesn't have multiple exclusive references (with the RefCell
variant).
Hey @YaLTeR, that's an interesting technique. Thanks for sharing! Is there a way to encapsulate the unsafe functionality in a crate so that others can make use of the technique without needing to write any unsafe
? Right now the guide is only safe code, and I think it might be better to keep it that way.
There might be a way to remove unsafe from MainThreadMarker::new()
by storing the thread ID on the first call and then verifying it's the same on subsequent calls but I haven't explored this yet.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.