Hi! I'm trying to create a global immutable config in rust. In C++, I'd typically write something like:
const Config config = load_config();
But in Rust, the compiler tells me I can't use a non-const function to initialize static variables.
I find some solutions like using LazyLock and OnceLock, but they checks whether the variable is initialized everytime I read the variable.
In my case, I'm initializing at the start of the program before any new thread is created, so these runtime checks feel unnecessary. What is the best practice in this case? Thanks advanced.
This check boils down to reading and comparing an atomic integer on every major platform. If you need access to your config in your hot loop, I personally would simply copy it out of the lock to get rid of the check. Otherwise—which would be 99.9999% of the time—, I would dereference the lock without a second thought about any runtime overhead.
Don't use a global variable for this. Instead, pass Config as a parameter to whatever function needs it. This also makes it possible to test your functions with different configs.
first of all, using globals for convenience should be avoided.
second, the runtime check you mentioned is probably the least thing you want to worry about. for one, it's unlikely in any hot paths of the program to check a configuration; what's more, it's just a single atomic load and conditional branch, which, in the case of immutable data, should always be well-predicted on modern CPU and practically costs nothing.
unlike C++, rust does NOT support running code before the main function, so it is impossible to guarantee at the type level in safe rust that a global variable is initialized at runtime. if you insist to eliminate the runtime "overhead", it's only possible using unsafe.
you need something like SyncWrapper<UnsafeCell<MaybeUninit<Config>>>. SyncWrapper<UnsafeCell<...>> can be SyncUnsafeCell<...> if you are on nightly. the trade off is, you end up replacing each runtime check OnceLock::get().unwrap() with an unsafe operation MaybeUninit::assume_init_ref(). the "overhead" is removed, so is the safety guarantee of the type system. if you made a mistake somewhere, you would get UB instead of a panic.
There are various tricks, but, ultimately it all boils down to the question “why does that car is sold without seat belt alarm stoppers”… if you want to die, then there are other car manufacturers and if you want to see your program crash then there are other languages…
at the end of day, C++ global constructors are just an array of callbacks in some special linker sections that the C/C++ runtime startup code automatically calls. sure, you can put a callback there in rust, but it's just emulating foreign language features, it's not "native" to rust. also, the link_section attribute in rust is unsafe nevertheless.
If it's really immutable and it's really const, just use a static variable and initialize it before doing anything else in main. Don't use the static variable anywhere except the initializer, create a function which returns an immutable reference to the static and only access it this way.
For some reason, all the posters on this forum will only say "don't do this" instead of trying to answer basic questions.
I wonder if there’s some way to force unsafe to be used somewhere in the program to acknowledge the requirement. If unsafe has to be used on each access, that’s not great, and conversely, requiring unsafe on only the initializer is insufficient (purely-safe code could simply never call the initializer and still experience UB).
It's workable if you don't have too many accesses and mostly pass config around the same reference (one of the offers).
That's not great, but it's also necessary. Because in C++ handling of global variables (especially global variables with state that's initializard at runtime) is common PITA: every module assumes it would be first to be initialized, then these modules conflict, then you add language extension to handle that, then you find out that now your standard library is not fully usable, then you add some more kludges…
All of this can be recreated in Rust (using ctor, e.g.), if that's something that is imposed on you by someone… but why do that of your own volition if nobody forces you?
I simultaneously agree with “we should not shame people for asking the “wrong” questions”
while also thinking “we should guide them away from an approach which will cause them a lot of pain”.
I guess I (and others) tend to forget the part where I explain why their intended route is achievable but a hassle.
Isn't that already well-beaten thing in C++ community? Style guides and blog posts, maybe not whole books, but close to it — and all these things, essentially, say that you have to do what Rust enforces: either initialize static with constexpr expression or put it in function (which would do what OnceLock is doing).
I could have expected that such discussion would be needed for someone with JavaScript background, or, maybe, Java background, but C++?
C++ programmers know all too well, why they shouldn't use statics! They just use these for “quick starter example”, which invariably becomes “production-released pile of hacks” that you, then, need to support… Rust stops that train at the moment it tries to leave the stantion, that's all…
Static const variables can be used soundly. Certainly claiming this feature exists explicitly to make programs unsound is inflammatory hyperbole.
I find it somewhat ironic that you have linked to a blog post describing a situation in which a package maintainer was relentlessly bullied for having written not so good code, to the point of having to quit maintaining this package due to mental stress, then proceed to engage in bullying of the same type.
Not if they are mutable. And if they are immutable you couldn't initialize them.
C++ does that, too, except for globals that are initialized before main (and thus are assumed to be stafe to intialize without locks).
Function that provides “safe” access to a mutable static without any checks? What else can they be used for?
It's classic “export of unsafety from safe function”.
Why is it ironic? As I wrote: that's something both sad yet necessary. Because safety of Rust programs depends on people not trying to play these games… and the only known way to convince some people to stop is to exorcise them…
It seems in this case the OP already has found the "most correct" solution (OnceCell and the like) and are asking about how to implement this common pattern from other languages without this additional runtime overhead of checking the cells initialization status.
There are two solutions to a problem, A and B, and A is flawed and B fixes those flaws. Maybe a reasonable approach is to present solution B and ignore A, to guide users to the best solution. Maybe another reasonable approach is to educate the user - present both solutions and the explain why A is flawed and how B fixes those flaws, lest the user discovers solution A themselves in the wild and says "screw it, I'm going to use solution A instead of B", unaware of the flaws inherent to solution A.
In this specific case, the OP is already aware of solution B. I think it is clearly no longer tenable to ignore the existence of solution A, or to tell the user that they shouldn't even have this problem to begin with, as several commenters have done here. It's almost like many of these commenters didn't even read the entire post, they just see "how do I do global variables" and immediately feel a visceral hatred towards the asker. (Sorry for the rant, you are not the one making such comments. But perhaps one of the few self aware individuals in this thread who genuinely care about pedagogy in rust).
To be fair, I believe the code could be sound in a binary, though not in a library. That is, you could define the static and a wrapper around it in main.rs or something, and in a SAFETY comment, cite the fact that you run the initializer before anything else in fn main.
Of course, this is of limited utility. And should not actually be used anyway. I’m just playing devil’s advocate.
It's not sound by the very definition of soundness. If you need to write SAFETY declaration then it's not a sound interface. It can be correct, in certain situations, sure, and, in some even rarer cases, necessary… not a reason to do such things easy.
Why do you think set_env and remove_varwere made unsafe in Rust 2024? They were causing crashes. In programs where people believed them to be “safe”.
…huh? The purpose of a SAFETY comment is to explain why something is sound. I didn’t mean a Safety section of the function documentation.
For clarity, I meant a SAFETY comment within the wrapper around the static, above the unsafe block accessing the static, explaining why the safety invariants are necessarily upheld. If the safety invariants cannot possibly be violated by safe code, the code is sound. And unsafe code may not be allowed to make assumptions of arbitrary user code, but it IS allowed to make assumptions about “what are the contents of the fully-written-out function directly below me?”. Maybe add a comment to fn main so that someone doesn’t accidentally break the code, but I’d consider it perfectly sound to rely on concrete code in the same module.
it's not that it can't be done, but it's not the best way to solve the problem, especially for someone who came to rust from C++ not very long. it's better to just use OnceLock or LazyLock and be happy with it.
In general you can't guarantee this within Rust.[1] For example, you don't know what linked code does (including libc). Maybe someone's pulling in a DBUS library with a static initializer or such.
That has been discussed at length around issues like the set_var (Unix environment) unsoundness. That said, your scenario is a bit different since you control the global resource.
But even then, there's no way to make "initialize global in main, assume init elsewhere with no checks" sound without a static atomic or similar in main to ensure you avoid initializing more than once. (main is safely callable from everywhere in the crate.) And if there may be signal handlers or panic handlers, etc, that access the global, you better make sure they're impossible to invoke before initialization too.
Suggesting using unsafe when is reasonably isn't needed, especially to a newcomer, is against the culture
Once you're in unsafe territory, the answers are almost always non-basic or non-sound (the latter of which is extremely against the culture)
Also some of the answers are suggesting alternatives like "pass a parameter instead". While that is "don't do this" in terms of having a global, it is idiomatic and "how you achieve this" in terms of avoiding checks every time you read the config... and requires no unsafe.