How do Rustaceans handle configuration values?


#1

I have a server program that I’m writing that needs to read values from a config file and store them. In other languages, I typically set up a struct to hold those values, instantiate it globally, then set the values during the initialization phase of the program and then, if supported by the language, make the instance immutable.

From what I understand, this type of approach is not recommended in Rust. I’m left wondering how Rustaceans handle configuration variables. One alternative that comes to mind is passing a reference to a non-global instance to every function/method that needs that information. While this would be about as fast as a global, it feels very un-ergonomic to me.

Any suggestions or recommendations are appreciated. Ideally, I would like to avoid having to explicitly pass or reference it and without pulling in crates for something that I feel should be quite trivial.


#2

Some languages make it look like a trivial problem, but it isn’t a trivial problem :slight_smile: And in Rust if you don’t want to use external crates you need to recreate wheels, so in Rust using crates is usually the preferred way. The low-level solution is to use crates like lazy_static. The higher level solutions is to use crates like config-rs.


#3

I’d just make a global using the lazy_static! crate.


#4

To chime in on this: configuration is one of the textbook cases of a “cross-cutting concern”; it’s a little thing, but it is needed everywhere. “Everywhere” is always hard, no matter how tiny it is.
Other examples of cross-cutting concerns are logging and database access, for which it is commonly accepted to use high-powered libraries, and people will actively warn you away from rolling your own.

As for the reticence to pull in crates: this is very common for people coming from other languages, and very understandable. My advice is to stop worrying about it in Rust.

In most other languages, pulling in a library is hard and comes with a lot of overhead. Which version to use, how to upgrade it, how to link it to the build system, how to distribute it, etc…
And then some new release comes out and I have to rewrite half my code because some core API changed…
Lots of hassle! So people habitually avoid dependencies for “little” things.
The other side of the coin is javascript’s NPM ecosystem, where there are single-purpose dependencies for literal one-liners like Leftpad (leftpad! for crying out loud!). There are real advantages to tiny lego-block modules and it Just Works™ because the package manager is so good.

Rust has Cargo, and a community committed to Semantic Versioning (i.e. “not-breaking APIs”, even with tooling).
In that respect, “taking on a dependency” is far more lightweigth than in most conventional languages, and more like in the Javascript/NPM ecosystem (but even better!)

Cargo solves almost all of the hassles of distribution and building, and the community focus on not-breaking means upgrades are virtually always a breeze.

Cargo has learned its lessons from Java’s Maven, javascripts NPM, python’s PyPi, ruby’s Gems/Bundlr and Perl’s CPAN, and even the various package managers of linux distributions. It takes the best properties of all of them.
Depencency management in Rust is probably the least painful/most pleasant in the world right now.
(Barring the open question on how to deal with system dependencies, which is unsolved in any language, and more the domain of linux distributions, mobile app-stores, etc.)

(Edit: added lots of links of where I get these wild ideas from :wink: )


#5

The most scalable way of doing configuration from my experience is to make sure that if something needs a bit of state (not necessarily your entire configuration file!) to alter how it behaves then you pass that in via the constructor. I’ve written my fair share of decently sized applications and this turns out to be a lot easier than you’d think.

Not using globals also helps make your application significantly easier to reason about and more testable! So yeah, give it a try and see how you go :slight_smile: Also don’t be afraid of adding in a clone() or two if you find it’s not reasonable to use references. 99.9% of the time it’s not going to make a difference to your application’s performance, and the other 0.1% of the time you’ll probably know what you’re doing anyway.

You may also want to check out serde. Of all the libraries and languages I’ve used, serde is by far the nicest way of serialising your stuff between various formats (json, xml, toml, yaml, etc).


#6

Thanks for the input, guys. I really appreciate it!


#7

Seconding serde. It works like magic.


#8

Another nice aproach is to use dotenv. This limits your configuration simple NAME=value pairs that can come a .env file or from the system environment. While this might seem like a bad limitation, I’ve found that forcing yourself to avoid complex multi-leveled configuration trees is actually often a good thing (but obviously, that depends on the application).


#9

And FWIW, using environment variables for configuration is #3 in 12 factor apps … Oh, the dotenv crate documentation already makes this reference. So yeah, this is good advice.


#10

My question regarding envvars is whether or not there are any mechanisms for making sure that ensuring that values don’t get changed surreptitiously.

It helps that typically only the owner of the process and root can view or change them. But it seems like a bad idea to put configuration information somewhere it could be changed without any coordination or verification by the processes using that information. Any input on this concern?


#11

AFAIK, the environment is supposed to be treated as static key-value pairs. That doesn’t mean it will be true in every case. But when you take a platform like Python, for example, you have read-only access to the environment with COW semantics. This ensures, among other things, that the environment is left untouched when the process forks.

Without getting into any detail, sure it is possible that the environment may change from underneath you, but the same could be said of internal application state being changed by ptrace.


#12

A great question! An approach which uses no external crates would be to do the following:

  • define a struct with the fields that you need.
  • define a new constructor for it. The routine should grab the needed values out of the environment and bind them to the new struct instance. Constructor could return a result or panic if needed.
  • wrap the new instance in an Arc (immutable), and pass that around your program.
  • encapsulation is your friend here. Passing the Arc around is cheap, easy, and avoids having to use globals. This can help you a lot when it comes to unit testing.