Global state and env::set_current_dir

For the first time I have been implementing threads with Rust. I have struck some global state that I did not expect. As displayed in the little programme below the "working directory" seems to be global across threads. This does not seem too unreasonable but it has raised for me the question: What is and is not global about state? For example local information, can differnt threads have different locals? Where can I find documents to clarify what is and is not global state?

use std::env;
use std::time::Duration;
use std::thread;
fn main () {
    let mut hv:Vec<thread::JoinHandle<()>> = Vec::new();
    hv.push(thread::spawn(move || {
        env::set_current_dir("/tmp").unwrap();
        thread::sleep(Duration::from_millis(100));
        println!("crrrent dir (/tmp): {:?}", env::current_dir());
    }));
    hv.push(thread::spawn(move || {
        env::set_current_dir("/dev").unwrap();
        thread::sleep(Duration::from_millis(90));
        println!("crrrent dir (/dev): {:?}", env::current_dir());
    }));
    hv.push(thread::spawn(move || {
        env::set_current_dir("/usr").unwrap();
        thread::sleep(Duration::from_millis(80));
        println!("crrrent dir (/usr): {:?}", env::current_dir());
    }));
    hv.push(thread::spawn(move || {
        env::set_current_dir("/home").unwrap();
        thread::sleep(Duration::from_millis(170));
        println!("crrrent dir (/home): {:?}", env::current_dir());
    }));
    for h in hv {
        h.join().unwrap();
    }
}

Results:

crrrent dir (/usr): Ok("/dev")
crrrent dir (/dev): Ok("/dev")
crrrent dir (/tmp): Ok("/dev")
crrrent dir (/home): Ok("/dev")

The documentation for std::env states right at the top that it is for "Inspection and manipulation of the process's environment.". Which makes sense as the "environment" is an OS-level concept that applies to an entire process and not just to a single thread in that process.

1 Like

To follow up on what @FenrirWolf said, once you know that a function is likely going to involve some OS-level functions, you can look up some OS-specific details. For example, under Windows the function to get the current working directory is called GetCurrentDirectory and it's explicitly documented to apply to the whole process:

Each process has a single current directory

and

The current directory state written by the SetCurrentDirectory function is stored as a global variable in each process, therefore multithreaded applications cannot reliably use this value without possible data corruption from other threads that may also be reading or setting this value.

Source: GetCurrentDirectory function (winbase.h) - Win32 apps | Microsoft Learn

This should be better documented in set_current_dir.

It sounds like on Windows SetCurrentDirectory is not thread-safe. In case of error Rust's stdlib reads last_os_error, so that's another race condition. It's probably too late to mark this function as unsafe :frowning:

Any suggestions how the documentation for it should be worded? e.g.

set_current_dir sets the current directory globally for the entire process. It should be avoided in multi-threaded programs, as it will affect filesystem operations on all threads.

At least Windows last_os_error is threadsafe, if the docs for GetLastError are accurate. On Unix errno should be a thread-local.

1 Like

yes, I'd happily take either a PR with improvements, or a bug with a list of ideas for improvements.

As I have pointed out before, SetCurrentDirectory itself is thread safe. You can call it multiple times from multiple threads without any risk of corrupting the current directory. The risk is entirely in user code which might assume that the current directory won't suddenly change mid-execution.

What is "SetCurrentDirectory"?

From MSDN

The Win32 API is interesting like this.

When writing a lot of C/C++ code over the last few years I ran into this. As retep said, it's thread-safe, it's just that your results aren't. This is contrast to the process models on other operating systems where if you manipulate the environment it acts sort of like container/process->threads->userland magic tiering; Windows treats threads almost like a process (somewhat arbitrarily). The model works for some things, and has drawbacks for others.