Global state and env::set_current_dir


#1

For the first time I have been implementing threads with Rust. I have struck some global state that I did not expect. As displayed in the little programme below the “working directory” seems to be global across threads. This does not seem too unreasonable but it has raised for me the question: What is and is not global about state? For example local information, can differnt threads have different locals? Where can I find documents to clarify what is and is not global state?

use std::env;
use std::time::Duration;
use std::thread;
fn main () {
    let mut hv:Vec<thread::JoinHandle<()>> = Vec::new();
    hv.push(thread::spawn(move || {
        env::set_current_dir("/tmp").unwrap();
        thread::sleep(Duration::from_millis(100));
        println!("crrrent dir (/tmp): {:?}", env::current_dir());
    }));
    hv.push(thread::spawn(move || {
        env::set_current_dir("/dev").unwrap();
        thread::sleep(Duration::from_millis(90));
        println!("crrrent dir (/dev): {:?}", env::current_dir());
    }));
    hv.push(thread::spawn(move || {
        env::set_current_dir("/usr").unwrap();
        thread::sleep(Duration::from_millis(80));
        println!("crrrent dir (/usr): {:?}", env::current_dir());
    }));
    hv.push(thread::spawn(move || {
        env::set_current_dir("/home").unwrap();
        thread::sleep(Duration::from_millis(170));
        println!("crrrent dir (/home): {:?}", env::current_dir());
    }));
    for h in hv {
        h.join().unwrap();
    }
}

Results:

crrrent dir (/usr): Ok("/dev")
crrrent dir (/dev): Ok("/dev")
crrrent dir (/tmp): Ok("/dev")
crrrent dir (/home): Ok("/dev")


#2

The documentation for std::env states right at the top that it is for “Inspection and manipulation of the process’s environment.”. Which makes sense as the “environment” is an OS-level concept that applies to an entire process and not just to a single thread in that process.


#3

To follow up on what @FenrirWolf said, once you know that a function is likely going to involve some OS-level functions, you can look up some OS-specific details. For example, under Windows the function to get the current working directory is called GetCurrentDirectory and it’s explicitly documented to apply to the whole process:

Each process has a single current directory

and

The current directory state written by the SetCurrentDirectory function is stored as a global variable in each process, therefore multithreaded applications cannot reliably use this value without possible data corruption from other threads that may also be reading or setting this value.

Source: https://docs.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-getcurrentdirectory


#4

This should be better documented in set_current_dir.

It sounds like on Windows SetCurrentDirectory is not thread-safe. In case of error Rust’s stdlib reads last_os_error, so that’s another race condition. It’s probably too late to mark this function as unsafe :frowning:

Any suggestions how the documentation for it should be worded? e.g.

set_current_dir sets the current directory globally for the entire process. It should be avoided in multi-threaded programs, as it will affect filesystem operations on all threads.


#5

At least Windows last_os_error is threadsafe, if the docs for GetLastError are accurate. On Unix errno should be a thread-local.


#6

yes, I’d happily take either a PR with improvements, or a bug with a list of ideas for improvements.