Race condition in file creation?


#1

I’ve discovered a situation where after the creation of a file it sometimes doesn’t exist on disk. This occurs randomly, but can be reproduced easily. The relevant code is:

fn mktemp() -> PathBuf {
    TempDir::new("test").unwrap()
        .into_path()
}

#[test]
fn create_single()
{
    let path = mktemp();
    println!("TMPDIR is {:?}", path);

    env::set_current_dir(path).unwrap();
    let empty = Path::new("file.txt");
    {
        File::create(&empty).unwrap().flush().unwrap();
    }
    assert!(empty.exists());
}

#[test]
fn create_multi()
{
    let path = mktemp();
    println!("TMPDIR is {:?}", path);

    env::set_current_dir(path).unwrap();
    let empty = Path::new("file.txt");
    let zzzz = Path::new("zzzz.txt");

    {
        File::create(&empty).unwrap()
            .flush().unwrap();

        let mut fd = File::create(&zzzz).unwrap();
        fd.write_all("zzzz".as_bytes()).unwrap();
        fd.flush().unwrap();
    }

    assert!(empty.exists());
    assert!(zzzz.exists());
}

This will fail on assert!(zzzz.exists()) about 1 in 4 times. Some notes from experimentation:

  • This occurs on both OS X and Linux, so is not an artifact of a filesystem.
  • It’s not related TempDir; a hand-rolled version creating random dirs under /tmp/ behaves the same (see repo below).
  • The issue only occurs when the first test is present. Removing it seems to remove the issue (or make it unreproducible).
  • Tested on stable and nightly.

I have created a repo with the code to reproduce this: https://bitbucket.org/tarkasteve/rust-file-race-test/src

I find it hard to believe that this is a Rust issue, so I must be doing something wrong, but I can’t see what. Any help would be appreciated.

Cheers,
Steve


#2

The way tests works is that each #[test] function will run in parallel and in randomized order. This means that tests needs to be written in such way that they don’t depend on each other and don’t collide.


#3

I think the problem has to do with using an environment variable with parallel tests.
If you remove env::set_current_dir and instead join the filename to the path the test runs fine.

fn mktemp() -> PathBuf {
    TempDir::new("test").unwrap()
        .into_path()
}

#[test]
fn create_single()
{
    let path = mktemp();
    println!("TMPDIR is {:?}", path);

   let empty = path.join("file.txt");
    {
        File::create(&empty).unwrap().flush().unwrap();
    }
    assert!(empty.exists());
}

#[test]
fn create_multi()
{
    let path = mktemp();
    println!("TMPDIR is {:?}", path);

    let empty = path.join("file.txt");
    let zzzz = path.join("zzzz.txt");

    {
        File::create(&empty).unwrap()
            .flush().unwrap();

        let mut fd = File::create(&zzzz).unwrap();
        fd.write_all("zzzz".as_bytes()).unwrap();
        fd.flush().unwrap();
    }

    assert!(empty.exists());
    assert!(zzzz.exists());
}

#4

OK, that (sort of) makes sense if the tests are run in threads. I might raise a report against the book though, as it’s not clear that this is the case.

Thanks for your help.

Edit: I decided to add it myself, PR here: https://github.com/rust-lang/rust/pull/37766


#5

Why isn’t env::set_current_dir marked as unsafe?


#6

How can you produce memory unsafety from set_current_dir?


#7

It could result in dlopen loading the wrong shared object. But I assumed that unsafe applies to other safety issues as well, not just memory safety. For example, data races on non-pointers cannot endanger memory safety, yet Rust still avoids them. Same for aliased access to primitive types.


#8

Calling dlopen requires unsafe code.

I’m not sure what you mean by non-pointers, but there can absolutely be memory safety issues allowing data races on non-pointers - for example, concurrent access to an enum causing loads of undefined data.


#9

Note that unsafe does also not mark the sites where potentially weird things can be introduced, but the place where the unsafty can take effect. Example: you can acquire raw pointers in safe code, as long as you don’t dereference it. That needs unsafe.


#10

Sure, but there is still an expectation that dlopen behaves as intended, independent of the rest of the program.

Right, bad example on my part. But there are still many cases which do not endanger memory safety, and are restricted (with good reason—memory safety really isn’t the raison d’être for any programming language.


#11

Why would there be that expectation? chdir is just as much of a legitimate libc function as dlopen is.


#12

There is an emerging concept of library-safe code; a slightly different take on this matter (link should go to the Process Attributes section).

In short, there is a fundamental difference between dlopen and chdir: If you are careful, it is possible to use dlopen and still interoperate with other libraries used by the process. (dlopen is not really modular in the strictest sense, but you can avoid issues if you open only DSOs with carefully constructed ABIs.) chdir is defined to change a process-global attribute, which just does not compose at all with what other parts of the process might be doing at the same time.