Metadata FileType vs DirEntry FileType

Why does the debug output for FileType differ. A small crate with the following main() gives me this output. Note the mode numbers.

"./Cargo.toml"
Metadata filetype FileType(FileType { mode: 33188 })
Direntry filetype FileType(FileType { mode: 32768 })
"./target"
Metadata filetype FileType(FileType { mode: 16877 })
Direntry filetype FileType(FileType { mode: 16384 })
"./src"
Metadata filetype FileType(FileType { mode: 16877 })
Direntry filetype FileType(FileType { mode: 16384 })
"./Cargo.lock"
Metadata filetype FileType(FileType { mode: 33188 })
Direntry filetype FileType(FileType { mode: 32768 })
use std::fs;

fn main() {
    if let Ok(entries) = fs::read_dir(".") {
        for entry in entries {
            if let Ok(entry) = entry {
                println!("{:?}", entry.path());
                if let Ok(metadata) = entry.metadata() {
                    println!("Metadata filetype {:?}", metadata.file_type());
                } else {
                    println!("Couldn't get metadata for {:?}", entry.path());
                }
                if let Ok(file_type) = entry.file_type() {
                    println!("Direntry filetype {:?}", file_type);
                } else {
                    println!("Couldn't get file type for {:?}", entry.path());
                }
            }
        }
    }
}

To begin with, this may be platform-specific behaviour -- certainly the methods themselves are platform specific. So you may help to let us know what platform your on if you want a more precise answer.

The rest of this reply is based on Linux and are just observations; I didn't go check the Rust stdlib code.

On Linux, the dirent C struct contains a field (d_type) which specifies the type of the file. There is enough information in here to know if it's a directory or normal file or symlink, for example. The explicit purpose of the field is to avoid a stat or lstat syscall. The stat C struct, in contrast, contains a field (st_mode) which contains not just this information, but also other information, such as the srwx bits for the user, group, and other categories.

If you consider each pair of numbers of your output in hexadecimal (or binary), you will see that if a bit is set in the Direntry mode, it is also set in the Metadata mode. But the Metadata mode has more bits set -- within three bytes which are all 0s in the Direntry mode. This difference looks pretty much like I would expect the difference between a dirent.d_type and a stat.st_mode value to look like.

Final note, I'm surprised the mode is even part of the Debug output -- seems like an internal detail.

References:

  • man 3 readdir
  • man 2 stat
2 Likes

OS is linux. Following my local documentation I find
std::fs;:Metadata.file_type return std::fs::FileType
std::fs::DirEntry.file_type returns Resultstd::fs::FileType

And I assumed it is something os specific as you say. How would I get from reading the documentation to the answer? Is the Debug output or internal stuff somewhere documented.

And thank you quinedot, reading the manpages I see the bits you mention, but still have no clue how to get that info from the rust documentation.

In general, you cannot depend on the output of Debug to be stable, e.g. it's not intended to be something you could rely on for serialization.

As for FileType, the inner workings are intentionally private. You can see this in the documentation where the declaration looks like

pub struct FileType(_);

The _ indicates that the contents are private. And if you click the "src" link you can see that the inside of the tuple is not marked as pub.

You can also see that Debug is derived. Even though the output is not stable, we can infer some things from the output that you pasted. Namely, on Linux, fs_imp::FileType looks to be a struct with a single field mode. But you're correct that this isn't documented. Why not? Well, it's an internal implementation detail that isn't guaranteed to stay the same, just like the Debug output. The only things that is documented for FileType is the API that you can rely on.

So in summary, the derived Debug trait is letting some internal details about the private portions of structs leak through, but you cannot rely on that output, and you cannot rely on the internal details to stay the same. They're not documented because you can't rely on them, and "shouldn't care" as a consumer of the API.

One more thing to note is that these non-guarantees allow not only flexibility for the library maintainers generally, but it allows abstractions across platforms. If we dig into the platform specific source code, we can find the definition for the internal FileType on unix platforms, and a different definition for Windows. If you ran your code on Windows, your debug output would be different than that on Linux (another reason you cannot rely on it). And I have no idea if the values in the output would be the same between the metadata and directory entry versions or not.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.