Is "Bad File Descriptor" a possible error in RUST when all you are doing is opening and writing to a file?

I have an LD_PRELOAD program I am writing.
I do use unsafe code to intercept libc execv, and family of API to propagate itself across exec* and spawns libc calls of other processes to track file system access across a very large build

I just verified that the execv intercept propagation works just fine across a very large 3 hour build.
I then added some simple code to open tracker.file in append mode in RUST where I will eventually be writing tracking data. For starters it only writes process start up data in the library constructor annd everything works fine again across the 3 hour build.

As a last step i addd some tracking data writes from within the intercepted exec* family of calls to the already open tracker.file.
And now I am seeing tracebacks.

<0000929> Partitioner: Partitioner: Exception: Invalid component metadata files
<0000929> Partitioner:      manageability/edmlib/comp-mdata.pl:
<0000929> Partitioner:          - thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err`
 value: Os { code: 9, kind: Other, message: "Bad file descriptor" }', src/tracker.rs:255:118
<0000929> Partitioner: stack backtrace:
<0000929> Partitioner:    0: rust_begin_unwind
<0000929> Partitioner:              at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/std/src/
panicking.rs:475
<0000929> Partitioner:    1: core::panicking::panic_fmt
<0000929> Partitioner:              at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src
/panicking.rs:85
<0000929> Partitioner:    2: core::option::expect_none_failed
<0000929> Partitioner:              at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src
/option.rs:1219
<0000929> Partitioner:    3: core::result::Result<T,E>::unwrap
<0000929> Partitioner:              at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src
/result.rs:1005
<0000929> Partitioner:    4: wisktrack::tracker::Tracker::report
<0000929> Partitioner:              at /ws/sarvi-sjc/wisktrack/src/tracker.rs:255
<0000929> Partitioner:    5: wisktrack::tracker::Tracker::reportexecv
<0000929> Partitioner:              at /ws/sarvi-sjc/wisktrack/src/tracker.rs:373
<0000929> Partitioner:    6: wisktrack::my_execv
<0000929> Partitioner:              at /ws/sarvi-sjc/wisktrack/src/lib.rs:162
<0000929> Partitioner:    7: wisktrack::execv::{{closure}}
<0000929> Partitioner:              at /ws/sarvi-sjc/redhook/src/ld_preload.rs:67
........

Individual simple test cases of execv itnerceptions works fine.
But in larger multi threaded multi process builds only the execv intercept stack traces with "Bad File Descriptor" errors.

Question 1. If I all I am doing is open a file in append mode, holding the file object as a global lazy_static and writing to it possibly from multiple threads, are there possibilities of "Bad File Descriptor" expected. As in RUST, can I expect to handle when file descriptors are destoyed?

In my case I just happened to be calling the tracker.file writes from unsafe C extern code that overrides libc API.

Question 2: Does rust offer any tools to help debug such cases, like tracking when file objects are destroyed. In this case, it looks like my file object is fine but the file descriptor it is holding is deleted. How is that possible. In my case I dot have any code that formall deletes/closes the file object. It is held as a lazy_ static, so the only way it get deleted is at the end of the program.

Question 3: In this case, considering this happens only when writinng to tracker.file from within the exec* family of functions I suspect what is happening is that process/program that could be any program not writte in RUST is forking and execing. What is the impact of forking on open RUST file objects and their file descriptors. the real exec hasnt happenned, yet, he traceback before it reaches the real exec

Here is a bit of additional informatation.
On trying to debug further, I noticed of one of the make commandlines failing and I tried to narrow it down.
I added some debug statements to print the file object and notice something wierd happeing.

  1. I open the file object to write tracking data in library constructor and write some data and also print {:?} the file object and is fine
    File: File { fd: 3, path: "/nobackup/sarvi/xrwisktest/binos/wisktrack.file", read: false, write: true }

  2. By the time it gets to my execvp intercept function the file descriptor seems messed up and becomes "pipe:[602639795]". doesnt make any sense to me. And if did a write to the file in the next line, there is a crash.

Constructor: 89554
File : File { fd: 3, path: "/nobackup/sarvi/xrwisktest/binos/wisktrack.file", read: false, write: true }
Constructor Done: 89554, Cmd line: ["18:21"]
execvp(objcopy, objcopy -g /tmp/strip-native-debug-symbolskeytool3625506531087648868/keytool)
Tracker: File { fd: 3, path: "pipe:[602639795]", read: false, write: true }
  1. Very similar code path that ends up at my execve() intercept is fine and the fd is still the same as at creation time as shown below.
Constructor: 90812
File 1: File { fd: 3, path: "/nobackup/sarvi/xrwisktest/binos/wisktrack.file", read: false, write: true }
File 2: File { fd: 3, path: "/nobackup/sarvi/xrwisktest/binos/wisktrack.file", read: false, write: true }
Constructor Done: 90812, ["/bin/bash","./t.sh"]
execve(/auto/binos-tools/java/jdk-14.0.1/bin/jlink, /auto/binos-tools/java/jdk-14.0.1/bin/jlink -p /auto/binos-tools/java/jdk-14.0.1/bin/../jmods:./tools/tdlc/mods:./tools/tdlc/bin:./linkfarm/host/usr/binos/mods --add-modules tdlc --compress=2 --launcher tdlc.jli=tdlc/com.cisco.tdlc.TDLC --output ./tools/tdlc/bin/jlink --no-header-files --no-man-pages --strip-debug)
Tracker: File { fd: 3, path: "/nobackup/sarvi/xrwisktest/binos/wisktrack.file", read: false, write: true }

strace shows that both do a readlink
in successull execve case you see

readlink("/proc/self/fd/3", "/nobackup/sarvi/xrwisktest/binos"..., 256) = 47

in the other execvp you see

readlink("/proc/self/fd/3", "pipe:[602673444]", 256) = 16

Its the same exact compiled rust libwisktrack.so that is beinng LD_PRELOADED.
But something seems to have changed fd:3 before it gets to execvp()

Problem solved.
Root caused to some programs using dup2 use specific file descriptors fd=3 in this case for some reason.
This was essentially overwriting the automatically picked/opened fd when opening tracking file.

Also these programs were essentially then closing these duped hard coded fd making them bad file descriptors as part of the fork and exec cleanup.

Solution was for my LD_PRELOAD program to open and dup2 the fs to some FD high up in the 800s

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.