Need help implementing custom write_fmt and buffered write_fmt

I am using RUST code to write LD_PRELOAD libraries.
RUST code misbehaves in the library constructor, so I cant use println!() or even create fs:File objects.

But writing using direct syscalls seem to work fine.

So I am trying to write some custom printf/write_fmt code to be called from the constructor with direct syscalls for writes and formated writes,

  1. Basically create a custom File object that can be created from an File descriptor
  2. Be able implement write and write_fmt() methods so I can use write_fmt(format_args!("{} ", 10))
  3. Be able to create custom buffered writing with thread local bufffers
    I think I might be able to cobble to gether 1 and 3 above.
    (2) above stumping me and I am not finding good examples I can model around.

The closest I have to an example is the following but there is much that I dont understand in there.
https://doc.rust-lang.org/src/core/fmt/mod.rs.html#1061-1087

Can some one point me to some relevant tutorials or sample code that could guide me in that venture.

(I have no experience with these kind of issues, so my answer might be innacurate :sweat_smile:)

If I understand correctly, you want some kind of structure that will write to stdio using syscalls:

pub struct SyscallIo {
    // ...
}

and you want to be able to use write/write_fmt with it ?
Then it seems like what you want is to implement the Write trait

impl std::io::Write for SyscallIo {
    fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
        todo!()
    }

    fn flush(&mut self) -> std::io::Result<()> {
        todo!()
    }
}

Then you can do something like

let mut syscall_io: SyscallIo = ...;
syscall_io.write_fmt(format_args!("{} ", 10));

Let me clarify.
forrmat_args!(), returns an Arguments object

I already have
pub fn write(&self, buf: &[u8]) -> Result<usize, &'static str>
And am implementing my own thread local buffered writing within that.

I want examples for how I can implement
fn write_fmt(&mut self, args: Arguments<'_>) -> Result<usize, &'static str>

The insides of such a function, on how I can translate "Arguments" above to a string and write to my file.

I could use format!(), But that would be an extra layer of malloc since it returns a String. I am implementing my own buffering, I would like to know who I can walk Arguments, format and write them

To clarify. I am tryin to avoid using std:io. I will be using this code in LD_PRELOAD library constructor/destructor functions. and they dont seem to llike much std code. Infact the reason I am reimplemeting File with syscalls is that I can't use std::fs::File

Ah, I see...
I don't think it is possible to implement write_fmt by yourself, because Arguments implements very few public methods... At best, if you are on nightly, you can use as_str, but it is only useful if there are no formatting arguments, so... :confused:

I feel I must reiterate that implementing std::io::Write is still probably your best choice :sweat_smile: The default implementation of write_fmt is really quite sensible: it does not allocate on the ok path, and does no IO aside from the write function you supplied.
However, it might allocate in case of error, so I get that you would not want to use that...

The other solutions I see are:

  • Use core::fmt::Write. This is guaranteed to make no allocation/IO behind your back. However it only works on &str, gives no way to flush, and returns what amounts to Result<(), ()>
  • You could create your own Arguments structure, and find a way to build it from format strings. This would probably be incredibly complicated though.

As mentioned in this previous thread, you can use the write! macro to format into your own buffer, without extra allocations.

If you want to use it directly on your own File-like objects, you can implement core::fmt::Write for them. You don't need to implement write_fmt yourself. You can rely on the default implementation.

I guess write!() works.
I also go the following as you suggest as well.
pardon the bad errorr handling, need to refine that.

impl fmt::Write for File {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        let raw_s = s.as_bytes();
        match self.write_all(raw_s) {
            Ok(x) => Ok(()),
            Err(e) => Err(fmt::Error),
        }
        // Ok(())
    }

    fn write_fmt(&mut self, args: fmt::Arguments) -> Result<(), fmt::Error> {
        fmt::write(self, args)?;
        // w.as_str().ok_or(fmt::Error)
        Ok(())
    }
}

The &mut self, bugs me and I have this question.
Does this mean locking?
My current implementation has doubled build times when using my LD_PRELOAD library
My File implementation, which uses direct syscalls, is simply self without mut
pub fn write(&self, buf: &[u8]) -> Result<usize, &'static str>

I plan on having thread_local! buffers to buffer data, before doing the write.
I notice both core::fmt::Write and write!() both take &mut self.

Would that mean locking if 2 threads were calling write_fmt or write! at the same time?

&mut is an exclusive reference. If the File is in a global static (or thread-local) variable, then yes, you would need locking to get an exclusive reference to it.

If you implement the trait for shared references (&File) then you will be able to call it without an exclusive reference:

impl fmt::Write for &File {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        let raw_s = s.as_bytes();
        match self.write_all(raw_s) {
            Ok(x) => Ok(()),
            Err(e) => Err(fmt::Error),
        }
    }

    // don't need to implement `write_fmt`
}

This is what the standard library does for &std::fs::File, which is why the playground link in this comment works without locking.

I implemented impl fmt::Write for &File like you suggested and tried a test case with a lazy static file
Doesnt seem to help with mutable reference issue.

In my case, I will eventually have a global lazy_static! { } of
pub struct Tracker {
file: File,
};
Which I tested in test case below.
All this I because I dont want to use

  1. std:io to write to tracker and tracer files
  2. to be able to write to these files from possible multiple threads without locks.
#![no_std]

use core::fmt;
use core::sync::atomic::Ordering;
use std::ffi::{CStr, CString};
use std::{io,cmp};
use std::process;
use libc;
use backtrace::Backtrace;

use crate::common::{PUUID, UUID, WISKFD};

const READ_LIMIT: usize = libc::ssize_t::MAX as usize;

const fn max_iov() -> usize {
    libc::UIO_MAXIOV as usize
}

pub struct File {
    fd: i32,
    filename: String,
}

impl File {
    pub fn open(filename: &str, flags: i32, mode: i32, relocfd: bool, specificfd: i32) -> io::Result<File> {
        // eprintln!("PID: {}, internal_open FLAGS: {}, File: {}",
        //           process::id(), flags, filename);
        cevent!(Level::INFO, "open(filename={}, flags={}, mode={}, relocfd={}, specificfd={})",
                filename, flags, mode, relocfd, specificfd);
        let fd = if specificfd >= 0 {
            let eflags = unsafe { libc::syscall(libc::SYS_fcntl, specificfd, libc::F_GETFD) } as libc::c_int;
            if eflags >= 0 {
                let buffer: Vec<u8> = vec![0; 1024];
                let linkpath = CString::new(format!("/proc/self/fd/{}", specificfd)).unwrap();
                let retsize = unsafe { libc::syscall(libc::SYS_readlink, linkpath.as_ptr(), buffer.as_ptr(), 1024) as i32 };
                wiskassert!(retsize > 0, "Inherited file descriptor {} does not map to a file. Expected {}", specificfd, filename);
                let fname = unsafe { CString::from_raw(buffer.as_ptr() as *mut i8).into_string().unwrap() };
                let f = File {
                    fd: specificfd,
                    filename: fname,
                };
                return Ok(f)
            }
            specificfd
        } else {
            WISKFD.fetch_add(1, Ordering::Relaxed) as i32
        };
        let filename = CString::new(filename).unwrap();
        let tempfd = unsafe {
            libc::syscall(libc::SYS_open, filename.as_ptr(), flags, mode)
                        //   S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP)
        } as i32;
        if tempfd < 0 {
            return Err(io::Error::last_os_error());
        }
        cevent!(Level::INFO, "Opened FD: {}, relocating to {}", tempfd, fd);
        let retfd = if relocfd {
            let retfd = unsafe { libc::syscall(libc::SYS_dup3, tempfd, fd, flags & libc::O_CLOEXEC) } as i32;
            if retfd < 0 {
                errorexit!("Cannot dup3 fd {} to {}, flags: {}\n{}",
                           tempfd, fd, flags & libc::O_CLOEXEC, io::Error::last_os_error());
            }
            unsafe { libc::syscall(libc::SYS_close, tempfd) };
            cevent!(Level::INFO, "File Descriptor(Relocated): {} -> {}, File: {}\n",
                   tempfd, fd, filename.to_string_lossy());
            fd
        } else {
            cevent!(Level::INFO, "File Descriptor(Original): {}, File: {}\n",
                   tempfd, filename.to_string_lossy());
            tempfd
        };
        let eflags = unsafe { libc::syscall(libc::SYS_fcntl, retfd, libc::F_GETFD) } as libc::c_int;
        cevent!(Level::INFO, "internal_open FD: {}, EFLAGS: {}", retfd, eflags);
        if eflags < 0 {
            errorexit!("Error Creating/Duping FD: {} returned eflasgs: {}, File: {}", retfd, eflags, filename.to_string_lossy());
        }
        if (eflags & libc::O_CLOEXEC) != 0 {
            errorexit!("Error O_CLOEXEC FD: {} returned eflasgs: {}, File: {}", retfd, eflags, filename.to_string_lossy());
        }
        let f = File {
            fd: retfd,
            filename: filename.into_string().unwrap(),
        };
        Ok(f)
    }

    pub fn as_raw_fd(&self) -> i32 {
        self.fd
    }

    // pub fn sync_all(&self) -> io::Result<()> {
    //     self.fsync()
    // }

    // pub fn sync_data(&self) -> io::Result<()> {
    //     self.datasync()
    // }

    // pub fn set_len(&self, size: u64) -> io::Result<()> {
    //     self.truncate(size)
    // }

    pub fn read(&self, buf: &mut [u8]) -> Result<usize, &'static str> {
        let ret = unsafe {
            libc::syscall(
                libc::SYS_read,
                self.fd,
                buf.as_mut_ptr() as *mut libc::c_void,
                cmp::min(buf.len(), READ_LIMIT))
        };
        if ret < 0 {
            return Err("Error read from FD");
        }
        Ok(ret as usize)
    }

    pub fn read_vectored(&self, bufs: &mut [io::IoSliceMut<'_>]) -> Result<usize, &'static str> {
        let ret = unsafe {
            libc::syscall(
                libc::SYS_readv,
                self.fd,
                bufs.as_ptr() as *const libc::iovec,
                cmp::min(bufs.len(), max_iov()) as libc::c_int,
            )
        };
        if ret < 0 {
            return Err("Error read_vectored from FD");
        }
        Ok(ret as usize)
    }
    
    pub fn write(&self, buf: &[u8]) -> Result<usize, &'static str> {
        let ret = unsafe { libc::syscall(
            libc::SYS_write,
            self.fd,
            buf.as_ptr() as usize,
            buf.len()) };
        if ret < 0 {
            return Err("Error Writing to FD");
        }
        Ok(ret as usize)
    }    

    pub fn write_all(&self, buf: &[u8]) -> Result<usize, &'static str> {
        let x = unsafe { libc::syscall(libc::SYS_write, self.fd, buf.as_ptr() as usize, buf.len()) };
        if x < 0 {
            return Err("Error Writing to File");
        }
        Ok(x as usize)
    }   

}

impl fmt::Write for File {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        let raw_s = s.as_bytes();
        match self.write_all(raw_s) {
            Ok(x) => Ok(()),
            Err(e) => Err(fmt::Error),
        }
        // Ok(())
    }

    fn write_fmt(&mut self, args: fmt::Arguments) -> Result<(), fmt::Error> {
        fmt::write(self, args)?;
        // w.as_str().ok_or(fmt::Error)
        Ok(())
    }
}
impl fmt::Write for &File {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        let raw_s = s.as_bytes();
        match self.write_all(raw_s) {
            Ok(x) => Ok(()),
            Err(e) => Err(fmt::Error),
        }
    }

    // fn write_fmt(&mut self, args: fmt::Arguments) -> Result<(), fmt::Error> {
    //     fmt::write(self, args)?;
    //     // w.as_str().ok_or(fmt::Error)
    //     Ok(())
    // }
}

#[cfg(test)]
mod report_tests {
    use std::io;
    use libc::{O_CREAT,O_WRONLY,O_TRUNC,O_APPEND,O_LARGEFILE,S_IRUSR,S_IWUSR,S_IRGRP,S_IWGRP};
    use std::os::unix::io::{FromRawFd};
    use std::fs;
    use std::fmt::Write;
    use super::*;
    use crate::common::{WISKTRACEFD, WISKFD};

    lazy_static! {
        pub static ref FILE: File = File::open("/tmp/testdataglobal",
                                         (O_CREAT|O_WRONLY|O_TRUNC|O_LARGEFILE) as i32,
                                         (S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP) as i32,
                                         true, 850).unwrap();

    }

    #[test]
    fn report_test_006() -> io::Result<()> {
        assert_eq!(FILE.as_raw_fd(), 850);
        FILE.write_fmt(format_args!("Hello World: {}\n", 1)).unwrap();
        assert_eq!(fs::read_to_string("/tmp/testdataglobal").unwrap(), "Hello World: 1\n");
        Ok(())
    }
}

I am still getting the following error about mutable reference

error[E0596]: cannot borrow data in a dereference of `fs::report_tests::FILE` as mutable
   --> src/fs.rs:289:9
    |
289 |         FILE.write_fmt(format_args!("Hello World: {}\n", 1)).unwrap();
    |         ^^^^ cannot borrow as mutable
    |
    = help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `fs::report_tests::FILE`

You can do this to get an &File and then call its methods:

(&*FILE).write_fmt(format_args!("Hello World: {}\n", 1)).unwrap();

You don't need to call write_fmt or format_args!directly. You can use the write! macro as a shorter way to do the same thing:

write!(&*FILE, "Hello World: {}\n", 1);

Thanks. That worked.
But I am confused about what is actually going on here.

lazy_static! {
        pub static ref FILE: File = File::open("/tmp/testdataglobal",
                                         (O_CREAT|O_WRONLY|O_TRUNC|O_LARGEFILE) as i32,
                                         (S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP) as i32,
                                         true, 850).unwrap();

    }

as I understand it FILE here is a ref of File, essentially &File. Is it not?
So doing (&*FILE), goes ref of File to File and then back to ref of File with the &File ?

Or am I getting it wrong?

Also impl fmt::Write for &File and impl fmt::Write for File
both have
fn write_fmt(&mut self, args: fmt::Arguments)
which I presume means write_fmt takes mutable reference to self/File ?

Is the File object some how being copied/duplicated here?

Basically how is the following OK
write!(&*FILE, "Hello World: {}\n", 1);
but not the following.
write!(FILE, "Hello World: {}\n", 1);

FILE is defined, I presume as immutable, pub static ref FILE: File, in the lazy_static ?

What is &*FILE doing to the argument being passed to write!(). That makes it a mutable reference that can be passed to write_fmt(&mut self, ....) ?

Is there parts of the RUST documentation you would recommend for me to reread to better understand what is going on here?

lazy_static generates a new wrapper type that implements Deref<Target=File>, so it has the same capabilities as &File but is not actually &File.

&*FILE dereferences this wrapper, and takes a new reference to the result, basically converting it to &File.

No, there's just one File value, but you create a new reference to it each time you write &*FILE. If you don't bind this reference to a variable, then it just just a temporary reference that goes away at the end of the statement.

That explains the need for &*. Thanks.

But I guess my confusion is where the mutability of FILE changes?

I presume, the following FILE as immutable ? That is closer to &File, though not exactly as you explain above. But FILE is still immutable.

lazy_static! {
        pub static ref FILE: File = File::open("/tmp/testdataglobal",
                                         (O_CREAT|O_WRONLY|O_TRUNC|O_LARGEFILE) as i32,
                                         (S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP) as i32,
                                         true, 850).unwrap();

    }

But
fn write_fmt(&mut self, args: fmt::Arguments) takes a mutable reference.

so where does write!(&*FILE, "something"); change the mutability of the reference that is needed by write_fmt ? Shouldn't (&*FILE) be compatible with
fn write_fmt(&self, args: fmt::Arguments)
instead of what it actually is
fn write_fmt(&mut self, args: fmt::Arguments

Also I am reading this in lazy_static source

# Implementation details

The `Deref` implementation uses a hidden static variable that is guarded by a atomic check on each access. On stable Rust, the macro may need to allocate each static on the heap.

Does it mean every time we try to access the lazxy static rreference in this case FILE, is there some sort of locking happening ? that will cause performance issues in a multi threaded or even single threaded environment?

You're right, the File value is immutable here. Since we are implementing a trait for the type &File, the type of self in these methods is &mut &File (a mutable reference to an immutable reference to a File). You can rewrite the method signature as:

fn write_str(self: &mut &File, s: &str) -> fmt::Result

When you call this method like this:

(&FILE).write_str("hello")?;

you are actually creating a temporary &File reference and then passing a mutable reference to it. You could write the same code like this:

{
    let mut tmp: &File = &*FILE;
    Write::write_str(&mut tmp, "hello")?;
}

Note that write_str cannot mutate FILE, but it could mutate the temporary variable tmp to make it point to some other File. We know that your implementation of write_str does not actually do this, but even if it did, it wouldn't matter because the temporary reference goes out of scope at the end of the statement so we can never use it again.

Makes perfect sense. if you think of self as &File instead File in this case.

Any comment on this

I am reading lazy_statics have to have the Sync trait and might do locking and might be costly in a multi threaded or even a single threaded environment.
I am seeing the build times more than double with my current implementation that uses lazy static. though I am not sure if that is cause. My original C implementation did not have this much of er a performance impact for what I am doing.

Considering the above, I am thinking of using thread_local Cursor buffer to buffer the writes like you suggest in a different thread. And use a global static File Descriptor id to dump the curser buffer durecty into syscal write for that FD. Avoiding lazy_static.