How do I perform interprocedural analysis on rust programs

Hello everyone,

I am currently working on scanning unsoundness issues in Rust projects and have made some progress. However, I am now facing some difficulties. Consider the following simple code example. For more detailed information, you can refer to this issue:

pub struct HeaderView {
    pub inner: *mut htslib::bcf_hdr_t,
}

impl HeaderView {
    pub fn new(inner: *mut htslib::bcf_hdr_t) -> Self {
        HeaderView { inner }
    }

    #[inline]
    fn inner(&self) -> htslib::bcf_hdr_t {
        unsafe { *self.inner }
    }

    /// Get vector of sample names defined in the header.
    pub fn samples(&self) -> Vec<&[u8]> {
        let names =
            unsafe { slice::from_raw_parts(self.inner().samples, self.sample_count() as usize) };
        names
            .iter()
            .map(|name| unsafe { ffi::CStr::from_ptr(*name).to_bytes() })
            .collect()
    }
}

In this case, the samples function, or more specifically the inner function, is unsound because it indirectly calls self.inner through slice::from_raw_parts. When we look at the inner function, it directly dereferences a raw pointer, which is inherently unsafe in Rust. Additionally, users can create a null pointer for HeaderView using the pub new function or even the pub inner field of HeaderView. Subsequently, calling inner on such a pointer would result in undefined behavior (UB).

Currently, I am only able to scan pub functions, such as samples, as I typically consider them to be user-accessible (though this is not entirely accurate, as modules can also be private). While I can detect such pub functions, I discovered this particular issue only by coincidence, meaning I am unable to identify similar unsound functions like inner() systematically.

One idea I have is to construct an Interprocedural Control Flow Graph (ICFG) for Rust to perform interprocedural analysis. However, I am not sure what tools can achieve this. At the LLVM level, unsafe information might be lost, but I could not find existing tools for interprocedural analysis at the MIR level either.

Does anyone have any suggestions or solutions for this problem? Any advice would be greatly appreciated! Thank you!

2 Likes

MIR removes all unsafety information, since it is no longer needed at that stage. The HIR is more of a fully expanded AST which does retain that information.

This might be sending you in the right direction, but I haven't personally used any of the compiler infrastructure myself. I don't know what lies ahead.

Anyway, good luck! It sounds like an awesome project.