Libloading segfault

Using libloading I am running into a segfault and trying to understand the why of it.

I have a trait defined in a separate crate which looks like this:

use std::path::PathBuf;

/// trait to find repositories in the job system
pub trait RepoFinderService {
    /// Find repository paths
    fn find_repo(&self) -> Vec<PathBuf>;
}

And a simple test implementation defined as a dylib crate, which looks like this:

use pes_interface::RepoFinderService;
use std::path::PathBuf;

#[no_mangle]
pub fn new_finder_service() -> Box<dyn RepoFinderService> {
    Box::new(DevRepoFinder::new())
}

pub struct DevRepoFinder;

impl DevRepoFinder {
    fn new() -> DevRepoFinder {
        DevRepoFinder
    }
}

impl RepoFinderService for DevRepoFinder {
    fn find_repo(&self) -> Vec<PathBuf> {
        vec![PathBuf::from(
            "/Users/jgerber/src/rust/pes/test_fixtures/repo_test",
           
        ), 
        PathBuf::from( "/home/jgerber/src/rust/pes/test_fixtures/repo_test")]
    }
}

And I can use it as a one shot affair if I define a plugin manager like so:

pub struct PluginMgr {}

impl PluginMgr {
    /// retrieve an instance of the Plugin Manager
    pub fn new() -> Result<Self, PesError> {
        Ok(Self {})
    }

/// this works!
pub fn repos(&self) -> Vec<std::path::PathBuf> {
        
        let dso_path = std::env::var(REPO_FINDER_VARNAME)
            .unwrap_or_else(|_| {
                let mut path = std::env::current_exe().expect("cannot get current executable from env");
                path.pop();
                path.push("../lib");
                path.push("librepo_finder.so");
                
                path.into_os_string().into_string().expect("cannot convert path to string")
            });

        let lib = unsafe { libloading::Library::new(dso_path.as_str()).expect("unable to load lib") };
        
        let new_service: libloading::Symbol<extern "Rust" fn() -> Box<dyn RepoFinderService>> =
            unsafe { lib.get(b"new_finder_service").expect("unable to get service") };
        let service = new_service();
        service.find_repo()
    }
}

But what I really want to do is only load the service once... So I try something like this:

pub struct PluginMgr {
    repo_finder: Box<dyn RepoFinderService>,
 }

impl PluginMgr {
  
    pub fn new() -> Result<Self, PesError> {
        let repo_finder = Self::new_repo_finder_service()?;
        Ok(Self { repo_finder })
    }

    fn new_repo_finder_service() -> Result<Box<dyn RepoFinderService>, PesError> {
        let mut path = std::env::current_exe().expect("cannot get current executable from env");
                path.pop();
                path.push("../lib");
                path.push("librepo_finder.so");
        let lib = unsafe { libloading::Library::new(path)? };
        let new_service: libloading::Symbol<extern "Rust" fn() -> Box<dyn RepoFinderService>> =
            unsafe { lib.get(b"new_finder_service")? };
        Ok(new_service())
    }

 pub fn repos(&self) -> Vec<std::path::PathBuf> {
        let repo = self.repo_finder.find_repo();
        repo
    }
}

This second implementation segfaults when run. Not quite sure why...

The Rust ABI isn't stable, I don't think you can do this. Rust reserves the right to change how Box<dyn RepoFinderService> is represented in memory so if that happens your code will break. Maybe you could try using abi_stable or making your own #[repr(C)] shared types?

That should not be an issue if you compile the dylib and the consumer using the same version of Rust though, right? I understand that the abi isn't stable and the memory layout is subject to change over time. But everything is compiled with the same version of Rust (and both in release mode).

In the second implementation the Library gets dropped which implicitly invokes dlclose(), thus causing a SIGSEGV when you later try to invole a function from the dylib.

Some time ago I also had issues with libloading (segfault). Unfortunately I do not recall the solution, but there seems to be a bug issue, which might contain a workaroung:

Fortunately, the fix was obvious. And I thought that I had already tried this before posting, but holding a pointer to the Library instead of the service in the PluginMgr fixed it.

Thanks everyone. I really thought I tried that before posting...

@jlgerber sadly Rust is not well equipped to truly express the properties of a dynamically loaded with an arbitrary API in an ergonomic fashion, so despite crates like libloading hiding how unsafe these operations are (in a thus unsound fashion), one should be very careful when dealing with these things.

The abi_stable crate framework is the best effort out there to avoid many pitfalls.

  • One such pitfall, for instance, would be that the loadee and the "loader" (by that I mean the host binary having loaded the shared library) may use different allocators :scream:. If this happens, the loader will be freeing the PathBufs that the loadee allocated. In all honesty, the only other sane option is to go write C-based FFI with manual free and alloc/new functions for each and every type (in your case you only have the new functions), or at least to have the remote new embed a virtual destructor (I believe this is one of the things that abi_stable does, for instance): Rust ownership should not cross FFI boundaries, unless the destruction is based on virtual methods.

    Another workaround is to feature a with_new(|value| value.to_owned()) in which the allocation logic is provided by the (loader) caller rather than by the loadee, so that the non-virtual ownership never crosses the FFI boundary.

    • In practice, especially on Unix, the System allocator is very likely to be shared by the loader / host and the loadee, so you might get away with it.

Once you reach that point of at least handling this issue with allocations, you also need to worry about:

  • layout: extern "Rust" fn, dyn Trait (such as dyn Fn…) or even PathBuf don't have a well-defined layout, such that it would be allowed to change between compiler invocations (provided incremental compilation is well handled). That being said, in practice, this would be so complicated to achieve unless there is a stability within each compiler version.

    • So, in practice, you may be able to get away with compiling everything with the same version of rustc.

Finally, and this is where your segfault came from: any signature that somehow suggests the existence of a &'static reference in the loadee code is a lie from within the point of view of the host, unless:

  • the loaded library is never unloaded, e.g., if you are using ::libloading, if the Library is never dropped (e.g., you manage to witness a &'static Library instance).

Otherwise the 'static lifetimes of the loadee–let's call the 'static_loadee– will not be 'static: once the Library is dropped, all these references will dangle.

Note that the worst or more sneaky offender out there are the fn "pointers", since it may not be obvious that they're actually &'static references: If we were very rigorous, these would be named fn static reference (that is, extern "ABI" fn… is actually a &'static Code<"abi">, where Code<const ABI: &str> would be an extern type).

This gets even worse in that any dyn Trait that came from the loadee is actually carrying a &'loadee_static (&'loadee_static Code<"…"> /* some method */, &'loadee_static …). This is where your segfault comes from: the loader / host interprets these things as having a 'static lifetime, so the borrow checker skips town, but when the Library is dropped, the loadee is unloaded, these 'loadee_static lifetimes end, and all these pointers dangle. When you then try to call .find_services() there is a double dereference of dangling pointers happening :grimacing:.

The best solution here is to get rid of that horrendous idea for Rust that is to unload a library that yields 'loadee_static references. That way, if it is never dropped, we have 'loadee_static == 'static and all is fine :slightly_smiling_face:

  • In practice, this means that any Library instance should be living within a specific static …: ::once_cell::sync::Lazy<Library>, never to be overriden (and thus unloaded).

    This is equivalent to only working with &'static Library references, by the way.

2 Likes