LD_PRELOAD works as dylib and doesn't as cdylib. Which is the right choice?

@Michael-F-Bryan in this group, recommended that I use cdylib as crate type to build an Rust LD_PRELOAD shared library I am working with. I was using dylib before that.
My reading of other threads i this group also backsup his recommendation

Interestingly enough dylib was working for me. I then switched cdylib and the LD_PRELOAD intercept portion stopped working. Though the .so still gets exectued as shown below.

Here is an example rust shared library code I used for testing. It intercepts readlink to print some debug print messages and continues to the original readlink.

extern crate core;
extern crate libc;
#[macro_use]
extern crate ctor; 


use libc::{c_void,c_char,c_int,size_t,ssize_t};

use std::sync::atomic;

#[cfg(any(target_os = "macos", target_os = "ios"))]
pub mod dyld_insert_libraries;

/* Some Rust library functionality (e.g., jemalloc) initializes
 * lazily, after the hooking library has inserted itself into the call
 * path. If the initialization uses any hooked functions, this will lead
 * to an infinite loop. Work around this by running some initialization
 * code in a static constructor, and bypassing all hooks until it has
 * completed. */

static INIT_STATE: atomic::AtomicBool = atomic::AtomicBool::new(false);

pub fn initialized() -> bool {
    INIT_STATE.load(atomic::Ordering::SeqCst)
}

#[ctor]
fn initialize() {
    Box::new(0u8);
    INIT_STATE.store(true, atomic::Ordering::SeqCst);
    println!("Constructor");
}


#[link(name = "dl")]
extern "C" {
    fn dlsym(handle: *const c_void, symbol: *const c_char) -> *const c_void;
}

const RTLD_NEXT: *const c_void = -1isize as *const c_void;

pub unsafe fn dlsym_next(symbol: &'static str) -> *const u8 {
    let ptr = dlsym(RTLD_NEXT, symbol.as_ptr() as *const c_char);
    if ptr.is_null() {
        panic!("redhook: Unable to find underlying function for {}", symbol);
    }
    ptr as *const u8
}


#[allow(non_camel_case_types)]
pub struct readlink {__private_field: ()}
#[allow(non_upper_case_globals)]
static readlink: readlink = readlink {__private_field: ()};

impl readlink {
    fn get(&self) -> unsafe extern fn (path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t  {
        use ::std::sync::Once;

        static mut REAL: *const u8 = 0 as *const u8;
        static mut ONCE: Once = Once::new();

        unsafe {
            ONCE.call_once(|| {
                REAL = dlsym_next(concat!("readlink", "\0"));
            });
            ::std::mem::transmute(REAL)
        }
    }

    #[no_mangle]
    pub unsafe extern "C" fn readlink(path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t {
        println!("readlink");
        if initialized() {
            println!("initialized");
            ::std::panic::catch_unwind(|| my_readlink ( path, buf, bufsiz )).ok()
        } else {
            println!("not initialized");
            None
        }.unwrap_or_else(|| readlink.get() ( path, buf, bufsiz ))
    }
}

pub unsafe fn my_readlink(path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t {
    println!("my_readlink");
    readlink.get()(path, buf, bufsiz)
}

My Cargo.toml looks like this

[package]
name = "readlink"
version = "0.1.0"
authors = ["Saravanan Shanmugham <sarvi@cisco.com>"]

[lib]
name = "readlink"
crate_type = ["dylib"]

[dependencies]
libc = "0.2"
ctor = "0.1.15"

And this works. I see the constructor executing, and my_readlink which is my itnercept function and ls -al /tmp/link works and shows the symlink as expected, so the original readlink was executed as well. So all is well here

bash-4.4$ LD_PRELOAD=target/debug/libreadlink.so ls -al /tmp/link 
Constructor
readlink
initialized
my_readlink
lrwxrwxrwx 1 sarvi eng 9 Aug 31 11:11 /tmp/link -> /tmp/file
bash-4.4$ 

I then changed
crate_type = ["dylib"]
to
crate_type = ["cdylib"]
And I see this. Only the constructor of libreadlink.so gets executed. But none of the interception happens.

bash-4.4$ LD_PRELOAD=target/debug/libreadlink.so ls -al /tmp/link 
Constructor
lrwxrwxrwx 1 sarvi eng 9 Aug 31 11:11 /tmp/link -> /tmp/file
bash-4.4$ 

Is cdylib the right choice? and why doesnt my interception not work as the recommended cdylib but works as a dylib ?
Considering dylib is supposed to be smaller and is working. Can I continue as dylib? Or should I be worried about any other pitfalls for dylibs exposing C externs for calls from other C programs?

I wonder if this is because you wrote it as an associated function, readlink::readlink. Even though that's marked extern and unmangled, the compiler might not be getting the visibility right for cdylib. Maybe try as a top-level function?

I did create a case against rust C_Variadic not working as associated fuctions, so it makes sense that regular functions and assocaited functions working differently.
May be I should report this as a bug as well to RUST Compiler.

The example I use above is a simplified, macro expanded version of code from redhook LD_PRELOAD library.
redhook implements as dylib and its tests are obviously passing.
And things have been working fine as dylib for me as well. Also I cdylib is larger.

Without the associated function implementation, the macro version of this code in redhook, is more difficult to implement, and havent found s solution. since macro variable expansion cannot do something
struct$macrovariablename
get$macrovariablename
my_$macrovariablename
$macrovariablename

Atleast I havent figured out how to do this yet in RUST macros.

So I am wondering if i really do need to move to cdylib as recommended. since dylib seems to be working fine so far.

You might like the paste crate for synthesizing identifiers.

Thanks for the pointer on paste. Will try it out. Looks like what I need.

BTW, when I changed away from associaited fuctions. the intercept part does work fine now.
I created a case against the rust compiler with this example and refering this thread.

The size diff between cdylib and dylib is 4.8M Vs 3.1M

Considering the exporting part works in dylib. Do I really need to go cdylib? What do I get that I need with the extra 1.6M of size that I dont get with cdylib ?

cdylib is more aggressive about hiding symbols, letting the linker prune unreachable code.

You can use nm --defined-only --dynamic libfoo.so to see the difference -- and your readlink is missing from the broken case.

1 Like

dylib is not guaranteed to work for what you are doing, the fact that it currently does is you being lucky with the current implementation of dylib. That means that what works right now using dylib, could break / stop working when updating Rust. Definitely not something you'd want in production. Thus,

do use cdylib, don't use dylib.

  • In practice, dylib is more of a vestigial feature than anything else, as of now.

That being said, I have done a minimal repro:

pub
enum Bar {}

impl Bar {
    #[no_mangle] pub
    fn foo () {}
    #[export_name = "baz"] pub
    fn baz () {}
}

#[no_mangle] pub
fn bar () {}

And indeed neither foo nor baz get exported: nm ... | grep -E 'foo|bar|baz' only outputs bar.

So, if there is an issue buf report to do, it is about that behavior (mentioning dylib, again, is just an unrelated distraction).


In practice, the solution for your issue is to thus export #[no_mangle] or #[export_name] functions that are not part of an impl block.

1 Like

While true, in this instance I'd recommend to use #[export_name = "my_identifier"] rather than #[no_mangle]-ing a [<my_ identifier>].

I don't think there's anything you can put in export_name except a literal string, so that doesn't work if these are being macro-generated.

Oh, I missed the fact that this was with macro-expanded code, my bad.

In that case #[export_name] becomes a tad more cumbersome, since it requires the same quirk that macro-generated docstrings require:

macro_rules! export_name {(
    #[export_name = $name:expr]
    $item:item
) => (
    #[export_name = $name]
    $item
)}
export_name! {
    #[export_name = concat!("f", "oo")]
    pub
    fn whatever () {}
}

At this point there is indeed no real winner among the two options, using paste is not worse than the above.

1 Like

I ran into a problem with paste that I can't figure out what is happenning

Here is a macro definition

#[macro_export]
macro_rules! testhook {

    (unsafe fn $real_fn:ident ( $($v:ident : $t:ty),* ) -> $r:ty => $hook_fn:ident $body:block) => {
        paste! {
            pub unsafe fn $hook_fn ( $($v : $t),* ) -> $r {
                    // event!(Level::INFO, "{}()", stringify!($real_fn));
                    $body
            }               
        }
    };
}

And its invocation

testhook! {
    unsafe fn readlink(path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t => my_readlink {
        println!("readlink({})", CStr::from_ptr(path).to_string_lossy());
        0
    }
}

the above compiles.
But if I uncomment the following the macro defintion the compilation errors out.
event!(Level::INFO, "{}()", stringify!($real_fn));

as follows

+ cargo build
   Compiling redhook_ex_varprintspy v0.0.1 (/ws/sarvi-sjc/redhook/examples/varprintspy)
error[E0423]: expected value, found built-in attribute `path`
  --> src/lib.rs:42:1
   |
42 | / testhook! {
43 | |     unsafe fn readlink(path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t => my_readlink {
44 | |         println!("readlink({})", CStr::from_ptr(path).to_string_lossy());
45 | |         0
46 | |     }
47 | | }
   | |_^ not a value
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

warning: unused import: `core::cell::Cell`

Further more if replace the stringify!($real_fn) with "something". Then it compiles.

BTW, Without using paste!{} all this compiles just fine.

It looks like paste doesnt like something about the content within it relating to stringify!($real_fn)

Can't figure out what. Any idea, what is wrong here ?

Here is a more compact working example:
This fails to compile

extern crate paste;
extern crate libc;

use paste::paste;
use std::ffi::CStr;
use libc::{c_void,c_char,c_int,size_t,ssize_t};

#[macro_export]
macro_rules! testhook {

    (unsafe fn $real_fn:ident ( $($v:ident : $t:ty),* ) -> $r:ty => $hook_fn:ident $body:block) => {
        paste! {
            pub unsafe fn $hook_fn ( $($v : $t),* ) -> $r {
                    println!("{}()", stringify!($real_fn));
                    $body
            }               
        }
    };
}

testhook! {
    unsafe fn readlink(path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t => my_readlink {
        println!("readlink({})", CStr::from_ptr(path).to_string_lossy());
        0
    }
}

This succeeds without stringify

extern crate paste;
extern crate libc;

use paste::paste;
use std::ffi::CStr;
use libc::{c_void,c_char,c_int,size_t,ssize_t};

#[macro_export]
macro_rules! testhook {

    (unsafe fn $real_fn:ident ( $($v:ident : $t:ty),* ) -> $r:ty => $hook_fn:ident $body:block) => {
        paste! {
            pub unsafe fn $hook_fn ( $($v : $t),* ) -> $r {
                    println!("{}()", "something");
                    $body
            }               
        }
    };
}

testhook! {
    unsafe fn readlink(path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t => my_readlink {
        println!("readlink({})", CStr::from_ptr(path).to_string_lossy());
        0
    }
}

Annd the following compiles without paste

extern crate paste;
extern crate libc;

use paste::paste;
use std::ffi::CStr;
use libc::{c_void,c_char,c_int,size_t,ssize_t};

#[macro_export]
macro_rules! testhook {

    (unsafe fn $real_fn:ident ( $($v:ident : $t:ty),* ) -> $r:ty => $hook_fn:ident $body:block) => {
            pub unsafe fn $hook_fn ( $($v : $t),* ) -> $r {
                    println!("{}()", stringify!($real_fn));
                    $body
            }
    };
}

testhook! {
    unsafe fn readlink(path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t => my_readlink {
        println!("readlink({})", CStr::from_ptr(path).to_string_lossy());
        0
    }
}

the problem seems paste!{} doesnt like the stringify!($real_fn) in the paste!{} context

This seems worthy of a bug report to paste.

Thought so. Created one

I generally find that debuging macro! invvocations very painnfull. Is there good option to show how the macro expands and where after macro expansion the error is coming from ?

You might find cargo-expand helpful for exactly this.