Using a raw pointer returned from a DLL function (conversion from C)

I've been trying to rewrite a piece of C code that uses a function from a DLL file, but I'm really struggling with it.

For reference, this is the original code in C (only showing the relevant bits)

struct MyItem {
    u32 id;
    i32 count;
};

void DoSomething(MyItem* items) {
    for(size_t i = 0; items[i].id; i++)
        printf("id: %d, count: %d\n", items[i].id, items[i].count)
}

using GetItemsFn = MyItem*(__cdecl*)();
GetItemsFn getItems = nullptr;
getItems = (decltype(getItems))GetProcAddress(lib, "get_items");

DoSomething(getItems());

So we have a get_items() func in the DLL that returns a MyItem* pointer that can actually be zero-to-many elements, so the code loops over it like an array. I'm not good at C but this seems so confusing to me. Anyway.

How can I handle this in Rust?
This is what I have so far:

use std::ffi::{c_int, c_uint};
use libloading::{Library, Symbol};

#[repr(C)]
struct MyItem {
    id: c_uint,
    count: c_int,
}

fn main() {
    let dll_path = libloading::library_filename("/some-dll");
    let lib: Library;
    let get_items_fn: Symbol<fn() -> *const MyItem>;

    unsafe {
        lib = Library::new(dll_path).unwrap();
        get_items_fn = lib.get(b"get_items").unwrap();
    }

    let items: *const MyItem = get_items_fn();

    // ... and then what?
    // How do I pass items into do_something() ?

}

fn do_something(items: &[MyItem]) {
    for item in items.iter() {
        if item.id == 0 {
            break;
        }
        println!("ITEM (id:{}, count: {})", item.id, item.count);
    }
}

One suggestion was to use let slice = unsafe { std::slice::from_raw_parts(items, 3) }; but I don't know the length as it can be anything.

Any help would be appreciated!

This...

...and this...

Are not equivalent. I presume the Rust version should be using u32 and i32 datatypes.

get_items_fn should be marked unsafe and the call to it should be in an unsafe block.

It looks like the end-of-list is marked by a zero id...

You will need to compare items to null before using the pointer. The length can be determined by scanning items for a zero id. After that from_raw_parts becomes useful.

Pedantic: that's C++, not C. C doesn't have anything like using, decltype or nullptr.

The original code uses fixed-width integer types. u32 and i32 aren't standard types, those would be uint32_t and int32_t, but these typedefs are also quite common in C++ code. In any case, translating those as c_uint and c_int would be incorrect. c_uint corresonds to unsigned int, which is a platform-dependent type with size mostly undefined by the standard. Similarly, c_int should be used only when the original used int.

That's a bit of an odd code. Why set initially to nullptr and introduce the possibility of bugs, when you can assign the value directly? Also, why use decltype when the type is explicitly known, and most importantly determined by the name of the linked dll function? Casting a function pointer to an incorrect type is UB, you should be paranoid about the types of the linked functions. So I would write that part as

auto getItems = static_cast<GetItemsFn>(GetProcAddress(lib, "get_items"));

I can only assume the original code is more complex and has a good reason to have this weird way of doing things, while the example is oversimplified.

Ouch. So it works basically like C's null-terminated strings. A very bad and error-prone idea. I can only assume you don't control the source of the library, but if you do, strongly consider changing that type.

Important question: the returned type is a pointer. Who owns the pointed to data? Is it an owned pointer that you must manually free later, or is it a non-owned pointer? If so, are you allowed to write to the pointed data? And what is its lifetime bound to? Those are important questions if you want to properly wrap that code in Rust.

Of course, if we're talking about a one-off code rather than a proper safe wrapper for some library, it may be fine to use raw pointers a few times, rather than trying to build a proper safe abstraction.

You have two logically independent operations with different preconditions inside a single unsafe block. That's a good way to introduce soundness errors. You should split the block into two parts, which would also allow you to write the code neater, without splitting off the declaration and definition of a variable. Like this:

// SAFETY: provide explanation on the safety of linking a library. Most importantly, this likely
// requires talking about its multithreading safety, and ensuring that you don't violate it (e.g.
// linking a dynamic library concurrently may not be safe, look up your OS and library specifics).
let lib = unsafe { Library::new(dll_path).unwrap() };
// SAFETY: ensure that the type of the linked function is properly specified. Wrong function
// signature is UB.
let get_items_fn = unsafe { lib.get::<fn() -> *const MyItem>(b"get_items").unwrap() };

Indeed, you don't know the length. You can compute it the same way as the original code does: walk the array until you find item.id == 0. Like this:

/// SAFETY: we must ensure that `items` points to live valid data. The function also does
/// an unchecked walk over the pointed to buffer, which may cause out of bounds errors
/// if the buffer was improperly initialized or the null terminator was overwritten at some point.
unsafe fn get_items_len(items: *const MyItem) -> usize {
    for len in in 0.. {
        // Do you have any upper bound on the size of buffer? If you do, it would be
        // nice to put a guard here, to reduce the possibility of out of bounds access 
        // if the buffer is malformed.
        let item = *items.add(len);
        if item.id == 0 { return len; }
    }
    unreachable!();
}

Now that you have the length, you can pass it to slice::from_raw_parts and proceed as usual. That said, &[MyItem] has an important limitation: it can be freely copied and shared between threads, and the pointed to data must be immutable. Are you sure that accesses to *MyItem are thread-safe and reentrant? You should ensure that other library functions which you're going to call aren't going to do concurrent writes to that buffer, or worse, free or reallocate it from under you.

4 Likes

Thank you for this detailed response, it's really helpful.

You are correct, I don't have control over the original C++ code nor the (closed-source) DLL file. I'm just trying to revive someone's abandoned project.

unsafe fn get_items_len(items: *const MyItem) -> usize {
    for len in in 0.. {
        let item = *items.add(len);
        if item.id == 0 { return len; }
    }
    unreachable!();
}

This is exactly what I was looking for! I didn't know you could use add() in this case.

Pedantic: C23 adds typeof and nullptr.

What restriction of pointer::add did you think wasn't satisfied? The C code walks through with pointer increments, so it should follow that Rust can do the same with the (raw) pointer.

Pedantic: C23 is still a draft, not an accepted standard. Nevermind support in production compilers. So as of today, C doesn't have nullptr.

I think they just weren't aware of its existence. Which is fair, the ptr module is huge and grows bigger over time. It's quite hard to find functionality unless you already have a good idea of what's you are searching for.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.