Having a hard time using Rust/C bindings

I have recently generated bindings for this C library GitHub - yangao07/abPOA: abPOA: an SIMD-based C library for fast partial order alignment using adaptive ba by using Bindgen.

I am now trying to re-implement this example (abPOA/example.c at main · yangao07/abPOA · GitHub) in Rust, to make sure that my bindings are working correctly. My Rust implementation of the same example is the following: rs-abPOA/main.rs at main · HopedWall/rs-abPOA · GitHub.

I am having a hard time re-implementing lines 95-103 from the C file, which should correspond to lines 61-72 in Rust. I can't seem to be able to use malloc correctly in Rust, or rather I am not able to assign new values to the allocated memory. Does anybody know how it should be done?

Do you need to use malloc? It'd be easier to push to a Vec instead.

let bseqs: *mut *mut u8 = malloc((mem::size_of::<u8>() * n_seqs as usize) as u64) as *mut *mut u8;

This is incorrect, because you're using size of u8, but then expecting to be able to write 4-8 times larger *mut u8 type.

Unlike C, Rust doesn't support [] on pointers. You have to use *(ptr.add(i)) instead of ptr[i].

1 Like

Thanks for your answer @kornel! I used some Vecs and I was able to define "simple" pointers such as:

let mut seq_lens_val : Vec<c_int> = vec![0;seqs.len()];
let seq_lens: *mut c_int  = seq_lens_val.as_mut_ptr();

However, in order to use some of the functions in the library, I now need to define a *mut *mut u8, how could I do that? Here's my attempt:

bseqs is of type *mut *mut u8, but it does not seem to be working because I get this error: (signal: 11, SIGSEGV: invalid memory reference)

Any chance you are using those pointers after the original vectors have gone out of scope?

1 Like

That was exactly the problem! Everything inside the for loop was going out of scope.

I managed to fix this by replacing the for loop with some clever iterator usage. Just for future reference, I'll leave the updated code here:

Thanks a lot @H2CO3 !

Why don't you just use safe code with references or indices instead of unsafe and raw pointers? The whole point of Rust is that you can't run into this kind of memory management errors if you use the safe abstractions that the language and the standard library provides.

4 Likes

That's because I'm working with auto-generated bindings for a C library (I used bindgen). In this library many functions require raw pointers as parameters, here's an example:

int abpoa_msa(abpoa_t *ab, abpoa_para_t *abpt, int n_seqs, char **seq_names, int *seq_lens, uint8_t **seqs, FILE *out_fp, uint8_t ***cons_seq, int ***cons_cov, int **cons_l, int *cons_n, uint8_t ***msa_seq, int *msa_l);

The auto-generated Rust binding looks like this:

extern "C" {
    pub fn abpoa_msa(
        ab: *mut abpoa_t,
        abpt: *mut abpoa_para_t,
        n_seqs: ::std::os::raw::c_int,
        seq_names: *mut *mut ::std::os::raw::c_char,
        seq_lens: *mut ::std::os::raw::c_int,
        seqs: *mut *mut u8,
        out_fp: *mut FILE,
        cons_seq: *mut *mut *mut u8,
        cons_cov: *mut *mut *mut ::std::os::raw::c_int,
        cons_l: *mut *mut ::std::os::raw::c_int,
        cons_n: *mut ::std::os::raw::c_int,
        msa_seq: *mut *mut *mut u8,
        msa_l: *mut ::std::os::raw::c_int,
    ) -> ::std::os::raw::c_int;
}

So I need to pass raw pointers to this function. Maybe I'm missing something?

I get that, but this doesn't mean you have to use raw pointers everywhere. You could get temporary raw pointers at exactly the call sites of the C FFI functions, and otherwise keep using idiomatic Rust data structures.

For example, collecting a full array of raw pointers and storing them in a vector seems like an anti-pattern. If the C library requires a pointer-to-pointer, wrap it in a safe-to-use function that is short and obviously correct, in that it only hands out short-lived pointers.

3 Likes

Oh I understand, I should definitely do that! Thanks for the advice @H2CO3!

Unfortunately I am running into more issues with the abpoa_msa function specified in one of my previous messages.
In the original C implementation this function takes as inputs some (uninitialized) pointers such as:

uint8_t **msa_seq;
int **cons_cov;

It then allocates the required memory, and makes the pointers point to the newly allocated memory.

I've tried to replicate that in Rust by doing this:

let msa_seq: *mut *mut *mut u8 = ptr::null_mut();
let cons_cov: *mut *mut *mut c_int = ptr::null_mut();

I 've used ptr::null_mut() because Rust explicitly requires pointers to be initialized. However, the function seems unable to modify these pointers, and trying to dereference them after the function is called results in a signal 11: SIGSEGV. Perhaps these pointers should be initialized in another way?

Also the function itself seems to be working just fine (its main output is printed on screen), the problem is that these pointers still point to 0x0000000000000000, which should be the same as ptr::null_mut().

If a function accepts a ***u8 because it needs to initialize a **u8, then it obviously wants to write through the additional level of indirection. But you are passing a null pointer, dereferencing of which is undefined behavior. You'd want to make a (possibly uninitialized or null-initialized) pointer of type **u8 and pass its address (which itself will be a valid pointer) to the function.

2 Likes

You probably want to do something like this:

use std::os::raw::c_int;

extern "C" {
  fn initialize_things(*mut *mut u8, *mut *mut c_int);
}

fn main() {
  let mut msa_seq: *mut u8 = std::ptr::null_mut();
  let mut cons_cov: *mut c_int = std::ptr::null_mut();

  unsafe {
    initialize_things(&mut msa_sql, &mut cons_cov);
  }
}
1 Like

Probably more like this

  let mut msa_seq: *mut *mut u8 = std::ptr::null_mut();
  let mut cons_cov: *mut *mut c_int = std::ptr::null_mut();

  abpoa_msa( ......, &mut cons_cov, ... &mut msa_seq, ...)

based on the information provided earlier in this thread.

2 Likes

Thanks a lot @H2CO3, @Michael-F-Bryan and @godmar, now everything works!

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.