C ffi - avoid final string copy


#1

I am working with the c ffi and my use case is to provide an interface similar to

int decode_base64(const unsigned char *src, size_t len, unsigned char *dst, size_t *dlen);

I’ve chosen {de,en}code_base64 as a simple example to showcase my question.

extern crate libc;
extern crate rustc_serialize;
use rustc_serialize::base64::{CharacterSet, Config, Newline, FromBase64, ToBase64};
use libc::size_t;
use std::slice;
use std::str;

static B64CONFIG: Config = Config {
  char_set: CharacterSet::UrlSafe,
  newline: Newline::LF,
  pad: false,
  line_length: None,
};

#[no_mangle]
pub extern fn decode_base64(subj_p: *const u8, subj_len: size_t, result_p: *mut u8, result_len_p: *mut size_t) {
  let subj_slice = unsafe {
    assert!(!subj_p.is_null());
    slice::from_raw_parts(subj_p, subj_len as usize)
  };
  let subj = str::from_utf8(subj_slice).unwrap();

  let result_slice = unsafe {
    assert!(!result_p.is_null());
    slice::from_raw_parts_mut(result_p, 99999 as usize)
  };

  let result_len_slice = unsafe {
    assert!(!result_len_p.is_null());
    slice::from_raw_parts_mut(result_len_p, 1 as usize)
  };

  let result = subj.from_base64().unwrap();
  let result_len = result.len();
  result_len_slice[0] = result_len;
  result_slice[..result_len].copy_from_slice(&result);
}

Specifically, is there a way to avoid result_slice[..result_len].copy_from_slice(&result);? It feels wasteful to allocate result only to immediately copy it somewhere. If possible, I’d like to write subj.from_base64().unwrap() directly into the result_slice.

Also, being a beginner at Rust, feel free to critique my coding style or to point out places that can be improved upon. Thanks!


#2

Unfortunately, rustc_serialize doesn’t provide the API you’re looking for and I don’t know of any crates that do. IMO, the best way to do this would be to write Read adapter that decodes a base64 stream.

As for your code, FYI, panicking across an FFI boundary is undefined behavior. You should consider returning an error instead.

Edit: Also, slice::from_raw_parts_mut(result_p, 99999 as usize) is dangerous. Slices should only ever include valid memory. Additionally, slice::from_raw_parts_mut(result_len_p, 1 as usize) is unnecessary, yo ucan just call result_len_p.as_mut().


#3

Thank you. I’ll modify the code to replace unwrap() as you suggested

The 99999 is hard-coded because the C side allocates an u8-array of that exact size (once, and then reuses it on subsequent calls). Why do you consider this dangerous?

Lastly, using .as_mut() results in this error

  let result_len_slice = result_len_p.as_mut().unwrap();
  result_len_slice[0] = result_len;



error: cannot index a value of type `&mut usize`
*result_len_slice[0] = result_len;

#4

assert! can also panick.
Note that by default assert! is run for both release and debug builds.
If you would like to assert only in debug builds(anyway you should not assert in those functions), there is debug_assert!.
If you want to panick inside that code, you can still catch it with catch_unwind.


#5

Sorry, I didn’t realize this was hard coded. Your C code allocates a 100KB array?

As mut returns a reference, not a slice. You can write to it like so:

let result_len = unsafe { 
    result_len_p.as_mut().unwrap(); // Don't actually unwrap.
}
// ...
*result_len = result.len();