How to safely transfer a vector of strings between Rust and C++

hi,

I've seen some examples already but i am quite confused on how to do this. (i've been looking into cxx but i got nothing i understand). I imagine this is not easy to do so if you feel like it is not worth your time, please ignore my post.

what i have is a "simple" situation where i build my cpp code using a builder :

fn main() {
    cc::Build::new()
        .file("src/test.cpp")
        .compile("test");
}

in my main.rs i created a vector of strings that i would like to pass as an input to my function in test.cpp, process it and return another vector of strings back to rust (all will be done in a loop so i imagine freeing memory is required).

main.rs

extern {
    fn test_strvec(sv: Vec<String>)->Vec<String>;
}


fn main(){

  let v = vec!["aa".to_string(), "bb", "cc"];

  let b = unsafe { test_strvec(v) }

  println!("{:?}", b);

 
}

in test.cpp

#include <stdio.h>
#include <string.h>
#include <vector.h>


vector<string> test_strvec(vector<string> sv) {

   // print incoming vector

   // make a new vector of strings
    string a = "aaa";
    string b = "bbb";
    vector<string> vec;
    vec.push_back(a);
    vec.push_back(b);
    ?????????????  how to return vec back to rust!!!!!
    //
}

let say i want to add 2 new random strings to a new vector and send it back to rust. i apologize for the sparse, incomplete and incorrect example but it just reflects my ignorance on the subject !

Moreover, if passing a string is not smart, 2d vector of char/u8 is also an option, but given i have no clue how to do either i cannot say what to do... :frowning:

Thank you

A

A Rust Vec is not a C++ vector.

A Rust String is not a C++ string.

The two languages may be using different allocators, so in general, you cannot mix collection contents, and if you send ownership across the FFI boundary, you need a way to send it back for dropping (or leak memory).

So, it's going to be a lot more involved than your sketch, even with a supporting library like cxx.

There's an example of cxx of turning a (reference to a) C++ vector<string> into a Rust Vec<String>. Note how everything is copied over to new allocations.

4 Likes

Isn't cxx's rust::Vec<T> specifically designed for this use case?

2 Likes

Hi,

after a day trying to figure out how to use cxx i finally gave up and tried another approach. basically a vec of strings, to me, from my operative perspective is a vec where each string is separated by a special i8 (10) . so i decided to pass a single vector to a cpp lib and since i am expecting the same output back i simply reused the strategy. Here is what i've done.

#Cargo.toml

[package]
name = "cvec"
version = "0.1.0"
edition = "2021"
build = "src/builder.rs"


[dependencies]
libc = "0.2"

[build-dependencies]
cc = "1.0"

then builder:

# src/builder.rs

extern crate cc;

fn main() {

    cc::Build::new()
        .file("src/test.cpp")
        .cpp_link_stdlib("stdc++")
        .compile("libtest.a");

}

# src/main.rs
extern crate libc;

use libc::{c_int, c_char};


extern "C" {
    fn sum(full: *const c_char, empty: *const c_char, len: c_int) -> c_int;
}

fn main() {
    let mut i =0;
    while i < 10 {
        let vec_full  = vec!['a' as i8,'b' as i8,'c' as i8,'d' as i8]; // i know i can do this smarter
        let vec_empty = vec!['x' as i8;4];

        let output = unsafe { sum(vec_full.as_ptr(), vec_empty.as_ptr(), vec_full.len() as c_int) };
        println!("vec_full => {:?}  vec_empty => {:?} : size {}", vec_full, vec_empty, output);
        i += 1;
    }
}


#  src/test.cpp

#include "test.h"

extern "C" {

  int sum (char *full, char* empty, int len) {
    int sum = 0;
    for (int i = 0; i < len; ++i) {
      sum += i;
      empty[i] = full[i];
    }
    return sum;
  }

}

and finally header:

# src/test.h

#include <vector>

extern "C" {
  int sum(char *full, char* empty, int len);
}

so this works i seen no memory leaks , but my question now is: how safe/portable/good practice/design this solution is ??? any comments are more than welcomed !!

A

ps
the downside now is that i need to rewrite all return vectors to that allocated by rust. so if i have a more elaborate computation that gives me a complex data structure i need to extract and reformat the data in order to be able to write it into the final return ("empty") vector :frowning:

This is generally considered the best practice: expose a (pointer, length) pair to the C/C++ code for both inputs and outputs, and let the caller allocate the vector as necessary. The memory slice is effectively the lingua franca of all contiguous sequences in low-level languages.

2 Likes