[SOLVED] FFI with align - Segfault


#1

Hi guys,

I am writing a wrapper using FFI, and I have a segmentation fault when the code on the client part or in the library change a bit (which sometimes depending on the building mode [release|debug]). I have this problem for one week for now and made multiple searches on Internet, without success yet. I hope I post my question in the right place. If not, please tell me where I have to post it. Thanks :slight_smile:

Context

I’m writing a raytracing program in Rust. I have written several of this program in the past, but I think the memory and thread safety in Rust is a neat feature for this type of application. In raytracing, we compute a lot of intersection point between a ray and a 3D model (triangle soup). Having a good performance for intersection testing is crucial raytracing. It is for this reason that I have started to make a wrapper for the Intel’s Embree library (C-style library).

Note, I’m not the first one that has attempted to write an Embree wrapper for Rust. In my knowledge, two other wrappers are existing on GitHub:

Moreover, when I have tested this two library, I got a segmentation fault. For example, Twinklebear wrapper, provide an example “triangle_geometry” that works in release mode but not in debug mode (get a similar error than my wrapper).

It is for this reason that I have started my own Embree wrapper:

Issue

Depending on some code changes (inside the library or the client code), the library segfault when calling the triangle intersection procedure (rtcIntersect in Embree) which corresponding to the function “Scene.intersect” in scene.rs:

pub fn intersect(&self, mut ray: Ray) -> Option<Intersection> {
        unsafe { rtcIntersect(self.ptr, &mut ray) };
        Intersection::from_ray(&self, ray)
}

Sometimes, I get the segmentation fault only in the release or debug mode, sometimes in both.

My attempts

I have tried several different ways to found the problem, without being successful yet. These what I have tried:

  • Disable multithreading ( which not an issue as Embree support multithreading call for testing the intersection)
  • Checked that no error occurred during the building of the acceleration data structure (using Embree function for that just after rtcCommit)
  • Checked that the Embree device or scene have not been drop before I do the intersection test

What I am suspecting

I am suspecting that is somehow, a compliation related issue as sometimes, using a different compilation error leads to different behavior. Like:

  • Call rearrangement
  • Memory layout rearrangement

What I am seeking for

Anybody had encountered this problem behavior before?
Do there is a way to link to the debug library version with Rust? (So it makes easier to found the problem by inspecting the memory)

Configuration tested

Archlinux with Rustup. Tested in nightly and stable. Embree version 2.17.3-2 (in Pacman).


#2

I’d guess your Ray struct doesn’t have the same layout as what Embree expects. What’s the definition/layout of that type in Embree?


#3

Thanks for your question, you may point out to the good direction!
So the embree Ray struct is the following:

struct RTCORE_ALIGN(16)  RTCRay
{
  /* ray data */
public:
  float org[3];      //!< Ray origin
  float align0;
  
  float dir[3];      //!< Ray direction
  float align1;
  
  float tnear;       //!< Start of ray segment
  float tfar;        //!< End of ray segment (set to hit distance)

  float time;        //!< Time of this ray for motion blur
  unsigned mask;        //!< Used to mask out objects during traversal
  
  /* hit data */
public:
  float Ng[3];       //!< Unnormalized geometry normal
  float align2;
  
  float u;           //!< Barycentric u coordinate of hit
  float v;           //!< Barycentric v coordinate of hit

  unsigned geomID;        //!< geometry ID
  unsigned primID;        //!< primitive ID
  unsigned instID;        //!< instance ID
};

My rust code originally:

/// Structure for embree to represent a ray
#[repr(C)]
pub struct Ray {
    /// Ray origin
    pub org: [f32; 3usize],
    /// Memory align
    align0: f32,

    /// Ray direction
    pub dir: [f32; 3usize],
    align1: f32, /// Memory align

    /// Start of ray segment
    pub tnear: f32,
    /// End of ray segment
    pub tfar: f32,

    /// Time of this ray for motion blur
    pub time: f32,
    /// Used to mask objets during ray traversal
    pub mask: u32,

    /// Unnormalized geometric normal
    pub n_g: [f32; 3usize],
    /// Memory align
    align2: f32,

    /// Barycentric u coordinate at hit
    pub u: f32,
    /// Barycentric v coordinate at hit
    pub v: f32,

    /// Geometry ID
    pub geom_id: u32,
    /// Primitive ID
    pub prim_id: u32,
    /// Instance ID
    pub inst_id: u32,

    /// Padding
    _padding: [u32; 3usize],
}

I believe the problem comes from the alignement part. So I have added these tests:

#[test]
fn align_Ray() {
    assert_eq!(::std::mem::align_of::<Ray>(), 16usize);
}

#[test]
fn memsize_Ray() {
    assert_eq!(::std::mem::size_of::<Ray>() , 96usize , concat ! (
               "Size of: " , stringify ! ( Ray ) ));
}

and I have found that the alignment of my Rust structure is only 4 :frowning:

So what is the correct work around for this problem? I have tried several approaches:

  • using _alignment: [u128; 0] using extprim: only align to 8. Do I need to use nightly to gets the true u128 type?
  • I have tried to use repr-align, and use #[repr(C, align = "16")] but I get an error said that attribute is experimental.

#4

repr(align(x)) is the only real way to do it, AFAIK. I think it should become stable soon, but you’ll need to use nightly to get it now. So if you add #[repr(C, align(16))] to Ray, you’ll get 16 byte alignment. Try that to at least see if that fixes the segfaults. Don’t forget to remove the _padding field.


#5

If everything goes well, repr(align(x)) will be stabilized thursday :slight_smile:


#6

Thanks guys, using repr(align(16)) solved the segmentation fault. I mark this post as solved and edit a bit the title post :slight_smile: