Raw pointer contains no data when running in release

A function created with bindgen from C gives me a raw pointer to data and a return status. If I run in debug the data is there and the return status is fine. If I run in release there is no data (like a std::ptr::slice_from_raw_parts() gives a slice of zero length) and the return status is still fine.

I know that this is not much info, but before I elaborate, I thought I ask if anybody has an idea what to look for?

This runs in a loop and for the runtime, one difference between debug and release is that debug runs at about 10 fps, and release at about 30 fps. But the original C program has no problem receiving the data at 30 fps so this might not be the problem.

If optimizations change the result like that, your C library or your FFI glue code may contain Undefined Behavior, and there's an actual safety bug in there.

  • Add more assertions and consistency checks in your Rust and C code (make sure all pointers are checked for NULL before they're used, lengths are never negative or too large)

  • Try running the program under Valgrind.

  • Try running with Address Sanitizer.

3 Likes

Thanks!
So I check all pointers for NULL and the data lengths. Only this one data length I mention above becomes zero for release.

Next I checked with Address Sanitizer. It gives a few memory leaks on the order of a few tens of bytes, happening in the C code, which is a (closed source) precompiled shared library, so I cannot check it. Anyway, I assume these tiny memory leaks have nothing to do with the problem?

But now comes the FUN FACT. To run Address Sanitizer, I have to use Rust nightly, and with nightly it runs in release without problems! What on earth could that mean? :smile:

If there actually is undefined behaviour (UB) in your code, then this is to be expected. The UB means, that the compilation result is undefined. So it can easily change from version to version.

This is why UB is so hard to handle, because its results are very unpredictable.

In my opinion this is the single best feature of rust as compared to C/C++. As long as you stay in safe land you do not have to worry about UB.

Maybe try an older nightly version an see if it fails again? For example one from the fay before the release of the stable version in which it failed?

There's a slim chance that you've ran into a miscompilation bug that has been fixed. You can bisect nightlies to find the exact change that made your program run as expected, and see whether that has been a relevant change: Installation - cargo-bisect-rustc

But also it's still quite likely that your code has UB. When the behavior is sensitive to optimizations, it's very likely that the code relies on something that it's not allowed to rely on. Whether it happens to work or not is a lottery, since small differences in optimizations can hide or expose the bugs (e.g. without inlining functions the optimizer may not see enough code to notice it's doing something forbidden, and inlining decisions vary based on heuristics).

2 Likes

I agree that it is probably UB, but after staring at this stuff for days, I'm running out of ideas. So please let me elaborate.

I'm testing out a depth/rgb camera (here is the API on github in C).

All the types/constants/functions prefixed with sc or Sc come directly from the bindgen I created from the API and are black boxes since the API libraries are closed source. There are two relevant functions with the following signatures

pub unsafe fn scGetFrameReady(device: ScDeviceHandle, waitTime: u16, pFrameReady: *mut ScFrameReady) -> ScStatus

pub unsafe fn scGetFrame(device: ScDeviceHandle, frameType: ScFrameType, pScFrame: *mut ScFrame) -> ScStatus

ScDeviceHandle: *mut ::std::os::raw::c_void is a pointer to the device, which I wrap into a struct

pub struct Device {
    pub handle: ScDeviceHandle,
}

and check it to not be NULL before returning it. It's used in many other function, for example to start up or shut down the camera. So I don't think this is the issue.

ScFrameReady is the following struct containing a buffer storing the signal for the frame availability.

pub struct ScFrameReady {
    pub _bitfield_align_1: [u8; 0],
    pub _bitfield_1: __BindgenBitfieldUnit<[u8; 4]>,
}

ScFrameType = ::std::os::raw::c_uint is an interger deciding the type of frame to be retrieved (depth, rgb,...).

ScFrame is the following struct, containing a pointer pFrameData to the data

pub struct ScFrame {
    pub frameIndex: u32,
    pub frameType: u32,
    pub pixelFormat: u32,
    pub pFrameData: *mut u8,
    pub dataLen: u32,
    /* … */
}

So I create the following two wrapper functions

pub fn read_next_frame(device: &Device, max_wait_time_ms: u16, frame_ready: &mut ScFrameReady) {
    unsafe {
        scGetFrameReady(device.handle, max_wait_time_ms, frame_ready);
    }
}

pub fn get_frame(
    device: &Device,
    frame_ready: &ScFrameReady,
    frame_type: &FrameType,
    frame: &mut ScFrame,
) {
    let mut ft: Option<ScFrameType> = None;
    match frame_type {
        FrameType::Depth => {
            if frame_ready.depth() == 1 {
                ft = Some(ScFrameType_SC_DEPTH_FRAME);
            }
        }
        FrameType::IR => {
            if frame_ready.ir() == 1 {
                ft = Some(ScFrameType_SC_IR_FRAME);
            }
        }
        FrameType::RGB => {
            if frame_ready.color() == 1 {
                ft = Some(ScFrameType_SC_COLOR_FRAME);
            }
        }
        FrameType::TransformedRGB => {
            if frame_ready.transformedColor() == 1 {
                ft = Some(ScFrameType_SC_TRANSFORM_COLOR_IMG_TO_DEPTH_SENSOR_FRAME);
            }
        }
    }
    if let Some(ft) = ft {
        unsafe {
            let status = scGetFrame(device.handle, ft, frame);
            if status != ScStatus_SC_OK {
                panic!("get_frame failded with status {}", status);
            }
        }
        if frame.pFrameData.is_null() {
            panic!("frame pointer is NULL!");
        }
    }
}

and then use these functions in the main loop of the program, for example, to receive a depth frame:

let frame_ready = &mut FrameReady::default();
let frame = &mut Frame::default();

loop {
    read_next_frame(&device, 1200, frame_ready);
    get_frame(&device, frame_ready, &FrameType::Depth, frame);
    // do something with the frame data...
}

This works fine for all FrameTypes except for FrameType::TransformedRGB in release mode, where frame.dataLen is zero and there is no data in frame.pFrameData (that's the only weird thing happening).
There is no other exceptional thing going on, frame_ready.transformedColor() == 1 is true, saying that "yes, there is a frame ready for that type", the return status of scGetFrame() is ok too, and the frame.pFrameData pointer is not NULL.

Sorry for the lengthy description, but is there anything wrong with my wrapping code that could create UB?

Here is a similar problem but much more compact. Maybe somebody can give a hint what's going on here?

I try to print out the firmware version, the C code looks like this

const int BufLen = 64;
char fw[BufLen] = { 0 };
scGetFirmwareVersion(g_DeviceHandle, fw, BufLen);
cout << "fw  ==  " << fw << endl;

So in Rust I do this

let firmware = &mut 0;
status = scGetFirmwareVersion(device.handle, firmware, 64); // status is ok

println!("{:?}",
    CStr::from_ptr(
        slice_from_raw_parts(firmware, 64)
        .as_ref()
        .unwrap()
        .as_ptr()
    )
);

The signature of the bindgen generated function is

pub unsafe fn scGetFirmwareVersion(device: ScDeviceHandle, pFirmwareVersion: *mut ::std::os::raw::c_char, length: i32) -> ScStatus

In debug the code prints the correct string
"NYX650_R_20241203_B26"

But in release I get random stuff like
"NYX6\\\xe8?:\xc3\x7f"
"NYX6\\\xe8\x9f\x04\x86\x7f"
"NYX6\\\xe8\xbf\xe6~\x7f"

Why?

pFirmwareVersion is likely expected to be pointing on [c_char; length], not at some random place. This code is UB and could very well corrupt some unrelated memory, or simply crash when trying to write to the memory that was mapped as readonly.

3 Likes

Thanks! But what does that mean? How should the code be modified if the function expects a pFirmwareVersion: *mut ::std::os::raw::c_char?

Well, if they are expecting the fixed-size buffer, you can simply create it yourself:

let firmware = [0; 64];
status = scGetFirmwareVersion(device.handle, firmware.as_mut_ptr(), 64);

println!("{:?}", CStr::from_bytes_until_nul(&firmware));

(untested)

2 Likes

Thanks a lot! This works!
I only had to make firmware mutable and to add

CStr::from_bytes_until_nul(transmute(&firmware as &[i8]))

because firmware is [i8; 64].

I will try to see if this technique helps me with the bigger problem above too...

Going back to the main problem. As mentioned above, the struct Frame contains the frame data as a raw pointer

pub struct Frame {
    pub pFrameData: *mut u8,
    pub dataLen: u32,
    /* … */
}

and the function

pub unsafe fn getFrame(device: DeviceHandle, frameType: FrameType, pFrame: *mut Frame) -> ScStatus

expects a raw pointer to a Frame.

So I create a frame

let mut frame = Frame::default();

(Side question: When creating the bindings I use the .derive_default(true) in the bindgen::Builder. Could this be an issue?)

Since there is no .as_mut_ptr() for Frame I then call the function like this

getFrame(handle, frame_type, &mut frame);

even though getFrame expects a *mut Frame. I guess this is the problem? How should it be done instead?

That is the correct thing to do assuming the scGetFrame function does what you expect.
There may be some mistake about what the API does, or the UB may originate somewhere else entirely.

I notice you don't seem to check whether scGetFrameReady succeeded or not - that would be my starting point.

2 Likes

I notice you don't seem to check whether scGetFrameReady succeeded or not - that would be my starting point.

Oh yes, sorry I did not mention it but I also added the status check to scGetFrameReady() and there is no issue. Thanks for spotting that.

This is the thing, the status of both functions getFrameReady() and getFrame() is ok, and the pointer to the new data frame.pFrameData is not null.
Only for one specific frame type the lenght of the data is just zero. Even though I query if that data exists with frame_ready.transformedColor() which returns 1 (yes).

I even ran the program with Valgrind, but it checks god knows what for 10 minutes at the initialization of the camera and I get a command error after it tries to hook onto the camera stream, probably due to a timeout or something. So I never get to the point were I can receive frames. :confused:

Does the documentation for the function say that it allocates space for new data and puts a pointer to it in pFrameData, or does it populate the existing buffer up to the given dataLen? Either option is possible, but if it's the second, your derived Default implementation would be providing the function with an empty buffer, which scGetFrame dutifully fills with 0 elements and returns successfully.

Why this would be different in debug/release, though, I don't know. There could be some combination of factors at play.

1 Like

The original C documentation says for the relevant fields in ScFrame

uint8_t*   pFrameData;  //!< A buffer containing the frame’s image data.
uint32_t   dataLen;     //!< The length of pFrame, in bytes.

and for the scGetFrame() function

/**
 * @brief        Returns the image data for the current frame from the device specified by <code>device</code>.
 *               Before invoking this API, invoke scGetFrameReady() to capture one image frame from the device.
 * @param[in]    device       The handle of the device to capture an image frame from.
 * @param[in]    frameType    The image frame type.
 * @param[out]   pScFrame     Pointer to a buffer in which to store the returned image data.
 * @return       ::SC_OK      If the function succeeded, or one of the error values defined by ::ScStatus.
 */
SCEPTER_C_API_EXPORT ScStatus scGetFrame(ScDeviceHandle device, ScFrameType frameType, ScFrame* pScFrame);

Also I reuse the same ScFrame variable to retrieve different types of frame, like depth, IR, color and it always fills the data correctly to which pFrameData is pointing and sets the appropriate dataLen. Only for this one frame type it does not...

Let me state once more (this is a long thread) that this is only happening in RELEASE mode under STABLE and BETA. Everything works fine under NIGHTLY.
And it was the same problem before the last rustc update in January.

I just published the crate vzense-rust for which this problem occurs and mention it in the README with a link to this discussion.

In case anybody has a NYX650/660 camera, please try it out and see if you get the same problem. :pray:

Thanks for all your help so far!

You should try to avoid transmute. Sometimes it is the correct tool but most of the time there are better solutions that to not completely sidestep the type and borrow checker. AFAIK, the transmute here is even UB beause the layout of fat pointers is not guaranteed. You should use a pointer cast or the bytemuck crate.

My other advise would be to minimize the unsafe wrapper size. Your init function is one big unsafe blob which makes it hard to audit. For my similar projects I like to have three layers.

  1. bindgen generated sys layer.
  2. minimal safe wrapper layer
  3. The device layer with a rusty user interface

Something like this

fn get_firmware_version(device: sys::HANDLE, buffer: &mut [u8]) -> PsReturnStatus {
    let len = buffer.len().try_into().unwrap();
    let ptr: *mut i8 = buffer.as_mut_ptr().cast();
    unsafe {
        scGetFirmwareVersion(device, ptr, len)
    }
}
pub struct Device(device: sys::HANDLE);

impl Device {
    fn get_firmware_version(&self) -> Result<String, LibraryError> {
        let mut buffer = [0; 64];
        match wrapper::get_firmware_version(self.0, &mut buffer) {
            sys::OK => Ok(CStr::from_bytes_until_nul(&buffer).to_string_lossy().to_owned()),
            error_code => Err(LibraryError::from_code(error_code))
        }
    }
}
3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.