Best practice for wrapping an unsafe function that fills dynamically-sized, uninitialized memory?


#1

This is a design problem that I have come across, and I’ve come up with several solutions, but I’m curious what other Rust users have done, or their opinion on what is a best practice.

A common practice in C/C++ interfaces is to provide reference-type input parameters which provide output. For example, in OpenGL, there is a function with the prototype:

void glCreateBuffers(GLsizei n, GLuint *buffers)

Each GLuint becomes a non-zero buffer handle. The simplest way to make a Rust interface around this would be

fn create_buffers(buffers: &mut [GLuint])

This provides a significant improvement in type safety and ergonomics, but doesn’t necessarily prevent a user from making the following two (logical) errors:

// other functions that use the buffer handles end up using GLuint type as well
fn use_buffers(buffers: &[GLuint]) { ... }

let mut buffers = ...;
create_buffers(&mut buffers);
// overwriting valid buffer handles
create_buffers(&mut buffers);
use_buffers(&buffers);

let mut buffers = ...;
// use_buffers doesn't statically require non-zero handles,
// thus it doesn't require create_buffers to be called before use
use_buffers(&buffers);

To convey that we are constructing NonZero handles, one might make a BufferHandle type:

struct BufferHandle(NonZero<GLuint>);

and use this type in place of the raw GLuint, like so:

fn create_buffers(buffers: &mut [BufferHandle])

Unfortunately, this would require a user to have a slice of non-zero buffer handles to start with. Using Option (and taking advantage of the Option<NonZero<T>> space optimization), we can at least enable users to pass a slice full of Nones.

fn create_buffers(buffers: &mut [Option<BufferHandle>])

However, it becomes unnecessarily difficult to use &[Option<BufferHandle>], and a user will want to do a transformation to &[BufferHandle] somehow. Another option would be to make the function return a Vec:

fn create_buffers(count: usize) -> Vec<BufferHandle>

This is much cleaner, but also forces the user to create a new allocation each time the function is used. Passing in a mutable vector can solve this issue:

fn create_buffers(count: usize, buffers: &mut Vec<BufferHandle>)

In this case, Vec::reserve() would be used, and the new handles would be appended to the end of buffers. This is likely the most flexible and safe option from a type perspective, but possibly also the most complex or unfriendly option.

Perhaps it’s a bad idea to try to use NonZero here? I have noticed that the standard library’s file descriptor abstraction doesn’t. I also think that in general, it’s much more profitable to use the type system to prevent memory errors than to describe certain application logic, however I am nonetheless very interested in doing so if possible!


#2

One idea worth considering is making use of session types to statically enforce correct BufferHandle usage. The gist is you’d have something like this:

enum Uninit {}
enum Init {}

struct BufferHandle<T> {
    _ptr: GLuint, // or however you want to define the ptr to the raw allocation done by gl
   _marker: std::marker::PhantomData<T>
}

impl BufferHandle<Uninit> {
    // define functions that can be called only when the buffer handle is uninitialized

    fn init(self) -> BufferHandle<Init> {
        // alloc gl buffer
        let ptr = ...; // allocate the memory
        BufferHandle {ptr: ptr, std::marker::PhantomData}
    }
}

impl BufferHandle<Init> {
    // define functions that can be called only when the buffer is initialized
}

// e.g. Drop only makes sense on an initialized handle
impl Drop for BufferHandle<Init> {
    // drop can only be done on a buffer handle that's been initialized (i.e. has allocated gl resources)
}

impl<T> BufferHandle<T> {
    // define functions that can be called irrespective of the buffer state
}

// use_buffers can only be called on initialized BufferHandles
fn use_buffers(buffers: &[BufferHandle<Init>]) { ... }

I’ve not fully fleshed this out in my mind and it may be overkill, but wanted to throw this option out there as well if you really want to go all out with type system facilities.


#3

What you want is basically support for alloca in Rust. With some help from the compiler you could have the same API as with returning a Vec, but without having to perform any allocation.
There has been some work on this, but nothing that will come soon unfortunately.


#4

Isn’t that more similar to a stack slice than to a Vec?


#5

Wow thank you so much, I’ve never heard of session types! Quite fantastic! I think this is precisely what I’m interested in!


#6

Yeah, it’s a pretty nifty design technique made possible by Rust allowing “conditional” impl blocks for a struct. I hesitate to call it specialization because that refers to a specific (different) feature.

One thing I omitted in haste in my sketch above is a trait that you’d likely want so that you have:

trait State {} // marker

// Uninit and Init tag enums as before
impl State for Uninit {}
impl State for Init {}

struct BufferState<S: State> { ... }

This is to prevent accidental non-sensical BufferState instantiations.

Anyway, this may be overkill as mentioned. But, you probably could use it to enforce the proper usage/protocol in dealing with the buffers. If these types are internal to your crate (i.e. aren’t going to tie you to compatibility issues), it’s worth experimenting with.


#7

I actually did something similar just now, except that my trait name (HandleState) is a bit worse and I neglected to use a BufferState<S: State> type. This is what I ended up with, the only other issue I had was that I found it necessary to create a slice wrapper type that implements Deref in order to implement the init method, but I think that’s expected. Anyways, thanks again, as you say, this might be overkill, but I think it’s really interesting what’s possible in Rust!


#8

Nice. A couple of quick suggestions:

  1. I used an (empty) enum for Init and Uninit intentionally, rather than a unit struct. The enum Init {} is an “uninhabited” type - you cannot get an instance of it, it’s a pure type level concept. A unit struct can still have a value (albeit it’s a singleton, but still).
  2. You’ll want to encode HandleState purely in the BufferState type, and use std::marker::PhantomData internally (PhantomData is a zero-sized type used purely for conveying certain type system properties). It looks like you’re giving it an actual field in your struct.

#9

Thanks for the suggestions, I hadn’t considered those points (and I agree with your method)!


#10

Hi tomaka, it’s really nice to receive a reply from you! (I use glutin constantly by the way) I think alloca would be the best non-overkill option (if it were an option), and as you mention in vulkano/TROUBLES.md, it would also be great to have placement new (which could be used even without alloca, using SmallVec right? That would suit this particular use case well, but perhaps not the general case of an uninitialized buffer, which would need let b = box alloca();)