Increase Performance of Heap Allocated Objects

Summary

For quick context, I'm working on an interpreter for a dynamically typed programming language for fun. I'm trying to refactor how I represent heap allocated values for speed and memory use.

Here a somewhat simplified set of structs to show my general problem I'm trying to tackle.

// stack allocated values
enum Value {
  Bool(bool),
  // one variant is a pointer to a heap object
  Obj(Managed<Obj>),
  // ...
}

// heap allocated values
enum Obj {
  String(String),
  Fun(Fun),
  // ...
}

// example obj
struct Fun {
  name: Managed<Obj>
  // ...
}

// effectively just a raw pointer to types that impl Manage
struct Managed<T: 'static + Manage + ?Sized> {
  ptr: NonNull<Allocation<T>>,
}

// allocation with header data for gc
pub struct Allocation<T: 'static + Manage + ?Sized> {
  header: Header, // not important
  data: T,
}

trait Manage {
   // .. some methods
}

// simplified process to allocated a managed value
fn main() {
  let string = "example".to_string();
  let mut alloc = Box::new(Allocation::new(string));
  let ptr = unsafe { NonNull::new_unchecked(&mut *alloc) };
  let managed = Managed { ptr };
  let value = Value::Obj(managed);
}

The couple of requirements I'm trying to go for are below. These all boil down to trying to crank more performance out of my language.

  1. I'd need Managed to hold a thin pointer so I need a concrete type. This is for another optimization I'd like to do later to make Value smaller.
  2. Be able to directly access the enum data directly without matching in some cases. In some cases I'll know Obj is actually a Obj::String in others I'll just know it's an Obj or Value.
  3. Minimized memory usage.

The couple possible implementations I've thought about are:

Enums:

Pros:

  • More idiomatic
  • Easiest to implement

Cons:

  • Size: the objects are not very union in size so I'm going to pay a lot of padding for things like Strings that are relatively small
  • I'll need to unwrap ever time even if I know for example Fun name is always filled with aObj::String

Unions:

Pros:

  • Can directly cast to variant I want if it is known

Cons:

  • More unsafe code. there generally a lot in this project but still.
  • Same size issue are enums
  • Seems several things with unions are unstable
  • Still have to cast in all the place I want to use the underlying data of the union

Pointer Casts?

This is the one I'm the least sure I can actually do. While obviously unsafe it seems there should be some way do some pointer casting. Maybe something in the vein of

struct Obj {
  kind: ObjKindEnum,
}

struct ObjString {
  obj: Obj,
  // ..
}

struct ObjFun {
  obj Obj,
  // ..
}

let string = ObjString::new();
let obj = &string as *mut Obj // ??

match &*obj.kind {
  ObjKindEnum::String => {
    let string_ptr: *mut ObjString = obj as *mut ObjString
  }
}

Pros:

  • May be able to have my cake and eat it to

Const:

  • Hilariously unsafe
  • Debug experience horrible if there is a bug

I'd greatly appreciate any suggestions on how I might tackle this. Repo for anyone interested GitHub - Laythe-lang/Laythe: A gradually typed language originally based on the crafting interpreters series

Is that Obj::String variant std::string::String? This might indicate you are double-boxing your strings. String is a fat pointer, and the content grows on the heap. Of course, if you are using a different String implementation, the details will vary.

Have you profiled heap allocations on a representative code sample? Do you know for certain this is a bottleneck? Have you tried jemalloc instead of system malloc?

A couple days ago I wrote up this code which lets you have a heap-allocated dynamically sized array with a thin pointer (equivalent). I did so by storing the array length at the beginning of the heap allocation. It requires a little bit of care to respect alignment.

Here's the tl;dr version:

/// Heap-allocated array which stores the length in the allocation.
///
/// Equivalent to `Box<[T]>` but with the size of a thin pointer,
/// not a fat pointer.
pub struct Array<T> {
    alloc: *mut u8,
    p: PhantomData<T>,
}

impl<T> Array<T> {
    pub fn from_vec(vec: Vec<T>) -> Self {
        unsafe {
            let len = vec.len();

            // compute memory layout
            let len_memsize = len_memsize::<T>();
            let layout = layout::<T>(len);

            // allocate
            let alloc: *mut u8 = alloc(layout);

            // initialize
            ptr::write(alloc as *mut usize, len);
            for (i, elem) in vec.into_iter().enumerate() {
                let offset = len_memsize + size_of::<T>() * i;
                ptr::write(alloc.add(offset) as *mut T, elem);
            }

            // construct
            Array {
                alloc,
                p: PhantomData,
            }
        }
    }

    pub fn as_slice(&self) -> &[T] {
        unsafe {
            // read length
            let len_memsize = len_memsize::<T>();
            let len = ptr::read(self.alloc as *const usize);

            // construct fat pointer
            let array_start = self.alloc.add(len_memsize) as *const T;
            let slice = slice::from_raw_parts(array_start, len);

            slice
        }
    }

    pub fn as_slice_mut(&mut self) -> &mut [T] {
        unsafe {
            // read length
            let len_memsize = len_memsize::<T>();
            let len = ptr::read(self.alloc as *const usize);

            // construct fat pointer
            let array_start = self.alloc.add(len_memsize) as *mut T;
            let slice = slice::from_raw_parts_mut(array_start, len);

            slice
        }
    }
}

fn len_memsize<T>() -> usize {
    usize::max(size_of::<usize>(), align_of::<T>())
}

fn layout<T>(len: usize) -> Layout {
    let len_memsize = len_memsize::<T>();
    let arr_memsize = size_of::<T>() * len;

    let alloc_size = len_memsize + arr_memsize;
    let alloc_align = usize::max(align_of::<usize>(), align_of::<T>());

    let layout = Layout::from_size_align(alloc_size, alloc_align).unwrap();

    layout
}

impl<T> Drop for Array<T> {
    fn drop(&mut self) {
        unsafe {
            // drop elements
            if needs_drop::<T>() {
                for elem in self.iter_mut() {
                    drop_in_place(elem);
                }
            }

            // free
            let len = ptr::read(self.alloc as *const usize);
            let layout = layout::<T>(len);

            dealloc(self.alloc, layout);
        }
    }
}

I think that you could create something similar to store your trait objects behind a thin pointer, while avoiding the "not very union in size" problem.

From this, you can cast them into specific types with a similar API to std::any. Furthermore, for your "I'll know Obj is actually an Obj::String" cases, placing core::hint::unreachable_unchecked on your failure branch will make the check dissapear from machine code (although I would suggest that's premature / overly risky optimization).

I'm case you're interested, thincollections implements support for such thin-pointer arrays.

Also if 2 words are small enough and the arrays/string don't need to be resized, then Box[str] and Box[T] would work.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.