First post, using unsafe
. Please be kind.
I'm trying to read in c-style structures from binary dumps that were created in C++. They're all POD types, so no worries there. I'm basing my code on this stackoverflow post and with a little modification it works, but I ran into a literal stack overflow because the structures are too large (over 100k, so not a surprise). So I'm modifying them to return a Box<T>
instead of just T
, and I'm running into having to use alloc
and dealloc
, which doesn't seem like it would be necessary, but I have a feeling I'm missing something. The code is below:
const BIG_STRUCT_SIZE: usize = 140000;
#[derive(Clone, Copy, Debug)]
#[repr(C)]
#[repr(packed(1))]
struct My_Big_C_Struct{
// Not implemented the sub-structures yet!
bulk: [u8; BIG_STRUCT_SIZE - mem::size_of::<u32>()],
crc: u32, // Every struct has a crc at the end
}
// Reads in the specified struct from the specified path. If there is not an exact match on size, returns an error
fn read_struct<T, P: AsRef<Path>>(path: P) -> std::io::Result<Box<T>> {
let path = path.as_ref();
let struct_size = ::std::mem::size_of::<T>();
let num_bytes = fs::metadata(path)?.len() as usize;
if struct_size != num_bytes {
return Err(std::io::Error::new(
std::io::ErrorKind::Other,
format!(
"Length of file did not match structure specified. File length: {}, Expected: {}",
num_bytes, struct_size
),
));
}
let mut reader = BufReader::new(File::open(path)?);
unsafe {
let ptr_layout = Layout::new::<T>();
let raw_ptr = alloc(ptr_layout);
let buffer = slice::from_raw_parts_mut(raw_ptr, num_bytes);
match reader.read_exact(buffer) {
Ok(_) => Ok(Box::from_raw(raw_ptr as *mut T)),
Err(e) => {
dealloc(raw_ptr, ptr_layout);
Err(e)
},
}
}
}
Is there a way to do this with Box
where I can just say "make an empty one for the type" and just go? I'm thinking I'm just missing something obvious, because I don't want to have to use alloc
and dealloc
at all if I can help it.
I'm also concerned that in the comments on the answers Shepmaster says that " it improperly uses unsafe Rust. The proposed function can introduce memory unsafety in safe Rust code". So I'm wondering if the approach is wrong? I don't want to read from the binary dumps field by field, I want the structure definition itself to say the sizes and such, as then there's only one opportunity to get it right/wrong.
Really, all I'm trying to do is get some big binary structures (I can't alter that, they'll be binary) into Rust from disk so I can do math on them. I think this is the way, but I'm open to other suggestions too.