A little while ago I was including some static data in an executable and wanted to do this in a type-safe way. I came across various discussions, and this exellent blog post and decided a crate was probably in order. I also briefly spoke to the author (Jack Wrenn) about an earlier iteration of this crate before publishing.
I'm pretty sure this simple crate is sound and in good shape, but I did want to open it up for review to make sure. Please feel free to comment here or open up issues on the repo if you spot anything or have any API comments.
You can include data in any custom type you like. The best way of doing this is if your custom type satisfies the requirements for bytemuck::Pod, in which case you can simply use include_data.
#[repr(C)]
#[derive(Copy, Clone)]
struct Foo {
integer: u16,
pair: (u8, u8),
}
// Safety: the requirements for `Pod` have been manually checked.
unsafe impl bytemuck::Zeroable for Foo {}
unsafe impl bytemuck::Pod for Foo {}
static FOO_DATA: Foo = include_data!("../tests/test_data/file_exactly_4_bytes_long");
This implementation is unsound, as the representation for tuples is the default representation, meaning that is it largely left unspecified and there are no guarantees of no padding or even that the fields are stored in order of declaration. This could be fixed by replacing (u8,u8) with [u8; 2].
This also does not specify what should happen if the specified file does not happen to contain exactly four bytes. Is it UB, silently ignored or a compile-time error? This should be documented extensively as these are the most important questions I would ask before using this library.
Edit: I just saw it is documented on the macros themselves. Still this also belongs in the main README / crate docs page.
This looks wrong. Are you sure it shouldn't be include_slice!(u8, $file)? In addition, I really don't see that need for these extra methods. Are they really necessary? In my opinion they are redundant in favor of the, also more readable, include_slice. I also find the name confusing and would expect a macro like include_u8 to give me a single value of type u8, not a slice.
You should probably say something about byte order in the documentation— If I'm compiling on a little-endian machine for a big-endian target (or vice versa), which endianness should be stored inside the data file?
Thanks, the redrafted README and docs say this much more explicitly. Basically the data is copied in byte-for-byte, the interpretation of the bitpattern happens entirely at runtime. The only thing "happening" at compile time is a copy into a properly aligned memory location. Thus, interpretation of multi-byte sequences is according to the endianness (and everything else) of the target machine.
the markup auto-titling of links makes it look like the links are both the same, but the first one leads to the macro and the second one to the trait.
Aliases are provided for include_slice for primitive number types, using them is a matter of personal preference. For example:
The code below still uses include_u32 instead of include_u32s (shouldn't cargo test have prevented this kind of typo, if it is a doctest?). In addition, I am still not really satisfied with the naming (and have never seen (primitive) types pluralized this way), but this is likely just personal preference and/or bikeshedding.
Good catch on the link, thanks (in any case, a redditor pointed out a different trait I should probably be using for the bound), will be fixed later today.
For the alias naming, I think the github readme is up-to-date, I just haven't pushed a new crates.io release yet.