Can i conveniently compile bytes into a Rust program with a specific alignment?


#1

The short version: can I specify the alignment of an include_bytes!("...") invocation?

My use case here involves a recent crate I published, regex-automata. One of the key features of that crate is the ability to serialize DFAs to raw bytes and then cheaply deserialize them. In particular, deserialization is designed to be a constant time operation that does no heap allocation or mutation. In other words, a deserialized DFA must be able to operate on the raw bytes, which means that we need to think about both alignment and endianness. Endianness is easy to solve, but alignment is proving a bit trickier.

For the purposes of this discussion, think of a DFA in memory as something like this:

struct DFA<'a> {
    start: u64,
    max_match: u64,
    transitions: &'a [u32],
}

(The real representation is a bit more complicated, but I think the above is fine to illustrate the issue.)

Namely, a DFA is made up of a constant number of fixed size fields and a transition table. At deserialization time, we can do a number of operations proportional to the number of fixed size fields, but we must not do a number of operations proportional to the number of transitions. Remember, deserialization should be cheap and zero-copy. In this case, our transition table is a &[u32], which requires pointing at a memory address that is aligned to 4 bytes.

The question is, if I have a &[u8], which is what DenseDFA::from_bytes accepts for deserialization, then how can I ensure that it is properly aligned to 4 bytes? This is required for the part of deserialization that casts a &[u8] to a &[u32]. Usually, there are a fair number of tricks one can do to specify alignment in some way, but in my case, I’d really like to be able to build a DFA from bytes compiled into the executable. The simplest way I know how to do that is with something like this:

lazy_static! {
  pub static ref DFA: ::regex_automata::DenseDFA<&'static [u32], u32> = {
    unsafe {
      ::regex_automata::DenseDFA::from_bytes(
        include_bytes!("dfa.littleendian.dfa")
      )
    }
  };
}

The problem here is that includes_bytes!("...") sensibly only guarantees an alignment of 1. Therefore, this code only works if it coincidentally happens to be aligned to 4 bytes. Which sometimes happens. But not always. (DenseDFA::from_bytes checks for alignment, so when it fails, you get a panic instead of UB.)

Is there a way to fix this such that include_bytes!(...) can be made to work? I’m afraid that this might not be possible, but seems like it could be with an attribute?

Another idea that would work I think is to abandon the use of include_bytes! and convert the raw bytes of my DFA to a &'static [u64] that is explicitly written as a Rust source file instead of included as a separate file at compile time. That would guarantee correct alignment. I’d just need to cast it to a &[u8] to feed it to DenseDFA::from_bytes, which is fine. However, I’d prefer not to do that, since I’ve found things like that to increase compile times, and they also generally use more space on disk.

Another idea is if we inserted padding bytes at the beginning of the serialized DFA, and then instructed the deserializer to ignore them. But I think that in order to do this, you need to know the address that the byte array is stored at, so there’s no way to insert those padding bytes ahead of time. I can’t think of any other tricks along this line, but maybe I’m missing something obvious!

Are there other approaches that would be just as or close to convenient as include_bytes!?


What is the absolute fastest way to load 50M f32 from file to memory?
#2

I didn’t notice anything along the lines of this in your listed attempts. I’m not sure if this is a 100% guarantee, but it may serve as a starting point (this is simple enough to wrap with macro_rules!):

(edit: incorporated @cuviper’s slick suggestion)

// This struct is generic in Bytes to admit unsizing coercions.
#[repr(C)] // guarantee 'bytes' comes after '_align'
struct AlignedTo<Align, Bytes: ?Sized> {
    _align: [Align; 0],
    bytes: Bytes, 
}

// dummy static used to create aligned data
static ALIGNED: &'static AlignedTo<f32, [u8]> = &AlignedTo {
    _align: [],
    bytes: *include_bytes!("data.dat"),
};

static ALIGNED_BYTES: &'static [u8] = &ALIGNED.bytes;

#3

Oh, that seems to work! Awesome. I had tried some incantations like that, notably,

    union Aligned {
        _align: u32,
        bytes: [u8],
    }

since I’ve used union before to force alignment for things like [u8; 16]. But it seems like unions don’t support dynamically sized types.

The trick I missed was just using a normal struct with a DST. Clever.


#4

FWIW, I think you could use [Align; 0] to get the alignment without the dummy value.


#5

cc @hsivonen I seem to recall you were asking about this a while back.


#6

Does anyone know why this doesn’t compile (playground)

#[repr(C)]
struct Aligned {
    _align: [u32; 0],
    bytes: [u8],
}

static ALIGNED: &'static Aligned = &Aligned {
    _align: [],
    bytes: *(&b"abc"[..]),
};

but this does? (playground)

#[repr(C)]
struct Aligned<B: ?Sized> {
    _align: [u32; 0],
    bytes: B,
}

static ALIGNED: &'static Aligned<[u8]> = &Aligned {
    _align: [],
    bytes: *b"abc",
};

This seems pretty subtle. In particular, notice that I wrote *(&b"abc"[..]) in the first example, only because *b"abc" doesn’t pass the type checker, even though it does in the second example.


#7

As I interpret it, the trouble is that the value on the right-hand side of bytes: in your struct literal must be a value expression (rvalue), but in your first example it is of type [u8]. Currently in rust, only place expressions (lvalues) can have unsized types, not value expresssions.

That is why I added the generic type argument, to allow an Aligned<[u8; SIZE]> to be constructed first (which is then unsized behind a reference to become &Aligned<[u8]>). I don’t know if there is another way.


#8

Wow. TIL. Thank you!


#9

If using a concrete alignment, why not just use #[repr(align(N))] on a newtype?


#10

I used repr(align()) on a wrapper struct. Like this:

#[repr(align(64))] // Align to cache lines
pub struct Utf8Data {
    pub table: [u8; 384],
}

#11

Ah neat, that’s a good idea too. I had thought about that, but somehow forgot that I could actually name the concrete array type since I know how big the file I’m including is. That does seem simpler!


#12

But is this guaranteed to align the inner field? Especially, if I wanted to place [u8; 383], it would need to add one byte padding and I don’t think anything specifies if it goes before or after the first field.


#13

Yeah, technically you’d also need to use repr(C).