[New crate!] Peapod: ultra-compact storage for enums

Hi fellow Rustaceans,

I just published a new crate called peapod that uses proc-macro magic to pack enums together really tightly!

Shameless plug: if you think this is cool feel free to star it on GitHub :slight_smile:

Here's the README:

Peapod

Peapod is a data structure for storing enums super-compactly, like peas in a
pod. It works with any enum that implements the Phenotype trait, which
captures the behaviour of each variant.

Contents

  1. Motivation
  2. Usage
  3. Technical
  4. How Peapod works
  5. When not to use Peapod

Motivation

We only have so much memory to work with. Especially in space-constrained
systems, we want to be particularly efficient. Peapod provides a way of
storing enums that can dramatically reduce space usage. You can read more
in-depth about the motivation in technical section.

tl;dr: Peapod provides ultra-compact storage for enums!

Usage

You can basically just use Peapod like a normal Vec. Some functionality is
impossible though, like iterating over a Peapod without consuming it.

To make an enum suitable for Peapod storage, stick a #[derive(Phenotype)]
on it.

use peapod::{Phenotype, Peapod};

fn main() {
    // The Peapod representation is a lot smaller!
    // These numbers are in bytes
    assert_eq!(ILovePeas::PEAPOD_SIZE.unwrap(), 9);
    assert_eq!(std::mem::size_of::<ILovePeas>(), 16);

    let mut pp = Peapod::new();
    pp.push(ILovePeas::SnowPea);
    pp.push(ILovePeas::Edamame(0x9EA90D));
    pp.push(ILovePeas::GeneticPea {
        wrinkled: true,
        yellow: true,
    });

    for pea in pp {
        // do something with pea!
    }
}

#[derive(Phenotype)] // <- this is where the magic happens
enum ILovePeas {
    Edamame(usize),
    SnowPea,
    GeneticPea { wrinkled: bool, yellow: bool },
}

Technical

enums (also known as tagged unions) are represented in memory by a tag
(integer) and a union. The tag specifies how the bits of the union are
interpreted. For example, a tag of 0 might mean "read the union as
Result::Ok(_)", while a tag of 1 would mean "read the union as
Result::Err(_)".

Because of alignment reasons, the compiler has to lay out enums so that the tag
takes up a more space than need be. If there are only two variants, we only need
one bit to keep track of which variant something is. Take this pretty drastic
example:

enum Two {
    First(usize),
    Second(usize)
}
// mem::size_of::<Two> == 16

Since the size of each variant is 8 bytes, and the size of the enum is 16
bytes, 8 bytes are being used for the tag! 63 bits are being wasted! We can
do better.

Peapod works by "cleaving" an enum into tag and union. Tags are stored
together in a bitvec type so that no space is wasted due to alignment. All the
data from the enums (in union form) is also stored together.

This drawing illustrates the previous example:

Scale: 1 - == 1 byte

Standard:
+--------+--------+
|  tag   |  data  |
+--------+--------+
        ^ Only this byte is actually needed to store the tag

Standard array:
+--------+--------+--------+--------+--------+--------+
|  tag   |  data  |  tag   |  data  |  tag   |  data  | . . .
+--------+--------+--------+--------+--------+--------+

Peapod:
+-+--------+
| |  data  |
+-+--------+
 ^ tag

Peapod array:
+-+   +--------+--------+--------+
| | + |  data  |  data  |  data  | . . .
+-+   +--------+--------+--------+
 ^ many tags can be packed into one byte, we could hold 5 more tags in this byte

How does it do it?

Preface: compiler people I beg your forgiveness.

The magic is in the Phenotype trait, which has two very important methods:
cleave and reknit.

type Value;
fn cleave(self) -> (usize, Self::Value)
fn reknit(tag: usize, value: Self::Value) -> Self

The type Value is some type that can hold all the data from each enum
variant. It should be a union.

cleave tags a concrete instance of an enum and splits it into a tag (this
tag is internal to Phenotype, unrelated to the compiler's) and a
Self::Value. reknit does the opposite and takes a tag and a Self::Value,
and reconstitutes it into an enum variant.

The implementation all happens with the wizardry that is proc-macros.
#[derive(Phenotype)] is the workhorse of this project.

The #[derive(Phenotype)] takes a look at your enum and first generates some
"auxilliary" types like so:

enum ThreeTypes<T> {
    NamedFields {
        one: T,
        two: usize
    },
    Tuple(usize, usize),
    Empty
}

// Represents the `NamedFields` variant
struct __PhenotypeInternalThreeTypesNamedFieldsData<T> {
    one: T,
    two: usize,
}

// Represents the `Tuple` variant
struct __PhenotypeInternalThreeTypesTupleData(usize, usize);

#[allow(non_snake_case)]
union __PhenotypeInternalThreeTypesData<T> {
    NamedFields: ManuallyDrop<__PhenotypeInternalThreeTypesNamedFieldsData<T>>,
    Tuple: ManuallyDrop<__PhenotypeInternalThreeTypesTupleData>,
    Empty: (),
}

Then, it generates the cleave method. The generated code for this example
looks like:

fn cleave(self) -> (usize, Self::Value) {
    match &*ManuallyDrop::new(self) {
        ThreeTypes::Empty => (2usize, __PhenotypeInternalThreeTypesData { Empty: () }),
        ThreeTypes::Tuple(_0, _1) => (
            1usize,
            __PhenotypeInternalThreeTypesData {
                Tuple: ManuallyDrop::new(__PhenotypeInternalThreeTypesTupleData(
                    unsafe { ::core::ptr::read(_0) },
                    unsafe { ::core::ptr::read(_1) },
                )),
            },
        ),
        ThreeTypes::NamedFields { one, two } => (
            0usize,
            __PhenotypeInternalThreeTypesData {
                NamedFields: ManuallyDrop::new(__PhenotypeInternalThreeTypesNamedFieldsData::<
                    T,
                > {
                    one: unsafe { ::core::ptr::read(one) },
                    two: unsafe { ::core::ptr::read(two) },
                }),
            },
        ),
    }
}

All we're doing is matching on the enum variant and reading out each field
into the correct auxiliary struct.

cleave does the opposite. Based on the tag, it reads the union and generates
and enum variant from the data contained in the auxiliary struct.

fn reknit(tag: usize, value: Self::Value) -> ThreeTypes<T> {
    match tag {
        2usize => ThreeTypes::Empty,
        1usize => {
            let data =
                ManuallyDrop::<__PhenotypeInternalThreeTypesTupleData>::into_inner(unsafe {
                    value.Tuple
                });
            ThreeTypes::Tuple(data.0, data.1)
        }
        0usize => {
            let data =
                ManuallyDrop::<__PhenotypeInternalThreeTypesNamedFieldsData<T>>::into_inner(
                    unsafe { value.NamedFields },
                );
            ThreeTypes::NamedFields {
                one: data.one,
                two: data.two,
            }
        }
        _ => unreachable!(),
    }
}

When not to use Peapod

  • Sometimes enums are niche optimized, meaning the compiler has found a
    clever way to elide the tag. The canonical example is Option<NonNull<T>>:
    since the NonNull<T> cannot be null, the compiler can use the null pointer
    to represent the None variant. This is fine as the None variant doesn't
    actually contain a NonNull<T>. In summary, an valid pointer bit pattern
    represents a Some variant, and the null pointer represents the None
    variant, so there is no need to store a tag.
  • Sometimes Peapod won't produce a smaller representation. You can check
    this using the provided IS_MORE_COMPACT constant.
  • You don't have an allocator. I'm working on a fixed-size Peapod but it
    seems like it's going to be difficult as long as const generics are
    incomplete.
9 Likes

When I try to use the Phenotype derive macro, I get an import error:

error[E0433]: failed to resolve: use of undeclared crate or module `phenotype_internal`
 --> src/main.rs:3:10
  |
3 | #[derive(Phenotype)]
  |          ^^^^^^^^^ use of undeclared crate or module `phenotype_internal`
  |
  = note: this error originates in the derive macro `Phenotype` (in Nightly builds, run with -Z macro-backtrace for more info)

It seems like the peapod-internal crate has not been published to crates.io, so other crates have no way to import it; is this an oversight?

Hmm, it's on crates.io here. I'm getting the same error though. I'll post an update when I make progress. If you have any ideas let me know :slight_smile:

And thank you so much for the report!

EDIT: Now fixed, found the source of the error proc-macro. Should've read the error message more closely :person_facepalming:

A few more bugs I found while playing around with this:

  • The Phenotype derive macro triggers the deny-by-default ambiguous_associated_items lint if the enum has a variant named Value or PEAPOD_SIZE:

    /*
    [dependencies]
    peapod = "=0.1.5"
    */
    
    use peapod::Phenotype;
    
    #[derive(Phenotype)]
    enum Test {
        Value
    }
    
    error: ambiguous associated item
      --> src/main.rs:8:10
       |
    8  | #[derive(Phenotype)]
       |          ^^^^^^^^^
       |
       = note: `#[deny(ambiguous_associated_items)]` on by default
       = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
       = note: for more information, see issue #57644 <https://github.com/rust-lang/rust/issues/57644>
    note: `Value` could refer to the variant defined here
      --> src/main.rs:10:5
       |
    10 |     Value
       |     ^^^^^
    note: `Value` could also refer to the associated type defined here
      --> /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/phenotype-internal-0.1.0/src/lib.rs:31:5
       |
    31 |     type Value;
       |     ^^^^^^^^^^^
       = note: this error originates in the derive macro `Phenotype` (in Nightly builds, run with -Z macro-backtrace for more info)
    

    It seems like the fix would be to explicitly refer to <Self as Phenotype>::Value and <Self as Phenotype>::PEAPOD_SIZE in the proc macro output.

  • If an enum only has a single variant, putting any value into a Peapod results in a double panic at runtime:

    /*
    [dependencies]
    peapod = "=0.1.5"
    */
    
    use peapod::{peapod, Phenotype};
    
    #[derive(Phenotype)]
    enum Holder {
        Variant,
    }
    
    fn main() {
        let _pp = peapod![Holder::Variant];    
    }
    
    thread 'main' panicked at 'cannot store 64 bits from a 0-bit region', /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/bitvec-1.0.1/src/field.rs:526:5
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    thread 'main' panicked at 'cannot load 64 bits from a 0-bit region', /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/bitvec-1.0.1/src/field.rs:526:5
    stack backtrace:
       0:     0x55b940403a3d - std::backtrace_rs::backtrace::libunwind::trace::h8217d0a8f3fd2f41
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
       1:     0x55b940403a3d - std::backtrace_rs::backtrace::trace_unsynchronized::h308103876b3af410
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
       2:     0x55b940403a3d - std::sys_common::backtrace::_print_fmt::hc208018c6153605e
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:66:5
       3:     0x55b940403a3d - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hf89a7ed694dfb585
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:45:22
       4:     0x55b94041ddbc - core::fmt::write::h21038c1382fe4264
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/fmt/mod.rs:1197:17
       5:     0x55b940401ce1 - std::io::Write::write_fmt::h7dbb1c9a3c254aef
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/io/mod.rs:1672:15
       6:     0x55b9404050c5 - std::sys_common::backtrace::_print::h4e8889719c9ddeb8
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:48:5
       7:     0x55b9404050c5 - std::sys_common::backtrace::print::h1506fe2cb3022667
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:35:9
       8:     0x55b9404050c5 - std::panicking::default_hook::{{closure}}::hd9d7ce2a8a782440
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:295:22
       9:     0x55b940404de6 - std::panicking::default_hook::h5b16ec25444b1b5d
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:314:9
      10:     0x55b940405656 - std::panicking::rust_panic_with_hook::hb0138cb6e6fea3e4
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:698:17
      11:     0x55b940405547 - std::panicking::begin_panic_handler::{{closure}}::h4cb67095557cd1aa
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:588:13
      12:     0x55b940403ef4 - std::sys_common::backtrace::__rust_end_short_backtrace::h2bfcac279dcdc911
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:138:18
      13:     0x55b940405279 - rust_begin_unwind
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:584:5
      14:     0x55b9403de5e3 - core::panicking::panic_fmt::h1de71520faaa17d3
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:142:14
      15:     0x55b9403e4e2b - bitvec::field::check::h0a41a79a9621d4d1
                                   at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/bitvec-1.0.1/src/field.rs:526:2
      16:     0x55b9403ec8ad - <bitvec::slice::BitSlice<T> as bitvec::field::BitField>::load_le::h68dc3f98d4f69a57
                                   at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/bitvec-1.0.1/src/field.rs:121:3
      17:     0x55b9403ec583 - bitvec::field::BitField::load::h164de7ba2ce57e07
                                   at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/bitvec-1.0.1/src/field.rs:49:4
      18:     0x55b9403dfdff - peapod::peapod::Peapod<T>::get_tag::hb33f43000f58bc9a
                                   at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/peapod-0.1.5/src/peapod.rs:76:9
      19:     0x55b9403dfa6c - peapod::peapod::Peapod<T>::pop::h8598d099f9d51852
                                   at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/peapod-0.1.5/src/peapod.rs:118:26
      20:     0x55b9403e3d58 - <peapod::peapod::Peapod<T> as core::ops::drop::Drop>::drop::h9c9500ac03e9faea
                                   at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/peapod-0.1.5/src/peapod.rs:193:15
      21:     0x55b9403e3b63 - core::ptr::drop_in_place<peapod::peapod::Peapod<wat::Holder>>::hb7dd24610791f155
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/ptr/mod.rs:487:1
      22:     0x55b9403ebfd5 - wat::main::ha31f3a69410ca4e0
                                   at /home/lm978/wat/src/main.rs:14:15
      23:     0x55b9403e39eb - core::ops::function::FnOnce::call_once::haf57eb7d16cd513f
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/ops/function.rs:248:5
      24:     0x55b9403df78e - std::sys_common::backtrace::__rust_begin_short_backtrace::ha5da18da33762ddc
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:122:18
      25:     0x55b9403e06a1 - std::rt::lang_start::{{closure}}::h661c473f48db4c52
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/rt.rs:145:18
      26:     0x55b9403ff8ee - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h4937aaa125c8d4b2
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/ops/function.rs:280:13
      27:     0x55b9403ff8ee - std::panicking::try::do_call::h6f5c70e8b0a34f92
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:492:40
      28:     0x55b9403ff8ee - std::panicking::try::h68766ba264ecf2e2
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:456:19
      29:     0x55b9403ff8ee - std::panic::catch_unwind::hc36033d2f9cc04af
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panic.rs:137:14
      30:     0x55b9403ff8ee - std::rt::lang_start_internal::{{closure}}::h78c037f4a1a28ded
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/rt.rs:128:48
      31:     0x55b9403ff8ee - std::panicking::try::do_call::he6e1fffda4c750ee
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:492:40
      32:     0x55b9403ff8ee - std::panicking::try::h48a77ddbb2f4c87a
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:456:19
      33:     0x55b9403ff8ee - std::panic::catch_unwind::hfa809b06a550a9e7
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panic.rs:137:14
      34:     0x55b9403ff8ee - std::rt::lang_start_internal::h4db69ed48eaca005
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/rt.rs:128:20
      35:     0x55b9403e0670 - std::rt::lang_start::hb0380bec32c62397
                                   at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/rt.rs:144:17
      36:     0x55b9403ec0ac - main
      37:     0x7f4bcd31bd90 - __libc_start_call_main
                                   at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
      38:     0x7f4bcd31be40 - __libc_start_main_impl
                                   at ./csu/../csu/libc-start.c:392:3
      39:     0x55b9403de7d5 - _start
      40:                0x0 - <unknown>
    thread panicked while panicking. aborting.
    Aborted (core dumped)
    
  • If an IntoIter is dropped before it has yielded all of its values, it will leak the remaining values (illustrated here using Miri's leak checker):

    /*
    [dependencies]
    peapod = "=0.1.5"
    
    cargo +nightly miri run
    */
    
    use peapod::{peapod, Phenotype};
    
    #[derive(Phenotype)]
    enum MyOption<T> {
        None,
        Some(T),
    }
    
    fn main() {
        let pp = peapod![MyOption::Some("Hello, world!".to_owned())];
        drop(pp.into_iter());
    }
    
    The following memory was leaked: alloc1915 (Rust heap, size: 13, align: 1) {
        48 65 6c 6c 6f 2c 20 77 6f 72 6c 64 21          │ Hello, world!
    }
    
    error: the evaluated program leaked memory
    
    note: pass `-Zmiri-ignore-leaks` to disable this check
    
5 Likes

Once again, thank you so much!

The solution for the first part is indeed to just to specify <Self as Phenotype>.

For the second part, I changed the macro slightly so it sets T::BITS to 1 for one-variant enums. This prevents checks of if T::BITS == 0 (definite maintenance hazard) every time it's used and is very cheap.

The last part where IntoIter doesn't drop it's remaining elements was an oversight on my part, and is now fixed.

I've batched all the fixes together in version 0.1.6.

I really appreciate the support :white_heart:!!

4 Likes

I notice that Phenotype::reknit() method can cause UB if called incorrectly (as noted in its documentation). Shouldn't it be declared as an unsafe function?

Also, are users allowed to implement Phenotype manually? The documentation discourages it, but one of the derive macro's error messages suggests that it is allowed:

    // Abort if there are const generics - works funky with the way we deal with generics
    if ast.generics.const_params().next().is_some() {
        abort!(
            ty_generics,
            "const generics are not supported for `#[derive(Phenotype)]`";
            note = "it may be possible to implement `Phenotype` by hand"
        )
    }

If so, then there are use-after-free soundness concerns with IntoIter. If not, then Phenotype should either be marked as an unsafe trait with the condition that users not implement it manually, or include #[doc(hidden)] items that only the proc macro knows about.

1 Like

I don't think that disclaimer is needed. At least by cursory reading of your README, it doesn't seem like you're doing something fundamentally unsound or problematic. The code must be carefully written and properly documented, and the API may require some adjustments, but the core logic seems well-defined and sound, so all of that is in the bugfix category.

Why? I can see why you can't iterate by &mut, since that would allow changing the enum variant, which is impossible given your encoding (or is it? it seems like with a fixed-width tag encoding that may be possible). But why can't I iterate by &? Is it because there is no enum variant actually located anywhere in memory? I believe it should be possible for the iterator to provide a "scratch space", where an enum variant is constructed on each iteration step, and a & to the scratch space is returned. Once the iterator advances or is dropped, the scratch space can be safely erased, without dropping the contents.

Although, interior mutability may cause unsolvable problems. However the macro could derive an extra Freeze-like unsafe trait which would be implemented only for types without interior mutability. It could be also unsafely implemented by the user for specific types which don't have transitive interior mutability.

If your goal is super compact storage, shouldn't you make the variant structs #[repr(packed)]? It seems like you're wasting space for alignment, while you don't provide pointers to in-storage data of variants anyway.

In fact, it seems like even combining the structs into a union __PhenotypeInternalThreeTypesData is wasted space, since the variants can be quite different in size. Easily more different than the size of the enum discriminant, which can be constrained to a smaller numeric type using #[repr(u8)]. If you use a variable-width encoding of variants, you can store them without any wasted space at all.

Fixed-width representation has its own benefits, but it looks like your API is centered on iteration anyway.

2 Likes

That (a “lending iterator”) isn't possible given the definition of the Rust Iterator trait, because the Item type of the iterator can't be a borrow of the iterator.

2 Likes

Ah, yes. Indeed, that's a canonical case of needing GATs. Well, looks like they will be stabilized in the near future, so that can be solved on stable in a couple of releases.

Thank you for your thoughts!

I mainly did not use #[repr(packed)] because the section in the Rustonomicon heavily discouraged it. I never take a reference to a field, but the Rustonomicon doesn't guarantee that just reading a field is fine either. I want this to be able to run on architectures that might be associated with more space-constrained systems.

For simple cases like directly loading or storing a packed field, the compiler might be able to paper over alignment issues with shifts and masks. (Italics added by me)

But if loading from a packed struct is always fine, I think the speed tradeoff is probably worth it ™️.

I'm not sure how to do this. How would you know how much space to allocate ahead of time? Also, how would each variant be represented in the type system (tag not included)?

Good point. I'm also leaning towards marking Phenotype as unsafe as extreme care must be taken to implement it properly.

Despite the "unsafety", I still think you should be able to implement Phenotype manually, since there are use cases the derive macro doesn't cover. I will make the library's stance more clear and add more warnings!

Could you elaborate on this? Do you mean there would be use-after-free concerns if the trait is implemented incorrectly (i.e. by a human)?

Actually, looking at it again, it seems like a double-free occurs even with derived Phenotype impls since your commit a113ff7:

/*
[dependencies]
peapod = "=0.1.6"
*/

use peapod::{peapod, Phenotype};

#[derive(Phenotype)]
enum Holder<T> {
    Value(T),
}

fn main() {
    let pp = peapod![Holder::Value("Hello, world!".to_owned())];
    let _iter = pp.into_iter();
}
free(): double free detected in tcache 2
Aborted (core dumped)

The problem is that <IntoIter as Drop>::drop() goes through all the Ts and drops them; then, <Peapod as Drop>::drop() goes through all the Ts and drops them again. This can also manifest as a use-after-free if the value is taken out first:

/*
[dependencies]
peapod = "=0.1.6"
*/

use peapod::{peapod, Phenotype};

#[derive(Phenotype)]
enum Holder<T> {
    Value(T),
}

fn main() {
    let pp = peapod![Holder::Value("Hello, world!".to_owned())];
    let mut iter = pp.into_iter();
    let s = match iter.next().unwrap() { Holder::Value(s) => s };
    drop(iter);
    println!("{s}");
}
�W�d�4
free(): double free detected in tcache 2
Aborted (core dumped)

The IntoIter must be careful to drop only the values it hasn't given out. In the old implementation (before 0.1.6), it would drop Values that it had already copied into returned Ts, so if a user-specified Value type didn't use ManuallyDrop, it would result in a UAF.

1 Like

Note that the Rustonomicon isn't the normative reference for Rust's unsafe behaviour (although from a practical standpoint it's pretty close).

Using #[repr(packed)] is rightly discouraged, because it's hard to use correctly, can hurt performance, and isn't actually needed most of the time. But your use case is already using some serious unsafe code, so dealing with #[repr(packed)] is within the scope, and you are already working under the assumption that you are so space-constrained that complex unsafe and speed penalties are a worthwhile tradeoff.

Reading and writing fields of a #[repr(packed)] is fine, the compiler will emit the proper access code for you. A long time ago there was an issue that copying could cause UB due to taking references in the Clone::clone, but I believe this is now fixed. However, you need to make sure that you never create a &/&mut to a packed field, since such fields are generally unaligned, but references in Rust must always point to aligned memory.

Producing a reference to packed fields should become a hard error in the future, but in the meantime it's best to guard against by only accessing packed fields through unaligned pointers. Again, that was impossible to do soundly in the past, since there was no way to create a pointer without going through a reference. However, now that core::ptr::{addr_of!, addr_of_mut!} are stable, this is quite easy to do. You need to create pointers to packed fields via those macros, and read/write data using read_unaligned/write_unaligned methods. Just make sure not to call any methods on the fields themselves.

You would represent your buffer of data as Vec<u8> instead of Vec<union> (or some other backing storage instead of Vec, e.g. ArrayVec<u8>, or &mut MaybeUninit<u8> for arbitrary user-provided storage). At the moment reading/writing the data is simple: you just pop/push a new union to the Vec. With variable-length representation, you would have to do actual (de)serialization of data into the Vec.

I don't understand the question "how much space to allocate". If you're using a growable buffer, e.g. Vec, you can ignore that question and just let it grow as large as needed. As a performance optimization you may use Vec::with_capacity to preallocate a buffer based on some size estimate of all your data. The estimate can be done in various ways based on your use case and desired tradeoff between performance, accuracy and memory overhead. The simplest estimate would be N * mem::size_of::<LargestVariant>(), or you can iterate the buffer of enums and compute the exact total size. Of course, that is valid under the assumption that all instances of enum are known beforehand, otherwise reallocation or wasted memory are a given.

The variants would be represented by the same structs as you're doing now, but without combining them into a union for storage. When adding a new variant to the Peapod, you just add a new tag to the bitvec of tags, and write the data struct into the data buffer by using core::ptr::copy_nonverlapping. When reading variants, you would keep the position in the tag buffer and the data buffer. First read the tag to know the next variant, then read the value of the proper packed variant struct from the data buffer (this will likely require dealing with MaybeUninit<VariantStruct>). Advance the positions in the tag and data buffer, and convert the packed struct into an enum just as now.

5 Likes

I see what you mean. I'll give implementing it a try.

I'm not exactly sure how to actually hand out the different variant structs since a function only has one return type.

I think instead of handing out a union from Phenotype::cleave, I could return a [u8; (some big enough number to hold every variant)]. Then I would also have a function (proc-macro generated) that matches a tag to the correct number of bytes to read. So I would know how many bytes to serialize, and then when deserializing, I could look up the correct number of bytes to deserialize using the tag.

I definitely might be overthinking this though :sweat_smile:

You could still return a union, you just wouldn't be storing the union long term

But then how would I retrieve the data from the union? I think there would have to be some proc-macro generated code as the union variants would change across each enum we're using Phenotype with.

In regular code, getting my hands on a data struct seems difficult because they vary across every enum. I can't return one, and I also wouldn't know the names of the union variants. If the data leaves proc-macro land in some common form though, I think this could work.

Ahh yeah I see what you mean, returning a byte array with a max size based on the largest variant like you said (and the actual size of the variant in the array) might be the simplest answer.

I guess a trait for the union to do that conversion might be cleaner actually

Yeah I think so too. Would probably also have to be derived. I have so many good ideas to work on now :yum:

Version 0.1.8 published with a bunch of your suggestions: Phenotype made unsafe, auxiliary structs #[repr(packed)], IntoIter soundness fixed.

I'll stop using this forum as a changelog now :wink:

Just wanted to give another MASSIVE thank you to everyone for their advice and support! You're what makes this community amazing <3

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.