smallbox crate implements small box optimization that stores small item on the stack and fallback to heap as normal box for large item. And since 0.7, smallbox is available on stable rust (1.28.0+).
Example
use smallbox::SmallBox;
use smallbox::space::S4;
let small: SmallBox<_, S4> = SmallBox::new([0; 2]);
let large: SmallBox<_, S4> = SmallBox::new([0; 32]);
assert_eq!(small.len(), 2);
assert_eq!(large.len(), 32);
assert_eq!(*small, [0; 2]);
assert_eq!(*large, [0; 32]);
assert!(small.is_heap() == false);
assert!(large.is_heap() == true);
Unsized type
#[macro_use]
extern crate smallbox;
use smallbox::SmallBox;
use smallbox::space::*;
let array: SmallBox<[usize], S2> = smallbox!([0usize, 1]);
assert_eq!(array.len(), 2);
assert_eq!(*array, [0, 1]);
Benchmark vs standard Box
running 6 tests
test box_large_item ... bench: 104 ns/iter (+/- 14)
test box_small_item ... bench: 49 ns/iter (+/- 5)
test smallbox_large_item_large_space ... bench: 52 ns/iter (+/- 6)
test smallbox_large_item_small_space ... bench: 106 ns/iter (+/- 25)
test smallbox_small_item_large_space ... bench: 18 ns/iter (+/- 1)
test smallbox_small_item_small_space ... bench: 2 ns/iter (+/- 0)
test result: ok. 0 passed; 0 failed; 0 ignored; 6 measured; 0 filtered out
Might be good to add which stable version of Rust you are targeting in the announcement (i.e. Is it now available on stable because of a recently stabilized feature?).
Would it be possible to perform some synthetic benchmarks on how expensive copying each of those boxes is? Things that live on the stack tend to get copied around as fns get called and return, which can substantially affect performance depending on the use case.
Compiler would optimize the return value and let rebinding, so the move operation as well as its overhead may not present as we expected. However, it is a good idea to test it to know the worst case.