My example was modified from a real crate and the endianness-specific layout is for pointer tagging, just ignore it 
It seems that we can only utilize NonZero* to do niche optimization. 0xFF cannot exist in UTF-8 encoded bytes. For dynamic sized Stringlet, or VarStringlet, we can just have the max length be limited to 0xFE; for fixed sized Stringlet FixedStringlet, I think we have to waste 1 byte and wait for the stabilization of feature generic_const_exprs if you want to keep the current semantics of SIZE. Here's a PoC.
It's hard to have enum be niche-optimized, since the compiler is not clever enough and we have to introduce a pivot struct and a union type like my example. IMO, why not just use VarStringlet? The enum size depends on the largest variant.
Rust does have a type NotAllOnes<T> for niche optimization (location: core::num::niche_types) which fits our needs, but it's rather unstable and not part of public APIs now.
I’ve been mulling the enum (or union) and am failing to see the big advantage it would offer. It would allow taking multiple arbitrary kind stringlets as parameters, without needing separate generic <Kind, Kind2…>. But that’s only a cosmetic advantage, at the overhead of run-time enum matching.
The only extra functionality this would offer is mixing them in a collection. To enable this, I would need four times as many implementations: besides StringletBase<Kind> op StringletBase<Kind2> also StringletBase<Kind> op Stringlet, … Might as well convert each to the kind that best fits all data.
Instead I’m removing the where-bound on all stringlet consumers, leaving it only on the constructors. Then I can rename StringletBase -> Stringlet, giving a fairly easy way to pass around arbitrary kinds.
@tczajka, @cuviper, @hantong Still struggling with the niche. The enum idea for valid utf-8 bytes less than u8 is lovely. But I’m scared for bigger SlimStringlets, where, as per my existing hack, the last byte can overflow the enum. I want to be sure to prevent the compiler meddling with that last byte, so as to make this safe. I would need something like the following. As I want to stay in stable Rust, this ability looks like far off:
#[repr(C)]
(
[Utf8Byte; { if SIZE > 1 { SIZE - 1 } else { SIZE }}],
[u8; { if SIZE > 1 { 1 } else { 0 }}]
)
I guess I’ll strike a balance by shortening SlimStringlet a little. Then my last byte will never reach 0xff. I’d read that having more than one niche value can be useful for nested types. But is there any notable benefit in excluding more than two or three values?