Is there a way to avoid serde bloat when using newtype wrappers for UUIDs?

I have a macro that looks like this

macro_rules! make_id {
    ($name:ident) => {
        #[derive(
            Clone, Copy, Debug, Deserialize, Eq, Hash, PartialEq, Serialize, PartialOrd, Ord,
        )]
        pub struct $name(Uuid);

        // Some more ceremony and From impls
    };
}

Then I use the macro like so:

make_id!(ApiType1Id);
make_id!(ApiType2Id);
// And so on ad nauseam

The IDs are then used in structs

#[derive(Serialize, Deserialize)]
struct ApiType1 {
    id: ApiType1Id,
}

My question is, is it possible to avoid all the bloat (derives, additional impls) in some way, that still holds these properties:

  • Type system can't confuse the ID types
  • Serialize/Deserialize to/from UUID-strings
  • Bloat is minimized

I assume your goal is have less generated Rust code? Not tested, maybe you can define something like this?

pub struct WrappedId<Tag>(Uuid, PhantomData<Tag>);
// impls for WrappedId

// In `make_id!`
struct ApiType1IdTag;
type ApiType1Id = WarppedId<ApiType1IdTag>: 
3 Likes

Implementing it all (hopefully correctly), yields a binary size increase:

artifact size in bytes
before 1,042,511
after 1,056,747
#[derive(Deserialize, Serialize)]
pub struct WrappedId<Tag>(
    Uuid,
    PhantomData<Tag>,
);

impl<Tag> Clone for WrappedId<Tag> {
    fn clone(&self) -> Self {
        Self(self.0.clone(), PhantomData)
    }
}

impl<Tag> Copy for WrappedId<Tag> {}

impl<Tag> PartialEq for WrappedId<Tag> {
    fn eq(&self, other: &Self) -> bool {
        self.0.eq(&other.0)
    }
}

impl<Tag> Hash for WrappedId<Tag> {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.0.hash(state);
    }
}

impl<Tag> PartialOrd for WrappedId<Tag> {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        self.0.partial_cmp(&other.0)
    }
}

impl<Tag> Ord for WrappedId<Tag> {
    fn cmp(&self, other: &Self) -> Ordering {
        self.0.cmp(&other.0)
    }
}

impl<Tag> Eq for WrappedId<Tag> {}

impl<Tag> Debug for WrappedId<Tag> {
    fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
        write!(f, "WrappedId({})", self.0)
    }
}

impl<Tag> WrappedId<Tag> {
    fn new(uuid: Uuid) -> WrappedId<Tag> {
        Self(uuid, PhantomData)
    }
}

impl<Tag> fmt::Display for WrappedId<Tag> {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        std::fmt::Display::fmt(&self.0, f)
    }
}

impl<Tag> From<Uuid> for WrappedId<Tag> {
    fn from(id: Uuid) -> Self {
        WrappedId::new(id)
    }
}

impl<Tag> FromStr for WrappedId<Tag> {
    type Err = uuid::Error;

    fn from_str(value: &str) -> Result<WrappedId<Tag>, uuid::Error> {
        match Uuid::parse_str(value) {
            Ok(id) => Ok(id.into()),
            Err(e) => Err(e),
        }
    }
}

impl<Tag> From<WrappedId<Tag>> for Uuid {
    fn from(n: WrappedId<Tag>) -> Uuid {
        n.0
    }
}

impl<Tag> Deref for WrappedId<Tag> {
    type Target = Uuid;

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

macro_rules! make_id {
    ($name:ident, $tag:ident) => {
        pub struct $tag;
        pub type $name = WrappedId<$tag>;
    };
}

// Lots and lots of make_id! after here

Anyone have a better idea? :hugs:

What are you comparing the sizes of? (Just checking)

I don't know how to decrease binary size from a source code perspective in this case. It feels very much in the domain of llvm and the compiler to make the right optimisations.

In general, I'd say avoid overusing generics to reduce code size, but in this case you should end up with basically the same monomorphised code from either approach.

It might be that with --release or other optimization flags passed to cargo, you can get the binary size down. I assume this would help either version of the source code.

You might have some luck with cargo-wizard configuring things for small binary size, in case you weren't aware of it

Hello, thank you for your reply.

I am comparing the size of the output wasm binary that is built using wasm-pack in release mode.

I believe I am using most of the recommended settings for reducing binary size (lto, stripping, wasm-opt, etc.) but I will have a look at cargo-wizard anyway.

It would have been amazing if rust had a newtype keyword (as a complement to type X =) that made a completely new type from the type system's POV but didn't need any additional codegen.

The newtype abstraction encompasses much more than just a tiny wrapper around another type.

Digging around found this old RFC that wanted to delegate the implementation of traits to the inner type, which would be closer to what you want: RFC: Delegation by elahn · Pull Request #2393 · rust-lang/rfcs · GitHub

Unfortunately the RFC was closed because it hit too many edge cases.

twiggy is a decent tool for analyzing wasm binary size [1]. You may find some valuable information with it, if you haven't already.

For your specific question, reducing monomorphization bloat with a newtype is a unique challenge because Rust doesn't have any way to automatically "pierce the veil" [2] and doing so would undermine the type system (ignoring piecewise delegation proposals, which is opt-in and would probably be nice).

There may be some things you can do as a workaround by changing it up. The best you can do is macro away the boilerplate.

Using #[derive] on your newtypes is the root cause [3]. If you can remove them all and hand-implement the traits needed on the owning type (ApiType1 et al.) then you should clearly see reduced bloat from newtype monomorphization (as there is none). This may require making the inner Uuid public or providing a getter method on the owner type to make it usable by callers. (I also understand that doing this reduces the utility of a newtype in the first place. But I do not specifically know how you are using them.)

Here's an example that hand-impls the serde traits on the owner, since that's what you've showed in the sample code: Rust Playground

Another option I played with for a bit was dynamic dispatch for the derived traits. It can technically work, but you give up type system rejection of bugs for runtime checks.

The last thing to note, are you aware of these resources? They might have some additional wisdom to borrow:


  1. Similar to cargo-bloat. ↩︎

  2. This is a common expression in legalese for LLCs in the US. I felt it was apropos. ↩︎

  3. For some "root". One could also say that using a newtype itself is the root cause. ↩︎

2 Likes

I wonder if you can get meaningful size reduction by using something like #[serde(from=Uuid, to=Uuid)] or #[serde(transparent)] to force the derived newtype implementations to be trivial.

2 Likes

Using #[serde(transparent)] saved 1,228 bytes

Thanks! Twiggy is great, but I stare at it and then I go... Oh well there's nothing I can improve here. Like how do I reasonably remove this

66520 ┊     4.97% ┊ models::thing::_::<impl serde::de::Deserialize for models::thing::Thing>::deserialize

Yes its huge, but everything in it is essential :sob:

1 Like

Then can you really call it "bloat?"

I've been watching this thread with some interest, and one of the things in the back of my mind as I read it is the question "does any of this matter?"

You've identified - I think correctly - that adding a type plus associated implementations of Serde's traits does increase the size of your binary, and you've made it pretty clear that your goal is to minimize that increase, but one thing that's missing for me is a rationale for why a 1.5% increase in total code size (from this comment) or a 0.1% increase in code size (from this comment) is consequential for you. Do you have a size budget you're trying to meet, for example?

Monomorphizing generic code often involves generating multiple variations on fundamentally-similar routines, varying based on the sizes, layouts, &c of the specific types involved in each variant. Serde makes heavy use of generics in order to support user-defined types with a useful level of type safety, and some level of code size increase necessarily comes with that. Attempting to minimize it at all costs is, perhaps, not the goal I would choose here. Balancing code size vs your other constraints, not least including your own time and debugging attention, might be a more useful perspective. On the other hand, if there is some constraint you're trying to meet, it might help us suggest more systemic approaches if we knew what it was. Micro-optimizing individual types and the associated code generation is probably going to buy you less headroom against those constraints than taking care with the overall design of your program will.

4 Likes

To answer, if it isn't clear, those (small) size changes are absolutely uninteresting to me.

I am delivering a wasm binary to > 20,000 people weekly, and the wasm binary changes about once a week (bugfixes, features). I want the binary to be as small as possible, and obviously HTTP transfer with gzip does a tremendous job reducing the size, but it's still kind of an uneven match - Our 60kLoC rust library (which we want to keep expanding... a lot!) has about the same size as 400,000 LoC TypeScript. (Both including transitive dependencies)

I would love to migrate everything I have to miniserde (or whatever dream world introwospection would have brought - flying cars?), but it's missing a lot of features I need so I can't do that. Also everything else is built on serde.

I love Rust, it is truly a marvellous programming language, it is helping us deliver a cross-platform library for web, iOS, and desktop, but the binary size (particularly on web) is not good. I don't even want to think of the binary size when this project starts to reach > 300kLoC.

1 Like

There's a chance that the serde impls are already trivial, and LLVM can do identical code folding on it to "erase" the bloat!

I played around with wasm32-unknown-unknown, wasm32-wasip1, and wasm32-wasip2 targets and saw some of it:

.set wasm_serde::ApiType2Id::new, wasm_serde::ApiType1Id::new
.set wasm_serde::ApiType3Id::new, wasm_serde::ApiType1Id::new
.set wasm_serde::ApiType4Id::new, wasm_serde::ApiType1Id::new

// ...

But #[derive(Deserialize) has some ineligibilities because each newtype gets its own special Visitor::expecting(). And the only reason that happens is because it derives strings that contain the newtype name:

.L__unnamed_1:
        .ascii  "tuple struct ApiType1Id"

.L__unnamed_2:
        .ascii  "tuple struct ApiType2Id"

// ...

If that can be avoided [1], then LLVM can do the bulk of the work. I didn't see any other serde code for the newtypes, possibly because it's folded away.

You might not actually have problems caused by the monomorphization, then!


  1. It doesn't look like it can be avoided! serde/serde_derive/src/de.rs at 46e9ecfcdd5216929ebcf29e76adc072412c5380 · serde-rs/serde ↩︎

1 Like

Obviously TS and Rust are not comparable 1:1, because a web browser is "the standard library" but I want to deliver business value here. I'm hoping for the experimental wasm code splitting things to get a champion, and keeping my eyes open for any other binary size reducing techniques I can find.

I find this interesting! How are you counting loc? Perhaps it's double counting some stuff on the typescript dependencies side for commonjs builds, but that hardly covers that level of disparity.

1 Like

I used tokei to count the dependencies:
TypeScript repo LoC:

TS repo

Type Lines
TSX 241,195
TypeScript 73,207

EDIT: Originally wrote 331,331 LoC for TSX but then I realized I had double-counted our tooling code, so I have adjusted by removing ~90kLoC tooling code

node_modules - Obviously most of this never ends up in the bundle

Type Lines
JSX 18,266
JavaScript 5,285,552

Rust repo

Type Lines
Rust 54,563

EDIT2: Obviously the Rust code contains some tooling as well, but let's ignore that for now, about 5kLoC

Bundle sizes:

Bundle Gzipped size
vendors 677kB
main 631kB
wasm 289kB
1 Like

Yeah, 5Mloc in node_modules is the real JS experience!

Looks like it's around ~2 bytes per TS line, and (handwaving some alloc, etc) ~4-5 bytes per Rust line.

This actually seems a little low for both, perhaps you're using the full counts, rather than just code, but even so it's a lot closer to what I'd expect: minified JS is pretty dense.

2 Likes

Yeah I included comments and empty lines for both TS and Rust, so you probably need to discount 5-10% comments, but you already figured that out.

If I could wish for anything from santa this year it would be stable and supported Rust wasm code splitting :heart: