String/&str optimization question

Say I have a WASM application that handles lots of class names.
Many of the names have repeated characters.
Think bootstrap: btn, btn-primary, btn-secondary, etc.

I can imagine two ways to handle these.

  1. Optimize for speed. Make everything a &'static str and don't concatenate any values, which would require String.

  2. Optimize for size: break the classes apart into sections like "btn", "primary", "secondary", "-" and type them &'static str. After that, construct the class name as String. For example,

const BTN: &'static str = "btn";
const PRIMARY: &'static str = "primary";
const DASH: &'static str = "-";
let mut btn_primary = String::with_capacity(BTN.len() + PRIMARY.len() + DASH.len());
btn_primary.push_str(BTN);
btn_primary.push_str(DASH);
btn_primary.push_str(PRIMARY);

Note: this is avoiding using format! because it tends to bloat WASM apps.

I'm curious about everyone's opinions. Would the size benefit of using option 2 ever possibly be good enough to justify the loss of speed? Is the loss of speed really that significant anyway? Like, imagine I want to optimize size over all else because I assume Rust is already much faster than JS, but I want to minimize bundle size as much as possible.

I think you'd really have to profile doing it both ways to know if it was worth it. In general I would think the size savings are going to be fairly small in practice, though. The code to recombine the strings is going to take up space too, and class strings usually aren't THAT long.

If you're primarily worried about download size, looking into setting up compression on your server might be more reliable at reducing page sizes.

4 Likes

Reminder that you can save a bunch of typing in writing this with concat:

let btn_primary = [BTN, PRIMARY, DASH].concat();

(That does the correct capacity calculation and such as part of it.)

3 Likes

So I decided to go ahead and test this. Check out this repo for the implementation details.

Basically, I took all the form classes from bootstrap. In one cargo workspace (no_concat in the repo), I define the entire classes each as their own &'static str. In the second (concat) I define only the subsections of the classes (anything separated by a dash) each by their own &'static string. I compiled the code in release mode using the settings most optimized for size. Here are the results:

  1. no_concat: 1312 bytes
  2. concat: 19807 bytes

That's more than 100 times difference in the size. The answer is clear: whether you're optimizing for speed or for size, it seems like not concatenating the strings results in a smaller binary.

Any thoughts? Did I get anything wrong or make any huge mistakes? Let me know.

Inspecting the dissassembled wat, it looks like quite a bit of the extra size in concat is coming from additional parts of std that get used when you call concat(). If you were interested in how far you could push the concat version, you might be able to improve things by rolling your own concat. You'd probably need to use some unsafe to get anywhere close though.

I'm using wasm2wat to disassemble the wasm files to wat.

1 Like

Thanks for the tip! I'll try that.

My calculator says it is about 15 times, not more than 100!

2 Likes

You could also try one of the const implementations of concat like in https://crates.io/crates/const_format

I think, as semicoleon mentioned, this is a misleading benchmark, because rust_string_concat_size_test/lib.rs at main · toadslop/rust_string_concat_size_test · GitHub uses only stuff from core. Indeed, you can confirm in play that it builds fine with #![no_std]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9d3263fb1783f739c3e527324f4503e7

(Not to mention that the sum is unused, and thus LLVM might even compile it out entirely. Const propagation could plausibly fold it down to one unused usize, and not have any of that code at all either.)

So I think what it's showing is "it's smaller if you don't need to bring in a wasm allocator". That's certainly true, and good to know, but also means that if you're doing more than just this and thus need that allocator somewhere else in the program anyway, then you're measuring a cost that you've already paid elsewhere.

So it's hard to say whether it's telling you something useful to you or not.

2 Likes

I did check that the constant strings for each case were actually compiled into the final wasm, so at the very least the sum is indicating to LLVM that the consts are used.

There's a typo when I did the original calculation and I missed a digit.

Are you sure this is the hot part of your code? May be you should profile it before you start optimising.

Hi Mng12345, thanks that's a good point. I'm actually not concerned about whether it was a hot part of my code. I was just approaching the idea from a point of curiosity. I was just thinking about how many repeated substrings there were in a large app and wondered if eliminating them in this way could have any positive impact. I was surprised by the results so far because I expected at worst that the improvement would be too small to matter and was surprised to find that it actually bloated the code more, at least in this naive test case.

Thanks for our feedback, scottmcm. I admit I have almost no experience with profiling (this is actually my first attempt to do something like this). I appreciate the suggestions.

I updated the code and added no_std and switched to using wee-alloc to check the result, I did get some significant improvement on the concat version -- binary size dropped to around 13429 bytes, an improvement of roughly 7kb on the previous version.