If I have a fn that will run over and over again, comparing an input &str against an array of strings, which of the following options is more efficient, performance-wise? Does using let inside the fn body allocate or copy each time?
Also, what about putting const HEADERS inside the fn body? Does that make a difference?
I'm not actually experiencing perf issues, I'm just trying to learn the implications of each.
With optimization enabled (cargo run --release) they will almost certainly be identical.
Does using let inside the fn body allocate or copy each time?
Without optimization, both versions would definitely allocate the 2-element array for headers, on the stack, each time it is run. However, all that means is copying the 2 pointers and lengths for the string literals β not their text.
With optimization, the compiler will almost certainly transform the first code to be identical to the second β and beyond that, the array and the iteration will also go away, so that the result is just two comparison routines and no loop.
You can see the results for yourself by using the βASMβ option in the menu in the Rust Playground, or https://rust.godbolt.org/. When I did this, I saw that the compiler actually de-duplicated the code: declaring that the two versions of is_header are two names for the same code.
Also, what about putting const HEADERS inside the fn body? Does that make a difference?
No, that just changes where the const is visible. Scope, variable names, etc. make no difference to how the program is compiled.
Those two are effectively identical. Read here about the difference between const and static.
But I also wouldn't expect any real difference from static in this case. All you're potentially changing is how the pointers to the str data is stored.
wow, this paragraph from your link just made things click so much better for me (I still have only a basic understanding, but I get this):
To put it simply, constants are inlined wherever theyβre used, making using them identical to simply replacing the name of the const with its value. Static variables, on the other hand, point to a single location in memory, which all accesses share. This means that, unlike with constants, they canβt have destructors, and act as a single value across the entire codebase.
If all "header" strings are known at compile time, you might be able to squeeze more performance out of it using a non-cryptographic hash function. is_header can hash the input once, and then it's a simple integer comparison against all known header hashes with a match[1].
I wrote a proc macro for this, but never published it. [2] The "hash function" in this case just returns the first 8 bytes of the string, padded with zeroes if necessary. A hasher like FxHash might be more suitable for you. Maybe it can give you some ideas, anyway.
Would you be willing to show me an example of what you mean, maybe using my example as a template? I assume this would only work for whole strings, and not partial checks like "ends_with()"?
That sounds really cool but a bit out of my wheelhouse just yet.
EDIT: I tried this, and I don't see any performance benefit over this:str_vec_of_headers.contains(&line)
The bad news is that hashing does not exactly match what your original code was doing (substring comparisons with ends_with()). If you want to do equality matches, and the list of string comparisons is larger than just 2, you can begin to see some benefit from comparing hashes instead of strings.
It's a modest improvement, but it is quite sensitive to wordlist changes. And the way I'm computing the hashes at compile time (and include!ing) is not ideal.
Bonus: If you do enough digging, you can find some degenerate cases where the string comparisons are really bad. This run had a lot of long strings with the same prefixes (e.g. only the last 5-10 characters were different in a few dozen long strings):
Thanks. I cloned your repo and will play around with it. Much appreciated, I think I could get away with the equivalent of == rather than .ends_with(). The reason I did it was because the files I'm working with prepend a BOM, but I can probably find a workaround for that.
And if not, this is still cool insight I'll tuck away for the future.
Well, these are 30-year-old optimizations, at least. Everything in your code (the relevant parts that you are asking about, anyway) is known at compile-time, you have string literals in an array literal. There's nothing to "figure out", it's basically trivial.
There are much more interesting and complex optimizations that compilers can perform. The bottom line is that you basically NEVER need to worry about low-level decisions like this.
you basically NEVER need to worry about low-level decisions like this
That's awesome.
There's nothing to "figure out", it's basically trivial.
I probably used the wrong phrasing. I should have said, "that's super convenient that the compiler..." rather than "that's amazing that the compiler..." I'm loving this zero cost abstraction business. It's certainly not the case in some other languages I've used.