Hello folks,
TL;DR I'm experiencing a lot of small unexpected allocations and I would like to hear advice how to pinpoint where the allocations happen.
Description of the project
I've been implementing a Javascript lexer based on this lexer tutorial in C. The key factor is performance. In C it's quite easy to control all memory allocations. However, Rust could be able to generate smaller representation of state machine that could be better cached in CPU. I've managed to come up with implementation that is quite fast and native to Rust, but I would like to take it to the next level.
Benchmarking
For performance tests I've been using a lexer_bench component which consists from calling the lexer with fairly long javascript file. The JS file contains 1.5M LOC. NB: without rust bencher
For profiling I'm using Xcode's Instruments tools, specifically Allocation tool. In the results I see a suspiciously high number of 16 bytes allocations.
After investigating this a bit further I've found out that above 90% of those allocation occurs from the same place. First I tried to figure what is the origin of allocations call. To my surprise I've got a different call origin for inlined/non-inlined function calls with completely different stack traces. (2 images below)
Observations
The first one suggests parse_identifier
as origin of allocations. However, looking at parse identifier the only allocations are String and maybe some small number for the integer iterator. Stack trace shows that it's not a string but allocation is direct.
The second one points to call with String::from
. I have only 2 calls to String::from
one in parse_identifier
and one in parse_dash
. This again point to the parse_identifier
function. However, what is worrying me is the high number of dynamic allocations.
I've added some debugging code to parse_identifier
to track number of calls and how many strings are allocated. The function is called 232931 times and in 172355 calls it allocates a string with average size 5 characters. Number of strings of length one is 3322 and length two is 6432
Thanks for any advice and sorry for the long post