Memory optimization of javascript lexer

Hello folks,

TL;DR I'm experiencing a lot of small unexpected allocations and I would like to hear advice how to pinpoint where the allocations happen.

Description of the project

I've been implementing a Javascript lexer based on this lexer tutorial in C. The key factor is performance. In C it's quite easy to control all memory allocations. However, Rust could be able to generate smaller representation of state machine that could be better cached in CPU. I've managed to come up with implementation that is quite fast and native to Rust, but I would like to take it to the next level.

Benchmarking

For performance tests I've been using a lexer_bench component which consists from calling the lexer with fairly long javascript file. The JS file contains 1.5M LOC. NB: without rust bencher

For profiling I'm using Xcode's Instruments tools, specifically Allocation tool. In the results I see a suspiciously high number of 16 bytes allocations.

After investigating this a bit further I've found out that above 90% of those allocation occurs from the same place. First I tried to figure what is the origin of allocations call. To my surprise I've got a different call origin for inlined/non-inlined function calls with completely different stack traces. (2 images below)

Observations

The first one suggests parse_identifier as origin of allocations. However, looking at parse identifier the only allocations are String and maybe some small number for the integer iterator. Stack trace shows that it's not a string but allocation is direct.

The second one points to call with String::from. I have only 2 calls to String::from one in parse_identifier and one in parse_dash. This again point to the parse_identifier function. However, what is worrying me is the high number of dynamic allocations.

I've added some debugging code to parse_identifier to track number of calls and how many strings are allocated. The function is called 232931 times and in 172355 calls it allocates a string with average size 5 characters. Number of strings of length one is 3322 and length two is 6432

Thanks for any advice and sorry for the long post

On macOS all allocations have 16-byte alignment, so it's entirely normal that small allocations are rounded up to the nearest 16-byte boundary.

It seems very likely that it is exactly that String::from call.

Look for string interning crates. Instead of a String you can use an identifier from a pool that reuses allocations.

3 Likes

Thank you @kornel , you were totally right. Number of allocations went down to 18000. I feel very dumb now for asking this question.

Btw. while I was fixing string from I also improved string unescapping and maybe found a performance problem in std.
The replace function is allocating vector of default size, however they could've used String::with_capacity since length of str is know, right?

Replace function
    #[must_use = "this returns the replaced string as a new allocation, \
                  without modifying the original"]
    #[stable(feature = "rust1", since = "1.0.0")]
    #[inline]
    pub fn replace<'a, P: Pattern<'a>>(&'a self, from: P, to: &str) -> String {
        let mut result = String::new();
        let mut last_end = 0;
        for (start, part) in self.match_indices(from) {
            result.push_str(unsafe { self.get_unchecked(last_end..start) });
            result.push_str(to);
            last_end = start + part.len();
        }
        result.push_str(unsafe { self.get_unchecked(last_end..self.len()) });
        result
    }
2 Likes

No worries. Your question was good.

As for replace, it could probably be optimized. It depends on assumption what the replacement is. This implementation is ideal for "aaaaaaa".replace("a", "") :slight_smile: