I have been working with opentelemetry-rust crate, during the usage I found that one particular function in this crate has memory leak (using this function is resulting high usage of RSS memory), Memory Consumption Issue with opentelemetry::KeyValue · Issue #793 · open-telemetry/opentelemetry-rust · GitHub
In order to detect the source of this issue, I used valgrind memcheck and massif, and these tools have shown me a trace of function calls which can be responsible.
Now, what should be my next step ?
It is a chain of 15 function calls, who internally calls many other functs, How do I detect the exact source and what kind of changes should I make ?
The vast majority of memory leaks in Rust come when you've got a long-lived object (e.g. a
static variable) that hangs onto data longer than it needs to, typically by appending the data to a vec without bounds. One example of this is a logger that doesn't flush messages to their destination frequently enough.
What I would do is check each of the functions in that chain of calls for some sort of
push() operation that makes it look like you are storing an object somewhere. From there, it's a case of looking at the code and figuring out why something would cause unbounded memory use. Once you understand what is going on, the solution will become quite obvious.
One gotcha I've ran into is that "RSS" reported by Linux is actually max RSS (peak value), so it never goes down by definition, even after the application actually releases the memory.
To get more accurate data, I use
cap Rust memory allocator, which can immediately see when Rust releases memory:
No, the RSS is not the max RSS. It is the current RSS. If pages get swapped out, it decreases and also if the allocator releases the memory pages. Memory allocators generally try to keep part of the pages mapped after deallocation of all objects in them to speed up further memory allocations. Jemalloc I believe periodically tells the kernel that it is fine to discard unused pages using MADV_DONTNEED. This will only reduce the RSS once the kernel actually needs to discard them though I think. As for glibc's memory allocator, I'm not quite sure how it's deallocation heuristics work.
I mean calls like getrusage have reporting of the accurate value unimplemented, so
ru_maxrss is often used instead.
Actually, I have been using a product named datadog, which apperantly shows current Memory usage. I observed all the applications running on my Linux and only my Rust app is growing in memory because it is using the opentelemetry-rust SDK.
As soon as, I comment down that function call to SDK, the memory usage stabilizes.
But, thanks @kornel & @bjorn3 for clarifying !