Tracking memory usage

Hello,

I'm looking for a way to inspect memory at runtime. In my program I'm storing a lot of data in Vec, HashMap, and BTreeMap:

struct Table {
    storage: HashMap<Key, SomeComplexStruct>,
}
struct Db {
    tables: BTreeMap<AnotherKey, Table>,
}

and I'd like to be able to track the memory usage for these tables, and be able to query it at runtime. The closest thing I've found to do this kind of stuff is the size-of crate. It provides a derive macro, which computes/guesses the memory allocated for a given struct. It fits my use case but it's impractical:

  • it relies on a SizeOf trait, which cannot be implemented on types from other crates, due to the orphan rule
  • implementing the SizeOf trait is sometimes not trivial. You have to make assumptions and approximations, especially if your type uses foreign types that don't implement the SizeOf trait

Tools like dhat-rs are interesting but cannot be used in production. I also looked at the new Allocator trait, thinking there might be a way to wrap the system allocator, but from the trait API I don't to see how I could track information such as "amount of memory allocated for a specific Table" or amount of memory allocated for a specific instance of SomeComplexStruct" (from the example above).

I would suggest you use collections with custom Allocator, then I remember std::collections::HashMap still doesn't support custom allocators yet. see:

you can use hashbrown::HashMap for now, before it (hopefully soon) merged into std.

Using the Allocator would only track the shallow size of the hash map’s contents anyways. Anything that (Key or) SomeComplexStruct might own behind any pointer indirection wouldn’t be counted. Whether or not this is the desired behavior may depend on the use-case.

For illustration: E.g. a HashMap<K, Vec<SomethingElse>, _, CustomAllocator> would not use the CustomAllocator for the space the SomethingElse structs occupy, if the Vec still uses the normal default allocator. Hence a CustomAllocator that keeps track of memory usage cannot keep track of the memory usage for the nested data, such as SomethingElse.

I believe that directly implies that any solution that doesn't account for FFI pointers is bound to be inaccurate, even for "simple" things like files or C strings.

That's...kind of a bummer.

In theory, yes, but in practice, I doubt mis-counting memory allocated by C would make much of a difference to what you are allocating.

Something I've found quite useful for course-grained tracking of memory usage is to override the global allocator with something that increments/decrements a bunch of AtomicUsize counters stored in a static variable. From there, you might choose to dump the counters every time through a loop or at the end of execution or in response to a signal or whatever.

I'll normally wire this up manually, but the tracking-stats_alloc crate looks like it gives you a lot of the tools you'll need.

extern crate stats_alloc;

use stats_alloc::{Region, StatsAlloc, INSTRUMENTED_SYSTEM};
use std::alloc::System;

#[global_allocator]
static GLOBAL: &StatsAlloc<System> = &INSTRUMENTED_SYSTEM;

fn main() {
    let reg = Region::new(&GLOBAL);
    let x: Vec<u8> = Vec::with_capacity(1_024);
    println!("Stats at 1: {:#?}", reg.change());
    // Used here to ensure that the value is not
    // dropped before we check the statistics
    ::std::mem::size_of_val(&x);
}

Yes, my use case is to count the total memory allocated including for Key and SomeComplexStruct.

This looks like a good start. My program is multithreaded so I'd need a slightly more complex tracker which keeps a count per thread. I assume I can call thread::current()::name() from within the GlobalAlloc trait methods, right?

Wrong. That makes the process abort for some reason :frowning:

I'm guessing it allocates, which calls back into your allocator and overflows the stack.

Instead of using the thread name, try to use the thread ID. It's just a number, so hopefully accessing it won't incur any allocations.

I'd look into making your own copy of their allocator which uses a thread-local variable for storing the Counters struct. As a nice side-effect, it means you can also drop the AtomicUsize for counting and use a plain usize.

It's also worth asking whether you actually care about which thread an allocation is made on, or whether a single global set of counters is enough.

The cap crate has global memory usage stats. It can't tell you about specific structs, but if you're worrying that they are big enough to affect whole process or machine, I think tracking global usage could be enough to act:

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.