Rust programs getting more memory than expected?


#1

Hi
Few days ago when I’ve finished some small version of my networking application using MIO, I’ve started testing and found that only for program initialization very basic echo server getting around 2.4mb RAM memory.
I’ve opened issue here https://github.com/carllerche/mio/issues/427 but it seems that Rust programs itself getting more memory than expected.
For example this very very basic 2 thread program getting 480KB memory. Which is really huge for this.

fn main() {
    std::thread::spawn(||{
        std::thread::sleep(std::time::Duration::new(60, 0));
    });

    std::thread::sleep(std::time::Duration::new(60, 0));
}

I’m on a Mac El Capitan with Rust 1.9

How can I get information about why Rust programs is using more memory only for initialization.

P.S. Interesting part is that that memory is not growing during the execution, because for awesome Rust type system, but I’m worried why I need to allocate that match memory for a simple program.

Thanks


#2

My guess would be that each thread uses a big stack size. What results do you get if you explicitly request small stack size for the second thread?


#3

Yes! it’s worked, now program using less memory, but how to choose right stack size number ?
How calculate the number ?


#4

The last time I’ve checked, I failed to find a simple way to configure the main thread’s stack size, so I just spawned a child thread with the necessary stack size (I needed huge stacks though, not the small ones).

Do you really need to tweak stack size though? My understanding is that most OSes are using demand paging anyway. Also it can be useful if you specify how exactly do you measure memory consumption.


#5

Looks like the default stack size is 2 MB: https://github.com/rust-lang/rust/blob/6cc49e51de7ea9b0cc4aff437975544233c57107/src/libstd/sys/common/util.rs#L25


#6

Ok that explains a lot (2mb default size).

I’m building static library with specific TCP protocol based on MIO, which should be integrated using C interface to Mobile, Desktop app and server side services. So for me memory usage is critical, I liked Rust’s memory management model that’s why I started doing library with Rust.
But it seems MIO is eating a lot of memory. I’m thinking of switching to C libuv


#7

But it seems MIO is eating a lot of memory.

Event if threads are created with small stacks? This needs some investigation to verify and fix. I’ve subscribed to the mio issue, it sounds interesting :slight_smile:

FYI, there are Rust bindings for libuv: https://github.com/sorear/libuv-rs


#8

Using Thread size limit, now I’m getting 8.7mb memory (prev. 9.3mb), it’s not so big difference compared to libuv’s 380kb memory usage :slight_smile:
Will check this out https://github.com/sorear/libuv-rs and will come back with results.

Thanks


#9

In general, stack space is why co-operative multi-tasking (like goroutines) is preferred for this kind of IO bound networking task. I suggest you create a fixed size thread pool and use an event based architecture and nonblocking IO. The idea is the event que represents pending actions, and is the only blocking call in the program, so threads block on reading an empty event queue. Whenever you do IO you want to save a continuation so that the IO completion event contains the necessary information to continue this task later, then take the next event from the queue to process. This kind of multi-tasking runs optimally with one thread per core (or 2 per core with hyperthreading), so on an i7 with 8 virtual cores, you would need a max of 8 threads, and therefore 16M for 8 full size stacks. Using any more than this is being wasteful of memory because the CPU cannot really do more than 8 things at one time. You could of course do this and reduce stack size, but you risk a runtime stack overflow if you make them too small.

One question, what would be the best way to get a continuation in Rust to pass to the IO request?


#10

In Rust implementation I know only MIO which is providing non-blocking io over network operations and channels.

I’m already using MIO https://github.com/carllerche/mio for making Non-Blocking async IO, but by default it running only in one thread, so now I’m trying to run multiple event loops to provide multicore performance.


#11

Above you said MIO was using a lot of memory, and this is resolved by reducing stack space. This would suggest MIO is using threads internally. As one of the points of using nonblocking IO is to increase concurrency, this seems to be a problem. The kernel can provide asynchronous requests without requiring lots of threads (the kernel uses interrupts) so the problem would appear to be MIO itself. You need an async library that passes the continuation to the kernel, rather than using lots of user space threads for the concurrency.


#12

MIO doesn’t have a multithreaded event loop (as stated in the README). It only uses threads for timers.

[andrew@Serval src] pwd
/home/andrew/clones/mio/src
[andrew@Serval src] grep -nrHIF thread | grep -v '//' | grep -v 'test/'
timer.rs:3:use std::{cmp, error, fmt, u64, usize, iter, thread};
timer.rs:51:    wakeup_thread: thread::JoinHandle<()>,
timer.rs:318:                    inner.wakeup_thread.thread().unpark();
timer.rs:360:        let thread_handle = spawn_wakeup_thread(
timer.rs:369:            wakeup_thread: thread_handle,
timer.rs:403:fn spawn_wakeup_thread(state: WakeupState, set_readiness: SetReadiness, start: Instant, tick_ms: u64) -> thread::JoinHandle<()> {
timer.rs:404:    thread::spawn(move || {
timer.rs:414:            trace!("wakeup thread: sleep_until_tick={:?}; now_tick={:?}", sleep_until_tick, now_tick);
timer.rs:419:                thread::park_timeout(Duration::from_millis(sleep_duration));
timer.rs:425:                    trace!("setting readiness from wakeup thread");

#13

So are you saying MIO wasn’t responsible for the large memory usage in the above use-case, as the program using the library appears to be single threaded? The reduction in memory usage when stack size is reduced clearly points to a large number of threads running.

Why is MIO using timers? Is in polling something? With async IO you simply need to register the continuation function with the kernel and return to the main event loop (or without continuations return to the event loop and wait for events, including IO completion events).


#14

You might consider that MIO has timers available for use, but that they aren’t necessary for using MIO.

I don’t know though. I’d encourage you to confirm for yourself.


#15

Are you using the MIO timers in your program, and if so what for?