Hi
Few days ago when I've finished some small version of my networking application using MIO, I've started testing and found that only for program initialization very basic echo server getting around 2.4mb RAM memory.
I've opened issue here https://github.com/carllerche/mio/issues/427 but it seems that Rust programs itself getting more memory than expected.
For example this very very basic 2 thread program getting 480KB memory. Which is really huge for this.
How can I get information about why Rust programs is using more memory only for initialization.
P.S. Interesting part is that that memory is not growing during the execution, because for awesome Rust type system, but I'm worried why I need to allocate that match memory for a simple program.
The last time I've checked, I failed to find a simple way to configure the main thread's stack size, so I just spawned a child thread with the necessary stack size (I needed huge stacks though, not the small ones).
Do you really need to tweak stack size though? My understanding is that most OSes are using demand paging anyway. Also it can be useful if you specify how exactly do you measure memory consumption.
I'm building static library with specific TCP protocol based on MIO, which should be integrated using C interface to Mobile, Desktop app and server side services. So for me memory usage is critical, I liked Rust's memory management model that's why I started doing library with Rust.
But it seems MIO is eating a lot of memory. I'm thinking of switching to C libuv
Event if threads are created with small stacks? This needs some investigation to verify and fix. I've subscribed to the mio issue, it sounds interesting
Using Thread size limit, now I'm getting 8.7mb memory (prev. 9.3mb), it's not so big difference compared to libuv's 380kb memory usage
Will check this out https://github.com/sorear/libuv-rs and will come back with results.
In general, stack space is why co-operative multi-tasking (like goroutines) is preferred for this kind of IO bound networking task. I suggest you create a fixed size thread pool and use an event based architecture and nonblocking IO. The idea is the event que represents pending actions, and is the only blocking call in the program, so threads block on reading an empty event queue. Whenever you do IO you want to save a continuation so that the IO completion event contains the necessary information to continue this task later, then take the next event from the queue to process. This kind of multi-tasking runs optimally with one thread per core (or 2 per core with hyperthreading), so on an i7 with 8 virtual cores, you would need a max of 8 threads, and therefore 16M for 8 full size stacks. Using any more than this is being wasteful of memory because the CPU cannot really do more than 8 things at one time. You could of course do this and reduce stack size, but you risk a runtime stack overflow if you make them too small.
One question, what would be the best way to get a continuation in Rust to pass to the IO request?
In Rust implementation I know only MIO which is providing non-blocking io over network operations and channels.
I'm already using MIO GitHub - tokio-rs/mio: Metal I/O library for Rust. for making Non-Blocking async IO, but by default it running only in one thread, so now I'm trying to run multiple event loops to provide multicore performance.
Above you said MIO was using a lot of memory, and this is resolved by reducing stack space. This would suggest MIO is using threads internally. As one of the points of using nonblocking IO is to increase concurrency, this seems to be a problem. The kernel can provide asynchronous requests without requiring lots of threads (the kernel uses interrupts) so the problem would appear to be MIO itself. You need an async library that passes the continuation to the kernel, rather than using lots of user space threads for the concurrency.
So are you saying MIO wasn't responsible for the large memory usage in the above use-case, as the program using the library appears to be single threaded? The reduction in memory usage when stack size is reduced clearly points to a large number of threads running.
Why is MIO using timers? Is in polling something? With async IO you simply need to register the continuation function with the kernel and return to the main event loop (or without continuations return to the event loop and wait for events, including IO completion events).