[Concurrency] Logging and Data Aggregation with Tokio

I have a server that takes requests and logs them to a file. Also the data from the requests should be aggregated and regularly persisted to a database. I am especially concerned with performance. As a runtime I am using tokio.

Logging

For each request I want to have one line in a log file. I am thinking of try_clone-ing the tokio file for each task and then writing to it in the task. Is this even correct? Is this performant?

I have the feeling that the correct solution would be to have a mpsc channel and a dedicated log writer task. What would you recommend?

Aggregate Data

For the most part I want to count things, so my plan would be to have a data structure with all the accumulated numbers and a background task that dumps the data in a database in regular intervals. What would be the best way to organize the data structure.

  • Having everything in lock-free constructs: I am not sure if this is possible for my data. And when I read the data to dump it in the database I probably need to lock either way?
  • Having a mutex or rwlock guarding the data structures?
  • Having a mpsc channel?

What would here be the best solution?

As you can see I am not very comfortable with the whole concurrency topic. I am not sure what the performance cost of different primitives is. At the moment I am leaning in the direction of using mpsc channels for both issues and just avoiding the whole synchronization. Doing the synchronization badly will definitely lead to low performance, I guess?

This will result in interleaved lines in your log, and may cause issues when you drop the file, not sure.


I recommend the following solution:

  1. Use std::thread::spawn to spawn a thread for writing the file.
  2. Use a crossbeam unbounded channel to send lines to the thread.
  3. Receive them in a loop and write them.
1 Like

Thanks!

Is there an advantage for a thread instead of a tokio task?

Is there an advantage for crossbeam instead of tokio::mpsc? I'd like to keep my dependencies at a minimum.

Yes, no OS provides a true async file API, so it's more expensive to keep file IO in Tokio than outside. You could also use std::mpsc rather than crossbeam. You can't use a Tokio mpsc channel because the receiver is in non-async code.

It should be unbounded because unbounded channels don't care about where the sender is (but do care about the receiver).

1 Like

Check out https://docs.rs/tokio/0.2.22/tokio/sync/mpsc/index.html#communicating-between-sync-and-async-code

2 Likes