How-to deal with variable-latency sync functions in an async context?

Hi there!

I'm currently building a web server with axum, in a good'ol async fashion.

For data persistence, I want to use the embedded database redb for storage, which only has synchronous methods.

Calling a method to read data could range from a microsecond to several milliseconds depending on whether the data is in memory or on the disk. Here is the problem I have:

  • If I call redb's methods synchronously from an async context, it may block the thread entirely, which is not viable ;
  • If I use e.g. tokio's spawn_blocking function to move the call on a dedicated worker thread, it will have some latency in order to move the task there, especially if it needs to create a new thread for the task.

So I'm basically stuck with losing a lot of performance when redb takes a long time (case 1) or a little bit of performance, many times when it takes a short time (case 2).

What solution would there be to have minimum latency when redb returns data quickly without blocking the thread when it takes longer, knowing that it's not possible to know ahead of time how long the method will take?

Thanks in advance for your help!

What overhead have you measured the spawn_blocking approach to add?

I haven't done any measurement (micro-benchmarking microseconds delay is much harder than what I'm comfortable with), I'm mostly talking from a theoretical perspective. Moving a task to a dedicated thread (and more importantly spawning a new thread) has a cost, probably not very high but still.

If it is touching the disk, you should certainly have it on another blocking thread. If your server will not experience a lot of load, it is fine though. One request will wait even if it was async, but with sync you will make unrelated requests wait too. It may be fine with low load and you can most probably just do it like that.

Instead of spawning you may have a single thread and use message passing. Or, maybe there are non-blocking operations (if you're trying to read, it will return error 'Would block'). With them, you can make your own async wrappers - the only missing piece is to know when to wake. It the worst case you may just schedule a timer

2 Likes

The thing is, I may need to do several operations at once on the DB without knowing ahead of time if it will block or be resolved instantly.

Message passing may be a good idea, if it's mainly small messages it should be fast enough, IIRC std's crossbeam-inspired channels are now very fast.

Seems like there's no "perfect" solution unfortunately, and message passing requires quite a lot of boilerplate, but hey it's better than nothing. Thanks for your help :slight_smile:

Can't you set some option to return an error instead of blocking? Nearly every system has it, even something as C as NetxDuo. For Linux sockets is a non-blocking option, for example

No unfortunately redb only has full-blocking methods, with no way to check beforehand whether it will be block for a long time or not.

Extending Ddystopia's answer: I believe that actor pattern is a good general solution for bridging sync and async code together. One thing worth noting is that tokio's channels support both blocking and async reads and writes, so for message passing, you can just rely on tokio's channels.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.