I'm unable to make out the reason for people using diesel when it's synchronous. E.g. if my actix-web server has 4 threads running in runtime, diesel can block all of them for just 4 simultaneous requests.
Or there is some other magic going on that I haven't yet understood.
First of all, Diesel far pre-dates the existence of async in Rust.
Second, I sense a default (and wrong) assumption that async is somehow unconditionally "superior" or "faster". This isn't the case. You can perfectly well write high-performing applications in synchronous code. As a trivial solution, you can offload DB connections to a separate background thread, and communicate with it via a queue – you don't have to spin up a thread for every request, just to have it blocked. (Also, you better design your queries so that they don't take more than a few milliseconds anyway. If they take up a significant portion of your code's running time, you have bigger problems.)
I completely know and understand the fact that async has its own tradeoffs in performance, but what I'm particularly looking at are backend web applications that need to interact with database over network. And network request-response (in this case with database) inherently adds latency, other than the (real) time taken to process the query. So, database can end up being unoccupied in between those round-trips, 'cause I wasn't able to send 5th request, before one of the four was processed.
I don't see any way to implement this other than pushing queries to queue and poping in another, say query processing, thread. And this particular implementation would make query executions even more synchronous, hence database ending up being unoccupied in b/w round-trips, when in reality there is a lot of work waiting to be done.
On a side note, thanks for the quick response.
I don't understand what you mean by this. The point is exactly that the pushing and popping of queries is a lot quicker than waiting for the DB. With an appropriate concurrent queue data structure, pushing and popping elements is trivial and probably don't cost more than a couple of atomic loads and stores. The DB can signal back (e.g. via a condition variable or a semaphore, which is equally cheap) when it's done with a particular query, and in the meantime, your thread can go and do some other, more interesting things.
You can use diesel-async for async queries in diesel.
Thanks, though I know about existence of async ORMs and libraries. What I'm trying to do, is understand why synchronous database interaction ORMs like diesel
exist and widely used as well in BE applications, other than for the quoted point.
It also helps to recognize that not everything needs to be async. Which was also hinted at by the post you quoted.
For instance, maybe you are writing a simple CLI tool that just needs to interact with a remote database for one-off admin purposes. There is little incentive to using a complex multi-threaded runtime for cases like this.
It's the same reason that crates like ureq
exist:
Ureq uses blocking I/O rather than Rust's newer asynchronous (async) I/O. Async I/O allows serving many concurrent requests without high costs in memory and OS threads. But it comes at a cost in complexity. Async programs need to pull in a runtime (usually async-std or tokio). They also need async variants of any method that might block, and of any method that might call another method that might block. That means async programs usually have a lot of dependencies - which adds to compile times, and increases risk.
The costs of async are worth paying, if you're writing an HTTP server that must serve many many clients with minimal overhead. However, for HTTP clients, we believe that the cost is usually not worth paying. The low-cost alternative to async I/O is blocking I/O, which has a different price: it requires an OS thread per concurrent request. However, that price is usually not high: most HTTP clients make requests sequentially, or with low concurrency.
That's why ureq uses blocking I/O and plans to stay that way. Other HTTP clients offer both an async API and a blocking API, but we want to offer a blocking API without pulling in all the dependencies required by an async API.
The most important question to ask here is: Why do you believe async database connections are that important for your application design. Do you feel that this could be a performance problem? If that's the case do you have performed benchmarks or at least looked for benchmarks that support that hypothesis? The reasons presented in your question are assumptions at best. As a strong counter point: crates.io
is doing fine with using diesel, so you might want to have compelling evidence there to claim something different.
The general answer to this kind of question consists of several points that need to be considered:
- First and foremost as a crate author it's important that you can implement an API for your users that you believe is helpful for them. That's not the case (yet?) for an async database library due to language limitations around cancellation safety. Note that this is something that affects other crates as well, even if they do not talk about that much. These are fundamental issues that likely can only be solved at language level
- Is there a considerable advantage of the other approach (so using async)? At least till now I did not saw any compelling reasoning here that this is the case. Neither in terms of performance nor in terms of usability. For the former you can just use a fixed thread pool to execute the "blocking" queries onto, as your are limited on a low number of connections anyway. (Checkout
deadpool-diesel
for example if you look for a ready to go solution). For the later point:async
solves some issues with the sync approach (queries need to use'static
data), but introduce other issues (cancellation safety…)
Hopefully that answers the question.
This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.