Blog post: Asynchronous Rust [complaints & suggestions]


#1

I tried to use futures. Some of it was nice, and some of it wasn’t so nice. I wrote a blog post about the bits I liked & didn’t like:

https://pro.theta.eu.org/2017/08/04/async-rust.html


#2

I can definitely relate to @eeeeeta here. My experience with tokio was almost identical and incredibly frustrating, and I’m sure I’m not the only one.

I think the big problem is that there are way too may layers to tokio, leading to a really complicated design which often means you end up spending hours on docs.rs clicking your way through the Matryoshka doll-like layers of the tokio/futures stack. I’m not sure how you can fix this and still maintain flexibility though… Perhaps the monadic/combinator approach that futures takes isn’t the best one? Or it could just be a documentation issue. Tokio and futures are so young that we don’t yet have enough resources for teaching people how to use them and there aren’t any best practices people can use as guidelines.

I’m going to give tokio and futures the benefit of the doubt here. Maybe they will be easier to work with in a couple months time when people have more experience using it?

If anyone can link to larger applications using tokio and futures I’d love to peruse their source code and see how they’ve done things!


#3

Links (to pro.theta.eu.org) are giving 404s

-> impl Trait will make it to release, hopefully sooner than later.

However, it makes more sense:

The code you write is hideous.
You have changed code from;

  • step 1
  • step 2
  • step 3

into a mess.

You seem to miss the point that poll() in an implementation detail. The basic usage of the future is in chaining async code into steps using .and_then() and map(). As opposed to CPS (continuation passing style) which leads to hard to read nests.

Reference guides aren’t user guides. I know there is a lack of guidance to usage. Hopefully over time it will improve.

https://hyper.rs/guides/server/hello-world/
Notice how the struct HelloWorld has nothing in it. A clone will have no cost. To extend the code the only part that you would typically be concerned with is the contents of call. Your blog seems to diving into library implementation rather than usage.


#4

Links (to pro.theta.eu.org1) are giving 404s

Oh gosh, yep - thanks for pointing that out! Looks like the markdown parser jekyll uses went full potato (or I don’t know how markdown links work). Should be fixed.

Okay, fair point; noone would like to (or should…) write enum-based futures code at the moment. My eventual hope is to write a macro that would expand into that sort of code, or something like that. The point is (and I probably put this across badly) that the underlying representation of async code should be enum-based, not wrapper struct-based. Obviously the latter is really hard to do without a macro, which is why it hasn’t been done.

Yeah, I know, I read the guide. My point was, however, that inevitably you want to store some form of state in your web handler, in which case you’re somewhat stuck - you’ll end up cloning it every time you handle a new request!


#5

Wouldn’t that be at most just a pointer (Arc?) which points to a pool of database connections? I don’t think you’d want to copy around your entire application state, at most it’d just be a pointer to your state which has some sort of synchronisation primitive (RwLock, channels, etc) internally.


#6

It should be an Arc<Mutex<T>>, yeah, or an Arc<SomeDatabaseConnection>. Still (slightly?) inefficient, as said in the blog post:

It therefore makes no sense to clone our Service, which presumably just holds a bunch of Arc<Mutex>s, for every request - we’re just needlessly incrementing & decrementing a refcount (and possibly incurring some synchronisation costs from the atomic operations).


#7

If your service is single-threaded, you can use Rc<Cell<T>>, which has very little overhead, if it’s multi-threaded, you would need a mutex to protect your state anyway.

There is a slight overhead in the reference counting itself, compared to just knowing when it’s safe to free the service state. However, if you really wanted, you could place the service state further up the stack and pass in a borrow to the service, or even store it in a global variable, and access it that way, which avoids the reference counting overhead.


#8

I know that Rust actually makes you detest and avoid allocations where ever possible, which is probably why a lot of the Rust libraries / projects are quite performant. ( the compiler is not quite competitive with C or C++ compiler yet, I’d say, and things like bounds checking need to be manually circumvented, …).

But in context of all the things that usually happen on a web request (multiple DB queries, Redis queries, maybe external API calls, …) the cost of one allocation or 2 atomic operations is really completely irrelevant, imo.

Things might be different if you are mostly serving static files or something from an in memory cache, but that does not seem like a major use case for me. And even then, in context of network traffic, it’s hard to imagine that it would make a difference.


#9

If you find small examples where this is true, it’s probably a good idea to put them in Bugzilla.

In general Rust is not good enough in this, and I am not seeing work/RFCs done to improve this situation. All the design/implementation work is going elsewhere, I think.


#10

In general Rust is not good enough in this, and I am not seeing work/RFCs done to improve this situation. All the design/implementation work is going elsewhere, I think.

That’s fine with me, though. I think ergonomics improvements are much more important right now to increase adoption.


#11

Only about the Service/NewService stuff, re performance:

So Service::call returns a Future, which can’t capture references from the &self argument and for all practical purposes has to be 'static. Since the future needs to capture any arguments used after any IO event or other asynchronous operation, that future needs to take a copy of the connection state anyway. So you’re reference counting your state almost definitely no matter what.

Incrementing/decrementing the ref count is extremely cheap (especially when its non-atomic). The expensive part is losing cache locality & performing the allocation by boxing up that data in an Rc or Arc. But you should have to do that once when your application starts, and then fiddle the reference count every time you handle a connection or process a request.

In other words, every new clone of your application state is cheap once you’re already refcounting it.


In an ideal world, we could prove that your application state outlives the event loop, and you could just pass a free reference through the whole thing. But the lifetime wrangling becomes quickly unmanageable, and its not clear at all that the performance benefits are worth the cost. There are lower hanging fruit than that.


#12

My experience with Tokio was a lot like yours. I spent hours reading the Tokio docs and tutorials, trying to figure out how to do anything at all as a client instead of just an echo server, and in the end I just cloned a socket into two threads, and it worked on the first try without a single hard-to-understand API.

At the moment Tokio is Fancy/10 on the design, but Junk/10 on the explanation end.


#13

I have to say I like this plan a lot. Coming from the Node ecosystem I don’t see the single threadedness as a bad thing. People complain you have to spin up a node instance for each thread and then communicate through some other channel like Redis or the DB for cooperation. However at least in the web services space most node users have found that is the best way to design anyway as at some point your going to want to scale across servers. So node app are just built as Logical Core = Server so you can scale across as many servers as you want super easily. In my experience it works very well in practice.


#14

To give Tokio a little breather from the beating, I found @frankmcsherry’s (and his co-authors) paper, http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html, an interesting and somewhat amusing read. There’s also a choice quote by Paul Barham that is appropriate in some circles:

There are many good reasons to split processing across processes and machines. But, it’s nice not to need more than necessary by utilizing a given machine well. Shared memory concurrency is difficult to get right and then keep right in the face of code change. This is where Rust can really change the game (or has a great opportunity to) by making it safer and not sacrifice (significant) performance.


#15

Wanting to split across extra machines has a lot less to do with any single machine being overworked and a lot more to do with the fact that sometimes hardware just dies and when that happens a cluster of 1 is now 100% dead.


#16

Its the best because its the only one available. People using Go or Erlang somehow did not come to similar conclusions.
I am for example using data-heavy services and the cost of serializing it to share between processes is just too high.


#17

Right, this is one of the “good reasons” I mentioned - I didn’t elaborate on them, but there are certainly several; hardware fault tolerance, network partition tolerance, geographic proximity zones, data center outage mitigation, load balancing, and so on.

My point was you don’t want to be frivolous - once you’ve achieved your high-availability objectives, extra machines for load servicing is an unneeded overhead (operationally and financially).


#18

I believe you mean computationally heavy. I agree there is a use case for the other but what most people don’t realize is that people don’t do it that way because it is the only option. It is the only option on purpose for a lot of reasons. I don’t believe anybody is talking about replacing Futures or Rusts current threading model. Just creating a new super easy to use abstraction that will cover 95% of use cases at a reasonable level of performance. Node.js didn’t rise for being the fastest. It rose for it being super easy to write applications that were fast enough. I think the Rust can take advantage of the same thing. For example I could easily see applications being written like such

A) some computationally expensive task is written using traditional threads in a crate (lets say image processing)
B) Rocket and Diesel get rewritten to take advantage of the current multithreaded futures as their writers have a deep understanding of Rust and concurrent programing.
C) This new proposal allows a new programmer to Rust to come in and quickly write a REST api for image processing using these new easy to understand single threaded promises. Were Rocket then spins up an instance for each core just as Node.js does.

This makes it super easy for them to write an reason about the application at the same time not wasting any resources.


#19

No, I meant that I need to preload a lot of data and then reload it occasionally. Doing it several times in case of multiple processes would be terrible option (and because the need of reloading sometimes, load-then-fork wouldn’t solve the problem). In case of node its limitations are a reason that people abandon it completely when their needs go beyond its scope and choose Erlang/Go/Java.

It is the only option on purpose for a lot of reasons.

Many of them simply come to “using parallel programming in C/C++ is hard even for experienced developers, so we can’t do that even if we wanted to, and we spent last decade working around it”. When you have a language that does not have such limitation, nobody is throwing it away because “its such a great idea to go back to single thread” as you say. There are usecases where it may be good idea (like complex interaction with OS that may fail or block whole process), but I do not believe that the fact that we have many single-threaded applications is a proof that it is common case.

To put it another way - there are many benefits you can get from having proper concurrency and parallelism within single process, and there is no reason not to reach for them. One process per core comes with many limitations. It has been the only available model for some time, so people designed solutions around it, but getting rid of that opens a lot of new doors. If you really want to, there is nothing stopping you from programming the old way on thread level (share nothing),
and still get some benefits, but I don’t see really the point of limiting myself to doing it only this way.

I don’t believe anybody is talking about replacing Futures or Rusts current threading model.

Good, that was my understanding of your post.

Just creating a new super easy to use abstraction that will cover 95% of use cases at a reasonable level of performance.

I appreciate the need for that, but … Rust is not a language that aims to achieve that.
This can - and should - be achieved on a an application or a framework level, but not always libraries. Had the language had green threads of some sort, this would make async trivial for 95% percent of cases as you expect, but it doesn’t, and all other options are more complicated.
Also keep in mind that writing basic internet protocols is complicated anyway. Once they are implemented, using them is easy. Also lot of complains in this blogpost comes to poor documentation that doesn’t explain reasons why something is done the way it is (like lack of explanation for having service per connection - which I believe comes to the fact that service holds connection state). This certainly can and will be improved.

For example I could easily see applications being written like such

This is already done. If you are merely using (as opposed to extend/integrate with) a framework like Rocket, all of that will happen automatically already. There are many Rust frameworks that manage thread pools and provide you with all the benefits without you needing to know how it works and provide easy to use and understand abstractions.

Node.js didn’t rise for being the fastest. It rose for it being super easy to write applications that were fast enough.

I am not negating the need to be easy to use, though if it is your top priority, Rust is not the best choice. It is secondary goal at best here.


#20

Well I guess agree to disagree. I think your vision for Rust is to small. With some layers of zero cost abstraction it can become very easy to use and more than just a systems/networking language like go and erlang were designed for. One were people can easily dip their toes into to write webapps and simple CLI tools and then peel away the layers as they need to/learn to do more complex things. It will take a while and should be done slowly. I would compare it very much to ES2015 in that the most important features were simply sugar to make the language easier. One can still edit the prototype of a class for example.

Thats why eeeeeta’s suggestions really struck me as a good way to go