Asynchronous HTTP client

I have started to implement an asynchronous HTTP client, available on crates.io under the name async-http-client.

This is a simple, quite incomplete implementation of HTTP/1.1; for instance, it does not support chunked transfer encoding (or gzip or deflate for that matter), or handles deprecated HTTP/1.0 features like header folding or a response without Content-Length terminated by closing the connection. It doesn't do TLS, though I don't know how difficult it would be to add that.

The distinctive features of async-http-client are:

  • asynchronous (using tokio-core)
  • available on crates.io
  • lightweight and simple, with few dependencies
  • standalone HTTP response parser written with nom

Wishlist:

  • support Transfer-Encoding: chunked
  • maybe provide a Stream interface, so the caller can treat a chunked response as it arrives rather than waiting for the complete response.

I'm using this crate at work, hope this can be useful to somebody else!

3 Likes

I didn't see any crypto crates in your dependencies. Does it support HTTPS?

He said no TLS, that implies no HTTPS I guess

Oops! I missed that right in the original post. Thanks

It's a bit sad to see Rust starting down the path of implementing a bunch of I/O-strategy-specific network libraries. In Python-land we're just going through the process of realizing that this was a mistake, and starting to migrate to I/O-agnostic libraries + I/O-system-specific wrappers. At least, it was a mistake for us -- maybe it will be a mistake for you too?

The obvious advantage is that e.g. these days if you want to make a new async HTTP library in Python, you can start with import h11 and you automatically handle things like chunked encoding and the "obsolete line folding rule", instead of every client/server sync/async/etc. library having to reimplement this stuff from scratch. (Disclaimer: I wrote h11.) The less obvious advantage is that putting an API boundary around a "sans-I/O" library forces the library designer to really focus on figuring out how to abstract the protocol's semantics at a high level, and this seems to lead to better designs in general. Like, as far as I can tell, the only HTTP/1.1 state machine diagram on the entire internet is the one in the h11 docs, which are generated directly from the h11 source code. Extracting that thing from the RFCs took me ages, but it sure helps give me more confidence that h11 actually implements HTTP correctly (including all the weird corner cases).

Anyway, just a thought...

1 Like

Normally I would just have used the httparse crate for parsing HTTP, but having to allocate memory for a maximum number of headers on the stack turned me off. And I took this as an opportunity to learn nom and make a parser that is hopefully easier to understand/extend.

I agree with your point of avoiding too many different I/O-strategy-specific libraries. The way it is now seems fine, as there are mainly 2 paths: 1) synchronous, blocking I/O provided by the standard library 2) asynchronous, non-blocking I/O expressed with Futures, using Tokio for the implementation.

I don't see it as sad that there are alternative libraries that allow you to use different I/O strategies, on the contrary it is good that you have a choice. Of course this crate is quite reinventing the wheel :slight_smile: , at the same time:

  • Hyper stable is synchronous, and Hyper asynchronous is still in development. Not to mention that it is quite heavyweight for an HTTP client.
  • there's an asynchronous wrapper for libcurl, but the documentation says that it does not scale well on Windows. Also C-based means potential security issues and more difficult to compile (again especially on Windows).

Oh, of course there should be libraries that allow you to use different I/O strategies! The point though is that it's worth trying to factor out the protocol implementation into a pure library that does no I/O, so that the libraries that use different I/O strategies can share code.

A better example of this might be hyper-h2 (no relation to the rust hyper library, AFAIK), which is a no-I/O HTTP/2 library. It's used by synchronous clients, callback-based servers, future-based servers, etc.