[Released!] Multipart 0.5.0: new apis; integration with Iron, tiny_http, hyper::Client

#New Documentation Link!


Announcement/discussion on /r/rust

Big changes spanning several weeks' work! Before I make a full release, I'd like comments on the changes as well as for people to try them out and give feedback.

I have published this PR as 0.5.0-alpha.1 to make it easier to try out.

The rendered documentation for this PR (including all optional features) is available here.

New Integrations

  • Hyper's new Client API (wraps RequestBuilder)
  • Iron (!!!), including a convenient middleware which saves all POST fields and files to a datastructure which is accessible via the request object (with sane limits on file sizes)
  • tiny_http's server-side
  • Nickel, blocked on them upgrading to Hyper 0.7 so we don't have to build two different versions

Changes to Client APIs

  • New lazy-writing Multipart object which implements Read (part of wrapping the RequestBuilder API)
  • Eager-writing Multipart object now returns errors immediately
  • chain_result! macro is exported with a syntax for binding the wrapped value in the previous result

Changes to Server APIs

  • Removed the assumption in HttpRequest that the request object impls Read for the body
  • Implemented HttpRequest for &mut hyper::server::Request so that it doesn't have to be consumed if the user doesn't want it to be
  • Added tempdir, which is used for file saving by default, plus a bunch of new methods for controlling where and how files are saved

Beginner Opportunities

Mentored projects available here.

Deferred Ideas

  • A version of multipart::client::lazy::Multipart which uses static chaining a la std::io::Chain. I figured it wouldn't be very useful as it cannot be constructed dynamically (each new field would change the type). I might change my tune if there's a lot of want for statically dispatched requests with fixed fields.

Hi, I am contributing to the rotor-http client and server implementation. Multipart form handling is out of scope for this project but users of this crate of course will want to use multipart encoding.

What specifications does the multipart crate implement? The tight coupling of multipart with mime crate is not optimal for users who want to parse media types by their own source code and that do not need mime. Also hyper integration is not of interest to rotors users.

What I would really like to see is a library for multipart handling that just does the parsing, has very few depencies but avoids allocations and is real fast, like httparse for HTTP/1.x. But this is the task of another crate. (multipart could depend on it).

Note: The docs link at Github does not work, it is redirected to crates.io.

Many of the dependencies of multipart are controllable via features. Hyper is only a default feature as a convenience for the largest potential userbase; you can turn it off (default-features = false, features = ["client", "server"]) and still use multipart with your own implementations of its traits.

Can you list some of the reasons for using a different MIME implementation? It's not "tightly" coupled with mime so much as, "uses it for MIME parsing so I don't have to screw with it". It can fairly easily be replaced with generics and trait implementations for other types as long as they support equivalent APIs. mime_guess can probably be adapted as well.

Allocations are difficult to avoid as reading to some intermediate buffer is necessary to search for the boundary string--unless you want to read byte-by-byte, and then you lose the optimizations afforded by memchr and memcpy, and also potentially pay the cost of a syscall/virtual method call each iteration. It can be stack-allocated but then you lose control over the size of the buffer and your stack usage blows up. reallocations are few and far between; the string buffer containing the field name is reused and resized when necessary, but the reading buffer remains constant-sized.

If the underlying HTTP implementation exposes a buffer with the right APIs--a BufRead impl isn't sufficient as the boundary parser needs to be able to request more bytes without emptying the buffer--then I suppose the buffer allocation can be elided entirely. However, none of Hyper, Iron, or tiny_http provide access to their buffers, so I had to use my own.

I considered starting work on my own integration with rotor-http but I didn't think it was far enough along yet to want multipart/form-data support. I'd be willing to collaborate, however.

multipart's new client::lazy API should work pretty well already with rotor-http's client. For convenience, I could add a method to lazy::Multipart which writes the content-length (if applicable) and body to the request.

Shameless release announcement bump.

New documentation link