Route to DOM with wasm-unknown-unkonwn

Like many rustaceans, I am really keen to improve the wasm story, so that my whole stack can be in rust. Specifically, I want to be able to generate the necessary glue code to be able to convert the ECMAScript DOM API into a rust API.

My strategy is:

  1. Write an IDL parser
  2. Write some codegen for the necessary javascript and extern rust
  3. Run these on the DOM and HTML5 IDL fragments to generate bindings
  4. Stick the result in cargo if possible
  5. \o/

So far, I've converted the IDL production rules into a PEG grammar, next I need to create a data structure to hold the fragments, then I'm on to the codegen.

It's quite a big job, so I'd probably benefit from some collaboration if anyone is interested? Also, please let me know if this is a stupid idea! Some of this functionality is relevant to browser code as well (e.g. Servo), so is there any prior work here?

8 Likes

I need help!

I'm currently creating my datastructures for storing the parsed IDL. Here is a simplified version of my problem:

I have the following structures

#[derive(...)]
enum MyType {
    Single(SingleType),
    Union(HashSet<SingleType>)
}

#[derive(Hash, ...)]
enum SingleType {
    Bool,
    Int,
    Float,
    // ...
    Sequence(Box<MyType>) // PROBLEM!
}

The problem is that a sequence can be of a union type, but Hash is not implemented for HashSet, so this will not compile.

The solutions I can think of off the top of my head are

  1. Implement Hash manually (need to preserve order of elements in HashSet, in this case all my structs could implement Ord although this doesn't really make semantic sense).
  2. I don't have any other ideas. I could use some other data structure like a Vec and enforce uniqueness myself. This would complicate the types though.

Can anyone help! Here's a previous thread about implementing Hash for HashSet.

I think that thread of mine is the current best answer - use BTreeSet and it will work out.

2 Likes

Webassembly runs in a browser, and it is faster than JavaScript. The combination of those two features is the only reason why anyone should consider using it for anything at all.

While you can write some gnarly glue code to get DOM to Webassembly, the result will generally be slightly slower code due to poor performance in the transitions between JavaScript and Webassembly.

Simply writing your DOM interaction code in JavaScript gives you faster runtime performance and full garbage collection, whereas a piece of middleware like the one you suggest isn't actually going to solve any practical problems, yet it is potentially a great source of bugs.

It seems that an IDL parser already exists (webidl-rs), wish I'd seen it before I wrote one :S.

I want to be able to write my whole web-app in rust - if it were faster as well that's a bonus, but I'm not too unhappy if this isn't the case. It means I can write code in a properly typed language, without having to worry about all the pitfalls of javascript, and I can share code between client and server (again I don't want to use javascript).

There are many other reasons why writing client-side code in rust is superior to javascript. Here are a few

  1. RAII means memory leaks are much harder. I would expect the DOM to be owned by the browser, handing out references to code, similar to Rc. If you avoid storing these they will just get deallocated as they go out of scope and the browser will be able to manage the DOM efficiently.
  2. All the language warts on javascript. One null is bad enough, but js has 2! (null and undefined).
  3. In javascript you can do very fancy things with closures. Unfortunately they can be difficult to implement efficiently. Here's an example (admittedly fixed).

This is more opinion than total fact, so I'd like to hear arguments against (and for) what I'm saying.

PS Generating bindings like this makes it easy to transition to some later direct api, you just change the codegen (assuming the API matches the IDL description somehow).

  1. You can't have DOM references in Webassembly, so you have to string something together with JavaScript holding them for you, rich opportunities for leaking. As for leaking in JavaScript, it practically doesn't happen, you'd have to be pretty deliberate about storing old references.
  2. Actually 3, you forgot explicitly defined as undefined, which is completely different from undefined.
  3. Yaaay, closures are great! Your reference hasn't got anything to do with closures, that is just a browser bug in the new scoping feature.

In general, I don't get why anyone would choose Rust's pseudo-garbage-collection over real garbage collection if speed is not important. If you don't like JavaScript, lots of other languages will both compile to JavaScript and run server code. Also, compiling to Webassembly instead of JavaScript is just another step of indirection when writing code that deals with the DOM.

@derekdreery, have you found stdweb yet? I think its exactly what you're after. It just recently gained support for wasm32-unknown-unknown (before it required emscripten). It's using a slightly different approach - the bindings are written manually using the help of an inline js! macro, so many APIs will be missing at the moment, but its rather trivial to add the ones you need.

Or perhaps you could try and integrate your work with IDL into stdweb, so we can write new bindings more effectively. That would be awesome.

Regarding @NohatCoder's statement on performance:

Webassembly runs in a browser, and it is faster than JavaScript. The combination of those two features is the only reason why anyone should consider using it for anything at all.

I really don't think that performance is the only reason to use WebAssembly. Yes, it is true that accessing the DOM right now involves going through a JS FFI, but that is not always going to be the case. There is currently a proposal for GC support in WASM, which would enable direct access to web APIs. Once this is implemented, I imagine stdweb could transition to using the new direct interface under-the-hood, whilst keeping the same API for Rust developers.

Anyway, personally I agree with @derekdreery, that the increased developer productivity from using a language such as Rust with a strong and flexible type system (eg: ADTs, traits, specialisation etc..) can be worthwhile in itself, and the ability to write both the client and server in the same language (like Node, except both languages would be Rust), is equally enticing.

Perhaps if I were writing a large app for a major company which had to be deployed and working in a month, I would stick to existing frameworks and technologies, but I see no reason to discourage innovation in this area because I think the future of the Web will be one where JavaScript is just one option out of many (Kotlin, Rust, OCaml, Haskell, etc...)

4 Likes

I don't think that one is going to take off, there is no way patching a non-GC language to become GC isn't going to end in a complete clusterfuck. I think Webassembly should have been a no-exceptions garbage collected semi-high-level language in the first place, but there is no realistic way of getting there without starting from scratch.

I guess this is the main point of contention, as I see it, developer productivity is an area where Rust is weak, you pay in bureaucracy to get a good combination of speed and security. If you don't need the speed, Rust gives you nothing you couldn't get from a GC language.

The situation is a bit different today; originally the wasm team thought that you'd need GC support, and then DOM bindings, but there is a new proposal which does not, and is more likely to be the way we get DOM access.

2 Likes

It is hard to respond to this kind of blanket statement without more specific info, but if your reference is Javascript, there are certainly a lot more developer productivity differences between Rust and JS than GC vs non-GC.

Javascript is pretty much the modern embodiment of the "just keep running, no matter how stupid the program behaviour gets" philosophy that was popular in the era of Perl and Bash. It has zero compile-time error checking, and near zero run-time error checking. It will gladly let the developers completely mess up the execution environment, and then present it as a feature (yay polyfills!). Its type system is so weak that crazy operations such as [] + {} are not only considered to make sense, but also to produce results of a type which neither of the original operands had. Even something as basic as the equality operator fails to pass the simplest mathematical tests (transitivity, anyone?), and when you finally find an operation which works and produces sensible results, chances are that the most popular web browsers will get it wrong anyhow as soon as you start testing it more broadly.

In contrast, Rust was built around a strong type system designed for maximal error checking, where you can detect a lot of problem at compile-time, and a bunch more at run-time. This does means that you expend more time having a discussion with your compiler until it is convinced that the program looks correct, but also means that when you have reached this stage, you have already proven a whole bunch of useful properties about your code's quality "for free". It's a different way to program, where the idea is not to get something running quickly, then spend months debugging it, but to get something a bit more slowly, then enjoy the fact that it works a lot better right from the start.

If I'm not wrong, what the OP is looking for is not just the "safety without GC" aspects of Rust, but also the more general package of a modern strongly and statically typed language, with the productivity advantages that come with it. It could be a different language with similar properties transpiling to WASM, it just happens that Rust has already gone pretty far down the WASM road.

4 Likes

I didn't specify a language because I didn't want to get into a discussion about otherwise unrelated languages, and the choice depend on preferences. But something that is at least safety-equivalent to Rust could be C#, from a language-design perspective the interesting question is: Is there anything in the strict Rust data-pass-around rules that makes you more productive than you would be with no such restrictions and a garbage collector?

Rust vs C# is indeed a different story, where I think the two languages stand on a more equal footing from a productivity standpoint.

Between these two, from the point of view of productivity, I would choose Rust when I care about...

  • A better multi-threading story, which may become important once wasm gains threading support.
  • A better memory safety story in general with respect to more single-threaded shared mutability gotchas such as iterator invalidation.
  • Code with better locality properties, where your program doesn't become a pointer spaghetti monster where modifying data in one place can potentially change the behaviour of another piece of code that lies kilometers away.
  • Less magical error handling, where it is clearer which functions can error out and why.
  • A conceptually simpler and more carefully designed type system, with less subtly different ways of doing the same things all over the place, and more genuinely useful API design elements like traits and move semantics.
  • Easily getting my software to run on something that isn't Windows (though that may change with .Net Core, Mono was really not a good substitute for the real thing).

...and I would choose C# when I care about...

  • The large existing body of .Net code, if it can easily be used in a WASM setting.
  • An async/await story which is ready for production use today :slight_smile:
  • Inheritance-based OO, which from professional experience tends to be an incredible source of messy code, but remains nonetheless an industry-standard technique which many devs love.
  • In general, maximal "black magic" powers whenever I want my library to do crazy and wonderful things like introspecting everything at runtime, generating code on the fly and having configurable hooks on every single part of the API (properties, delegates...).

In general, I tend to find C# more flexible and Rust much easier to reason about, so it depends which of these two properties you care most about.

As an aside, the design of C# tends to be very tightly tied to .Net, and I am a bit curious about how usable it will remain in a more restricted environment like wasm. Previous experiments like Mono were certainly not very encouraging. But people are trying anyhow, so it seems that we will soon find out how well the two match :slight_smile:

1 Like

Yes, having unique ownership as a guarantee and not something you have to prove for every type has been very productive to me.

I see the strict rules as a productivity benefit.

Of course C# compiled to WASM will also need a bunch of tacked-on JavaScript interaction to do anything with the DOM, so it isn't like that story is a lot better. What bothers me the most is all the indirection, instead of JavaScript talks to DOM, you get Rust compiles to WASM talks to JavaScript talks to DOM. Every step of the way you lose performance, increase the risk of bugs, and make it harder for the programmer to reason about what actually goes on.

Please just stop writing web frameworks, we have had them for years in JavaScript, and they always make things more complicated, I can't imagine that injecting a layer of WASM is going to make things better.

My experimentation has kinda stalled on this, because it seems the stdweb project is already doing what I'm interested in. I might finish the IDL parser so I can compare it to the servo one.

I'm very happy with this thread btw - thrown up loads of interesting stuff to read! Thanks all!

3 Likes

Have you seen webinden? I believe it has a similar goal, although it seems progress has stalled in the past month.

@derekdreery if you're still interested in this I've also been working on a project called wasm-bindgen which is intended to be a bundler-freindly solution to integrating Rust and JS, and I was hoping to eventually parse webidl and generate *-sys crates that can be used natively in Rust.

The other goal of wasm-bindgen is that it's intended to be forward-compatible with the host bindings proposal so as soon as that's available in browsers/rustc you'll be able to flip a switch and instantly have even faster DOM performance than JS does today (as wasm -> DOM methods will be faster in the various JS engines).

Anyway, if you're curious, let me know!

5 Likes

Cool, I'd like to collaborate on this, happy to have work passed to me. If you want to use peg to parse the IDL, I can post my code (if I haven't done so already)

Some of my favorite links:

A very good tutorial

Implementing a light-weight Malloc for WASM

My project stripping the wasm-file-size (work in progress, planning to use wee-alloc):
https://github.com/frehberg/rust-wasm-strip

If you want to see how to execute Wasm files in non-Javascript environment, see:
https://github.com/pepyakin/wasmi/blob/master/src/tests/wasm.rs

Oh great! I don't personally have much experience with webidl but my thinking is that the first iteration would be to pick a simple webidl file and try to generate a sys crate from it (using wasm bindgen attributes)

From there I'd imgaine it would be a solid base to start expanding and translate more and more idl files!