Safe Native Code - Midori


#1

An interesting and very long “experience report” with a safe language for system programming:

http://joeduffyblog.com/2015/12/19/safe-native-code/

I think the most interesting parts for future Rust improvement are:

- Bounds check elimination

- Overflow checking elimination

- Error model

Our model ended up being a hybrid of two things:
Fail-fast for programming bugs.
Typed exceptions for dynamically recoverable errors.

A nice accident of our model was that we could have compiled it with either return codes or exceptions. Thanks to this, we actually did the experiment, to see what the impact was to our system’s size and speed. The exceptions-based system ended up being roughly 7% smaller and 4% faster on some key benchmarks.

- Contracts

In the early days, we experimented with contract analyzers like MSR’s Clousot, to prove contracts. For compile-time reasons, however, we had to abandon this approach. It turns out compilers are already very good at doing simple constraint solving and propagation. So eventually we just modeled contracts as facts that the compiler knew about, and let it insert the checks wherever necessary.


EDIT: pcwalton says some interesting things on Hacker News:
https://news.ycombinator.com/item?id=10766436

bounds check elimination ends up nightmarish quickly. I think I prefer an approach that fixes the problem at its source, by just encouraging programmers to stop using for loops (which are bad for non-performance-related reasons as well). Note that all of the examples given use for loops; if they were rewritten using iterators, there wouldn’t be a problem. C# has iterators, so the machinery is there; they just need to make it more painful to use a for loop than the iterator alternative. :slight_smile:

My opinions:

  • The future system/fast Rust code in the wild will probably use some for/while loops, so adding some bounds check elimination will improve performance (but I agree that currently there are more urgent things to implement/improve in Rust, because it’s still young).
  • A smart language design (that takes a hint from Ada) should help the compiler eliminate several array bound tests without too much work for the compiler.
  • Iterators too have significant disadvantages. In non-release mode the iterators-heavy Rust code is very slow (sometimes 10-20 times slower than release code), and this is a significant problem if you want quick compilation times and “acceptable” performance. Iterators bloat the binary and ask LLVM back-end to remove tons of abstractions. So there are many cases where a simple run-of-the-mill abstraction-light for/while loop is better. I love iterators, and I use them quite often in non-performance-heavy code, but a system language needs to offer very good for loops on arrays.

#2

This isn’t something to add: LLVM is an industrial strength optimiser and removing bounds checks is something it can already do. E.g. this has no bounds checks:

#![crate_type = "lib"]
#[inline(never)]
fn print(x: u32) {
    println!("{}", x);
}
pub fn foo(x: &[u32]) {
    for i in 0..x.len() {
        print(x[i])
    }
}

Don’t miss opt-level = 1: it improves performance without the full compilation times of --release, especially with codegen-units set.


#3

I see, very good. In the following days/weeks I’ll try to find if there are common&basic array access patterns where LLVM is not doing well enough, and I’ll report them here.
(And there is also a language design improvement that can help LLVM on this, but I don’t know how much Rust designers want to do such thing).

Right. So far I have used mostly the level 0 and 3 (or 2 with Cargo --release), I have to try the level 1 too. But I think most of my argument still stands: iterators generate ton of stuff that later LLVM has to shovel away. Compilation of for loops is faster, requires less RAM, requires a simpler compiler, requires less tradeoffs between high run-time speed and high powered compilation. (And still I love to use lot of iterators in most of the small Rust programs I’ve written so far).

Thank you for the answer.


#4

NB. others have looked into this sort of stuff before, e.g.:

This is not a relevant point: every Rust compiler will need to support iterator-based for loops. And, in fact, for _ in a..b uses exactly the same machinery as for _ in v.iter(), i.e. they’re equally complicated. Also, for loops are built from 3 simple things:

  • function calls
  • loop
  • match

They can be trivially implemented as desugaring into that. Writing for pattern in iterator { body } is the same as writing:

let mut iter = IntoIterator::into_iter(iterator);
loop {
    match iter.next() {
        Some(pattern) => body,
        None => break,
    }
}

I think the first step to investigating this sort of thing would be getting some hard numbers about the compilation statistics you’re saying.


#5

Any thoughts on Midori’s support for asynchrony i.e. async/await? See:

http://joeduffyblog.com/2015/11/19/asynchronous-everything/