New Blog Post: Compile-time Safety is for Everybody!


#1

https://www.tockos.org/blog/2017/apsys-paper/

We (the Tock developers) just submitted the camera-ready version for a short paper on building kernels in Rust. It’s a sort of follow-up to our “Ownership is Theft” paper from a couple years ago. The blog post has a bit more informal overview of what the challenges were building Tock in Rust early in on, what lead to the original paper, and how interior mutability got us out of trouble in the end.


#2

From the blog post:

It reported on challenges we ran into early on building Tock in Rust and, maybe as a result, many took it as a warning against using Rust for kernel development. It may also have been the result of the incendiary title. No regrets!<

Sometimes it’s good to have some regrets in life :slight_smile:

but also imperitive for preserving type-safety

Typo.

// Rust
enum NumOrPointer {
  Num(u32),
  Pointer(&mut u32)
}

// Equivalent C
union NumOrPointer {
  uint32_t Num;
  uint32_t *Pointer;
};

Nope, the equivalence is more like:

// Rust #1
enum NumOrPointer {
    Num(u32),
    Pointer(&mut u32)
}

// Equivalent C #1
enum NumOrPointerTag { Num, Pointer };
union NumOrPointer {
    enum NumOrPointerTag tag;
    uint32_t Num;
    uint32_t *Pointer;
};

And:

// Rust #2
#![feature(untagged_unions)]
#[repr(C)]
union NumOrPointer<'a> {
    num: u32,
    pointer: &'a mut u32,
}

// Equivalent C #2
union NumOrPointer {
    uint32_t Num;
    uint32_t *Pointer;
};

#3

Yeah, in the first equivalence it’s impossible to discriminate between the variants. But why, in your version, is the enum part of the union? I think it should all be wrapped in a struct, consisting of the enum and the union, like this:

enum kind { num, ptr };

struct numorptr {
  enum kind kind;
  union {
    int num;
    int *ptr;
  };
};

C APIs have a lot of this annoying boilerplate, and unfortunately not all of them are as macro-happy as Apple is, so it got really tedious. Edit: Except it was even worse because they didn’t have C11 anonymous unions, I think.


#4

In Tock we use these [interior mutability types] ubiquitously

That’s a quote from the blog. I’m curious on hearing thoughts about this. Specifically, these bypass static borrow checks (by design). That cancels out one of the main strengths of Rust. I understand that you’ll get panics (or aborts) instead of corruption (as in C), which is certainly better, but it seems that having to resort to these “ubiquitously” is a sign of some problem (in Rust or Tock or …).

Any thoughts on that from Tock developers or the community as a whole?


#5

You’re right of course :slight_smile: If I write C code and I don’t compile it, it usually contains silly mistakes because my C is rusty and I didn’t think enough about that C code.


#6

Only RefCell and Mutex bypass static borrow checks, and we actually use neither of those in Tock (though we have similar types that we do use very sparingly). We
use mostly use Cell which I think of as basically encapsulating specific cases where shared mutability is perfectly safe. I don’t think there is a “cost” to using Cell from static checking perspective.

I’m not sure I would characterize it as a “problem” so much as a mismatch in software architecture. It’s perfectly possible to build a whole Rust ecosystem that uses unique (mutable) references sparingly. The fact that the existing ecosystem doesn’t is probably evidence that in most cases the architecture we used for Tock isn’t worth it. We’re dealing with resource constraints that are many orders of magnitude more restrictive than most systems, so I think it’s understandable if the trade-offs are just different for us vs. the vast majority of Rust code.

I think this is fundamental issues (i.e. theoretical limits) with memory management in a type-safe context. A footnote (number 4) in the blog post notes that it may be possible to allow shared mutability more liberally if enum-like types (in the PL literature, they are called “existential types” for some reason) were removed from the language, but that’s actually an incredibly useful feature.

On the Tock side, in many systems, you would avoid this problem by using an event loop or message passing or some other way of avoiding the circular references all-together. We would have run into similar issues in any other category of type-safe language: we would have had to work around Haskell’s purity (probably by using IO-like Monads ubiquitously), Java/C#'s dynamic typing (I’m not sure how).

Basically, our previous paper (Ownership is Theft) argued that it would make sense to change the language to support cases where shared mutability was fine. We no longer believe it would be worth it after understanding how intertwined it is with enum types (and we never even convinced ourselves that our proposed changes were actually feasible, it was just a work-in-progress paper).


#7

Ah, that wasn’t clear to me from the blog (maybe I missed it). Yeah, Cell is totally fine because there’s no ownership involved - it’s just moving values in and out. The impression I got from reading is you use RefCell as well (Mutex isn’t really a concern from the static check perspective for me).


#8

That’s a good point, the post doesn’t really go into that much.

We use two types that are similar (in semantics, but basically skip a computation instead of panicing) to RefCell (called MapCell and TakeCell). We try to use them extremely sparingly because of exactly the reason you’re bringing up: they don’t let us reason about interactions at compile time. So far the only remaining cases are basically for storing statically allocated buffers that need to be passed around. Normally, I think you would just use an Option for these, so the dynamic borrow checking seems kind of OK.


#9

Makes sense. I’ve not read the paper, so perhaps this is explained there, but I would encourage you to mention these bits in the blog. I think people seeing “ubiquitous use of interior mutability types” may get the same worried impression as me :slight_smile:.