Unsafe Rust - how do I inform the compiler of side effects?

I'm dabbling a bit in unsafe Rust with raw pointers and encountered a situation where the compiler doesn't understand that a function I wrote changes global memory through raw pointers. Instead, the compiler assumes the usual Rust semantics of non-aliased pointers and creates code accordingly, which results in an infinite loop.

How do I tell the compiler that a function or unsafe block has side effects on memory via raw pointers?

E.g., in gcc I would say:

asm volatile("" ::: "memory");

How do I say this in Rust?

Can you give more details on what the compiler assumes is unchanged?

Sure. Playground

In the loop:

while !l.is_empty() {
   l.pop_back()
}

where:

    fn is_empty(&mut self) -> bool {
        self.begin() == self.end()
    }

and

    fn pop_back(&mut self) -> *mut ListElem
    {
        unsafe {
            let back = self.back();
            (*back).remove();
            back
        }
    }

the compiler doesn't understand that the unsafe block in remove changes what's returned by .begin() and .end() in is_empty(). As such, it's emitting an infinite loop.
It's probably correct as far as Rust semantics is concerned (unless this is a compiler bug.)

1 Like

There's probably UB out there. I don't see it in this code, but it's hard to tell without the code for back(), begin(), end(), and remove(). You can run it in Miri, or give more context.

Also note that is_empty() should probably take shared &self.

Possibly. It's a straight port from C.
valgrind accepts it, though. The full code is in the playground.

Didn't see it, sorry.

valgrind does not detect all sorts of UB, and it's better for C. Miri finds most of Rust's UBs, and it does detect (at least) one here.

3 Likes

You can run Miri in the playground under Tools, and it does find UB in your example.

It is of course difficult to argue with UB, but the UB Miri shows is possibly unrelated or not directly related (it's at least inside a different function). The infinite loop that's generated for the while loop is

0x000055555556b350 <+464>:  mov    %rcx,0x8(%rax)
0x000055555556b354 <+468>:  mov    %rax,(%rcx)
0x000055555556b357 <+471>:  cmp    %rbp,0x8(%rbx)
0x000055555556b35b <+475>:  jne    0x55555556b350 <_ZN4test5bench13ns_iter_inner17h544258e6745407f9E+464>

That said, should this work? Should Rust assume that raw pointers/unsafe blocks write to memory?

And also:

error: Undefined Behavior: not granting access to tag <3346> because incompatible item is protected: [Unique for <3341> (call 1033)]
   --> src/main.rs:13:5
    |
13  | /     fn insert_before(&mut self, elem: *mut ListElem) {
14  | |         unsafe {
15  | |             (*elem).prev = self.prev;
16  | |             (*elem).next = self;
...   |
19  | |         }
20  | |     }
    | |_____^ not granting access to tag <3346> because incompatible item is protected: [Unique for <3341> (call 1033)]
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information

what does that mean?

UB should never, ever exist. The argument "it does not cause the problem" should not be said, and also many times incorrect.

I haven't dug into the code yet, but one thing to say is: never translate C code directly to Rust code. C has different safety requirements from Rust. The most obvious example is unique mutable borrows, but it's not the only one.

4 Likes

Yes, if they're allowed to.

1 Like

The problem in your code is that begin takes &self, so the *mut ListElem pointer it returns is not allowed to mutate the item. Changing it to &mut self fixes the Miri complaints.

5 Likes

Would you mind sharing a playground? When I make these changes, I still get the same Miri complaint: changed playground

Yes. But Rust also assumes that &mut references are truly unique. Any time you take &mut borrow, you're asserting that this borrow can be used as exclusive borrow. You can turn &mut to raw pointer, but any time you use &mut again, compiler is assuming that raw pointers pointing into the same memory as &mut won't be used again. And from the quick skim of your code, looks like you're interleaving &mut and *mut accesses.

I think to understand those errors from MIRI, you should familiarize yourself with "stacked borrows memory model", try googling it. Eg. this talk.


My quick idea on how to get rid of UB here would be to:

  • mark every method as taking &self, not &mut self (as you can't guarantee uniqueness for the self),
  • change struct definition to contain UnsafeCell<*mut ListElem> instead of *mut ListElem. This is a way to tell Rust that a field don't go through usual Rust exclusivity rules and arbitrary pointers may point into that memory.

(This is just a quick guess though, not sure if it'll work and if UnsafeCell is truly necessary here)

4 Likes

I'm strongly against "quick changes" to unsafe code to see if it does not cause UB. Even passing Miri is not a guarantee. At most, it assures you it does not contain UB in this test. Unsafe code should be deeply analyzed, or better, not written at all (not by non-experts, at least).

3 Likes

Ah, sorry, I ran the wrong command

From skimming through the source code, I've noticed three problems (there may be more, but even fixing them will require somewhat deep changes, so I didn't bother):

  1. As was said by others here, you derive mutable references from shared references, which is an immediate UB. To fix that, either make all borrows mutable, which will make the list significantly less conformable to use, or use UnsafeCell, which will require more unsafe code, and thus more analysis (and perhaps more UB).
  2. ListElement is Unpin while it should not be (moving it invalidates the pointers). You need to add a PhantomPinned, which creates cascaded error you should fix with more unsafe code (get_unchecked_mut()). Which in turn means more analysis etc. etc..
  3. In main(), you convert a *mut ListElem into a *mut IListElement, which is just wrong, because it's not the real type. Casting pointers to different type then dereferencing them is allowed in either of the following scenarios:
    1. This was the original type, for example: &mut ilist_element as *mut IListElement as *mut ListElem as *mut IListElement, or
    2. The data in the first type is valid for the second type (including alignment, padding, size, and everything). Neither of these applies here.

My recommendation to you is: don't try to implement a doubly linked list in Rust. Not even as an exercise. It may be a good one in C, not in Rust, unless you want exercise to unsafe code. And if you decided to do, don't translate a C implementation. Rust is not C. Looking at the implementation of std::collections::LinkedList may be useful.

12 Likes

In case you haven't seen it, here's a deep dive into these challenges:
Learn Rust With Entirely Too Many Linked Lists

4 Likes

It does not cover unsafe doubly linked lists (yet): An Ok Unsafe Deque - Learning Rust With Entirely Too Many Linked Lists (rust-unofficial.github.io).