Mapping of Undefined Behavior to Source Code in Rust

I know some of undefined behaviors (UBs) are listed here. Behavior considered undefined - The Rust Reference . But I don't know exactly which rust unsafe code produce which UB, i means mapping of such behaviors to exact scenario. For example,

fn x(r: &i32) {
unsafe {
*(r as *const _ as *mut _) += 1;
}
}

For above code, clippy suggest me to change my code for interior mutability types.

use std::cell::UnsafeCell;
fn x(r: &UnsafeCell) {
unsafe {
*r.get() += 1;
}
}

Anyone can help in mapping of UBs to exact scenario of code..

Your can put code in a code fence like so

```rust
Your code here
```
3 Likes

I update my question. Sorry , I am new to this community, so don't know .. Next time I will

No problem


Rust currently does not have an exhaustive list of all sources of UB right now. One thing you can do to figure out where there is UB in your code is to use miri,

This should find the biggest sources of UB,

There is currently an effort to document all sources of UB in the Rust Unsafe Working Group, the work is being done here

4 Likes

UB is a compiler-writer concept that facilitates optimization. It arises in the gray areas where the compiler does not guarantee to generate code corresponding to what the author wrote. In other words, for certain code situations the compiler claims to produce correct code (and if it has no bugs, does so). But other code situations are declared "out of bounds", and for those the compiler makes no guarantee. Double use, use after free, simultaneously accessing something as both mutable and immutable, permitting race conditions between threads, etc., are all cases where the compiler makes no guarantee of correctness.

Safe Rust precludes all of these problems because the compiler proves to itself that the hazards do not occur, or declares an error and aborts compilation. unsafe Rust permits the program author to assume part of the proof obligation, but does not make the above programming errors permissible. Where such errors occur, the compiler optimization passes are free to scramble the program unrecognizably. That is the price of aggressive optimization, such as is found in Rust as well as other languages that use the LLVM backend.

3 Likes

I am aware of that, I was just pointing out some efforts to document sources of UB (such as the Unsafe Working Group), and ways to find UB in your code (such as miri). Even if these are still not exaustive lists of UB, if there could even be such a list, it is better than no information about UB. In no way did I state that the example code in the original question was in any way correct, I didn't even mention it in my post.

Nor was I trying to imply that you were incorrect. The problem with attempting to define UB is that it is a negative definition; it is all those areas where the compiler is free to "mis-optimize" (in some sense) the program as written by the author. The list of such mis-optimizations will continue to grow as compiler optimization techniques improve. Thus I consider it relatively pointless to attempt to delimit the sources and consequences of UB.

In summary, write correct programs or expect that someone will suffer the consequences. Unlike most other languages, Rust makes it difficult to write incorrect programs, provided that the program author does not attempt to use unsafe as a "Get out of jail free" card (which, for non-US readers, is a reference to the 1930s-era game Monopoly).

1 Like

Can you explain me further this sentence.. because I was thinking that compiler permit it and make it run time error or panic... But as I am newbie , so I have no idea for actual scenario...
thanks for your time...

unsafe is not intended as a way of disabling the rules that the compiler enforces - it is a way of telling the compiler that you are going to enforce them manually. Even if you write unsafe code, the compiler is still expecting you to follow those rules, and if you break them, there is no guarantee that the compiler will output what you're expecting.

1 Like

You can perfectly well write (code presenting) Undefined Behavior in Rust.

So, the idea is to have a subset of the language with which, ideally, (code presenting) Undefined Behavior cannot be written: this is what non-unsafe Rust is for.

This, by the way, has usually required (using) a memory managed language (dynamically checking, for instance, each pointer dereference). The novelty of Rust is bringing properties / invariants such as the validity of a pointer to the type level (e.g., &T references). This way, checking these invariants "boils down" to type checking.

Sadly, not all correct programs can be statically proven so by Rust (or any other language). In other words, there exist valid programs that cannot be written without unsafe.

And that's precisely the point of the unsafe keyword: it grants the programmer more freedom / expressibility in order for them to be able to write down such valid programs. But since these programs can no longer be proven (to be) correct by Rust, it is up to the programmer to "prove" its correctness.

Example

You cannot write

fn get_fast<T> (slice: &'_ [T], index: usize) -> &'_ T
{
    unsafe {
        // safety??
        slice.get_unchecked(index)
    }
}

because then, for instance, get_fast(&[], 0) would be Undefined Behavior from within non-unsafe Rust code!!

The correct implementation of get_fast would be

/// # Safety
///
/// `slice.len()` must be greater than `index`
unsafe // Since the function cannot take all inputs "safely", it must be marked `unsafe`
fn get_fast<T> (slice: &'_ [T], index: usize) -> &'_ T
{
    slice.get_unchecked(index)
}

Then if someone wants to use your function (potentially triggering UB with it), then they must do it from within an unsafe block, meaning it was up to them to prove the call was right. For instance,

Correct unsafe

// Since this function is valid for all inputs, it can be marked non-`unsafe`
fn get<T> (array: &'_ [T; 256], index: u8) -> &'_ T
{
    // since for all `index: u8`, `0 <= index < 256`, this indexing operation cannot be invalid.
    // I thus wish to express an unchecked indexing operation.
    // Such operation does not exist within non-unsafe Rust (how could it possibly exist?),
    // I must thus use `unsafe`:
    unsafe {
        // safety (proof obligation):
        // u8 is unsigned, thus index >= 0
        // for all `x: u8`, `x <= core::u8::MAX`, and `core::u8::MAX = 255 < 256`
        // thus `0 <= index < 256`; where `256 <= array.len()`
        get_fast(array, usize::from(index))
    }
}

Obligatory watching when talking about Undefined Behavior:

4 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.