Marker: unsafe or not unsafe

Consider the following situation.

Person A designs a marker trait for an iterator that always returns slices of the same length:

/// Iterates over slices of the same length
trait SameSliceIter<'a, T: 'a>: Iterator<Item = &'a [T]> {}

Person B uses the guarantee provided by this trait to write some unsafe (possibly unnecessarly unsafe, but nevertheless sound based on the guarantee) code:

fn sum_last<'a, T>(mut iter: impl SameSliceIter<'a, T>) -> Option<T>
where
    T: Copy + Add<Output = T> + 'a,
{
    let first_slice = iter.next()?;
    let last_index = first_slice.len().checked_sub(1)?;
    let sum = unsafe {
        iter.fold(*first_slice.get_unchecked(last_index), |sum, t| {
            sum + *t.get_unchecked(last_index)
        })
    };
    Some(sum)
}

Person B makes an erroneous but safe implementation of this trait:

struct Yolo<'a, T>(&'a [T]);

impl<'a, T> Iterator for Yolo<'a, T> {
    type Item = &'a [T];

    fn next(&mut self) -> Option<Self::Item> {
        use rand::{thread_rng, Rng};
        let mut rng = thread_rng();
        let len = rng.gen_range(0..(self.0.len()));
        Some(&self.0[0..len])
    }
}

impl<'a, T> SameSliceIter<'a, T> for Yolo<'a, T> {}

As the result using safe erroneous code with unsafe non-erroneous code results in undefined behavior:

fn main() {
    let vec = vec![0; 1 << 20];
    let yolo = Yolo(&vec);
    dbg!(sum_last(yolo));
}
unsafe precondition(s) violated: slice::get_unchecked requires that the index is within the slice

Now the question is: what went wrong?
I would guess in this particular situation either the trait SameSliceIter should have been made unsafe, or the person B should not have trusted all trait implementations to be correct. In which case it follows: unsafe code should not trust trait invariants, unless the trait is unsafe. Going further, which invariants should unsafe code trust? E.g. shoud one use (time::Date.day() - 1) as an unchecked index in [&str; 31]?

It raises a bigger problem.
Unsafe code is a subject of greater scrutiny than safe code and requires significantly more effort to write. If unsafe code relies on an invariant guaranteed by safe code, then it may transition the responsibility for correctness to the safe code, which does not necessarily meet the quality standards unsafe code ideally should. This makes safety of unsafe code depend not only on the correctness the code itself, but also of every piece of safe code it relies upon.

1 Like

In this case sum_last is incorrect. The trait is not unsafe, so implementers could violate the requirements without that being allowed to result in UB. Either change sum_last or make the trait unsafe.

3 Likes

Yes. You can see this distinction in the standard library’s iterator traits:

In which case it follows: unsafe code should not trust trait invariants, unless the trait is unsafe. Going further, which invariants should unsafe code trust?

Unsafe code may only rely on properties that the unsafe trait’s safety documentation says implementors must have.

(But if the unsafe code is using a specific implementation of the trait, then it can use that knowledge too, just like any other non-arbitrary function call. Nothing wrong with, for example, unsafe code relying on <std::slice::Iter<T> as ExactSizeIterator>::len() to return a correct answer, because that’s a specific function from the standard library, not a caller-provided implementation.)

4 Likes

What about unsafe code that relies upon an invariant provided by a non-arbitrary call from a random crate? I can probably trust std as much as I can trust rustc. But I can't say the same about the other crates.

Let assume that the used crate uses only safe code. One can write unsafe code that relies upon an invariant provided by non-arbitrary code in that crate. If the safe code in the crate will turn out to violate the invariant, it may result in undefined behavior.

Rust is designed to allow mistakes in safe code without violation of soundness, but not allow such mistakes in unsafe code. If unsafe code relies upon invariants in safe code, it means that the safe code should not have mistakes of otherwise soundness may be violated. I.e. safe code effectively becomes unsafe.

Is it correct that writing unsafe code is not only a matter of formal reasoning, but also of a heuristic assessment of "trustworthiness" of the code relied upon?

I guess unsafe code in this case can rely on explicitly guaranteed things, and nothing more. That is, one must not trust the current behavior itself, if it is not documented to always be this way.

1 Like

The problem with relying on safe traits is not that you're relying on safe code per se. It's perfectly fine to call some specific function from a dependency implemented without unsafe, and to then rely on postconditions promised by that dependency in unsafe code. As a downstream user of your library, there's no way for me to use that trigger unsoundness. Your api is safe. (Assuming no bugs in the dep, but that's a separate issue.) However, with a safe trait, downstream users of your library can write safe code that ends up triggering UB. So the api is not safe. And whether the user can misuse the api is the determiner here.

4 Likes

Imagine you're writing code that really, really needs to have some properties that are not soundness. For example, maybe you're writing a control system for a piece of expensive equipment that will destroy itself if not managed correctly. In that case, you would want to be very careful about the correctness of the code that your code relies on. But, you might still choose to use carefully chosen, well-regarded libraries over writing everything from scratch.

I think that unsafe code relying on libraries is in a similar position: preserving soundness is extremely important — it’s the one thing the code really must do — but that doesn't mean that the only option, or even the best option, is to write everything from scratch.

All else being equal, unsafe {} blocks should keep the set of things they rely on on small. But it would be foolish to say “use no library functions; inline all the logic” because that leads to unsafe blocks that are large and unnecessarily hard to review. We create functions, modules, and libraries in order to divide problems into chunks that are small enough to understand and review, to increase the chance that the program will be correct. This doesn’t change just because some of the code includes unsafe {} blocks.

3 Likes

That's a nice way to think about marker traits.

You're right. My original question has been answered, now I have a different question which is no longer about traits, but rather any API in general. I've created a topic on IRLO.

I am not trying to say that libraries should not be used or that unsafe code is useless. Rather I am trying to find whether it is possible or practical to formally prove that some unsafe code is sound and use libraries at the same time.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.