Maximum slice length - Is this sound?

I assume the following function is sound, but I'm not sure and also not sure how to prove it:

pub fn foo<T>(x: &[T]) -> &[T] {
    unsafe {
        std::slice::from_raw_parts(x.as_ptr(), x.len())
    }
}

(Playground)

The problem in the safety requirements is:

  • The total size len * mem::size_of::<T>() of the slice must be no larger than isize::MAX. See the safety documentation of pointer::offset.

If T is an u8 and x.len() is isize::MAX+1, then this would be UB right?

The linked documentation of pointer::offset says that "for instance, Vec and Box ensure they never allocate more than isize::MAX bytes", but does this limit hold for slices in general too? I hope it does. If it does, where is that documented? If not, what can I do?

2 Likes

Your function is sound, but I don't know where it is documented.

1 Like

Good that it's sound :sweat_smile: (though I guess it's also unlikely an allocation is that big). Anyway, I still would like to know where it's documented, so I can add proper SAFETY comments. Maybe someone else knows.

Currently, I just wrote in my use case:

// SAFETY:
//  *  […]
//  *  The element count (multiplied by `size_of::<u8>()`)
//     is not expected to be larger than `isize::MAX`. TODO

Which is a bit unsatisfying.

But the len() is never > isize::MAX. It would already be unsound in the first place if you managed to get hold of such a slice from any source. The standard library doesn't allow or do it, it's simply not allowed. (And if you obtained such a slice from any other 3rd-party source, then that code is incorrect.) Thus, your own code can't possibly be unsound merely due to performing such an identity conversion on a slice.

1 Like

On a 16-bit cpu it's 32767, so not that big. :wink:

It's good if it's not allowed, but where is that documented? What worries me a bit is this:

For instance, Vec and Box ensure they never allocate more than isize::MAX bytes, so vec.as_ptr().add(vec.len()) is always safe.

This explicitly refers to Vec and Box (and gives them as examples), but this isn't exhaustive. Moreover, the sentence before reads:

The compiler and standard library generally tries to ensure allocations never reach a size where an offset is a concern.

What does "generally tries to ensure" mean? :fearful: Or rather: what does "try" mean? :see_no_evil:

The isize type is a signed integer type with the same number of bits as the platform's pointer type. The theoretical upper bound on object and array size is the maximum isize value. This ensures that isize can be used to calculate differences between pointers into an object or array and can address every byte within an object along with one byte past the end.

From the Rust reference

2 Likes

Since PR 95295, it is impossible to soundly construct a Layout with a (padded) length greater than isize::MAX. Therefore, GlobalAlloc::alloc() cannot be used with such a long length. (GlobalAlloc::realloc() still doesn't have that requirement, but I think that's currently considered a documentation bug, since it uses a Layout under the hood.) Also, the compiler enforces that no named type is larger than isize::MAX after monomorphization, so that prevents any overly large locals from being declared. Thus, the only way to soundly receive a block of memory longer than isize::MAX is to directly use virtual-memory functions like mmap() (or FFI functions that indirectly call such functions).

In my case (or the given toy example), it doesn't matter if there can be an allocation that big. It can always exist due to FFI, like you said. And I don't think an unsafe block becomes unsound just because it's making such an allocation, does it?

I think the question is not whether it's guaranteed that no such allocation exists or whether there can be raw-pointers to them.

The question is rather: Is it ruled out that an ordinary shared/exclusive Rust reference to such a value exists (or can be crafted by safe code or sound unsafe code)?

You're looking for array types (fixed-size arrays on 32-bit, HashMap, Vec) can be large enough that indexing is unsound · Issue #18726 · rust-lang/rust · GitHub

This has been ruled-out since 1.0.

No, I don't think so. I take a slice, not an array.

It's not a documentation bug, it was explicitly punted on. There's a follow-up ACP though.

Even if FFI arranged a larger block of memory, it couldn't be formed into a &[T] without violating the same safety requirements that you originally linked.

3 Likes

That's what I suspect too, but who guarantees that std::slice::from_raw_parts is the only way to create a slice? There could be different functions like that; possibly also in future (e.g. when you can manipulate wide-pointers in future perhaps). My point is: Is there an explicit guarantee this will never be possible?

1 Like

I think you've hit a point here. The standard library does, in fact, respect the invariant that all &DST and &mut DST references have a size within isize::MAX at runtime, but we don't actually document that invariant formally. Perhaps doing so would be a good idea.

1 Like

Actually, there does seem to be a brief reference to this invariant in the Rust Reference:

Note that dynamically sized types (such as slices and strings) point to their entire range, so it is important that the length metadata is never too large. In particular, the dynamic size of a Rust value (as determined by size_of_val) must never exceed isize::MAX.

So if the standard library were to internally create a DST larger than isize::MAX, it would be unsound to expose it to the user by this rule, since it covers existence, rather than just creation. (I suppose the Rust Reference is non-normative, but that's neither here nor there; there are all sorts of language rules that simply aren't documented in the standard library.)

1 Like

There's a chain of history here I didn't fully read, but anyway:

The reasoning behind all slices being limited to isize::MAX length is discussed here.

1 Like

So that also means that std::mem::size_of_val<T>() is guaranteed to never return an integer (of type usize) which is greater than isize::MAX, right? If so, maybe worth noticing there.

Correct, and rustc tells LLVM as much: Add 0..=isize::MAX range metadata to size loads from vtables by erikdesjardins · Pull Request #105446 · rust-lang/rust · GitHub

I don't know how normative any of this is (so the documentation may not be possible without an RFC or FCP), but that seems to be the current intentions, yes.