For 2024 edition, should it use Option for some functions?

Rust 2024 will be released soon, but I think it's worthy to change return type for some functions, and design Rust replacement to glibc (GNU libc, for syscalls). E.g., slice::len returns usize, but it's limited to 2^n-1, e.g., 8-bit unsigned integer has limit 255. If it returns Option, using Some(0) to say it has 1 byte and None to say it doesn't have any elements, it can raise up the limit to 2^n. However, syscalls interface like glibc and musl is designed according to POSIX standard, it requires to design Rust's own standard to replace POSIX standard to return Option. Rust already allows to return Result from main function, i.e.,

fn main() -> Result<(), Box<dyn Error>>

But POSIX standard requires integer, which is incompatible and wastes some bytes, since the returned integer costs further bytes for Rust programs.

...and make a lot of confusion (and almost guaranteed silent logic bugs) for users. Given that usize is at least 32-bit, 2^n is high enough that this single number is likely to not matter.

6 Likes

What confusion?

Editions aren't carte blanche to make whatever changes one wishes.

Editions do not split the ecosystem

The most important rule for editions is that crates in one edition can interoperate seamlessly with crates compiled in other editions. This ensures that the decision to migrate to a newer edition is a “private one” that the crate can make without affecting others, apart from the fact that it affects the version of rustc that is required, akin to making use of any new feature.

The requirement for crate interoperability implies some limits on the kinds of changes that we can make in an edition. In general, changes that occur in an edition tend to be “skin deep”. All Rust code, regardless of edition, is ultimately compiled to the same internal representation within the compiler.

16-bit.[1]


  1. 1, 2, 3... ↩︎

8 Likes

If [T]::len returns Some(42), what's the amount of elements in this [T]? Intuitively, it would be 42. In your words, it seems that it would be 43.

4 Likes

Indexing is similar, a[0] means to locate to the 1st element.

There are OS-specific methods in std::os, if the changes are put into specific methods, I think it's acceptable and won't break backward compatibility.

Slice lengths always fit in usize, so there is no value in extending the range of [T]::len() beyond what can be represented in usize.

8 Likes

Seems to be an unnecessary complication to me. If you have a slice it has a length, so there is no reason for .len() to return an Option. That length is usize which is enough to represent up to the size of your memory in bytes minus 1. There no point in trying to get that extra 1, you can't make a array or slice anywhere near that big.

3 Likes

How about the main function returning Result? POSIX standard requires an integer, it wastes some bytes for Rust programs.

You mean like:

use std::process::ExitCode;

fn main() -> ExitCode {
  ExitCode::from(2)
}

More generally, the things that can be returned from main are those that implement std::process::Termination, which allows them to be converted to a std::process::ExitCode, which I would expect to be implemented as a POSIX-compliant integer on POSIX-compliant systems. So main already ultimately outputs an integer.

3 Likes

The len method returns the length. It doesn't return the index of the last element.

3 Likes

usize is, in fact, already too big for every real use-case of slices.

As described in https://doc.rust-lang.org/nightly/std/ptr/index.html#allocated-object, a slice can be at most isize::MAX bytes, so for example [i32]::len can be at most isize::MAX/4, aka usize::MAX/8.

And thus usize is already wasting at least one bit, so if anything we'd want something smaller, not phrasing it in a way that allows more values.

(Only slices of ZSTs can have len == usize::MAX, but I've never seen a single place where that's actually useful. It'd be nicer if slices of ZSTs also couldn't be longer than isize::MAX.)

9 Likes

It wastes some bytes, since it converts the Result into an integer which uses more bytes. If it can replace POSIX standard, it will save the bytes used by that integer.

Does it?

The example:

use std::process::ExitCode;

fn main() -> ExitCode {
  ExitCode::from(2)
}

compiles to the following assembler:

example::main::he300f0667a1659b3:
        mov     al, 2
        ret

Which is what C would do as well.

3 Likes

Your proposal is incoherent.

POSIX is a set of operating system APIs, Result and Option are data structures in Rust, they are different concepts. One is not a replacement for the other.

Rust can't save the 1 byte allocated for a process exit code in Linux because that's the operating system's job not Rust's.

2 Likes

It's easy to make the maximum value of a type larger by simply using a different type with more bits. Option<T> adds one byte to T (although due to alignment it's probably often size_of<T>() bytes in practice). If you want to increase the maximum value from 2^n-1 — which others have pointed out that you likely don't want to do — adding a whole byte just to use a single bit doesn't make much sense.

…w-what?!

Just for completeness, I'd note that the same happens if you use Result<(), ()> instead of ExitCode; Err(()) compiles to mov al, 1, but is otherwise the same as your ExitCode example.