Can't convert usize to u64?

We don't want to make it such that it's easy to accidentally make code that compiles on 64 bit targets but not 32 bit targets without realizing.

.into() catches wrong conversions at compile time. as causes bugs at run time.

For example if I have foo as bar, and bar type is too small, the as will still proceed and cut bits off without a warning, which is likely to cause a bug at run time. OTOH foo.into() will not compile if bar type is too small.

1 Like

I don't see how it's a problem for 32-bit targets. When usize is u32 it can be safely and losslessly converted to u64.

For example if I have foo as bar, and bar type is too small, the as will still proceed and cut bits off without a warning, which is likely to cause a bug at run time. OTOH foo.into() will not compile if bar type is too small.

Personally this doesn't seem like a very significant pitfall to me.

I don't see how it's a problem for 32-bit targets. When usize is u32 it can be safely and losslessly converted to u64.

But it can't be losslessly round-tripped. If I'm on a 32-bit target, and I do:

// on 32-bit target
let foo = std::usize::MAX;
let mut bar: u64 = foo.into(); // just sign-extended
bar += 10; // it's a u64 so this is fine
let baz: usize = bar.into(); // uh-oh, we just wrapped

At least to me it doesn't make sense to provide Into<u64> for usize but not Into<usize> for u64, which would be needed if you want calls to into() to be statically guaranteed as lossless.

However, you may be interested in turning clippy warnings on for this so that you will have to explicitly ignore cases where the behavior is intended:

I don't know, but I would guess this is for future proofing the language. If rust is around for as long as C has been around, then it will probably see CPUs where a usize is 128bits, in which case the conversion from usize to u64 won't be trivial.

4 Likes

But how? Let's see:

  • If foo.into() converted usize to u64:

    • In 64-bit usize — works correctly.
    • In 128-bit usize — does not work correctly, but catches the invalid conversion at compile time.
  • If I have to use foo as u64

    • In 64-bit usize — works correctly.
    • In 128-bit usize — does not work correctly, but the compiler does not catch the invalid conversion. Produces buggy code without warning.

So a 128-bit usize would break this code no matter what. The only difference is that .into() can catch the bug, and as can't.

1 Like

The difference is between future proofing of the language (or standard library) vs future proofing of your code. IMO a trait implementation in std should either be present for all platforms or for none. If a certain type is platform specific and if it is possible, that for a valid representation of that type the trait cannot be implemented, then it should not be done.

3 Likes

The Into::into docs states: "this trait must not fail", so Into<u64> for usize is not implemented so in future we can have usize with more tahn 64bits.

1 Like

For further reference, your complaint is exactly this:
https://github.com/rust-lang/rust/issues/30495

And that referenced two other PRs:
https://github.com/rust-lang/rust/pull/29220
https://github.com/rust-lang/rust/pull/28921

I don't think there's consensus that this would be actually a good thing. Suppose I'm dealing with numbers far smaller than u32::MAX, but I've been convinced that I need to use into() because its "safer." Now my library may not compile on all platforms. In general, having traits like Into implemented inconsistently across platforms seems like it could be a bigger pitfall than as conversions.

Haha, looks like it's not the first time I ran into this problem :smiley:

Heh, I didn't even make the connection that it was your bug.

Is this really a concern? 16 Exabytes is a lot of memory; it doesn't seem likely that we'd ever (at least, while Rust is still relevant) want pointers this big, except maybe if we wanted a single address-space across multiple nodes (even then, only a few supercomputers have exceeded 1PB so far; ORNL's new Summit has "more than 10 petabytes of memory" which still leaves a factor of 1000 before exhausting the 64-bit address space).

2 Likes

I don't really disagree, but I'm sure this was said about 4GB (32-bit max) a couple decades ago.

3 Likes

Do keep in mind that pointers live in a virtual address space, which current operating systems use for mapping disk data as well as RAM. On the disk front, we are getting close to the exabyte scale in large data centers.

1 Like

Risc-V defines the RV128I 128-bit base integer instruction set, in which usize = u128. This is the size of a virtual address, so it could be encountered on a system with much less than 16 EB of memory by anyone with an emulated or real RV128I implementation.

I think a better question is why isn’t there a u64: TryFrom<usize>

2 Likes

Per [1212.0703] The Cost of Address Translation,

While 64 bit addresses are sufficient to address any memory that can ever be constructed according to known physics, there are other practical reasons to consider longer addresses.

3 Likes

There is something happening in nightly with primitive conversions.