Why numerical computation casting hell?


#1

One thing I am finding really annoying in Rust is the need to manually cast different number types to the largest|proper type in order to do regular math.

Why doesn’t the compiler default to automatically infer the necessary casting in order to do math?

Here is some C++ code in a program I’m translating to Rust.

void printprms(uint Kn, uint64 Ki)
{
  // Extract and print the primes in each segment:
  for (uint k = 0; k < Kn; ++k)       // for Kn residues groups|bytes
    for (uint r = 0; r < 8; ++r)      // for each residue|bit in a byte
      if (!(seg[k] & (1 << r)))       // if it's '0' it's a prime
        cout << " " << mod*(Ki+k) + residues[r+1];
  cout << "\n"; 
}

Here’s what I have to do in Rust to get it to compile.

fn printprms(Kn: usize, Ki: u64]) {
  // Extract and print the primes in each segment:
  for k in 0..Kn {                    // for Kn residues groups|bytes
    for r in 0..8 {                   // for each residue|bit in a byte
      if (seg[k] as usize) & (1 << r) == 0 {  // if it's '0' it's a prime
        print!("{} ",(Ki+k as u64)*(rmod as u64) + residues[r+1] as u64);
      }
    }
  }
  println!(""); 
}

In both instances seg[] is a byte array (a u8 in Rust), and residues is an array of unit|usize.

As someone coming mostly from Ruby, which was designed to make programmers "happy" this seems very unnecessary. Why not have the compiler do all this grunge work, and throw warning messages if necessary to let you know what it assumed you wanted it to do? If C++ (et al) can figure this out why not let the Rust compiler do it to?

This was using Rust 1.6|7.


#2

This is me projecting somewhat, but Rust isn’t here to make it easy to write code; it’s here to make it easy to write correct and maintainable code.

Implicit numerical casting is super convenient, but also a terrible idea because it makes it really hard to tell what’s actually going on. The problem is that implicit casting is invisible, so you can’t tell from just looking at the code where it’s happening, and that can be important. Instead, you have to load the casting rules into your head and manually apply them to the code.

Rust doesn’t permit this, and forces you to spell it out, at which point reading the code is trivial. I don’t have to have any rules in my head, I just look and there are the conversions, right there.

Which also allows me to look at your code and wonder why on earth you’re casting to usize at all. I mean, (1 << r) should always be in range for u8, so why usize? Is there something weird going on? I thought seg was supposed to be a byte array, too.

And, wait a minute, why is residues an array of usize? Shouldn’t it be u64 as well? Should the residues be tied to the size of a pointer, because that seems really wrong.

As a final point, there are already plenty of languages that let you bang out code loose and fast. For one, I’m super glad that Rust is explicitly designed against that trend.


#3

Here is some history regarding the use of u8 and u64 as vector indices. If the linked issue was accepted as an RFC then you could declare r as u8 without having to write residues[r as usize +1].

I don’t run in to problems with casting very often, but I can see why it is annoying for your usecase,


#4

Ruby doesn’t have as many fixnums as Rust. And Rust tries to avoid the “type soup” of your average C program.

On the other hand I agree that casts are dangerous, so forcing the Rust programmer to insert too many casts in the code could decrease the safety of the Rust code compared to D language code.

But I’d like array/vector/slice indexing to be allowed with more then just usize. Indexing with u8/u16/u32 should be allowed.

If your rmod is of type u64, seg is an array of u8, and residues is an array of u64 then you can write this Rust code:

fn print_primes(kn: usize, ki: u64) {
    // Extract and print the primes in each segment:
    for k in 0 .. kn { // For kn residues groups|bytes.
        for r in 0 .. 8 { // For each residue|bit in a byte.
            if seg[k] & (1 << r) == 0 { // If it's '0' it's a prime.
                print!(" {}", rmod * (ki + k as u64) + residues[r + 1]);
            }
        }
    }
    println!("");
}

Recent versions of Rust allow you to perform a conversion from smaller to larger integral types without using hard casts. In general try to avoid hard casts in your Rust code.

Example, if your residues is an array of u32 you can write:

print!(" {}", rmod * (ki + k as u64) + u64::from(residues[r + 1]));

The Rustc compiler also has two handy warnings that are usually disabled, that tell you when you’re adding useless casts, I keep them always on:

-W trivial-casts
-W trivial-numeric-casts

#5

In the code rmod and residues are global constant variables (as in the C++ code).

// Global parameters
static residues: [usize; 9] = [1, 7, 11, 13, 17, 19, 23, 29, 31];
const rmod: usize = 30;   // prime generator mod value
...
...

As I said, all I initially want(ed) to do was translate the working C++ version into a working Rust version. Making it idiomatically and optimally Rust can come at a later time.

The coding example shown was what was sufficient to get the Rust compiler to compile it.
If there are other alternative coding approaches I’d be happy to see them.

The larger issue I’m raising, since Rust is still young and malleable, is I think its developers should be (more) concerned how its philosophical creed affects the humans who may want to use it. The language can be both rigorous AND user friendly. These qualities need not be in conflict.

Take these examples:

let mut a = 0u8;
let mut b = 0usize;
let mut c = 0u64;

// case 1;
c = a + b;

// case 2:
b = a + c;

// case 3
a = b + c;

``
For case 1 it should be obvious that no explicit casting should be necessary as the result takes two smaller sized values and puts their sum into a variable that can completely contain it.

For case 2 probably a warning|error is applicable to note c could overflow b. I can see where an explicit casting as b = a + c as usize would be necessary, but I don’t see why a needs to be explicitly cast, for the reason stated in case 1.

For case 3, where both variables exceed the size of containing variable, the same issues arises. However, this code should be valid, as the intent (to me) is to only use the lower 8 bits of the sum of the variables.

Maybe it would be better coding practice to do case 2|3 explicitly as:

b = (a + c) as usize and a = (b + c) as u8, with implicit temporary operational casting, or
b = (a + c) & 0xffffffff and a = (b + c) & 0xff to show explicitly the numerical intent.

Another way is to create compiler flags (say a fit flag) to do explicit implied numerical casting.

The ultimate point I’m trying to make, as a friendly suggestion to the developers, is to make this type of thing as easy and flexible as possible to users (people like me). I suspect people with a lot less interest AND persistence to learn this language will NOT persevere through these language qwerks. Again, the language should be subservient to the human programmer, NOT the other way around.

That is one of the major reasons people Love Ruby. I can do so much in Ruby without having to think (worry and be frustrated) about mico implementation issues such as these. Once I know how to do what I want, I can then concern myself with speed, concurrency, scaling, etc.

I completely understand Rust is meant to be an at-the-hardware level system programming language like C, and has different design goals and concerns than Ruby. I get that, which is why I’m taking the time to learn it, primarily based on its concurrency and memory use model. But people use and stick with Ruby on Rail because, even with all the issues it has, you can quickly get a lot of work done with it.

I also agree with the point that vector indexes should be able to take indices other than usize. If I have a u8 vector it shouldn’t be restricted to only usize number of elements, instead of u8 or u64 number of elements as well. The type of the content should not restrict the length of the vector.

However, the divinity is in the detail. I would urge you to always think about how, ultimately, to make the language as easy to use, and think in, as possible for real people (not language designers). You’ve got a smart compiler, just make it smarter! You don’t want someone to fork your great ideas and create Rusteasy! :smile:


#6

Since you mention C: quite a few features of Rust are designed the way they are because of experience with C. Implicit casting between integer types is a nuisance in C, as you should be able to see after taking this quiz. Enjoy!

Well, usize is pointer sized, so you’ll be hard pressed to create a vector with more elements than usize::MAX :slight_smile:

There are two principles that I think make the most sense: have only one, unbounded, integer type ala Python (and maybe Ruby?), or have bounded types with mostly explicit casting ala Rust. I enjoy the former, but I know that I’m paying for it in runtime cost. I accept the latter, because I can be sure I understand the rules.


#7

Hello JZakiya,

using Python and C++ for many years, I understand your expectation of automatic casts at places were they are save to do so.

However the Rust language ist designed to minimize the possibility to write code that does something you are not expecting it to do. (Which is close to Python’s principle of least astonishment.)

If you want a numeric type that allows these automatic casts, you are free to write it or use a crate that provides it (I didn’t find one)!

If converting C++ to Rust would be easy, it would be very difficult, to make Rust a save language. C++ allows way to many unsave things to do.

For me writing usize, u8, u64 makes it clear, that these are machine level types, with their limitations. For numerical calculations a type with unbounded in size would be a better fit. At the cost of slower computations (when the size is unknown) and higher memory usage, you could get a type that represents the natural numbers.

For the indexing into a vector (or anything linear in memory) Rust has chosen the convention to use usize.
It is the type for pointer arithmetic and all indexing operation end up in pointer arithmetic.

Easy, fast and safe… pick two. (No free lunch theorem )


#8

What are you referring to here - new syntax? some new type system feature?


#9

Some people are planning in changing this decision.

I meant:

fn main() {
    let x1: u8 = 10;
    let x2: u16 = 10;
    let y1: u32 = x1.into();
    let y2 = u32::from(x2);
}

But currently usize/isize are not allowed to be converted like that.


#10

I would love for someone to create ‘Rusteasy’!

I of course would not use it over Rust, I would probably find Rust superior due to its numerical casting rules :wink:

However, I would love to see people trying new things with Rust’s ideals.


#11

There is for example dyon, a scripting language with Rust-like syntax and concepts.


#12

I feel your pain.

Rust is super annoying with usize indexing and needing risky casts seemingly everywhere if you don’t stick to using usize exclusively for your entire program.

In recent versions .into() has been added to numeric types, so at least in some cases you can use foo.into() instead of foo as usize. It doesn’t save much typing, but at least checks that the conversion is lossless.


#13

I think that kind of conversion doesn’t work with usize.


#14

I was surprised at the verboseness of number casting when I started learning Rust; Swift also has the same issue, today, but there’s been discussion on the mailing lists of adding support for implicit widening conversions.

It would be interesting to see how that evolves and what issues they run into as it could inform Rust’s direction in this area.


#15

The current Rust situation is not terrible, because in some cases you can now avoid the dangerousness of hard casts:

fn main() {
    let x1: u8 = 10;
    let x2: u16 = 10;
    let y1: u32 = x1.into();
    let y2 = u32::from(x2);
}

Still the indexing of arrays/slices/vecs is still a bit too much heavy, also because you can’t use into/from with usize.

A second improvement that I’d like in Rust is to add Value Range Analysis, and allow intro/from for normally unaccepted conversion that are statically known to be in-range:

fn main() {
    let x: u64 = 1_000;
    let y = u32::from(x);
}

#16

I’d just like to chime in that unsigned vs signed can get really annoying after a while too. Here is some example code from my game, simplified down.

http://is.gd/Enp63v

fn go_left_or_right() -> isize {                                               
    // Decide based on player input                                            
    if true {                                                                  
        -1                                                                     
    } else {                                                                   
        1                                                                      
    }                                                                          
}                                                                              
                                                                               
fn main() {                                                                    
    let mut pixels = [false; 10];                                              
    let mut position : usize = 5;                                              
    for _ in 1..10 {                                                           
        pixels[position] = false;                                              
        position = ((position as isize + go_left_or_right() + pixels.len() as isize)
                    % pixels.len() as isize) as usize;                 
        pixels[position] = true;                                               
        // Draw pixels!                                                        
        println!("Position is: {}", position);                                 
                                                                               
    }                                                                          
}

This is just really irritating. Neither isize nor usize just work for my position variable, as I need to go into negatives only for that short segment where I’m
adding the offset, but that just infects everything in pain.
(The + pixels.len() is because Rust has the same behaviour as C, where -1 % 5 = -1, not 4. )

It’s almost to the point where I’m just doing to declare two copies of all my global constants, like screen::WIDTH as signed and unsigned versions, just to save some casts.


#17

I would argue that this function should be dealing with enums, rather than 1/-1, though…

tbelaire(https://users.rust-lang.org/users/tbelaire)
March 17

I’d just like to chime in that unsigned vs signed can get really annoying after a while too. Here is some example code from my game, simplified down.

http://is.gd/Enp63v

fn go_left_or_right() ->isize { // Decide based on player input if true { -1 } else { 1 } } fn main() { let mut pixels = [false; 10]; let mut position : usize = 5; for _ in 1…10 {pixels[position] = false; position = ((position as isize + go_left_or_right() + pixels.len() as isize) % pixels.len() as isize) as usize; pixels[position] = true; // Draw pixels! println!(“Position is: {}”, position); }
}


#18

How about:

fn go_left_or_right(position: usize) -> usize {
    // Decide based on player input
    if true {
        position-1
    } else {
        position+1
    }
}

modulo modulo, etc.


#19

I am actually just converting from the (C-like) enum to the +/- 1 in the go_left_or_right().

Lets take a look at how that would look:

https://play.rust-lang.org/?gist=0f0445875754f2d89ffe&version=stable

I don’t feel like the enum is super helpful, as these are just returning -1, 0, 1
based on if the left arrow key is held and the right arrow key is held.
(respectively up and down, and the pair of shoulder buttons on the gba).
So it’s not semantically going to only be used for one purpose, for example
I adjusted the palette based on the shoulder buttons to adjust and index, just like this example.


#20
struct Foo{
    seg: Vec<u8>,
    residues: [u64; 9],
    rmod: u64
}

impl Foo {
    fn printprms(&self, Kn: usize, Ki: u64) {
        for (segk, k) in self.seg.iter().take(Kn).zip(0u64..) {
            for (r, residue) in self.residues.iter().skip(1).enumerate() {
                if segk & (1<<r) == 0 {
                    print!("{}", (Ki + k)*self.rmod + residue);
                }
            }
        }
        println!(""); 
    }
}