Whether to use type isize or i32, fsize or f64, usize or u32

This is a request for clarification. As I've been working through the Rust book and other learning resources, sometimes I see, for instance, isize used and sometimes I see i32, or something else more specific. I like the idea of using isize so I've been using it quite a bit, but sometimes it throws an error when I don't expect it. In my mind I'm wondering why we even bother with i32 or f64 or u32 when the more generic isize, fsize, and usize are available. Can someone clarify how all this fits together? Thanks.

1 Like

i32, f64, etc. are a fixed size, but the isize and usize can be different sizes depending on the target machine and OS

Because program which doesn't ever talk to the outside world is not very interesting, ultimately.

fsize doesn't exist. You use isize or usize when you don't deal with externally defined APIs or file formats.

Most of the time f64 is good, but something you need to do a lot of calculations and then f64 is just slow. Same with integer sizes.

But 90% of time (maybe even 99%) you used You use i8/u8/etc simply because some externally-defined APIs use these types.

1 Like

Ok, so let's see if I've got this straight. isize and usize are fine unless my program is talking to the outside world, so my choice of, say, i32 would be because the data I'm pulling in is of i32 type. Is that what you're saying?

90% of time. You may also need it to make your program a bit faster, but that's much less often.

Most of the time something externally-imposed determines your choice.

E.g. Unicode defines char as something between 0 and 1048575.

That's too much to fit into i16 or u16 thus char uses 4 bytes (means you can convert it to u32 or, usually, into usize, but not into i16 or u16).

It depends on what you're doing and the hardware you're using.

  • You have to use usize to index arrays.
  • If you know overflow cannot occur, 32-bit integers are usually preferable to 64-bit integers. They generate slightly smaller code on x86 and take up half as much memory.
  • It's easier to reason about signed arithmetic. Unsigned arithmetic has some weird pitfalls.
  • Unsigned types are usually better when you need to work with the underlying bits (e.g. if you're working with a bitstream, [u64] is probably your best bet).
  • Floating point types (f64 and f32) are for representing real numbers, typically not integers, but you can exactly represent integers up to 53 bits in an f64. If you are writing algorithms with floating point numbers you have to make a choice about speed vs accuracy.
4 Likes

So, am I right that using isize and usize as much as possible will make my code more portable to other OS's? (Using Linux right now.) I figure that the less specific my code is (as in i32 being more specific than isize), the easier it will be to port to, say Windows. Is that a good way to think about it?

Well, if you have a huge amount of data then you might want to use the smaller sizes to save on memory and or disk storage space, assuming your data number ranges fit the type. So i8, u8, i16, u16 can be your friends.

Even if memory spaces is not an issue I suspect more cache friendly code can be produced if more data fits in the caches. So smaller types might be appropriate for performance critical code.

Given that isize and usize can change their width depending on platform it seems to me that for maximum predictability when changing platforms on should use specific sizes rather than usize.

My tendency is to use the largest size possible unless there is a specific reason to use something smaller, as above. I'm keen on using unsigned types for things I really don't want to be negative.

I would say it's good way to think about it for a novice.

The reasons to use i8 or u16 do exist (and you may read about them in answers here) but 90% of time isize/usize are “good enough”.

So go with isize/usize for now and return to that question after year or two.

All these things about caches, memory consumption and many other things may be important, even such exotics things as f16 (present in modern CPUs but not yet in Rust) can be useful.

But… later. Learn to write robust, stable code first… then you may start to think about squeezing efficiency from it.

Thanks, guys. That has really helped me get a better idea of how to use number types in Rust. Appreciate your responses.

I have to say no, this is not right.

The Rust Reference sates:

usize and isize have a size big enough to contain every address on the target platform. For example, on a 32 bit target, this is 4 bytes and on a 64 bit target, this is 8 bytes.

Potentially it could be 16 bits or not even a multiple of 8 bits. Rust compiles for the Arduino which has a very small memory space, I have no idea what an isize/usize is there.

For me then, to avoid surprises I would not use isize/usize. Heck, you cannot even count the human population using usize on some platforms/operating systems.

1 Like

The cost of usize is that you don't know where the overflow behaviour is. So you should only use it for things where its domain is exactly right: things representing memory. Notably, that means slice indexes and fenceposts.

If you need values that aren't natural numbers, then you probably want floating-point numbers, and pick between f32 and f64 depending on how much relative error you can tolerate.

Otherwise, use the unsigned number that's the best fit for what you're trying to express, considering the set of values you need to express. u32 is a reasonable default -- though if you're storing tens of thousands of something then it's worth considering a smaller type, as wider local variables is often free.

(Personally, I think that signed integers are mostly useless in computing -- that unsigned integers or floating-point numbers are superior in the majority of cases.)

6 Likes

How so? Many things I deal with everyday can be negative and mostly don't need to be floats.

I'm old school, still thinking of floats as a luxury, processors often don't have float hardware.

One old boss of mine said to a new member of our team who was using floats in his code "If you think you need to use floating point arithmetic to solve the problem then you don't understand the problem". Mind you our target had no float support and our language had good fixed point arithmetic support.

Sure, fixed-point is fine too. Though those are translation-invariant instead of scale-invariant, and it's not at all obvious to me that there are more translation-invariant problems than scale-invariant ones. Certainly most things we deal with IRL care more about relative error than absolute errors. There are certainly some things where fixed-point is better, though, since scaling doesn't really exist.

1 Like

There is a corollary to my old bosses statement about floats:

If you really do need floating point to solve the problem then you will have problems you won't understand.

Basically: What Every Computer Scientist Should Know About Floating-Point Arithmetic

I'd like to second scottmcm and zicog here: if it's an arbitrary choice and all other things are equal, you should use a fixed size integer type, not isize/usize.

All Rust code is "portable" in the way C code is "portable" — it's capable of being ported. However, isize/usize aren't target independent, so the behavior of code is less consistent across targets/platforms if it uses them instead of the fixed size types, which behave the exact same on all targets.

If you don't constrain the type of an integer literal, it's an i32. (For floating point, f64.) There's a logical reasoning behind this choice — all else being equivalent, these are fairly reasonable defaults to choose.

The first guideline is to use the same type the APIs you're feeding into use. Mostly, that means that if you're doing indexing, use usize, but you might also be using a library that deals in e.g. u16, and if you're interacting with that API, you should probably store and handle those as u16 on your side as well.

If that's not the case and you're using floating point, just default to f64. You still have to deal with floating point inaccuracies, but they're less than with f32. If you want f32 instead, you probably already know that or are using a library using f32 (e.g. computer graphics).

If you want integers, 32 bit is a fine default. 2 billion is still a lot, and the important thing is that you'll get the same behavior on whatever targets, rather than e.g. if you're using isize perhaps seeing overflow only on one older system that happens to be 32-bit but none of the newer or development machines see this occuring.

If overflow happens in debug mode, you'll get a panic (crash) pointing out exactly where the overflow occurred, so you can determine if the overflow is an error (fix the error or handle the overflow) or just an expected case (in which case you can refactor the type to use 64-bit integers). In release mode, you'll get silent wrapping overflow by default (but you can change back to panicking overflow by configuration).

Math on i32 is essentially[1] never going to be slower than math on isize and might instead be faster, since the CPU doesn't have to process as much data. This is the most apparent to the optimizer rather than the physical CPU, notably when the optimizer is able to vectorize your code to use SIMD and process multiple data chuncks in a single iteration. The smaller your data, the more single-core parallelism you can get (e.g. i32x8 versus f64x4).


I agree that generally you'll typically want u32 instead of i32, because most integer measures are counts or offsets or such with a contextually known sign, rather than one value which could logically be either positive or negative. (This doesn't change that i32 is a better default if no context is given other than "it's an integer.") The obvious large category which excepts that is physical measure, but that generally is a rational number rather than an integer, and in today's compute landscape that means f64 except in specialty cases where fixed point (integer of partial unit) is preferred instead. Currency is a fun example: you'd perhaps first assume that should be i64 cents, but in commerce the sign is known from context (prices are positive, discounts are negative respectively, final order total should probably only ever be positive) and a bank account being allowed to overdraft into debt is a special case. Using signed numbers for "real world" things is always convenient and perhaps even the correct default, but you should always consider what a negative value means; if a negative value has unclear or no meaning, prevent it with an unsigned value.

If measures vary wildly in scale, then scale-invariance is indeed typically more important. Where scale is relatively constant, though, scale-invariance obviously doesn't matter much, an translation-invariance is extremely beneficial. I love Contraption Maker as an example here. More things with semifixed known scale should imho be using translation-invariant rather than scale-invariant real numbers, but the hardware support for IEEE floats is hard to pass up, and scale-invariance absolutely ends up offering progressive degradation opposed to fixed-scale needing to handle overflow and such. Scale independence is essentially built-in significant figures, and for the purpose of measurement rather than creation/simulation, that's hugely beneficial. (I'm just steeped in the latter.)


  1. Some exceptions: 16-bit targets with 16 bit isize, and perhaps some extremely minor (maximum order of one cycle per branch, closer to one cycle per function call) hits on some 64-bit targets where ABI passes 32 bit integers zero extended in 64 bit registers and does not provide any 32 bit operations, thus requiring masking 32 bit integers before function calls. ↩︎

7 Likes

Or you'll learn how to use the tool so you do understand the problems. Just like borrowing in Rust -- one often has problems that initially they don't understand, but as they learn more they know how to handle the complex cases and avoid the troublesome ones.

For example,

To start, let me answer your question directly. For JPL's highest accuracy calculations, which are for interplanetary navigation, we use 3.141592653589793. Let's look at this a little more closely to understand why we don't use more decimal places. I think we can even see that there are no physically realistic calculations scientists ever perform for which it is necessary to include nearly as many [360] decimal points as you asked about.

~ How Many Decimals of Pi Do We Really Need? - Edu News | NASA/JPL Edu

Why would they state that number specifically, as opposed to more or fewer digits? I've got a pretty good guess:

[src/main.rs:2] std::f64::consts::PI = 3.141592653589793

1 Like

I agree. That takes time and experience. As I said my boss's pronouncement was made to a very junior, "green horn" member of the team. Kind of makes more sense in that context.

Disagree. You see if a neophyte does not understand how to keep the borrow checker happy he cannot produce code that even compiles. Rust saves him from himself. Contrast to the weird things that can happen with naive use of floats, without any compiler errors or warnings.

I would agree, if you had said "Just like borrowing in languages like C and C++". Where years of experience with the debugger teaches one about lifetimes.

I don't buy this. Plenty of sensors I deal with produce negative outputs. Using signed integers saves having to keep track of sign by other means. Also how can I test a value for being between a lower and a higher threshold without signed arithmetic.

But, but, I have never used a sensor that did not present an integer output. It may well be some odd scale like 16th of a degree C. Sticking such things in floats is already claiming more accuracy than is possible.

I think the reality is most programmers, most of the time, including me, are lazy and just uses floats because it saves us worrying about overflows and fractions most of the time. We only worry about it when we get surprised by odd results.

2 billion, or 4 billion is a ridiculously small number :slight_smile:

I'm unsure what you're proposing instead, though. You mentioned fixed-point earlier, but it has many of the same problems.

A 16.16 fixed-point has 1 / 3 * 3 = 0.99998 != 1, for example. It happens to get lucky with the classic float example of 0.1 + 0.2 = 0.3, but that's not because it solved the problem: it has 0.1 + 0.3 = 0.40001 != 0.4.

1 Like