Using scientific notation in Rust

Today I discovered that Rust won't let me use scientific notation on integers. I started with the line let mut num1 = 1*10^9 + 1; expecting the compiler to assign 1,000,000,000,001 to the num1 variable. Instead, it assigned the value 1 and ignored the rest of the line. So, I googled it and found that you have to use "e-notation" when doing exponents. Okay, fine, I changed my line to let mut num1 = 1*10e9 + 1; and that returned lots of error messages. After I did some more checking, I found out that in order to use scientific notation you have to use floating values. The line that finally worked is let mut num1: f64 = 10e9 + 1.0;

My question is twofold. First, I'm sure there is some good rationale as to why Rust doesn't recognize the caret ^ notation for working with exponents, but it doesn't make a lot of sense to me. Could someone explain why that is so? Thanks.

Second, I also find it odd that Rust won't let me use scientific notation on integers. I also need an explanation for that question, too.

Thanks for your help! :>)

1 Like

The caret in Rust is the bitwise XOR operator, as it is in most C-family languages. Arithmetic operators have a higher precedence than bitwise ones, so that statement will calculate 10 XOR 10, which is equal to zero.


Also, the number you have written is 1012+1, not 109+1. In e-notation, that’s 1e12_f64 + 1.. You wrote 10e9, which is 10×109=1010.

9 Likes

That's actually surprisingly deep question.

I guess since you have googled extensively you already know that more than half languages out of top20 don't recognize caret for scientific notation and the same goes for using scientific notation with integers.

Yet that doesn't convince you.

I guess the question, then, becomes: why have Rust picked approach used by computers and engineering calculators since before I was born and not something which would be familiar for workers of some other industry.

My guess would be that Rust is not trying to create a revolution and doesn't try to do something strange and unusual. It picks syntax familiar for the field of programming languages not for for the field of math or chemistry.

Does it make any sense to you?

Rust have many innovations for IT field (actually they are, mostly, things that are 20+ years old), but as far as syntax is concerned it tries to follow most popular curly-bracket languages: C, C++, Java, JavaScript, PHP…

JavaScript, of course, allows you to use e-notation with arrays and string indexes but that's not because it allows you to use it with integers but because it doesn't have integers at all!

This choice would have, probably, been bad fit for Rust.

Does this make any sense to you?

3 Likes

Because scientific notation is all about relative error, but integers are about absolute error. Generally, if you want scientific notation, I'd say you frequently want floating-point.

If you want 1 GiB, you can use "binary scientific notation": 1 << 30. If you want 1 MB, remember you can use separators: 1_000_000.

8 Likes

I wonder how many non-esoteric programming languages use caret for power. I know about Visual Basic (and some other versions of BASIC, I guess). Anything else? Not even Fortran supports it, it uses ** for power. I guess you can count TeX (although it's not a programming language but it's often used for documents thus people from IT-industry know if), too. Anything else?

I think Mathematica uses the caret.

Oh, right. Mathlab, Mathematica and R (although R supports both ^ and **).

Still, I guess doing what most popular languages are doing and not what these math-oriented ones are doing was a good choice for Rust.

{integer}::pow exists.

12 Likes

Not that many apart from the math-oriented ones indeed: Awk, Dylan, Eiffel, Lua, Mathematica, Matlab, Pliant, Yorick, Maple, Haskell according to

http://rigaux.org/language-study/syntax-across-languages/Mthmt.html#MthmtPwr

Too bad this page is so old, but I doubt this changes much.

A bit verbose, but works well:

const TWO_MILLION: i32 = 2 * 10i32.pow(6);

fn main() {
    println!("{TWO_MILLION}");
}

(Playground)

Output:

2000000

I tried to make a macro out of it which doesn't require the type being repeatedly specified for the base 10 (as in 10i32, which feels annoying), but didn't get it to work. How could such a macro look like, which doesn't require the i32 hint?

1 Like

Here is a macro: playground.

const TWO_MILLION: i32 = scientific!(2 * 10 ^ 6);
2 Likes

I tried to modify it a bit, so it also works with non-literals and has a shorter notation:

macro_rules! sci {
    ($mant:expr, $exp:expr $(,)?) => {
        {
            let base = 0 * $mant + 10;
            if false {
                base
            } else {
                $mant * base.pow($exp)
            }
        }
    };
}

const TWO_MILLION: i32 = sci!(2, 6);

fn main() {
    println!("{TWO_MILLION}");
    println!("{}", sci!(5i32, 3));
    let sixthousand: i32 = sci!(6, 3);
    println!("{sixthousand}");
}

(Playground)

Output:

2000000
5000
6000

Some questions:

  • Is using the "0 * $mant +" trick idiomatic, or can it be achieved in other ways?
  • Any way to make the line println!("{}", sci!(5i32, 3)) work without the type hint?

Make it a generic function rather than a macro and then the literal will default to i32.

Interesting idea, but trait methods cannot be const fns :frowning:

trait Sci {
    fn sci(mant: Self, exp: u32) -> Self;
}

macro_rules! impl_sci {
    ($type:ty) => {
        impl Sci for $type {
            fn sci(mant: Self, exp: u32) -> Self {
                mant * Self::pow(10, exp)
            }
        }
    };
}

impl_sci!(i8);
impl_sci!(u8);
impl_sci!(i16);
impl_sci!(u16);
impl_sci!(i32);
impl_sci!(u32);
impl_sci!(i64);
impl_sci!(u64);
impl_sci!(i128);
impl_sci!(u128);
impl_sci!(isize);
impl_sci!(usize);

fn sci<T: Sci>(mant: T, exp: u32) -> T {
    T::sci(mant, exp)
}

//const TWO_MILLION: i32 = sci(2, 6);

fn main() {
    //println!("{TWO_MILLION}");
    println!("{}", sci(5, 3));
    let sixthousand: i32 = sci(6, 3);
    println!("{sixthousand}");
}

(Playground)

How do I make this work with: const TWO_MILLION: i32 = sci(2, 6);

I modified my macro to default to i32 if the context is unknown:

macro_rules! scientific {
    ($a:literal * $base:literal ^ $exp:literal) => {{
        let mut result = $a;
        let mut exp = $exp;
        while exp != 0 {
            result *= $base;
            exp -= 1;
        }
        result
    }};
}

const TWO_MILLION: i64 = scientific!(2 * 10 ^ 12);

fn main() {
    println!("{}", scientific!(2 * 10 ^ 6)); // i32
}

1 Like

Thank you, everyone, for your answers. That really helped. Keeping Rust consistent with most of the major programming languages by limiting the caret symbol to bitwise operations was likely a good call. @khimru, you explained that very well.

I see what you are saying about "relative error" and "absolute error", but I'm not convinced that limiting scientific notation to floats only was a good call. Why not both? Of course, the decision was made by wiser heads than mine and I can work around it by adding a decimal point to my integers, but it seems clumsy and unnecessary. Scientific notation is also simply good shorthand for very large numbers. When used that way it's not about error, just utility.

2 Likes

I agree it would be nice. Something like this could even probably be added in a backwards compatible way, for instance: 1.99e9_u32.

I don't think the "relative error" counter-argument is convincing. In science this might be sometimes taken to mean something like (1.99 ± 0.01) * 109, but even for f64 it's not what it means.

1.123e2_u32 would be disallowed, but 1.99e9_u32 would work.

1 Like

Hmmm..... Come to think of it though, using scientific notation for integers is going to be very limited. As soon as you get past the single digits you have to use a decimal point which forces one into using a float. For instance one could use 9*10e9 to represent to represent 9 billion, but if you want to do 99 billion you need to use a decimal point 9.9*10e9. So, I guess not allowing integers to use scientific notation does make sense. Have I answered my own question?

You could write 99 billion either as 99e9_u64 or 9.9e10_u64. I don't see why would want to limit the notation to single digits or disallow the decimal point.

1 Like