Get fractionals out of f64 as integer

I'm looking for some algorithm to extract fraction part of an f64/f32 provided, as an integer.

It should work sth like this:

get_frac(2.32_f64); // 32
get_frac(0.023_f64); // 23
get_frac(0.00402_f64); // 402

The easiest way I found is to to_string/find ./parse slice, but it feels off to have to parse/unwrap when the input is already a float.
I looked into what the Display do and they use grisu3 which seems to be a pretty nice algorithm, but also quite sophisticated and aimed at building a string out of a float, while I only need an integer which I'd hope is simpler.

I imagine that there would be some way to get f64::get_parts() which would return integer and fraction as an integer, but I can't seem to find it, and all algorithmic approaches with multiplications/divisions in a loop run into the rounding error where 0.023_f64 ends up being 0.22999999997 and then 229999997 instead of 23.

Any idea or a lead to an algorithm that may work for this use case?
(the use case is https://unicode.org/reports/tr35/tr35-numbers.html#Plural_Operand_Meanings )

Where did you get the float from?

My library's user via a public API. So, all I can control is that the input is f64/f32.

Ultimately what you want is somewhat ambiguous due to the nature of floats, but how about repeatedly multiplying with ten until the difference between the rounded value and the actual value is less than, say, 0.00001?

1 Like

Ultimately what you want is somewhat ambiguous due to the nature of floats, but how about repeatedly multiplying with ten until the difference between the rounded value and the actual value is less than, say, 0.00001 ?

I attempted that and it breaks when the rounding gives 0.0009999997.

Give this a try:

fn get_frac(mut f: f64) -> u64 {
    
    let eps = 1e-4;
    while (f.round() - f).abs() > eps {
        f = 10.0 * f;
    }
    
    return f.round() as u64;
}

playground

Woa! That seems close! I had to add a .fract() cause otherwise it returned 232 for 2.32, but then it worked until I tried with a longer fractional:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=31c9ff248502c755b30eca1b6a0cad4d

The last assert returns 0 for me. Not sure if it's reproducible in the playground.

Ah, I didn't realize you don't want the integer part.

As for the last one, I really recommend thinking about if you can use some other approach. That said, this might work:

fn get_frac(f: f64) -> u64 {
    
    let eps = 1e-4;
    let mut f = f.abs().fract();
    if f == 0.0 { return 0; }
    
    while (f.round() - f).abs() <= eps {
        f = 10.0 * f;
    }
    
    while (f.round() - f).abs() > eps {
        f = 10.0 * f;
    }
    
    return f.round() as u64;
}

playground

But be aware that all of this is rather heuristicy, and there will always be edge cases where it's weird.

2 Likes

Parsing of string output is probably not too bad here. Float printing functions are complicated and do quite a bit of magic to pretend that 0.1 exists.

It doesn't have to allocate. Instead of to_string you could write! to some buffer (e.g. arrayvec)

7 Likes

@kornel - I'd like to avoid adding dependencies. I really wish heap vec was in std :slight_smile:

@alice - thank you so much! I was able to adapt it and it seems to work quite well.
Here's the PR - https://github.com/zbraniecki/pluralrules/pull/31/files
Would you have a moment to verify that it looks sane?

The performance on the benchmark is:

without: parse_float_operands time: [1.1371 us 1.1428 us 1.1501 us]
with: parse_float_operands time: [101.18 ns 101.38 ns 101.62 ns]

So, this PR makes the operation infallible, and gives a nice 10x perf win!

Thank you!

For the record, you can write to mutable slices, which can be on the stack too.

use std::io::Write;

fn main() {
    let mut array = [0u8; 10];
    let mut slice = &mut array[..];
    
    write!(slice, "{}", 5.01);
    let remaining_len = slice.len();
    let written = array.len()-remaining_len;
    
    let digits = &array[0..written];
    println!("{}", std::str::from_utf8(digits).unwrap());
}
2 Likes

This seems like a pretty bad trade to me. You've replaced a correct algorithm in the standard library with a buggy one that you now have to maintain.

I note that the table there shows 1.0 and 1.00 as being different (screenshot below). If that's the case, it feels like f64 is fundamentally the wrong datatype for the input. Why not accept a &str and parse it?

1 Like

I do accept &str as well, using FromStr trait. I also accept integers and floats, in which case denoting the precision comes from the fraction portion of the number and the options like minimum_fraction_digits. (See ECMA402 https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/NumberFormat for what inspired this code).

Thank you for all the feedback. I realize that the issue is hairy and the solution from @alice is potentially unsafe.
I tried @kornel's approach with writing to an array. Here's the result:

        impl From<$ty> for PluralOperands {
            fn from(input: $ty) -> Self {
                let abs = input.abs();
                let mut array = [0u8; 10];
                let mut slice = &mut array[..];

                write!(slice, "{}", abs).unwrap();
                let remaining_len = slice.len();
                let written = array.len()-remaining_len;

                let digits = &array[0..written];

                let (len, fraction) = if let Some(pos) = digits.iter().position(|b| b == &b'.') {
                    let s = std::str::from_utf8(&digits[pos+1..]).unwrap();
                    (
                        digits.len() - pos - 1,
                        usize::from_str(&s).unwrap()
                    )
                } else {
                    (0, 0)
                };

                PluralOperands {
                    n: abs as f64,
                    i: abs as usize,
                    v: len,
                    w: len,
                    f: fraction,
                    t: fraction,
                }
            }
        }

Here's the perf:

master: parse_float_operands time: [1.1371 us 1.1428 us 1.1501 us]
@alice's solution: parse_float_operands time: [101.18 ns 101.38 ns 101.62 ns]
@kornel's solution: parse_float_operands time: [660.16 ns 661.66 ns 663.28 ns]

I'm not sure if I want to merge that PR just because of the perf, but I wish there was a performant solution that didn't require 3 unwraps :slight_smile:

1 Like

To add on to this, I actually looked into this problem a bit ago for a proc macro. My determination was that parsing the string manually was in fact the best option.