Are there any floating types with precision beyond that of `f64`?

I've been trying to find ways to increase precision of calculations beyond that of the f64 type.

Besides several crates, providing arbitrary precision types, there don't seem to be any which make use of the Rust's standard primitives. Am I missing something?

Rug seems to be able to do the job, but I can't install it on Windows without jumping through tens of hoops, since it requires Linux-y based libraries, which can only be installed through a bash-like interface and then compiled through cargo, which has to be run through the same shell emulator once again.

Are there any f128 or f256 types or any viable alternatives?

Arbitrary precision is not something I'm looking for - type must have a fixed size.

1 Like

Rust does not have any built-in floating-point types beside f32 and f64 at the moment.

I know - are there any external libraries that provide these types, though?

Not that I know of, but since you listed rug, which provides arbitrary precision floats, I'd like to point out rustc_apfloat as a pure-Rust alternative to that.

Searching crates.io for f128 turned up this crate, which is said to be in "maintenance mode" but may suit your purposes anyway:

https://crates.io/crates/f128

Do you need more mantissa or exponent bits? Out of curiosity, can you tell us a little more about the problem itself?

I'm not that far into trenches yet to answer the first question, unfortunately.

What I want is to be able to perform calculations with higher precision, as needed, with ML (machine learning) algorithms. Most of the libraries I've found online begin to struggle once they have to deal with significant number of digits after the decimal point - thus the need for a f128 or f256 type.

It looks like it's simply a binding to another GCC math library. There's nothing in pure Rust like it, is there?

My bad - I thought rug provided fixed size primitives.

That's the only relevant search result for f128 and there are no results at all for f256, so you might be out of luck. What's motivating you to look for a Rust-only solution?

P.S. If you're looking more precision (as opposed to range) then you want extra bits in the mantissa, not the exponent. GCC's libquadmath is documented here but I can't tell how it divides the bits ah, I see it's 112 bits of mantissa (!).

1 Like

I've found that dealing with FFI's can be quite a pain, and having to jump between the documentation of a specific crate and the original library only adds to it, for the most part. Given that Rust's still maturing, perhaps it was a bit optimistic to assume there's just going to be a plug'n'play solution in the form of f128 or f256 type in addition to the built-in primitives, but I thought I might ask nevertheless.

You'd recommend to check out libquadmath then, do I understand you right?

If the bindings are well-written you shouldn't have to know anything about the original library or do any FFI. The f128 crate exposes a struct of the same name that implements all the usual arithmetic operations, so it shouldn't give you any trouble. That said, I can't speak to this crate's quality; caveat emptor.

2 Likes

I am not a ML expert, but I know some linear algebra and I've done work in adjacent fields and I strongly suspect you're doing something wrong here.

f64 has just shy of 16 decimal digits of precision. That's precise enough to measure the radius of Earth in nanometers. There's not a machine learning appliance in the world that can give meaningful results to the 16th decimal place (if you even had inputs with that level of accuracy to give it).

If you're losing precision to the low bits of f64, it strongly suggests that either the data is not well normalized, or the mathematical model is not well suited to the data. A classic error would be something like using GPS coordinates to describe the location of nearby things rather than bearing and distance. Other examples would be working in time when frequency is more appropriate, or in linear space instead of log space (or vice versa). Obviously, I have no idea what kind of numbers you're actually dealing with.

The trend in machine learning is actually toward less precise floats, with work being done in f32, f16 and occasionally even smaller and more exotic formats.

18 Likes

I wondered about that as well. I know very little of ML but I have never seen it said that it requires high precision floating point. Quite the opposite in fact. As pointed out above.

One of the first things one learns about ML is that useful models require massive performance. Hence the use of hardware GPUs and NN accelerators. Using a software extended precision floating point library would kill performance stone dead. My instinct, and what little experience I have of ML so far, tells me it would be unusable.

The ARM introduces some vector instructions for bfloat16 for the high performance NN applications last year.

bfloat16 is not a f16, it has same exponent range as f32 but only cuts precision bits by 16bit. So it can represent same range of numbers with f32 but has much less accuracy. Data intensive applications like ML are commonly bottlenecked by the memory bandwidth, reducing size of data by half can effectively double the performance in many cases.

To add to that, "I don't want arbitrary-precision, only f128 because I'm running out of f64 bits" is a very strange thing to see. Usually when the largest, reasonably-large fixed-width representation is insufficient for solving a problem, then reaching for a slightly-larger but still fixed type only delays the issues a little bit further, but does not solve them properly.

4 Likes

I've heard from a colleague that when doing high precision numerical integration for orbital calculations quad precision can be needed. Haven't confirmed this myself, though, and the colleague in question was far from an expert in computational methods, I think was just using standard fourth order Runge-Kutta.

1 Like

Sure enough, there's probably a whole range of problems for which X bits is insufficient precision but 2X bits are enough. However, the phenomenon I'm referring to is that these are the exceptions, rather than the rule. If you start "running out" of bits of precision, that probably indicates a conceptual or design problem, or the fact that arbitrary precision (e.g. growing with time as the algorithm progresses) may be necessary.

Agreed, finding a solid numerically stable algorithm is an essential step that can be easily slipped.

Double f64 is fairly easy to port from Julia https://github.com/JuliaMath/DoubleFloats.jl
Using FMA instruction gives a really nice boost to performance.
Highly recommended :slight_smile:

1 Like