teju is a Rust implementation of the (novel) Tejú Jaguá algorithm for converting binary floats into a decimal representation. It exposes an interface that is drop-in compatible with the very popular ryu crate used for the same purpose.
Like ryu, it is substantially faster than going through the standard library formatting facilities.
Compared to ryu, it should be even faster (slightly faster, or 5 or 6 times faster, depending on the inputs), and it also exposes more flexible formatting options (e.g. always in scientific notation, or never like in std).
Microbenchmarks comparing teju, ryu and std are included in the readme but please take those with a grain of salt [1]. They can be run with cargo bench.
Teju is feature complete and includes extensive tests, but it's still a work in progress (more formatting options, cleaner code, more profiling+tuning the low-level implementation, etc).
[1] For example, both teju and ryu use lookup tables, so a tight benchmark loop where these are always hot in cache may not be representative of your workload.
Yep, format will do exactly that (e.g. formatting 1e3 as "0.001" but 1e30 as "1e30"), while format_dec/format_exp will force one way or the other. This is like ryu but unlike rust's stdlib, which always formats as decimal with Display.
I started something similar by porting the Schubfach algorithm from a C++ prototype, more to solve some inconsistencies with the rounding of floating numbers when displaying them at a given precision than for a performance gain. Sadly, there's been a bit of a hiatus since the writing of the validation code, and now I'm wondering if I shouldn't rewrite a big part of the unsafe code as safe code if possible.
How is the code size? That seemed to be a big downside of ryu (compared to std) last i looked. This matters not just for embedded, but also for instruction cache pressure. And if formattinf floats is not all you are doing, that might be more important than how fast the algorithm is in a microbenchmark.
I made this exact caveat in the opening post ahaha Both the Tejú and Ryu algos rely on precomputed modular inverse tables, and also both my implementation and the ryu crate use a lut for printing the actual digits. So that's a few kilobytes of footprint just in terms of lookup tables.
I haven't done any precise comparisons in terms of code size though.