First release of `root1d`

Hi all,

I'm happy to announce the first release of root1d, a fully generic library to find roots of functions TT (usable with T being f32, f64, rug::Float and rug::Rational out of the box). The state of the art Toms748 algorithm is implemented. Care has been taken to have implementations both for Copy and non-Copy types.

Comments are very welcome.

Enjoy,
Christophe

2 Likes

I'm curious why your function type F: FnMut(T) -> T is FnMut rather than Fn, particularly for the Copy types. The former allows for functions that are not mathematical functions, and I'm not sure why it would be needed for mathematical functions (i.e. where the output depends only on the input).

I also wondered whether your code would be much nicer without all the macros, and just accepting a little duplication. I recently improved some of my own code but eliminating macros in favor of a little duplication, and found that it was far easier to read, improve, and debug.

2 Likes

I feel like you should mention in these docs that this is for continuous functions only, and perhaps (if this is indeed the case) mention that the function values for the given boundary points need to have opposite signs, or whatever the precise requirements are for the algorithms to work.

2 Likes

I'd assume this is following the general wisdom to accept the earliest in the order FnOnce, FnMut, Fn. In this case, since the function is only called by unique reference, FnMut is sufficient. Note that even a Fn can capture a &Cell<T>, &RefCell<T>, &Mutex<T>, or &RwLock<T> to modify its environment, so the function being pure is really a matter of documentation. (Also, what if the user wants to cache their partial results, or log performance output? A Fn would only make this more difficult.)

For a related example, <[T]>::sort_by() accepts a FnMut(&T, &T) -> Ordering for its comparator function, but states:

The comparator function must define a total ordering for the elements in the slice. If the ordering is not total, the order of the elements is unspecified.

A similar rule should really apply to any mathematical function taking a callback: if its documented preconditions are violated, the function is still memory safe, but it might result in a panic or a nonsense value.

1 Like

Thanks for your feedback. As @LegionMammal978 said, I took the more general possibility on the FnOnce, FnMut, Fn scale. Granted, more often than not, the closure will not mutate its environment but I did not want to make it more difficult to do so. This capability was used in some examples to count the number of function evaluations and, more practically, can accomodate functions making use of memoization or, say, coming from ODE solvers using a mutable store.

Is there any particular place you found hard to read because of macros? I am also concerned no to become slower. For example, for newton_quadratic, I thought of parametrizing it with <const K: u8> instead of using a macro but I am not sure whether a loop such as for _ in 0..K {...} would then be unrolled.

1 Like

All type parameters and const generics are monomorphized in the final binary. For instance, if you declared a function newton_quadratic<const K: u8>(), then the compiler would generate two separate functions newton_quadratic<2>() and newton_quadratic<3>(), then optimize them separately. So there's no real performance penalty (except possibly in compile times) in using generics over macros.

I found a lot of it hard to read (on a phone). One example would be the act_on_sign_change macro. I presume it exists because you're interested in types where the simple approach of multiplying the two values and examining the result isn't sufficient, but an actual function returning a descriptive enum would have made both the act_on_sign_change and the code calling it more readable. The trouble is that with the macro you have to read both pieces of code to have a clue what it's doing. I doubt that a generic function would cost you anything here. In terms of performance.

I've done that. Funnily, returning an enumeration of all possible outcomes and acting on them via pattern matching is slightly slower. But returning a Result<X, Error<T>> where X is a smaller enumeration seems even to improve speed slightly (for Copy types)... :thinking:

I tried and there indeed isn't. So I guess the Rust compiler unroll the loop. Is there a place where such best practices are documented?

You do want to be careful about paying attention to tiny changes. Something effects like you describe happen simply by changing where the cache lines end up, and any change to the code will have seemingly random effects.

Or it could be that by using a Result you have the compiler a hint as to the "fast path" that is common.

Is there a place where “optimization wisdom” for the Rust compiler is collected (such as the optimization of Box<smart pointer>, inlining & loop unrolling, recommended patterns,...)¹ or are we just bound to benchmark (I used perf, valgrind and naive timings and have yet to learn criterion). BTW, is criterion the recommended tool to benchmark in Rust?

¹ Here is an idea of what I mean for another language.

Pretty much. I prefer scaling myself, which is more streamlined and simpler in many ways, but criterion is solid.

Regarding a rust optimization resource, I don't have such a thing and would also appreciate it.

1 Like

Thanks to everyone for your comments. I've released a new version.