Simple Runge-Kutta - are all this cloning needed?

Hi,

I'm teaching myself Rust by implementing some fun stuff. This time I'm doing a double pendulum visualisation using nannou and I'm not happy at all with my implementation of Runge-Kutta to solve the differential equations.

My question is: is all that clone()'ing inside next() required?

The function f doesn't modify or does anything funny with the input array, it simply calculates some more values with it and returns another array. Do I really need to do all those clones of ys and a, b, c, and d?

By the way, I can modify f to take &Array instead, so that I could pass a reference without cloning, but then I'd still need to clone due to the calculation done y + h / 2 * a for example...

    use ndarray::Array1;
    pub type Float = f32;
    pub type Array = Array1<Float>;

    /// Runge-Kutta 4th order generator.
    ///
    /// f: function that returns the first order derivatives of the variables.
    /// y: initial value of the variables
    /// h: time-step
    ///
    pub struct RungeKutta {
        pub f: Box<dyn Fn(Array) -> Array>,
        pub y: Array,
        pub h: Float,
    }
    impl Iterator for RungeKutta {
        type Item = Array;
    
        fn next(&mut self) -> Option<Self::Item> {
            let y = self.y.clone();
            let a = (self.f)(y.clone());
            let b = (self.f)(y.clone() + self.h / 2.0 * a.clone());
            let c = (self.f)(y.clone() + self.h / 2.0 * b.clone());
            let d = (self.f)(y.clone() + self.h * c.clone());
            self.y = y + self.h / 6.0 * (a + 2.0 * b + 2.0 * c + d);
            Some(self.y.clone())
        }
    }

If your iterator yields Array then you are going to have to allocate at least once per next. But you can limit it to that by changing the API. What if f is FnMut(&mut Array, &Array)?

That being said ... Iterator is a poor fit for this. An alternative API would do several steps of integration at once and return a multidimensional array (or write to an already allocated one).

Tried an implementation without Iterator, filling up an already allocated array. I like this version better.

However, didn't manage to get rid of all those .clone()s...

/// Runge-Kutta 4th order generator.
///
/// f: function that returns the first order derivatives of the variables.
/// h: time-step
pub struct RungeKutta {
    pub f: Box<dyn Fn(Array) -> Array>,
    pub h: Float,
}
impl RungeKutta {
    /// ys: array with time history of the variables. It should be initialised
    ///     with the required capacity and have its first element set to the initial
    ///     values. The rest of the elements will be filled by the `fill_ys`.
    pub fn fill_ys(&self, ys: &mut Vec<Array>) {
        // TODO: check that ys.len() == 1
        for i in 1..ys.capacity() {
            let y = ys[i - 1].clone();
            let a = (self.f)(y.clone());
            let b = (self.f)(y.clone() + self.h / 2.0 * a.clone());
            let c = (self.f)(y.clone() + self.h / 2.0 * b.clone());
            let d = (self.f)(y.clone() + self.h * c.clone());
            ys.push(y + self.h / 6.0 * (a + 2.0 * b + 2.0 * c + d));
        }
    }
}

pub struct RungeKutta {
    pub f: Box<dyn Fn(&Array) -> Array>,
    pub h: Float,
}
impl RungeKutta {
    /// ys: array with time history of the variables. It should be initialised
    ///     with the required capacity and have its first element set to the initial
    ///     values. The rest of the elements will be filled by the `fill_ys`.
    pub fn fill_ys(&self, ys: &mut Vec<Array>) {
        // TODO: check that ys.len() == 1
        for i in 1..ys.capacity() {
            let y = &ys[i - 1];
            let a = (self.f)(y);
            let b = (self.f)(&(a.clone() * self.h / 2.0 + y));
            let c = (self.f)(&(b.clone() * self.h / 2.0 + y));
            let d = (self.f)(&(c.clone() * self.h + y));
            ys.push(y + self.h / 6.0 * (a + 2.0 * b + 2.0 * c + d));
        }
    }
}

You can save all y.clone() but I think you can't save allocation on cloning a,b,c.

pub struct RungeKutta {
    pub f: Box<dyn Fn(CowArray<Float, Ix1>) -> Array>,
    pub h: Float,
}
impl RungeKutta {
    /// ys: array with time history of the variables. It should be initialised
    ///     with the required capacity and have its first element set to the initial
    ///     values. The rest of the elements will be filled by the `fill_ys`.
    pub fn fill_ys(&self, ys: &mut Vec<Array>) {
        // TODO: check that ys.len() == 1
        for i in 1..ys.capacity() {
            let y = &ys[i - 1];
            let a = (self.f)(y.into());
            let b = (self.f)((self.h / 2.0 * &a + y).into());
            let c = (self.f)((self.h / 2.0 * &b + y).into());
            let d = (self.f)((self.h * &c + y).into());
            ys.push(y + self.h / 6.0 * (a + 2.0 * b + 2.0 * c + d));
        }
    }
}

This one might be better, if you expect self.f reuse the allocation.

1 Like

See also lazyivy

The section on binary operations in the ndarray docs might be helpful here.

While y.clone() is probably reasonable for the left hand side of operations (returning a new owned allocation), you are probably making and then throwing away extra allocations when you .clone() those on the right hand side. There you would probably better served by calling .view() or using a reference.

You could just create a reference to y at the beginning let y_ = &self.y; and then use y_ in the calculations in place of y.clone(). That may give Rust some more freedoms with optimization rather than forcing the new allocation and then calculating on it.

1 Like

Instead of f taking and returning a value, it should borrow its input and output. It should take something like a dst: &mut [f64] and src: &[f64]. This moves the responsibility for allocation somewhere else (in particular you can hoist it out of a loop).