Choosing a type at runtime


#1

Hi everyone. I’ve got an issue I can’t seem to find a solution for online. I’m writing an application that has a configuration file which a user will edit to control what the application outputs to disk.

The program is a heavy number cruncher that uses 3D arrays (from ndarray) of type f64 at the moment, but I want to extend this to also compute arrays of Complex64 from num_complex. Since this will instantly double my memory requirement (as a complex value is essentially (f64, f64)), I’d much prefer that I didn’t use just the real part of the type for all real-only calculations, so thus implementing a generic set of functions is a requirement.

This part has taken me some time, but now I have a MWE of this functionality:

    fn sample_operation<F: ComplexFloat>(w: &Array3<F>) -> F {
        w.into_iter().map(|&el| el.conj() * el).fold(F::zero(), |acc, v| acc + v)
    }

can be called with both real and complex types:

    let comp = Array3::<Complex64>::from_elem((10,10,10), Complex64::new(1., 2.));
    let real = Array3::<f64>::from_elem((10,10,10), 2.);

    let comp_ans = sample_operation(&comp); //Returns complex of size 16
    let real_ans = sample_operation(&real); //Returns real of size 8

The issue I have now is I want the user to be able to choose the type in the configuration file. More specifically, I have a PotentialType enum that is deserialised from the input, and some of the underlying potentials should emit only reals, and some complex. So what I’d really like is something along the lines of this:

    fn get_potential<F: ComplexFloat>(potential_type: &PotentialType) -> Result<F, Error> {
        match potential_type {
            Real => Ok(12.0),
            Complex => Ok(Complex64::new(12.0, 1.0)),
        }
    }

which obviously gives me expected type parameter, found f64/Complex64 errors as the return types are not generic here.

Is there any way I can set, at runtime, the type of a variable from a function like this? Or is there a way I can refactor/rethink this method to allow the generic types to do their job?


#2

One approach is to return a trait object:

fn get_potential(potential_type: PotentialType) -> Result<Box<ComplexFloat>, Error> {
    match potential_type {
        Real => Ok(Box::new(12.0)),
        Complex => Ok(Box::new(Complex64::new(12.0, 1.0))),
    }
}

You would then impl<T: ComplexFloat> ComplexFloat for Box<T> (by delegating the calls to the inner value) so you could use it in generic functions that require ComplexFloat-bound type parameters .

The above allocates heap memory though, which may be undesirable.

Another approach might be to define an enum that can hold the various types of ComplexFloat, e.g.:

enum ComplexFloatValue {
    F32(f32),
    C64(Complex64)
}

Then, you’d impl ComplexFloat for ComplexFloatValue and can use the enum variants in generic functions without the boxing/trait objects. The implementation would basically switch over the type of self and then delegate the call to the underlying value. Your get_potential then looks like:

fn get_potential<>(potential_type: PotentialType) -> Result<ComplexFloatValue, Error> {
    match potential_type {
        Real => Ok(ComplexFloatValue::F64(12.0)),
        Complex => Ok(ComplexFloatValue::C64(Complex64::new(12.0, 1.0))),
    }
}

It looks like ComplexFloat is a fairly wide trait (i.e. lots of functions), so the delegation is going to be annoying :frowning:.


#3

I don’t think that enum trick will work. An enum will always be as big as its largest variant, so even if you were to use the f32 variant, it’d still take up the same amount of memory as a complex number plus the enum tag.


#4

Right, but I was referring to heap allocations being a (possible) issue with the boxing approach. The enum approach, irrespective of how big it is, allows stack values and static dispatch (via monomorphization).


#5

The easiest way is to parametrize your entire calculation pipeline with F, from the beginning (e.g. reading input files) to the end (e.g. writing the output to file). And then, your main function would be something like:

fn entire_calculation<F: ComplexFloat>(config: &Config) -> Result<(), Error> {
    // here you know precisely what F is
    // but it can't leak outside through argument or return type
}

fn main() {
    let config = read_config(/* ... */);
    match config.potential_type {
        Real => entire_calculation::<f64>(&config),
        Complex => entire_calculation::<Complex64>(&config),
    }.expect("calculation failed");
}

#6

That’s a cool solution but wouldn’t scale to larger bodies of code and interactions. I wonder if there’s been any discussion or thoughts on having first class “proxy”/delegate support? Maybe some form of a procedural macro?


#7

Thanks for the suggestions everyone. I’ll take a shot at some modifications and see what works out.

My application is effectively laid out in a similar fashion to what you’ve suggested Fylwind, so I’ll see how that goes first.

But I’m not sure I understand what you mean when you say it wont scale vitalyd?
I can see that basically every call inside this main entire_calculation function will also need the type definition. That’s just a bit of tedium to write but not impossible. Or do you think there will be performance (or some other) issues with this implementation?


#8

I meant such an approach might work in this case, but is impractical for larger more complex cases where wrapping the entire pipeline in a single monolithic function is impractical and/or impossible from a code structure/modularization perspective.


#9

Ahh. Very true. A possible solution, just not a nice one.


#10

wrapping entire pipeline in a single monolithic function is impractical and/or impossible from a code structure/modularization perspective.

Under what circumstance would the pipeline not go into a single function? Your program has to eventually converge in main, so at some point it is going to end up in a single function. The only scenario where this can’t happen is if the pipeline is split over multiple processes, in which case you just have to re-read the configuration file and dispatch to the correct type.


Also, in case you want to parametrize over the entire_calculation it’s possible to use a trait as a kind of generic function:

trait CalculationFn<Args> {
    type Output;
    fn call_calculation<F: ComplexFloat>(self, args: Self::Args) -> Self::Output;
}

struct Calculation1;
impl CalculationFn<()> for Calculation1 {
    type Output = Result<(), Error>;
    fn call_calculation<F: ComplexFloat>(self, args: Self::Args) -> Self::Output {
        /* ... */
    }
}

struct Calculation2;
impl CalculationFn<()> for Calculation2 {
    type Output = Result<(), Error>;
    fn call_calculation<F: ComplexFloat>(self, args: Self::Args) -> Self::Output {
        /* ... */
    }
}

fn dispatch_calculation<C: CalculationFn<(), Output=Result<(), Error>>>(config: &Config, calculation: C) {
    match config.potential_type {
        Real => C::call_calculation::<f64>(config),
        Complex => C::call_calculation::<Complex64>(config),
    }.expect("calculation failed");
}

fn main() {
    let config = read_config(/* ... */);
    match config.which_calculation {
        First => dispatch_calculation(&config, Calculation1),
        Second => dispatch_calculation(&config, Calculation2),
    }
}

#11

I probably didn’t convey the thrust of my message clearly. What I was really getting at is it’d be nice if there was an ergonomic way to create proxies/delegates/wrappers/etc. For a computation tree like this, yeah, you can root it at a function like yours and arrange for the generic type to not escape through a return type. But that’s quite limited and forces certain structure. I’m more interested in how to make the enum example I mentioned easier/more ergonomic as it doesn’t require any particular code arrangement.


#12

The axis along which this definitely doesn’t scale is multiple config type inputs, imagine you had an input value in the config file that could be either a f64 or Complex64

match (config.potential_type, config.input.parse::<RealOrComplexValue>()) {
    (Real, RealValue(val))       => entire_calculation::<f64, f64>(val),
    (Complex, RealValue(val))    => entire_calculation::<Complex64, f64>(val),
    (Real, ComplexValue(val))    => entire_calculation::<f64, Complex64>(val),
    (Complex, ComplexValue(val)) => entire_calculation::<Complex64, Complex64>(val),
}.expect("calculation failed");

You could recover O(n) LOC scaling by defining a series of functions instead of doing it all in one match, but that comes with it’s own mental overhead


#13

Another option is to just define an enum Array { Real(Vec<f64>), Complex(Vec<Complex64>) } and deal with everything dynamically. Only bother with static dispatch for parts that are performance sensitive.