Passing a function as parameter: as reference or as value?

What is the difference if I define a function parameter as reference or as value (other that the parameter is moved)? Is there a performance difference?

pub fn calculate(
    data: &[SomeStruct],
    config: &Config,
    f_result_analyzer: &dyn Fn(&ResultBatch) -> i32,
) {...}

In this case I cannot remove the & before dyn and must pass the function as reference, as otherwise the compiler complains the size cannot be known at compile time. This makes sense, as anyhow only the address of the function is passed (or not?).

But I can rewrite the definition to

pub fn calculate<F>(
  data: &[SomeStruct],
  config: &Config, 
  f_result_analyzer: &F) 
where
    F: Fn(&ResultBatch) -> i32,
{...}

and also as value (note missing & before F):

pub fn calculate<F>(
  data: &[SomeStruct],
  config: &Config, 
  f_result_analyzer: F) 
where
    F: Fn(&ResultBatch) -> i32,
{...}

So I wonder: Is there any difference between the last two definitions?

If you're passing F as a generic parameter, your binary will have different assembly for each different F. Compiler will see exactly what is F, may inline or optimize in some other way specifically for each and every F. This is a default choose you should use.

dyn Fn is an unsized object, and &dyn Fn is a fat pointer to that object. All calls are dispatched through a virtual table and are slower, sometimes significantly. On the other side, your calculate will not be duplicated across the binary (at the cost that compiler will not optimize it so much in respect to the function).

So do not use &dyn Fn until you know what you're doing. As long as you don't, use calculate<F: Fn()>(f: F) or, equivalently, calculate(f: impl Fn()) (just nicer syntax).

5 Likes

The first requires a reference to a function. The second does not require a reference, and is more general (because the caller can pass a reference if they want).

Typically, the second is preferable as it is easier to call. However, if calculate were recursive, you would have to use the first so that f_result_analyzer can be passed down through the recursion by copy.

2 Likes

Now I have a follow up question: How can I make the parameter optional?

This works as long as I have Some(fn), but None requires a type annotation which I could not figure out.

pub fn calculate<F>(
  data: &[SomeStruct],
  config: &Config, 
  f_result_analyzer: Option<F>) 
where
    F: Fn(&ResultBatch) -> i32,
{...}

There is no single good way to do this.

If it is a public API, you should probably provide two separate functions (with and without f_result_analyzer), instead.

If you want simple code above all else, use Option<&dyn Fn(&ResultBatch) -> i32> so there is no type parameter to be ambiguous.

Consider whether it is reasonable to pass “function that does nothing” instead of making the function optional.

2 Likes

You could...

const DO_NOTHING: Option<fn(&ResultBatch) -> i32> = None;
// ...
calculate(&[], &config, DO_NOTHING);
// This also works but who would want to type it more than once
calculate(&[], &config, None::<fn(&ResultBatch) -> i32>);

Incidentally, you can probably use F: FnMut(&ResultBatch) -> i32, which is more flexible for callers. Or maybe even FnOnce.

1 Like

I would suggest having 2 function, one with no callback and another with. That one without callback will call that with callback and pass None::<fn(&ResultBatch) -> i32>.

1 Like