How to inline in async code?

I have a very long async block with some if/else logic. I would like to refactor this block for clarity, while not creating overhead. Eg. I would like to use #[ inline( always ) ] on a function. Within this function I would like to be able to use .await.

However when I try this, rustc insists I make the function async. So now I have to await it. I suppose that it is not inlined, and creates overhead.

Is this feasable at all?

Is your function going to be called more than 10 million times per second? If not, probably don't worry about it. Function calls are cheap in Rust.

LLVM will try to inline everything that makes sense, even when you don't add annotations. In fact, it's possible to make overhead larger by adding #[inline(always)], because there's also cost of having more code/bigger functions/more versions of generic functions, which creates more pressure on instruction caches and may decrease hit rate of branch prediction.

Well, there will probably be to much overhead elsewhere to reach 10 million, but it's a function that handles incoming network packets. So the less overhead the better, since it all just adds up.

Also async functions are far less cheap if I understand well, memory is allocated for the whole stack frame of the function (as a generator) and then that needs to go in the generator one up, and so on all the way up the call (await) stack. If I'm not mistaking those add up? I'm not sure because they might be pinned, so maybe everything only exists once, but it's not as cheap as a normal function call.

I'm talking about refactoring a function for clarity. Basically the refactored code will be called from exactly 1 location. I don't think in-lining could create overhead here. I'm not worried about LLVM failing to inline code that get's called exactly once, even without annotation, but I don't know how LLVM optimizes awaiting async functions.

Something like:

async fn do_complicated_processing()
{
   if some_condition
   {
      // 150 LOC

      // I would like to move those 150 lines in here:
      //
      some_condition(); // <- doesn't really work because I need to 
                        // make it async and await it if I want to use
                        // .await inside the function
   }

   else if something_else
   {
      // 200 LOC
   }

   else if yet_something_else
  {
      // 350 LOC
      // 
      // How big is the generator that needs to be allocated for 350 lines
      // of code with a bunch of variables (some that need to be passed in),
      //  with this code awaiting a bunch of futures again?
  }

   else 
   {
      // 80 LOC
   }
}

The long pieces of code obscure the logic of the if/else blocks. In non-async code I would be confident putting this in functions, and probably even without telling the compiler to inline it, it would be done. With async code, it's not so clear what's going to happen.

I'd imagine that even inline async functions get lowered to separate state machines from their caller, and that this may have the potential to obscure optimizations. Or at least, it will turn one enum discriminant into two.

Really, the only way to know for sure is to benchmark it!

If you want to absolutely guarantee zero cost, you can use a macro.

1 Like

Thanks. I hadn't thought about macros. Sure I would benchmark first whether that's worth it.

I was just wondering if this had been thought about, and if there was already a vision on inlining in async contexts.

Maybe it's to early for that? Maybe I should benchmark and open an issue on rust-lang... It's that I'm quite busy now and there's a ton on my todo list.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.