Fastest way to perform a file transform

I have a project to speed up a JavaScript 'Read, transform, write' algorithm with Rust. I've already taken it from 800MB per minute to 25GB per minute. But I think I might get pressured for even more cowbell!

The current transform is effectively

  • Buffered reader on input file
  • Buffered writer on output file
  • For each line in read buffer
    • Transform it
    • write_all to output buffer
  • Flush output

All single threaded. Each line must be in the same order in the output as it is in the input.

So, what are my opportunities for further optimizing that, if any?

Thanks

What is your transform doing? Most optimizations will likely depend on that.

If you replace the transform with do nothing, how much faster does it get? That'd be a good way to know if it's a transform problem or an IO problem.

7 Likes