Success story: new Rustacean beating C perf in first week

juleskers · June 6, 2017, 2:59pm

So, it's been a while, and my colleague has continued fiddling with the original application in between his other tasks.
It's now basically finished, so you can think of this post as the closing update.

tl;dr: Rust now does more for us, and our implementation got even faster!

A big part of the speed increase was discovering the wonderful "needletail" crate (crates.io).
Needletail uses byte-reading, memchr and reused buffers to do (almost) zero-copy iteration. Basically all the things you wonderful people suggested we do, upthread
In practice, it builds a reader over a file, and that reader then calls your user-supplied closure with each record in the file. Your closure gets handed a lifetime-bounded record backed directly by the read-buffer!.
As a bonus, it transparently reads gzipped files, which was a separate step in our old perl-based system.

As suggested, we switched to Regex::bytes submodule, to avoid all the UTF8 validation when reading our (pure-ascii) fastq files.

In addition to Needletail, we've also started using Rust-bio (crates.io) to handle our nucleotides in the file. They have a nice, fast, reverse complement algorithm, that even works on raw bytes!

To repeat the context, we use this, productively, for our CRISPR-analyzer webservice, that lets you look for CRISPR/CAS9 target sites in your samples, bringing sophisticated biological screens into the reach of labs without dedicated bioinformatics staff.
Speeding up this webservice means more customer-friendlyness (less waiting!), and less resources expended on our side.

timings with time:

-- original PERL version: --
1) gunzip inputfile:                      real 0m 9.674s
2) extract matching fastQ reads:          real 0m46.133s
3) Map to reference genome with bowtie2:  real 0m30.978s  (constant)
4) Map reads to genes:                    real 2m 3.399s
   TOTAL:                                    * 3m30.184s *

++ RUST version ++
1+2) extract fastQ reads from gzip'd input:   real 0m26.334s
  ALT: same, non-gzip'd input:               (real 0m10.971s)
3) Map to reference genome with bowtie2:      real 0m30.978s (constant)
4) Map reads to genes:                        real 0m19.490s
   TOTAL                                           1m16.802s (63% shorter overall!)

Updated code is (still) here on github

For those of you wondering that these timings are longer than the original 3-4 seconds I posted: the 3 seconds time was for an incomplete port of the original, just the bare minimum of iterating and regex-matching (but of course, the same incomplete feature-set for both C and Rust).
These new timings are for the full, does-everything, version. Features cost runtime

Thanks everyone for helping us save an entire two minutes from our webservice!

Topic		Replies	Views
Yes, at last, my Rust is faster than C!	38	3243	April 2, 2020
Benchmarking different languages for size and speed announcements	2	660	April 4, 2021
Rust vs. C++: Fine-grained Performance	51	6942	January 12, 2023
How fast will Rust get in the future?	5	2260	January 12, 2023
To the creators of Rust community	11	5117	November 5, 2018

Success story: new Rustacean beating C perf in first week

Related Topics