BufReader 100x slower than Python — am I doing something wrong?


#1

Hi there! I wrote this little program that uses BufReader wrapped around File to count the lines in a file. (The file is the GeoNames US placename database, which is ~270 MB and has ~2.2 million lines.)

use std::error::Error;
use std::fs::File;
use std::io::{BufRead, BufReader};

const DATA_FILE: &'static str = "dataset/US.txt";

fn main() {
    let file = match File::open(DATA_FILE) {
        Err(why) =>
            panic!(
                "couldn't open {}: {}",
                DATA_FILE,
                Error::description(&why)
            ),
        Ok(file) => file,
    };

    let buffered_file = BufReader::new(file);

    let mut lines = 0;
    for line in buffered_file.lines() {
        lines += 1;
    }
    println!("{} lines", lines);
}

It’s really, really slow:

$ time ./target/debug/line-counter
2205986 lines

real    0m52.612s
user    0m52.516s
sys     0m0.058s

As a baseline, I wrote a naïve Python implementation, and that’s 100x faster:

lines = 0

with open('dataset/US.txt') as myfile:
    for line in myfile:
        lines += 1

print "{} lines".format(lines)
$ time python linecounter.py
2205986 lines

real    0m0.432s
user    0m0.387s
sys     0m0.044s

Is there a bug in BufReader, or am I using it wrong here?


#2

I’m pretty sure this has to do with the default optimizations (or lackthereof). Try building with -O if using rustc, or --release if using cargo:

On my machine, the runtime went from 62.2 seconds to 0.9 seconds (!)

Perhaps its the overflow checking at work?


#3

As an experiment, I tried a slighty un-rustic approach:

    let mut lines = 0;
    let mut buf: [u8; 4096*32] = [0; 4096*32];
    loop {
        let num_bytes = match file.read(&mut buf) {
            Ok(s) => s,
            Err(_) => break
        };
        if num_bytes == 0{ break; }
        for i in 0..num_bytes {
            if buf[i] == 0x0A { lines += 1}
        }


    }

This brings the runtime down to about 0.5 seconds (so roughly twice as fast, probably because it doesn’t do any of UTF-8 handling that String has). But it’s also several times slower than running wc -l (which on my machine runs consistently less than 0.15 seconds)

Interesting!


#4

Don’t try to beat wc unless you’re really masochistic: http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/wc.c


#5

Just FYI (performance should be the same) in Rust you can also write:

let lines = buffered_file.lines().fold(0, |sum, _| sum + 1);

#6

One major factor here is definitely optimizations not being turned on.
However, even with optimizations the Python version is still faster by about a factor of 4 on my system. According to perf Rust spends about half the time in str::from_utf8(), and the other half in main() (presumably in the inlined iteration/counting). So UTF-8 handling is definitely a factor here, the lines iterator could likely also use some love.
Python apparently pretty much just calls into memchr(), so one could almost argue we are really comparing Rust to C here, though Python does add some overhead.


#7

First of all, yes, doing benchmarks without optimization is meaningless. It’s wrong.

A sidenote is that .lines() allocates a new String per line, so it’s not just UTF-8 handling but that too. Don’t use .lines() unless you need those Strings!


#8
let mut lines = 0;
let mut buf = [0u8; 4096*32];
while let Ok(num_bytes) = file.read(&mut buf) {
    if num_bytes == 0 { break; }
    lines += buf[..num_bytes].iter().filter(|&&byte| byte == b'\n').count();
}

rustic’d. At least a bit.