Hi there! I wrote this little program that uses BufReader wrapped around File to count the lines in a file. (The file is the GeoNames US placename database, which is ~270 MB and has ~2.2 million lines.)
use std::error::Error;
use std::fs::File;
use std::io::{BufRead, BufReader};
const DATA_FILE: &'static str = "dataset/US.txt";
fn main() {
let file = match File::open(DATA_FILE) {
Err(why) =>
panic!(
"couldn't open {}: {}",
DATA_FILE,
Error::description(&why)
),
Ok(file) => file,
};
let buffered_file = BufReader::new(file);
let mut lines = 0;
for line in buffered_file.lines() {
lines += 1;
}
println!("{} lines", lines);
}
It’s really, really slow:
$ time ./target/debug/line-counter
2205986 lines
real 0m52.612s
user 0m52.516s
sys 0m0.058s
As a baseline, I wrote a naïve Python implementation, and that’s 100x faster:
lines = 0
with open('dataset/US.txt') as myfile:
for line in myfile:
lines += 1
print "{} lines".format(lines)
$ time python linecounter.py
2205986 lines
real 0m0.432s
user 0m0.387s
sys 0m0.044s
Is there a bug in BufReader, or am I using it wrong here?