I am using csv to read CSV files. So far everything was fine, but I've just encountered a file that has some notes before the header line. Some notes that 'd like to skip.
I have been fighting this for a while but could not yet figure out how to skip 3 lines and then use the rest of the file for reading.
let filepath = "Connections.csv";
let mut records: Vec<Connection> = vec![];
let fh = File::open(&filepath)?;
let br = std::io::BufReader::new(&fh);
let lines = br.lines();
let _ = lines.skip(3);
let mut rdr = csv::Reader::from_reader(fh);
for result in rdr.deserialize() {
let record: Connection = result?;
records.push(record);
}
In this example the CSV reader starts from the beginning of the file (as probably expected, I am only showing this so you can see I did fight with it.)
let filepath = "Connections.csv";
let mut records: Vec<Connection> = vec![];
let fh = File::open(&filepath)?;
let cursor = std::io::Cursor::new(&fh);
let lines = cursor.lines();
let _ = lines.skip(3);
let mut rdr = csv::Reader::from_reader(cursor);
for result in rdr.deserialize() {
let record: Connection = result?;
records.push(record);
}
You can use BufRead::skip_until to skip over data until you reach a delimiter. Since that delimiter can only be one byte, you skip until a newline character \n 3 times. (skip_until also skips the \n terminator).
let filepath = "Connections.csv";
let mut records: Vec<Connection> = vec![];
let fh = File::open(filepath)?;
let mut br = std::io::BufReader::new(fh);
// This is the new part!
// (You can also just write br.skip_until(b'\n') 3 times instead of a loop)
for _ in 0..3 {
br.skip_until(b'\n');
}
// NOT fh vv
let mut rdr = csv::Reader::from_reader(br);
for result in rdr.deserialize() {
let record: Connection = result?;
records.push(record);
}
Your code example didn’t work because .skipping on an iterator doesn’t actually do anything except alter the future state of that iterator. In fact, you get an unused_must_use warning, which you suppressed with the let _ = lines.skip(3).
If you have a specific byte offset into the file to skip to (say, you know there’s always 80 bytes of junk before the CSV data begins), use Seek on your File.
As another note on your code, your final loop of deserializing and pushing into a Vec can be replaced with Iterator::collect:
// ...
let mut rdr = csv::Reader::from_reader(br);
let records = rdr.deserialize::<Connection>()
.collect::<Result<Vec<_>, _>>()?;