Crashing during unwrap far into reading the data

The Rust program linked below compiles without complaint, but crashes while unwrapping a line of text read from a file (line number 125). Run the 'gendat.pl' script to generate some fake data for the 'forums.rs' program to read. The number of lines it processes successfully in the scan_database() function seem to be dependent on the size of the file that is read into a HashMap before scan_database() is called.

For example, I loaded a history file with 24 million lines, and scan_database() dies while unwrapping the 18th line. I loaded a history containing 4600 lines, and scan_database() dies while unwrapping the 10th line. It worked fine with a history containing 100 lines.

Depends are chrono and regex crates.

I'm hoping it is something simple I'm overlooking, being a relative newby to Rust. Feel free to critique my use of (or lack thereof) idioms.

https://www.cs.nmsu.edu/~mleisher/rust/forums.rs
https://www.cs.nmsu.edu/~mleisher/rust/gendat.txt

The perl script cannot be downloaded. (...is the webserver trying to run it server-side!?)

image

Thanks for the alert!

Try https://www.cs.nmsu.edu/~mleisher/rust/gendat.txt to get it.

Just to add a data point: the amount of available memory doesn't seem to be an issue. The same error occurs with 8GiB as does with 64GiB, on both Linux and Mac OS. Haven't tested on Windows yet.

For the curious, the Perl script that currently does this loads 27+ million lines in about 35 seconds while the Rust program, compiled with cargo build --release, loads them in 34 seconds on a Linux box with 64GiB memory. I'm comparing the performance between the two.

I made some minor changes to make it more idiomatic here, itemized in three separate commits:

Let's run it.

$ cargo run
   Compiling forums-help v0.1.0 (C:\Users\diago\Downloads\forums-help)
    Finished dev [unoptimized + debuginfo] target(s) in 1.93s
     Running `target\debug\forums-help.exe`
Loading switch history...done.
Scanning database...Line: 1
Line: 2
Line: 3
Line: 4
Line: 5
Line: 6
Line: 7
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ParseError(TooShort)', src\libcore\result.rs:999:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
error: process didn't exit successfully: `target\debug\forums-help.exe` (exit code: 101)

ParseError(TooShort)? That doesn't sound like it's coming from unwrapping a line. I add RUST_BACKTRACE=1. This part stands out:

   9: core::result::unwrap_failed<chrono::format::ParseError>
             at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b\src\libcore\macros.rs:18
  10: core::result::Result<chrono::naive::datetime::NaiveDateTime, chrono::format::ParseError>::unwrap<chrono::naive::datetime::NaiveDateTime,chrono::format::ParseError>
             at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b\src\libcore\result.rs:800
  11: forums_help::nm
             at .\src\main.rs:81
  12: forums_help::scan_database
             at .\src\main.rs:130
  13: forums_help::main
             at .\src\main.rs:157

I add a dbg!(start); above the first NaiveDateTime::parse_from_str call:

Loading switch history...done.
Scanning database...Line: 1
Line: 2
Line: 3
Line: 4
Line: 5
Line: 6
Line: 7
[src\main.rs:81] start = "20140103"
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ParseError(TooShort)', src\libcore\result.rs:999:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

The dbg!() only printed once, so we can see that it breaks the very first time nm is called. And the error is ParseError(TooShort). Quite notably, your date format is "%Y%m%d_%H%M%S", but start only has %Y%m%d.

So I made the following fix:

diff --git a/src/main.rs b/src/main.rs
index d8482d7..07fdbfb 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -43,7 +43,7 @@ impl Mac {
 // Load all the MAC addresses with their first and last seen dates.
 //
 fn load_switch_history(prog: &str, emap: &mut HashMap<String, Mac>) {
-    let re = Regex::new(r"^(\d+)_\d+\s+\S+\s+\S+\s+\S+\s+(([0-9A-F]+,?)+)").unwrap();
+    let re = Regex::new(r"^(\d+_\d+)\s+\S+\s+\S+\s+\S+\s+(([0-9A-F]+,?)+)").unwrap();

     let infile = match File::open(SWITCH_HISTORY) {
         Err(_why) => {
@@ -78,10 +78,11 @@ fn load_switch_history(prog: &str, emap: &mut HashMap<String, Mac>) {
 fn nm(start: &str, end: Option<&str>, now: &DateTime<Local>) -> u32 {
     let ey: u32;
     let em: u32;
-    let s = NaiveDateTime::parse_from_str(start, "%Y%m%d_%H%M%S").unwrap();
+    let s = NaiveDateTime::parse_from_str(start, "%Y%m%d_%H%M").unwrap();

     if let Some(end) = end {
-        let edt = NaiveDateTime::parse_from_str(end, "%Y%m%d_%H%M%S").unwrap();
+        let edt = NaiveDateTime::parse_from_str(end, "%Y%m%d_%H%M").unwrap();
         ey = edt.year() as u32;
         em = edt.month() as u32;
     } else {

Seems to work now.

1 Like

Doh! I learned some. Thanks, Michael!