I wrote a program to elaborate the statistics on the Hard Drives published by BackBlaze in order to extract a plot that shows for each drive model a survival analysis.
This enterprise publish for each quarter of year a bunch of csv files (one per day) that shows the status of each disk in their array.
My program should load the data from the local disk (using the csv and the walkdir crates) process them (using Rayon for parallel computing and a CHashMap as data structure) and then create and show the graph using criterion_plot.
I'm having some problem with the last part: the plot.
This is the code:
let ref xs: Vec<_> = linspace::<f64>(0.0, days as f64, days + 1).collect();
let mut f = Figure::new();
for (model, data) in models.into_iter() {
let mean = statistical::mean(data.lifes.as_slice());
println!("{:30} {:10} {:4}", model,
data.capacity,
mean);
// statistical::standard_deviation(data.lifes.as_slice(),
// Some(mean)) );
let mut ys = vec![0; days+1];
let total = data.lifes.len();
let mut alives = total;
for l in data.lifes.into_iter() {
ys[l as usize] +=1;
}
//at this point ys[i] says how much drives died in the i-th day
for y in ys.iter_mut() {
alives -= *y;
*y = alives*100 / total;
}
f.plot(
Lines {
x: xs,
y: ys
},
|lp| lp.set(Label(model))
);
}
f.draw().ok().and_then(|gnuplot| {
gnuplot.wait_with_output().ok()
.and_then(|p| String::from_utf8(p.stderr).ok())
}).expect("ERROR occurred while plotting");
It does not panic, just exits normally, but the plot is not showed. Can you suggest why it is so, or any solution?
Another strange thing is that the non optimize version runs in 11 minutes on a subset of the data while the --release version runs in 6.457 seconds on the same data(measured through time).
It looks a little bit too much difference to me. Maybe something to report?