I have a function that returns a vector of structures containing data from an sql database query via rusqlite. It works. Can I fold the "let maxvalue", "let minvalue" and "for loop" functionality into the "let mut closing_values" step by applying additional methods in some way? Or is there simply a better way to go about do this? Any assistance will be appreciated.
pub fn get_closing_values(start_date: &str) -> Result<Vec<ClosingByDate>, CustomError> {
let db_file_path = "stock_data.db";
let conn = Connection::open(db_file_path)?;
let mut stmt = conn.prepare(
"Select date, fdate, close
From data
Where date >= (:startdate)
Order By date Asc"
)?;
let row_iter = stmt.query_map(&[(":startdate", &start_date)], |row|
{
Ok(ClosingByDate {
date: row.get(0)?,
fdate: row.get(1)?,
close: row.get(2)?,
min: false,
max: false,
})
})?;
let mut closing_values: Vec<ClosingByDate> = row_iter
.filter_map(|item| item.ok())
.collect();
let maxvalue: f64 = closing_values.iter().max_by(|a, b| a.close.total_cmp(&b.close)).unwrap().close;
let minvalue: f64 = closing_values.iter().min_by(|a, b| a.close.total_cmp(&b.close)).unwrap().close;
for element in &mut closing_values {
if element.close == maxvalue {
element.max = true;
} else if element.close == minvalue {
element.min = true;
}
}
Ok(closing_values)
}
I don’t know of any really nice ways to do this. You can collect into multiple collections by using tuples, but "collecting" min and max would require defining special wrapper types.
You could do this several ways, of debatable elegance, but one would be to just obtain the min and max as side effects:
let mut minvalue = f64::INFINITY;
let mut maxvalue = -f64::INFINITY;
let mut closing_values: Vec<ClosingByDate> = row_iter
.filter_map(|item| item.ok())
.inspect(|item| {
minvalue = minvalue.min(item.close);
maxvalue = maxvalue.max(item.close);
))
.collect();
for element in &mut closing_values {
if element.close == maxvalue {
element.max = true;
} else if element.close == minvalue {
element.min = true;
}
}
This does two iterations over the whole dataset instead of four.
You can also record which index is the min and max during the first iteration, then you just need to index instead of the second loop.
let mut minvalue = f64::INFINITY;
let mut maxvalue = -f64::INFINITY;
let mut minindex = 0;
let mut maxindex = 0;
let mut closing_values: Vec<ClosingByDate> = row_iter
.filter_map(|item| item.ok())
.enumerate()
.map(|(index, item)| {
if item.close < minvalue {
minvalue = item.close;
minindex = index;
}
if item.close > maxvalue {
maxvalue = item.close;
maxindex = index;
}
item
)
.collect();
if !closing_values.is_empty() {
closing_values[maxindex].max = true;
closing_values[minindex].min = true;
}
I presume the OP wants to record if there are several occurences of the max. So he could collect the indicies into vectors, and iterate over those vectors instead of rechecking the whole list of closing prices?
Oh, right - forgot that my solution would work differently on that case.
But yea, store a vector maxindices and as you iterate if you find a value > maxvalue overwrite it with a length 1 vector, and if you find a value == maxvalue push to the vector (and similar for the min).
Then at the end iterate the vectors of indices to set .min and .max.
let mut min_value = f64::INFINITY;
let mut min_indices = Vec::with_capacity(5); // guess number of indices we'll need
let mut closing_values = row_iter
.filter_map(|item| item.ok())
.enumerate()
.map(|(index, item)| {
if item.close < min_value {
min_value = item.close;
min_indices.clear(); // re-use the old allocation for the new vector
}
if item.close == min_value {
min_indices.push(index);
}
// same for max
item
})
.collect::<Vec<_>>();
for index in min_indices {
closing_values[index].min = true;
}
// same for max
I would write it in SQL and make it a problem of the database, but you should measure the performance yourself:
SELECT
date,
fdate,
close,
(SELECT MIN(close) FROM data) == close AS min,
(SELECT MAX(close) FROM data) == close AS max
FROM data
WHERE ...
ORDER BY date ASC;
Thank you one and all. Great suggestions. I ended up letting SQL handle determining the min/max values with subqueries. Below is the updated code. Two questions: 1) It seems logical to me that SQLite would be smart enough to run each subquery once since the outer query does not affect the subquery. Is this correct? 2) I assume that the aliases min & max are for readability. Is this correct?
pub fn get_closing_values(start_date: &str) -> Result<Vec<ClosingByDate>, CustomError> {
let db_file_path = "stock_data.db";
let conn = Connection::open(db_file_path)?;
let mut stmt = conn.prepare(
"SELECT date, fdate, close,
(SELECT MIN(close) FROM data WHERE date >= (:startdate)) == close AS min,
(SELECT MAX(close) FROM data WHERE date >= (:startdate)) == close AS max
FROM data
WHERE date >= (:startdate)
ORDER BY date ASC"
)?;
let row_iter = stmt.query_map(&[(":startdate", &start_date)], |row|
{
Ok(ClosingByDate {
date: row.get(0)?,
fdate: row.get(1)?,
close: row.get(2)?,
min: row.get(3)?,
max: row.get(4)?,
})
})?;
let closing_values: Vec<ClosingByDate> = row_iter
.filter_map(|item| item.ok())
.collect();
Ok(closing_values)
}
- I don't know, but you can ask the database with explain query plan. I just realized SQLite also supports window functions, so you can also try this and check if it makes any difference:
SELECT date,
fdate,
close,
close = MIN(close) OVER () AS min,
close = MAX(close) OVER () AS max
FROM data
WHERE date > 'fooo'
ORDER BY date ASC;
- Yes, exactly. It looks nicer when debugging and some ORMs allow you to get the value of a column by name.