I have an iterator whose item is a Vector of enums.
struct SQLiteResultIterator<'s> {
num_columns: i32,
previous_status: i32,
stmt: &'s crate::community_statement::Statement,
}
impl<'s> Iterator for SQLiteResultIterator<'s> {
type Item = Vec<sql::Entity>;
fn next(&mut self) -> Option<Self::Item> {
if self.previous_status != sql::ffi::SQLITE_ROW {
return None;
}
let mut row = Vec::with_capacity(self.num_columns as usize);
for i in 0..self.num_columns {
let entity_value = sql::Entity::new(self.stmt, i);
row.push(entity_value);
}
unsafe {
self.previous_status =
sql::ffi::sqlite3_step(self.stmt.as_ptr());
};
Some(row)
}
}
At every iteration I allocate a new vector, and it seems very wasteful.
Is that true that I will end up allocating a new vector at any iteration? I might be lucky and having the memory allocator being smart enough to don't call the OS, but I am still performing an allocation, correct?
Is there a way to avoid multiple allocations in cases like this one? How?
Yes, this will allocate lots of Vecs. Instead of trusting the system allocator to do the right thing, you may want to look into streaming iterators.
The general idea of these is that you keep the Vec inside of your iterator. On every call to next(), you clear the inner vec, fill it again, and return a reference to it (or a slice spanning its entire content) to the user. This won't allocate on every iteration because Vec's backing storage is kept around when clearing.
A caveat of this design, however, is that such an iterator cannot implement Rust's standard Iterator trait, because that trait requires being able to collect all iterator outputs into a collection, whereas here the user must drop the previous output before it's allowed to call the next() method again.
Thanks for confirming my doubts! Really appreciate!
I thought about this idea of streaming iterators as well, but eventually, I want to have all the result together in memory. So this is not going to cut it.
The other option I am considering is to pass a vector of vector as input and having the function fill the outside vector. It seems like a better idea...
Another option is to yield iterators, then your users can do whatever they want with those. But this may not always be possible, and may be hard to do.
If you always will want all the rows, why make an Iterator at all? Just write a function that returns a Vec, and you can organize the allocation however you care to, including iterating over the rows.
That's one approach, but doesn't use the type system as well as it could. You really want an extensible 2-D matrix, as @Yandros pointed out, so define a type for a row and make a vector of that type.
But how? The row is an ordered list of an arbitrary (definitely non-constant) number of elements. To define a type, I will need to have such number know at compile time.
Apparently I misinterpreted your earlier reply. From what you've just written, it appears that all rows in a given instance are the same length, but that length is not knowable until runtime. The clean way to implement that is to provide runtime functions (or macros) to get and set a slice, using Vec::chunks() and Vec::chunks_mut(), respectively, together with the 1-dimensional index of the slice.
Sorry! My bad! Each query result will have different numbers of columns. But inside one query result, all the row will have the same number of columns.
Yes, I believe we have converged into the best solution together!
Then when you choose what's next() return by replacing each columns.
for i in 0..self.num_columns {
let entity_value = Entity {};
mem::replace(&mut self.row[i], entity_value);
}
And then return a reference to the row Vec.
Actually it does not compile because of the lifetime of Item. But there is a crate that allow you easily to break this wall : streaming_iterator.
You just have to divide your impl : the for loop in advance() and the return in get().