Reviewing:
#[allow(missing_debug_implementations)]
pub struct StatementUse<'a, 'b> {
statement: MaybeCached<'a, Statement>,
column_names: OnceCell<Vec<*const str>>,
// we need to store the query here to ensure noone does
// drop it till the end ot the statement
// We use a boxed queryfragment here just to erase the
// generic type
_query: Box<dyn QueryFragment<Sqlite> + 'b>,
binds_to_free: Vec<(i32, Option<SqliteBindValue<'static>>)>,
}
impl<'a, 'b> StatementUse<'a, 'b> {
pub(super) fn bind<T>(mut statement: MaybeCached<'a, Statement>, query: T) -> QueryResult<Self>
where
T: QueryFragment<Sqlite> + QueryId + 'b,
-
Subjective nit: when using complex lifetimes, especially with
unsafe
code, despite popular belief, theSelf
alias ought to be replaced with what it mentions, so that the signature is fully readable on its own. This is especially meaningful if the associated function is far from theimpl
line:fn bind<T>( mut statement: MaybeCached<'a, Statement>, query: T, ) -> QueryResult<StatementUse<'a, 'b>> where T: QueryFragment<Sqlite> + QueryId + 'b,
Also, since
'b
represents the region of owned-usability of the givenquery
, then maybe you should name it accordingly, as well as'a -> 'stmt
for the maybe-borrow of theStatement
?impl<'stmt, 'query> StatementUse<'stmt, 'query> { fn bind<Query>( mut statement: MaybeCached<'stmt, Statement>, query: Query, ) -> QueryResult< StatementUse<'stmt, 'query> > where Query: 'query + QueryFragment<Sqlite> + QueryId,
and ideally propagate the rename elsewhere.
impl<'a, 'b> StatementUse<'a, 'b> {
pub(super) fn bind<T>(mut statement: MaybeCached<'a, Statement>, query: T) -> QueryResult<Self>
where
T: QueryFragment<Sqlite> + QueryId + 'b,
{
let mut bind_collector = SqliteBindCollector::new();
query.collect_binds(&mut bind_collector, &mut ())?;
let SqliteBindCollector { metadata, binds } = bind_collector;
let mut binds_to_free = Vec::new();
for (idx, (bind, tpe)) in binds.into_iter().zip(metadata).enumerate() {
// Sqlite starts to count by 1
let bind_idx = idx as i32 + 1;
-
Note that
iterator.enumerate()
is the same as(0 ..).zip(iterable)
, so if you want to start by1
, and usei32
s, I highly recommend that you do:for (bind_idx, (bind, tpe)) in (1_i32 ..).zip(binds.into_iter().zip(metadata)) {
// It's safe to call bind here as:
// * The type and value matches
// * We ensure that corresponding buffers lives long enough below
// * The statement is not used yet by `step` or anything else
unsafe { statement.bind(tpe, &bind, bind_idx)? };
-
I cannot judge of the safety of this call, since
Statement::bind
does not spell out its/// Safety
requirements.After looking at this, it does look like the
value
parameter involves two lifetimes (you should try to enable thedeny_elided_lifetimes_in_paths
lint, especially if you are usingunsafe
code), so I'm assuming the lifetimes to live for at least as long as either the statement is dropped, or the bindings are "cleared" by being overwritten with bindings to "null placeholders" (which the drop impl does; for further clarity you could write theDrop
inside this very function).EDIT: After reading this whole snippet as a whole, I realize it's a wrapper offering a non-
unsafe
API for two "orthogonal issues": "bound arguments should not dangle" (which is tackled by theDrop
impl and the vec) and the "column names shan't dangle". From there, I'd recommend that two wrappers be used, one tackling only the first issue, and then a second wrapper, containing it, which adds the "column names" invariant.
// We want to unbind the buffers later to ensure
// that sqlite does not access uninitilized memory
match bind {
SqliteBindValue::BorrowedString(_) | SqliteBindValue::BorrowedBinary(_) => {
binds_to_free.push((bind_idx, None));
}
SqliteBindValue::Binary(b) => {
binds_to_free.push((bind_idx, Some(SqliteBindValue::Binary(b))));
}
SqliteBindValue::String(b) => {
binds_to_free.push((bind_idx, Some(SqliteBindValue::String(b))));
}
_ => (),
}
}
Ok(Self {
statement,
column_names: OnceCell::new(),
binds_to_free,
_query: Box::new(query) as Box<_>,
})
}
pub(in crate::sqlite::connection) fn run(self) -> QueryResult<()> {
self.step().map(|_| ())
}
pub(in crate::sqlite::connection) fn step(self) -> QueryResult<Option<Self>> {
let res = unsafe {
match ffi::sqlite3_step(self.statement.inner_statement.as_ptr()) {
ffi::SQLITE_DONE => Ok(None),
ffi::SQLITE_ROW => Ok(Some(())),
_ => Err(last_error(self.statement.raw_connection())),
}
}?;
Ok(res.map(move |()| self))
}
-
According to the docs of
column_name
, IIUC, this one invalidates the potentially-cachedcolumn_names
since it callssqlite3_step
, and yet offersSelf
nonetheless, thus allowing for:let stmt_use = …; let _ = stmt_use.field_name(1); // calls `column_name` and caches a pointer if let Some(stmt_use) = stmt_use.step()? { // calls `sqlite3_step` if let Some(coll_name) = stmt_use.field_name(1) { // dereferences the cached pointer // According to the warnings written on the raw `column_name` // function, this could be a dangling reference! } }
// The returned string pointer is valid until either the prepared statement is
// destroyed by sqlite3_finalize() or until the statement is automatically
// reprepared by the first call to sqlite3_step() for a particular run or
// until the next call to sqlite3_column_name() or sqlite3_column_name16()
// on the same column.
//
// https://sqlite.org/c3ref/column_name.html
pub(super) unsafe fn column_name(&self, idx: i32) -> *const str {
let name = {
let column_name =
ffi::sqlite3_column_name(self.statement.inner_statement.as_ptr(), idx);
assert!(
!column_name.is_null(),
"The Sqlite documentation states that it only returns a \
null pointer here if we are in a OOM condition."
);
CStr::from_ptr(column_name)
};
name.to_str().expect(
"The Sqlite documentation states that this is UTF8. \
If you see this error message something has gone \
horribliy wrong. Please open an issue at the \
diesel repository.",
) as *const str
}
-
Note: I was initially surprised by having a function yielding a raw pointer and also being an
unsafe fn
on its own, but since.field_name()
is a non-unsafe
function yielding a Rust reference which would be invalidated by this function, theunsafe fn
is warranted (although I'd personally make this function private insidefield_name
to make it clearer).As it stands now, a
/// Safety
annotation regarding its misinteraction with.field_name()
would be in order.
pub(super) fn column_count(&self) -> i32 {
unsafe { ffi::sqlite3_column_count(self.statement.inner_statement.as_ptr()) }
}
pub(super) fn index_for_column_name(&mut self, field_name: &str) -> Option<usize> {
(0..self.column_count())
.find(|idx| self.field_name(*idx) == Some(field_name))
.map(|v| v as usize)
}
pub(super) fn field_name(&mut self, idx: i32) -> Option<&str> {
if let Some(column_names) = self.column_names.get() {
return column_names
.get(idx as usize)
.and_then(|c| unsafe { c.as_ref() });
}
let values = (0..self.column_count())
.map(|idx| unsafe { self.column_name(idx) })
.collect::<Vec<_>>();
let ret = values.get(idx as usize).copied();
let _ = self.column_names.set(values);
ret.and_then(|p| unsafe { p.as_ref() })
}
-
Nit: this uses a
&mut self
receiver as well as a&
-friendlyOnceCell
; which is kind of an oxymoron.-
Either embrace
&mut
here:column_names: Option<Vec<*const str>>,
let column_names = self.column_names.get_or_insert_with(|| { /* define the raw `column_name` here (c.f. previous comment) */ (0 .. self.column_count()) .map(|idx| self.column_name(idx)) .collect() }); column_names .get(idx as usize) .and_then(|c| unsafe { c.as_ref() })
-
or feature
&
access (AFAIK, this won't cause soundness problems with the lifetime, providedcolumn_name
remained private (or else it being anunsafe fn
would indeed be warranted)):- column_names: Option<Vec<*const str>>, + column_names: OnceCell<Vec<*const str>>, … - pub(super) fn field_name(&mut self, idx: i32) -> Option<&str> { + pub(super) fn field_name(&self, idx: i32) -> Option<&str> { - let column_names = self.column_names.get_or_insert_with(|| { + let column_names = self.column_names.get_or_init(|| {
-
pub(super) fn copy_value(&self, idx: i32) -> Option<OwnedSqliteValue> {
OwnedSqliteValue::copy_from_ptr(self.column_value(idx)?)
}
pub(super) fn column_value(&self, idx: i32) -> Option<NonNull<ffi::sqlite3_value>> {
let ptr =
unsafe { ffi::sqlite3_column_value(self.statement.inner_statement.as_ptr(), idx) };
NonNull::new(ptr)
}
}
impl<'a, 'b> Drop for StatementUse<'a, 'b> {
fn drop(&mut self) {
// First reset the statement, otherwise the bind calls
// below will fails
self.statement.reset();
// Reset the binds that may point to memory that will be/needs to be freed
for (idx, _buffer) in std::mem::take(&mut self.binds_to_free) {
unsafe {
// It's always safe to bind null values
self.statement
.bind(SqliteType::Text, &SqliteBindValue::Null, idx)
.expect("Binding nulls shouldn't ever fail");
}
}
}
}
-
I would:
unsafe { // It's always safe to bind null values self.statement .bind(SqliteType::Text, &SqliteBindValue::Null, idx) .expect("Binding nulls shouldn't ever fail"); + drop(buffer); }
to make the semantics involved explicit, so as to be more robust to refactorings.
And, if the
.expect
were to be remotely reachable, which would cause thebinds_to_free
to drop the remaining elements without clearing them from the statement, I'd then suggest to useManuallyDrop
in this snippet:let mut binds_to_free = ManuallyDrop::new(mem::take(&mut self.binds_to_free)); binds_to_free.iter().for_each(|&(idx, _)| unsafe { self.statement.bind… }); // Now that all the stuff has been un-bound, we can finally free the buffers: drop(ManuallyDrop::into_inner(binds_to_free));
And at that point, even more generally, you could be wrapping in a
ManuallyDrop
the.binds_to_free
field altogether: it would avoid the.take()
, as well as show the "subtle drop semantics" intent involved:self.binds_to_free.iter()… // Now that all the stuff has been un-bound, we can finally free the buffers: ManuallyDrop::drop(&mut self.binds_to_free);
Way more unwind-safe this way, I'd say
Other than these things (the most important of those being the potential unsound interaction between .field_name()
and .step()
), everything looks quite good to me —I really like the Box<dyn …>
improvements over the manual free (the one from the previous commit quoted in this thread), for instance