Sqlite: caching prepared statements .... again

I've brought up this topic before and received some good suggestions. But I have never managed to come up with a satisfactory method for achieving something that is both useful and extremely simple in C: caching prepared sqlite statements that are used multiple times, to avoid unnecessary prepares.

In C, I simply use a static hold a pointer to a prepared statement. I initialized the static to NULL and in the function that uses the static (typically I define the static within the function if only that function will use it), I test if the pointer is NULL. If it is, I prepare the statement and store the result in the static. Done.

The application I am speaking about is single-threaded and always will be. In Rust, I'd love to be able to define a static with type &mut Option and initialize it to None. The first time I need the statement, I'd like to prepare it and set the static to Some(... prepared statement ...).

But I can't just use a mutable static even if I were willing to write unsafe code, because trying to initialize a mutable static of type Option to None won't work because None is not a value available at compile-time. lazy_static doesn't work because neither the sqlite or rusqlite crates implement Send for the Statement struct. lazy_static requires that the static types implement Sync. So my thought was to wrap the Option in a Mutex, which provides the Sync trait, but only if the type it enclosed implements Send, so this won't work. I think crate implementors are being conservative, since there are three options for the threading mode used to build sqlite and while the default is safe, I'm guessing they feel they have to code for the worst case since they have no control over the building of sqlite in the users' environment. They are probably right.

My application is single-threaded and always will be. Since there appears to be no way to say to Rust "look, I know you are all about thread safety, but I'm building a single-threaded application and your obsession with thread safety is driving me to distraction. Just back off, I know what I'm doing. Failing that, at Vitaly's suggestion of a month or two ago, I tried thread_local. This got me into lifetime hell. Thoroughly frustrated, I abandoned the effort in favor of other uses of my time. But I decided today to revisit this to try to construct a simple example that would illustrate the issue, hoping perhaps to either get a solution (it's certainly conceivable I'm overlooking something here) or improve the situation so this simple and useful requirement can be implemented in Rust as simply as I think should be possible.

Here's the code:

extern crate sqlite;

use sqlite::{Connection, Statement, State};
use std::cell::RefCell;

thread_local! {
    static DB:Connection = Connection::open(":memory:").unwrap();
    static TEST:RefCell<Statement<'static>> = RefCell::new(DB.with(|db| db.prepare("select    12345").unwrap()));
}

fn main() {
   TEST.with(|test| {
       let stmt = *(test.borrow_mut());
       while let State::Row = stmt.next().unwrap() {
           println!("{}", stmt.read::<i64>(1).unwrap());
        }
    });
}

Here's what happens when I try to build this:

dca@franz:~/Software/example$ cargo build > /tmp/errors
   Compiling example v0.1.0 (file:///home/dca/Software/example)
error[E0495]: cannot infer an appropriate lifetime for autoref due to conflicting requirements
 --> src/main.rs:8:76
  |
8 |     static TEST:RefCell<Statement<'static>> = RefCell::new(DB.with(|db| db.prepare("select 12345").unwrap()));
  |                                                                            ^^^^^^^
  |
note: first, the lifetime cannot outlive the anonymous lifetime #2 defined on the body at 8:68...
 --> src/main.rs:8:68
  |
8 |     static TEST:RefCell<Statement<'static>> = RefCell::new(DB.with(|db| db.prepare("select 12345").unwrap()));
  |                                                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: ...so that reference does not outlive borrowed content
 --> src/main.rs:8:73
  |
8 |     static TEST:RefCell<Statement<'static>> = RefCell::new(DB.with(|db| db.prepare("select 12345").unwrap()));
  |                                                                         ^^
  = note: but, the lifetime must be valid for the static lifetime...
note: ...so that expression is assignable (expected sqlite::Statement<'static>, found sqlite::Statement<'_>)
 --> src/main.rs:8:60
  |
8 |     static TEST:RefCell<Statement<'static>> = RefCell::new(DB.with(|db| db.prepare("select 12345").unwrap()));
  |                                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error

error: Could not compile `example`.

To learn more, run the command again with --verbose.

First of all, disappointingly, E0495 is not in the website's Rust Compiler Error Index. A fuller explanation of this error could perhaps have been helpful here.

Secondly, I have absolutely no clue what the compiler is complaining about. In the first line of the error details, "prepare" is underlined. What reference is outliving what borrowed content? Everything is static, or trying to be, so what's the problem?

1 Like

If you look at the docs for std::thread::LocalKey, you’ll see:

That’s precisely what you were trying to do (escape the closure) and the compiler stopped you.

You can hack this up with static mut and unsafe code. You can wrap sqlite structs into your own type and impl Send for it (ie mark it as Send) and then stick a mutex around it. Or, try to avoid globals even though I know you really want them :slight_smile:

I'm sorry, but I don't understand your comment about my trying to escape the closure. There are two closures in this example. One is the DB.with closure and I assume that's the one you are talking about, given the error. How am I taking the "reference to the contained value", which is the Connection, and sending it out of the closure? I am using it to compute a prepared statement and that is what I am sending out of the closure (as a returned value). The Statement is not the Connection.

DB.with(|db| db.prepare("select 12345").unwrap())

db here is a reference with a "fresh" lifetime (it will not unify with anything - that's how the code prevents leaking any references tied to self to outside the closure). prepare() is defined as:

fn prepare<'l, T: AsRef<str>>(&'l self, statement: T) -> Result<Statement<'l>>

As a result, the returned Statement has a lifetime parameter equal to the borrow of self - note that it has nothing to do with how long the actual self value lives, this is the lifetime of the borrow. So then your code is trying to assign a Statement<'some_anon_lifetime> to a Statement<'static>, which is not valid.

2 Likes

Ok, now we're making progress. I actually thought, though didn't say, that this was tied to sqlite's effort to be sure that a statement didn't outlive its connection. The criterion should NOT be that that the statement doesn't outlive a reference to the connection. References come and go -- it should not outlive the connection! But it's not clear how to say that in Rust, at least to me. rusqlite has the same lifetime relationship between statements and a reference to a connection.

Thanks for the help. I was hoping that I'd really over-looked something simple and there was an easy way to achieve the very simple and standard thing I want to achieve. The difficulty of getting this right in Rust convinces me, yet again, that Rust is the wrong language for applications like this (where gc pauses just aren't a problem) and the cost in productivity brought about by wrestling with problems like this are not worth Rust's seductive appeal of performance+memory-safety. I can get the memory safety much more easily in Go or Haskell and the sacrifice in performance, if there is one, will be paid for by the increase in productivity. Or I can just stick with the ugly-but-working C implementation.

That’s what the code does though. If you were to call prepare on a &'static Connection (which means you have a shared reference to a Connection that lives for the app duration, you’d get a Statement<'static> back. But the way thread_local is designed is you never see the 'static bit. The reason it’s like that is to prevent the closure from leaking out anything tied to the thread local value (as the docs say). If it did allow it, there’s nothing then stopping you from sending the value/reference to another thread (for example), which is no longer a thread local of course. It’s a correctness/safety belt to ensure the API isn’t misused.

If you want to turn off the safety, then use lazy_static with newtypes that mark these sqlite structures as Send (and Sync if you want). You can also keep your current code but transmute to 'static. These are all unsafe but you’ll be in charge of maintaining the safety yourself for these bits. I don’t think it’s reason to give up the whole language.

Also, as mentioned, purely for kicks and giggles: redesign your app to not require globals and see how you feel and how it looks. You might find that it’s actually a better and more maintainable design.

1 Like

vitalyd https://users.rust-lang.org/u/vitalyd
February 13

donallen:

The criterion should NOT be that that the statement doesn’t outlive a
reference to the connection. References come and go – it should not outlive
the connection!

That’s what the code does though. If you were to call prepare on a &'static
Connection (which means you have a shared reference to a Connection that
lives for the app duration, you’d get a Statement<'static> back. But the
way thread_local is designed is you never see the 'static bit. The reason
it’s like that is to prevent the closure from leaking out anything tied to
the thread local value (as the docs say). If it did allow it, there’s
nothing then stopping you from sending the value/reference to another
thread (for example), which is no longer a thread local of course. It’s a
correctness/safety belt to ensure the API isn’t misused.

If you want to turn off the safety, then use lazy_static with newtypes
that mark these sqlite structures as Send (and Sync if you want). You can
also keep your current code but transmute to 'static. These are all unsafe
but you’ll be in charge of maintaining the safety yourself for these bits.
I don’t think it’s reason to give up the whole language.

Also, as mentioned, purely for kicks and giggles: redesign your app to not
require globals and see how you feel and how it looks. You might find that
it’s actually a better and more maintainable design.

What we've been discussing is not implemented as globals in the original
C version. As I said in the original post, the cached prepared statements
are stored mostly in statics that are within the scope of the functions
that use them. They are visible only to the functions that use them. As an
old Scheme programmer, I'm very attentive to scoping issues. This is not
about globals; it's about lifetimes. Again, I think the design of Rust is
all about thread-safety and gc-less-ness and I have an application that
requires neither and in the case of thread-safety, it's very difficult to
say to Rust that "I'm writing single-threaded code. Please don't make me
pay for what I am not using."

Ah, I probably missed it while skimming. Ok, statics to hold your connection and prepared statement at function scope is better but not great :slight_smile:. But I can see the appeal while you stay singlethreaded.

Rust is hyper focused on correctness and wants you to work within its framework to explain to it why something is safe in terms it understands. But you can turn this off and I don’t think it’s difficult (it’s actually simple compared to some other languages, like Java) - you go into the unsafe territory and then sidestep the compiler. You’re one or two transmutes away from getting your code to compile.

I think the problem you are having is because you want to use a static variable instead of a local variable which only gets created once (after the connection) then passed around as a parameter.

When you want to make a static variable you are trying to say that the thing it contains will be valid for the entire program. Considering we create a connection after the program starts and a prepared statement can't outlive it's connection (use-after-free and all that), what you are trying to do is actually unsound and that's why the compiler is stopping you.

You can always sidestep the lifetime system and use unsafe like you would in C, but then it becomes your responsibility to ensure each use of a prepared statement is actually valid and to remember this every time you go near one... The correct way would be to hold the prepared statements in some local Cache<'conn> struct which is created after the connection and contains a bunch of Option prepared statements. You can then statically say that your connection outlives your cached statements.

2 Likes

vitalyd https://users.rust-lang.org/u/vitalyd
February 13

donallen:

As I said in the original post, the cached prepared statements

are stored mostly in statics that are within the scope of the functions

that use them. They are visible only to the functions that use them. As an

old Scheme programmer, I’m very attentive to scoping issues. This is not

about globals; it’s about lifetimes.

Ah, I probably missed it while skimming. Ok, statics to hold your
connection and prepared statement at function scope is better but not great [image:
:slight_smile:].

No, just the prepared statements that will be used many times. Can you
suggest a better way? Programs frequently have state that needs to last the
life of the program -- switches, caches, etc. You can even do this sort of
thing in Haskell without much trouble (IORefs), though I don't know a way
to do it with local scope (I think you need to pass the Haskell equivalent
of a struct around that contains these values, capturing the struct in
callback closures where needed).

But I can see the appeal while you stay singlethreaded.

donallen:

Again, I think the design of Rust is

all about thread-safety and gc-less-ness and I have an application that

requires neither and in the case of thread-safety, it’s very difficult to

say to Rust that “I’m writing single-threaded code. Please don’t make me

pay for what I am not using.”

Rust is hyper focused on correctness and wants you to work within its
framework to explain to it why something is safe in terms it understands.
But you can turn this off and I don’t think it’s difficult (it’s actually
simple compared to some other languages, like Java) - you go into the
unsafe territory and then sidestep the compiler. You’re one or two
transmutes away from getting your code to compile.

Well, I don't think the way to do this is with thread_local, because all it
presents inside the closure is a reference to the static, the reference is
short-lived and the sqlite crates are not saying what they want to say, or
can't say about the relationship between the lifetime of a connection and a
statement on which it depends. I really think this points out a serious
weakness in Rust. Rust makes it extremely difficult to talk about lifetimes
of references, lifetimes of objects and the relationships among them. When
I read the attempts to document this, both by Klabnik and Jim Blandy, the
understanding I get is not what's going on, it's how obscure what's going
on really is. In this case, I think that's why the documentation seems so
bad; memory safety without a gc is hard and maybe the design of Rust's
lifetime system isn't quite right, starting with how hard it is to say that
an arbitrary object, not a reference, is static. My 2 cents.

As for whether or not to give up on the language (for this application),
look at the lengths I have to go to, with your help, to do something so
basic. Why does sqlite (and postgresql and all the other db systems) have
prepared statements with bindable parameters? To avoid recompiling them
every time you use them! This is analogous to why we use compiled languages
instead of doing everything interpreted. And Rust is all about performance.
But I need to twist myself into a pretzel and write some ugly, unsafe code
to make this happen in Rust. That to me is a clear indication that it's the
wrong tool for this job, especially given that I have other tools that let
me do this easily (trivial in C, Nim, Go or Scheme, a bit harder in
Haskell, none as messy as Rust). My old friend Jerry Sussman taught us
years ago that it's extremely important that code be readable.

I'm not giving up on the language, which I've said before clearly has merit
and areas of application. I just don't think it's right for this job.

Michael-F-Bryan https://users.rust-lang.org/u/michael-f-bryan
February 13

I think the problem you are having is because you want to use a static
variable instead of a local variable which only gets created once (after
the connection) then passed around as a parameter.

When you want to make a static variable you are trying to say that the
thing it contains will be valid for the entire program. Considering we
create a connection after the program starts and a prepared statement can’t
outlive it’s connection (use-after-free and all that), what you are trying
to do is actually unsound and that’s why the compiler is stopping you.

You can always sidestep the lifetime system and use unsafe like you
would in C, but then it becomes your responsibility to ensure each use of a
prepared statement is actually valid and to remember this every time you go
near one… The correct way would be to hold the prepared statements in some
local Cache<'conn> struct which is created after the connection and
contains a bunch of Option prepared statements. You can then statically
say that your connection outlives your cached statements.

I've tried this. It's the same approach that would be necessary in Haskell.
The devil is in the details. I also ended up in lifetime hell going this
route, again involving the relationship of a statement's lifetime to the
lifetime of the connection on which it depends. One of the issues is that
this program has a graphical UI (gtk) and callbacks need access to some of
these global-esque variables. The callbacks are closures. How do you insure
that the globals live long enough to be reference-able by the closures? You
have to use cloned Rcs. But those can't be mutable. Now what? RefCells. Get
into the weeds with that one and you end up with a massive battle with the
borrow checker. Why bother when I can do this in go, nim, haskell, scheme
or swift and get absolutely acceptable performance without putting myself
in a mental institution mumbling about how my life is ruined because I
can't please the Rust compiler :slight_smile:

Well you need the connection as well since the statement is tied to it.

An approach I’d suggest is what you said later:

You’d set up your connections early on in main(), prepare statements that you know a priori, and then hand out the statements as needed. These instances would be kept around (and live) for the entirety of the program and passed around as context/state where needed.

You mentioned gtk in the other post - does it require closures that you give it to be 'static? If so, yeah, you’ll need to use Rc and RefCell. This should be workable but you have to align your code structure/flow with easier borrow checking. It’s a bit of a skill (comes with Rust experience) in itself. I wrote a bit about some general strategies here: How can I get myself out of this "mutable more than once" corner I've backed myself into?. Maybe that’ll help frame your approach a bit better.

By the way, have you looked at any of the connection pooling crates? Eg r2d2_sqlite3 - Rust

I think the sqlite API says the correct thing about the relationship. A statement depends on the connection you get it from. That seems pretty uncontroversial.

Lifetimes are not easy, but I think the friction you’re experiencing here is because of trying to shoehorn dynamic data into a 'static one. I’d be happy to try to explain some lifetime issues if you’d like - not sure if I’ll succeed but worth a shot :slight_smile:.

vitalyd https://users.rust-lang.org/u/vitalyd
February 14

donallen:

No, just the prepared statements that will be used many times. Can you

suggest a better way?

Well you need the connection as well since the statement is tied to it.

Yes. What I meant by that saying that you only the static at function scope
for the prepared statement. The connection needs to be visible to the whole
program somehow.

An approach I’d suggest is what you said later:

donallen:

I think you need to pass the Haskell equivalent

of a struct around that contains these values, capturing the struct in

callback closures where needed).

You’d set up your connections early on in main(), prepare statements that
you know a priori, and then hand out the statements as needed. These
instances would be kept around (and live) for the entirety of the program
and passed around as context/state where needed.

You mentioned gtk in the other post - does it require closures that you
give it to be 'static? If so, yeah, you’ll need to use Rc and RefCell. This
should be workable but you have to align your code structure/flow with
easier borrow checking. It’s a bit of a skill (comes with Rust experience)
in itself. I wrote a bit about some general strategies here: How can I
get myself out of this “mutable more than once” corner I’ve backed myself
into?
https://users.rust-lang.org/t/how-can-i-get-myself-out-of-this-mutable-more-than-once-corner-ive-backed-myself-into/15541.
Maybe that’ll help frame your approach a bit better.

By the way, have you looked at any of the connection pooling crates? Eg
r2d2_sqlite3 - Rust

donallen:

and the sqlite crates are not saying what they want to say, or

can’t say about the relationship between the lifetime of a connection and a

statement on which it depends.

I think the sqlite API says the correct thing about the relationship. A
statement depends on the connection you get it from. That seems pretty
uncontroversial.

Well now I'm thoroughly confused. In an earlier message, you said:

"fn prepare<'l, T: AsRef>(&'l self, statement: T) ->
Result<Statement<'l>>

As a result, the returned Statement has a lifetime parameter equal to the
borrow of self - note that it has nothing to do with how long the actual
self value lives, this is the lifetime of the borrow.

"

Your reading of what &'l self means -- that 'l is the lifetime of the
reference to self, which may be different from (less than) the lifetime of
self is exactly what I thought it meant. But your statement above ("I think
the sqlite API says the correct thing ...") made me wonder, because what is
necessary is that the statement not outlive the connection. Constraining
the statement to not outlive a reference to the connection is too
conservative, since the reference might die before the connection does,
which was the cause of the trouble with my thread_local attempt.

So I concocted a little example:

extern crate sqlite;

use sqlite::{Connection, Statement, State};

fn main() {
let db:Connection = Connection::open(":memory:").unwrap();
let mut stmt:Statement;
{
let inner_db = &db;
stmt = inner_db.prepare("select 12345").unwrap();

} //inner_db dies here

while let State::Row = stmt.next().unwrap() {
    println!("{}", stmt.read::<i64>(0).unwrap());
}

} //stmt dies here, just before db

To my utter amazement, this compiles and runs. The definition of prepare
says that the statement it returns will have the same lifetime as the
pointer to the connection it received, yes? Then how does this work?

This is so illustrative of the problem I have with Rust. Given the language
and the attempts to explain it, I simply do not have an accurate mental
model of what this thing is doing, which is essential, of course, if you
are going to use a programming language successfully. I understand the
memory models of C, Scheme, Haskell and every other programming language
I've used since I started doing this in ancient times. Rust consistently
violates the Principle of Least Surprise, at least for me.

Can you please explain to me why this example works?

extern crate sqlite;

use sqlite::{Connection, State, Statement};

fn main() {
    let db: Connection = Connection::open(":memory:").unwrap();
    let mut stmt: Statement;
    {
        let inner_db = &db;
        stmt = inner_db.prepare("select 12345").unwrap();
    } //inner_db dies here

    while let State::Row = stmt.next().unwrap() {
        println!("{}", stmt.read::<i64>(0).unwrap());
    }
} //stmt dies here, just before db

The compiler knows precisely how long db is live for. Recall that the dot operator used for method calls automatically dereferences through as many layers of references as needed to make the call. So you could've written let inner_db = &&&&&&&&&&&&&&&&&&&db and it would still work - the compiler would strip away all the references, get to the Connection value, auto reference it (i.e. get the &self that's needed for the prepare() call), and dispatch to the function. Since db outlives stmt, this is all perfectly fine - the inner_db doesn't really change the situation, all it really is is an indirection.

vitalyd https://users.rust-lang.org/u/vitalyd
February 14

extern crate sqlite;

use sqlite::{Connection, State, Statement};

fn main() {
let db: Connection = Connection::open(":memory:").unwrap();
let mut stmt: Statement;
{
let inner_db = &db;
stmt = inner_db.prepare("select 12345").unwrap();
} //inner_db dies here

while let State::Row = stmt.next().unwrap() {
    println!("{}", stmt.read::<i64>(0).unwrap());
}

} //stmt dies here, just before db

The compiler knows precisely how long db is live for. Recall

Recall? This is like saying "Recall Feynman's formulation of quantum
electrodynamics" :slight_smile:

that the dot operator used for method calls automatically dereferences
through as many layers of references as needed to make the call. So you
could’ve written let inner_db = &&&&&&&&&&&&&&&&&&&db and it would still
work - the compiler would strip away all the references, get to the
Connection value, auto reference it (i.e. get the &self that’s needed for
the prepare() call), and dispatch to the function. Since db outlives stmt,
this is all perfectly fine - the inner_db doesn’t really change the
situation, all it really is is an indirection.

In the immortal words of John McEnroe, "you cannot be SERIOUS!". Someone
else seems equally baffled:

.

So this auto-dereferencing thing wanders through the pointer chain until it
gets to the actual object and then generates an auto reference to that
object with the same lifetime as the object? And we are supposed to keep
this special case in mind (with the myriad other special cases in this
language) when trying to write Rust code, where getting lifetimes right is
difficult enough? And there are competitive languages perfectly appropriate
for writing non-systems applications that make you think of none of this?

Thanks for your time and your help. Again. But let's stop.

So, let me get this straight:

  • You initially assume that the prepared statement must not outlive the reference to the connection, find it unnecessarily restrictive, and complain about it.
  • Then you figure out that the compiler actually does what you consider to be the right thing (ensures that prepared statement does not outlive the underlying connection object), and still find a way to complain about it.

I know you really, really want to prove your point, but sometimes it is good to know when to stop...

2 Likes

HadrienG https://users.rust-lang.org/u/hadrieng
February 14

So, let me get this straight:

  • You initially assume that the prepared statement must not outlive
    the reference to the connection, find it unnecessarily restrictive, and
    complain about it.
  • Then you figure out that the compiler actually does what you
    consider to be the right thing (ensures that prepared statement does not
    outlive the underlying connection object), and still find a way to complain
    about it.

I know you really, really want to prove your point, but sometimes it is
good to know when to stop…

And it would be really good if you read the whole thread before you get
snotty. If you had, you'd realize that what I was trying to do was make one
last ditch attempt to figure out how to use this language in some
reasonable way to do something that's trivial in almost any other and a
requirement for using database systems efficiently.

I have read the thread, but somehow got something differently out of it. What I saw was someone desperately trying to shoehorn an architectural pattern (global state) which is considered archaic and undesirable in any modern coding style, into a problem which didn't really need it, due to what looked like basic unability to fight old bad habits, and then complaining because Rust took extra steps to discourage it. Trying so hard to prove that this is the language's fault that the argumentation ended up self-defeating.

2 Likes

For your information,
rusqlite supports caching prepared statements.
The cache is associated to one connection (and dropped when the connection is closed).
The implementation was based on the official TCL binding.

4 Likes

Part of the confusion may be that the prepared statements in SQLite are a misnamed. They are actually reusable result sets. There is no immutable prepared statement concept in the SQLite API.