Cache prepared statements in statics?

While I continue to be skeptical about Rust as a general purpose programming language (my skepticism does not extend to its use as a systems programming language, as it is billed by the project), as I've said in other posts and therefore won't elaborate here, I continue to be intrigued by it, mostly because there are clearly smart people involved in this project, and so I continue to experiment with it in spare time.

I have written a personal financial management system for myself. The main application is written in C, using sqlite3 and gtk3. Let's just look at one issue that was easy to do in C and confounds me with Rust. Gtk callbacks frequently need to access the database. In C, I use statics inside the callback functions, initialized to NULL, to hold prepared queries. The functions test the static for NULL and if true, they prepare the query and store it in the static. If not, they just use it. So these statics allow me to do just-in-time preparation of only the prepared queries actually used, and then cache/memoize them for subsequent use.

In Rust, I've sketched out a bit of code like this and have tried to use lazy_static to (possibly) hold the prepared queries. This application is single-threaded and will remain so. I am using the sqlite crate. Its Statements (the result of prepare) do not have the Send trait and Statements must be stored so they are mutable. Apparently Rust insists on even non-mutable statics being thread-safe even in a single threaded application with no way to turn this off. So I can't define the type of a static as Statement, or Option and maybe use RefCell for interior mutability, to avoid declaring the static mutable. So even though I don't need a Mutex, I wrap the statement in a Mutex, in an effort to make the compiler happy. And then tried wrapping that in Arc, as The Book suggests. It still complains:

error[E0277]: the trait bound *mut sqlite3_sys::types::sqlite3_stmt: std::marker::Send is not satisfied in sqlite::Statement<'static>
--> src/book.rs:29:9
|
29 | / lazy_static! {
30 | | static ref stmt:Option<Arc<Mutex<Statement<'static>>>> = None;
31 | | }
| |_________^ *mut sqlite3_sys::types::sqlite3_stmt cannot be sent between threads safely
|
= help: within sqlite::Statement<'static>, the trait std::marker::Send is not implemented for *mut sqlite3_sys::types::sqlite3_stmt
= note: required because it appears within the type (*mut sqlite3_sys::types::sqlite3_stmt, *mut sqlite3_sys::types::sqlite3)
= note: required because it appears within the type sqlite::Statement<'static>
= note: required because of the requirements on the impl of std::marker::Send for std::sync::Mutex<sqlite::Statement<'static>>
= note: required because of the requirements on the impl of std::marker::Sync for std::sync::Arc<std::sync::Mutex<sqlite::Statement<'static>>>
= note: required because it appears within the type std::option::Option<std::sync::Arc<std::sync::Mutex<sqlite::Statement<'static>>>>
= note: required by lazy_static::lazy::Lazy
= note: this error originates in a macro outside of the current crate

So it appears that wrapping an object that doesn't implement Send in a Mutex (which claims to implement Send) in a single-threaded application just isn't good enough. Perhaps I'm just going about this the wrong way. Can someone please educate me on how to do this simple-minded task correctly in Rust?

Thanks.

You're right with the observation that the concept of single-threaded application doesn't exist in Rust. Rust doesn't forbid threads, and it requires that everything that isn't forbidden does work.

In general there's a theme that in C you're happy that there exists a case where your code works. But Rust wants to prove that there can't possibly be any case, now or in future, in which the code doesn't always work.

Second thing is that there are two notions of thread safety: "Sync", which means multiple threads can use it at the same time, and "Send" which means object isn't tied to a single specific thread forever (e.g. depends on thread local storage or requires access only from the main thread like some UI frameworks).

Mutex helps making objects Sync by locking. But it doesn't make them Send, because any random thread can use the Mutex. If Sqlite's statement is not tied to a particular thread, it may be a limitation/bug of the sqlite wrapper you're using which fails to mark it as Send kind of thread-safe.

It might just be an oversight of the sqlite crate - they would be happy about an issue in that case. Or prepared statements might not be usable from different threads after all in sqlite - you don't really know with C libraries unless it's explicitly documented.

If they can't be made Send, a thread-local RefCell could be used instead of a Mutex.

I appreciate both your replies, but what I'm reading is that Rust code must be thread safe. My way or the highway. Well, to my way of thinking, there is still a lot of code that isn't parallel, concurrent or whatever you choose to call it, because it isn't suited to such treatment or doesn't require it. (I would point out to you that I ran the BBN Butterfly Lisp project -- the Butterfly was a 256-processor machine with a fast-FFT Butterfly interconnect -- and also worked at Thinking Machines, so the concept of parallel processing is not foreign to me.) But there are times when it isn't necessary. So imposing all kinds of restrictions on our code on the chance that we might someday think about having a meeting to discuss perhaps making an application multi-threaded makes absolutely no sense to me. This is like buying a Chevy Suburban because you take the family on vacation once/year, when most trips are one person going to the grocery store.

With Haskell GHC, there is a --threaded flag, which makes available all the multi-thread support. To me, making this optional is sensible, and I think something like this ought to be considered for Rust, for the reasons I've outlined above.

Again, thanks for your help. I will abandon further efforts to port this application to Rust. I don't completely trust my C code, for all the reasons Rust was created, but at this point Rust's rigidity on this point creates too many implementation problems for me.

Well, that's not true. Quite the opposite, in fact, as the fact that Rust lets you clearly mark not-thread-safe things as such, to avoid accidents, is one of the great wins in Rust that I wish existed in other languages at other points in the control-productivity spectrum.

What you've actually seen is that

  1. Global variables must be thread-safe to be safe, which is totally reasonable. Of course there's unsafe if you want C's level of safety.
  2. Mutex, which is a threading library class, correctly marks itself as unusable from multiple threads if whatever's inside it isn't.

Nothing here prevents you from writing single-threaded code.

scottmcm https://users.rust-lang.org/u/scottmcm
January 5

donallen:

what I’m reading is that Rust code must be thread safe

Well, that’s not true. Quite the opposite, in fact, as the fact that Rust
lets you clearly mark not-thread-safe things as such, to avoid accidents,
is one of the great wins in Rust that I wish existed in other languages at
other points in the control-productivity spectrum.

What you’ve actually seen is that

  1. Global variables must be thread-safe to be safe, which is totally
    reasonable.

Yes, you are right, I overstated the case. But my point remains. Global
variables don't need to be thread safe if there's only one thread! So why
make my life difficult if I'm prepared to state that the application is
single-threaded? So I don't see that it is "totally reasonable". Am I
missing something? Is there signal-handling in Rust? Last time I looked,
there wasn't.

  1. Of course there’s unsafe if you want C’s level of safety.

If I wanted C's level of safety, I'd write it in C. Much simpler.

  1. Mutex, which is a threading library class, correctly marks itself
    as unusable from multiple threads if whatever’s inside it isn’t.

Yes, except what I am trying to do is perfectly safe in a single-threaded
environment. But I'm having to fight with the compiler, study
documentation, send messages like these, to try to do something that should
be simple when there is only one thread. In this case, Rust is extracting a
cost when there is no benefit.

I think the real answer lies in my original post. Rust is not a general
purpose programming language and I am misusing it by even experimenting
with it for this application. There is a mis-match between what Rust was
designed for and what my application requires.

Nothing here prevents you from writing single-threaded code.

Except I still haven't received a suggestion for how to do the caching of
prepared statements in callbacks (that, after all, is the whole point of
prepared statements in sqlite and other rdbs, to amortize the preparation
cost over multiple uses). So at this point I would say that I can't write
the single-threaded code I want. Again, I think this is just the wrong tool
for this job.

Again, thanks to all for the discussion, which I think has helped me find
my answer.

In a single-threaded program you can use thread_local! to have a global variable that doesn't need to be thread-safe. Or you can wrap any type in a struct and do unsafe impl Send for that struct and store it in a static.

3 Likes

The thing you’re missing is that the compiler and the libraries want to ensure your code is safe, not assume anything. In particular, it wants to ensure that if threads are added later that this code continues to be safe. So you have to convey the thread-localness via the types. In C it’s up to you to make sure this invariant holds and compiler is silent on any violations.

I think you’re just in the process of learning Rust, not misusing it. It’s not a hack language - getting something to compile and run with reckless abandon is not what it’s optimizing for. You have to appreciate what it’s trying to do (and why) or else you’ll continue to dislike it.

vitalyd https://users.rust-lang.org/u/vitalyd
January 6

donallen:

Global

variables don’t need to be thread safe if there’s only one thread! So why

make my life difficult if I’m prepared to state that the application is

single-threaded? So I don’t see that it is “totally reasonable”. Am I

missing something? Is there signal-handling in Rust? Last time I looked,

there wasn’t.

The thing you’re missing is that the compiler and the libraries want to
ensure your code is safe, not assume anything. In particular, it wants
to ensure that if threads are added later that this code continues to be
safe. So you have to convey the thread-localness via the types. In C it’s
up to you to make sure this invariant holds and compiler is silent on any
violations.

No, I am not missing that at all. I completely understand it and have all
along. My issue is that this aspect of the design of Rust is making a
common case (single-threaded applications) much more difficult than need be
on the chance that mutli-threadedness might pop up in the future. This is
precisely why I brought up the GHC --threaded compiler option. If this were
done in Rust, then if "threads were added later", to use them, I'd have to
enable the --threaded option and at that point, the compiler could begin to
do all the ensuring of thread-safety possible and I'd be grateful for that.
I just think it makes no sense for it to be getting in the way needlessly
on speculation about something that might or might not happen in the
future. Or, slightly different, if the focus of Rust is on multi-threaded
applications, then it's the wrong tool for single-threaded applications if
it imposes an additional cost in writing them for no benefit.

donallen:

I think the real answer lies in my original post. Rust is not a general

purpose programming language and I am misusing it by even experimenting

with it for this application. There is a mis-match between what Rust was

designed for and what my application requires.

I think you’re just in the process of learning Rust, not misusing it. It’s
not a hack language - getting something to compile and run with reckless
abandon is not what it’s optimizing for. You have to appreciate what it’s
trying to do (and why) or else you’ll continue to dislike it.

To learn the language, I've ported three of the utilities in my finance
suite from Haskell to Rust. These are not "hacks". They are substantial
amounts of code, over 1000 lines. Actually, you were very helpful,
assisting me with problems I ran into along the way.

As for appreciating what it is trying to do, I'm quite sure that I do. I
won't reiterate how long I've been doing this and how many programming
languages I've used over the years, but I've written an awful lot of code
in my life and programming languages are a particular interest. But I think
you would agree that no programming language is suitable for every
situation, every problem. I don't think Scheme or Haskell is suitable for
writing router firmware. Rust is. If I were going to write an operating
system, I would certainly consider Rust and not the others. But I think
Rust clearly is not suitable for writing applications like my finance
application, which has no real-time constraints, can tolerate garbage
collection pauses that may never happen on today's memory- and
address-space-rich machines, and which, by its nature, is single-threaded.
Other languages are as good as Rust at insuring correctness at compile- and
run-time (C is not one of them, but Haskell is) without exacting a price
for things that don't matter in this situation.

But the way to address that is trivial (once you know about it) with the thread_local RefCell, isn’t it?

To be clear, I didn’t mean that your programs are hacks. I meant it’s, essentially, not a prototyping language like a Python might be. Python will run virtually any code you throw it. C is similar for a compiled language.

I would.

But I don’t think I agree with this, although I understand where you’re coming from. As I mentioned I probably wouldn’t use Rust for some prototype/POC/throwaway code.

If performance isn’t of serious concern then you’re right there are other options. Then it comes down to what else the languages offer and what you personally find valuable. By definition that’s subjective and universal agreement will be hard to find.

I would, however, wait until you know the language better, and get more experience with it, before reaching a conclusion. And Rust is not a finished frozen product - it’s being evolved at a fairly quick clip and the learning curve is a known sore spot. Hopefully some of that will get easier over time. No language is perfect and all.

vitalyd https://users.rust-lang.org/u/vitalyd
January 6

donallen:

My issue is that this aspect of the design of Rust is making a

common case (single-threaded applications) much more difficult than need be

on the chance that mutli-threadedness might pop up in the future.

But the way to address that is trivial (once you know about it) with the
thread_local RefCell, isn’t it?

Ah, looks promising. Looks like thread_local is the --threaded option I
wanted on a more microscopic level. I will try it.

donallen:

To learn the language, I’ve ported three of the utilities in my finance

suite from Haskell to Rust. These are not “hacks”. They are substantial

amounts of code, over 1000 lines. Actually, you were very helpful,

assisting me with problems I ran into along the way.

To be clear, I didn’t mean that your programs are hacks. I meant it’s,
essentially, not a prototyping language like a Python might be. Python will
run virtually any code you throw it. C is similar for a compiled language.

I understand. That Rust is not for quick hacks is pretty evident :slight_smile:
Actually, I personally would not use C for that sort of thing either. I
tend to favor tcl, because I've used it for years and if you don't push it
beyond what Ousterhout designed it for, it can be very useful, though the
programs are a bit ugly. But not as ugly as perl code, in my opinion. I've
also written a fair amount of Python, too, but for quick stuff, I still
prefer tcl.

donallen:

But I think

you would agree that no programming language is suitable for every

situation, every problem.

I would.

donallen:

But I think

Rust clearly is not suitable for writing applications like my finance

application, which has no real-time constraints, can tolerate garbage

collection pauses that may never happen on today’s memory- and

address-space-rich machines, and which, by its nature, is single-threaded.

Other languages are as good as Rust at insuring correctness at compile- and

run-time (C is not one of them, but Haskell is) without exacting a price

for things that don’t matter in this situation.

But I don’t think I agree with this, although I understand where you’re
coming from. As I mentioned I probably wouldn’t use Rust for some
prototype/POC/throwaway code.

If performance isn’t of serious concern then you’re right there are other
options.

On today's hardware, we frequently obtain adequate performance with
interpreted languages. We don't need a Ferrari to go to the grocery store.
And sometimes the code we write isn't the bottleneck. One of the utilities
I spoke of previously is a report generator that picks over my financial
database and generates Latex to give me the reports I need. Depending on
the machine I use, this program takes about a minute in Rust. It takes
about a minute in Haskell. I'm not arguing that Haskell code is as fast as
Rust, but in this case it is, because most of the time is spent in sqlite
(the programs are processor-limited in both languages and this particular
program runs in two threads, so it's no I/O that's determining the
run-time, it's processor time in sqlite).

But if performance turns out to be an issue (after thinking hard about the
80-20 rule and perhaps doing some prototyping in one of the "hack"
languages), then certainly Rust is a leading candidate (I'd certainly
choose it ahead of C or C++, particularly the latter, which i think is a
uniquely awful programming language).

Then it comes down to what else the languages offer and what you
personally find valuable. By definition that’s subjective and universal
agreement will be hard to find.

Absolutely.

I would, however, wait until you know the language better, and get more
experience with it, before reaching a conclusion. And Rust is not a
finished frozen product - it’s being evolved at a fairly quick clip and the
learning curve is a known sore spot. Hopefully some of that will get easier
over time. No language is perfect and all.

The learning curve is steep even if the language were to hold still. The
moving target makes it steeper, but I understand why this is the case. Rust
is a major innovation, so getting it right is more of a challenge. Fewer of
Newton's giants.

1 Like

Well, I had a crack at thread_local and while it works, it's a bit ugly. So
my thought was to use this just for the caching of prepared statements that
are local to particular functions, but avoid the use of global statics for
other things this application needs (callbacks need to access the
connection object to prepare statements, there are a few hashtables for
keeping track of various things, etc.). The idea was to create a globals
struct in the main routine, which lives until the application dies, and use
RefCell for mutable globals (like the hash tables), rather than declaring
them in the struct as mut, which then avoids passing a mutable reference to
the rest of the world. The reference would need to be captured by closures
in callbacks and is they are mutable, I have hard-earned knowledge that
this causes trouble in Rust.

Except now I find that because a callback closure captures the immutable
reference to the globals, the compiler is upset because the closure isn't
static (the closure gets passed to gtk when you enable a signal handler). I
have no idea how to indicate that a closure is static and a little googling
suggests that it may not be possible (I found some discussion about this
that was a couple of years old, so the issue may be fixed?).

I'm going to call it a day. This is a major time sink and I do have
alternatives, as we've discussed, which includes just continuing with the C
implementation of the application that we are discussing. It works, it's
fast, valgrind finds no memory leaks, so maybe the best idea is to use this
time to read a few good books :slight_smile:

Thanks, Vitaly. Your knowledge and helpfulness is a real asset to this
project. If I find myself in a situation that I think calls for Rust and
get into trouble, I'll get in touch.

/Don

The easiest way to achieve this case is Rc<RefCell<HashMap<...>>> and then move a clone of the Rc into the closure. Have you tried that?

No problem :slight_smile: . Maybe check back on Rust in some time in case it gets easier.

vitalyd https://users.rust-lang.org/u/vitalyd
January 6

donallen:

Except now I find that because a callback closure captures the immutable

reference to the globals, the compiler is upset because the closure isn’t

static (the closure gets passed to gtk when you enable a signal handler). I

have no idea how to indicate that a closure is static and a little googling

suggests that it may not be possible (I found some discussion about this

that was a couple of years old, so the issue may be fixed?).

The easiest way to achieve this case is Rc<RefCell<HashMap<...>>> and
then move a clone of the Rc into the closure. Have you tried that?

You know, I'm sitting here trying to read my book and you keep interrupting
me with good ideas. Very annoying :slight_smile:

I knew about Rc and Arc, but didn't think of using them. This is an
illustration of the fact that Rust is a different animal and it doesn't
matter that I've been doing this for 100 years, because I've never seen
anything like Rust and I'm not good at it yet. I had the same experience
with learning Haskell, though I don't think Haskell is as difficult as Rust
if you don't try to understand monads, just learn how to use them (and the
'do' syntax).

Anyway, I hadn't tried your suggestion, but just did and of course it
works. The handling of the globals (especially with regard to capturing
them in callback closures) and caching of prepared queries was a major
obstacle in getting anywhere with this application and I think you've
gotten me over that hump, so maybe I'll set the book aside and stick around
a little longer, to keep you busy.

donallen:

2 Likes

Glad that helped! (Sorry about the book distraction :slight_smile: )

The Rc<RefCell<...>> is a common technique for closure-heavy APIs (where closures are “long lived”), so it’s good you’re aware of it now. For example, building servers using tokio virtually requires it (which is a bit annoying but I digress).

2 Likes

I did some more work on this over the weekend and with the weapons you've
equipped me with, the language and and its enforcer, the compiler, are no
longer the enemy. The issue is the gtk-rs crate, which isn't quite there
yet, at least with respect to my application. Too many of the items in the
gtk api that I use are still unimplemented. So I have no choice but to put
this on hold, or start filling in the holes in gtk-rs myself, which I have
no desire to do.

Thanks again for all the help.

/Don

1 Like