Rust objects: aCId => ACId => ACID?

Using the database definition of ACID, one can argue:

Rust objects are aCId

  • Isolation: can't have two &mut ensures that each sequence of mutations happens in isolation

  • Consistency: can't have &mut while read & exists ensures that all reads are consistent

  • It's not really Atomic, since in the middle of a &mut self function, something bad might happen, we return a Result<.., Err> with half written state

  • Definitely not durable.

Haskell / Erlang / Elixir / Rust using only immutable-rs are ACId

  • Using a codding style that uses purely persistent data structures with a "pointer swap" at the root guarantees Atomicity; if a function fails, there are no partial updates; if the function succeeds, it returns a new good value.

So here, by restricting ourselves to using only immutable-rs, we can 'boost' Rust objects from aCId to ACId.

sql is ACID (in properly configured db); rust + ??? has ACID objects ?

In a properly configured DB, after SQL statements are executed, the updates are durable.

Question: besides the "use an ORM + database" crates, are there any other crates / techniques that promotes Rust objects to "ACID" ?

Note: restricting to immutable-rs already takes Rust objects from aCId to ACId, without depending on an external db. Are there crates that takes us from ACId to ACID without using an external db ?

You may want to look at Prevayler, "the simplest and fastest way to provide ACID persistence".

https://prevayler.org/

1 Like

As you mentioned, you can get atomicity by designing your code to use immutable types and pointer swaps. Alongside immutable-rs, the arc_swap crate is a good one to mention here.

My interpretation of "durability" is that if the application says it has done an operation, I could yank out the power cord and know I won't lose anything... You could probably create a library that does operations within a certain scope and if the operation succeeds it'll be committed to disk, otherwise we'll roll back to a previous snapshot. But at that point we've just implemented a poor man's SQL transaction, haven't we?

Why are you trying to achieve database functionality without using a database? What is wrong with using a database, what are you trying to do?

2 Likes

Not full DB functionality. Just "ACID objects".

Hackability. When is the last time you made a significant patch to MySQL or PostgreSQL? I have never made a patch to either. They're open source but opaque to me.

Rust + immutable.rs already brings us to ACId -- and is very hackable. I am wondering if there is some mmap trick + restrictions on Drop handlers that can do ACId -> ACID -- in an easily hackable / modifiable way.

I don't think you can make correct, full-fledged ACID "easy" to hack on. Pretty much the only part of the complexity you can avoid in a DB is the server part, which is not strictly necessary. But pretty much everything else (self-balancing persistent trees, index consistency, transaction handling, efficient I/O) is necessary and quite hard to get right.

I'm wondering what specific modifications you have in mind. Maybe you could solve your problems by wrapping SQLite in a strongly-typed layer?

1 Like

I suspect "my goals" != "what you think my goals are". I do NOT want to build a full fledged ACID database. I just want easy / simple durability.

immutable-rs is a very simple way to add immutable data structures to a codebase

I'm looking for a simple way to add (fast) durability. A slow solution would be to dump out everything via serde after every modification; a faster (but not obvious to me way) is somehow write all the changes to Rust objects to a log.

I think the main disagreement here is that your reading seems to be:

  • ACID objects implies databases

whereas my argument is:

  • databases satisfies ACID; however
  • one can achieve ACID objects without all the complexity of databases

OK, then what is "ACID objects"? Do you have something concrete, eg. a resired API and corresponding operational semantics in mind?

1 Like

In databases, ACID describes the behavior of the database with respect to transactions. What's a transaction when you're talking about just... regular objects?

3 Likes

Haskell's AcidState was nearly ported to Rust. It would have given you strong ACID guarantees on arbitrary rust objects.

It's a sequence of fallible operations such that any of those operations failing leaves things as if none of them had been attempted.

At least, that's the 'A' part; you can write similar statements for 'C', 'I', and 'D'.

I admit upfront my notation is a bit sloppy. Let us consider the following problem. Suppose we are writing a tcp server that takes commands and does stuff:

while let Some(cmd) = get_command() {
  self.exec_cmd(cmd);
}

I'm going to assume you agree with the earlier notation of: can't have two &mut => Isolation; can't have &mut while holding immutable & => Consistency.

Now let's examine Atomicity. Suppose we want the following requirement: each Cmd either gets executed fully or not at all. I.e. if it can not execute to completion, it should not leave the system in some weird inconsistent partial state In a Cmd fails, it should act as if it never happened.

Normally, this is not easy to do, but if we are using immutable-rs, we can do something like this:

while let Some(cmd) = get_command() {
  match calc_update(old_state.borrow(), cmd) {
    Ok(new_state) => * old_state.borrow_mut() = new_state,
    Err(_) => {}
  }
}

So the point here is that we use immutable data structures everywhere, if the Cmd goes through, we do a 'root swap', and if it fails, no state is changed.

If you agree with me so far, then we are playing with ACId objects/state in Rust already.

Furthermore, we have achieved ACId without using an external database -- by just using immutable-rs and restricting our programming style a bit.

======================================

Okay now, suppose we want an additional feature. While the program is running, we're going to unplug the machine. Then, when the machine boots back up, it continues running.

To do this, we need to somehow at the end of executing each Cmd, serialize everything to disk to make it Durable.

The obvious (but slow) solution would be to just dump all state to disk via serde.

A better (but I don't know of any crates that does this yet) solution would be to somehow record the diffs (on in Memory Rust objects) the Cmd caused, and write just those diffs to disk as a log. (Basically, the 'WAL' in database terminology).

I do admit that we are borrowing ideas (like WAL) from database techniques, but in the end, what we get is something that provides 'ACID state/objects' in Rust -- without pulling in all the complexity of a database.

1 Like

This is interesting. Any insights on why it was not completed / where the partial work is ?

Okay, thanks, I now understand your requirements and what "transactional" means in this setting. It makes sense.

However, I still think that when it comes to persisting the state, you would be better off using a real, lightweight database (maybe SQLite, maybe just a key-value store). Doing persistence correctly is hard, or at least it's a lot of work, so you definitely don't want to invent your own file format and algorithms for that.

As you demonstrated, you can already solve the atomicity/consistency/isolation problems pretty easily using tools provided by the standard library. I think what would be valuable to focus your efforts on is mapping objects' state and the diff/replay logic onto a battle-tested database. For example a single table with columns "entity, sequence, diff" is a reasonable starting point.

1 Like

I agree with you that the right production use technical decision is to use an existing database.

However, to quote George Bernard Shaw:

The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.

Given we already have ACId without using a database; I am curious what crazy crates "unreasonable" Rust programmers have come up with, and how close it can get us to ACID w/o databases.

Going a bit philosophical, I think it is a good thing that compilers are not these undecipherable monoliths -- that we have parsing libraries / JIT libraries that can be used outside of compilers. Similarly, I think there is value in being able to disassemble database techniques into separate components and pick/choose what to use (rather than a binary decision of 'serialize data to MySQL/Postgres? yes/no').

I don't think databases are "undecipherable" or that they are worse than compilers in terms of design. There are reusable components even in the Rust ecosystem which are suitable building blocks for writing databases. For instance:

  • sqlparser is just what it says on the tin.
  • sled is a high-performance persistent BTree.

What I am saying is: if what you want is persistence reliably and correctly, then use a database. If you want to study a database or write one yourself from scratch for educational purposes, then of course go ahead and do that. It just doesn't seem like you need to do that if you have the requirments you have.

2 Likes

I think part of the confusion was my sloppy usage of the word 'database'. By 'database', I had 'production database -- i.e. MySQL / Postgres' in mind -- and although those code bases are well engineered, I have not made much progress in reading them.

Projects similar to sled are precisely the types of projects I am interested in.

How To Corrupt An SQLite Database File may be an interesting read wrt durability (the easier problem I think) and Atomic Commit In SQLite wrt atomicity (the harder problem).

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.