Current solutions for key value stores

Dear all,

I'm searching for a key-value database. I don't need network access, but I would like to access the DB from multiple processes.

LMDB

I have stumbled upon LMDB, which seems to be a lightweight solution with a lot of features that I could use, in particular transaction support and concurrent access from multiple processes. See their homepage.

Using LMDB through the "rkv" crate in Rust

However, looking for a Rust binding, I saw a warning in the README of the rkv crate:

Warning

To use rkv in production/release environments at Mozilla, you may do so with the "SafeMode" backend, for example:

[…]

The "SafeMode" backend performs well, with two caveats: the entire database is stored in memory, and write transactions are synchronously written to disk (only on commit).

In the future, it will be advisable to switch to a different backend with better performance guarantees. We're working on either fixing some LMDB crashes, or offering more choices of backend engines (e.g. SQLite).

I tried to do some research what these crashes are about. I found this, but not sure if it is the (only) problem.

Other people seem to be confused about this warning as well; see the following issue in rkv's bugtracker:

serefarikan commented on Jun 19

I've been looking for a rust binding to LMDB and I thought rkv may be a good candidate given it is under mozilla.

However, the recommendations in the README for production uses are quite conservative (full db in memory and synched transactions) and furthermore, there are references to LMDB crashes to be fixed.

It would be great if the readme provided references to what these crashes are, since that statement made my question my positive view of LMDB's stability.

Given the other projects where LMDB is used, I have a hard time believing this is a problem of LMDB, but I'd rather suspect it's the Rust bindings (rkv). Does anyone know something about this?

The above issue wasn't commented for over 6 months and there weren't any commits to rkv's repository within a year.

Other Rust bindings for LMDB

There are some other bindings for LMDB, but their last releases are several years old:

Any crate I could reasonably use or that someone can recommend?

jammdb

I stumbled upon jammdb (mentioned in this post), which started as a port of a database that was inspired by LMDB. But I have no idea if it's suitable for productive use and actively maintained. At least the most recent version is less than a year old.

sled

Searching further, I found sled mentioned in this post. Checking sled's repository, there have been commits a couple of months ago. But the database is still considered beta, I guess? In the README as of time of writing this post, the author lists under priorities:

  • the 1.0.0 release date is imminent! just putting the final touches on, while performing intensive testing

The README of the latest release said:

  • the 1.0.0 release date is January 19, 2021 (sled's 5th birthday)

There's no version 1.0.0 yet, but maybe it's just a tiny bit of waiting? Anyway, as far as I understand, it doesn't seem to support concurrent openings of the database, as the most recent version (in the repository) of the README says:

sled does not support multiple open instances for the time being

I would assume this also means I cannot open the database from several processes at the same time.

What to use?

Which of these (or other) solutions for a lightweight key-value store would/could you recommend?

If you have enough memory to keep all keys in memory: Bitcask - Wikipedia is great.

https://aerospike.com/ is also on my to-study list.

Increasingly, I am fascinated by the simplicity and performance of solutions where (key, u64=loc on disk) can be stored in memory:

  • read = memory lookup, 1 SSD disk read
  • writes = write to some block, write update of in-memory k -> loc map to disk, update in memory k-> loc map

There is the downside of a gc / compact phase, but even that seems simple compared to some of the other algorithms.

What about TiKV?

1 Like

sled doesn't support multiple processes :frowning:

Currently I'm using sqlite. It somewhat works, and has proven reliability, but it is a performance bottleneck for me, so I'm curious to see if you find something better.

I think sanakirja fits, but imo the API and docs are pretty incomprehensible. I got a simple example to work with concurrent access but even in the simple example I had no idea what half the code I wrote was actually doing. The next thing I was going to try was rusqlite with the bundled feature

I had considered TiKV too, but it comes with a server and network protocols. That way it seemed to be less lightweight to me.

Also note that LMDB, for example, is relatively small (compressed source code ~170kiB vs ~3.9MiB for TiKV).

After some web research, looking into the LMDB C API, and experimenting a bit with LMDB in C, I think I'll go for LMDB. Not sure about which Rust binding to use though, see further below.

LMDB maps the database into memory, and (if I understand right) a corrupted database file on the harddrive can cause crashes. That means it is not possible to open a database ("environment" in LMDB's terminology) without unsafe, as you have to assure that the database file has not been tampered with, for example.

Nonetheless, the approach doesn't seem to be unreasonable, and it isn't a blocker for me, as long as I know about it.

Consequently, the lmdb-zero crate doesn't offer an (entirely) safe interface. It's documentation for lmdb_zero::EnvBuilder::open states:

Unsafety
It is the caller's responsibility to ensure that the underlying file will be used properly. Since LMDB is built on memory-mapping, these responsibilities run quite deep and vary based on flags.

Generally, lmdb-zero's documentation seems to be quite verbose and several design decisions appear well-reasoned, see its documentation on docs.rs (don't forget to click on "Expand description").

lmdb-zero's repository has the last commit on April 22nd, 2018.

2 Likes

Well, it's always a shame when projects go without commits for too long, but considering they're bindings that isn't as much of a problem. Unless you're looking to have the cutting-edge version of LMDB, lack of activity is fine if there's documentation.

I thought the *-sys crates would provide bindings to the native liblmdb on my system. In that case, I wouldn't worry too much about old crates.

So I wondered if the *-sys crates really use the same LMDB as my operating system installed. They do not. Instead, they are shipped with old copies of LMDB's code:

The package manager of my operating system (FreeBSD) ships LMDB version 0.9.29.

I'm not even sure if LMDB changed its binary format. Does anyone have information on that?

Also, I would like to know if it's common practice to include a copy of the upsteam source in the *-sys crates?

(Edit: Removed my comments on getting errors when updating the database. Looks like my database was in a wrong state.)


After finding the changelog of LMDB, I'm getting a bit nervous, seeing which fixes are missing in the Rust crates (an excerpt):

  • Fix MDB_DUPSORT alignment bug (ITS#8819)
  • Fix delete behavior with DUPSORT DB (ITS#8622)
  • Fix mdb_cursor_get/mdb_cursor_del behavior (ITS#8722)
  • ITS#8756 Fix loose pages in dirty list
  • ITS#9007 Fix loose pages in WRITEMAP
  • ITS#9278 fix robust mutex cleanup for FreeBSD

I haven't looked deeper into these, but the thought of using a database productively, where such bugs aren't fixed, doesn't make me sleep well…

I am not sure what this implies. Is it possible to share memory between different processes? My guess would be no, but I don't know for sure. I certainly don't have answers (and really I have more questions than answers), but it seems like you need some kind of inter-process communication. I don't know what LMDB is doing "under the hood". Do you?

[ Oh, and my main thought was "why not use a single process" ? ]

It basically means the files are opened by two different programs running on the same machine. This involves using some sort of locking (not sure which mechanism LMDB uses, I think it's file locking or SYSV IPC).

To have different tools operating on the same dataset. Also to allow backups using a separate software/process.


The "Caveats" section in LMDB's documentation reveals some internals.

I would think it's easier to have the tools run under a single process ( if they are to be running at the same time, which I presume is what is wanted ).

I think I remember some recent discussion that this is problematic ( not supported in a uniform way by different operating systems ). My gut feeling is it would not work, or not work well. My gut feeling may of course be completely wrong!

I also thought on command-line tools for example. If there is only a single process, then the command-line tools would need to communicate with the running process. It's possible, but not sure if it always makes things easier. (It depends on the use-case, actually, I'd say.)

When network transparency is required, then having a (separate) server process is a necessity anyway.

I think you are right with that there are certain issues. To cite LMDB's documentation as some examples where locking can be difficult:

On BSD systems or others configured with MDB_USE_POSIX_SEM, startup can fail due to semaphores owned by another userid.
[…]
Do not have open an LMDB database twice in the same process at the same time. Not even from a plain open() call - close()ing it breaks flock() advisory locking.
[…]
Do not use LMDB databases on remote filesystems, even between processes on the same host. This breaks flock() on some OSes, possibly memory map sync, and certainly sync between programs on different hosts.
[…]
Opening a database can fail if another process is opening or closing it at exactly the same time.

(Edit: Made second part a separate post, see below.)

As of now, I'm tempted to use the original LMDB C-implementation (from Howard Chu, Symas Corp.) and write my own Rust bindings. I don't want to use crates that include copies of the original source which then aren't updated for years. Hence doing it myself seems most reasonable.

1 Like

My general belief is that a server process is necessary for "good" (well-behaved) concurrency, but I don't know what you mean by "network transparency" or why it results in that conclusion.

With "process" I meant an OS process, not a task or thread.

With "network transparency" I meant being able to move the server/storage part of a client-server application to a different host.

So what I meant is that when I want to be able to create a client/server application which runs over a network (i.e. where it's irrelevant where the server and the client runs because they use TCP/IP to communicate with each other, for example), then I need an architecture where my database is a separate process (on a different machine, possibly).

When I just have a bunch of programs which run on the same host/machine, then I won't need something like TCP/IP, and I could either work with having multiple OS processes accessing the same database on file storage (assuming that writes are synchronized somehow, which is supported by LMDB), or I can have a single OS process which runs several threads. (Of course I could also install a database server locally and communicate with it through TCP/IP, even if it's on the same host/machine, but that makes handling a bit more complex sometimes.)

This is where I cannot really understand what LMDB is doing. I can imagine maybe the memory for the memory-mapped file is accessible to multiple processes, but how can the synchronisation be done? That's to say, to one process, "hold on a moment, the database is being updated", in other words, inter-process locking. Well, maybe there is a mechanism, but I don't know what.

There is SYS V IPC for example. Another way could be to use flock used on a lock-file on the file system.

Ok, this seems to be specific to Linux, I have not used Unix for over 30 years, and I know even less about Linux.

Not sure how this works under Windows, but I'd assume there are similar mechanisms…

"IPC" (Interprocess Communication) is the keyword in general. Or maybe semaphores, or file-locking.