Something like sqlite, but multithreaded?

For crates.rs I'm using sqlite as the data storage, and it's perfect in many ways:

  • it's file based, so the data can be easily redistributed without any special import procedure
  • there are no daemons, accounts or sockets to configure
  • it's ubiquitous, so there are no extra install steps, and cargo run is all you need to get it working

There is however, one deal-breaker: it's mostly single-threaded.

It has become the performance bottleneck of my project (almost all time spent in sqlite, barely using more than 1 core). I've tried usual the speed tuning recommendations (disabling sync, journals, internal threads, shared cache, connection-per-thread, sharding over multiple db files). It has caused real data loss, tons of errors from locked tables, and it it's still not nearly fast enough.

My dataset is relatively small: ~5GB, of which ~500MB is hot.

Is there something that has similar zero maintenance and zero install, easy data redistribution like sqlite, but works nicely with multithreaded programs?

If I were to use a real database I'd use postgres, but I'm considering the other extreme and using Serde to load 500MB HashMap into memory and call it a db.

2 Likes

If you are fine with key-value store, rocksdb might be an interesting option to consider. A bunch of Rust apps use it in production.

1 Like

I've also had good experiences with RocksDB, but it does work at a "lower level" than SQLite which may or may not matter for your use case.

There's also the sled crate which is likewise a lower-level key/value store written in Rust.

rustbreak looks like a way to do ~this for the whole 5Gb, maybe.

1 Like

Thanks. I'll try them out!

Post back what you find!

1 Like