For crates.rs I’m using sqlite as the data storage, and it’s perfect in many ways:
- it’s file based, so the data can be easily redistributed without any special import procedure
- there are no daemons, accounts or sockets to configure
- it’s ubiquitous, so there are no extra install steps, and
cargo runis all you need to get it working
There is however, one deal-breaker: it’s mostly single-threaded.
It has become the performance bottleneck of my project (almost all time spent in sqlite, barely using more than 1 core). I’ve tried usual the speed tuning recommendations (disabling sync, journals, internal threads, shared cache, connection-per-thread, sharding over multiple db files). It has caused real data loss, tons of errors from locked tables, and it it’s still not nearly fast enough.
My dataset is relatively small: ~5GB, of which ~500MB is hot.
Is there something that has similar zero maintenance and zero install, easy data redistribution like sqlite, but works nicely with multithreaded programs?
If I were to use a real database I’d use postgres, but I’m considering the other extreme and using Serde to load 500MB
HashMap into memory and call it a db.