Offload values—from k/v map—to disk when memory threshold reached?

I just rewrote my datastructure:

+ indexmap::IndexMap<String, String>
- indexmap::IndexMap<String, either::Either<String, Vec<u8>>>

Now I'm worried about the value becoming too large. So possibly I should switch to impl std::io::Read and when the size reaches say 20MB offload to disk.

Once disk is targeted, be sure to retrieve / append in chunks.

How do I do this in Rust? - Happy to use any third-party crate!

EDIT0: I was thinking Apache Arrow + Apache Parquet but maybe that's too much Apache!!

EDIT1: Memory Management in DuckDB – DuckDB looks like the term is "spilling"

Consider using sqlite for your map.

3 Likes

For a pure Rust solution, you could consider the redb K/V database.

Large values are split automatically into OS-size pages (by default 4KB on Linux systems).

Doesn't the system already swap RAM to the disk when you are running out of physical RAM?

Swapping is a defensive mechanism when the system runs out of memory because of bad processes. You shouldn't knowingly exhaust the system's resources.

The system is quite good at swapping (and compressing RAM). It makes the decission by usage and not by "oh, but this is bigger than 20 MB".
Implementing your own swapping will not just open an can of worms, but a barrel of them.
If you insist in swapping, use a db (as suggested). They already have put a lot of effort into that subject.

2 Likes

That's what was I saying by responding to you

The conclusion we came to in Android is that right now, SQLite is the correct solution for mappings that need to be stored on disk. Yes, it's boring to pick a non-rust implementation, but it is what it is.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.