I'd like to have a HashMap-like struct that automatically removes the entry after a certain Duration. The timer would reset on each
get(). In other words, any given entry is dropped/removed after a given Duration after the final access.
Are there any existing crates that provide this sort of functionality? I presume it would need to be asynchronous to avoid constantly blocking, which would presumably complicate implementation efforts.
It doesn't necessarily have to be a HashMap. A BTreeMap would also suffice.
I don’t know of any crates. But depending on your needs, you could implement a simple version of that. See the following link for an example of simple one with only insert and get implemented.
That's definitely the sort of thing I'm looking for. The main problem for me, at least, is that I want it to be dropped without requiring another
get(). The purpose is to avoid unbounded memory growth in a cache.
I think it would be possible to use this if I added a garbage collector-style method that runs on an interval from the async runtime, but that seems a bit hacky.
Here is a technique that also deletes other keys than the one accessed: playground. playground with debug print
It uses a
BinaryHeap to keep track of the next thing to delete efficiently. When an item is accessed with
get, the instant in the map is updated, but not in the heap. This means that the instant in the heap may sometimes be lower than the actual instant in the map, but this is ok — when the item in the heap reaches the deadline, we check if it is has really expired, and if not, just add it back into the heap with the new deadline.
Note that the correctness of this relies on the fact that the instant in the heap may be smaller than the one in the map, but it is never larger.
For values that stay in the map, each such item is reshuffled in the heap once the specified duration.
I'll have to dig into how this works a bit tomorrow, but it definitely seems like a step in the right direction. Hopefully it doesn't have too much overhead (I'll measure it to be sure), as my goal is to use it in a high-performance web server.
Pardon me for the question. Maybe I just don't understand your problem well enough. But can't you instead of eagerly removing entries as soon as the timer runs out, only do that when you add a new entry. I.e. only when you want to grow your map, you clean up instead of checking a timer all the time for all the elements?
I think that'd probably have some timing disadvantages if you want the regular requests to be consistent / fast. Doing it with a separate timer could avoid that and keep regular accesses / updates consistent?
Right, didn't think about that. It just seemed like a complication which can be avoided. However, your explanation proves that it might be a necessary complication.
I absolutely could, but as @daboross said, I'd like to have it run in the background garbage collector-style to avoid having to do something on every request.
Basically, I'd prefer it be done when the server isn't already busy (more than likely). That's why I'll measure rough overhead
Your post above made my mind go to "what about if you mark the entries as old and then bulk delete them when you have time" then I realized that it'd mean you'll basically implement a GC. However, if you maybe maintain a queue of data telling your system which entries should be deleted and whenever it is idle it plugs away at handling that queue... But this is probably premature optimization at this stage.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.