Remove list element from Mutex<HashMap<Id, LinkedList<Task>>>


#1

Hi,

I have a field that looks like map: Mutex<HashMap<Id, LinkedList<Task>>> and would like to remove a element of the inner list. What is the correct way to do this?
For example I tried something like this

let mut map = self.map.lock().unwrap();
for (_, list) in map.iter_mut() {
     *list = (*list).into_iter().filter(|&x| !x.is_completed()).collect();
}       

but thats not working. The only solution I found is to use split_off, remove the element from the splitted list and append the list again to the original list … but if there are multiple elements to remove this would not be very nice…

Thanks, Biggx


#2

mem::replace and mem::spwap functions can be useful here:

use std::sync::Mutex;
use std::collections::{HashMap, LinkedList};

type Id = String;
struct Task;

impl Task {
    fn is_completed(&self) -> bool {
        true
    }
}

fn main() {
    let map: Mutex<HashMap<Id, LinkedList<Task>>> = Mutex::new(HashMap::new());

    let mut map = map.lock().unwrap();
    for (_, list) in map.iter_mut() {
        let old_list = std::mem::replace(list, LinkedList::new());
        let new_list : LinkedList<_> = old_list.into_iter().filter(|x| x.is_completed()).collect();
        *list = new_list;
    }
}

However I’m wondering why there is no mem::update<T, F: Fn(T) -> T>(value: &mut T, f: F) in stdlib…

Out of curiosity: why do you want a LinkedList here?


#3

Hm, this should work, I didn’t thought about mem::replace

It is a list of task and the list is changing all the time - what would you expect instead of a list?


#4

I would expect a Vec here because it performs better then LinkedList for the overwhelming majority of use cases and in general is regarded as a default container.

Like the docs put it, use linked list when “You are absolutely certain you really, truly, want a doubly linked list.”


#5

Hm…the number of “pending” tasks can increase from 2 to 100000 for a short time and then decrease to 100 and then to 0 …and after a while up to 10000 …
So, I need a container where I can add and remove elements fast (normally at the front or back) - and I do not need direct access to element somewhere “in” the container - I would say, that a LinkedList is the better choice - or is that different in Rust then in other languages.
The code from my question is not really the “used case” - but if it would be - then removing a element in a very hugh list should be fast then deleting a element in a vector - or is Rust using some spooky implementation of a vector I have not heard about yet?

By the way, a question related to my original question:
How do I iterate over my list, do something with the elements and delete some of them during the iteration? Deleting a element in a linked list when I am already at the position of the element in the list should be in O(1) - but how do I implement this fast in Rust?


#6

I have done 0 benchmarks of vec vs. list, but my intuition (ha-ha) is that a vec (or vec_deque for access from both ends) is significantly faster because of better locality. Here is a good discussion of the tradeoffs: http://cglab.ca/~abeinges/blah/too-many-lists/book/#an-obligatory-public-service-announcement.

How do I iterate over my list, do something with the elements and delete some of them during the iteration? Deleting a element in a linked list when I am already at the position of the element in the list should be in O(1) - but how do I implement this fast in Rust?

The list from std can’t do this, but this list can. Vec can do something similar, albeit it will mess up the order: swap_remove. But again I suspect that just collecting elements to a new vector may be faster than filtering list in place because of the cache.


#7

You might also find lock free data structures from crossbeam useful: http://aturon.github.io/crossbeam-doc/crossbeam/sync/chase_lev/index.html


#8

Hmmmmm…okay…I put a comment in my code and as soon as I have enough code to write a meaningful benchmark I will do this and compare the performance of a implementation with a linked list vs. a vector and I will post my results here :wink:


#9

Because if f were to panic instead of returning normally, there’d be no new value of type T to store in value.


#10

Why the docs say: “You are absolutely certain you really, truly, want a doubly linked list.” is because on modern processors(years 2003+) there is almost no case where a LinkedList is appropriate performance wise(unless you work on an embedded device with a very small cache).
It’s cache unfriendly and doesn’t want to help your prefetcher at all, it just confuses him. It just so happens that even a vector is faster at insertions because of the cache. LinkedLists even searching for the position where to insert do a lot of cache misses and this impacts the performance.
Note that a vector is not perfect for this situation, it is bad but most probably still better than a LinkedList. Consider using other data structures that are cache friendly. Does a HashSet suit your needs?(Note that HashMap is slow because it applies a technique to try to protect against DDoS)