Remove list element from Mutex<HashMap<Id, LinkedList<Task>>>

Hi,

I have a field that looks like map: Mutex<HashMap<Id, LinkedList<Task>>> and would like to remove a element of the inner list. What is the correct way to do this?
For example I tried something like this

let mut map = self.map.lock().unwrap();
for (_, list) in map.iter_mut() {
     *list = (*list).into_iter().filter(|&x| !x.is_completed()).collect();
}       

but thats not working. The only solution I found is to use split_off, remove the element from the splitted list and append the list again to the original list ... but if there are multiple elements to remove this would not be very nice..

Thanks, Biggx

mem::replace and mem::spwap functions can be useful here:

use std::sync::Mutex;
use std::collections::{HashMap, LinkedList};

type Id = String;
struct Task;

impl Task {
    fn is_completed(&self) -> bool {
        true
    }
}

fn main() {
    let map: Mutex<HashMap<Id, LinkedList<Task>>> = Mutex::new(HashMap::new());

    let mut map = map.lock().unwrap();
    for (_, list) in map.iter_mut() {
        let old_list = std::mem::replace(list, LinkedList::new());
        let new_list : LinkedList<_> = old_list.into_iter().filter(|x| x.is_completed()).collect();
        *list = new_list;
    }
}

However I'm wondering why there is no mem::update<T, F: Fn(T) -> T>(value: &mut T, f: F) in stdlib...

Out of curiosity: why do you want a LinkedList here?

Hm, this should work, I didn't thought about mem::replace...

It is a list of task and the list is changing all the time - what would you expect instead of a list?

I would expect a Vec here because it performs better then LinkedList for the overwhelming majority of use cases and in general is regarded as a default container.

Like the docs put it, use linked list when "You are absolutely certain you really, truly, want a doubly linked list."

Hm...the number of "pending" tasks can increase from 2 to 100000 for a short time and then decrease to 100 and then to 0 ...and after a while up to 10000 ...
So, I need a container where I can add and remove elements fast (normally at the front or back) - and I do not need direct access to element somewhere "in" the container - I would say, that a LinkedList is the better choice - or is that different in Rust then in other languages.
The code from my question is not really the "used case" - but if it would be - then removing a element in a very hugh list should be fast then deleting a element in a vector - or is Rust using some spooky implementation of a vector I have not heard about yet?

By the way, a question related to my original question:
How do I iterate over my list, do something with the elements and delete some of them during the iteration? Deleting a element in a linked list when I am already at the position of the element in the list should be in O(1) - but how do I implement this fast in Rust?

I have done 0 benchmarks of vec vs. list, but my intuition (ha-ha) is that a vec (or vec_deque for access from both ends) is significantly faster because of better locality. Here is a good discussion of the tradeoffs: Learning Rust With Entirely Too Many Linked Lists.

How do I iterate over my list, do something with the elements and delete some of them during the iteration? Deleting a element in a linked list when I am already at the position of the element in the list should be in O(1) - but how do I implement this fast in Rust?

The list from std can't do this, but this list can. Vec can do something similar, albeit it will mess up the order: swap_remove. But again I suspect that just collecting elements to a new vector may be faster than filtering list in place because of the cache.

You might also find lock free data structures from crossbeam useful: http://aturon.github.io/crossbeam-doc/crossbeam/sync/chase_lev/index.html

Hmmmmm...okay..I put a comment in my code and as soon as I have enough code to write a meaningful benchmark I will do this and compare the performance of a implementation with a linked list vs. a vector and I will post my results here :wink:

1 Like

Because if f were to panic instead of returning normally, there'd be no new value of type T to store in value.

1 Like

Why the docs say: "You are absolutely certain you really, truly, want a doubly linked list." is because on modern processors(years 2003+) there is almost no case where a LinkedList is appropriate performance wise(unless you work on an embedded device with a very small cache).
It's cache unfriendly and doesn't want to help your prefetcher at all, it just confuses him. It just so happens that even a vector is faster at insertions because of the cache. LinkedLists even searching for the position where to insert do a lot of cache misses and this impacts the performance.
Note that a vector is not perfect for this situation, it is bad but most probably still better than a LinkedList. Consider using other data structures that are cache friendly. Does a HashSet suit your needs?(Note that HashMap is slow because it applies a technique to try to protect against DDoS)