Good way to using reference in a function with thread

Hi, I was asked to implement a function to compare two lists, which was sent in as '&' reference.
Considering that the two lists might be very large, so I decide to use multi-thread inside the function.
The code was like this:

fn isSub<T: PartialEq + Sync>(first_list: &'static [T], second_list: &'static [T]) -> bool {
    let (tx, rx) = channel();

    spawn(move || {
       	//something compare elements and send result to tx.
    });

	// something gets all results from rx and summarize.
	// then make sure the thread has stopped using join, and return.
}

Then I was asked to make the lifetime of two params static, which is fully understandable. The compiler concerns that the memory of two params is recollected before the thread ends.

These are what I have known:

  1. I can use Box::leak() to get a static ref of the lists.
  2. or I need to change the way the params were sent in, like Arc.

But all of the above solution need to change the context. (the first one is preferred now, but it still causes several changes in the context), like this:

let huge: Vec<u32> = ...;
let huge_boxed = huge.into_boxed_slice();
let huge_ref = Box::leak(huge_boxed);

// then send huge_ref inside the function.

What I want to ask is:

In this case, I believe I know what I am doing and willing to take any consequences. Is there any way that no need to change the context. I mean like an unsafe way. I tried to use unsafe block and send pointer into the thread, but the compiler asked me not doing that, it said something like it's no safe to pass a *const T into a thread.

I hope I can know more about the unsafe feature in Rust, I think the unsafe part in the doc "The Rust Programming Language" maybe not fully covered.

Anyway, Rust is great, I'm now enjoying coding with it, thank you in advance.

1 Like

You can use rayon to easily parallelize this.

https://crates.io/crates/rayon

2 Likes

Alright a few things to note here:

  • This could have all been avoided with rayon's .par_iter() method on slices and other things.
  • Putting this on a single other thread and then joining would have the same time span as single threadedness:
//Timing for this is worse than just calling my_very_long_task in a 
//single threaded context
let handle = thread::spawn(move || {my_very_long_task(my_mutex)});
handle.join();
//On the other hand, this is what it's meant to be for:
let handle = thread::spawn(move || {/**/});
another_very_long_task(my_mutex);
handle.join(); //Here we can be assured that there is no separate thread
  • If you want more information on the black magic that is unsafe take a look at the rustonomicon
  • Box::leak(another_box) is unsafe to try to change the lifetime of things. It's basically a std::mem::transmute() to change the lifetime of a reference by turning it into a pointer and you must also remember to get rid of it later by calling from_raw... it's just a mess
  • Instead, using a Arc<RwLock<T: Deref<Target=&[Data]>>> is probably the best way to go, especially by not requiring the lock on the mutex, and being able to use .read

On another note, it's great to hear that you're enjoying rust! It's always good to see the community growing!

1 Like

Box::leak is perfectly safe

To illustrate my point

let a: i32 = 0;
let b = Box::new(&a);
let a_ref: &'static mut &i32 = Box::leak(b); // error[E0597]: `a` does not live long enough
1 Like

Hi, thank you so much for the reply. I'm really happy that I started this topic, things I learnt here are really dazzling. I'll give it a little try with the black magic as a language learner, and as a application developer I'll just use the fantastic Rayon.

1 Like

Hi, Thank you very much for introducing Rayon!
I think why the box leak is not safe is because it leaves the data on heap last to the end of whole program lifetime(static). it may take a lot of memory if it's not recollected.

Well, leaking memory is safe. As in, leaking memory will not change the output of your program and will not incur undefined behavior. We even have a function to do it, std::mem::forget. It may make your program less efficient or less secure, but those are not the same as being memory unsafe.

1 Like

Yes, that's correct:

This function is mainly useful for data that lives for the remainder of the program's life. Dropping the returned reference will cause a memory leak. If this is not acceptable, the reference should first be wrapped with the Box::from_raw function producing a Box . This Box can then be dropped which will properly destroy T and release the allocated memory.

But as @RustyYato said, it is safe

1 Like

Rust has a specific definition of what is considered memory unsafe. The nomicon goes over this here.

1 Like

Yeah, now I see. And I may never use the nomicon written by myself in a production server, but I'll try to go through the articles, they're cool.