Modifying a variable in a thread

lason · November 22, 2022, 7:41am

During the development of an HTTP client, I have to save the header ETAG from one web access and use it in the next. I've tried to rectify the problem and first made the following test program:

use std::thread;
use std::time;

#[tokio::main]
async fn main() {
    let mut var = 1;
    fun( & mut var ).await;
    println!("var = {}", &var);
}

async fn fun(v: & mut i32) {
        thread::sleep(time::Duration::from_millis(1000));
        *v = *v + 10;
}

This program works fine. Then I added the thread:

async fn fun(v: & mut i32) {
    thread::spawn( || {
        thread::sleep(time::Duration::from_millis(1000));
        *v = *v + 10;
    });
}

This program does not compile, and I receive an error (roughly).

argument requires that lifetime must outlive 'static

I tried adding "static" to the function signature, but I got other errors, including a new warning that var is borrowed as mutable in fun() and unmutable in println!().

How do I solve this?

Michael-F-Bryan · November 22, 2022, 7:50am

The general problem is that you can't pass borrowed references from one thread to another because it's quite possible for fun() to return and invalidate your &mut i32 before the spawned thread even starts (e.g. because fun() is fast and your OS took some time to start the spawned thread).

The most common solution is to have some sort of shared ownership which will keep the i32 alive as long as anything is using it. This typically means you use put the i32 behind a reference-counted pointer (Arc) and we'll also need a synchronisation mechanism (e.g. Mutex) because we want to update it.

use std::{
    sync::{Arc, Mutex},
    thread,
    time::Duration,
};

async fn fun(v: Arc<Mutex<i32>>) {
    thread::spawn(move || {
        thread::sleep(Duration::from_millis(1000));
        let mut v = v.lock().unwrap();
        *v = *v + 10;
    });
}

(playground)

lason · November 22, 2022, 8:08am

Thank you, Michael. But let me just ask a question. Isn't it possible to prevent fun() from returning until the thread is done?

Cerber-Ursi · November 22, 2022, 8:15am

As far as type system is concerned - no. Even if you add a join(), everything between spawn and join can panic, exiting from the function immediately.

lason · November 26, 2022, 10:42am

I'm not sure if Michael's method works; it didn't solve the problem for me. I approached the problem in two steps:

I removed the parameter from fun() and made it a static variable. It solved the lifetime problem.
The unsafe access to the variable was still a problem, and I solved it by making the corresponding code lines unsafe. I know it's a compromise, but I won't be accessing this variable from multiple threads, so I can live with it.

use std::thread;
use std::time;

static mut VAR: i32 = 1;

#[tokio::main]
    async fn main() {
    fun().await;
    unsafe { println!("var = {}", &VAR); }
}

async fn fun() {
    thread::spawn( || {
        thread::sleep(time::Duration::from_millis(1000));
        unsafe { VAR = VAR + 10; }
    });
}

This program compiles and works without problems. But the value printed from main() is 1. This confirms that fun() spawns the thread and returns, and it takes 1s more for the thread to complete.

Cerber-Ursi · November 26, 2022, 11:23am

You are doing exactly that in the code you've written - the only reason it doesn't bite you right back is that one of the threads is sleeping long enough for another one to finish (in fact, it sleeps long enough for main thread to print the unchanged value and terminate the whole program).

lason · November 26, 2022, 12:57pm

You are right. It is a demo program. In the real app, I won't be accessing a similar variable from multiple threads. But I would be grateful if you bothered to modify this program by removing unsafe and adding Mutex.

EdmundsEcho · November 26, 2022, 3:13pm

May I ask why not let the thread take ownership And have that thread return ownership of the, now updated value?

IMO the simplest “go with flow” design is to let the thread take ownership and work your design accordingly. One of the reasons I enjoy working with Rust is because constraints like the one you are dealing with (threads must own values) has nothing to do with some “quirk of Rust” that you need to find a work-around for, but rather the “physics” of the problem itself. Gravity only works in the down direction… kind of thing, mut borrows must be exclusive…

lason · November 26, 2022, 7:31pm

I don't see this design. fun() could return the ownership, but I don't see how the thread would do it. This program is a mock-up, and, as Cerber-Ursi points out, the thread will die together with the main thread. You would need to prolong the main's life to implement your concept. Perhaps you can propose a solution.

EdmundsEcho · November 27, 2022, 12:01am

Only to "think out loud", I can see from your initial problem statement a sequence dependency. That can be captured by the caller of fun giving then taking ownership of the ETAG.

async fn fun(v: i32) -> i32 { .. }

In how I'm thinking about it, how you scale this up to leverage parallel processing does not change the need to maintain this fundamental relationship implied in your statement. If what I'm saying makes sense, is a scope that starts in the body of what you have in your main. This would need to change in order to create a scope with what it needs to operate independently in each thread. To avoid the need to track what goes on with each thread, I would ensure the scope includes all the logic required so that I (parent that spawns the threads) can "fire and forget". Thus, the scope would include the error handling required to inform a particular user-agent request in the event something went wrong and where in the sequence it went wrong. Otherwise, if "fire and forget" is not feasible, there is more that can be done, but I'll stop here for now.

lason · November 27, 2022, 9:47am

I don't think I follow your thinking. In a real app, ETAG is passed from each web access to the following web access. I will implement the web access in a function, so ETAG has to be passed between function calls. I may (or may not) spawn a new thread for each web access, so ETAG would also be passed between threads. Main has no use for ETAG but might be used as a placeholder between the consecutive web accesses. But I rather prefer the current concept of using a static variable. It looks simpler.

By the way, this program is already implemented for Windows in C#. It was fabulously easy to do it in C# compared to Rust. But I'm trying to implement a similar design in Rust, and it is not always optimal.

Heliozoa · November 27, 2022, 11:54am

Maybe Lazy in once_cell::sync - Rust is what you're looking for? A static VAR: Lazy<Mutex<T>> would allow you to access and mutate VAR from anywhere safely. The description of the problem isn't very clear to me so there might be a better solution, but this should work in any case.

Cerber-Ursi · November 27, 2022, 12:38pm

Mutex::new is const since 1.63, so there's no need for Lazy - Mutex can be put directly in static.

EdmundsEcho · November 27, 2022, 4:04pm

I understand how the suggestion might not work for how the design works by definition of the C# codebase… a design that you like, so if only for that reason, valuable in its own right. So, I submit to “I don’t know what I don’t know”.

Perhaps as you proceed, wonder to yourself if relying on a side effect (mutating memory) is the best way to go here. Ownership seems like a great built-in accounting for what you need to do based on the words used to describe the task. Rhetorically speaking at this point, are there multiple threads that need simultaneous access to the same etag to accomplish the task ^[1]?

answer without relying on the inner workings of a particular implementation ↩︎

lason · November 27, 2022, 5:30pm

No, eTag is accessed by one thread at a time.

Web access must be performed within a worker thread. I consider two options:
A. Create a new thread for each web access and let it die when it is finished.
B. Start a long-running thread and let it perform numerous web accesses.
In case B, there is only one thread that uses eTag. In case A, eTag enters many threads, but only one at a time.

svet · November 27, 2022, 8:32pm

I wouldn’t call myself a Rust pro, but I did deal with something like this recently. From a conceptual point of view, two thoughts come to mind:

You talk about how you can avoid the need for multiple threads having shared access to the variable - but that sounds like a dubious design decision in the context of a web server. Presumably you’ll want to have concurrency, in order to be able to handle a number of requests without blocking?
I would question the need for this to be implemented as a variable that gets passed around between threads. Often a channel (e.g. via crossbeam) can provide a more elegant (and I would say obvious) way to implement something like what you describe. [You can have a single thread/function that owns the channel “receiver” and manages your counter. Meanwhile you can have an arbitrary number of “senders” that you pass to the threads that are handling the actual requests, and whenever a request comes in, a message is sent over the channel to tell the thread that manages the counter that it needs to be incremented.] This is also handy in terms of separation of concerns. Of course I don’t know much about the broader context of what the counter is used for, so this may not be the right solution in your case.

lason · November 28, 2022, 7:18am

Thank you, svet, for the comments. I'm new to Rust, and all comments are welcome.

This application is not a web server. It is a HTTP client. In its current implementation in C#, it wakes once a minute, gets an internet page, processes the received character string using regex, and presents the result using a UI and some audible signal. The app is almost never active and uses very little CPU.

ETAG is a hash index that the HTTP server calculates for the whole page and sends in the response. You insert this index into the next get request. The HTTP server sends back the page only if it has a new ETAG. Otherwise, the server sends the status code 304 Not Modified, thus saving bandwidth.

My application has to extract the ETAG from each HTTP response and reuse it in the next HTTP request one minute later. ETAG is never accessed simultaneously by two threads and is not eligible for race conditions.

My current vision of the application is that it will use a background thread for web access and regex processing and the main thread for the presentation. I want to use a channel to send data to the main thread. This is also a headache. I make the channel in the main thread but want to use the TX in the worker thread. The compiler doesn't like it because the worker thread is asynchronous. But this is another issue; I may send a new question about it.

svet · November 28, 2022, 7:52am

Hah, yes, apologies @lason - I had clearly misunderstood what you meant by “handling” a request - and indeed in the very first post you mentioned it’s a client.

Thanks for the detailed explanation, that’s very clear and makes total sense. I’m not sure I can add anything to the preceding discussion - or to the “Fearless Concurrency” chapter of the book.

I agree that a channel is probably not the solution to your problem. Having a couple of threads - as you suggest - where one is the exclusive owner of the eTag variable does feel like the right way to go about it.

Finally, given that you’re already using threads, I wonder if layering async on top of that is actually necessary? In my experience, one or the other would usually suffice for most jobs. Making your function calls be synchronous (within their dedicated threads) would likely simplify the ownership issues you are encountering.

lason · November 28, 2022, 9:53am

"I wonder if layering async on top of that is actually necessary?" That bites me! My code resembles the following:

    let (tx, rx): (Sender<String>, Receiver<String>) = channel();
    thread::spawn( async move || {
            let mess = get_web().await;	//  fetch the web
            tx.send(mess).unwrap();		// send it to main thread
    });

get_web() is intrinsically async and I have to await it. Then the thread must be async, and nothing works. I also tried block_on(mess) without an async thread, but it didn't work. I have plenty of compilation errors in this code. One is a borowing tx issue, and another is that the future from get_web() is not a String.

Cerber-Ursi · November 28, 2022, 10:07am

async doesn't do anything unless run in the async runtime. Are you sure you need to explicitly create new thread and not to tokio::spawn the task?

Topic		Replies	Views
Question on moving a reference to a child thread help	9	2146	November 15, 2020
Unable to pass Struct to threads help	7	1411	March 30, 2022
Passing a borrowed variable to a thread from an endless loop help	7	3572	January 12, 2023
Why passing mut variable to another thread isn't an error? help	30	1237	June 1, 2023
Borrow checker problem with multithread help	5	511	June 28, 2023

Modifying a variable in a thread

Related Topics