Tokio Synchronization for Reconnecting a Multiplexed Connection

I am using a multiplexed connection in a concurrent server (using Tokio). Many tasks use the same TCP connection to send their requests and get their answers.

Now obviously things can fail and the connection might need to be reestablished. My plan is to check the error code if a request fails, maybe reestablish the connection and then try the request again one more time.

Something like this:

fn request() -> Result<Answer, Error> {
  let mut result = connection.send_request();
  if result.is_io_error() {
    connection.reconnect()?;
    result = connection.send_request();
  }
  result
}

During the reestablishment no other task can make progress, because the connection is broken.

A simple solution would be to create a new connection after a failure and just swap it in for everybody. But this sounds flawed to me: Many tasks can notice "at the same time" that the connection is broken and start the process of reestablishing the connection.

Better would be some kind of synchronization. Maybe the connection could be behind a RwLock. I would pay a small performance penalty to acquire the read lock for each request and reestablishing the connection would be a write lock which makes everybody else wait.

But I think I would still have to worry about multiple tasks entering the reconnection path at the same time. They would execute sequentially but still unnecessary often, am I right?

Of course I can think of hacky ways to solve it, but my question here is: what would be the correct and robust synchronization primitive to solve this problem? Only one of the tasks should reconnect, but everybody else should wait.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.