How to design borrowing in caching component?

I am having difficulties with designing a component when members of a struct get into fight over self references.

use std::collections::{HashMap, VecDeque};
use std::net::TcpStream;
use std::sync::Arc;
use async_std::sync::Mutex;

struct Broker {
    conn: TcpStream,
    metadata_cache: HashMap<String, String>,
    /// Buffer is behind mutex because it is being truncated by sending "worker process" (spawn),
    /// i.e. it is mutably owned by Broker for purpose of `add' and it is mutably owned by send
    /// task for purpose of read and truncate.
    buffer: Arc<Mutex<Buffer>>,
}

struct Buffer (VecDeque<Vec<u8>>);

impl Broker {
    /// Append message to internal buffer
    async fn send(&mut self, topic: String, msg: &String) {
        let meta: &String = self.get_or_request_metadata(&topic).await;
        self.buffer.lock().await.add(msg, topic, meta);

    }

    async fn get_or_request_metadata(&mut self, _topic: &String) -> &String {
        // Request metadata from TcpStream
        unimplemented!()
    }
}

impl Buffer {
    fn add(&mut self, _msg: &String, _topic: String, _meta: &String) {
        unimplemented!()
    }
}

fn main() {} 

error[E0502]: cannot borrow `self.buffer` as immutable because it is also borrowed as mutable
  --> src/main.rs:21:9
   |
20 |         let meta = self.get_or_request_metadata(&topic).await;
   |                    ---- mutable borrow occurs here
21 |         self.buffer.lock().await.add(msg, topic, meta);
   |         ^^^^^^^^^^^ immutable borrow occurs here ---- mutable borrow later used here

What I can not understand, is why &mut self has such long lifespan. Technically I do understand, that I have called get_or_request_metadata(&mut self, ...), but once call is complete, I've got immutable `meta: &String'. Why Rust thinks that there is a risk of mutation after call to mutating function is complete?

And what are idiomatic ways of dealing with situations like this, when you have to mutate self to cache request results inside component? I was considering RefCell but I would like to make sure there is no better way because runtime checks do not inspire confidence.

Because the lifetime of the exclusive reference is passed on to the shared reference, this makes the following function safe.

fn get_shared(mutex: &mut Mutex<i32>) -> &i32 {
    Mutex::get_mut(mutex)
}

Note: Mutex::get_mut does not lock the mutex, so if the lifetime of the exclusive lock to the mutex was not tied to the shared reference that was returned, then you could do this.

let val = get_shared(&mut mutex);
let init = *val;
*mutex.lock().unwrap() = 0; // note: because the mutex was not locked, this won't block.
assert!(*val, init);

and this would be an example of UB that is prevented by the current borrowing rules.

Let's look at the API of .get_or_request_metadata:

async fn get_or_request_metadata (
    self: &'_ mut Broker,
    _topic: &'_ String,
) -> &'_ String;

which has lifetime elision kicking in, thus being equivalent to the following function signature:

async fn get_or_request_metadata<'broker_borrow> (
    self: &'broker_borrow mut Broker,
    _topic: &'_ String,
) -> &'broker_borrow String;

So, if we ignore async here since it doesn't really play a role for the issue at hand,
you have a function that will &mut borrow a Broker during some lifetime that I have decided to call 'broker_borrow, and it returns a reference to some String that is only valid during the lifetime 'broker_borrow (most probably because it borrows something from Broker, but the actual implementation plays no role here).

Since a &'a String is only valid / usable while in the lifetime 'a, this means that for the returned String to be valid, the borrow on the input Broker must still be held.

The fact that Rust prevents this is fine, because if you were to mutate the metadata_cache before using the obtained &String reference, then you could expose yourself to a UAF / dangling pointer.

But in your case you do not access metadata_cache, you are trying to access the other fields. It doesn't matter, API-wise Rust does not know that the obtained String was only borrowing from metadata_cache: a later implementation borrowing from the other fields would not violate such API.

So the problem came from your API, being too restrictive for Rust.

To fix it, you need to tell Rust that you are only borrowing from metadata_cache, so you need an explicit function taking a
&'cache_borrow mut HashMap<String, String> as input and returning a &'cache_borrow String:

impl Broker {
    /// Append message to internal buffer
    async fn send (
        self: &'_ mut Broker,
        topic: String,
        msg: &'_ String,
    )
    {
        async fn get_or_request_metadata<'cache_borrow> (
            cache: &'cache_borrow mut HashMap<String, String>,
            _topic: &'_ String,
        ) -> &'cache_borrow String
        {
            // Request metadata from TcpStream
            unimplemented!()
        }

        let meta: &String =
            get_or_request_metadata(
                &mut self.metadata_cache,
                &topic,
            ).await
        ;
        self.buffer
            .lock()
            .await
            .add(msg, topic, meta)
        ;
    }
}

EDIT: I realise that your cache function also needs access to the TCPStream field when the metadata hasn't been memoized / cached yet. In that case the reasoning still applies, in that you must avoid having access to the buffer field in the function performing the lookup, else accessing .buffer will be part of the API of the function.

So you may need to split your struct Broker as follows:

struct TcpMetadataWithCache {
    conn: TcpStream,
    cache: HashMap<String, String>,
}

impl TcpMetadataWithCache {
    async fn get_or_request_metadata (
        self: &'_ mut TcpMetadataWithCache,
        _topic: &'_ String,
    ) -> &'_ String
    {
        // Request metadata from TcpStream
        unimplemented!()
    }
}

struct Broker {
    cache: TcpMetadataWithCache,
    buffer: Arc<Mutex<Buffer>>,
}

impl Broker {
    /// Append message to internal buffer
    async fn send (
        self: &'_ mut Broker,
        topic: String,
        msg: &'_ String,
    )
    {
        let meta: &String =
            self.cache
                .get_or_request_metadata(&topic)
                .await
        ;
        self.buffer
            .lock()
            .await
            .add(msg, topic, meta)
        ;
    }
}
2 Likes

Thanks @Yandros, it is not the first time when answer to lifetimes was embarrassingly simple :slight_smile:

1 Like

There is nothing to be embarrassed about, since this "all the fields of Self are accessible as part of the API" is often surprising :sweat_smile:; Rust is the only language I know of that requires you to reconsider this kind of API / design choices so that the compiler understands your intent (which in the long term is a good thing, since it also makes it easier for a programmer to see what's going on :slightly_smiling_face:)

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.