Multiple non mutable refs to different parts of a boxed value

Hi, I'm trying to create a way to store data while preventing data deduplication.

To give some context, I am storing struct representation of network packets. Those structs implement the Extract trait which originally was returning some Box<dyn Any> of every element contained in the struct (and every contained struct recursively). This was causing a lot of duplication and since some of those packets could be up to a few kB, the memory cost was non-negligible.

What I came up with is the following, for each struct that I want to store, I put it in Vec<Box<dyn Any + 'static>> and then store refs of the internals of the struct in a Vec of &dyn Any (along with metadata, hence the DataContent type). This is "mostly" working, but now I can't call my DataStore multiple times without having error[E0502]: cannot borrow my_store as immutable because it is also borrowed as mutable

Can I do something to fix this or is the whole approach flawed ? If it's the latter, is there another way to design my code to achieve the same goal without too much overhead (memory and performance-wise) ?

Thanks

use std::any::Any;

#[derive(Debug)]
pub struct DataContent<'a> {
    pub source: String,
    pub data: &'a dyn Any,
}

pub trait Extract {
    fn extract<'a>(&'a self, data_refs: &mut Vec<&'a dyn Any>) -> Result<(), ()>;
}

#[derive(Debug)]
pub struct DataStore<'a> {
    raw_data: Vec<Box<dyn Any + 'static>>,
    data: Vec<DataContent<'a>>,
}

impl<'a> DataStore<'a> {
    pub fn new() -> Self {
        Self {
            raw_data: vec![],
            data: vec![],
        }
    }

    pub fn do_extract<T: Extract + 'static>(&'a mut self, data: T, source: String) {
        self.raw_data.push(Box::new(data));
        let data_ptr = self
            .raw_data
            .last()
            .unwrap()
            .as_ref()
            .downcast_ref::<T>()
            .unwrap();

        let mut new_data = vec![];
        data_ptr.extract(&mut new_data);

        for variable in new_data {
            let knowledge = DataContent {
                source: source.clone(),
                data: variable,
            };

            self.data.push(knowledge);
        }
    }
}

#[derive(Debug)]
pub struct Test {
    a: String,
    b: u8,
    c: u64,
}

impl Extract for Test {
    fn extract<'a>(&'a self, data_refs: &mut Vec<&'a dyn Any>) -> Result<(), ()> {
        data_refs.push(self);
        data_refs.push(&self.a);
        data_refs.push(&self.b);
        data_refs.push(&self.c);

        Ok(())
    }
}

pub fn main() {
    let data_to_extract = vec![
        Test {
            a: "Test1".into(),
            b: 1,
            c: 512,
        },
        Test {
            a: "Test2".into(),
            b: 2,
            c: 512,
        },
    ];

    let mut my_store = DataStore::new();

    for x in data_to_extract {
        my_store.do_extract(x, "Vec of data".into());
    }
}

The correct signature for DataStore::do_extract is

-    pub fn do_extract<T: Extract + 'static>(&'a mut self, data: T, source: String);
+    pub fn do_extract<T: Extract + 'static>(&mut self, data: T, source: String);

&'a mut T<'a> is a typical anti-pattern that borrows the struct for it's whole lifetime preventing multiuse. This is a flag indicates self-referencing. And indeed, data is pointed to raw_data!

If you don't need to interleave do_extract and accessing self.data, you can just have a method generate Vec<DataContent<'a>> all at once on demand instead of populating it on the fly.

If that's not the case, then you probably have to resort to unsafe code. But since raw_data's content is boxed anyway, so as long as you destruct self.data before removing or modifying old element in self.raw_data, then I think it's fine.

1 Like

Thank you for your reply. I do indeed need to interleave do_extract calls and accessing self.data. I agree with you that it should be fine to use unsafe because nothing will be removed from raw_data and data for the entire lifetime of DataStore but I was hoping that I could achieve this without unsafe.

You're trying to create a self referential struct but unfortunately that's not possible in safe Rust

1 Like

Can you give me a hint on how to achieve this with unsafe Rust ?

Honestly, self-referential structs are so tricky to get right that even experts got them wrong. I would suggest using a crate like self_cell instead.

If you really really want to use unsafe you'll have to replace the reference in DataContent with a raw pointer.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.