Serializing structs from bytes -- borrowing issue

I've written a btree reminiscent of the one btrfs uses. Nodes are fixed size at 4KB. The leaf nodes can hold items of different types. The interface is opaque insert(key: Key, val: T) and get(key: &Key) -> Option<T>. To support storing more than one kind of item in the same node, the items are serialized and stored as bytes, packed in together with headers that indicate the offset within the byte buffer, type, and size of each.

The node itself is backed by an array of u8, and serializes/deserializes items that are stored or retrieved -- you always get out the T you put in, but the underlying data is stored in a byte array. I'm relying on serde and bincode to do serialization and deserialization for now.

I'm trying to write a remove(&mut self, key: Key) -> Option<T> method that will remove the item indicated by the key and return that item. I need to copy the data from the backing array into something T shaped and return that

I can't figure out how to get serde to do this. In fact, I don't think it can, directly. What I'm trying to get it to do instead is deserialize from the data in-place, clone the resulting T and return that:

impl<'de, T> IsNode<'de, Key, T> for StaticGenericNode<Item> 
where T: IsItem<'de>
{
    fn remove(&'de mut self, key: &Key) -> Option<T> {
        match self.itemptrs_mut().binary_search_by_key(&key, |kp| &kp.key) {
            Ok(pos) => {
                let itemptr = self.itemptrs()[pos];

                let t: T;
                {
                    let slice = &self.bytes[itemptr.offset as usize .. (itemptr.offset + itemptr.size) as usize];
                    let temp_t: T = bincode::deserialize(slice).expect("couldn't deserialize");
                    t = temp_t.clone();
                }

                // --> this is the line that throws the error
                self.itemptrs_mut().copy_within(pos+1 .. initial_n_items, pos);

                // ...some other stuff here...
                Some(t)
            }
        // ... some other stuff here ...
        }
    }
}

The error I'm getting is:

error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable
   --> src/staticgenericnode.rs:264:17
    |
166 | impl<'de, T> IsNode<'de, Key, T> for StaticGenericNode<Item>
    |      --- lifetime `'de` defined here
...
256 |                     let slice = &self.bytes[itemptr.offset as usize .. (itemptr.offset + itemptr.size) as usize];
    |                                  ---------- immutable borrow occurs here
257 |                     let temp_t: T = bincode::deserialize(slice).expect("couldn't deserialize");
    |                                     --------------------------- argument requires that `self.bytes` is borrowed for `'de`
...
264 |                 self.itemptrs_mut().copy_within(pos+1 .. initial_n_items, pos);
    |                 ^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here

Here's what I don't understand:

  1. I get a reference to the slice in self.bytes that I need to copy
  2. I deserialize temp_t: T which is backed by the slice
  3. I then clone temp_t into t, theoretically to break the borrow and return the data.
  4. The next time I try to reference self, I get the error that self is already borrowed by slice, by way of temp_t

But slice and temp_t should both have been dropped already because of the scoped block in which I create them. What's happening here? Can somebody help me understand this?

When you have lifetime on &mut self it creates a paradox that screws everything up, and means that self can only be used once ever, and never again.

That's because when you have a lifetime on a trait, it means the trait borrows from something external, that has been created before the object implementing the trait, and may live after it (e.g. think of iterators — they borrow from a collection that outlives them).

Then when you mix that "outlives me" trait with lifetime of self, it creates a paradox that the exclusive loan of self has to be valid as long as something that existed before it. Since self can't live longer than itself, the best the borrow checker can do is to say that this &'de mut self is for the entire lifetime of self. And because mut is an exclusive access, you can't borrow it ever again.

Don't put lifetimes on self.

1 Like

Huh. Ok. But with Deserialize I am borrowing from something external.

Without 'de on &self, I get this error:

error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements
   --> src/staticgenericnode.rs:256:34
    |
256 |                     let slice = &self.bytes[itemptr.offset as usize .. (itemptr.offset + itemptr.size) as usize];
    |                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
note: first, the lifetime cannot outlive the anonymous lifetime defined on the method body at 247:15...
   --> src/staticgenericnode.rs:247:15
    |
247 |     fn remove(&mut self, key: &Key) -> Option<T> {
    |               ^^^^^^^^^
note: ...so that reference does not outlive borrowed content
   --> src/staticgenericnode.rs:256:34
    |
256 |                     let slice = &self.bytes[itemptr.offset as usize .. (itemptr.offset + itemptr.size) as usize];
    |                                  ^^^^^^^^^^
note: but, the lifetime must be valid for the lifetime `'de` as defined on the impl at 166:6...
   --> src/staticgenericnode.rs:166:6
    |
166 | impl<'de, T> IsNode<'de, Key, T> for StaticGenericNode<Item>
    |      ^^^
note: ...so that the types are compatible
   --> src/staticgenericnode.rs:257:37
    |
257 |                     let temp_t: T = bincode::deserialize(slice).expect("couldn't deserialize");
    |                                     ^^^^^^^^^^^^^^^^^^^^
    = note: expected `staticgenericnode::_::_serde::Deserialize<'_>`
               found `staticgenericnode::_::_serde::Deserialize<'de>`

How do I get around this problem?

Oh I see. I assume the bytes are Vec<u8>. In that case you're limited by the trait's definition that doesn't give you a way to specify shorter-lived returned values.

If you can change the trait, then something like this:

fn remove<'a>(&'a mut self, key: &Key) -> Option<T> where T: IsItem<'a>

would let you make this lifetime attached to a single function call (self borrowed just for this call, not self in general related to the whole trait). This is the default assumption for most function calls (lifetime elision).

But you might still have a problem due to the fact that mut is an exclusive borrow, and this exclusivity infects everything that it touches via its lifetime, so you won't be able to call any &mut methods while any copy of T returned by remove still exists.

This is because the borrow checker looks only at function types, not their bodies. And the &mut self type allows the function to do anything with its data, including:

self.bytes.clear();

So the borrow checker can't know whether previously returned Ts remain valid or not after further function calls, so lifetime rules prevent that.

Do you have to deserialized to a temporary struct that borrows from self.bytes? If you could make T own its data, and use DeserializeOwned, then the lifetime and its restrictions would disappear.

Another option would be if you had self.bytes field borrowed in your struct as &[u8], so that the whole struct would be a temporary view into the longer-lived bytes. Then you could use bytes' longer lifetime ( StaticGenericNode<'bytes, Item>) on T that you return.

This is precisely what I'm trying to do with remove. It should pull the data out of the backing self.bytes buffer and return ownership of that data (as a T) to the caller. I'm aiming for a HashMap-like API. get and get_mut should return references to T, where remove copies the bytes out of the backing self.bytes into a new T and returns that. I think this is conceptually sound but I can't yet find the right knobs to turn to make the compiler cooperate.

Is the issue that Deserialize<'de> makes it such that T doesn't own its data?

It doesn't seem like Serde can support a type where it can be either constructed from borrowed data or owned data.

I can also rewrite the IsItem trait to not use Serde at all and instead work through pointer conversions (playground -- this compiles but i haven't tested it yet) but this a pretty naive implementation that just translates structs to bytes and back, ignores endianness, and doesn't handle indirection. Adding to this in a way that can handle indirection, containers, or DSTs sounds a lot like reimplementing Serde...

Yes, if a lifetime is attached to something, that's the warning sign that it doesn't own its data.

T: IsItem<'de>

means that T is not self-contained, and depends on some external data borrowed from some other place that can be tracked by following where the 'de came from.

It should be T: IsItem (and you have to change definition of IsItem to remove requirement for the lifetime) to say T is owning all of its data.

Alternatively, where some type/trait requires a lifetime, but you don't need one, then 'static can be used to say that it's not going to actually borrow anything temporary. So T: IsItem<'static> may also work (if the compiler starts pointing to things saying they have to live for 'static lifetime, then you're still not copying data and borrowing it)

And then of course you have to change the code to actually copy the data instead of borrowing it. In case of Serde that's the serde::de::DeserializeOwned trait alias.

Thanks so much for your help. I ended up using the trait I linked instead of serde, and implemented a copy_from_bytes.