Expressing lifetime of C buffers in an FFI binding iterator


#1

Hi All,

I’m working on a rust binding for rocksdb (yes, I know there are several already, but this is mostly for my own education).

Natively, rocksdb supports keys and values which are plain byte arrays. I represent these with a RawBuf type, which ends up owning the malloc-allocated memory, and RawRef for memory that rocksdb gives temporary access to but not ownership.

In order to make this an ergonomic API, I’m using From<RawBuf> and AsRef<[u8]> traits to allow types to be transparently converted from/to the raw byte variations, so a lookup is done via:

fn get(&self, key: K) -> Option(V)
    where K: AsRef<[u8]>, V: From<RawBuf>
{ ... }

which allows users to use their application types that implement those traits (and there’s default implementations for String and Vec<u8>).

This works fine for get/put/delete operations, but rocksdb also supports iteration, and naturally I want to map this to Rust iterators. Rocksdb iterators maintain a cursor, and you can get a reference to the current key/value for the cursor, but the catch is that those references become invalid as soon as you move the cursor.

This seems like something I ought to be able to encode in with lifetimes and ownership, but I haven’t found a combination that works yet.

If I have a function:

fn iter(&self) -> DbIterator<K>
    where K: From<RawRef>

and then a DbIterator implementation for Iterator:

fn next(&mut self) -> Option<K>
    where K: From<RawRef>
{
    unsafe { ffi::rocksdb_iter_next(self.iter); }

    if !self.valid() { return None } // iterator done
    unsafe {
        let mut klen = 0;
        let kptr = ffi::rocksdb_iter_key(self.iter, &mut klen);
        let bufref = RawRef::new(kptr, len); // basically a &[u8] to a raw pointer

        Some(K::from(bufref))
   }
}

The issue here is that because K::from() takes ownership of the RawRef, then it could include parts of that memory in the resulting object - which is fine so long as the object doesn’t outlive the underlying buffer. The buffer is valid until the next call to next(), which I think means it needs to be seen as a borrow of part of the DbIterator's state, so that it must be returned before we can use &mut self.

But I don’t know how to go about expressing that.

Right now, I’m trying to tie the lifetimes together with DbIterator<'a>, RawRef<'a> and the contraint K: From<RawRef<'a>> + 'a but I can’t make rustc happy with any of the combinations I’ve tried.

I read the Nomicon which is a big improvement over the previous documentation for unsafe stuff, but I don’t think it covers this specific case. I played a little with Higher Rank Trait Bounds but I’m not at all sure they apply here and just make things more complex to think about.

Thanks,
J


#2

Unfortunately this kind of iterator isn’t supported by the Rust Iterator trait. See Returning borrowed values from an iterator https://www.reddit.com/r/rust/comments/303a09/looking_for_more_information_on_streaming/cpoysog and https://www.reddit.com/r/rust/comments/2t1rxx/more_general_iterator_trait/ for previous discussion.


#3

Thanks very much for the pointers; they look very useful. I’ve come up with something that works, but I think it relies on T being a non-reference value in From<X> for T (where Iterator::Item = T), which is consistent with what you’re saying.

I’ll go read all those articles now.