Hello all!
I'm having some lifetime issues that I'm not sure how to fix! I'd love some help, but at the very least I hope you enjoy my adventures in designing complex Rust architecture!
Checking bbolt-nub v0.1.0 (/Users/abaxter/projects/better-bbolt/bbolt-nub)
error[E0477]: the type `RefTxCursor<'p, 'tx, T>` does not fulfill the required lifetime
--> bbolt-nub/src/components/cursor.rs:667:1
|
667 | / impl<'p, 'tx: 'p, T: TheTx<'tx, TxPageType = DirectPage<'tx, RefTxBytes...
668 | | CursorSeekRefApi<'tx> for RefTxCursor<'p, 'tx, T>
669 | | where
670 | | for<'b> Self::RefKv<'b>: PartialOrd<[u8]>,
... |
677 | | }
| |_^
error[E0477]: the type `RefTxCursor<'p, 'tx, T>` does not fulfill the required lifetime
--> bbolt-nub/src/components/cursor.rs:668:3
|
668 | CursorSeekRefApi<'tx> for RefTxCursor<'p, 'tx, T>
| ^^^^^^^^^^^^^^^^^^^^^
error[E0477]: the type `RefTxCursor<'_, '_, T>` does not fulfill the required lifetime
--> bbolt-nub/src/components/cursor.rs:668:29
|
668 | CursorSeekRefApi<'tx> for RefTxCursor<'p, 'tx, T>
| ^^^^^^^^^^^^^^^^^^^^^^^
error[E0477]: the type `RefTxCursor<'p, 'tx, T>` does not fulfill the required lifetime
--> bbolt-nub/src/components/cursor.rs:672:6
|
672 | fn seek_ref<'a>(
| ^^^^^^^^
I'm spending my spare time writing a database. One of the main goals is the ability to change low level aspects about how the data is access and organized so I can run experiments. Because of that this code is much much much more generic that any project should reasonably be. It's also fun (for some definition of fun) and I've learned a lot by doing this.
My generic Rube Goldberg machine is below.
Generic page access:
// I need to generically slice/subslice bytes that are either
// * completely loaded in memory (types beginning with Direct)
// * lazily loaded in memory (types beginning with Lazy)
pub trait GetKvRefSlice {
type RefKv<'a>: GetKvRefSlice + 'a
where
Self: 'a;
fn get_ref_slice<'a, R: RangeBounds<usize>>(&'a self, range: R) -> Self::RefKv<'a>;
}
// A single loaded array of memory, usually page_size in length
pub trait TxBytes<'tx>: Deref<Target = [u8]> + AsRef<[u8]> + Clone + Sized + Sync + Send {}
// A generic trait for both directly loaded pages and lazily loaded pages
pub trait TxPageType<'tx>: Page + GetKvTxSlice<'tx> + GetKvRefSlice + Clone + Sync + Send {
type TxPageBytes: TxBytes<'tx>;
}
Generic byte comparisons:
// Since errors may occur during File read calls,
//I created some Try variants for PartialEq, Eq,
//and PartialOrd to be used by the Lazy loading key/value types
pub trait TryPartialEq<Rhs: ?Sized = Self> {
type Error: Error + Send + Sync + 'static;
fn try_eq(&self, other: &Rhs) -> crate::Result<bool, Self::Error>;
// SNIP
}
pub trait TryEq: TryPartialEq<Self> {}
pub trait TryPartialOrd<Rhs: ?Sized = Self>: TryPartialEq<Rhs> {
fn try_partial_cmp(&self, other: &Rhs) -> crate::Result<Option<Ordering>, Self::Error>;
// SNIP
}
pub trait KvTryEq: TryEq + TryPartialEq<[u8]> {}
pub trait KvTryOrd: TryPartialOrd + TryPartialOrd<[u8]> + KvTryEq {}
pub trait KvTryDataType:
KvTryOrd + TryHash + TryGet<u8> + RefIntoTryCopiedIter + RefIntoTryBuf + Sized
{
}
// Direct key/value types has similar requirements
pub trait KvEq: Eq + PartialEq<[u8]> {}
pub trait KvOrd: Ord + PartialOrd<[u8]> + KvEq {}
pub trait KvDataType: KvOrd + Hash + Get<u8> + RefIntoCopiedIter + RefIntoBuf + Sized {}
Most importantly both key/value types must be comparable to [u8]. The library interface will always use [u8] for user inputted data.
This design decision splits any search functionality into 2
// The generic Node interface for both leaf and branch nodes
pub trait HasElements<'tx>: Page + GetKvRefSlice + Sync + Send {
type Element: HasKeyPosLen + Sync;
// SNIP
fn search<'a>(&'a self, v: &[u8]) -> Result<usize, usize>
where
<Self as GetKvRefSlice>::RefKv<'a>: PartialOrd<[u8]>,
{
// SNIP
}
fn try_search<'a>(
&'a self, v: &[u8],
) -> crate::Result<
Result<usize, usize>,
<<Self as GetKvRefSlice>::RefKv<'a> as TryPartialEq<[u8]>>::Error,
>
where
<Self as GetKvRefSlice>::RefKv<'a>: TryPartialOrd<[u8]>,
{
// SNIP
}
// SNIP
}
This way search and try_search functions only exist for the key/value datatypes that support them. Neat!
You've probably noticed the 'tx
lifetime everywhere. I want all resources to be limited to the transaction boundary which is identified by 'tx
.
// When the backend reads a page, the 'tx lifetime
// is explicitly added by the transaction handler. To
// simplify things, backend has no concept of transactions.
pub trait TxReadPageIO<'tx> {
type TxPageType: TxPageType<'tx>;
// SNIP
fn read_node_page(
&'tx self, node_page_id: NodePageId,
) -> crate::Result<NodePage<'tx, Self::TxPageType>, PageError>;
}
pub trait TheTx<'tx>: TxReadPageIO<'tx> {
fn stats(&self) -> &TxStats;
}
pub struct CoreTxHandle<'tx, IO> {
io: RwLockReadGuard<'tx, IO>,
stats: Arc<TxStats>,
tx_id: TxId,
}
Each transaction handle defines the type of memory it will be passing around and whether or not it's direct or lazy loaded.
// The MemMap handler. It passes around a single
// &'tx [u8] that holds the entire page
pub struct RefTxHandle<'tx, IO> {
handle: CoreTxHandle<'tx, IO>,
}
impl<'tx, IO> TxReadPageIO<'tx> for RefTxHandle<'tx, IO>
where
IO: IOPageReader,
IO::Bytes: IntoTxBytes<'tx, RefTxBytes<'tx>>,
{
type TxPageType = DirectPage<'tx, RefTxBytes<'tx>>;
// SNIP
}
impl<'tx, IO> TheTx<'tx> for RefTxHandle<'tx, IO>
where
IO: IOPageReader,
IO::Bytes: IntoTxBytes<'tx, RefTxBytes<'tx>>,
{
// SNIP
}
// The lazy loading handler. It passes around Arc<[u8]>
pub struct LazyTxHandle<'tx, IO> {
handle: CoreTxHandle<'tx, IO>,
}
impl<'tx, IO> TxReadPageIO<'tx> for LazyTxHandle<'tx, IO>
where
IO: IOOverflowPageReader,
IO::Bytes: IntoTxBytes<'tx, SharedTxBytes<'tx>>,
{
type TxPageType = LazyPage<'tx, Self>;
// SNIP
}
impl<'tx, IO> TheTx<'tx> for LazyTxHandle<'tx, IO>
where
IO: IOOverflowPageReader,
IO::Bytes: IntoTxBytes<'tx, SharedTxBytes<'tx>>,
{
// SNIP
}
With that design we can generically access and compare slices in pages, searching within pages using that functionality, and none of that code needs to know how any of the bytes are read in or organized in memory. Very cool!
Where I'm currently stuck is on the Cursor where which is central to walking the database nodes.
// A CoreBucket is the database index we're walking.
// It holds the root node and the transaction.
// A StackEntry is a node page and the element we're
// currently referencing on that node.
pub struct CoreCursor<'p, 'tx: 'p, T: TheTx<'tx>> {
bucket: &'p CoreBucket<'tx, T>,
stack: Vec<StackEntry<'tx, T::TxPageType>>,
location: CursorLocation,
}
impl<'p, 'tx, T: TheTx<'tx>> CoreCursor<'p, 'tx, T> {
// We return keys with this function
pub fn key_value_ref<'a>(
&'a self,
) -> Option<(
<T::TxPageType as GetKvRefSlice>::RefKv<'a>,
<T::TxPageType as GetKvRefSlice>::RefKv<'a>,
)> {
// SNIP
}
// We move around the index with move functions
fn move_to_first_element(&mut self) -> crate::Result<Option<LeafFlag>, CursorError> {
// SNIP
}
}
Searching across the index is where things get interesting. I found I had to explicitly set a different lifetime for<'b>
for RefKv<'b>: PartialOrd<[u8]>
or RefKv<'b>: TryPartialOrd<[u8]>
otherwise the compiler would complain that I could only use a single mutable borrow at a time. Regardless of how I tried to resolved the issue, the borrows would never end. My original function definitions used 'a
instead of 'b
. I'm note exactly sure what I was telling the compiler there, but separating the lifetimes worked.
fn seek<'a>(&'a mut self, v: &[u8]) -> crate::Result<Option<LeafFlag>, CursorError>
where
for<'b> <T::TxPageType as GetKvRefSlice>::RefKv<'b>: PartialOrd<[u8]>,
{
// SNIP
}
fn try_seek<'a>(&'a mut self, v: &[u8]) -> crate::Result<Option<LeafFlag>, CursorError>
where
for<'b> <T::TxPageType as GetKvRefSlice>::RefKv<'b>: TryPartialOrd<[u8]>,
{
// SNIP
}
That brings us to my latest issue. I need to, in a single function both search for a [u8] and return the key/values.
The APIs look like this
pub trait CursorRefApi<'tx> {
type RefKv<'a>: GetKvRefSlice + 'a
where
Self: 'a;
// SNIP
}
pub trait CursorSeekRefApi<'tx>: CursorRefApi<'tx>
where
for<'b> Self::RefKv<'b>: PartialOrd<[u8]>,
{
fn seek_ref<'a>(
&'a mut self, v: &[u8],
) -> crate::Result<Option<(Self::RefKv<'a>, Self::RefKv<'a>)>, CursorError>;
}
Finally we get to the issue at hand. The Cursor compiles right up until I add the search function
// This Cursor mirrors the RefTxHandle above.
// A MemMapCursor will wraps this and will explicitly define the backend for any library user
// No one needs to see the horror show I've created :D
// It's definitely better than the last architecture I've designed
pub struct RefTxCursor<'p, 'tx: 'p, T: TheTx<'tx, TxPageType = DirectPage<'tx, RefTxBytes<'tx>>>> {
filter: LeafFilterCursor<'p, 'tx, T>,
}
// This works as expected
impl<'p, 'tx: 'p, T: TheTx<'tx, TxPageType = DirectPage<'tx, RefTxBytes<'tx>>>> CursorRefApi<'tx>
for RefTxCursor<'p, 'tx, T>
{
type RefKv<'a>
= <T::TxPageType as GetKvRefSlice>::RefKv<'a>
where
Self: 'a;
// SNIP
}
// This errors out
impl<'p, 'tx: 'p, T: TheTx<'tx, TxPageType = DirectPage<'tx, RefTxBytes<'tx>>>>
CursorSeekRefApi<'tx> for RefTxCursor<'p, 'tx, T>
where
for<'b> Self::RefKv<'b>: PartialOrd<[u8]>,
{
fn seek_ref<'a>(
&'a mut self, v: &[u8],
) -> error_stack::Result<Option<(Self::RefKv<'a>, Self::RefKv<'a>)>, CursorError> {
todo!()
}
}
I do not understand which lifetime the compiler is talking about. The error message is completely unhelpful in this regard. My code is available here.
At the very least I hope you enjoyed the tour! I'd love some help if you have any suggestions!