Lifetimes for registers in a virtual machine. Runtime borrow checking

Hello,

I'm trying to store references in an array where the values can borrow each other. The borrow checker (rightly) doesn't like this. I want to be able to check this at runtime. The background below hopefully explains why I want this behavior. I'm aware of RefCell but not sure if that immediately solves the issue.

Background:

I'm creating a small project similar to SQLite for my own learning experience. As part of the sql execution engine there is a virtual machine. the machine is a register based virtual machine and the registers can be primitive values or handles to open tables (cursors). When trying to model this in rust I encounter some interesting problems with the lifetimes, I think the semantics I want are essentially runtime borrow checking ala RefCell but im struggling to work out the details.

Code:

Here is a simplified set of structs for the objects and their lifetimes

struct DataSource {}
struct Database { data_source: DataSource }
struct Cursor<'a> { data_source: &'a DataSource }
struct RowValueRef<'b> { data_source: &'b DataSource}

The virtual machine needs to store the open cursors for one or many tables and also might store some value references too:

enum Value {Literal(i64), Cursor(Cursor<?>), Row(RowValueRef<?>)}

struct VirtualMachine { registers: [Value; 3], db: Database }

and the machine will have many operation functions like:

impl VirtualMachine {
    pub fn op_open(&mut self, table: &str, reg: usize) {
        self.registers[reg] = Value::Cursor(self.database.open(table));
    }
    pub fn op_read(&mut self, cursor_reg: usize, reg: usize) {
        match self.registers[cursor_reg] {
            Value::Cursor(c) => {
                let v = c.read_value();
                self.registers[reg] = Value::Row(v);
            }
            _ => panic!()
        }
    }
}

Problem

The issue with lifetime is that the cursors must outlive the rows.

So when reading from self.registers I'm not certain if the value it contains is borrowed from another register and if that register still contains the value?

What I want is some type of runtime borrow check. some way the vm can report a lifetime error and abort the virtual program.

  • open table a into r0: r0 = open(a)
  • read row from r0 into r1: r1 = read(r0)
  • open table b into r0: r1 = open(a)
  • at this point if we read from r1 we should get a runtime error since the cursor that row was read from is now closed. (abort the program)

Obviously any program doing this was badly formed. but currently I struggle to even get something like this past the borrow checker.

How can I achieve this? What changes do I need to make to my existing structs to allow this?

Would adding a level of indirection help? A "PossibleRow" that may or may not reference an actual Row?

That's essentially how weak references work with Arc. You may be able to use Arc as a "PossibleRow".

you probably want rows to borrow cursors instead of the underlying data store then.

to borrow the data store at runtime, the states should be put in a RefCell, and it must be shared among all possible borrowers, so typically Rc<RefCell<DataSource>> (or `Arc<RefCell> for multithread). to express the possibility of dangling reference, you use weak references instead of strong references.

but I would suggest you re-consider your design, some example ideas: ideally, the invariant should be validated when compiling the queries into the VM opcodes; store the references as indices into an external Vec (instead of pointers directly like ECS, which is popular in games); use something like an epoch counter to check dangling/stale references; or just implement an tracing validator/collector if that's what you need; etc.

some side notes

the term lifetime has specific semantic definition in rust, while it is also a general concept in common programming context too (especially in OOP languages). so it might cause confusions here and there. you don't necessarily need to express the "lifetime" concept using the rust lifetime mechanism.

for instance, the rust construct to "runtime borrow checker" is RefCell, it specifically checks the borrow rules defined by rust, it is not some kind of general runtime sanitizer (or even garbage collector) . the related guard type Ref (and RefMut) is different from a general concept of "object references" in other OOP languages (a.k.a. pointers or handles), and is typically used "ephemerally", you are not supposed to use Ref to express objects diagrams in OOP paradigm.

Thanks both @nerditation and @Coding-Badly. I think both of you independently mention Weak references as the missing part of the puzzle.

The fact that I have raw references in the types here means I didn't allow for runtime checks:

struct Cursor<'a> { data_source: &'a DataSource }
struct RowValueRef<'b> { data_source: &'b DataSource}

I've prototyped a new design with the following:

struct Cursor { db: Weak<RefCell<Database>> }
struct RowValueRef { cursor: Weak<RefCell<Cursor>> }
enum Value { Cursor(Rc<RefCell<Cursor>>), Row(RowValueRef), None }

And that does solve my build problems.

I don't think its a perfect solution since I was hoping to only change the types defining the virtual machine interface.

Maybe there is a way I can decouple the original database definitions from the virtual machine in such a way where the lifetime shenanigans required for the virtual machine's registers (Rc, RefCell) don't impose changes on my Database and cursor definitions.

I.e. I wonder if it is possible to make my database and cursor structs generic enough to optionally support runtime lifetimes (as required for virtual machine) without removing the ability for code outside the vm to utilize standard rust lifetimes at compile time?

Thoughts?

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.