Data structure for storing u8 buffer w/ undo

  1. Suppose we want to store a buffer (say mini-nano or mini-notepad). We can do something like:
buffer: Vec<Vec<u8>>
  1. This is easy to implement hjkl, insert/delete, etc ... .

  2. Now, we want to add undo trees. This makes things a bit more difficult. Storing "reverse diffs" will be error prone, while doing "snapshots" will be expensive.

  3. What is the recommended data structure for storing buffers w/ undo trees? (And does some Rust crate provide this already.)

Check out rope data structure.

1 Like

At work we develop a CAD system, so we needed a reliable way to deal with this undo/redo problem. What we ended up doing is creating an Operation interface with Do and Undo methods that will update the world.

To make sure these operations are reliable, I've got tests which will execute an operation then undo it, and make sure the new state of the world is identical to how it started.

For your example, it'd probably look something like this:

pub struct World {
  ...
}

pub trait Operation {
  fn do(&self, world: &mut World);
  fn undo(&self, world: &mut World);
}

pub struct InsertText {
  location: ...,
  contents: String,
}

impl Operation for InsertText {
  ...
}

#[test]
fn round_trip_insert_test() {
  let mut world = World::new(...);
  let op = InsertText::new(...);

  let original = world.clone();
  
  op.do(&mut world);
  op.undo(&mut world);

  assert_eq!(world, original);
}
3 Likes

So each edit op is a "ForwardDiff", and each "ForwardDiff" has a corresponding "BackwardDiff", with the requirement that compose(ForwardDiff, BackwardDiff) = identity.

The thing that scares me if that if I get one of these ops wrong, where compose(ForwardDiff, BackwardDiff) != identity .... then the document can "cascade out of control" as ops meant to bring the World to a given state ends up making modifications.

In particular, consider the operation "DeleteToEndOfLine." In order to be able to undo this, we would need to store the text being deleted.

Wellll..... yes! Yes you would! Honestly, no matter how you try to solve this problem, you would need to store that text somewhere.

But if you have operations as specific as DeleteToEndOfLine in your undo history, you're gonna have a bad time. Keep it simple so that you know for sure it can be reversed:

struct Changes {
    changes: Vec<Change>,
}

struct Change {
    pos: Position,
    old: Vec<u8>,
    new: Vec<u8>,
}

impl Changes {
    fn chain(mut self, mut other: Changes) -> Self {
        self.vec.append(&mut other.vec);
        self
    }

    fn inverse(mut self) -> Self {
        self.vec.reverse();
        for change in &mut self.vec {
            std::mem::swap(&mut change.old, &mut change.new);
        }
        self
    }
}

This can easily represent all of the changes you can possibly make to a text file.

You might want to track indices in this vector that correspond to the user's perceived changes, so that the undo key undoes e.g. an entire Find and Replace operation rather than just the last match. (I'd also suggest potentially trying to swap the Vec<u8>s out with something cheaper; say, index ranges into a giant growing Vec<u8> of history string data. Maybe.)

3 Likes

That assumption doesn't need to be true -- one can use persistent data structures to reduce the cost of just keeping those snapshots.

For example, Finger Trees allow logarithmic-cost split and merge while still having access to the originals.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.