Design an API where data is accessed through objects?

I couldn't find a good post title, but what I want to do is actually quite simple (I think).

I have one object that represents a ZIP file that is to be written, with multiple files in it. It should allow writing data to those inner files in arbitrary order. Because the files in a ZIP file have to be contiguous, my object needs to store the data until it's ready write. (Later I want to add functionality to stream data into one file at a time.)

I could easily implement an API like this:

let mut zip = Archive::new();

zip.add_data("file1", "some data");
zip.add_data("file2", "some data");
zip.add_data("file1", "some data");

But I would prefer to have an API like this (note that the file names are given first):

let zip = Archive::new();

let file1 = zip.add_file("file1");
let file2 = zip.add_file("file2");

file1.add_data("some_data");
file2.add_data("some_data");
file1.add_data("some_data");

I couldn't come up with an implementation that satisfies the borrow checker. Is such an API possible (and idiomatic) in Rust?

You're going to need some kind of interior mutability to make this work, or else the archive will be locked until the added file is closed. Here's what I came up with:

use std::collections::{BTreeMap, btree_map};
use std::cell::RefCell;
use std::rc::Rc;
use std::ops::DerefMut;

pub struct File<'a> {
    content: Rc<RefCell<String>>,
    
    // This exists to ensure all Files are dropped before a write
    #[allow(dead_code)] parent: &'a Archive
}

impl<'a> File<'a> {
    pub fn edit(&mut self)->impl DerefMut<Target = String>+'_ {
        // This never fails because Archive never lets multiple copies of the
        // same File exist simultaneously
        self.content.borrow_mut()
    }
}

#[derive(Default)]
pub struct Archive {
    // Note: None of the impl methods are reentrant, and they all release
    // the RefCell lock before returning.  Therefore, taking the lock should
    // never fail.
    files: RefCell<BTreeMap<String, Rc<RefCell<String>>>>
}

impl Archive {
    pub fn insert<'a>(&'a self, name: impl Into<String>)->Result<File<'a>, &'static str> {
        let name = name.into();
        match self.files.borrow_mut().entry(name) {
            btree_map::Entry::Occupied(_) => Err("File already present"),
            btree_map::Entry::Vacant(e) => {
                let content:Rc<RefCell<String>> = Default::default();
                e.insert(content.clone());
                Ok(File{content, parent: self})
            }
        }
    }
    
    pub fn update<'a>(&'a self, name: impl Into<String>)->Result<File<'a>, &'static str> {
        let name = name.into();
        match self.files.borrow_mut().entry(name) {
            btree_map::Entry::Vacant(_) => Err("File doesn't exist"),
            btree_map::Entry::Occupied(e) => {
                let content = e.get();
                if Rc::strong_count(content) > 1 {
                    Err("File already open")
                } else {
                    Ok(File { content: content.clone(), parent: self })
                }
            }
        }
    }   
    
    pub fn write(self) {
        for (_name,content) in self.files.into_inner().into_iter() {
            // Because the Files contain a reference to "self", they've all
            // been dropped at this point, and there's no other way to get
            // a copy of this Rc
            assert_eq!(1, Rc::strong_count(&content));
        }
    }
}

#[test]
fn test() -> Result<(), &'static str> {
    let zip:Archive = Default::default();

    {
        let mut file1 = zip.insert("file1")?;
        let mut file2 = zip.insert("file2")?;

        *file1.edit() += "some_data";
        *file2.edit() += "some_data";
        *file1.edit() += "some_data";
    }
    
    zip.write();
    Ok(())
}

(Playground)

2 Likes

Alternatively you could design an api like this

let mut zip = Archive::new();

zip.add_data(“file1”, “some_data”);
zip.add_data(“file2”, “some_data”);;
zip.add_data(“file1”, “some_data”);