Memoize function without cloning

While reading Category Theory for Programmers, I tried to implement the memoize() higher-order function. I did it with a HashMap and some cloning, but I suspect there's a way to avoid cloning by returning references to the memoized results (I'm okay with the type difference of the input and output functions). However I haven't managed to make this work due to conflicting lifetime requirements. Any help would be much appreciated!

pub fn memoize_ref<'a, R: Eq + Hash, S>(f: &'a (dyn Fn(&R) -> S)) -> Box<(dyn FnMut(R) -> &'a S)> {
    let mut map: HashMap<R, S> = HashMap::new();

    Box::new(
        move |x: R| {
            if let Some(y) = map.get(&x) {
                println!("memoized!");
                &y
            } else {
                let y = f(&x);
                map.insert(x, y);
                &y
            }
        }
    )
}

Consider a code:

let mut memo = memoize_ref(f);
// y1 points to an element of a HashMap
let y1 = memo(x1);
// second call to memo may cause reallocation of the HashMap 
// and y1 reference becomes invalid
let y2 = memo(x2);

So we need to make sure that no calls to memo are made, while we keep reference around.

It can be done by introducing a new trait with a following call method:

fn call(&'a mut self) -> &'a S;

In this way, we tell the compiler to keep memoizer locked while we hold the reference.

The code: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=c5bc8cd6b038cebb2707cf8a493355d6

Another way to solve this problem is to keep values in an arena, which ensures stable references to its elements. typed_arena isn't available in the playground, so the example uses naive implementation

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=9bf39377c6f7f4175ff6a059b1450fa3

Most functional languages use garbage collection, so memory management is not a concern there. In Rust we need to do something explicitly.

3 Likes

This is quite subtle, thanks for highlighting it. The arena seems to do the trick here.

I have a follow-up question regarding this solution: Could we avoid the use of a HashMap outside of the arena (and thus storing separately the references to the computed values) altogether? E.g. by using a HashMap instead of a Vec in the arena?

It can be done, but it still won't allow keeping such HashMap inside a closure, if that's what you want.

Fn* traits don't allow to bind lifetimes of returned references to the lifetime of the closure itself. Compiler will be unable to track validity of the returned references, so it's a no-go.

You can use reference counting (arguably, a form of garbage collection) instead of references to ensure that references to S are available even after HashMap is dropped.

Playground

Why lifetime of references can't be checked statically? Consider this code:

type Memo = Box<dyn FnMut(R) -> &S>;

fn drop_randomly(f: Memo) -> Option<Memo> {
    if random(2) < 1 {
        Some(f)
    } else {
        None
    }
}

let y1 = memo(x1);
let maybe_memo = drop_randomly(memo);
println!("{}", y1); // compiler can't statically check whether y1 still points to allocated memory

I think I get it now, thanks for your time!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.