Safe string interning

I bumped into this article. In the final version, the author discussed the usage of fake'static references. I realize you can't get the interner to output a dangling reference but is it the only required constraint? To me, being unfamiliar with this technique, putting non-'static references where it requires 'static ones, doesn't feel right.
Miri doesn't complain by the way.

There's no undefined behavior requirements involving lifetimes, only the reference itself and the thing it points to. So if you transmute a &'a T to &'static T and then use it after T has been deallocated, or borrowed mutably, or another soundness property that 'a was being used to guarantee wouldn't happen, that's UB, but if you don't do that it's fine. In this case, matklad can't use 'a because it's ensuring the String isn't being reallocated by making them unmovable and unmodifiable. Instead, that's being ensured by just not modifying the strings until the entire struct is dropped.

1 Like

If you really want a safe interner anyway, the allocations can be amortized with Rc<str>:

2 Likes

Note however that letting this &'static T get into the external code is unsound, since we don't know how long this external code will use this borrow - it can be stored in the thread-local, for example, and then accessed after the T is freed. It's only sound if you're the only one to do anything with this borrow, and even then it would likely be frowned upon if you don't restrict its usage to a single module where it's clearly documented why all of this is OK.

3 Likes