Better type systems for more reliable bit packing and twiddling

Some Saturday musings. This person:

has tried to reduce the memory used by this data structure:

enum LongStr {
    Leaf(Vec<u8>),
    Concat(RcStr, RcStr),
}
type RcStr = Arc<LongStr>;

First he has used a Box, then he has removed the weak pointer from a modified Rc and later has tried to pack bits:

Another commenter pointed out that it is possible to go all the way down to 16 bytes with some unsafe code. The trick is that on x64, (userland) pointers are limited to a 47 bit address space. This means that two pointers plus an enum discriminant can be packed into 12 bytes, leaving 4 bytes for an atomic reference count. I actually tried implementing this, but it crashed and I didn’t feel like trying to debug why. Such are the perils of writing clever unsafe code, I suppose.

For me a modern good system language is a language that (among other things) allows you to use memory efficiently in a sufficiently reliable way. This means pack those bits in this problem and reduce the probability of crashes and bugs.

This is perhaps impossible to do if you want 100% reliability, but there are many ways to improve type systems past the C language for such kinds of problems. The paper "Bit Level Types for High Level Reasoning" by Ranjit Jhala and Rupak Majumdar shows an example of what I mean (yet, I think it's not enough to solve the packing problem presented above):

http://goto.ucsd.edu/~rjhala/papers/bit_level_types_for_high_level_reasoning.html

1 Like

Making use of extra bits in a pointer, either due to free bits occurring as a result of minimum alignment or physical memory addressing, is generally referred to as "tagged pointers" - searching for that term will reveal many discussions around it. It's definitely not a clear cut advantage as it makes use of assumptions, either of the allocator or the OS/CPU, that may easily break. But yeah, it's a technique that could be used to reduce footprint.

Back to Rust. I suppose one debatable thing is whether Rc/Arc should have support for weak refs by default. Or rather, should there be distinct types for when you need weak refs or you don't. Having a single type is attractive though.

Also, I think it's fair to say that if you're writing something that will be allocating/managing lots of memory, then you'll likely need to write a bunch of your own low level code. It's somewhat unreasonable to expect general purpose allocators and/or libs to be as good as custom code that can exploit usecase specific knowledge.

Packing two 64 bit pointers plus an enum discriminant into 12 bytes is more than just regular tagged pointers.

as it makes use of assumptions, either of the allocator or the OS/CPU, that may easily break.

A well designed system language should expose as many as possible of such assumptions and give you handy means to manage them to produce sufficiently reliable and sufficiently nice code :slight_smile:

I suppose one debatable thing is whether Rc/Arc should have support for weak refs by default. Or rather, should there be distinct types for when you need weak refs or you don’t.

I don't know much about Rc/Arc design history, but perhaps you can design them like HashMap, using () (empty tuple) for values to implement HashSet.

Also, I think it’s fair to say that if you’re writing something that will be allocating/managing lots of memory, then you’ll likely need to write a bunch of your own low level code.

Right. The paper I've linked shows a type system that enforces correctness in your handling/packing/unpacking of bits. This is not enough to solve the memory problems, but shows an example of type system that solves a correctness problem that I guess C programmers assume can't be solved.

Indeed. But it builds on the same underlying assumptions, and the pros/cons will be the same whether you're tagging one pointer or combining.

The language or the stdlib/default runtime that comes with it? The blog you linked is mostly dealing with the lib and runtime (i.e. built-in allocator choices) aspects.

Me neither. But it's interesting that, as the blog mentions, even if you flat out remove the weak count, the size doesn't change due to jmalloc size buckets. So, although allowing for smaller allocation size is good, you still need the allocator to "make use of it".

I'll admit I've not looked at the paper, but will try to find some time to do that.