How does Rust look like in the memory?


#1

Are there any blogs/articles/web pages that explain how does Rust allocate memory?(there are some videos that show how a vector looks like in memory)
For example, how does a vector of Options that have the value None look like in the memory?
“The Reference” didn’t help me much…

My issue is that in C++ for example I can reason about many things.
I can arrange a structure and know with certainty if it fits well into a single cache page on a particular architecture. How cache friendly will a block of memory be when I traverse it in a particular pattern. How well will the memory access play with the cache prefetcher…

That’s my only gripe with Rust currently. I don’t know with certainty what operations it does in the background and how it arranges memory.

How does Some(Rc<T>) look like in the memory or Some(10i32)?
I know Option<Box<T>> is optimized away into a pointer. Then, Some(SomeOtherType) is not optimized away at compile time?
Is reading the source code the only Option<T> currently?

Do you know where I could find such material?


#2

So, structs will look exactly like they look in C++.

Tuples and arrays are just anonymous structs.

Tagged unions are definitely unspecified as to how their layout works.
This is because there are many clever optimizations that can be performed
that we would like to reserve the right to do. For instance Option<...250 more...Option<bool>...>> can in principle be represented as a single byte.

As of today we only do one interesting optimization with enum layout: the
null pointer optimization. If any type contains a non-nullable pointer (&,
Box, Vec, Rc, Arc, …), Option<ThatType> will occupy the same space as
ThatType. Same for any enum isomorphic to Option. None will be represented
by nulling out that pointer.

I cover this in a bit more detail in
http://cglab.ca/~abeinges/blah/turpl/_book/repr-rust.html which I hope to
land in master in the coming week.

edit: I should note that all enumy optimizations are pure compiler magic. You can’t read e.g. Option’s source to find this out (although we have NonZero to claim that a value will not be null that Box, Vec, etc use internally).


#3

By the way, my optional crate has an OptionBool type, as well as a generic Optioned type with implementations for integer and float types (that uses MAX, MIN, or NAN, respectively for the None value).

Of course, the OptionBool type will be pretty useless once the compiler optimizes this stuff away.


#4

Just to add one point to Gankro’s post:

Currently, for any X which doesn’t allocate, Option<X> will take up one more byte of memory than X itself, and Option<X> will be stored directly in memory where X would have been stored (no allocations).

As for Option<Rc<X>>, it will take up the exact same size as Rc<X>. Rust’s null pointer optimizations let Option::None act as a null pointer when Option is storing any type with a guaranteed non-null pointer. Because of this, for any X which allocates and guarantees a non-null pointer (like Box, Rc or Arc), Option<X> will take up the same space as X itself.

You can test all of this out using std::mem::size_of() on playpen: https://play.rust-lang.org/?gist=de81f32647abc7fe4953&version=stable.


#5

It will actually use up more space than a byte in practice. It will
generally try to fill up any space that would otherwise just be padding.
Apparently this makes the optimizer happier.


#6

To expand on that… Interesting, why does Option<T> take up one more byte? Shadow memory for runtime safety checks(aka sanitizing the memory), or just to make the optimizer happier, how Gankro says?

Option<T> is not the only “mystery” to me. impl SomeStruct, impl TraitWithTypes for SomeStruct - what will be loaded into memory if I use an object of type SomeStruct? If I add many functions through impl SomeStruct, will my instruction cache get filled with things I don’t need when I load the object, can this be directly compared to a C++ struct/class?

Btw, good point on size_of(), I will play around with it… I am thinking now to also make some bitwise operations to see what it does.


#7

Look into the source of Option<T> – you will find that it’s an

enum Option<T> {
    Some(value: T),
    None,
}

(reproduced from memory, so please look up the [src] in the docs – they are just a click away.

Since an enum is always stored as type tag + space for variants (this is always the space of the longest variant), it needs sizeof(T) + sizeof(u8) + padding – the first for the Some variant, the second for the type tag and the last to make the optimizer happy.


#8

An impl does not add anything to SomeStruct's in-memory representation*. An impl has the exact same runtime overhead as it’s constituent function definitions: a trait impl with no function definitions in it, such as an impl of Send, is pure compiler metadata with no runtime representation at all. These compile down to the same asm, only the names are changed.

With a trait

pub struct SomeStruct {
    data: usize,
}
impl From<SomeStruct> for usize {
    fn from(a: SomeStruct) ->  usize {
        a.data
    }
}
fn main() {
    let data: usize = From::from(SomeStruct{data: 3});
    println!("{}", data);
}

With a bare function

pub struct SomeStruct {
    data: usize,
}
fn from(a: SomeStruct) -> usize {
    a.data
}
fn main() {
    let data: usize = from(SomeStruct{data: 3});
    println!("{}", data);
}

The difference between the generated assembler files

Generated using rustc --emit asm and diff.

2,3c2,3
< 	.file	"test_bare_fn.0.rs"
< 	.section	.text._ZN4from20h21c98f259508b997iaaE,"ax",@progbits
---
> 	.file	"test_trait_fn.0.rs"
> 	.section	".text._ZN28usize.From$LT$SomeStruct$GT$4from20h0cc7013da25f51e9jaaE","ax",@progbits
5,6c5,6
< 	.type	_ZN4from20h21c98f259508b997iaaE,@function
< _ZN4from20h21c98f259508b997iaaE:
---
> 	.type	_ZN28usize.From$LT$SomeStruct$GT$4from20h0cc7013da25f51e9jaaE,@function
> _ZN28usize.From$LT$SomeStruct$GT$4from20h0cc7013da25f51e9jaaE:
23c23
< 	.size	_ZN4from20h21c98f259508b997iaaE, .Ltmp1-_ZN4from20h21c98f259508b997iaaE
---
> 	.size	_ZN28usize.From$LT$SomeStruct$GT$4from20h0cc7013da25f51e9jaaE, .Ltmp1-_ZN28usize.From$LT$SomeStruct$GT$4from20h0cc7013da25f51e9jaaE
26c26
< 	.section	.text._ZN4main20hc363dbfca27efb45raaE,"ax",@progbits
---
> 	.section	.text._ZN4main20h5cc8e894ab3a62d8waaE,"ax",@progbits
28,29c28,29
< 	.type	_ZN4main20hc363dbfca27efb45raaE,@function
< _ZN4main20hc363dbfca27efb45raaE:
---
> 	.type	_ZN4main20h5cc8e894ab3a62d8waaE,@function
> _ZN4main20h5cc8e894ab3a62d8waaE:
41,42c41,42
< 	movq	const894(%rip), %rdi
< 	callq	_ZN4from20h21c98f259508b997iaaE
---
> 	movq	const903(%rip), %rdi
> 	callq	_ZN28usize.From$LT$SomeStruct$GT$4from20h0cc7013da25f51e9jaaE
46c46
< 	movq	_ZN4main15__STATIC_FMTSTR20h184a1ba70fe95744KaaE(%rip), %rax
---
> 	movq	_ZN4main15__STATIC_FMTSTR20h58daeaf08cc7ff5ePaaE(%rip), %rax
48c48
< 	movq	_ZN4main15__STATIC_FMTSTR20h184a1ba70fe95744KaaE+8(%rip), %rax
---
> 	movq	_ZN4main15__STATIC_FMTSTR20h58daeaf08cc7ff5ePaaE+8(%rip), %rax
60c60
< 	callq	_ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new20h4867565462692555336E
---
> 	callq	_ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new21h10416708861696835008E
73c73
< 	.size	_ZN4main20hc363dbfca27efb45raaE, .Ltmp3-_ZN4main20hc363dbfca27efb45raaE
---
> 	.size	_ZN4main20h5cc8e894ab3a62d8waaE, .Ltmp3-_ZN4main20h5cc8e894ab3a62d8waaE
86c86
< 	movq	const1000(%rip), %rcx
---
> 	movq	const1009(%rip), %rcx
88c88
< 	movq	const1000+8(%rip), %rcx
---
> 	movq	const1009+8(%rip), %rcx
99c99
< 	.section	".text._ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new20h4867565462692555336E","ax",@progbits
---
> 	.section	".text._ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new21h10416708861696835008E","ax",@progbits
101,102c101,102
< 	.type	_ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new20h4867565462692555336E,@function
< _ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new20h4867565462692555336E:
---
> 	.type	_ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new21h10416708861696835008E,@function
> _ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new21h10416708861696835008E:
124c124
< 	.size	_ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new20h4867565462692555336E, .Ltmp6-_ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new20h4867565462692555336E
---
> 	.size	_ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new21h10416708861696835008E, .Ltmp6-_ZN3fmt24ArgumentV1$LT$$u27$a$GT$3new21h10416708861696835008E
143c143
< 	leaq	_ZN4main20hc363dbfca27efb45raaE(%rip), %rax
---
> 	leaq	_ZN4main20h5cc8e894ab3a62d8waaE(%rip), %rax
157,158c157,158
< 	.type	const894,@object
< 	.section	.rodata.const894,"aM",@progbits,8
---
> 	.type	const903,@object
> 	.section	.rodata.const903,"aM",@progbits,8
160c160
< const894:
---
> const903:
162c162
< 	.size	const894, 8
---
> 	.size	const903, 8
164,165c164,165
< 	.type	const1000,@object
< 	.section	.rodata.const1000,"aM",@progbits,16
---
> 	.type	const1009,@object
> 	.section	.rodata.const1009,"aM",@progbits,16
167c167
< const1000:
---
> const1009:
169c169
< 	.size	const1000, 16
---
> 	.size	const1009, 16
171,178c171,178
< 	.type	str1001,@object
< 	.section	.rodata.str1001,"a",@progbits
< str1001:
< 	.size	str1001, 0
< 
< 	.type	str1002,@object
< 	.section	.rodata.str1002,"a",@progbits
< str1002:
---
> 	.type	str1010,@object
> 	.section	.rodata.str1010,"a",@progbits
> str1010:
> 	.size	str1010, 0
> 
> 	.type	str1011,@object
> 	.section	.rodata.str1011,"a",@progbits
> str1011:
180c180
< 	.size	str1002, 1
---
> 	.size	str1011, 1
182,183c182,183
< 	.type	ref1003,@object
< 	.section	.data.rel.ro.local.ref1003,"aw",@progbits
---
> 	.type	ref1012,@object
> 	.section	.data.rel.ro.local.ref1012,"aw",@progbits
185,186c185,186
< ref1003:
< 	.quad	str1001
---
> ref1012:
> 	.quad	str1010
188c188
< 	.quad	str1002
---
> 	.quad	str1011
190c190
< 	.size	ref1003, 32
---
> 	.size	ref1012, 32
192,193c192,193
< 	.type	_ZN4main15__STATIC_FMTSTR20h184a1ba70fe95744KaaE,@object
< 	.section	.data.rel.ro.local._ZN4main15__STATIC_FMTSTR20h184a1ba70fe95744KaaE,"aw",@progbits
---
> 	.type	_ZN4main15__STATIC_FMTSTR20h58daeaf08cc7ff5ePaaE,@object
> 	.section	.data.rel.ro.local._ZN4main15__STATIC_FMTSTR20h58daeaf08cc7ff5ePaaE,"aw",@progbits
195,196c195,196
< _ZN4main15__STATIC_FMTSTR20h184a1ba70fe95744KaaE:
< 	.quad	ref1003
---
> _ZN4main15__STATIC_FMTSTR20h58daeaf08cc7ff5ePaaE:
> 	.quad	ref1012
198c198
< 	.size	_ZN4main15__STATIC_FMTSTR20h184a1ba70fe95744KaaE, 16
---
> 	.size	_ZN4main15__STATIC_FMTSTR20h58daeaf08cc7ff5ePaaE, 16
200,201c200,201
< 	.type	const1013,@object
< 	.section	.data.rel.ro.const1013,"aw",@progbits
---
> 	.type	const1022,@object
> 	.section	.data.rel.ro.const1022,"aw",@progbits
203c203
< const1013:
---
> const1022:
205c205
< 	.size	const1013, 8
---
> 	.size	const1022, 8

* The built-in Drop trait magically adds a hidden “drop flag” to anything that impl’s it. The core team is planning to move these flags into a bit field on the stack. This is the only exception.


#9

I guess the language has to stabilize more in order for somebody to say “This is how it is working in the background”(I mean that the language is changing, some claims can’t be made because we don’t know for sure that they will hold in the future).
I come to the conclusion that when it comes to this specific point(understanding how the language works), how C++ standards work(trying very hard to keep backwards compatibility) is actually good.
While people do seem to complain about C++ fighting hard for backwards compatibility, this brings the advantage of a very well defined language that people can understand + of course not breaking “older” software.

I would compare this “drop flag” with vtables in C++. The standard defines very well when you structure will contain the vtable pointer and where it will be placed in the memory. This makes this more complex topics easier to share and over time, more people understand them.
Also, you don’t need to read assembly code to understand what the language does(this is hard to do), it’s very well defined in the standard(and there are A Lot of people that already have read it and can explain it to you in simpler terms).
In case of Rust, it’s hard to follow issues to keep up-to-date with this kind of changes. The only people that understand how the language works, are actually those people that define the language(and changes to it) which is not ok.

Somehow, it’s disturbing to me that I write in a language I don’t understand. And it is a bigger issue for me particularly because it’s hard to get the necessary info to understand the language.

Now, “coming back to C++”… Hopefully, when Rust will be in the wild for a longer time(not exactly like C++, since I think we get so much info about the language particularly because we have the internet now/it evolved so much, not because the language has its particular age, although the 2 combined do meet. What I mean, if C++ would of been in the wild for the last 10 years, defined so well in a standard, how it is currently, we would probably have it documented by people as well as it is currently), we will have all this info, although, for a native language, I think the info should of been laid out already.

At the end, I am still left unsatisfied with my understanding of the language and the information available out there to accomplish this task.
I remain hopeful that the “concepts/behaviors” that are stabilized and the people that define/implement the language will document better these “concepts/behaviors”.


#10

One inherent issue here: describing semantics that are not guaranteed means that people will come to rely on those semantics, which makes fixing them to be the semantics we want more difficult.

Rust is not yet at the level of maturity where there will be a full spec. That’s going to take time. So there’s just always going to be these under-specified bits for now.


#11

For the record, the C++ standard doesn’t define where vtable pointers are stored or how virtual dispatch works in general. Most compilers do it pretty similarly, but I’ve heard of one that, IIRC, sticks the vtable pointer as if it were a field declared in the class at the location of the first virtual method.


#12

Ye, seems like I made a claim without being properly informed on the subject.
I also considered always that the vpointer is always placed at the beginning of the object, which, if to believe the article on wikipedia, is not true…

Well, anyway, it’s important to know that the vpointer is added(and when) to the object, not so much where it is placed inside the object. This info can be used to find why it is not a good idea to call virtual functions from within’ constructors, to know that the size will be change, that you also have to take care taking pointers into the object…


#13

I don’t think C++ even specifies the presence of a vtable pointer, and I know padding isn’t specified, so you can’t really know much about the size. All it specifies is that the address of every member is bigger than the one before it.