So, how is accessing data on the stack faster than accessing data in the heap?

From what I understand; I can see that putting data on the stack would be faster/more efficient than putting data in the heap, but I don't see how accessing data on the stack would be faster.

In the book it says:

Accessing data in the heap is slower than accessing data on the stack because you have to follow a pointer to get there.

So to have an example. I create two structs: StructOne, and StructTwo respectively. In a later function I want to access the static data from StructOne (which should be on the stack). StructTwo however has been created later, so would be ontop of StructOne on the Stack. Does it not follow a pointer here to access data from StructOne, since getting the top part of the stack is StructTwo? How does it access StructOne?

... a processor can do its job better if it works on data that’s close to other data (as it is on the stack) rather than farther away (as it can be on the heap).

With that said, I don't follow this statement about the data being 'closer'.

Can anyone further clarify, cause I'm not seeing it.

Ownership; Stack and Heap

I guess I'll have to dig more into some specifics of cache/registry memory since I'm not familiar with that. Thanks for the replies!

Let's see how the stack looks like in your example:

StructOne.Field1
StructOne.Field2
StructTwo.Field1
StructTwo.Field2
---------- <- stack-pointer (sp)

(I assumed each struct contains two fields).
To access any field of the struct, you have to do something like mov rdi, [sp-offest], where offset is a constant (the exact assembly would vary, I am cutting a lot of corners to keep the explanation simple).
Now suppose you have a pointer to something on the heap. Suppose the pointer itself is stored on the stack, at sp - 8. So, in order to access the value in the heap, you need the following two instructions:

mov rdi, [sp-8]
mov rax, [rdi]

This is what is meant by following a pointer, you need two loads instead of one.

2 Likes

Every local variable in a given function is in the same contiguous stack area; that's the scenario @RedDocMD demonstrates. You're also probably going to hit it a lot, so it's likely to be in a cache... or parts of it in a register even.

In contrast, if you have two Strings say, their (heap allocated) memory could be far away from not only the stack, but from each other, and the memory areas in question may not have been accessed recently upon execution of the function body (less likely to be in cache).

Now, you did say...

(emphasis added)

If you can access StructOne in a later function, you either

  • Passed it in by value, in which case you moved it and it's local
  • Passed in a reference to it (or pointer etc)

In the latter case, you will still be following a pointer. That said, it's still probably pretty "close" (in the same general region of stack) to the current function (unless you pass some reference deeper and deeper into the call stack, or have some large stack variables), so probably it's still more likely to be in cache.

Getting more into spitballing territory, there are also other factors my intuition says probably matter too: say a function you're passing a reference to a stack variable into gets inlined; the pointer could be optimized away.

3 Likes

It's not usually going to make a big difference.

The fact that you have to "follow a pointer" often doesn't matter, because that pointer only has to be loaded once, and then you already have it in a register. To access data on the stack you also have to "follow a pointer" -- the stack pointer.

What matters much more is whether the data is in the memory cache or not, whether the access is to consecutive locations or randomly jumping around, etc. Whether it's on the stack or on the heap is the last thing I would worry about in this context.

10 Likes

The sentiment of this sentence is generally true, even though taken literally is not quite accurate.

There are cases in Rust where a type T is faster than Box<T>, but it's not specifically about the stack, but rather indirection and optimization:

  • using i32 instead of Box<i32> allows the integer to be stored in a register, which is the fastest option.

  • If you have a struct Foo { bar: Bar } and struct Foo { bar: Box<Bar> } then accessing foo.bar via &Foo is faster in the first case.

  • If you have a local variable that contains a struct let baz = Baz { quz: 1, quux: 2} then the optimizer can tear the struct apart, as if each field was a separate variable, and then put them in registers or push on the stack separately as needed. The optimizer can't make as many assumptions about Box<Baz> (it might still "cache" some field reads in registers, but it will have to lay out the struct in memory first).

2 Likes

That's the key part of your misunderstanding. StructTwo may be initialized later, but it's not created later.

No.

All the structures on the stack are created simultaneously. When your function is called. Compiler in the compilation time decides where to put these structures in the function frame. It doesn't matter if you have 1 structure or 100, they all are allocated in one operation. And since you din't need to calculate offsets between them you can access them all easily. Yes, during compilation compiler may do a lot of work to decide where and how to put these structures, but in runtime you do nothing.

Now, structures on stack are placed there independently. Their addresses are not related to each other and each would have it's address stored somewhere in runtime.

Basically: when you move strctures on stack you move all these calculations from rumtime to compiler time. That's why access is faster.

Of course if your structure in on stack and you pass reference to it into other function it's not faster than accessing heap. Except if function is inlined, of course, then there are no reference because function doesn't actually exist anymore and associated work in runtime is not needed for that reason.

3 Likes