I don't understand, how is this "behind the scenes"? There's only a single question needed to determine if something is on the heap or stack: "Did you put it on the heap yourself (or call code that put it on the heap)?".
I don't think any method of recovering data from that would be very efficient, whether compiler written or human written. I think that's just a big blob of data, and the compiler remembers where each string starts and how long it is, and that's used that wherever it's accessed
A variable created with let within a function can be on the stack or on the heap. Only that when Rust places it on the stack you will get a compiler error Use after Drop. It will first suggest you to define a Borrow Life Time just to eventually find out that you need to copy the data if you want to get it out of a function.
In pointer oriented languages a pointer is not de-allocated if you export it from a function. It keeps being valid just as long as the source is valid.
Why do you say so? Drepper advocated that more than 10 years ago. In fact I'm pretty sure what GLibC does is still more efficient than what Rust compiler is doing (because it avoids penalty of relocations on program startup).
Any language which doesn't have strings null-terminated (ISO Pascal or Ada, e.g.) would do this. Compiler is not human, such things just happen naturally.
I think you are significantly overestimating what compiler Rust may do and what humans can do with C or C++.
True, Rust can do certain microoptimizations which allow us to use things like Option<&Foo> without thinking about inefficiency of additional bool. But they are pretty limited, relatively obvious and are mostly compensating for high-level features of Rust (lack of NULL in safe Rust, e.g.)
It's not even close to what determined speed-oriented developers may do with C or C++.
No, it's always on the stack. The value may manage and control heap-allocated data, but it still lives on the stack itself. For example, while Vec stores its elements on the heap, it still has the pointer to that data and its length on the stack.
Strangely enough that is exactly how MS-DOS stores it's environment variables. One null terminated string, followed by another, and an other.... the whole thing terminated by an extra null at the end.
I happen to know this because I was the one who once forgot about that extra null at the end and was subsequently chided for introducing a bug that randomly consumed up to a whole segment of memory(64K).
The original incarnations of CADSTAR CADSTAR | eCADSTAR were for MS-DOS and stored all the strings that appeared in the user interface like that. Along with an index for quick look up.
It's amazing what one will do to save memory when there is not much of it. Which is still the case in billions of small micro-controllers.
for Application Maintenance reasons I would create the messages normally like this
static const char* msgs[] = {
[ERR1] = "long message for err1",
[ERR2] = "but message for err2 can be longer",
[ERR3] = "or message for err3 can be short"
};
I would never create the messages like
static const char msgs[] = "long message for err1but message for err2 can be longeror message for err3 can be short";
This is certainly the best memory optimized solution.
But how can you show the message for ERR2 now?
Still, the discussion you mentioned exposes that the former method is not optimized.
Now the macros that the documents introduces require that the Strings are predefined at compile time:
The programmer only has to add the strings, appropriately marked, to a data file which is used in the compilation.
And that I think is the reason Rust insists in immutability where ever possible.
The work you linked to shows that they spent a lot of work and effort in optimizing the memory management of 3 error messages which is only a Micro Optimization. And it will only make a real difference when you have to manage thousands of those messages.
I think I can agree with @VorfeedCanal. For example this comment in the article: I’ve also been told, by people with white in their hair, with an air of misty-eyed revelation, that once you “get” Rust’s memory model of the stack and the heap,1 that things just all fit together wonderfully.
So few developers are actually trained on actual memory models in the OS it is entirely foreign to them. Rust's does not have memory model of stack and heap--that is just the OS as they have been for decades. I find basic OS concepts hard to explain to developers that came up since the days of Java,Ruby,Python,etc. They have had so much hidden from them for their whole years of experience. For example, I just did a large data implementation using Rust and FiFO pipes on Linux. None of my "less white haired" fellow engineers had a clue that FIFO pipes existed.
To those of us coming from C/C++ -- all of it makes entire sense and is exciting to see the control you get back. For me Rust feels like an overdue correction for the past 20 years. That being said--Rust takes some "real" learning if you don't understand the OS architecture.
When I define a local variable inside a function it exists on the stack. It only exists for the lifetime of the function because the stack is unwound when the function returns.
When I want data to outlive the function that creates it then I have to do something else. Like use malloc() in C or new() in C++ or Box in Rust.
The whole notion of stack and heap is fundamental to the way these languages work. This is true in all manner of compiled languages: C, C++, Pascal, PL/M, Rust, many more.
Actual memory models of the OS are not relevant. We know this to be true because we can use these languages without any OS underneath. In fact we write operating systems in such languages.
In short, notions of stack and heap are not peculiar to Rust, they are ancient as the hills and present in many compiled languages. Neither are they anything to do with operating systems.
Speaking as a guy with white hair and misty eyes I totally agree with that. Even if not for the reasons given.
...unless you consider JVM languages "compiled", of course, since in the JVM every object is on the heap, and only references (Rc<RefCell<_>>, in Rust terms) live on stack.
Personally I cannot accept that anything that requires a huge run-time/virtual-machine in order to interpret and execute its intermediate representation as a "compiled language". If I can't simply dump the binary into memory and point the processors program counter at it then it is not compiled.
Let's call it "half compiled".
Of course one could devise a properly compiled language that keeps all its local variables, even things like simple integers, on the heap. That would be kind of silly though.
While true, this can be considered an implementation detail. A Java implementation could just as well allocate its int on the heap and store a pointer in the variable, as it does for other types of objects, and it would make no functional difference (other than performance), because there is no mechanism to take a reference to that int variable, or a reference to reference variables that are also stored on a stack.
Of course one can compile any traditionally "compiled" language into anything one likes. For example compile C code to RISC V and run it on a RISC V emulator on an x86 machine. Compile C to asm.js and run it as Javascript in your browser.
I don't think that being able to compile Rust to WASM detracts from the original intent of the language.
There are such mechanism, it's called “closure”. Of course the ability to take such reference would blow away the whole illusion of everything living in heap — that's why an attempt to capture non-finalint is a compile-time error (but how many Java users know that?). Go would silently move int to heap if you would try to capture it thus there are no such difference.
What about C# in AOT mode for iOS? Is it compiled language or half-compiled one?
Not at all. Go is one such language. Well, it puts it's variables on stack when it could, but that's considered an optimization, you can take a reference to any of them and they would survive after function would exit.
To understand why people would always demand GC in Rust and would never get it you really need to watch this lecture of Tony Hoare. At about 15th minute of it he explains how customers who bought and used computer with Algol compilers were absolutely happy paying for range checking and thus the ability to turn the range changing off was never added to it. And then at about 18th minute he tells how an attempt to sell Fortran-to-Algol transpiler was a disaster: most transpiled programs crashed with a subscript error immediately but they weren't interested in fixes for them, they just wanted to run them!
And the same thing happens today with Rust and GC: people which demaind GC don't want to make sure their programs are correct, they just want to run them! The article we are discussing here says the same thing in a different words: in practice, people just want to be able to write a tree-like type without having to play Chess against the compiler.