I learnt that rust is able to keep literals on stack because they have a fixed length and are known at compile time. i8 8 bits maximum value 2^7-1. How about string literals ? Is there a maximum number of characters that a string is allowed to have until rust says it’s larger than what a stack frame can accommodate ?
I could be very wrong but I think string literals are implicitly &'static str
. So they're the equivalent of writing out:
static MY_STRING: &str = "This is my string!";
Which means the string literal is compiled in to the data section of your application. So it has a fixed address but it's not on the stack.
Ah, the Rust book explains it better than me.
In the case of a string literal, we know the contents at compile time, so the text is hardcoded directly into the final executable. This is why string literals are fast and efficient.
Recall that we talked about string literals being stored inside the binary. Now that we know about slices, we can properly understand string literals:
let s = "Hello, world!";
The type of s here is &str: it’s a slice pointing to that specific point of the binary. This is also why string literals are immutable; &str is an immutable reference.
Got it. String literals are actually pointers which is fixed sized and hence can be kept on the stack. Thanks
In fact, string literal does has maximum size, which is usize::MAX
=)
I found this stack overflow interesting
https://stackoverflow.com/questions/31722881/is-an-entire-static-program-loaded-into-memory-when-launched
In short: the executable is not loaded into memory until that part (in 4k chunks) of the executable is needed (generates a page fault when it is needed), unless a prefetch service is running which loads it and required files (based on previous executions).
(Irrelevant nitpick: it's actually isize::MAX as usize
, because of LLVM GEP restrictions. See https://doc.rust-lang.org/std/primitive.pointer.html#method.offset. But of course I suspect rustc will fail to compile the code well before you hit that big of a string literal )
If I understand this correctly;
String literal pointer is made up of address, length and capacity information
This limit is imposed by the biggest number that the length or capacity memory space is able to represent.
Please correct me if I am wrong
Pointer to a string literal (&str
) is only address and length, and there's never any capacity involved.
Data of the String
struct is an address, length and capacity.
String has those three, but string literals don’t have a capacity. &str can’t modify the backing memory, and so doesn’t need it.
Oh yeah that’s right, that means the limit is there because of the largest number that length can represent
On both ARM64 and x86_64, it's impossible to actually overflow that number with real allocation sizes, because the CPU actually only supports a 48-bit address space. The whole upper two bytes, including the sign bit, are always zero.
On 32-bit general purpose platforms, it's a bit more complicated. In a lot of operating systems, the upper half of the address space is reserved by the OS, which also happens to be the "negative numbers" in an isize representation. So, it's still impossible to overflow that number before running out of address space.
So you have to be running on 32-bit, bare metal, and you might have to use page table trickery to avoid slamming into the PCI hole, in order to allocate a string that takes up half of your address space.
Yes, the documentation for the function I linked mentions as much:
Most platforms fundamentally can't even construct such an allocation. For instance, no known 64-bit platform can ever serve a request for 263 bytes due to page-table limitations or splitting the address space.
You replied to a two year old thread.
Also, string literals (&str) are not Strings.
The largest possible string element index is actually isize::MAX
, which is the largest permitted array index. All memory addresses are in the range 0..=usize::MAX
, so an address larger than usize::MAX
is not possible.