What does 'capacity overflow' error mean when working with Strings?

I was working on a problem and needed to repeat a string multiple times. After a bit of searching, I came across the documentation for std::str:repeat(), and there's a section that warns about panicking upon overflow.

// this will panic at runtime
let huge = "0123456789abcdef".repeat(usize::MAX);

Curious about this overflow error, I decided to experiment a bit. I figured, "Of course it will panic, it's like 16 symbols times usize." But then to my surprise, just "0".repeat(usize::MAX) also didn't work, throwing the same "capacity overflow" error. I thought, maybe using the exact usize wasn't the best approach. Perhaps it needed some additional "free space." So, I tried "0".repeat(usize::MAX - 1_000_000), but that failed too.

Then, I thought, maybe if each symbol in the string is 4 bytes, to avoid capacity overflow, I should divide usize by 4, which would be the maximum String capacity. When I tried "0".repeat(usize::MAX/4), the program panicked with a "memory allocation failed (SIGABRT)" error, which I totally get — the operating system won't let me allocate petabytes of memory.

But I decided to push it further, and surprisingly, "0".repeat(usize::MAX/2) didn't panic with "capacity overflow" as I expected, but with the same "memory allocation failed" error.

The capacity overflow error message points to library/alloc/src/raw_vec.rs:570:5, but at that position, there's a drop() function that frees the memory, not allocates it. I tried reading around the code in raw_vec.rs, but I couldn't grasp what was happening. I'm a beginner, and this is a bit overwhelming for me at this point.

I'm completely confused now. The capacity should be somewhere between usize/2 and usize - 1_000_000, which doesn't make sense to me yet. Can someone help me understand what the capacity overflow error means? What is the exact capacity I cannot overflow, and how can I find out?

Here's the link to the Rust Playground I've been tinkering with.

2 Likes

As you have already found out String uses Vec to carry it's payload and Vec, as documented, panics when the new new capacity exceeds isize::MAX bytes.

Which happen to be precisely usize::MAX/2 which you experiments revealed.

Why isize::MAX and not usize::MAX ? That's mostly to ensure safety: it's easy and fast for CPU to check if value is smaller than isize::MAX and various calculations with overflow usually produce results in that range. And most “normal manipulations” (double the capacity, increase it by 100, etc) don't need any special dances, too (remember Java story?)

And since Vec may hold up to isize::MAX elements… you observe results that you are observing.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.