Why does using too many `&` cause overflow?

Hello everyone, today while conducting an experiment, I came across an interesting issue.
Here's the code snippet:

fn main() {
    let a = 1007;
    let b = &&& ... followed by 10,000s & ... &&&a;
    println!("{:p}", b);
}

When I run it using cargo run , an error will be thrown:

thread 'rustc' has overflowed its stack
error: could not compile `tuple`

Caused by:
  process didn't exit successfully: `rustc --crate-name tuple --edition=2021 src\main.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --diagnostic-width=126 --crate-type bin --emit=dep-info,link -C embed-bitcode=no -C debuginfo=2 -C metadata=1e83d7b6b441f689 --out-dir E:\code\Rust\rust_by_practice\ch6\tuple\target\debug\deps -C incremental=E:\code\Rust\rust_by_practice\ch6\tuple\target\debug\incremental -L dependency=E:\code\Rust\rust_by_practice\ch6\tuple\target\debug\deps` (exit code: 0xc00000fd, STATUS_STACK_OVERFLOW)

Obviously, it is not practical to write such code in a production environment. However, I would like to know the fundamental reasons behind the occurrence of the overflow. I would greatly appreciate it if you could provide me with an explanation.
My operating system is Windows 10 Pro 64-bit, 16.0 GB RAM
I'm using VSCode as my editor.

3 Likes

Every of your 10ks of references is stored on the stack. Godbolt.

3 Likes

I think this OP was a stack overflow at compile time though :slight_smile:

I.e. rustc crashes. This is typically considered a bug, at least in principle, though of course whether it’s worth fixing could be a trade-off.

11 Likes

Oh yes, thread 'rustc' has overflowed its stack, not thread 'main' :person_facepalming:

Here‘s an easy reproduction:

macro_rules! refs {
    ([$t:tt $($ts:tt)*][$($amps:tt)*] $e:expr) => {
        refs!([$($ts)*][$($amps)*$($amps)*] $e)
    };
    ([][$($amps:tt)*] $e:expr) => {
        $($amps)*$e
    };
}

fn f() {
    refs!([##### ##### ###][&] 0);
}

This issue may or may not be related to existing ones such as e.g. this one where a comment suspects the parser. But I couldn’t find an exact instance of using deeply nested references, so feel free to report this one, I guess.

1 Like

Well, just parsing this monstrosity recursively is likely to exhaust the stack. These kind of edge cases can probably be simply mitigated by counting the levels of recursion and stopping reasonably early (e.g. 128 or 256).

2 Likes

To check whether this is early in compilation, I tested

macro_rules! refs {
    ([$t:tt $($ts:tt)*][$($amps:tt)*] $e:expr) => {
        refs!([$($ts)*][$($amps)*$($amps)*] $e)
    };
    ([][$($amps:tt)*] $e:expr) => {
        ignore!($($amps)*$e)
    };
}

macro_rules! ignore {
    ($e:expr) => {};
}

fn f() {
    refs!([##### ##### ####][&] 0);
}

and for me that only blew up for one more of the #s, so that’s around 16384 apersands… but nonetheless it did result in overflow, too, even though the ignore macro will discard the whole expression after parsing it.

2 Likes

Can't rustc simply remove the limit on its stack size? On linux there is no hard limit on stack by default, a process can decide what its own limit is by using setrlimit. rustc should just set it to the hard limit (unlimited, by default).

Would eating all your RAM and slowing your machine to a crawl, or even hanging it, be preferable to panicking?

Yeah panicking isn't ideal and there's got to be a better way, but I don't know if we should be optimizing the compilation experience for obviously stupid code.

13 Likes

Well that wouldn't happen with the example, you'd need a lot more ampersands to exhaust say 1GiB of RAM. If that happens, that is unrelated to the stack in particular -- if you have a huge program, you may exhaust memory building your AST, regardless of whether the parser is putting things on the stack or on the heap.

If you want to protect against exhausting physical memory, the correct solution would be to put a limit on total memory used (also possible via setrlimit), not just specifically on the stack.

6 Likes

Well, not remove the limit entirely, but it does have tricks to request more stack in places where it can sometimes recurse deeply.

For example, I did Add an `ensure_sufficient_stack` to `LateContextAndPass::visit_expr` by scottmcm · Pull Request #112673 · rust-lang/rust · GitHub the other day to fix one place that was right at the edge of blowing up in the test suite.

Interesting. Does this solve the issue?

But then: why not just remove the limit entirely? The operating system has a built-in mechanism to get as much stack memory as necessary when needed, so what's the point of duplicating the effort by introducing a small limit and then trying to relax it manually when necessary?

As far as I can tell the only reason for the default soft limit (8MB on Linux) is to make debugging infinite recursion bugs easier. Is there any other reason?

That only works on the main thread on some operating systems, while thread stacks are a fixed size. But even for main the current RLIMIT_STACK might not be unlimited, and even if it is, growth could be blocked by any other virtual memory mapping below the stack.

2 Likes

Sure. When I said "remove the limit" I didn't mean there would be no limits whatsoever, I just meant removing the soft limit (make it equal to the hard limit, which is RLIMIT_INFINITE by default), i.e. remove the additional self-imposed limit on top of what is supported and allowed.

I looked a bit into the memory mapping and was surprised that linux's mmap may indeed leave only a small gap for stack growth -- if you have a small soft limit when the process is launched and you're unlucky. That's because it calculates the required gap based on the current soft limit, not the hard limit. So if you want to guarantee large stack growth space you want to do setrlimit (or ulimit) first and then exec, as to allocate all memory for the process with the stack limit you want -- if you do that, mmap will make sure to leave plenty of space in virtual memory for stack.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.