Stack overflow / SIGSEGV on copying u32 arrays

Hi all,

I recently started trying out Rust and I'm just trying small things that come to mind. The current snippet I'm trying out is a merge sort, I was checking (empirically) whether the asymptotic number of operations matches expectations - so I'm trying out big arrays. With half a million u32 elements, the code runs fine, but with a million u32, it runs into an error:

   Compiling RustExamples v0.1.0 (/data/git/personal/RustExamples)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 8.06s
     Running `target/debug/mergesort`

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
[1]    8422 IOT instruction (core dumped)  cargo run --bin mergesort

I looked up how to understand a stack overflow error, I read I should use gdb:

GNU gdb (GDB) 16.3
...
(gdb) run
Starting program: /data/git/personal/RustExamples/target/debug/mergesort 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x000055555555bee1 in core::mem::swap<[u32; 1000000]> (x=<error reading variable: Cannot access memory at address 0x7fffff48b9d8>, y=<error reading variable: Cannot access memory at address 0x7fffff48b9e0>)
    at /rustc/05f9846f893b09a1be1fc8560e33fc3c815cfecb/library/core/src/mem/mod.rs:730
warning: 730    /rustc/05f9846f893b09a1be1fc8560e33fc3c815cfecb/library/core/src/mem/mod.rs: No such file or directory

The snippet in question is:

use rand::RngCore;
use rand::SeedableRng;
use rand::rngs::SmallRng;
use std::cmp;
use std::mem;

fn main() {

    let mut rng = SmallRng::seed_from_u64(42);

    const N: usize = 1000000;
    const M: u32 = 100000;

    let length = N;

    let mut array: [u32; N] = [0; N];
    let mut aux: [u32; N] = [0; N];

    for i in 0..N {
        array[i] = rng.next_u32() % M;
    }

    //println!("before sort: {:?}", array);

    let mut op: usize = 0;
    let mut step: usize = 1;

    while step < length {
        //println!("array={:?}", array);
        let mut index: usize = 0;
        while index < length {
            let mut i: usize = index;
            let n: usize = cmp::min(index + step, length);
            let mut j: usize = n;
            let m: usize = cmp::min(n + step, length);

            //println!("step={step}, index={index}");
            //println!("i={i}, n={n}");
            //println!("j={j}, m={m}");

            while i < n || j < m {
                if i < n && (j >= m || array[i] <= array[j]) {
                    aux[index] = array[i];
                    i += 1;
                } else if j < m && (i >= n || array[j] <= array[i]) {
                    aux[index] = array[j];
                    j += 1;
                }
                //println!("aux={:?}", aux);
                index += 1;
                op += 1;
            }
        }
        step *= 2;
        //println!("aux={:?}", aux);
        mem::swap(&mut array, &mut aux);
    }

    for i in 1..N {
        assert!(array[i - 1] <= array[i], "not sorted at index {i}: {:?}", array);
    }

    //println!("after sort: {:?}", array);
    println!("operations: {op}, length: {N}");
}

For mem::swap(&mut array, &mut aux), I was thinking to just swap the 2 arrays as pointers, instead of copying with array = aux... The latter also runs into an error, one that I also don't understand:

(gdb) run
Starting program: /data/git/personal/RustExamples/target/debug/mergesort 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x000055555555c0e1 in mergesort::main () at src/bin/mergesort.rs:7
7       fn main() {

So I have 2 questions:

  1. What is my code doing wrong here? A million 32 bit elements shouldn't be a big deal, from my experience with C++/Java?
  2. How do I get further information in such cases? I tried RUSTFLAGS=-g cargo run --bin mergesort, gdb still doesn't tell me what went wrong.

Sorry if similar questions have been asked before and thanks in advance.

Writing let mut array: [u32; N] = [0; N]; allocates your huge array on the stack, and together with your other operations, this runs out of available stack space — stacks are generally only a few megabytes (depending on platform).

You should use Vec<u32> instead of [u32; N], allocating the storage on the heap. At these large sizes, there are no significant advantages to using arrays instead of vectors. (If you really want a statically declared size, you can convert the Vec, after creation, to Box<[u32; N]>.)

3 Likes

Ah, right, thank you!