Library tests pass, but sometimes segfault afterwards?


#1

Hi, I’m a beginner to Rust, (not a beginner to systems programming, but I’m still just a college student). I’ve been writing a simple matrix class in unsafe Rust, and I’ve been running into weird errors with cargo test. About 1/4 of the time, cargo test runs successfully and passes all the tests I’ve written. The other 3/4 of the time (approximately), it segfaults at some point within the execution of the tests, or after all of the tests have run. Running the compiled test program in gdb, the segfault always occurs in the execution of clone() from within start_thread(). Now, I’m pretty experienced with understanding C and the assembly it generates, but I do not know what the problem is here. Is this an issue with my code? I’m not doing any multithreading, so I would assume cargo test is running the tests in parallel, and that sometimes these threads are having some sort of memory error? I could really use some help debugging this.

I’m running rustc 1.5.0-nightly 2015-10-16, cargo 0.6.0-nightly 2015-10-17. My library code is here on GitHub: https://github.com/peterdelevoryas/linear-rs. Inside the repository is lib.rs, containing all of my library code and the test module, and two log files: gdb_log, containing the output of two executions through gdb, and test_log, showing the output from a few executions of cargo test. Also, I noticed gdb states the code is using my system’s libthread_db library file, so perhaps this is specific to my computer? I’m running x86_64 Arch Linux 4.1.6-1 on a late 2008 Aluminum Macbook (I believe it’s called the Macbook 5-1).


#2

I haven’t tried running the code, but this section looks suspicious:

        pub fn eye(&mut self) {
            let size = self.rows * self.cols * mem::size_of::<T>();
            unsafe {
                // Zero the data array
                ptr::write_bytes::<T>(self.data, 0, size)
            };

From the documentation, ptr::write_bytes takes a count of Ts, rather than a size in bytes, so I think you’re writing over too much memory.


#3

Omg you’re right, thank you so much for taking the time to look at my code!!! I was actually checking the memory calls to see if they took sizes or lengths but I dismissed write_bytes because I assumed it would write size bytes, not length * elem-size. Again, thank you so much for the help!!


#4

I don’t want to sound harsh, but the whole point of Rust is that unsafe code should be used only if strictly necessary.

If in a particular situation you suspect that safe code can’t be as fast as unsafe code, you should benchmark it first.


#5

That’s not harsh at all, you’re completely correct, this is not really what unsafe Rust was made for. And of course, I’m not experienced with Rust, and I made a simple mistake in the unsafe code I was writing, and then I didn’t know what was wrong with the code. Maybe I’ll change the raw ptr to an array slice!


#6

I can’t reproduce the segfault to be sure it helps in this case, but valgrind is often useful for debugging this sort of memory corruption, e.g. it can print a stack trace at locations that write out of bounds of their allocation which would’ve likely pointed to the culprit straight away in this case. It’s a useful extra thing to have around when writing unsafe Rust code.


#7

I’ve only used Valgrind once before with a random piece of C code I was writing, so I didn’t realize you can use Valgrind with Rust! (Because I didn’t really understand what Valgrind was until 2 seconds ago!) Thanks!