[HowTo] Sanitize your Rust code!


#1

Morning Rustaceans,

I got some good news for you today: Sanitizer support (only for x86_64 Linux) landed a few hours ago and you can start using it right now. Yeah, that’s right; no need to wait for a new nightly.

First things first, here’s the documentation but the TL;DR is:

# sanitize your application
$ RUSTFLAGS="-Z sanitizer=leak" cargo run --target x86_64-unknown-linux-gnu [--example foo]

# sanitize your library (through its unit tests)
$ RUSTFLAGS="-Z sanitizer=leak" cargo test --target x86_64-unknown-linux-gnu

You can use address, leak, memory or thread as the sanitizer argument.

Next, to use this today before it reaches a nightly.

One of the goodies from the new CI infrastructure is that we get build artifacts for every merged PR. There’s no nice, easy way to use these artifacts right now (I expect that rustup would eventually gain support for these per PR artifacts) but you can use them with some manual work and the help of rustup. Here are the commands to get a rustc binary that includes sanitizer support:

# The sha comes from here https://github.com/rust-lang/rust/pull/39677
# (some manual crawling was required)
$ curl -LO https://s3.amazonaws.com/rust-lang-ci/rustc-builds/fd2f8a4536cb9b45abd72b8ff977ad48618602b3/rust-nightly-x86_64-unknown-linux-gnu.tar.gz

$ tar xzf rust-nightly-x86_64-unknown-linux-gnu.tar.gz

$ mv \
    rust-nightly-x86_64-unknown-linux-gnu/rust-std-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu \
    rust-nightly-x86_64-unknown-linux-gnu/rustc/lib/rustlib

$ rustup toolchain link san rust-nightly-x86_64-unknown-linux-gnu/rustc

$ rustup default san

# verify: you should get the same output
$ rustc -V
rustc 1.17.0-nightly (fd2f8a453 2017-02-09)

Now you have a toolchain with sanitizer support. At any time you can revert to an official channel by typing e.g. rustup default nightly.

Finally, I want to thank tmiasko for doing the initial work in this area. We wouldn’t have sanitizer support today without their hard work. Thanks tmiasko!

Happy sanitizing!

P.S. Please report sanitizer bugs (e.g. false positives) to the rust-lang/rust repo and report gotchas and workarounds to the japaric/rust-san repo.


#2

@japaric,
Is that for UnsafeCell and pointers right?
Safe rust doesn’t have leaks, right?:slight_smile:


#3

Nice! It would be cool if travis-cargo would get updated to run all these by default!

BTW for those who don’t know, the sanitizers are incompatible with each other (you cannot instrument a binary with more than one sanitizer at a time):

  • sanitizer=memory catches all uses of uninitialized memory, dangling references to stack frames… and is about 20x faster than valgrind!
  • sanitizer=address catches accessing memory out of bounds (in arrays and otherwise), use after free, … (and is infinitely faster than valgrind!)
  • sanitizer=leak catches memory leaks (like Rc cycles!)
  • sanitizer=thread catches data-races! Libraries like crossbeam should probably start using this right away!

Also, address+memory is almost the same as valgrind memcheck=full, at a tiny fraction of the cost of a valgrind run.


#4

Safe rust totally has memory leaks. From Rc cycles to simple std::mem::forget.


#5

Yeah, this is mainly for unsafe code. But as @SimonSapin said, it is safe to leak memory. There’s an example in the linked repo that shows a safe memory leak and that LeakSanitizer catches.


#6

Right now, I’m thinking about how best to incorporate this into my existing testing methodology and I have a few questions:

  1. What effect does this have on the process exit code if run under cargo test? (I want to be able to fail a CI job based on sanitizer output)

  2. Has the greater LLVM community produced any kind of whitelisting solution to ensure that false positives don’t cripple the utility of sanitizers?

  3. Given that I’ll also want to run what can be run with cargo test via the test.sh I currently have handling things like running clippy (I develop on stable) and grepping the TODO/FIXME report out of rustfmt’s CheckStyle output, am I correct in assuming this would be the correct way to ensure LeakSanitizer functions reliably under cargo test?

    [profile.test]
    opt-level = 1

That said, I’ll also need to do some benchmarking to decide whether that bogs down bare cargo test enough to justify extending the “Use sed to temporarily monkey-patch opt-level="z" into Cargo.toml” trick I already use in release.sh.


#7

What effect does this have on the process exit code if run under cargo test? (I want to be able to fail a CI job based on sanitizer output)

All violations will change the exit code of cargo test to a non-zero value. However, the test runner knows nothing about sanitizers so if a unit test violates some rule then the test runner won’t fail that test (the test will pass). Also, sanitizers handle violations differently:

  • LeakSanitizer will report all the violations after cargo test ends.
  • ThreadSanitizer will report each violation as it happens.
  • AddressSanitizer and MemorySanitizer will report the first violation and abort the process.

Has the greater LLVM community produced any kind of whitelisting solution to ensure that false positives don’t cripple the utility of sanitizers?

Each sanitizer has some way to whitelist functions (e.g. using an attribute that’s recognized by clang). But that hasn’t been implemented in rustc. Such mechanism will need an RFC to discuss the design as it is user facing (it requires modifying the user source code).

to ensure LeakSanitizer functions reliably under cargo test

LeakSanitizer seems to work reliable when used with cargo test; it’s with cargo run where it’s not reliable.


#8

In clang it is possible to configure this behavior. Does customizing the behavior of the sanitizers follow clang conventions? (same environment variables, same suppression files, etc. can I just follow clang docs for this?)


#9

rustc links the same sanitizer runtime as clang so if the runtime supports configuration via env vars and files then that should also work when sanitizing Rust programs. I have tested that env vars work with ASan; I haven’t tried suppresion files.


#10

If my crate uses derive-new as a dependency, the sanitizer fails:

Compiling derive-new v0.3.0
error: Only executables and rlibs can be compiled with -Z sanitizer


#11
RUSTFLAGS="-Z sanitizer=leak" cargo test --target x86_64-unknown-linux-gnu

Works for me. You always have to pass --target x86_64-unknown-linux-gnu. (A bit silly, I know. It’s a Cargo “feature”). That’s in the README:

Be sure to always pass --target x86_64-unknown-linux-gnu to Cargo or you’ll end up sanitizing the build scripts that Cargo runs.


#12

Wow, nice feature!

How about making this the default for unsafe blocks? Just to make sure, your unsafe block does not really do unsafe stuff such as accessing random memory but only memory it has a reference to?


#13

Have you run the sanitizers on Rustc itself?


#14

While existing sanitizers might be able to detect some unsafe violations, Niko was contemplating more targeted analyzers specific to Rust’s unsafe rules. (whenever those rules get solidified…)
http://smallcultfollowing.com/babysteps/blog/2017/01/22/assigning-blame-to-unsafe-code/

But I don’t see ever running this by default. People often dip into unsafe to improve performance by avoiding checks that the compiler can’t determine statically, so they wouldn’t want to be hamstrung by sanitizers and runtime checking.


#15

How about making this the default for unsafe blocks?

Seems very unlikely to ever occur.

You certainly don’t want sanitizers installed in your release binary. And to enable sanitizers by default on dev mode, they would have to meet a very high quality bar. First, the sanitizers would need to have zero false positives, otherwise pretty much anyone that wants to gate their CI on running a test suite would have to opt out from them. Then, they would have to at least work on all tier 1 platforms as no stable, opt-out, platform specific feature exists in Rust and doubt sanitizers will become the first feature that breaks the status quo.

Also, there are other complications with your suggestion: You can only use one sanitizer at a time and there’s four of them so it’s not clear which one should be enabled by default; This feature would have to be stabilized first; Sanitizers work at a function granularity level, not with blocks; etc.

@leonardo

Have you run the sanitizers on Rustc itself?

Not yet.


#16

UPDATE Sanitizer support has reached the nightlies. If you do rustup update nightly right now, you’ll be able to use the sanitizers.


#17

Very awesome, thanks japaric!

Apologies if this is a dumb question - I have very, very little background in LLVM/rustc internals and I’ve only taken a very quick glance at your code - but once this is more stable, how much work would be involved in enabling support for other architectures which already have AddressSanitiser support working in clang?


#18

@japaric the memory sanitizer does not work on OSX I think.


#19

@ajd

how much work would be involved in enabling support for other architectures which already have AddressSanitiser support working in clang?

Not that hard I expect. The required changes are mostly build system related as the related rustc - LLVM interface is done. The changes would be teach Rust the build system how to cross compile the sanitizer runtimes (e.g. rustc_asan) from x86_64 Linux to e.g. AArch64 Linux. The sanitizer runtimes have their own build system (CMake) so this may just be passing some extra flag to the cmake invocation. Then to actually [build the runtimes] #(https://github.com/rust-lang/rust/blob/dc0bb3f2839c13ab42feacd423f728fbfd2f2f7a/src/libstd/Cargo.toml#L26) for AArch64. Finally, test that the thing actually works; if not, we may have to backport some compiler-rt patches as our compiler-rt is old-ish.

@gnzlbg

the memory sanitizer does not work on OSX I think.

MemorySanitizer only supports Linux, it seems.


#20

I tried running the address sanitizer against Diesel. It appears to hate zero sized types (specifically if they’re the last field on a struct it seems)? The first try was this build, which claimed there was a stack buffer overflow here (Insert is zero sized). I added a dummy byte to confirm my suspicions, and then got this failure which claimed this was a stack buffer overflow (the last field of Bound is a PhatomData)