Rust trained my sense for spotting UB in C

I had an interesting experience this week at work. My senior colleague and I was working on some code for a project with a deadline of Friday. After a change in the code, we suddenly started getting garbage values out on our serial from the microcontroller.

While this puzzled my colleague (who is much more experienced than I am in embedded development an development in general), I immediately realized what was going on. We were having issues with returning a reference to a local and then calling another function before using that pointer:

char *function_A(char *parameters) {
    char result[100] = {0x00};
    // populate result with response
    char *return_val = result;
    return return_val;
}

char *function_B(char *parameters) {
    char result[100] = {0x00};
    // populate result with response
    char *return_val = result;
    return return_val;
}

char *calling_function(char *parameters) {
    char *a_result = function_A("params");
    char *b_result = function_B("params");
    char result_buffer[200] = {0x00};
    sprintf(result_buffer, "%s%s", a_result, b_result);
    char *return_val = result_buffer;
    return return_val;
}

I immediately realized we had a dangling pointer into function A's stack memory, which is overwritten by function B's stack before we read it in our calling function. I can't think of any other explanation than my struggles with rustc to explain why I immediately realized that.

12 Likes

I share the sentiment, and have a very practical anecdote to share along the same lines.

One day at work, a coworker asked for help on a strange bug he was experiencing with a library in C++. The lib would return a list of values, but depending on the number of values it would be either correct, or a long string of garbage values.

Suspecting a memory error, I ran the code under valgrind (bless valgrind's devs, everyday) and sure enough, a memory error was there.

Using my Rust-sharpened borrow-checker sense, I could narrow it down to the list returned being a reference-like object (think span) to a temporary that wouldn't live long enough. The fix was along the lines of:

// Before, with memory errors
auto list = some_query().get_results().get_list(); 

// After, fixed
auto results = some_query().get_results();
auto list_view = get_list();

The thing is, that particular lib is available both as C++ and rust (native in both cases, no bindings, they are 2 implementations of the same concept if you want).

Another day, I was implementing a module in Rust using the Rust version of that library for testing purposes. Accidentally, I wrote the same mistake as my colleague did in C++:

let list = some_query().results().list();

I was greeted with a temporary value dropped while borrowed. Consider using a let binding to create a longer lived value. Temporary is freed at the end of this statement, borrow later used here error from the compiler.

This made me realize a few things very clearly:

  1. It is terrifyingly easy to introduce lifetime related errors in C++ code, and "modern C++" will definitely not save you from these (despite what is regularly heard on the topic)
  2. These errors are hard to spot in C++. It took 1 hour and a half of two engineers to understand and fix the problem. And this is when such mistakes are detected in the development phase (if my colleague hadn't had the reflex to test on various workloads we might have not detected the problem)
  3. By contrast, rust makes detecting and fixing these errors a triviality. The compiler even suggest what to do to fix the error. It should take no more than 10 seconds of a single developer to apply that fix in such a trivial situation.
  4. Rust trains developers to spot these kinds of errors, because they are reified by the compiler. After seeing lifetime related problems a lot of time in rust, and their fix as offered by the compiler, I tend to detect these more easily when reviewing C++ code, because I'm always thinking about the lifetime of objects at the back of my mind. My colleague, a very experienced C++ developer, but less versed in rust, has more trouble detecting this kind of issues.
  5. Even once we know of this kind of issues, it is very easy to write them by mistake. I don't know if the lib is to blame, but the combination of generally long method chains whose intermediate results we are not interested in, and maybe their naming (adding list_ref() instead of just list() would make things easier?) makes it so we tend to do that mistake again and again. As a matter of fact, after having fixed it by my colleagues and made it myself in rust, I wrote it a second time in C++... Thankfully it was detected quickly.

Nowadays, when I write something non trivial using that lib, I tend to write it in rust first, just to check my lifetimes, and then port these parts to C++ as needed :sweat_smile:.

13 Likes

Ha yeah, that's one way to go about it...

I found Clang's linter to be quite close to being my "rustc" when writing C++. Maybe MSVC also, because I know there's an open-source Microsoft implementation of the C++ Core Guidelines. However, they're not going to catch everything and you're going to ignore some warnings some time.

By Clang's linter, you mean clang-tidy? We're using it, although we could use it more. However, for the case I described, I believe that the linter would have been helpless: there is no explicit reference to a temporary returned by the library, it is rather a type containing a reference (think struct A<'a> in Rust), except than in C++ the compiler has no information on the lifetime of the reference, so no warning can be produced.

To be clear, I'm not spending half my time writing code in rust and half rewriting it in C++ :stuck_out_tongue: . I'm just checking for dangerous usage patterns for this specific library. Although, thinking about a problem with Rust's constraints first often leads me to a better C++ solution.

Not quite the same but I had a similar experience.

A company gave us a couple of thousand lines of code written in C# as an example of how to interact with the protocol coming out of their device over a serial link. Lots of horrible custom packet parsing, fiddling with bits and bytes.

In the process of transcribing it to C++ to use with our system I found some bugs in their code. Not unsafe or UB but issues anyway.

Then we decided to go Rust so I transcribed it to Rust. The Rust compiler showed up a few more issues!

The end result is that our Rust has been up and running talking to these devices for a year now. Meanwhile that original C# hangs after a few minutes of running. I never went back to fix it.

7 Likes

I believe so, but I'll have to check. It's the one which Qt Creator uses for inline errors and warnings.