Testing that code does not compile

One advantage of using a strongly typed language like Rust is that some programmer errors are immediately detected at compile time and lead to a compilation failure.

For compiler-enforced properties like borrow checking, we trust the rustc team to make sure that this actually happens via their magical compiler test harness. But for library-defined properties, and especially (but not only) when unsafe code is involved, the library author is responsible for ensuring that the library's compile-time API preconditions are actually checked at compile time.

So, how does one go about actually testing this, beyond ocasionally checking that some code does not compile by hand?

I know that rustdoc has a compile_fail attribute that can be applied to documentation examples, does it plug into a more general-purpose mechanism that one can use internally without writing down every "should-not-compile" example in the documentation?

2 Likes

There is compiletest_rs as a crate, but I can barely recommend it as has in the past frequently been broken by changes in rustc, and is perpetually nightly-only.

compile_fail in rustdoc has an additional feature so far (checked on nightly only - on stable, it's ignored) that one can list which error is expected. That should make the test more robust against false negatives.

3 Likes

I highly recommend using ::trybuild for this.

It is specially important, for instance, to check against both unsafe libraries with "too much variance" (c.f. my example within the big PhantomData post) and procedural macros that generate unsafe code.

11 Likes

trybuild looks great and is easy to use — if compiletest was gen 1 from rustc, then trybuild is gen 2, like the ui tests in rust itself. (Thanks for the tip!)

That said, I'm reluctant to port to trybuild, won't this be fragile from rust version to rust version still, since it checks exact rustc output? Already today my tests output slightly different diagnostics for stable and nightly, and don't pass trybuild on both simultaneously.

For that reason, just testing does this compile / does it not, is unfortunately more robust in the sense that it doesn't need updates from version to version.

2 Likes

Trybuild does try to ignore some compiler output variability, but yes, testing against error messages is fundamentally fragile, and this is something that concerns me as well.

1 Like

My thoughts exactly :smiley:

I agree. I have tested looking at the json output of a failing cargo check, and this is a snippet of what it looks like:

In this snippet we can notice the presence of level, code and spans keys. By not necessarily 1 basing the test on the exact output but on (error_level, error_code, set(spans.map(extract_coordinates))) tuple from the JSON output, ::trybuild could be used in a (slightly) more resilient manner for compile-fail tests other than custom procedural macros error codes, at the cost of a potentially more complex setup.

Thoughts?

  • Let's cc @dtolnay to see what they have to say about this.

1 it could still be opt-in / opt-out for things like procedural macros errors

I strongly believe the current approach in trybuild is the right one for my use cases – actual compiler output as the user would see it, but heavily normalized to cut unimportant differences over time. Someone is free to fork it if they want to build on the JSON output instead.

So far I have found the churn from compiler diagnostics changes to be quite low. For example serde_json's ui tests have not had to change a single time since I switched them to trybuild 5 months ago. But sometimes it catches bugs in nightly (rust-lang/rust#65001), which is great.

In my projects I use rustversion to run the ui tests only on nightly (usually) or only on stable.

#[rustversion::attr(not(nightly), ignore)]
#[test]
fn ui() {
    let t = trybuild::TestCases::new();
    t.compile_fail("tests/ui/*.rs");
}
3 Likes

That's fine, I don't think I have exactly the same use case. It's about ensuring certain code does not borrow check and in some cases, that certain methods are ruled out by marker type parameters on a type. It seems similar to what @HadrienG said — checking the safety boundaries of the API.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.