C++ Core Guidelines

File this under related work. A group led by Bjarne Stroustrup is trying to describe what modern and safe C++ should look like in C++ Core Guidelines. They include plans for tools that can verify adherence to these rules. Rust is also mentioned briefly in the lifetimes paper.

There's a lot in this, and I haven't read it all yet, let alone really digest it. But I thought many Rustaceans would find it interesting too.

10 Likes

Here is a Stoustrup presentation of the work on the cppcon:

Note how many questions at the end are about GSL support for concurrency :smiley:

4 Likes

Herb Sutter goes into further detail on how this works (with a demo of a Visual Studio prototype implementation) here:

There is a PDF with more detailes: https://github.com/isocpp/CppCoreGuidelines/blob/master/docs/Lifetimes%20I%20and%20II%20-%20v0.9.1.pdf

Two noteworthy points:

  • Instead of a universal "unsafe" escape clause, there is the ability to disable a specific rule violation. This ensure you aren't unintentionally breaking additional rules.

  • He later makes the point that the C++ Core Guidelines choice of lifetime defaults is better than Rust, therefore requiring much less annotations. He does mention that Rust defaults are generally good, but claims the new C++ rules are even better. Perhaps someone could compare his examples to what Rust does today?

I hope this is a start of an arms race where both languages will adopt good features from each other.

6 Likes

Elision is more agressive in the prototype he shows, but I think it's also because the tools are different thant in Rust.

In Rust, elision rules are a feature of the language, which means that in order to have retrocompatibility it would be difficult to have more restrictive rules in, say Rust 1.4, because it would break Rust 1.(x<4) code. At the contrary, if Rust 1.4 ships in with more elision, it should not be a problem (precedent code will still work).

The tool for C++ is not in the compiler but is an additional tool which IIRC basically gives warnings. I don't remember if it was Stroustrup or Sutter, but one of them said that a future version could consider more code as error. So in their case, they can have very agressive elision at the beginning but remove some of them if they see it doesn't work.

What I mean is : "currently" (well, the tool is not out yet) there seem to be more elision in C++ (GSL-C++?) than in Rust, but I wouldn't be that surprised if there was more convergence in a few iterations of both languages.

The two questions on my mind after watching these presentations are the following, though:

  1. How is this gonna work with existing C++ libraries?
  2. Their solution to avoid dangling pointers seems less "invasive" than Rust's (no need for the "there must be no other aliasing if you want a ref mut"), but how will this work for concurrency?

If I understand their lifetime rules correctly, that any non-const operation invalidates pointers, then I think this is actually quite strict. Maybe too strict! For instance, does operator[] invalidate pointers? Heck, how about begin() and end() - can these iterators coexist if one call invalidates the other?

Such broad invalidation would break a lot of <algorithm>, if you can't grab a pair of iterators, so I feel I must be missing something. But without inter-function analysis or additional annotation, there's no difference to the caller between say insert() which needs to invalidate and begin()/end() which need not.

(He did cover the paired lifetimes of input iterators to insert(), but that's not what I'm talking about because those are const, so acquiring them simultaneously is fine. e.g. cbegin()/cend())

I also posted this concern on /r/cpp, but I probably missed the conversation by now. Hopefully you smart folks here can either set me straight or agree this is problematic. :smile:

Anyway, I think it's interesting how close these proposed C++ pointer invalidations are to Rust borrowing, solving some of the same problems, but resulting in opposite error reporting.

let mut v = vec![1, 2, 3];
let x = &mut v[0];
v.push(4); // error: cannot borrow `v` as mutable more than once at a time
*x = 42;
vector<int> v { 1, 2, 3 };
int *x = &v[0];
v.push_back(4);
*x = 42; // ERROR, invalidated by push_back

i.e. Rust complains if you try to invalidate a reference; C++ Core complains if you try to use the invalidated pointer. Both recognize the problem, but place different blame.

If I understand their lifetime rules correctly, that any non-const operation invalidates pointers, then I think this is actually quite strict. Maybe too strict!

My understanding is that the rule is similar to what Rust used to have in 2012, aka "pure". Compare Imagine never hearing the phrase 'aliasable, mutable' again · baby steps.

In that system, begin() and end() are pure, insert() is not. Rust used explicit annotation, but I think it can be inferred. As I understand It does not need interprocedural analysis.

My understanding from the talk was that any non-const method on the vector would be considered invalidating the pointers into the vector by default, but if a different behavior is needed then some annotations should be applied to the method. But Herb didn't give any more details about the annotations.

Overall it seems that Herb had presented a much simplified version of the rules compared to the PDF on GitHub. E.g. PDF distinguishes "unique_owner" and "shared_owner" kinds of pointers, but he never mentioned that in the talk. Also, the scenarios he demonstrated were relatively simple, and AFAIK there was no follow up talk with more technical explanations. I suppose this is still very much work in progress. He mentioned Rust a couple of times, but in his description Rust requires more verbose lifetime annotation. I'm not sure that's the case, he never said which version of Rust he was referring to though...

I was thinking what might be a more complicated scenario to test that logic, maybe something like sorting a vector of smart pointers. Sorting operation has to obtain non-const iterators, and it changes the vector itself, but it doesn't change what the smart pointers point to, so if you keep a pointer to one of those sub-objects, it shouldn't be invalidated.

Also, it wasn't very clear to me what happens by default in case when a function takes several pointers as inputs and returns a pointer. Lifetime of that pointer is assumed to be tied to the lifetime of one of the input pointers, but which one?..

Update. The annotated version of vector::operator[] is shown in the paper, the annotation expresses that it doesn't affect the lifetimes of the pointers:

T& operator[](size_t n) [[lifetime(const)]]
{
 return data[n];
}
1 Like

It seems that [[lifetime(const)]] corresponds to pure of old Rust, and the rule is roughly similar. Therefore I think they will have same problems Rust had.

1 Like

The way I understood, it defaults to the intersection of the two (return's lifetime is valid only while all input's lifetimes are). In Rust I think it would be equivalent to defaulting to:

fn foo<'a>(x:&'a type, y: &'a type) -> &'a type

There was also at least one example of ad hoc case: if a function takes references to two strings, and one of these references is const, then the return's lifetime is the lifetime of the non const string (because is is supposed that the const string is just a pattern to search in the other string).

1 Like

I have to say, that glimpse into Rust of yore makes me appreciate more where we ended up! :smile:

Thank you, I missed that! Nice to know my concern was both valid and already addressed. I should re-read the whole thing more carefully. That's section 10 "Lifetime-const" in the lifetimes pdf for others who are curious.

Sutter explicitly said in his talk that he designed this lifetime system without looking at other languages, to help ensure a fresh and unbiased perspective. That's all fine to start with, but now I hope they'll take a deep comparing look elsewhere to see what lessons they have missed.

My first impression from first 30 minutes: guys just came from cryo chamber and tried to reinvent Rust in a C++ lib plus style checker. Would've worked if C++ wasn't language of unsound defaults.

3 Likes

What bothered me is how quickly they dismissed the idea, that a new language could have any success. Rust does more things better than C++ other than just lifetimes and ownership.
Just look at things rust shipped without.

3 Likes

@hoodie

That's because we have gigatons of code in C++. They're right that we can't just drop it. And they're right that there were multiple attempts to overthrow C++. They just don't confess some of them were actually successful

  • Java occupied heavy enterprise serverside, and even some games (Minecraft :))
  • C# occupied parts of windows (and also games, hello Unity)
  • Javascript assaults small servers via Node.JS
  • Golang assaults Node.JS and many other niches; some folks are discussing games written in Go (which can be another success)

Because, let's be honest, C++ sucks at ergonomics and clean code. That's why it was substituted in many areas where being 3-5 times slower matters less than debugging 200-line template substitution errors.

The place where C++ still shines is performance-critical code - games, browser engines, OS core features etc. Even compilers are mostly written in different languages (usually bootstrapped).

I will spare your time by not repeating all C++ flaws another 100500'th time :smile:

4 Likes

I'd like to nominate your post to be quote of the week.

Nominated: TWiR quote of the week - #155 by kstep

You're welcome :smile:

The problem is, you can't make the gigatons of existing code any better by imposing guidelines upon new code. The old code will eventually have to be rewritten or redactored or replaced. In that case it really begs the question, will guidlined c++ still be better than code, written in a genuinely more modern language?

1 Like

Unfortunately, the problem is too twofold.
C++ interfaces only with C++. So to reuse existing libs we'll need certain amount of wrappers.

Though I don't think that C++ can become that much better, simply because of multiple inherent flaws. Stroustrup & co provide smooth transition, but no way to isolate newly written code from legacy flaws.

Each evolutionary addition increases complexity. My humble opinion is that Rust doesn't have some skyrocket complexity. It just has some complexity details not familiar to C++ programmers.

Not for boast, just for context. I'm coding in C++ in production for about 7 years, and had student experience with it for about 5 years. And to me Rust is dead simpler than C++ - just because it's much, much, much more consistent with itself and isn't composed of a bunch of square wheels. At least for now.

1 Like

It's a bit offtopic, but I have beed wondering recently, if there is a tool (like swig?), which can

  • take a header file for a C++ class and generate Rust structure/impl definition and ffi bindings.
  • take a Rust struct w/ traits and generate a C++ class definition for it.

I guess such tool will make a huge impact on Rust adoption.

3 Likes

Would be wonderful. Except C++ class layouts are not standartized and thus completely implementation defined. Classes without virtual methods can be wrapped purely by code generation.
But what to do with vtables? What to do with multiple inheritance? What to do with virtual inheritance? At example, method pointer sizes on MSVC vary depending on inheritance type.
The only solution I can guess is to:
a) generate compilable C++ which wraps C++ method calls as struct-classes (i.e. functions with explicit this, and pointer to class bod passed by pointer)
b) Wrap that C interface with Rust externs

By the way, we still have templates - and I have no idea what to do with them.