Blog: Why not to use Rust


#1

Wherein we look at reasons that keep people from embracing Rust.


#2

The C++ designers will try hard to eat Rust features/capabilities, they have done this many times with other languages.


#3

I’m not convinced C++ could easily become a superset of Rust, because beating it at its own game would entail 1/breaking lots of backwards compatibility (e.g. arbitrary memory pointers) and 2/focusing on usability before obscure performance features, two things which the C++ standards committee has notoriously never done.

Hopefully the future will prove me wrong, because I use C++ at work.


#4

Preserving backwards compatibility would be really easy for the C++ designers, with at least two simple options:

  1. Make borrowck a separate application, sort of like dialyzer.
  2. Make all borrowck errors warnings by default, which would admittedly be sort-of hellish because most programs would be flooded with errors.

#5

IMO that vastly underestimates the kind of changes that would be needed to “drop borrowck into C++”. Borrowck is not an independent analysis, it is deeply tied into how the language works, and C++ works significantly differently.

For borrowck to exist, you need:

  • The ability to annotate lifetimes
  • These annotations to be enforced, locally. Borrowck is not a global analysis
  • The ability to understand annotations in a template context. Again, borrowck is not global, but you need something like C++ Concepts for templates to make sense like this
  • The removal of copy/move constructors, being replaced by move by default. C++ isn’t an affine typesystem, but something like borrowck needs that to exist.

At this point you effectively have a different language. It will be completely incompatible with existing codebases and libraries.

The ISOC++ core guidelines are a more conservative drop-in approximation of this. They don’t completely guarantee safety, but they’re a step.


#6

Usability in C++ is a strange beast. Wrote some C++ today, and I’m always fascinated by the strange mix of high and low level.

On one hand, you can throw std::strings around willy-nilly, doing almost anything with them without a care in the world for allocations and leaks. It’s nice just to do string = char + string + *char but I don’t want to know, even if that was easy, how many unnecessary copies I’ve introduced.

On the other hand, the tiniest mistake can lead to a compiler error monstrosity that takes the uninitiated minutes to decipher, or segfaults that are hard to find.


#7

I think that unlike Rust strings the C++ strings have short-string optimization, this should help a bit.


#8

I lack the historical perspective that it would take to be sure, but I strongly get the impression that as of today, the ISO C++ standardization commitee is much more focused on the needs of a small number of language gurus than on those of the vast majority of C++ users.

Take a look at the C++17 feature set, for example. For this standard revision, the C++ commitee started with plenty of important usability improvements on the table, such as modules (aka fixing the broken include system), concepts (aka fixing the broken template system), or the Concurrency TS and Coroutines (aka fixing the broken futures and enabling standard and pleasant asynchronous library interfaces). These all had the potential to greatly improve the everyday experience of using C++.

Instead, they chose to focus on things like the Parallel STL (aka duplicating the thousands of data-parallel libraries out there so that gurus can do with one less dependency), elliptic integrals and Bessel functions (aka duplicating the thousands of advanced math libraries out there so that the gurus can drop another dependency), allowing static asserts to be undocumented (because the gurus can’t be bothered to write documentation), or more constexpr stuff (aka engraving some old and popular compiler optimizations in stone so that the gurus can guarantee that they occur).

I hope the gurus are happy.


#9

I think adding good Concepts, Modules, etc to C++ is hard, that’s the most important reason for such features slipping away from C++ standard dates.


#10

This is true, however one should not procrastinate replacing broken windows by thinking that putting pretty curtains behind them will be enough.


#11

… Meanwhile, the HPC compilers are probably still stuck on C++03 :P​


#12

It’s not easy, but they will try hard, for many many years to come, and they succeed on some things, because C++ is like a big blob of mud that’s rolling. And if look at the conferences you will see things like:

https://github.com/boostcon/cppnow_presentations_2017/blob/master/05-20-2017_saturday/type_safe_programming__jonathan_muller__cppnow_05-20-2017.pdf

Type safety, as shown here is just a small piece of Rust. Other pieces are visible in other talks and libraries.


#13

For me, that blog post translates to “the only reason to not use Rust is if something/someone won’t let you.”


Regarding the C++ discussion, when I started programming the only viable oss version control system was cvs. It was horrible, but better than nothing. Then subversion was created and it was like a breath of fresh air, because it did the same thing well. Then alternatives exploded and among them git emerged as this amazing, amazing game-changer because it changed the whole approach to version control, enabling amazing things.

To me, Rust is that git-like game-changer of systems programming languages because it changes the whole approach, enabling amazing things.

I am curious to see if there is another wave of things better than git/Rust in my lifetime.


TWiR quote of the week
#14

There are a few things that I think are reasons not to use Rust (not that there is anything better right now, but I think there can be something better). I think lifetimes are over complex, and there are too many passing conventions, you could get away with a RAII style mechanism that uses pass by reference when calling a function, and move semantics when returning things, resulting in a much simplified language with the same memory safety. I think the restrictions on write references are a problem, and that holding multiple write references to a collection (for example to swap items) is safe in many cases rust does not allow. Finally I think the type system is a bit ad-hoc and Rust does not make the best balance between keeping the type system simple and symmetrical and choosing the language semantics to match (I think the work on Chalk shows how the type system does not nicely fit as a consistent logic).

For me, at the moment Rust has the right idea with type-safety, but rejects too many safe programs, resulting in the style I want to write in getting rejected. It’s easier for me to write in C++ right now.


#15

Let me try to comment on some of these points:

While I hate the syntax with passion and really hope we can come up with something easier to understand someday, I think the concept of lifetimes is sound and needed.

Recall that one design goal of Rust is to make it hard for people to break a function’s interface when only changing its implementation. For this, you really need something to express, in a function’s signature, that given two inputs, a function returns a reference to one of them (or to something else, like a global variable), because it is part of the interface contract for callers of the function, it sets a limitation on what kind of objects can be used as input. That’s the same reason why there is no (full) type inference in function signatures.

The problem here is that there are good use cases both for moving input into a function, and for returning a reference, which would be hard to emulate if both of these options were disallowed.

Moving input into a function means that the function takes ownership of the input. That is appropriate when inserting data in containers, and whenever a piece of data is sent to someone else (another thread, an output peripheral…). When you are inserting a piece of data into a container, you usually don’t want to make an unsynchronized copy of it, but rather to move it there and keep it there, so any copies should be explicit. Similarly, when sending a piece of data to the outside world, you’re usually done with it, and don’t want to use it again, so moving is the proper semantic as well.

Returning a reference is useful whenever you’re returning a different view of the input data. Consider, for example, iterators which extract subset of strings according to various criteria (word by word, in blocks of N code points…). Needing to create a new string for each of these subsets is extremely inefficient, and totally unneeded: just return a view of the original’s string data. But to implement such a view, you need pointers/references. A similar use case is container lookup functions: when searching for something in a container, you don’t want to make an unnecessary copy of it when returning the result, but if your functions are required to return by move, you will need to either do that, or worse, move the content out.

Indeed, Rust’s borrow checking definitely needs improvements in its handling of composite objects, from arrays to collections. I think the current rules were chosen because they are easy to implement while achieving the stated goal of forbidding memory unsefety (e.g. by modifying a collection as someone holds an iterator to it), but I would expect borrow gurus like @nikomatsakis to be open to suggestions on how to handle this kind of legitimate “partial borrow” use case better.


#16

Obviously it’s hard to cover all the details in a short post, but I believe there are reasonable solutions to those problems you mention.

For example with returning objects, any object local to the function should be returned using move semantics, anything passed in with move semantics should be returned by move, anything passed by reference should be returned by reference.

When passing objects into a function, I don’t want the function definition to be different, so the caller should specify move or reference, the called function signature should not change.

Regarding containers, there are internal storage containers, where you always have to copy data in (like an array). If disallow references to be stored in containers then I think the above rules are safe, and you don’t need lifetimes. So far this seems straightforward.

Now it gets a bit difficult, and this is just an early idea without too much though as to how to proceed:

So the only time I think you need lifetimes is when storing a reference to a container. If the lifetime of a reference is the stack frame of the function the referenced object is defined in, then we can automatically assign the lifetime to any collection as the lowest numbered reference lifetime stored to the collection. We would then limit code to those situations where this can be determined statically. If you cannot determine statically then you cannot prove the use of the collection safe, you need to copy, or have garbage collection.


#17

That pretty much is the case today already.[quote=“keean, post:16, topic:11388”]
When passing objects into a function, I don’t want the function definition to be different, so the caller should specify move or reference, the called function signature should not change.
[/quote]
Caller choosing move vs reference can be accomplished using generic functions.

There are cases where you, as the function author, want to dictate the ownership semantics of arguments. Sure, you could take that flexibility away but you’d be losing legitimately useful functionality.

I might be misunderstanding you, but disallowing containers to store references is very limiting and would create performance hazards.

How about values storing references to other values? It’s not just containers. The notion of a lifetime is always there as long as references are allowed. Other languages have this too, it’s just not part of the type system (and so compiler doesn’t point out mistakes) or there’s a GC. But even in GC’d languages you often need to consider lifetimes - not for memory safety, but for leak safety.

Now, the current borrow checker isn’t perfect and does reject some cases that you “know” to be safe. The issue is those cases aren’t currently expressable in the language, and so you need workarounds. But any perceived limitations of the existing borrow checker shouldn’t be viewed as evidence of lifetimes being unnecessary.


#18

All this is enforced today by the borrow checker. And in many of these cases, you can do this without any explicit lifetime annotations, thanks to lifetime elision. These annotations become necessary, however, as soon as the lifetime of something is ambiguous when looking only at the function’s signature.

Consider this function, as a trivial example:

fn return_ref(x: &T, y: &U) -> &V {
    /* ... */
}

What is the lifetime of the &V that is being returned? Is it that of x? That of y? The intersection of both of these lifetimes? The lifetime of something else, like a global variable? Without explicit lifetime annotations, the interface contract is ambiguous, and interface ambiguity is the enemy of maintainable code as it makes the validity of caller code implementation-dependent.

That would require function implementations to be built to work with both values and references. But these have very different interface contracts, and taking the intersection of all of these contracts would be limiting for the function implementation:

  • A value allows you to move, something which you cannot do with either &mut or & references.
  • An &mut reference allows you to modify the target data, an & reference doesn’t.
  • An & reference can be freely duplicated (as that only results in read-only aliasing), something which has very different semantics and may be impossible when working with values and &mut.

It is thus fair of a function to require to solely be called with value, & or &mut parameters, as long as it needs the specific features of one of these parameter passing modes.

So, you would forbid people from having containers of Box for dynamic dispatch purposes, for example? As that is one trivial use case for storing references inside of a container.


#19

I think it may seem like Rust’s ownership model and the various ways to pass things around carries overwhelming (and seemingly unnecessary) complexity, but I think that’s just part of the learning curve. Once beyond that curve, they’re legitimately useful features that allow conveying desirable semantics to the compiler, which then helps in enforcing them (and can optimize codegen using them). That’s not to say the implementation of this system isn’t without quirks, as mentioned, but I think the idea is sound. The implementation will likely improve over time too.


#20

And of course Herb Sutter’s talk on “Leak Freedom by default” is an echo to adopt some common idioms beyond RAII for leak-free C++ code:
https://github.com/CppCon/CppCon2016/blob/master/Presentations/Lifetime%20Safety%20By%20Default%20-%20Making%20Code%20Leak-Free%20by%20Construction/Lifetime%20Safety%20By%20Default%20-%20Making%20Code%20Leak-Free%20by%20Construction%20-%20Herb%20Sutter%20-%20CppCon%202016.pdf

They truly try hard, but i don’t expect quantum jumps by the C++ community.