In search for a C++ style `const` qualifier

Since no one has shared any actual use cases for such a thing, I'm going to pull in an actual example of when mutation behind a shared reference breaks code logically.

The standard library has an immutability requirement for keys of a HashMap.

It is also a logic error for a key to be modified in such a way that the key’s hash, as determined by the Hash trait, or its equality, as determined by the Eq trait, changes while it is in the map. This is normally only possible through Cell, RefCell, global state, I/O, or unsafe code.

The main reason mutation isn't completely forbidden is that there needs to be some way to enable mutability so that we can make types like HashMap<Arc<str>, _>. And Rust doesn't have a way of distinguishing interior mutability that affects Hash/Eq from interior mutability that doesn't affect Hash/Eq, so all interior mutability is allowed.

Luckily, this requirement is not only in documentation, it's also in Clippy: clippy::mutable_key_type. However, there's no way to extend the lint to operate on custom collections or functions. So for things like these, the only thing you can do is put it in the documentation.

2 Likes

Thanks for clarifying that, I wasn't understanding the OP clearly. Sorry about that, @thomas001.

Yes, this is one way to solve this problem on the language level. What a const keyword would give you that instead of &CFrozenHashMap you could write & const CHashMap, but the language ensures that only methods of CHashMap marked as const are callable. So you could view const as a way to implicitly define a frozen view type within the language. Now that I am writing this, I would not be surprised if there is a crate somewhere that does exactly this :slight_smile:

Thank you for the great example! :slight_smile:

This is not a language war, most people here are pretty convinced that Rust is a great language to work with. My company seems quite committed to start using rust mid-term. Still, Rust is a developing language, C++ is a developing language (finally again). Both can learn from one another. So IMO it's worth discussing whether this feature, that C++ programmers value, could have a place in Rust or not.

That would not be enough. Rust doesn't have syntax that is only a suggestion, which is what const is in C++, so immutability would have to be enforced. The compiler would need to disallow interior mutability on const refs. This is much more difficult.

And for what you described to be generally useful, someone would have to manually add a const keyword to every applicable parameter in the std library, which could be the majority of existing parameters. Therefore some other way of specifying constness (or non constness) would need to be designed. I'm having trouble coming up with a way to add this to the language that is usable but doesn't badly break existing usage.

What @the8472 said is probably the most relevant:

I don't see an rfc for that when scanning the list.

1 Like

You might be interested in this, I think it has some relevance here:
https://graydon2.dreamwidth.org/312681.html

"The typical Rust program has either no refcells, or a very small number, just as it has either no unsafe code or a very small amount; so it's easier to reason about them as a special case..."

What I would say is that interior mutability tends not to be used very much in Rust programs, so managing it typically is not a problem. I am not sure how to explain this further or more precisely, it is something you will discover "by experience" I think.

4 Likes

In some metaphotic sense it is, just not in the usual fashion where every side tells the other guys are idiots.

Summary so far:

  1. C++'s const and Rust's non-mut and different, yet, critically, the important design choice is identical: const/mutable in C++ match exactly to on-mut/UnsafeCell (syntax is different but actual behavior is exactly the same).
  2. But in practice C++ designs don't usually use interior mutability, they use something different and use const as marker.
  3. Rust designs don't have anything to do that separation, except documentation which gives us completely different question: do we need some different modifier (let's call it great) which do something different, but would be used in the same fashion as const is used in C++
  4. And this form of question sends us straight to IRLO because requirement “why don't we have something that does we-have-no-idea-what but that we may use exactly like const in C++” is really a very different question from “why is there no C++ style const in Rust?” which started the whole discussion.

But that example looks like a counter-example to me. Think again about this:

You don't want to be able to freeze types completely. You want to disallow some interior mutability while simultaneously keeping the door open for some other kinds of interior mutability.

This is not the ability to “freeze” something. You want to keep the door half-open and allow some “unimportant” change while disallow some “unimportant” change… which is exactly what const qualifier does in C++ and which IMNSHO would be entirely unsuitable for Rust!

But that's the key insight: const in C++ have so many warts, gotchas, exceptions and crazyness because it tries to solve the impossible task of separating “unimportant” changes from “important” ones!

And then it fails. Utterly and completely: people violate C++ const rules, compilers violate C++ const rules, there are crazy gotchas like std::launder (try to explain what that crazy thing even does and why is it needed!) and so on… and yet C++ pretend that const does the impossible and use it as if it may actually do the impossible.

Rust… doesn't believe in the design based on something that's actually impossible.

Take the main division that Rust have: unsafe. Why does it exist? There are an impossible question: separation of safe code from unsafe code. That's impossible, so Rust gives you two choice: accept false positives or false negatives. Either compiler is too lenient and accepts code that's incorrect (unsafe realm) or it's too strict and rejects code that's fine (normal, “safe”, Rust).

Similarly with C++ const designs: separating code which does benigh changes to your data structures (that's what C++ const is trying to and which is, of course, impossible to verify) from dangerous ones is impossible thus Rust embraces two obvious answers: immutable shared references and interior mutability. Again: false positives and false negatives for the exact same reason.

And I would argue that something that is used like C++ const would just be impossible to add to Rust because it relies on getting correct answer to the impossible question.

If you would try, you would get another layer of two answers which would IMNSHO helps no one. This hyphotetic great modifier wouldn't act like C++ const, no. It would just be another two answers: one with false positives and one with false negatives.

P.S. I get the desire to get the “simple” answers of C++. But you have to understand that they were rejected for a reason: if something is impossible then it's better to accept that something is impossible and embrace it. C++ “solution” of adding another layer to top of many existing layers of exceptions every three years is madness. And, again, we even know for sure this will never actually give us an answer that C++ users naïvely think they actually have.

5 Likes

Thanks, your post is insightful as usual and helps to think clearly about this.

I agree. But what is missing in Rust is a clear indication that a function modifies (or does not modify) its &T argument with interior mutability. This is useful information to the reader.

So perhaps a small improvement would simply be to have a clear doc indication of when and how interior mutability is used. Similar to panics, which are not part of the function signature but should be listed in a Panics section, there could be an Interior Mutability section by convention. It could be placed at the parameter, function or type level, to avoid repetition.

This wouldn't address the OP issue at all, it's just a small improvement to the current situation where there is no convention for explicitly describing it and it may come as a surprise to the user.

This would be nice if it happened. Another thing that is closely related, and is also similarly squirrelly to define formally, is “if you Clone this type, you get another completely equivalent handle to the same resource”. I often see library documentation completely not mentioning this, yet it can be quite important for good usage of the sort of API where it appears.

(And in the absence of interior-mutable application data, this also shows up as a possibly-important performance characteristic, though not a semantic one: “it is O(1) to clone this regardless of how big its contents are”; things like Arc<str>.)

However, I'm not sure “interior mutability” is a good way to describe it for documentation purposes, though we have no better standard term. For example, both a &std::sync::mpsc::Sender<u8> and a &std::process::ChildStdin allow you to transmit bytes to some destination, but the former mutates some in-process memory and the latter does not. So, there are two different properties:

  1. Implementation property: This type makes use of some exemption from the “no mutation through &” rule. (That is, it contains or refers to an UnsafeCell somewhere.)
  2. API property: This function has a side effect even though it takes no &mut parameters (or, this type has such methods).

As people understanding how a Rust program works, we talk about (1) because it's well-defined and explains the mystery “wait I thought & was inviolably immutable”. As people writing a program to use some library, we actually care about (2), and so the documentation should describe things in terms of (2).

8 Likes

Great thoughts! A "Side Effects" section is more appropriate and has wider applicability.

2 Likes

There's also the minor detail that it's not possible to actually forbid interior mutability.[1] Freeze doesn't detect interior mutability behind indirection, and even if it was tenable to do so, you don't need ownership or pointers to still effectively "contain" interior mutability.[2]


  1. without forbidding doing anything I suppose ↩︎

  2. This is the "global state" from the docs; "I/O" could be even more opaque, e.g. utilize a file. ↩︎

3 Likes

No, it's not useful! That's precisely the issue. Think back to that textbook case with HashMap: you want to allow HashMap<Arc<str>, _> which changes it's interior in the “unimportant” ways, bit don't want to allow HashMap<Mutex<i32>, _> which changes it's interior in “important” way.

And then you invent bazillion rules which [try to] distinguish “important” from “unimportant”, but they all fail because what you are interested in is semantic property, not syntactic properties… and semantic just couldn't be derived from code!

Documentation may work since documentation doesn't have to separate wheat from chaff: it describes, roughly, what you are allowed to do and what you are not allowed to do in terms of the data structure in question, but that couldn't be extended to all possible data structure and all possible types.

Yeah. Same issue: deep clone vs shallow clone with no way to formally and universally define how they actually differ.

You may forbid it in C++ way (almost identical to unsafe Rust way): declare that this and that thing is not allowed without giving any means to checking whether program does this or that.

But that's something people [try to] avoid in “normal”, “safe” Rust. At least on the language level (we couldn't put information about everything in types, at some point you have to agree that something should actually go into a documentation, exclusively).

3 Likes

It is the semantic side effect that I would list in the side effects doc. I wouldn't list changes to the Arc ref count, but I would list changes to a shared mutable container made by a method for modifying the container -- the idea there is to have a prominent way to point out that although you're passing &Container there is a mutation going on. But maybe its unnecessary, since the method doc would describe the mutation anyway.

1 Like

But would that be good enough? There is always an escape hatch when the answer is wrong. C++ has const_cast, no idea what rust would have. IMO this does not have to be perfect, if a user tries really hard to violate "const guarantees" they probably can, it should just be obvious. As many others and you pointed out "const" is a semantic property, a marker, an API hint that might propagate from methods to receivers to help the coder to uphold the API contract. But in the end, it's semantics and cannot be formally verified. I am okay with that. Maybe one could come up with a set of rule that over estimates const correctness and would be formally correct, who knows.

While I agree that documentation always helps, I've seen too many cases where documentation is not read, was misunderstood or was changed after code has been written without giving much thought about breaking the API contract. Hence, I take every bit of compiler support I can get.

I think @khimru has made good arguments why it's is not good enough in the context of Rust.

In addition there are the backward compatibility issues. What would greatly bother me is that the vast majority of existing &T parameter types do not allow "semantic interior mutability", they are semantically immutable as @geebee22 mentioned. So how would you add &const T in a useful way, yet not make a huge mess of things?

A detailed design is likely to uncover other usability issues and conflicts with the current language. For example, Rust already has a const keyword that corresponds to the C++ constexpr. This is just a naming issue, but naming is really important.

2 Likes

None exists to my knowledge. It's often discussed in the context of being generic over async and const keywords (also "function coloring") but effects are more general than that.

This blog post lists a bunch of effect flavors that would be relevant to this discussion.
It doesn't treat "const" as a singular feature and instead breaks it down into further aspects such as parametricity, no-panic, no-divergence, capability-safety, no-allocation.

2 Likes

I tried to identify why I felt hard to understand what OP is looking for, and reassembled contexts in this thread in my own words that I understand. This is in no way meant to be offensive. Please tell me if I misinterpreted your intention even slightly.

OP is looking for a way to qualify that the value can't be mutated with this reference, like the const keyword in C++. This is what normal non-mut references supposed to do, but OP think it's too weak due to the existence of interior mutability. C++ also have mutable keyword which basically is same as interior mutability feature wise, but OP think we can generally trust C++ programmers to not do weird things with it.

My conclusion is, why don't you trust Rust programmers, too?

For example the std::collections::HashMap::iter_mut() returns an iterator over (&K, &mut V) tuples. You can mutate each values on iteration with it but not keys as modifying keys may change its hash which results broken hashtable invariant.

3 Likes

If by "weird things" you mean shared mutability, then the answer is: in C++ you don't need interior mutability to do that, whereas in Rust you do.

That's because in C++ T& can be shared whereas in Rust the equivalent &mut T can't be shared.

As a result, it is standard in Rust to implement shared mutability by using shared non-mut references, including in the standard library.

Weird things means the mutability in general, since that's what OP wanted to prevent. OP was finding a counterpart of C++'s const qualifier, which implies they found it's feasible enough to prevent mutations for their use cases. My suggestion is that if C++ const is enough for you, Rust &T should also be enough too. Surely both can be abused with malicious code but that's not a concern here.

1 Like

Absolutely not. Not even remotely close.

OP is still at the stage of “attempting to write C++ in Rust”. Rust very much doesn't like it. All these attempts to write C++ in Rust, Java in Rust or Python in Rust just lead to insane levels of frustrations.

Why? Because they give radically different answers to what to do about shared mutability question.

Shared mutability question is, in some sense, central issue of programming: it was said by Tony Hoare (who else?) half-century ago references are like jumps, leading wildly from one part of a data structure to another… their introduction into high-level languages has been a step backward from which we may never recover, but the real danger is that refernces to mutable state introduce lots of confusion.

But of course shared mutable state is something that's very much needed: this forum would be very dull if we wouldn't be able to change anything or to share our thoughts.

And every language struggles to provide the best answer, be it const in C/C++ or ConcurrentModificationException in Java… it's all about [attempt to] find out when shared mutable state changes.

And the answer to that question in Rust is radically different from all other languages.

Most mainstream language divide functions in two classes:

  1. The ones that change the state (setters).
  2. The ones that don't change the state (getters).

And then this separation propagates from the bottom to the top and you have to always remember whether you are supposed to modify state or not.

Many C++ programmers don't even know mutable keyword exists! Because it's so rarely used and when used, mostly for things like caching, where the ourside world doesn't have to even know there are some kind of mutable cache inside.

Sometimes, in rare cases, you have to know about these (e.g. glibc caches PID of the process because it changes so infrequenly and if you call raw fork syscall then you would be in trouble, but 99.99% of programmers don't even know or care about things like that), but most of the time you live in the world of getters and setters.

Rust, on the other hand, makes responsibility of handling shared mutability the part of data structure. If you data structure supports it then it's responsibility of that data structure to handle all these issues. There are no separation between mutable and immutable operations for such a data structure.

In some cases you have to care (e.g. if you use such data structure as key in hash map), but most of the time… nope.

Boats recently wrote an interesting article about that difference in phylosophy, but in may be highlighted by one simple example: mutex. Mutex is Rust is intrinsically tied to data that it protects and you just couldn't easily write incorrect code. You don't have to care about whether you need to take mutex or no, because, most of the time, you simply don't have a choice! in C++ mutex just exists in vacuum and you have to know about wether you have to use it or not!

2 Likes

You make it sound like this is a bad thing. It's not only remembering that choice when writing a function, it's also part of the interface. So your callers know whether you will modify state or not. (up to mutable...const_cast and so on)

Again, you make it sound like this is a good thing? I know you probably think this is going in circles, maybe it's just a long time of C++, but I really like my API to tell me whether the operation is going to mutate the visible state or not. I am not sure why it's waved away by just saying "it's usually not needed". Sure, you can write complex code without const correctness, most languages do that. Yet, having it is a added tool that makes the API cleaner and safer to use, communicates assumptions and prevents errors like accidental mutations. Once there is a large code base with a lot of different people working on it, every additional check is worth it. I've seen enough bugs where some state was changed that shouldn't have been (Yes, I am looking at you Go...).

Thanks for the article, I'll give it a read. For the example of mutex, I am not sure i'd agree here. There are C++ synchronization libraries that work like Rust's mutex, it's just called a monitor (Herb Sutter showed one in some talk, but there are plenty good and not so good implementations out there). I'd consider this an API choice, but not a language difference.