Soft question: underscore in names

zeroexcuses · October 17, 2021, 1:16am

Is it just me, or is

Err:Hey_This_Bad_Thing_Happened much easier to read than Err:HeyThisBadThingHappened ?

I find myself agreeing with most of the Rust style guide, but this camel case on types / enums has been a real struggle. I am increasingly finding myself adding _ into type names and enum variants.

jhpratt · October 17, 2021, 1:31am

You are, of course, free to name things as you want, but doing so is not idiomatic Rust.

crumplecup · October 17, 2021, 1:33am

Even if the style choice is arbitrary, I like having the name styling convention recommended to me by clippy, my personal code consultant. Adhering to a uniform standard offers a number of benefits beyond ergonomics and personal preference, including making code easier to share and work on collaboratively.

Fredrik · October 17, 2021, 9:06am

I find the incoherent mess of naming conventions slightly disturbing. And this is one of extremely few things in common Rust style that I find disturbing at all. I've eventually had enough, and now I start every crate with #![allow(non_upper_case_globals)]. I would love to have an option #![forbid(upper_case_globals)] to help maintain consistency. I'm tempted to do the same for enum variants. If you choose to violate the naming conventions, you have my sympathy.

The distinctions that are expressed through the choice of naming convention don't even make sense:

When I refer to some data, I really don't care whether this data was defined as let n: u8 = 42 or const n: u8 = 42. (For those new to Rust, keep in mind that both let and const are constant, unlike let mut.) I want it to be named the same in either case. When I move the definition out of the function body so it can be shared with another function, this requires changing from let to const, which for no meaningful reason requires me to rename all references to this data to scream at you in uppercase.

If someone wants to have identifiers highlighted based on details about their definitions, they should use syntax highlighting based on the definition information provided by Rust Analyzer. This is better because it's automatic and configurable, rather than having an old book of rules decide which kinds of definitions are to be highlighted in which way and manually write the highlighting into the code, and it can use color instead of screaming uppercase letters.
Enum variants are actually functions, which can be called just like any function and can be passed as callbacks, but somehow enum variants have a different naming convention from functions defined in other ways. The naming convention leaks an implementation detail about the function. Same goes for structs by the way.

It seems like the naming convention is trying to distinguish type context from value context, which is the job of syntax highlighting, not of the person writing the code. And just like now-obsolete regex-based syntax highlighting it's failing to take into consideration that the same name can be used in both a type context and a value context. To achieve this goal, which shouldn't be achieved anyway, because syntax highlighting does it better, it would be more meaningful to make the identifiers case-insensitive and have the style guide mandate code like this: let x: My_Struct = my_struct(42);

Now as to why types are without underscores, my humble guess is that traditionally (i.e. before Rust) there wasn't as much type inference, and code was cluttered with so many references to types it was difficult to read the part of the code that actually does something, causing people to run away to dynamically typed languages, or have style guides mandate that type names should be more concise than the part of the code that actually does something.

Having names with underscores and names without underscores in the same language causes the search and replace feature to behave inconsistently. If I want to search for “chocolate cake factory” I have to make a (possibly case-insensitive) search for chocolate_cake_factory and at the same time a case-sensitive search for the regex [Cc]hocolateCakeFactory. If I want to select all matches and edit them all at the same time, well, let's just give up and edit them one at a time. Same goes for mixing uppercase and lowercase identifiers in the same language.

troplin · October 17, 2021, 9:31am

I agree that the name itself is easier to read, but I find that it makes the code as a whole harder to read.
Underscores look too much like spaces and my brain "parses" the name as multiple tokes instead of a single one. Which is probably exactly what makes it easier to read in isolation.

ZiCog · October 17, 2021, 10:38am

It's just you

To my sensibilities if one is going to use underscores then the uppercase is redundant and jarring. hey_this_is_a_bad_thing_happened is prettier.

I like Rust's casing convention. Especially as it distinguishes between variables, type names and constants. That is usefully helpful. I don't want to have to rely on some IDE or syntax highlighter to point out those differences.

Most importantly, even if I do have some quibbles with Rust's formation/style as checked by clippy, I feel it would be better if we all just went along with it rather than each going their own way. Which would have some positive results:

We end up with a global pool of software all written to the same formatting/style. Which in the long run removes confusion and eases the reading everyone else's code for all of us.

It's one less thing to have to think about when we write our code.

It's removes the endless bickering about formatting that goes on in project teams

Nobody has to waste time putting together yet another company/project style guide.

Michael-F-Bryan · October 17, 2021, 10:56am

Naming conventions are one of those topics that people start religious wars over, and starting your comment with such strong wording means the @moderators will need to follow this thread closely to prevent it from devolving into arguments over naming.

Whether you agree with the chosen conventions or not, Rust's naming conventions are consistent and have been codified in RFC 430.

Different "kinds" of names are only valid in different contexts (types can't be used as values, module paths often contain :: can't be used as types, static/const are typically globals, etc.) and we use naming conventions so you can see at a glance what they are, with things that are generally interchangeable being named similarly (crates and module paths, functions and locals, type parameters and types and traits, etc.).

Item	Convention
Crates	`snake_case` (but prefer single word)
Modules	`snake_case`
Types	`UpperCamelCase`
Traits	`UpperCamelCase`
Enum variants	`UpperCamelCase`
Functions	`snake_case`
Methods	`snake_case`
General constructors	`new` or `with_more_details`
Conversion constructors	`from_some_other_type`
Local variables	`snake_case`
Static variables	`SCREAMING_SNAKE_CASE`
Constant variables	`SCREAMING_SNAKE_CASE`
Type parameters	concise `UpperCamelCase`, usually single uppercase letter: `T`
Lifetimes	short, lowercase: `'a`

H2CO3 · October 17, 2021, 11:02am

That is plainly incorrect; let is not the same as const. The former means immutability, the latter means compile-time evaluation.

Otherwise, naming different things differently is not "inconsistency", it is the right thing to do. CamelCase vs snake_case, for example, helps visually distinguish between type-level things (types and traits) and value-level things (variables, functions, etc.).

Heliozoa · October 17, 2021, 11:02am

You may also get a compiler error, and in this way you may be forced to care about something being const or not:

const x: usize = 0;
fn main() {
    let x = 0;
}

error[E0005]: refutable pattern in local binding: `_` not covered
 --> src\main.rs:3:9
  |
1 | const x: usize = 0;
  | ------------------- constant defined here
2 | fn main() {
3 |     let x = 0;
  |         ^
  |         |
  |         interpreted as a constant pattern, not a new variable
  |         help: introduce a variable instead: `x_var`
  |
  = note: the matched value is of type `usize`

It also makes it more clear what's going on when matching with constants

    let num = 0;
    match num {
        x => {}
        y => {}
    }

This isn't quite true.

use std::cell::Cell;

const CONST: Cell<usize> = Cell::new(0);

fn main() {
    let not_const = Cell::new(0);
    not_const.set(1);
    CONST.set(1);
    println!("{}", not_const.get());
    println!("{}", CONST.get());
}

will print

1
0

const is actually constant, let just disallows reassignment and mutable borrowing.

Fredrik · October 17, 2021, 11:10am

In my experience the Rust community has the unusual strength that it's able to have meaningful discussions instead of religious wars, and doesn't need to declare certain topics a taboo.

Fredrik · October 17, 2021, 11:15am

No

Which is besides the point. The example was let n: u8 = 42;, which is compile-time evaluated, exactly as const n: u8 = 42.

Which is incorrect, as explained in the post you replied to if you care to read the entire post.

trentj · October 17, 2021, 11:16am

That may be true, but naming is one of those things everybody has an opinion on, none is really definitively better than the others, and we're all poisoned by our prior exposure to other languages with various conventions. Is it really worth arguing about? Or is it more important to just say, "Okay, we're all going to be slightly uncomfortable with something about this, but it's better that we all do the same thing because the differences between styles is less important than the fact there is a style"?

Naming is one of those things that has been done to death, over and over and over again for at least the last 50 years. Is there anything really novel to say about it at this point?

steffahn · October 17, 2021, 12:13pm

It would help if you tried your best to follow the route of meaningful discussion yourself, too. Especially with a controversial opinion it's important to always explain yourself and not to use unnecessarily concise and bold/impolite wording or to make too emotional/subjective points, in order to keep the discussion civil.

The already cited beginning of your post, going even a bit further than what was cited above

is just a bunch of subjective description of what you feel and describing at length how you're "violating" the conventions, a long time before getting started on providing the very first actual reason or explanation for your standpoint.

This, as well as e. g. your short answers in

are, in my view, hardly advancing or supporting any "meaningful discussion".

(I mean, come on, at least provide a short reference/quote to what part of your post you're referring too when saying "as explained in the post ..." — that post isn't exactly short and, as I explained above, contains a significant amount of explanations of personal opinions and actions, so it's easy to miss one of your points. It's also easy to just misinterpret or misunderstand one of your points, so there's really absolutely no need to immediately jump to the accusation someone didn't "read the entire post")

zeroexcuses · October 17, 2021, 12:30pm

This is my fault for not pointing the conversation in the right way.

What motivated this is that

I can comfortably read Hey_This_Is_Bad_Thing_Happened at font size N.

For HeyThisBadThingHappened I need N * 1.25

Furthermore, with monitors going widescreen, I have lots of unused horizontal space but am short on vertical space.

Concretely, I can currently fit around 70 lines of code with Hey_This, vs only 57 lines f code on HeyThis.

erelde · October 17, 2021, 12:43pm

About font size. I have "perfect" vision, and for some years now I have consistantly zoomed on every website (URLO and IRLO are 150% zoomed) and applications, and I set the font size in my editor to 14 or 16px.

Reducing the amount of code I can see at a glance is rarely (never?) an issue for me. But the comfort gained by not pretending to be "manly manly man" superhuman, neckpains from straining and focusing the eyes in particular is a huge comfort boost.

In short and in my opinion, if you need to zoom : zoom.

trentj · October 17, 2021, 12:49pm

I mean, that's plausibly just a bad name. HeyThisBadThingHappened is too abstract to critique, but for example...

ErrorOpeningMyCrateConfigFile is a bad name, but mostly because it contains information that is redundant with whatever namespace it's in. mycrate::Error::OpenConfig contains the same information, but can be shortened when the full path is not necessary and the ::s may be highlighted in a different color than the names, making the "words" even more distinct.

Personally I don't love :: for namespace resolution, but meh.

Another thing to consider is that when writing "library types" you are ascribing meaning to a name that is more than just the words in it. Take RefCell for example: this is more than just a Cell for Refs; it's its own kind of abstraction and there's no additional layer of meaning to be divined from separating the words. When I read RefCell I don't think "ref... cell", it's a whole word for the fairly novel concept of "refcell" (which is only kind of related to the concepts of references and cells).

TomP · October 17, 2021, 2:42pm

I've been around what is now called the Internet since ARPAnet (before it was renamed DARPAnet). As Phil Karlton reputedly said on many occasions about two decades ago,

This whole discussion is reminiscent of the BigEndian / LittleEndian war that was ranging in the 1970s and 1980s, most significantly between IBM and DEC / Intel / Xerox, back when IEEE 802 (Local Area Networks) was formed in 1980. For background on that war, I suggest reading Danny Cohen's On Holy Wars and a Plea for Peace [mono-spaced non-paywalled version].

That war gave us network protocols that are BigEndian, except for embedded MAC addresses that are LittleEndian; a situation that largely persists today 40+ years later. (Note, for architectural reasons related in part to carry propagation in multi-precision computation, modern computer architectures have pretty much settled on LittleEndian.)

The problem of naming things has a similar tortuous path; each language, both human and computer, has its own focus, which induces naming biases. As a language ages, the rationale for those biases may shift from one of current concerns to one of tradition and history, but the biases remain.

Personally, I feel that Graydon Hoare and his co-contributors to the genesis of Rust did us all a favor by establishing a fairly-consistent naming style, which for me facilitates reading other people's Rust code no matter their language. The only real downside of which I'm aware is that the upper-case/lower-case distinctions don't project well into those non-Roman languages that do not have multiple letter cases. Thus a modified Rust naming convention is needed for names in those languages.

My advice: Get over it and use Clippy and Rust's naming styles unless there's a non-ego-based reason not to do so. In doing so you help others read your code, making it more likely that your contributions will help the Rust ecosystem.

droundy · October 17, 2021, 3:06pm

There is still the issue that they don't behave the same if you take a reference. Not a common issue, perhaps, but it can definitely bite you if you think you have a variable and you don't.

steffahn · October 17, 2021, 3:19pm

I know the original post doesn’t even question the upper-case vs lower-case distinction in Rust, but since the point of lower-case snake_case vs upper-case CamelCase supposedly distinguishing between types and values came up, I’d like to quickly throw in my 2 cents, coming from Haskell.

Notable, regarding enum variants, which have been mentioned above, too. Not only enum variants, but also struct names can be used as

constants, if it’s a unit-struct or a unit variant
functions, if it’s a tuple-style struct of a tuple-style variant

and these things of course are not part of the “type world”, so the ‘types vs values’ ≙ ‘CamelCase vs snake_case’ claim is not really correct.

In these functions, they do differ from the typical casing style for either constants or functions. But arguably more importantly than their constant-like or function-like usage – and by the way, even in that usage, people may appreciate the additional hint that you have an enum-variant constructor of a struct constructor there, not an arbitrarily complex constant or an arbitrarily complex function – anways… other than this usage, these “constructors” can also be used in pattern matching. And in pattern matching in particular, the case distinctions help tremendously. Following the actual language specification, a pattern foo could be

matching against a constant “foo”, or
matching against a unit-(struct/enum-variant) constructor “foo”, or
introducing a fresh variable foo

depending on whether any constant or unit constructor named foo is in scope or not. Naming conventions coming to the rescue, you should never actually have any constant or unit constructor named “foo”, since that’s lower-case, so the potential for huge confusion is eliminated.

Now for the Haskell context, in Haskell a similar convention of lower-case functions / variable vs upper-case types and value constructors is present, but there it’s actually enforced; with the effect that the rules for interpreting patterns can be simplified: if it’s lower-case it introduces a variable, if it’s upper-case it’s referencing some existing constructor. (Constants, in particular constants in patterns, are not a thing in Haskell.)

FYI, the enforced bit is only whether the first character of the name is upper-case of lower-case, furthermore, the convention is to use ‘camelCase’ for lowercase as well as ‘CamelCase’ for upper-case identifiers; no ‘snake_case’ used in Haskell conventionally.

On a related note, Haskell also uses an enforced case distinction for types: All types have to start upper-case, while generic type variables have to start lower-case. This has the nice benefit that you don’t have to explicitly introduce your generics explicitly in Haskell; a function like fn wrap_in_some<T>(x: T) -> Option<T> {…} could be just written fn wrap_in_some(x: t) -> Option<t> {…} if Rust were to follow that convention. (The explicit <T> listing becomes optional/redundant.) Note that there’s an RFC to allow the same kind of thing in Rust at least / only for lifetime arguments; which is only possible because lifetimes arguments don’t look like anything else, so that an identifier like “'a” can never be referencing something that’s already defined elsewhere, at least nothing already defined on the top level. (I haven’t checked the RFC to see how methods in impl blocks are treated; both impl blocks and functions containd in them can introduce lifetime argument.)

zeroexcuses · October 17, 2021, 7:11pm

If sticking with convention and growing existing ecosystems is the highest calling of the programmer's life, then no new languages would ever get created.

Topic		Replies	Views
Breaking cAmEL case on enum Err's	6	903	June 4, 2020
Naming convention suggests name that is read incorrectly	26	2167	May 24, 2021
Why are fundamental types not UpperCamelCase? community	10	729	June 28, 2020
Is snake_case better than camelCase when writing rust code? community	14	14441	March 29, 2020
Preferred naming convention: full words or abbreviations?	7	3031	January 12, 2023

Soft question: underscore in names

Related Topics