Naming convention suggests name that is read incorrectly

I was looking at the rust api naming guidelines, and noticed this sentance:

In UpperCamelCase , acronyms and contractions of compound words count as one word: use Uuid rather than UUID

Specifically, the UUID suggestion seems incorrect. In almost every other context I've seen, it's written as UUID (with the remaining being uuid for snake_case types and fields). In particular, in the Java Runtime library (which does use CamelCase convention for class names, similar to rust), the type is java.util.UUID. Additionally, it is cased as UUID in [RFC 4122], the title of ISO 9834-8, and other relevant standards that define the format and documents describing it (https://en.wikipedia.org/wiki/Universally_unique_identifier).

Further, the spelling Uuid reads differently from UUID, at least to me. At an immediate glance, UUID reads as a Univerally Unique Identifier, and Uuid reads as some other form of Unique Identifier (without actually thinking about the name).
I'm wondering if anyone else is confused in the same way, and, if so, should the naming guidelines explicitly except this acronym from the rule, given the relative frequency of the former casing (and the fact the official casing seems to point to UUID).

1 Like

It's not just “UUID” that is affected; almost all acronyms/initialisms suffer the same way. For example, the proper name is “TCP” but the Rust standard library spells it Tcp. The proper name is “XML” but Rust crates use Xml.

I believe this is a deliberate choice that sacrifices some fidelity (and aesthetics, in my opinion) for the ability to distinguish globals (const and static items, which use all-caps) from other sorts of identifiers. It also simplifies the algorithm for mapping between lower_snake_case and UpperCamelCase names. It’s a bit quirky but you get used to it.

Anyways, since Rust uses all-lowercase identifiers for functions and local variables, you also need to get used to other incorrect forms like uuid and tcp.

(At least we don't use both formats within a single identifier, like some languages. :slight_smile: )

12 Likes

This also seems like an issue, though I'd submit that the meaning is still conveyed. My problem is that Uuid conveys a different meaning from UUID (which means that some code using Uuid where I expect a UUID looks incorrect at a glance). The correctness is not so much of a problem as the confusion. Just given the number of languages I program in (and only in a few of them do I not work with UUIDs), I doubt I'd ever get use to it being written the latter way :sweat_smile:. In any case, it's more of an annoyance than anything; if I need to, I can always define the type with #[allow(clippy::upper_case_acronyms)] (which is the reason I actually took a look at the style guidelines in the first place).

It's just different. We get used to it.

It has some logic in making it easy to distinguish consts/statics from structs from method/variable names. As noted above.

Personally I am very happy it is defined as it is. I have been struggling with acronyms in CamelCase for ages. On the one hand "MyNGPGenerator" looks clunky and is hard to read. On the other hand "MyNgpGenerator" is not true to being the acronym I want. What to do?

Rust says, "don't waste time thinking about it, just do it this way". Phew, I can just do it that way and move on. One less thing to worry about.

Even the likes of Microsoft never got their bearings on this. For example "XMLHttpRequest". Wtf? It's CamelCase, but one acronym is all caps, the other is not.

13 Likes

It’s easier to get used to this if you’re comparing to snake_case. You’re supposed to write Uuid instead of UUID for the same reason why you write uuid instead of u_u_i_d in snake_case. Or in other languages that have lowercase camelCase identifiers, you would use uuid instead of uUID.

2 Likes

FWIW, Uuid matches the .Net guidelines, where the type is called Guid, not GUID.

Specifically, from Capitalization Conventions - Framework Design Guidelines | Microsoft Docs

The PascalCasing convention, used for all identifiers except parameter names, capitalizes the first character of each word (including acronyms over two letters in length), as shown in the following examples:

PropertyDescriptor HtmlTag

6 Likes

The purpose of capitalizing only the first letter is to reduce ambiguity. Is UUIDHTP a UUI for DHTP, or is a a UUID for HTP? Or perhaps is a UU for an IDHTP? There's no such confusion for UuiDhtp.

It does not. Uuid must be a UUID, because it would be written UUid or UuId if it were intended to mean something else.

12 Likes

So now you are even spelling "WTF" as "Wtf", I see what you did there. :stuck_out_tongue:

11 Likes

Well spotted.

It's starting to come naturally :slight_smile:

3 Likes

The ship has sailed of course, but I might have at least considered some extra clause to the definition of UpperCamelCase, allowing for XML_HTTP_Request (using underscores as acronym separator). But then that doesn't help UUID where one would also have to accept ambiguity between structs/traits and constants. On balance its probably best the way it is.

Ha! Some might recall this long-passed weekend bikeshed that Graydon unceremoniously squashed during the alpha years of rust. :slight_smile:

12 Likes

I like the naming convention.

If you have something that has more than one consecutive acronym, something like: HttpTcpXml is far, far better than HTTPTCPXML.

7 Likes

I know how we name things it the mother of all bikesheds, but... Mixing all caps with underscores with camel case kinda hurts my eyes. I'd much rather XmlHttpRequest even if it goes against how we normally do acronyms in english.

Also, the only place I've seen that sort of naming convention is old C codebases so you've got some not-so-nice connotations there.

2 Likes

Here, here.

2 Likes

For example "XMLHttpRequest". Wtf?

I read once that this was because the Microsoft naming standard in use said to use all-caps for abbreviations of 3 or fewer letters, but mixed-case for longer abbreviations(!)

I agree that Rust's strategy of "don't require me to make a decision" is a good one.

1 Like

Interesting. Random arbitrary style rule over any kind of meaning/thinking.

More deeply.

Of course it will be an HTTP request. We are in a browser, it's only way out at the time was using HTTP. So that is redundant. Which leaves us with:

XMLRequest()

But wait. This thing is more general than transporting only HTTP. So that is silly. Which leaves us with:

Request()

When all this nonsense is finally realised we finally end up with:

Fetch()

This naming business is the hardest, as yet unsolved, problem in Computer Science.

I would actually add that this naming convention actually leads to names that could be ambiguous (particularily when writing enums based on names of assembly opcodes that exclusively uses 3-letter acronyms).
For example, my arch-ops library for lc-binutils, the enumerator for the BCC instruction is Bcc for Branch Carry Clear, which can also be read as Bcc for Branch condition code (which is the most annoying behind Uuid). It can also provide acronyms that are also words (which can be fun to read at a glance). Rep(Immediate(0x60)) is a particularily fun one (this is REP #$60 or REset Processor status flags, not repeat #$60). I suppose this is just a problem of the assembly language, uncovered by this convention. Alas, I cannot fix the assembly instructions (without breaking compatibility with every 65816 assembly file ever).

Wouldn't that be ReP if it is a type and re_p if a function in the Rust style? Whereas Repeat would be Rep and rep respectively.

The instruction is the acronym REP, so by rust convention, it's Rep.

Hmm, that's the name of the instruction, but it doesn't really seem to be a proper acronym since it has two letters from the first word.

1 Like