Crate.io statistics for hyphen vs. underscore package names

This isn't another post asking whether someone should use hyphens or underscores (or neither) in their package names, or whether one of them subjectively feels more often than the other in people's experience. (Previous thread 0, previous thread 1, previous thread 2)

Instead, I'm wondering whether anyone has done the work to quantitatively measure the prevalences or frequencies of hyphens vs. underscores vs. neither in crates.io's package names.

nematsakis on GitHub said it well:

... I think collecting data on current usage is a key next step. This is a classic bike shed topic that can lead to endless discussions of what individuals prefer, but data from the community will go far to ground those discussions. For example, I just looked at my own project's crate and found 21% of dependencies use lower_snake_case, 16% use kebab-case, and 63% avoid hyphens or underscores. What does this look like across the open-source tree writ large?

Has anyone done the statistics on the crates.io index?

Bonus points if the prevalences are weighted by popularity.
Bonus points if changes in prevalence over time are also checked.
Bonus points if the packages' crate names are also checked.
Bonus points if names without hyphens or underscores are checked for concatenated words.

I'm wondering if anyone has done this work. I can't find any evidence on the web that it's been done, but I'd like to see if anyone else has seen anything quantitative. This probably would be a lot of analytical labor for anyone, so if anyone did do it, my hat goes off to them.

  • 76142 with hyphen
  • 31345 with underscore
  • 831 have both

The proportions are the same when weighed by ranking.

13 Likes

Maybe would be interesting to include/exclude the count of -sys crates and cargo- crates.

Without cargo[_-] and [_-]sys it's 71672 hyphens vs 30970 underscores.

If I exclude low-ranking crates, and count distinct number of owners, it's 7523 users publishing underscores, and 14249 users publishing hyphens.

1 Like

How many users use both? I'm pretty sure I have done both over time for example.