Cargo problems - namespacing


#1

Why is Cargo organised as if all contributors belong to a single organization? Other languages have namespaced packaging, but not Rust? Why?

It leads to name clashes. Let’s say “Some Joe” creates some mediocre implementation of Base64 and creates the base64 crate. Let’s say I want to contribute a better implementation. So what do I name my crate? I am forced to do some name mangling or to name it something completely unrelated like avocado. Of course then base64 crate gets 15k downloads and avocado gets close to 0 and I’m left wondering why do I even bother contributing.

As number of crates reaches 100 or 1000 times as many, what’s the endgame? The namespace saturation will be just immense.


#2

#3

If we had namespacing, so that you’d have somejoe/base64 and you/base64, how would people know which crate was more complete, better maintained, etc?

There are other languages that don’t have namespaces in their central package distribution site, such as Ruby/rubygems.org and Node/npm, and they work out fine.


#4

How do people know which one of hypothetical base64, base64-rs, rust-base64, haiku64 is better maintained?


#5

How do people know which one of hypothetical base64, base64-rs, rust-base64, haiku64 is better maintained?

Exactly, that’s my point. Namespacing doesn’t help with this problem.


#6

…but that isn’t the problem described in the original post.

The point is that the first crate to get named something like base64 gets a significant “mindshare” advantage over every other competitor based on the name alone. If you have somejoe/base64 and you/base64, there’s no immediate reason to prefer one over the other. One doesn’t look “more official” than the other. You have to actually look at the release history, documentation, usage numbers, etc. to get a sense for which might be the better option as opposed to “I think I’ll take the one with a sensible name over the one named after some weird asian foodstuff or a street in Novgorod.”

As another point, unrelated to the above, the current system also leads to speculative name reservation. Look at retep998: he’s reserved something like four hundred and twenty crate names just so his Win32 bindings don’t end up fragmented. And that’s not even a complete list of all possible Windows system libraries!

Finally, as an aside, I thoroughly reject “they work out fine” as a justification. That something functions does not mean it is perfect and cannot be improved. That is the sort of reasoning that would leave us stuck with C forever: it works out fine, we don’t need a better language, we just need to hire more careful programmers!


#7

I don’t use libraries because of mindshare; I use libraries because they meet my needs.

Aside from libraries that are in the rust-lang organization on github, there is no “official”.

The OP used prior art as justification:

I was pointing out that there is significant prior art that made the other choice as well.


#8

This isn’t a particularly constructive reply and I daresay is somewhat obtuse. The impression I received from reading @carols10cents is that, implied from history of other package managers, is that users will pick the one with most relevant name 9 times out of 10 before moving on their next task. Finding a dependent package to solve a problem when you don’t know what package that’ll be is a chore. Your counter-argument implies, to me, that you value users doing the investigation and picking “the best one” based on an informed decision. This is admirable, but bear in mind that you’re bringing different values and axiomatic assumptions into this discussion. Nevermind whether this is realistic or achievable or just an idea you have about how things should be. Don’t assume everything agrees with you or shares the same basic values.

And for the record: My opinion on this whole thing (namespacing) is hey! would be nice to have, but really what’s proposed here is bikeshedding. If anything, we could be looking at much more sophisticated things like what Joe Armstrong proposed in the past (http://erlang.org/pipermail/erlang-questions/2011-May/058768.html). But that ship has sailed (or, at least, would require an enormous effort that would be fall into “breaking changes” land).

EDIT: taking my own advice and toning down my rhetoric.


#9

I did not intend to imply that, actually-- I encourage all users to do due diligence on all their dependencies regardless of what they’re named. My point was that whether we have namespacing or not, users will need to do research, so trying to justify namespaces by saying we shouldn’t have crates named ‘base64’ that aren’t the best base64 crate isn’t convincing to me.


#10

@carols10cents okay, so then would allowing slashes or some other form of nominal namespacing plus metadata serve that goal of finding the best solution to a greater extent? Is this a ux problem on crates.io?


#11

I think projects like Awesome Rust that curate crates and aggregate quality crates for different purposes are the best solution rather than namespacing.


#12

RRQLDR: Names matter, namespacing gives us more, better names, which means better discoverability, and minimises a source for potential non-technical bias in choosing a crate.

So you are always perfectly logical and never make a judgement based on anything other than pure, cold logic? If so, you are a better person than I and, I suspect, the overwhelming majority of humans. I know for a fact I’m influenced by names and branding, even when I’m aware of it and trying not to be influenced. The human mind is a trecherous thing, but I suppose a robot wouldn’t understand that.

(Pre-emptive note: on the off chance calling carols10cents a robot doesn’t make it clear, I’m being facetious. And anyway, some of my favourite people are robots; like CGP Grey.)

I’ve seen people use rustc_serialize over serde purely based on the name: either because the former looks more official (it’s even got “serialize” in the name!) or because the latter doesn’t say what it does and doesn’t even look relevant. Denying that names have power is silly, no matter how inconvenient it may be.

As another aside (why, oh why, can’t text be a directed graph?): the reason I said “a street in Novgorod” was that I was going to use an example of a new XML parser library I saw recently… except it’s name was so completely arbitrary, I honestly can’t remember what it was called. It might as well have been named with random syllables, or a 16-digit hexadecimal number.

This is the sort of thing I seriously worry about for crates.io: a future of arbitrarily-named and impossible to remember crates. Perhaps it would be best to explicitly say that I (and I think it’s not unreasonable to say there are others who agree with me) view the “creative package naming” that seems to frequently happen with non-namespaced package indexes as a bug, not a feature. A name should, as much as possible, say what something is and/or does. Not having namespaces appears to create (or at least contribute) to this problem, having namespaces appears to ameliorate it; thus, namespaces have merit over no-namespaces.

I interpreted that as justification for asking the question in the first place (the very next word after your quote was “Why?”), not as justification for the proposal. What’s more, the OP went on to provide an actual justifying concern: name collisions. That said, having pointed it out, I can see how it could be read in an accusatory tone, so fair enough: @RokLenarcic: carol10cents is right: just because other languages have namespaced packaging is not, itself, a reason for Rust to have it.

And that is something I believe to be a problem.

My position is that the name should not add to the user’s problems. If they see base64 and denniston, which are they more likely to pick? Ideally, they’d consider both equally, but people are lazy (conservation of effort is a perfectly reasonable thing!), and if someone’s in a hurry, are they even going to realise that denniston is an option? If they don’t check, how will they know that base64 is largely functional but minimal, whilst denniston is newer and more performant?

If they’re both called */base64, then at least it’s clear that a reasonable choice exists. They might pick one arbitrarily over the other, but at least they can’t claim “oh, I didn’t even know there was another one!” As I’ve mentioned above, I’ve seen people react with surprise to the existence of serde. If you’re looking for a serialisation library, why on earth would you even look at something called “serde”?

(That’s actually a bit unfair on “serde”; as far as I know, the name is a contraction of “serialisation, deserialisation”; kinda like “codec”. But again, cute names are dangerous. :slight_smile: )

If I assumed that, why on earth would I be here trying to convince people that my perspective has more merit than the alternative? I’d just sit back, safe in the knowledge that all discussions that don’t agree with my views are just sarcastic joking about how silly anyone who theoretically disagrees is (not that anyone would).

No one’s going to die if namespacing isn’t introduced, true enough.

It’s not quite on the level of “what colour should the background of crates.io be?”, though. I mean, it looks like a pool table. It’s not unpleasant, but why? I thought Rust was going for flat and sanitised, not “classy English pub”. Shouldn’t it conjure imagines of a warehouse, or even a dock?

As someone who likes organising things into neat collections, that post is horrifying.

I think they’re certainly good to have, but not a solution in and of themselves. The “Data Structures” section doesn’t list the various alternative data structures that were spun out of collect-rs. I forget all the names at the moment; they might be in another section, but that’s the one I’d expect them in.

(Also, why is “serde” in “Data Structures”? I expected to find it under “Encoding”. Why isn’t the maintainer’s brain arranged the same way mine is? 'Tis a shocking oversight, I tell you, shocking!)

Namespacing means better names, which means there’s an additional mechanism for users to find relevant crates. More discovery is a good thing (or at least, I’d hope so).

Actually, given the recent creation of stdx, it might be nice to have some sort of more formal aggregation mechanism built into crates.io. But that’s starting to sound like Steam Curators, and I’m not sure we want to go down that particular garden path…

(It stands for “Really Rather Quite Long, Didn’t Read”.)


#13

I totally agree, take for example the Iron repo has crates called session, persistent, cookie, and router etc. The names are so vague that you can’t tell they have anything to do with Iron’s middleware layer. This also means that all the other web frameworks can’t use those names. At least with conduit they are all prefixed like conduit-router, conduit-router, conduit-static. If they were namespaced as iron-persistent and iron-router it would add so much context. I feel like it’s a hipster thing to give your libraries cool names, but that doesn’t help someone who’s reading a code base for the first time as they will need to lookup everything in github just to see what it does.


#14

gasp YOU’VE DISCOVERED MY SECRET BEEP BOOP


#15

If we put your other arguments aside for a second, and really do consider that names matter, aesthetically “base64” is a lot more pleasing than “somejoe/base64”

You seem to want to foist less aesthetically pleasing, harder to remember names on everyone, lest someone who claims a good name get an unfair advantage. While that may be a valid point in and of itself, from a purely qualitative perspective, I think you’re arguing for aesthetically uglier, harder to remember names, by removing the ability to have shorter, more aesthetically pleasing ones.

Really I think what you’re after is a sort of “fairness” to the naming scheme, not high-quality names. If the latter is what you’re actually after, I think mandatory namespaces are detremental.


#16

True in this case, but what is the next base64 library going to be called? While I find aesthetically pleasing names, well, pleasing, I’m much more interested in: a) describing what the crate does and b) how easy the crate name is to remember. Take “serde” for example, an aesthetically pleasing name (at least to me), but a) it doesn’t actually tell me what the library is, and b) I keep forgetting the name of it.

I think a. is nice since the name of a crate will pop up in lots of places (e.g. code, dependency lists, etc) where you don’t necessarily have an easy way to immediately tell what the crate does. I’ll admit that I have an atrocious memory so b. may not be as important to other people as it is to me.

While namespacing doesn’t necessarily mean easier to remember names (at all), at least it’s a lot easier to choose a name that describes what a crate does.


The other question is: what happens to unmaintained crates? If crates are namespaced then its clear to everyone who is responsible for the maintenance and that by using it you are relying on the authors to keep it maintained.
What I really would like to avoid would be to have lots of the commonly named crates, e.g. base64, http or whatever, be unmaintained crates that the rust community have long since decided shouldn’t be used. How do new developers know not to use them? If they still look like they work, and have a decent enough API, then they’re almost certainly blindly going to be continued to be used. At the very least the community should be able to flag these as “unmaintained” and for that to show on the crates.io page.

I believe Perl(?) handles this by allowing unmaintained packages to be taken over by the community.

The flip side to that question is: would developers be put off uploading crates (into a global namespace) if they worry they might not have time to maintain it properly? Is this a good or bad thing? Does having an existing imperfect, e.g. base64, crate deter people from writing a better one since it will never be the base64 crate.

All this makes me wonder if we should have both namespaced and global package names, where:

  • Namespaced packages are personal packages that people are free to use and it’d be down to the authors to maintain (or not).

  • Global packages are maintained by the authors but may be taken over by members of the community if the package goes unmaintained for a long time. (How do we choose who gets the global names? First come first served? A community decision?)

The downsides of this approach is that it creates a two tiered system; developers may assume personal packages are inferior/less maintained and global packages are somehow endorsed and maintained by the community and thus perfect.

I’d also be interested to hear how much this has been a problem for python? I’m not familiar enough with python pacakging to really have an informed opinion.


(Definitions of “community” may vary)


#17

We had namespaced packages in the Ruby community. It was an utter catastrophe. I am very, very glad that Cargo did not make the same mistake. Coming from the Ruby community, the people who wrote Cargo were well aware of the failure of namespaced packages, which is, I presume, why they decided against namespacing. A flat namespace has been proven to work well in dozens of different package managers. With lots of examples of success on one side and a clear example of failure on the other, why choose the failing side?

Here’s what happened in the Ruby community: Ruby’s package manager, Rubygems, originally had one source for distributing packages: Rubyforge. Rubyforge was a site kind of like Sourceforge that had SVN repositories, issue trackers, project pages and so on. When you wanted to release a new project, you went to Rubyforge and filled in a five page petition and asked for a project name. A couple of days later you were usually granted that name by the administrators and you could publish a gem. It was a slow and annoying process.

Along came GitHub, which was in the beginning very heavily tied to the Ruby community (being written in Rails and all). GitHub decided that publishing gems was too cumbersome, and so they added a feature to GitHub which meant that every project that had a valid gemspec would automatically be published as a Rubygem via their servers. The gems were namespaced with the GitHub username, just as GitHub’s repos are. So jnicklas/rails would become jnicklas-rails. It was super convenient and everyone loved it.

It was only after a while that the problems became apparent. Whenever someone found a bug or a problem with a library they were using, it was super simple to just fork the repo, push the change and point to jnicklas-rails instead of rails. This made people lazy. Whenever they had a problem they would just fork, instead of trying to push their changes upstream.

Libraries became incredibly fragmented and whenever you’d find a new library you wanted to use, you would have to go into GitHub’s network graph and try to gather all patches you needed, construct your own hacked together version of the library, push it and run your own fork. It was gross.

It also made it much harder to remember which version was the canonical one. It caused real problems when maintainership changed hands and users didn’t know that they now had to switch to a new version.

The whole thing was a disaster.

Thankfully GitHub’s gem host is no more. It was shut down years ago, replaced by Gemcutter, which became rubygems.org, which works pretty much exactly like crates.io does, and which has been a great success. No one in the Ruby community laments that GitHub’s gem host and its namespacing is gone. No one. Everyone is happy with rubygems.org. I haven’t once heard any complaint about it. If a name is taken, choose another one. It’s that simple. Two projects with the same name under different namespaces is confusing anyway, is it a different project or just a fork? It’s not obvious.


#18

What about Java? That doesn’t seem to have been a massive failure.

I don’t think anyone is saying that having a flat namespace will be a catastrophe and won’t succeed, but that doesn’t mean its the perfect solution. While what happened to ruby sounds really interesting (and thanks for bringing it up), I don’t think it proves that having namespaces are doomed to failure.


I do wonder whether this had more to do with it being so integrated with github rather than directly a fault of the namespacing. It sounds to me more like moving away from github, and thus making it not quite as easy to fork and use, is what helped here.
Like I mentioned before, I wonder to what extent having a flat namespace discourages people from forking/writing a new version of a library, and is this a good or bad thing? The Ruby story shows that making it too easy is probably a bad idea, but making it too hard will probably also be a bad idea. Either way, I don’t think its clear that the Ruby problems can all be pinned on namespaces.

While its always nice when packages change owner when the original author can no longer maintain it, this doesn’t always happen. How to deal with unmaintained packages is a problem whether you have namespaces or not.
(There are also a variety of solutions for changing owners, e.g. symlinks, github style ‘organization’ namespace, etc.)


True, though the way flat namespace package managers seem to deal with this is that everyone chooses equally undescriptive and unobivious names.

Maybe cargo should allow you to upload patched versions of other’s crates with a special namespaced patch version? I’m not sure how much I like that idea.


#19

I disagree with the whole “Ruby” post. Namespaced packaging is working fine for Java and Scala. The ruby argument is a bad argument. The post says that the Ruby namespaced packaging failed because it was namespaced packaging, but that’s absolutely false. It failed for other reasons detailed. It failed because the solution completely removed the process of publishing a version. The massive screwup is that they made any fork a new dependency automatically. The massive screwup is that every commit is a new version. In Java I work on a github repo, then I publish to a central repository. If someone forks my library, they generally don’t publish the fork of the library to the central repo. They either create a pull request to patch the main repo, or they publish to their local Maven or they use the fork source or packaged jar directly.

It’s the typical Ruby “make-work” and “if it’s convenient it’s good” spirit ruining things again.


#20

Please try to be more charitable to other language communities in the future.