The Rust module system is too confusing


#1

@withoutboats wrote: https://withoutboats.github.io/blog/rust/2017/01/04/the-rust-module-system-is-too-confusing.html


#2

DISCLAIMER: I had a C++ background, where “modules” are textual includes, and the paths are always explicit. This background definitely had an impact in how I started learning modules (e.g. anything implicit at the filesystem level was extremely surprising to me at the beginning and completely unexpected: “how does it know that it has to import the mod.rs file inside the foo directiory after refactoring foo.rs automatically?”).

I basically agree with the main point behind the blog post:

The Rust module system is undesirably difficult to understand

but I completely disagree on the observations about why.

The first observation is:

There is simply a lot of syntax

The syntax is mod, extern crate, use/pub use, and use patterns like *, super, self … It is some syntax, but it is not a lot of syntax.

Rust requires users to build an explicit module graph

This is a really really good thing, but if I had to name the single biggest difficulty I had while learning modules, that would be that the module graph wasn’t explicit enough. As a newcomer, depending on where you add a file (and the name of the file, and whether it is in a new sub-directory or not, …), you are adding new modules/edges/branches to the graph without writing any keywords. This felt like magic! And understanding how this worked was the toughest part of understanding the module system. Then paths sometimes started from the root of the module, sometimes from the local module (using super, self, …). This was also baffling at the beginning. If you combine the implicit graph structure in the file hierarchy, with using the wrong paths, and unhelpful compiler errors, learning modules was a frustrating experience.

It would have helped a lot if the only way to use paths would be from the root module, and if cargo would conspire with rustc on use errors to print to the screen the whole module hierarchy from the root (including extern crates).

The rest of the blog post assumes that the solution to making modules more beginner friendly is to make them more implicit. I don’t know if that would have made my learning experience better or worse (one can only learn something for the first time once), but in a nutshell my main problems were:

  • the module system is too implicit,
  • the implicitness is in the file hierarchy,
  • two ways of naming paths and when/where to use each were confusing at first (from the root and local paths, thinking in terms of super, self, … was extra complexity),
  • the book forced me to extract the relevant information from the text.

The things proposed by the @withoutboats blog post (e.g. implicit mod) don’t necessarily make any of these issues worse though, and do reduce unnecessary syntactic noise (so I kind of like some of the things being proposed), but I don’t think they would make learning modules neither easier nor harder. The implicitness in the file hierarchy remains, and the different types of path remains, and all these features interact in multiple ways and the user must know how all of these interactions work in order to get started.

@steveklabnik I think the main issue I had while learning modules in the old book (haven’t checked the new version), is that:

  • the examples must be long and complex by necessity because modules solves a complex problem, the different features interact with each other, … I don’t think much can be done about this, but
  • I had no mental model about how modules (as a whole) were supposed to work in Rust before going into the examples, so I had to, on each example, try to extract from the text and the features that were introduced a mental model of how modules were supposed to work, and then have it a bit debunked in the next example where I learned something new that didn’t fit my previous mental model
  • to learn all the features I had to extract them from the text, and then re-read the whole section again with some context to try to understand modules. Basically building the whole picture of all module features and how they interact with each other required a lot of effort. This is the hardest part.

I really think that a small “bear with me” section at the beginning of the module part of the book with the whole modules picture “in a nutshell” would have helped me significantly avoid the hardest part of the process. Something like:

  • a 1 sentence explanation of each feature/keyword (in a table/row format)
  • a 1 sentence explaining how the modules are structured (a graph, explicitly using mod, implicitly using a file hierarchy with folders, mod.rs),
  • a 1 sentence explanation of the two types of paths (from the root, from the module, with super and self) and when they are used.

These bare-minimum out-of-context explanations would have given me the whole picture about modules at the beginning and how it is supposed to work before going into the examples (which must be long and complex by necessity). The examples can then be used to solidify that model, ideally requiring me to read the whole thing only once, because everything is repeated twice (once at the beginning briefly, and then in a longer explanation in one of the examples). Maybe a bullet list of links “If you come from C/C++ Rust modules differ in…”, “If you come from Python, Ruby, …”, “…”, … with brief explanations for programmers coming from mainstream languages would be very helpful as well.


#3

I mentioned this in the twitter thread, but I too feel that this is a problem to be solved with documentation and diagnostics, not features. The problem is that the union of “confusing” and “easy to understand” features doing the same thing might end up being more confusing than either individual feature. Part of the issues with Rust’s module system are that there are many ways to do a thing, and adding more ways will confuse it further.

I think the main confusion is with how paths work. This can be taught. The file hierarchy stuff is confusing too, but it similarly could probably be taught better.


#4

I like the idea of making extern crate implicit when the crate is a dependency in Cargo.toml. Having to define the dependency in two places feels like unnecessary boilerplate.


#5

Coming from the point of view that the filesystem naming and structure is potentially redundant with an explicit module strucutre, I would choose to make implicitness explicit. Namely, leave everything as it is right now. Don’t deprecate the current mechanisms. Instead add one new piece of syntax to optionally add modules implicitly based on the file structure. This would make it easy to use a more java-esque approach, but not lose the flexibility if you need to do something non-standard.

One thing that I’ve found annoying is fast prototyping, where I have a bunch of files/modules that I want to incrementally add to my project, and having to create the files and also update a lib/mod file seems unnecessarily burdensome. The above approach would make this experience painless.

I have mixed feelings about doing away with extern crate directives, as a single cargo.toml can generate a library as well as multiple binaries that might not all rely on the full set of dependencies specified in the toml. Though maybe it would be useful to be able to express that detail in the toml as well, if needed, so that extern crate becomes truly unnecessary.


#6

In my view, there can be important gains in improving both the documentation and the compiler diagnostics.

Let me provide a concrete example, with some pretending to be a newbie and “thinking aloud”

use std::path::Path;

fn main() {
    let p = Path::new("");
}

Super boring code, but let’s imaging that copied it from a tutorial (that is, I wrote it without deeply understanding everything. I see a use std::path::Path at the top, and then I call this Path thing later, this looks reasonable. Since I’m come from another programming language, I’ve started to synthesize a mental model about how to use use.

Now I try to write a test, using the test module boilerplate that cargo auto generates:

use std::path::Path;

fn main() {
    let p = Path::new("");
}


#[cfg(test)]
mod tests {
    #[test]
    fn it_works() {
        let p = Path::new("");
    }
}

My it_works() function seems to visually have the same form as my main() function, yet this code will not compile:

error[E0433]: failed to resolve. Use of undeclared type or module `Path`
  --> a.rs:14:17
   |
14 |         let p = Path::new("");
   |                 ^^^^^^^^^ Use of undeclared type or module `Path`
 

People familiar with this issue will immediately see the problem, but the compiler generated error message is not helpful in this case. I believe it can be improved, perhaps by noticing the use std::path::Path; at the top of the file and suggesting that the test module on line xyz needs to have the same use statement.

If I open the rust book and scroll to the section titled “Importing Modules with use”, here’s what the first part reads (paraphrased, since I won’t bother to reproduce the whole section in this forum post)

Rust has a use keyword, which allows us to import names into our local scope. The two use lines import each module into the local scope, so we can refer to the functions by a much shorter name.

Well, apart from some confusion about what “local scope” means, it appears that I’ve done just what the docs suggest, but my code still fails. Here is where I believe the book misses an opportunity to explain how each file is part of an implicit module, and use statements that appear to be global (at the top of the file) are in fact local to this implicitly created model.

One of the very first things in the chapter (under the Basic Terminology section) is this sentence:

Each crate has an implicit root module that contains the code for that crate.

But as a newbie, this single sentence is too terse for me to fully grok it’s implications and it’s relevance to my particular problem

Going one step further, I notice how a use can be used to use a thing by a shorter name, so maybe in the interest of experimentation, I try this:

fn main() {
    let p = std::path::Path::new("");
}

Cool! It works! Let’s try to apply it to my test module:

error[E0433]: failed to resolve. Use of undeclared type or module `std`

Now I’m still a really confused newbie.


I know that withoutboats’ blog post didn’t directly mention this case, but I hope I’ve given a compelling example where some hopefully minor changes could have the potential to have a proportionally lager positive impact on new users.


#7

I absolutely agree with the article. “‘use’ and local paths start from a different point” resonates the most with me. The fact that

std::path::Path::new("");

and

use std::path::Path; Path::new("");

are sometimes the same thing, and sometimes different things was baffling to me. It bites when you split your first program into modules for the first time, so it contributes to Rust’s learning curve.

I also used to be confused why do I sometimes have to write use foo; and sometimes mod foo;. Still, I’d just prefer that use foo worked for built-in modules, my own modules, and crates automagically. If I say I want foo, I’d expect Rust to go and find the module or the crate that matches it.

The extern crate isn’t that confusing, but it feels like duplication of work when using Cargo.

So my wishes are:

  • Make mod and crate optional (left for backwards compatibility or complex cases). Let use automagically discover crates and modules, so that use foo acts as mod foo; or extern crate foo; when appropriate.
  • Never show "Use of undeclared type or module std". At minimum, the message could be changed to explain the quirk. Maybe it could be made to “just” work by adding use std to prelude, or making module look up fall back to lookup from the root?

#8

To be clear (and to echo the thoughts of some others), I would very much prefer that more effort be expended to improve docs and tools, before trying to expend effort to redesign the module system.

It may well be that all of the confusion could be solved by better docs and tools. We don’t know yet. Let’s try to find out!


#9

Have you read the new docs? What do you think of them?


#10

Sorry Steve, I know you must have linked these new docs 1000 times already, but can you do it again?


#11

@eminence: http://rust-lang.github.io/book/ch07-00-modules.html


#12

Thanks! At first glance, these look great, it covers this case exactly. I’ll give the entire chapter a more in-depth read this weekend.


#13

Just a crazy idea, but I personally would love to be able to reference local / unpublished crates like so:

extern crate "../path/to/crate" as my_crate;

or something more rust-like as:

extern crate crate as my_crate;```

Should be simple to parse out normal published crates from quoted crates with alternate locations, local or otherwise (e.g. git:// or http:// could be used as well)...

Is there a way to propose such additions to the language?  Many thanks!

#14

@fungl164 You can express this in Cargo.toml like my_crate = { path = "/path/to/my_crate" }. It’s not something you can write in Rust sources, because rustc doesn’t know how to build other crates, just the current crate it’s working on.


#15

Hi sorry it took me a while but thank you @cuviper. I realized you could do this in the Cargo.toml, but after wrestling with no_std, eh_personality compilation problems using Cargo I retorted to using just plain rustc which I finally was able to make work.

There was certainly a learning which I don’t regret in any kind of anyway, but it was very confusing at first as to why rustc would compile without complaints and cargo would not compile at all…

That said, I think there is some real argument re: confusion with modules at least from the beginner’s perspective, especially when dealing with local unpublished crates…

It would be quite interesting to see the possibility of crate and module shadowing so you wouldn’t have to explicitly mark each use within a rust source file. Of course, the compiler would warn against unused imports.

Also the quoted vs unquoted extern crate strategy could be used as a bridge to differentiate legacy extern crate declarations vs localized declarations. The model would follow along the lines of ES6 and Python for example where you declare exactly what you pull out and/or re-export from which file. This way modules would not be necessarily stuck on the file structure/convention used in the current incarnation (although it would not preclude the continued use of such an arrangement).

These are all just suggestions from a beginner. Thank you all for your wonderful contributions. Best!