My experiences with learning the basics of Rust's module system

These are notes (really an article) about my experience with learning the basics of the Rust module system. It might be useful to the people who are involved with the Rust documentation, or maybe to other learners if they have run into similar problems. (Or maybe it's too long.) Writing this helped me build a model of how this part of Rust works if nothing else. It is a bit stream-of-consciousness, in the sense that I describe the back-and-forth thought-process of trying to learn about this thing. This also involves a pretty haphazard approach to finding and using learning resources.

My background: As far as languages go, Java and Haskell (in that order w.r.t. experience). I have used other languages, but not enough to warrant mention compared to those two. I am still a relative beginner, i.e., no larger projects under my belt.

Problem

I have two rust files in a directory, main.rs and a.rs. I want to use stuff from a.rs in main.rs.

// main.rs
fn main() {
}

// a.rs
pub fn f() {
}

A self-inflicted limitation was that I wanted to learn the bare minimum in order to fulfill this goal. Not in the sense that I just wanted a ready-made, one-off solution. But I didn't want to learn about something building and deploying a Rust library to Cargo in order to use stuff from different files if I didn't have to. Side note: I wonder how common this prioritizing/laziness is among Rust beginners?

Learning about the Rust module system

I already knew that Rust had modules (mod) and imports (use). So the first bet was to blindly - based on previous experiences with unrelated languages - try some variation of use a; in main.rs and (presumably) declaring the public module of a with mod a; or pub mod a;. rustc apparently didn't understand any of that by just using rustc main.

Start of the Rust Book

The first stab at actually learning how this works was the initial part of the Rust book. Naturally not 2.2 Hello, World!, though, since that only involves one file. The next part is 2.3 Hello, Cargo!, which is apparently a build system. The opening paragraph says:

Cargo is a tool that Rustaceans use to help manage their Rust projects. Cargo is currently in a pre-1.0 state, and so it is still a work in progress. However, it is already good enough to use for many Rust projects, and so it is assumed that Rust projects will use Cargo from the beginning.

Use it from the beginning. I assume that means public/shared Rust projects though, as opposed to using it from the beginning of any Rust project for the library author's sake since it is so convenient. My projects are in any case not going to published to www.crates.io. Using such a tool seems a bit too much for my needs, so I don't want to go through all the things that I associate with such things; configuration files, manifests, where to publish code, how to get third-party code, and so on. Those might be wrong/misinformed associations, but that was where I was coming from nonetheless. I have skimmed this part some times before and I didn't want to learn about things like TOML and what have you. It turned out later that "Hello, Cargo!" indeed just used the "Hello, world!" example but in the context of a Cargo project. So this part didn't really introduce importing ones own code.

Next I asked man rustc for help. It turns out that I could first tell rustc to create a library out of a.rs, and then compile main.rs by telling rustc to search for libraries in the current directory:

$ rustc --crate-type lib a.rs
$ rustc -L . main.rs

I also need to add two lines to main.rs:

extern crate a;
use a::*;

fn main() {
    f();
}

That wasn't too bad. It was kind of involved for such a simple task, but not hard to understand. It's probably not how you're advised to do it for something so simple as importing a function from another .rs file, but it works for now. But there were some limitations with this approach -- I don't recall which ones in particular -- so I soon wanted to solve the problem in another way.

I later found out how to import module a into main.rs from someone on IRC. Declaring mod a; inside main.rs apparently is conceptually like copying the a.rs file as its own namespace into main.rs. Since it is its own namespace, I also have to use the (public) things that I want from the a module, after having imported it by the declaration mod a;. This information was pointed out on IRC by an URL to part 10.5 of Rust By Example. Now I had solved my original goal, but it seemed a little involved to "copy paste" each module into the file that I needed to use some of it in, and then use it. I wanted to to be able to just use the things that I needed in each file in the directory, like how I import std code.

The actual Rust modules chapter

The next thing I looked at was 5.26 Crates and Modules. Well, shouldn't this have been the first place that I looked? I don't quite recall why I didn't. Maybe it was a mix of wanting to see if I could find the minimal information in the introductory parts (chapters?) of the Book, and that this part didn't seem to build up to the information that I need when I had skimmed this part before. The opening paragraph:

When a project starts getting large, it’s considered good software engineering practice to split it up into a bunch of smaller pieces, and then fit them together. It’s also important to have a well-defined interface, so that some of your functionality is private, and some is public. To facilitate these kinds of things, Rust has a module system.

I don't care about good software engineering practice right now. Maybe later when I have an actually useful project, or if I don't happen to give up after barely having learned the basics. But this looks like the place to learn how to import code, so so it seemed relevant to what I need to learn.

I had already seen examples of nested modules, and that made sense:

mod english {
    mod greetings {
    }

    mod farewells {
    }
}

mod japanese {
    mod greetings {
    }

    mod farewells {
    }
}

This way you can have english::greetings and japanese::greetings without name conflicts.

For some reason, Rust's way of moving things out of a single file into several files was not that clear. I was used to something like "mod a;" being a declaration meaning "module a is the rest of the (public) stuff in this file". And in the case of Rust, the mod name { ... } seemed like just a way to have several different modules in the same file if that was needed. So you'd use mod a; if a was the only module in the file, then it would be most convenient to use that syntax; if you'd need several modules in the same file, then you could use the mod name { ... } syntax to delimit the modules. But apparently mod a really means to conceptually move all the relevant (public?) stuff from either a file named a.rs or from a file named a/mod.rs. So mod a; doesn't mean "this file is module a" at all. It means "move the stuff from this totally separate file into this file", which feels very different from my initial guess.

I assumed that this business of putting mod-declarations into separate files was an optional way of structuring the code to present a different external interface (crate) to the outside world. And I was partly right it seems:

[...] Our internal organization doesn’t define our external interface.

But apparently I was wrong about this being optional. Since it said that:

We can instead declare our module like this:

mod english;

If we do that, Rust will expect to find either a english.rs file, or a english/mod.rs file with the contents of our module.

My first instinct was that I could just skip the mod.rs-files altogether and just let Cargo, or rustc, or whatever search for and find the modules greetings and farewells. After all, when given a path like english::greetings::hello(), it should try and find either english/greetings.rs or english/greetings/mod.rs, and I have english/greetings.rs. Of course there was a missing link: the english module. Like the chapter said:

[...] expect to find either a english.rs file, or a english/mod.rs file [...]

And I had just omitted english/mod.rs, and there was no english.rs. So I guess whatever underlying tool would go from the the root module (lib.rs?), try and find english.rs or english/mod.rs, find neither, and give up the rest of the search. I assumed that the directory english would serve as a sort-of by-default module which just "re-exported" all the public items of the child-modules contained in the english module.

pub mod

This part introduces pub mod by making every used module pub. Does this mean that every module must be public in order to be used? That can't be true - then what would be the point of non-public modules? All used modules are made public here presumably because they are used by main.rs in the same project (which is apparently a different binary crate, as opposed to the library crate that we've made). Then if we look at the library crate as one big file, like it started out as at the beginning of the chapter, then it makes sense that we would need to make any of our internal modules public in order for any external code to access them; a module consisting of pub mod a { ... } mod b { ... } would be an analogue to pub fn a() { ... } fun b { ... }. This can be observed by importing a function from lib.rs, instead of from the greeting module:

// lib.rs
mod english;
pub mod japanese;

pub fn g() -> String {
    english::greetings::hello()
}

(Note no pub modifier on mod english;.)

And then use g() in another crate:

// main.rs
extern crate phrases;
use phrases::g;

fn main() {
    g();
}

At this point, my mental model of mod is that mod declarations are used to recursively -- starting from the root module -- bring code into the parent module. So starting at lib.rs (in the case of a library crate), mod declarations in lib.rs is used to search for the relevant modules. In turn, these modules start searching for relevant modules starting from their position in the "module hierarchy". In the end, you have effectively included the whole module hierarchy in the lib module -- equivalently to having them "inlined" in the lib.rs file itself like in the beginning of the 5.26 chapter -- based on mod declarations and pub modifiers throughout the hierarchy. The same goes for the modules beneath lib.rs in the in the hierarchy. So a mod a; declaration in a file is not a way to declare that this file is a module a, but rather a way to include code from the module a into this file.

use: This seems straightforward enough. The distinction between absolute paths for use and relative paths for non-use was perhaps a bit different, but it seemed simple enough if I remained vaguely aware of it.

Beyond basic necessities: pub use

I knew what I needed at this point. But the last part of the chapter seemed compelling, since it seems to reveal how Rust's approach might help in decouple internal representation and organization from the interface it offers up to the outside world.

pub use self::greetings::hello;

At this point I'm quite confused. pub use? My mental model of use was that it imported things into the current file, and the current file only. Then would pub use a; mean to import a into the current module, and to import it into the modules that import this current module? But this sounds more like exporting than importing. And indeed:

You can also use it inside of your crate to re-export a function inside another module.

So this is re-exporting an item to another module. This doesn't make sense to me if I think of a pub use declaration in a module a being visible in module b by virtue of b importing as stuff using use a::*;. But it might make more sense if thought of as being caused by including modules by using mod declarations. So if module b has a mod a; declaration, we "inline" module a in b. Naturally things like functions which are public in a then become usable in a if they are imported with use. What about public imports (pub uses)? Like functions, the imports in a don't become automatically visible in b just by virtue of being public. We have to explicitly use them in b. So we need to use a use. This is not that strange if the use in a as being a namespace declaration that of course can be used in a but, due to the pub modifier, also can be imported by other modules. Then we can relate this to public functions in this way; public functions declared in a module are visible and usable in a, and can also be used from other modules if they use them. "Public imports" (pub use) declared in a module are like a namespace declarations that are visible and usable in the module they are declared in, but can also be imported by other modules if they use them. It still seems a bit strange, but I can sort of see how it is similar to how other Rust items work with and without the pub modifier.

use before mod?

The book notes:

Also, note that we pub used before we declared our mods. Rust requires that use declarations go first.

But didn't I do the opposite when I first used a module declaration to include code into another module/file? This compiles:

// a.rs
pub fn f() {
}
// main.rs
mod a;
use a::*;

fn main() {
    f();
}

Even though I first import, and then use what I need. But the book says that use declarations must go first? Swapping the order mod a; and use a::*; declarations also works (main.rs compiles).

Conclusion

I didn't like build systems/package managers, thinking of them as road bumps on the way to actually doing what I want since I didn't need or want to use third-party code yet. But Rust's Cargo actually feels pretty simple to get started with. So I can understand why it is brought up so early in the Rust Book (right after rustc hello_world.rs; poor rustc is apparently not seen again). Rust's modules and crates turned out to be harder to understand. It felt strange how it was easier to include and use third-party code from Cargo than it was to use code from different files within my own project.

I think writing this document has helped me understand how these things work. But, again, this has to a degree been a stream of consciousness more than something that tries to present how things really work. The point is in detailing how I got to a certain understanding, with the caveat that this understanding might be flawed. For my own part, I might come back to this when I might have forgot some part of how these things. And maybe it could even be instructional to other learners, or teachers.

9 Likes

All I have time for for now:

This is a bug in the text from when this did matter. I'll send a PR right now. remove incorrect statement from TRPL: crates and modules by steveklabnik · Pull Request #27343 · rust-lang/rust · GitHub

4 Likes

I know this is some serious thread necromancy, but I wanted to say "hey, thanks for writing this." I didn't before, and reading it is still helpful, a year later.

6 Likes

Complete beginner here: count me in.

I'll chime in, too, because this topic really made me lose more time than I wanted to.
I use Cargo and just wanted to get over with the topic the quickest I could.

In my case I wish I could find this snippet of code to get me started:

// Cargo.toml
[package]
name = "testlib"
version = "0.1.0"
authors = ["Journeyman <me@domain.com>"]

[dependencies]

// src/lib.rs
pub mod hello_mod {
    pub fn print_hello() {
        println!("Hello, world!!!");
    }
}

// src/main.rs
extern crate testlib;
use testlib::hello_mod;
fn main() {
    hello_mod::print_hello();
}

And after re-reading the Cargo documentation, it made sense:
http://doc.crates.io/manifest.html#configuring-a-target

2 Likes