Just another module question

I have read https://doc.rust-lang.org/book/ch07-05-separating-modules-into-different-files.html, but I still don't understand how to create a module that is defined by multiple files in a subdirectory. I feel like if I could just get a simple example to work, I'd be off to the races. Here is what I tried with just one file in a subdirectory. Can you spot where I went wrong?

src/main.rs

mod sub;

fn main() {
    sub::my_a();
}

src/sub.rs

pub mod a;

src/sub/a.rs

pub fn my_a() {
    println!("in my_a");
}

When I enter cargo run the error is "cannot find function my_a in module sub".

Modules are hierarchical and every mod statement creates a new submodule within the current one, so my_a is in the module sub::a, not sub. There's two ways to fix this:

  • You can refer to the my_a function directly in main:
fn main() {
    sub::a::my_a();
}
  • Alternatively, you can leave main as it is and re-export a's public items from src/sub.rs:
pub mod a;
pub use a::my_a;
// -- or --
pub use a::*;
2 Likes

I didn't realize that each file inside the module directory actually creates a new module. That was the missing piece.

To clarify, to the best of my knowledge it is not possible to separate one module into multiple files, you can only create hierarchical submodules. You can't create a module foo that is defined by the concatenation of foo1.rs and foo2.rs, for example.

You can if you use the include! macro to read in the files instead of a mod declaration, but that's not generally recommended.

Is it fair to say that you can logically do that if you consider the submodules to be just different parts of the module implementation that are never directly used by anything else? For example:

src/main.rs

mod sub;

fn main() {
    sub::my_a();
    sub::my_b();
}

src/sub.rs

mod a;
mod b;
pub use a::*;
pub use b::*;

src/sub/a.rs

pub fn my_a() {
    println!("in my_a");
}

src/sub/b.rs

// QUESTION: Is this the best way for this module to get access to something in the "a" module?
use super::a::my_a;

pub fn my_b() {
    println!("in my_b");
    my_a();
}

Yes, that's a reasonably common pattern, though each of the submodules might be prevented from accessing private members of its sibling modules (I can't remember the exact rules).

The alternative here is use crate::sub::a::my_a;. As far as I know, neither is preferred over the other.

Also, if the sub module is re-exporting my_a, you can write use super::my_a instead.

1 Like

It seems like src/sub/b.rs should be able to access things in src/sub/a.rs with the following:

mod a;

Can you explain why that doesn't work?

That declares a new module crate::sub::b::a. Because it has no body, the compiler will look for its source code in these locations:

  • src/sub/b/a.rs
  • src/sub/b/a/mod.rs

In general, a mod statement always declares a new module and should appear only once for each source file. Everywhere else, you should use use statements to refer to items inside the module.

Well, that's kind of "cheating" or at least it's not using a core language feature related to modules.

I thought mod name; basically only imports a module and that you have to include a body to actually define a new module. I understand though that it isn't looking for the module in the location I expected (just the current directory).

Which part seems like cheating?

Using raw string manipulation (the include!() macro) in order to circumvent the module system. It's basically playing C preprocessor with the Rust compiler. Literally including files is the principal billion-dollar problem that proper module systems try to solve.

2 Likes

use statements are the ones that import modules, and mod statements define them.

Even if you use something like the #[path] mod name; attribute to force the compiler to read the file from where you expect, the module you get will still be logically distinct from any other that was read from the same file. This produces some confusing and undesirable effects.

In particular, any types defined in the file that was read twice will have two separate, incompatible, definitions. Something that's expecting a sub::a::Foo object won't accept a sub::b::a::Foo object and vice versa. As far as the compiler's concerned, they're two completely different types that just happen to have the same layout and similar names.

1 Like

Admittedly I'm new to Rust, but this doesn't match my reading.
It seems to me that a mod does one of two things. It can "define" a module" or it can make one available in the current source file (I'd call this importing).
And it seems to me that a use just creates a shorter way to refer to a value in another module than using a fully qualified name.
It seems that use never appears without a corresponding mod and it's the mod that actually does the import.

This may be primarily a terminology issue. The statement mod name; is exactly equivalent to

mod name {
    include!("name.rs");
}

(where name.rs is the location found by the compiler's search).

Though the text of the text of the module body may be loaded from a different disk file, it's still a new module definition-- every one of these statements will create a new instance in the module tree and the compiler will treat each of them as if they're completely unrelated to each other.

You are correct that use "just creates a shorter way to refer to a value in another module," but that's more commonly referred to as "importing a value from another namespace/module". You can't import something that hasn't been defined, so you can always trace a use statements to some originating mod statement. What you can't do is replace that use with another copy of the mod statement; that means something entirely different, and is almost never what you want.


Another way to think about it is this: Rust modules are static singleton objects, and exactly one is created for each mod statement in the source code. The file-loading behavior of mod statements is secondary to this, a programmer convenience that doesn't alter their basic operation.

3 Likes

Thanks so much for clarifying this!

Wow, thanks. That might be the most useful thing I ever read about Rust modules. Such a succinct summary. Typically I have been totally confused setting up modules for each new project, having forgotten how it worked since last time, and had to study the docs all over again.

It surly is.

For example, that is where I disagree. I'm in the camp that thinks of 'mod' as actually doing the import (At least when the body is in another file) and use as simply creating a shorter alias to use.

After all, this is what one would expect of 'import' from Javascript modules, or Python (or C++ I believe now a days).

Also your example there is showing mod doing the import very vividly, by include!ing the bytes of another file explicitly.

How does this affect the generated executable? If I have three source files that use mod name; to pull in the definition of a module and that module defines a single function, is the code to implement that function repeated three times in the resulting executable?

You can disagree but that doesn't change the fact that this is not true. The compiler simply doesn't work like that.

1 Like