Rust File Organization, imports, etc

I am finding that how one organizes & uses modules/functions from other files, in a multi-dir project, is pretty complex and (so far) not too intuitive!

I have a project like this:

src
--one/bin/main.rs
--two/bin/main.rs
util.rs

I want to use functions from util.rs in the two mains. I did get it to work, by doing this:

util.rs:

use std::fs;
pub fn load() -> String { ...
}

Then doing this in one/bin/main.rs:

#[path = "../../util.rs"]
mod util;
use util::load;

But I thought it would be better to declare the module in util.rs, and not have to repeat that mod definition everywhere it's used. I thought I could do this:

pub mod util {
    use std::fs;
    pub fn load() -> String { ...
    }
}

then use this to "import":

#[path = "../../util.rs"]
use util::load;

This doesn't work.

Questions:

  • Why doesn't that work?
  • How should I be doing this - what's idiomatic?

Simply because that is not how module declaration is implemented. Modules in a Cargo project can only be declared in the crate root, i.e. in your case (where you don't have a library crate but instead only two binary crates) in your one/bin/main.rs and two/bin/main.rs.

What I'd do is keep one library crate (defaults to lib.rs) additional to your two binaries. Then you can define your module structure there:

lib.rs

pub mod util; // must be public so that the binary crates can access it

and import it in your binaries like this:

one/bin/main.rs / two/bin/main.rs

use your_package_name::util::load;

You can find the relevant chapter of the Rust book about modules here:

https://doc.rust-lang.org/stable/book/ch07-02-defining-modules-to-control-scope-and-privacy.html#defining-modules-to-control-scope-and-privacy

2 Likes

Just a small note:

#[path=_] in Rust is not like a preprocessor directive like it would be in C/C++/et al. It's the syntax for an attribute, which is some extra information attached to the thing that comes next. So when you see

#[path = "blah.rs"]
mod plim;

That's not two constructs, it's one construct: a mod item with an attached attribute that overrides the path to the containing file.

(There's also #![ ... ] syntax used for attributes attached to the thing they are inside of. Most often used for attributes attached to the "crate root" which doesn't otherwise have any explicit syntax for the #[ ... ] syntax to attach to.)

1 Like

That's not quite correct. Modules can be declared in the crate root and also in other modules.

The key part is that a module is always declared by its containing module or crate; without such a mod declaration, the source code for the module is not even read by the compiler. Modules cannot declare themselves into existence.

6 Likes

Right, I was trying to use the notation from the book which differentiates between modules (declared in the crate root) and submodules (declared in other modules).

1 Like

I do have to say that I find this unintuitive and sadly limiting.

Perhaps that is due to bias coming from languages with more inherent support for modularity. But, it would be nice and would seem to be quite sensible that if I have a file, somewhere, util.rs. I could declare mod util in it, and use that construct to group those functions.

But, OTOH, if the idea is that one should regard the source file itself as a grouping mechanism, I guess it's a little more palatable.

Hmm, I never found Rust's module system unintuitive tbh. But I don't think I ever knew a language with a more lax approach to modularity before I started using Rust. So as someone with the opposite experience than you had (from a stricter module system bound to the file tree structure (like Rust, Python, Java) to languages with more flexibility (like C#)) I find languages that are less strict about how modules/namespaces work rather confusing. Why do I have to explicitly define the namespace in every file when I structure my source files exactly like I structure my namespaces anyway? I do want to have my file tree resemble my module structure and don't need any flexibility beyond that (with the exception of re-exports, which come in handy sometimes).

5 Likes

Being unable to write

mod util {
   fn utilFn()...
}

anywhere I want is unintuitive to me, being used to languages where something analogous is possible (in any file, etc.).

What I'd like to know is if this is truly a "limitation" in the sense that they did it to make the language impl easier, or if such restrictions are actually purposeful and have an upside I don't see yet.

1 Like

You can write that most anywhere (outside of impl blocks and such), but it doesn’t behave how you’re expecting: mod util defines a new submodule called util inside the current module; if it has a block attached like you’ve written, that is used as the module contents instead of going to the filesystem to find a corresponding .rs file.

1 Like

Maybe this example will help explain the module hierarchy:

// src/main.rs
// This is a crate root

// crate::mod1
pub mod mod1 {
    // crate::mod1::mod2
    pub mod mod2 {
        // crate::mod1::mod2::mod3
        pub mod mod3; // Lives in another file
    }
}

fn main() {
    crate::mod1::mod2::mod3::f();
    crate::mod1::mod2::mod3::mod4::g();
}
// src/mod1/mod2/mod3.rs
// This is crate::mod1::mod2::mod3

// crate::mod1::mod2::mod3::f
pub fn f() {
    println!("f");
}

// crate::mod1::mod2::mod3::mod4
pub mod mod4 {
    // crate::mod1::mod2::mod3::mod4::g
    pub fn g() {
        println!("g");
    }
}

Since the compiler knows the relationship between directories and modules, it knows where to look for mod3.rs. The compiler never assumes that an .rs file is part of the current build unless it is the crate root or is referenced by mod name;.

4 Likes

Of course, design decisions are subjective and can be argued about endlessly, but perhaps it will be consoling to hear about some of the advantages of Rust's system:

  • The build system does not need to supply the compiler with an accurate list of all source files; the files are discovered by the compiler as directed by the code in them (except for the crate root file, which must be passed to the compiler and is the file that decides all of what happens after that).
  • Reading the contents of a file tells you all the items within that module or crate; no declaration in any other file can add items to a different namespace, only itself.
9 Likes

To play devil's advocate a bit, the build system for Java, Scala, Kotlin, etc. doesn't need to be told source files either - they're files under src dirs with the appropriate extension!

I do understand how the module system works (now).

I took a look at this project to get a feel for how things are done in a 'complex,' well-architected (onion) system:

I find having to have a separate file to enumerate modules at each level strange and clunky. It's a bit surprising to me given Rust general high level of elegance. But it's not a big deal and I'll get used to it.

I still love Rust. :slight_smile:

That's not quite what's happening.

src/domain/mod.rs is not a separate file to enumerate modules, but rather the entire contents of the domain module (mod domain; in main.rs), which is allowed to contain any kind of top-level item (structs, fns, traits, etc). It just so happens in this case that the only such items are modules.

An extra bit of confusion that may pop up is that this project is using the old file naming convention. In most new projects, this file would be src/domain.rs instead (with the submodule sources remaining in src/domain/*).

3 Likes

Most perhaps, but definitely not all. I still use the "old" scheme, because I like to have the entire contents of a module inside a single directory, as opposed to partly inside and partly outside. Just seems so... untidy.

4 Likes

I initially make the module a single file somemod.rs. If complexity of the module grows and separation is needed, then I convert it to a directory with somemod/mod.rs and the rest of the module files. Seems pretty simple and intuitive to me.

2 Likes

I wish the compiler supported a 3rd option by default: submod/submod.rs. It'd have the advantage of keeping the module's sources together without littering my editor with mod.rs tabs.

3 Likes

The problem is that having three ways instead of two would be even more confusing. I would welcome your solution, if somemod/mod.rs variant was banned, but I only have a bunch of small projects, and the rest of esteemed Rust ecosystem participants might politely disagree about usefulness of such a radical change, even if it would be cargo fixable. But it would fit perfectly in Rust 2.0 (or something), where all the quirks of Rust 1.0 would be fixed, and a whole bunch of new ones added.

1 Like

It's good for file system layout to be consistent as that makes onboarding new developers to a project easier if they know where to look for things. I think declaring random namespaces anywhere would make for an incredibly difficult to navigate codebase and would not want to work on a project where namespaces were not related to the file system layout.

Cargo workspaces provide a decent simple layout. eg each binary is a separate crate/project, and so is each library. Your libraries are imported by cargo. So import your library with a use like any other crate.

I think that tends to encourage good design of having a flatter set of more independent units of code. That facilitates the "high cohesion / low coupling" ideal of good software engineering.

The library scope should be deliberately limited and not just one giant ball of "utils" in a single hierarchy. Limiting the scope of modules also aids in their discoverability and maintenance. And by packaging them, you can limit the "public" surface size to help avoid dependency on implementation details, facilitating easier refactoring.


fwiw I think rust is too lax regarding namespaces - I should not be able to use foo::* and suddenly have a cluttered namespace with symbols coming from who-knows-where. I much prefer Go's approach, were I can only import a module, and must always refer to things inside that module as module.Func, never just a bare Func

2 Likes

Im not sure if this relate but i was tired of creating multiple tiny projects so i just declared a ‘bin’ for each mini project for it to map the main function separately.

# Cargo.toml
# declarations to be able to use multiple main files in a single project
[[bin]]
name = "ticket_example"
path = "src/concert_tickets/ticket_example.rs"

[[bin]]
name = "stock_app"
path = "src/stock_trading_backend/stock_app.rs"

which then i can create a folder structure like so:

Then import like this:

in your mod.rs file it would be a public export like this:

Idk if that was what you were looking for but I just look at it like in react when you have a components directory with sub folders:

--components/
    ---component1/
    ---component2/
    ---component3/
   // ect...
   index.tsx<----- has a single import for the whole directory
   instead of:
   import { Card } from `../component1/Card`
  you can just do:
   import { Card, Dialog, etc } from `../components/index.tsx`

But maybe im crazy. Idk. Its just how I understood it. This is like the 7th time ive relearned rust since 2019 & i use to use the [path: ...] pattern but was told that was frowned upon but I have not worked on a job for rust but I have in web development with js.
You can go the workspace route also:

or like somebody else mentioned the lib.rs in the src directory then call it in main.rs:

I hope all this helps.

1 Like