Discussion on the several way to split large code

GilShoshan94 · November 21, 2023, 1:19pm

Hi all,

As said in the title, I am trying to find ways to organize my ever growing code.

I try to split my code into differents modules, usually into files, but I also try to keep concepts grouped together in those modules. But when I have a module "def", in a file "def.rs" and I have a lot of Types and I need to implement a lot of Traits on them, I end up with a lot of impl block and it makes navigation in the code less nice.

So what are my options to make this codebase easier to navigate ?

A) The first obvious way to do this is to split the code into modules. It also permits to control the scope and the privacy. But you will need to bring everything you need in scope (e.g. use super::*;)
(also usually the linter won't be happy with super::* ...)
A1) The module can be inline, within curly brackets. Useful for tests and I guess other scenarios.
It would permit me to collapse the code inside but the annoyance is to put back in scope all what I need. Like this:

#[attribute...]
pub struct Foo {
...
}

mod _impl_Foo {
    use super::*;
    impl Foo {...}
    impl Trait for Foo {...}
    ...
}

...

A2) The module can be in another file, which is obviously a great way to split code. But at a certain resolution, it feels too much. If I have 10 types in my def module, it seems a bit too much to have a 10 submodules which means 10 files + mod.rs for the def itself. + all the use statetment repetitions in each module.
B) The second way is a bit more obscure. I picked up this patern while writting proc macros a learning from serde I think. To use an unnamed constant: const _: () = {...code...};
In macro it used to assert trait bounds.
This comes with a big caveat anything type defined inside won't be visible outside. But since free constants are always evaluated at compile-time, it seems that writting several impl block inside on a type in scope works and the big advantage is that you stay in the same scope. Like this:

#[attribute...]
pub struct Foo {
...
}

const _: () = {
    impl Foo {...}
    impl Trait for Foo {...}
    ...
};
...

Question about the unnamed constant: Is there any drawbacks? Reasons to not do that?

And also, what do you think is the best practice in general and is there another way than A1, A2 and B ?

H2CO3 · November 21, 2023, 1:21pm

Absolutely do not do that outside macro-generated code. Nobody writes code like that by hand.

If you need to split your impls into namespaces, then use modules as normally.

GilShoshan94 · November 21, 2023, 1:24pm

Here bellow a toy example with the const _: ()

Here bellow the same example but with a mod

afetisov · November 21, 2023, 2:10pm

It's up to you to set the module granularity. If you think 10 files for 10 types is too much, no one is stopping you from splitting the types into 5, 3, or 2 files. Use as few or as many as makes sense to make the code readable. Generally you should keep closely related stuff together (e.g. a type and its impls, particularly inherent impls), and unrelated stuff separate. But there are exceptions. Sometimes it makes sense to dump dozens of types in a single file, e.g. when you are defining serializable message types, with lots of pure data and lots of cross-references, but barely any methods on the types. Sometimes it's best to split all types into separate small files, just because there are unlikely to be any cross-references between them, and anyone reading or modifying the code will be only interested in a single type at a time.

For inspiration, you should take a look a Rust's standard library. For example, take a look at the iterator adaptor types (std::iter::adapters::*), which each live in a separate, often very tiny file. A single huge Iterator trait is used to provide ergonomic usage of the adapters for downstream code, without littering use sites with imports of individual adaptors.

That's basically a non-issue, and should be last on your list of priorities when making any project-structuring decisions. Any IDE will write the imports for you, and collapse all imports by default, so it doesn't matter much how many of those are in the file. Even when reading code in a simple editor, it's trivial to skip past a large block of properly formatted imports.

It doesn't solve the code structuring problem in any way. All the code is still dumped haphazardly into a single file, with no way to navigate it. It's also quite unidiomatic, so I would never use this approach to declare impls outside of macros.

Again, writing the proper imports is trivial. It shouldn't stop you in any way from properly structuring your codebase, and shouldn't be a reason to use wildcard imports, which poorly affect code readability and IDE usefulness.

This is semantically equivalent to modules living in separate files. The compiler literally turns all used files into such inline modules inside of a huge agglomerated source of your crate. For this reason the only difference between inline and separate modules are human usability benefits.

Since inline modules are generally unidiomatic and don't help with the wall of text problem, I'd generally avoid using them outside certain special cases (tests, prototyping, simple re-exports etc).

The standard way to structure your code is splitting it into separate files, according to semantic considerations (where would it make sense to search for a given functionality?). If you need helper functions/types/traits which you don't want to expose to the rest of the code, make a submodule and turn those into private items. You can use re-expoorts (pub use module::Stuff;) to provide a succinct API at the higher-level modules.

GilShoshan94 · November 21, 2023, 2:27pm

Thank you for the feedbacks.

scottmcm · November 21, 2023, 4:48pm

One place where small inline modules can be handy is in enforcing a bit of extra discipline on what can access the private fields of a class.

Imagine a file like

pub use private::Even;
mod private {
    #[derive(Copy, Clone, Debug)]
    pub struct Even(u32);
    impl Even {
        pub unsafe fn new_unchecked(val: u32) -> Self { … }
        pub fn get(self) -> u32 { self.0 }
    }
}

// extra things you implement here *must* use the
// methods, rather than looking at the field directly
impl Even {
    pub fn new(val: u32) -> Option<Self> { … }
}

Trame · January 5, 2024, 11:05am

I don't understand why no one mentioned the include attribute. This is a better alternative to constants. And in the standard library it is used
Constants are better suited for calculations rather than declarations. Just imagine how to execute some code without interfering with main? And constants allow this. include just inserts text, like in C++

system · April 4, 2024, 11:05am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Splitting a module in multiple files help	37	13032	May 23, 2020
Why one module per file?	5	838	August 19, 2019
I love rust, but one thing about modules is aweful!	48	19762	July 3, 2022
Code structure for big `impl`s distributed over several files help	18	11353	January 12, 2023
Organising modules inside crates help	3	645	January 12, 2023

Discussion on the several way to split large code

Related topics