Modern Rust Project Layouts

I'm newish to Rust and am confused about which project layout structure to use based on what I've been seeing in newer Rust projects. I've seen basically three setups:

  1. Everything in /crates and using a workspace.
  2. Main app in /src, everything else in /crates.
  3. Main app in /src, everything else as a submodule inside of /src.

Most of the new projects seem to be using #1 now, even for what I would consider to be smaller projects. Is there a reason why one would want to use one of these approaches over the other?

As with everything, it depends™.

How big is your project?
Do you need to break it up into multiple workspaces?

The way you organize your project is in function of how its subparts interact with each other.

I was quite confused by this for a while too. There are various reasons for doing this like improved compilation times (unchanged crates don't need to recompile, so editing
one part of the project doesn't trigger rebuilding everything), testing crates separately in CI, stricter API control and many other reasons I probably don't know about. All projects usually start as monolithic files, then module separation and crates if needed. But some people do like to follow the structure of the more popular projects.

Fair, I think having some recommendations in the Rust Book would be helpful for small, medium, large projects, or at least do a survey and explain the pros/cons of the major approaches. The Rust Book doesn't even mention creating a crates directory, which seems to have sprung up as a convention. I think conventions are very helpful for groking projects quickly.

I've also seen a lot of projects start with crates/, which leads me to believe there must be some advantage, even if it's just avoiding untangling a single file, or single crate at some later date.

There are many different reasons one might use a workspace with multiple crates. However, it also has significant costs:

  • Dead code analysis stops fully working (if crate foo uses crate bar::func(), then bar::func() must be pub, and if crate foo stops calling bar::func(), nothing will tell you that you now have unused code).
  • Trait coherence checks may prevent you from writing trait implementations you would like to
  • Possible worse build performance (the results of compiling separate crates need to be written to disk and read back, rather than used within a single compilation session) if the split doesn't provide any parallelism benefits.

You should not use multiple crates unless you need one of the things you can only get that way. Merely organizing your code should be accomplished using modules, not crates, which are much cheaper.

6 Likes

There's also the unique case of proc-macros which requires a separate crate (there are probably some other type of crates that requires separation too, I read about them, but don't remember much). I didn't know about dead code analysis not working for workspaces O.O Is that the only analyzer support that stops working for workspaces?

It depends on a size and a purpose of a project. For example, if I develop a small crate, then more likely everything will be in one directory, for as there. However, if the crate needs resources in other languages, then I can separate them as. And if the project uses many different resources, I may use something like that. I think you should stucture your project for most convenient for you, then other people prefer.

That’s one example of what I meant by "you need one of the things you can only get that way".

It's not that anything stops working, it's that there simply isn't any concept of “crate A is only used by crate B and not arbitrary other crates outside the workspace", so once something is pub in crate A, it and the things it used are assumed to be not dead code. There is interest in improving this but no solution yet.

1 Like

So there are situations where multiple crates are better in compilation speed and single crate sometimes is better. However, when one has the need to TEST the other approach out, it is too costly to refactor the whole project just to TRY it out.

It is a really difficult decision to make when one just starts out a project, of course it is it depends but it is just hard to reason the choices as the requirements are not clear yet.

When you are starting a project, it will be small enough that splitting will definitely be slower, and it is usually not feasible to predict what an appropriate split would be in the future. Splitting too early will also make it more difficult to refactor the code into a form that both suits the needs of the project and improves compilation speed if split.

Design in a single crate first. Then, later, if compilation speed is an issue, find natural divisions in your code (which will often already be separate modules) and split there. Splitting modules out into separate crates is not all that hard; move the source files, add dependencies, and rename paths to point to the new crate.

3 Likes

One common case that is actually recommended to be in separate crates from the start is a library with an associated binary. While you absolutely can have binaries and library targets in the same crate, it is still easier and nicer in a few ways to split them.

1 Like

Given these limitations, what do you see as a valid reasons to split into multiple crates?

  • Procedural macros currently must be defined in a separate crate that can only define proc-macros.
    • Thus, also, any algorithms that are used by a procedural macro and the code that uses the macro must be put in yet a third library.
  • When you have large amounts of optional code, keeping it in separate library crates can be easier to write correctly than heavy use of Cargo features and #[cfg] . In some cases, it may be hard to define the optional code’s optional dependencies optimally without also putting it in a separate crate.
  • If you’re building an application targeting multiple very different platforms, it may be easier to define multiple binary crates than to make one crate that adapts to all the build and API requirements.
  • If you are publishing libraries:
    • Users often appreciate narrowly-scoped libraries over monolithic ones that have lots of functionality they don’t want.
    • When you have a library that needs a complex integration with other libraries, splitting the code into a crate for the core types and algorithms, and another crate depending on it for the integration, reduces versioning headaches for your users, because the core types can stay the same while migrating to a different version of the integration.
  • When your project is actually large and slow to compile, splitting in the right places can greatly improve the situation. My overall point here is that you shouldn’t split before you know where the heavy parts of your code are. Don't split for purely organizational purposes — use modules for organization.
4 Likes