Soft question: navigating a 40k loc workspace

I have a workspace with multiple crates (all authored by me) that is 40k LOC.

In recent weeks, I have been running into this issue where despite using

  • IntelliJ
  • IntelliJ jump to definition
  • IntelliJ jump to usage of definition
  • IntelliJ global search
  • IntelliJ jump to class
    I am often struggling to jump to the right place in my codebase.

In particular, concrete problems I am running into are:

  1. async really screws up code navigation; there is no easy way for me to say: "find all async tasks that might wake up as a result of this action"; so often time, I'll be reading the code, get to the end of an async block -- and then have no idea where to "continue" the logical trace

  2. even with IntelliJ jump-to-usage-of-definition, it is not obvious from the UI which is the right one, so it becomes this silly linear search

  3. I don't have a good "mental stack management", so after a few rounds of (1) and (2), I have way too many buffers open and have also lost track of previous train of thought

I am curious (1) how others are navigating large code bases, and (2) what "design patterns" makes code easily searchable / jumpable.

Thanks!

1 Like

I don't have a good "mental stack management", so after a few rounds …

It sounds to me like the “pattern” you are missing is modularity, or abstraction boundaries. Write modules (not necessarily literally Rust modules, but they can be) such that there is a (relatively) simple explanation of what ideas should cross the boundary of that module, and treat it as a bug in the design when you find yourself having to jump between the code inside the module and the code outside the module to understand the program. Design “narrow” interfaces between modules. A good module should encapsulate a particular solution to a problem (algorithm, data structure) and the only thing the outside world cares about the internal structure is "that problem is solved in here", not the details of how it is solved.

Of course, sometimes you have a bug, and you can't nail down exactly what it is without studying the way that module interacts with other code. But that should be a relatively rare case. (Write tests for your modules!)

Structure your program so that complexity is isolated to small areas with good (well-designed, well-tested) boundaries and you will minimize these “mental stack overflows”.


even with IntelliJ jump-to-usage-of-definition, it is not obvious from the UI which is the right one

IntelliJ shows you one line of context, right? That one line usually contains variables. Make sure you're not missing opportunities to give those variables more specific names.


async really screws up code navigation; there is no easy way for me to say: "find all async tasks that might wake up as a result of this action"

I think good abstractions will help you not have to worry about long-distance side effects like this, because they will happen for locally understandable reasons even if you don't know exactly what code was responsible for causing them.

4 Likes

Part of the problem is that suppose we have a field foo, and we have a long line

blahblah.hey.foo.do_thingy(long_expr_for_arg1, long_expr_for_arg2, long_expr_for_arg3)

then rustfmt will break the lines in such a way that .foo is on it's own line

Are you able to ask IntelliJ to give more than 1 line of context? I know rust-analyzer + VS Code will show you the entire function signature plus doc-comments when you hover over items.

Yeah, this sounds more like an architecure problem than an IDE issue.

At some point, your project will get so large that it no longer fits into your head or can't be completely read by a single person. A good IDE can help a lot with navigation, but if you are finding you've got a dozen files open at a time when trying to follow the code flow then maybe it's time to refactor things.

One idea is to make sure things frequently accessed together are physically closer. Kinda like what we do with data structures and data locality, except for humans.

Another tactic, like @kpreid suggested, is to rely on abstractions and layering so you can quickly skip over things ("I'm interested in X, but that function manages Y and is independent of X, so let's ignore it").

If async in particular is making things difficult, maybe have a look at how the async code is written and ask why it's hard to follow. One of the main reasons async/await was introduced in the first place is to make the flow of asynchronous code as easy to follow as synchronous code.

In some situations, it's easier to accept the complexity and try to do things in a way that doesn't require understanding all the fine details. At a previous job, I was tasked with maintaining a mission-critical codebase that had been written by one self-taught programmer over the span of 20 years - to call it a dumpster fire would have been an insult to dumpsters. The book, Working Effectively with Legacy Code, introduced several useful techniques for dealing with those sorts of situations.

So far, I have been going mostly by one struct/enum per file. This almost certainly implies 4+ files open most of the time (not sure if 'dozen' above is literal).

I'm almost mostly navigating by keyboard, not mice, so if we have:

crate1/src/car.rs

src/
  foo.rs <-- editing this file; has struct Foo
    type `Car' appears in foo.rs
  bar.rs <-- has struct Bar

If 'Car' appears in foo.rs but 'Bar' does not, it feels that 'car.rs' is closer than 'bar.rs' because I can put cursor over 'Car', hit jump-to-def, whereas navigating to bar.rs requires using the mouse.

Excuse the slightly trivial aside, but the way to go for this kind of 'nonsemantic' nav with the keyboard is an ace-jump like facility. It's a while since I've used IntelliJ but it definitely has a plugin of this sort. If you want to try it search for a plugin with 'ace' or 'easymotion' in the name.