Difficult/Long refactoring, suggestions?

Folks,

I often sketch my code being quite liberal with memory allocation.

Eventually, I go back to my code after I understand the problem I want to optimize some part.

The optimization usually means to avoid clones, so start to use references and the need to specify the lifetimes.

The problem that I face is that, I just decide to change a type to use references instead of owning the object, and the next cycle of cargo check return 15 errors.
Sometimes it is really too much, and I either gave up, or use way too much time.

If the refactory is complex, and usually it is, I ended up having to keep in my mind different branches of intertwining problems that need all to be fixed. And it is not simple.

I believe that this is still much better than old plan C or even C++, but sometimes it really feel complex.

How do you guys cope with this problem? Do you believe is a hint of bad design?

1 Like

I just work through the errors from top to bottom. I've had some refactors woth 100s of errors after a change (to some key part of the system) and usually then I just wait for VS Code to show all the errors and work through them file by file.

To be honest, I don't consider it a problem at all. Quite the contrary; it's one of the main reasons I'm using rust for my current project.

To elaborate, I'm doing numerical computing, which is hardly Rust's strength. I don't need Rust for safety (security is not an issue for my code, as I'm the only end user), nor do I always need it for speed (I could have written the main program logic in Python and delegated the heavy number crunching to Rust). I chose to write the main application code in Rust because the errors and warnings it emits let me refactor things and make large-scale changes to the application without fear. It's impossible for me to e.g. forget to update a match to add a new variant.

In Python, I'd have to run the code and wait to get a TypeError or ValueError in some spot I didn't update correctly. Chances are I broke some feature I don't frequently use, and won't know until the next time I need it.


That said, if you aren't using an IDE, you should. It will speed up the edit-check cycle considerably.


I can't picture this argument. When I refactor things in rust, I don't need to keep much in my head, because I can largely address the errors on autopilot. I only need to keep track of those things the compiler can't help me with, like "I need to fix a sign factor there," or "I just added a method with a default body to a trait and need to check all implementors."

10 Likes

I usually concentrate on building up my program as I'm writing it, meaning that I optimize while I'm writing this code. For example, I see that a structure owns a Vec, while it could really be a slice, so I make the change and immediately try to cargo check it. This ends up in a slower workflow, but it results less to worry about later, when you're thinking about the program as a whole. Then, when you want to add another feature or thing, it should build on top, therefore making it a rather modular system. In this case, the largest changes you may encounter are filename (or mod) changes, or movement of files, etc.

But, in the case where you do end up making a change to your system which will inevitably require refactoring, (One of my previous projects was rather complex type system-wise, so a minor change would mean going through 3 trait impls and a few separate functions to make the changes), then noticing patterns in the changes is useful, like for example if I have many types which all implement a trait in a similar fashion, then cycling through them making the respective change is useful. Another thing to minimize the overwhelming amount of errors is to not make too many changes at once, instead focusing on one at a time, because in certain cases a change which builds on another change may lead to the realization that the original change is impossible to do, and you need to abandon it.

I believe that the best tips I could offer are the following to minimize the amount of long/tedious refactorings you're doing:

  • Keep your mod-system tidy. Make sure that everything is compartmentalized and there's no spaghetti. In fact if there is spaghetti, then fix that first before any optimizations.
  • Don't be afraid to rewrite a small portion (like a module or an impl) to benefit the usability.
  • Use a good IDE, where you can do things like rename a type, or look at references to a function, type or field. I'd recommend IntelliJ IDEA with the rust plugin or vscode if you don't like to wait around for things to get done by the IDE on startup and debugging.
  • Piggybacking on the previous point, use an environment where you can click on the error paths, to see what the problem is in the file without having to navigate to it.

Those are my tips on how to make refactoring easy.

4 Likes

IMHO we should work with the type system instead of working against it. The simplest way to cope with this problem would be to change the types one at a time and fix compilation issues after each type-change. In case of function-types, that is changing the function signature, we can change them one at a time and issue a cargo check. As far as logic is concerned, they are largely localized.

If the source base is large, may be it is a good idea to set up an IDE with rust plugin/rls-server. Hopefully IDE ecosystem should mature soon to handle these scenarios with a single command.

I usually happen to follow my "3 step refactoring program", oftentimes accidentally.

  1. I start with an idea of a Very Ambitious Refactoring. After I'm about 90% done I realize that I need to do the remaining 90%. That's not counting all those cut corners that need to be fixed at some point. Eventually, I give up and throw away all the changes. Might happen after couple of days of work or few very intense sessions. However, in the end I get much better understanding of all the different interdependencies and things that are getting into my way.

  2. With the knowledge accumulated in #1, I start doing a lot of small changes that will prepare for my next attempt of that grand refactoring. Lot of these changes might look very silly or completely useless. Like, it could be many file change that just swaps some arguments around, changes names. Unnecessary wrappers or adapters. Adding a lifetime parameter and making it being passed through the whole system without any visible need for that lifetime. The goal here is to prepare the system for the "big change" without changing much of the semantics. Usually, these changes are much smaller and I can ship them incrementally.

  3. Finally, I'll try to make the big change again. Because of all the work done in #2, things might just click into place. Sometimes I still would discover some new obstacles, in which case I would continue with #2.

One failure mode for this scheme is when you reach step #3 with all these adapters, wrappers, glue code, useless parameters and such and then realize that maybe you don't need to make that big refactoring after all :blush: Or discover that these remaining 5% are close to impossible to do (which I found to be more common in our Rust codebase, btw; one reason is contagious nature of ownership).

4 Likes

Oof, that three-step program definitely sounds familiar. It reminds me of some of my wasted weekends:

https://exphp.github.io/2018/07/30/that-weekend-i-wasted-on-newtyped-indices.html

I think my refactorings usually fall into one of a few categories:

  • Refactoring to make room for new functionality. This is generally easy.
  • Refactoring for efficiency. This can be a fair bit tougher because it involves nontrivial changes to program logic.
  • Refactoring to encode more information into the type system. These are my "wasted weekends."
1 Like

Just as an isolated suggestion: Consider going for Arc and Rc before going to lifetimes, especially for data that no longer needs to change (or changes very infrequently), and especially for high-level application code. You can often get away with fairly little outside of changing some function signatures.

My code has a fairly big trait with some 15 or so methods that return various kinds of Box<dyn Trait>. Initially, I wrote all of it to support Self: !'static, resulting in methods like this:

// Represents configuration for an atomic potential.
pub trait PotentialBuilder<Meta = Element>: Send + Sync {
    ...

    // Construct an object that will compute potential and forces.
    fn initialize_diff_fn<'a>(&self, initial_structure: Structure<Meta>)
        -> FailResult<Box<dyn DiffFn<Meta> + 'a>>
    where Self: 'a;

    ...
}

Finally, at some point I resigned to the fact that the few places where I was using references didn't really need them, because these types ultimately represented configuration and that data never needed to change, so I stuck them into Arcs and simplified the trait to require 'static.

1 Like

As soon as I get the types right in the refactoring, I go put a bunch of unimplemented!s everywhere. That way I can start changing code in one place at a time and only get errors for that place.

Thanks guys! Very useful hints and suggestions.

At the moment I have this type that is nothing else than Vec that is embed in several places in the codebase. I want to change it into a &[T], but a slice necessitate a lifetime, that eventually need to get into all those places.

I believe it is a quite hard problem in general...

Another approach that I would like to try is to use trait, so having a trait that is implemented by both the Vec and by the slice, and make the software compile.
Then, changing the underline implementation to use the slice instead of the Vec, finally (maybe?) going back to real types.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.