Refactoring: Macro or not?

So, I've been doing rust for some time and love (most of it). Though, when refactoring, I always end up fighting the types (especially iterator types) and the borrow checker.

The code below is an example of when I have code repetition that I fail to refactor away. I'm coming from Python and Haskell where I can just wrap everything in functions, but with Rust, it seems not that easy.

So here is the non refactored code with code repetition:

pub fn normalize_pattern_substitution(
    pattern: Pattern,
    substitution: Pattern,
) -> (Pattern, Pattern) {
    let mut index_map = FxHashMap::default();
    let mut current_index = 0;
    let new_pattern: Pattern = pattern
        .into_iter()
        .map(|p| {
            let [a, l, b] = p;
            let na = match index_map.get(&a) {
                Some(_a) => *_a,
                None => {
                    index_map.insert(current_index, a);
                    current_index += 1;
                    current_index - 1
                }
            };
            let nb = match index_map.get(&b) {
                Some(_b) => *_b,
                None => {
                    index_map.insert(current_index, b);
                    current_index += 1;
                    current_index - 1
                }
            };
            [na, l, nb]
        })
        .collect::<Vec<_>>();
    let new_substitution: Pattern = substitution
        .into_iter()
        .map(|p| {
            let [a, l, b] = p;
            let na = match index_map.get(&a) {
                Some(_a) => *_a,
                None => {
                    index_map.insert(current_index, a);
                    current_index += 1;
                    current_index - 1
                }
            };
            let nb = match index_map.get(&b) {
                Some(_b) => *_b,
                None => {
                    index_map.insert(current_index, b);
                    current_index += 1;
                    current_index - 1
                }
            };
            [na, l, nb]
        })
        .collect::<Vec<_>>();
    (new_pattern, new_substitution)
}

My plan was to rewrite this as:

pub fn normalize_pattern_substitution(
    pattern: Pattern,
    substitution: Pattern,
) -> (Pattern, Pattern) {
    macro_rules! get_set {
        ( $x:expr, $map:expr, $ci:expr) => {
            match $map.get(&$x) {
                Some(_x) => *_x,
                None => {
                    let i = $ci.clone();
                    $map.insert($x, i);
                    $ci += 1;
                    i
                }
            }
        };
    }
    macro_rules! norm_pattern {
        ( $pattern:expr, $map:expr, $ci:expr) => {
            $pattern
                .into_iter()
                .map(|p| {
                    let [a, l, b] = p;
                    let na = get_set!(a, $map, $ci);
                    let nb = get_set!(b, $map, $ci);
                    [na, l, nb]
                })
                .collect::<Vec<_>>()
        };
    }
    let mut index_map = FxHashMap::default();
    let mut current_index = 0;
    let new_pattern: Pattern = norm_pattern!(pattern, index_map, current_index);
    let new_substitution: Pattern = norm_pattern!(substitution, index_map, current_index);
    (new_pattern, new_substitution)
}

How would a more experienced rust programmer refactor it? Any links or resources are more than welcome for general best rust practices.
I apologize for this vague and unstructured question, but I hope anyone can help me in my search.

Thanks

Solution summary

The biggest problem with code was not the borrow checker.
My mind was locked in writing the code as

fn foo(a: IncrementalMap) -> IncrementalMap {
        let mut a = a;
        a.get_index_or_increment_insert(5);
        a
 }
...
 pattern.into_iter().map(|p| {
        a = foo(a);
 });

that cause problem with ownership and iterator. Final refactor looks like below. And as pointed out by Michael-F-Bryan my biggest issue was code clutter (that I will address in my code in the future).

Thanks all for the comments.

struct IncrementalMap {
    last_index: usize,
    map: FxHashMap<usize, usize>,
}
impl IncrementalMap {
    /// If key exist in map return, else map key with current index and increment
    fn get_index_or_increment_insert(&mut self, key: usize) -> usize {
        let val = self.map.entry(key).or_insert_with(|| self.last_index);
        self.last_index += 1;
        *val
    }
    fn new() -> IncrementalMap {
        IncrementalMap {
            last_index: 0,
            map: FxHashMap::default(),
        }
    }
}
pub fn normalize_pattern_substitution(
    pattern: Pattern,
    substitution: Pattern,
) -> (Pattern, Pattern) {
    fn norm_pattern(pattern: &Pattern, incremental_map: &mut IncrementalMap) -> Pattern {
        pattern
            .into_iter()
            .map(|p| {
                let [a, l, b] = p;
                let new_a = incremental_map.get_index_or_increment_insert(*a);
                let new_b = incremental_map.get_index_or_increment_insert(*b);
                [new_a, *l, new_b]
            })
            .collect_vec()
    }
    let mut incremental_map = IncrementalMap::new();
    let new_pattern: Pattern = norm_pattern(&pattern, &mut incremental_map);
    let new_substitution: Pattern = norm_pattern(&substitution, &mut incremental_map);
    (new_pattern, new_substitution)
}

I don't recognize anything in the code that would warrant the use of a macro. If you are annoyed by the long match statements, refactor them using a function.

1 Like

I would argue that using a macro here will just cover up the convoluted code, not simplify things.

In particular, you copy/pasted that same "add item, increment current_index, return previous index" dance 3 or 4 times so pull it out into a function or temporary struct which does the insertion and keeps track of a monotonically increasing counter. That way your function's main logic isn't cluttered with the insertion stuff.

struct Foo {
  last_index: usize,
  map: FxHashMap<usize, Pattern>,
}

impl Foo {
  fn get_index_or_insert(&mut self, key: usize, item: Pattern) -> usize {
    self.map.entry(key)
      .or_insert_with(|| {
        self.last_index += 1;
        item
      });
    self.last_index
  }
}

It's also really hard to tell what any of this means. For example, na means precisely nothing to me besides being part of the Australian saying "yeah, na":

Yeahnah. The most common form of Australian slang that is the short form of 'yes, but no'. Commonly used to respond to a mate during conversation, when you are acknowleging what was just said in a conversation, but you have a different view or opinion and are about to express said view or opinion.

What is the function actually trying to do?

2 Likes

Here's a function-and-closures approach.

After refactoring that though, I note that you just insert things into the map with keys that start at 0 and increase by 1 each time. That's just pushing onto a Vec, so here's another version that uses a Vec instead.

(I made some assumptions about the types involved.)

2 Likes

Hi Michael, thank you for your answer! The code is one of many functions part of a pipeline.

The code builds on the concept from:
Finally We May Have a Path to the Fundamental Theory of Physics… and It’s Beautiful—Stephen Wolfram Writings.

Where a graph contains nodes and edges of a particular pattern that are replaced with other nodes.

Before I do my search, I normalize the search and replace pattern (pattern, substitution).
The pattern has the form: [[a,l,b], [a,l,b], [a,l,b]...] -> [[a,l,b],[a,l,b]...]
Where: 0 <= a,b <= 8, l={0,1}.

An example of a pattern I can end up with (if randomly generating a pattern is:)
[[0,0,1],[3,1,0]] -> [[0,1,2]]. <-- This is the problem I'm trying to solve where the left side must contain all of the lowest of numbers, while new numbers on the right side must be larger or equal to the numbers on the left side (excluding the center number denoted l (this is small L)).

As the numbers don't matter, only that the same number get replaced with same.
So my function would take this pattern and turn it into:
[[0,0,1],[2,1,0]] -> [[0,1,3]].

Hope this summary is enough, and yes making thank you for rewriting it as a separate component. Breaks my thinking a bit and I think this is for sure wiser (and makes the code more understandable).

The reason I write macro is the borrow checker complains when sending a mutable object into a function. I usually would write a closure to deal with it, but as said, I'm having a problem satisfying the borrow checker. But I may end up in that situation because I'm not addressing the problem correctly in the first place. Your solution is way more understandable and avoids the problem altogether by going a different path (which is exactly what I was looking for). (Again, super thanks for your answer).

PS.
na means new_a, nb means new_b. And yes, it is a bug, the key-item is in the wrong order.

((Here is how the output looks from a random init state and replacements))

The issue of using functions is the borrow checker (for me). Usually, a closure would do, but functions/closures with iterators in rust are generally a big pain (for me) because of the complex type that comes out. I'm in a similar position as this thread from reddit: https://www.reddit.com/r/rust/comments/4dfvc0/having_trouble_refactoring_to_appease_the_borrow/

This is the reason why I even started using macros (as I can basically just drop in replace my code, without needing to think about ownership, return types etc.). I can feel it is a bad practice, and that was why I wanted to reach out to the community to overcome this hurdle. I think Michael's answer was more what I was looking for. But thank you for taking your time.

Yes, correctly inferred type! I'm sorry I didn't write the explicit definition of Pattern!

After your refactoring I found a bug where I swapped the key and the value :sweat_smile:

Also, you did help me by showing how you wrote it function/closures style. I can see I fail by taking the &mut map with ownership in my attempt to closure and then return to overwrite previously instead of just mutating it in the function itself! So big thanks!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.