Regarding refactoring deeply nested for loops

Hey, I have a really large function that I am wanting to refactor for testability and make it more rusty.
It has a lot of loops and I am not sure the best way to structure it while breaking it up.
I'd like to use higher order functions, but I also want it to be testable.
Any help appreciated.

Update: I think I've got it? I'm guessing I should just make each step a fn and unit test each small map / flat_map? Is that overkill? Is this the right direction?

Update 2: I'm going to run with this and update with what I end up with if it works out but, I'm new to this kind of stuff so, any help is still appreciated.


struct Gameplan {
    playbooks: Vec<Playbook>,
}

struct Playbook {
    roles: Vec<String>,
}
// ...

    let roles = gameplan.playbooks.iter()
        .flat_map(|ref x| &x.roles)
        .collect::<Vec<_>>();
    
    let tasks_paths = roles.iter()
        .map(|ref x| {
            let mut tasks_path = PathBuf::from("roles");
            tasks_path.push(&x);
            tasks_path.push("tasks");
            tasks_path
        }).collect::<Vec<_>>();
    
    let dir_entries = tasks_paths.iter()
        .map(|ref x| x.read_dir().unwrap())
        .collect::<Vec<_>>();
    
    let entries = dir_entries.into_iter()
        .flat_map(|x| {
            x.map(|x| x.unwrap().path())
        }).collect::<Vec<_>>();
    
    let contents = entries.iter()
        .map(|ref x| {
            let mut file = File::open(x).unwrap();
            let mut contents = String::new();
            file.read_to_string(&mut contents).unwrap();
            contents
        }).collect::<Vec<_>>();
    
    let filt_lines = contents.iter()
        .flat_map(|x| {
            x.lines()
                .filter(|x| x.contains("###"))
        }).collect::<Vec<_>>();

Background: The application is a command line utility that parses annotations from ansible roles (repo is internal only so, bear with me)

Ansible has a directory layout like (as far as this question is concerned)

roles/
├── role1
│   ├── files
│   │   └── file
│   └── tasks
│       ├── another_one.yml
│       └── main.yml
└── role2
    ├── files
    │   ├── file
    └── tasks
        └── main.yml

In this function, I am:

  • iterating roles,
  • reading the contents of the files its' tasks directories,
  • filtering on lines that begin with ###
  • regex capture groups from that line
  • matching on captures

so, in the above, another_one.yml, main.yml (role1), and main.yml (role2) are the files i want to read in, filter lines, and regex capture on.

// For each role in each playbook
// a playbook is just a file containing specifying what roles to run on what hosts
// so, this is determining which roles to iterate over
for playbook in playbooks {
  for role in &playbook.roles {

    //  create pathbufs for tasks' directories
    let mut tasks_path = PathBuf::from("roles");
    tasks_path.push(&role);
    tasks_path.push("tasks");

    //  create pathbufs for files' directories
    let mut role_file_path = PathBuf::from("roles");
    role_file_path.push(&role);
    role_file_path.push("files");

    // for each file in ./roles/$role/tasks directory
    for dir_entry in tasks_path.read_dir().unwrap() {
      let entry = dir_entry.unwrap();
      let mut file = File::open(&entry.path()).unwrap();

      // read contents of file to string
      let mut contents = String::new();
      file.read_to_string(&mut contents).unwrap()

      // filter on annotations
      let filt_lines: Vec<&str> = contents.lines()
        .filter(|x| x.contains("###"))
        .collect();
      for line in filt_lines {
        // annotation_regex is a simple regex to capture the parts of the annotation in each filtered line 
        for cap in annotation_regex.captures_iter(line){
          match do_stuff_with_regex_captures() {}
        }
      }
    }
  }
}

In my opinion you should use Iterators and adapters to make the loop more linear and readable.