Working with Arc to add "recursive" functionality

Hello all. Working on an CLI application to find directories on web applications.

I can get a recursive version working fine by cloning items repeatedly, but I want to make it more efficient with less overhead before I start attempting to make it multi-threaded (Which I know will require Arc).

Going with the philosophy I have read online:

  • Make it work first, then make it better
  • Make it single-threaded first, then multi-threaded.

Currently, I am trying to use RCs, since its still single-threaded, and using this as an opportunity to learn RCs and eventually Arc as I have no previous experience with pointers or parallelism. But i found that even if I Rc::Clone(PathBuf) at the beginning of my loop the code still throws ownership/move issues.

error[E0507]: cannot move out of an `Rc`
  --> src/main.rs:54:25
   |
54 |             for line in this_reader.lines() {
   |                         ^^^^^^^^^^^^-------
   |                         |           |
   |                         |           value moved due to this method call
   |                         move occurs because value has type `std::io::BufReader<std::fs::File>`, which does not implement the `Copy` trait

For more information about this error, try `rustc --explain E0507`.

The code is small enough that I will just cp/pst it here if you would like to take a look. Really open to any criticism as I want to learn, but also learn to do it right. But at this moment also really want to understand with a good example of how to use pointers in Rust. =/

#[derive(Parser, Debug)]
#[command(author="Andrew ", version="0.1.0",
about="directory busting tool", long_about=None)]
struct Args {
    /// required: The wordlist to use for the busting
    #[arg(short, long, required=true, value_parser=clap::value_parser!(PathBuf))]
    wordlist: PathBuf,
    /// required: The target URL to go against
    #[arg(short, long, required=true, value_parser=clap::value_parser!(Uri))]
    target: Uri,
    /// optional: Cookies to use for authenticated requests
    #[arg(short, long, value_parser=clap::value_parser!(Option<String>))]
    cookie: Option<String>,
    /// optional: Perform the scan as a recursive scan
    #[arg(short, long, default_value_t=false)]
    recursive: bool,
}

fn main() {
    let cli = Rc::new(Args::parse());

    let mut found_dirs: Vec<String> = Vec::new();
    let word_list = cli.wordlist
        .to_owned();
    let file = File::open(word_list)
        .expect("Could not read file.");
    let reader = Rc::new(BufReader::new(file));

    loop {
        let this_reader = Rc::clone(&reader);
        let mut count = 0;
        if found_dirs.is_empty() && count == 0 {
            count += 1;
            for line in this_reader.lines() {
                let result = make_request(line.unwrap(), Rc::clone(&cli));
                if result == String::from("NONE") {
                    continue;
                } else {
                    found_dirs.push(result);
                }
            }
        } else if count > 0 && found_dirs.is_empty() {
            break;
        } else {
            let ext = found_dirs.pop().unwrap();
            for line in this_reader.lines() {
                let new_line = format!("{}/{}", ext, line.unwrap());
                make_request(new_line, Rc::clone(&cli));
            }
        }
    }
}
#[tokio::main]
async fn make_request(line: String, args: Rc<Args>) -> String {
    let target = &args.target;
    let cookie = &args.cookie;
    match cookie.is_some() {
        true => {
            let url = format!("{}{}", &target, &line);
            let client = reqwest::Client::new();
            let req_resp = client.get(&url)
                .header(COOKIE, cookie.as_ref().unwrap())
                .send()
                .await
                .unwrap()
                .status()
                .as_u16();
            if req_resp == 200 {
                println!("| {} | {} | {} | => Dir found, adding to list", req_resp, url, &line);
                return line;
            } else {
                println!("| {} | {} | {} |", req_resp, url, &line);
            }
        },
        false => {
            let url = format!("{}{}", &target, &line);
            let client = reqwest::Client::new();
            let req_resp = client.get(&url)
                .send()
                .await
                .unwrap()
                .status()
                .as_u16();
            if req_resp == 200 {
                println!("| {} | {} | {} | => Dir found, adding to list", req_resp, url, &line);
                return line;
            } else {
                println!("| {} | {} | {} |", req_resp, url, &line);
            }
        },
    };
    return String::from("NONE");
}

That's not an Rc<PathBuf>, it's an Rc<BufReader<File>>.

Rc<_> doesn't implement BufRead. The compiler determines[1] that you're trying to call <BufReader<File> as BufRead>::lines, which takes the implementer by value. But you can't actually make that work, because you can't move the BufReader<File> out of the Rc.


Heads up: At this point we're sort of blindly pushing forward to try and force this to work, but it doesn't necessarily make a lot of sense, which we'll come back to.

Anyway, note how all the methods on that trait take self or &mut self. If you want to meaningfully share ownership of a BufReader<_>, you're going to need some sort of synchronization so that each owner can still get ahold of a &mut BufReader<_>.

Once you have that, &mut BufReader<_> also implements BufRead, so you can pass that in without removing anything from your Rc<RefCell<_>> or whatever.

// Not that I would write it like this...
for line in (&mut *this_reader.borrow_mut()).lines()

OK, now it's time to take a step back and think this through. Your for loop consumes the BufReader (or &mut BufReader) and reads all the lines through the end of the file. While it's doing this, it needs exclusive access to the BufReader. Once the loop is done, you've read the entire file and the BufReader is useless.[2]

So why do you need shared ownership? The way you're consuming this BufReader, it's a one-shot deal. Just use an owned BufReader and ditch the Rc.

If you sometimes read part of the file and then stopped in the middle so you could read more later, there might be a use-case. But even then, adding synchronization around reading the file might hurt more than it helps.


  1. via method resolution ↩︎

  2. Unless you were trying to read the file multiple times from the start within your loop, in which case you could reader.seek(SeekFrom::Start(0)) or such. But that seems unlikely. And why not just read it once and store the contents. ↩︎

2 Likes

That makes a lot of sense XD. I guess I was so focused on the types that I didn't even think about the implementation or how it was actually used.

Following your suggestions, I added the entirety of the file reading process to the inside of the loop, so that each iteration creates a new reader to that file. Works now!

Thank you, again!