i just finished reading the book, and i tried to write a piece of code that do the following (No libraries, it's for learning purposes):
i have a file that contains a list of url paths, and i would like to hit a webserver, that is running using python simple http server.
the code should output the number of paths that are 404 not found, follow redirects, and check if they are available, and output the paths that are available.
This is the code i wrote
use std::env::args;
use std::process;
use std::fs::File;
use std::net::TcpStream;
use std::io::BufReader;
use std::io::prelude::*;
use std::option::Option;
fn main() -> std::io::Result<()> {
let help = || {
println!("Usage: app <addr:port> <dictionary>");
process::exit(1);
};
let addr = args().nth(1).ok_or_else(help).unwrap();
let dict = args().nth(2).ok_or_else(help).unwrap();
let file = File::open(&dict).expect(&format!("File {} does not exists!", dict));
let reader = BufReader::new(file);
let mut count_not_found = 0;
for line in reader.lines() {
let mut line = line?;
if ! line.starts_with("/") {
line = format!("/{}", line);
}
if fetch(line, &addr, &mut "".to_string()).is_none() {
count_not_found += 1;
}
}
println!("\nNot found: {}", count_not_found);
Ok(())
}
fn fetch(path: String, addr: &String, mut prefix: &mut String) -> Option<()> {
let mut stream = TcpStream::connect(&addr).expect(&format!("Could not connect to {}", addr));
let mut buf = String::new();
let headers = [
&format!("HEAD {} HTTP/1.1", &path),
"\r\n",
];
stream.write_all(headers.join("\r\n").as_bytes()).unwrap();
stream.read_to_string(&mut buf).unwrap();
if buf.contains("Bad request syntax") {
println!("'{}' -> is not a valid path!", path);
return None
}
let status = buf.lines().nth(0).unwrap().split_whitespace().nth(1).unwrap();
if status == "404" {
return None
} else if status == "301" || status == "302" || status == "307" {
buf.lines().for_each(|line| {
if line.starts_with("Location:") {
println!("[ {} ] {}", status, path);
prefix.push('-');
fetch(line.split_whitespace().nth(1).unwrap().to_string(), addr, &mut prefix).unwrap();
}
});
} else {
if prefix.is_empty() {
println!("[ {} ] {}", status, path);
} else {
println!("[ {} ] {} {}", status, prefix, path);
}
}
Some(())
}
can you please give me feedback on this code, and how can i make it better?
Here's a few things immediately that jump out at me (I don't usually do network programming so I'm not familiar with certain concepts and/or practices, so this'll be general rust).
You use of the help closure is okay, but using process::exit(1) is not. Try the following:
fn main() -> std::io::Result<(), Box<dyn std::error::Error>> {
let help = "Usage: app <addr:port> <dictionary>";
let addr = args().nth(1).ok_or(help)?;
let dict = args().nth(2).ok_or(help)?;
//...
//...
//...
}
Some people might argue against using Result<(), Box<dyn std::error::Error>>, and will probably provide better error handling crates, but I just use that given that this is an example which doesn't use any libraries on crates.io.
If you're being nitpicky about performance, you might be able to reuse an args() call by storing it, like so:
let mut sys_args = args();
let addr = sys_args.next().ok_or(help)?;
let dict = sys_args.next().ok_or(help)?;
if fetch(line, &addr, &mut "".to_string()).is_none() {
count_not_found += 1;
}
Should become
if fetch(line, &addr, &mut String::new()).is_none {
//...
}
You shouldn't accept &T if T: Deref<U> in most cases. This is to say, in this case, change &String to &str because &str is a reference to the text data directly while the &String is a reference to a structure meant to both hold the equivalent of &strand also allow it to re-allocate and manage it, but only if you've got a &mut String.
Looking at your code, you never use the mut in mut prefix. You should resolve the warnings present during compilation. You'd use it if you had something like this:
fn foo<'a, 'b: 'a>(mut bar: &'a mut String, list: &'b mut Vec<String>) {
//Reassign to `bar`:
bar = &mut list[0];
}
Returning an Option<()> is okay, but since you're already using a Result in main, you could use a Result<(), &'static str>, or a bool instead.
You use let mut stream = TcpStream::connect(&addr).expect(...), but it is already a &String in your original code and a &str in the code I've suggested you modify, so there's no reason to reborrow here (You're putting a &&String into a &str which means the compiler turns it into *&addr instead)
I would suggest using a match expression instead of a series of ifs when you check the status. It amounts to the same thing, but it better expresses the notion that you intend to do something no matter what the value of status is.
The fetch() function returns a bool, which you are checking when you call it from main, but you're not checking the return value when you call fetch recursively. I'm not sure whether that's intentional or not. Also, you're using true to indicate failure. which might not be obvious.
It looks like you're interested in counting how many requests returned a 404. In that case, maybe the simplest return type that expresses this is Result<(), usize>. This comes with the added benefit that you'll get a warning if you call fetch without checking its return value. You can always write let _ = fetch(...) to explicitly ignore its return value.
@adeschamps the recursive fetch is supposed to follow redirects, i guess the current implementation does not increment the number of 404 pages in case the redirect end up in a 404. i should maybe think of a way to do that.