Program to download a file from the web

Hello,

I'm trying again to learn Rust, and made this short program with ureq for practice. In particular I would like to know how to make it more "rustic", but I'm a programming novice, so any and all feedback is welcome, even if it isn't strictly related to the use of Rust. Thanks in advance.

#![warn(clippy::all, clippy::pedantic)]

use clap::Parser;
use std::fs::{self, File};
use std::path::Path;
use std::time::Duration;

#[derive(Parser, Debug)]
#[clap(version, about, long_about = None)]
struct Args {
    #[clap(help = "URL of the file to download")]
    url: String,
    #[clap(short, long, help = "Output filename")]
    output: Option<String>,
}

fn create_output_file(output_filename: &str) -> Result<File, String> {
    let output_file_path = Path::new(output_filename);

    // never overwrite the output file
    if output_file_path.exists() {
        return Err(format!(
            "output file {} already exists, aborting...",
            output_filename
        ));
    }

    match File::create(output_file_path) {
        Ok(file) => Ok(file),
        Err(e) => Err(format!(
            "failed to open file {} for writing; cause: {}",
            output_filename, e
        )),
    }
}

fn main() -> Result<(), String> {
    let args = Args::parse();

    // if the address doesn't specify the protocol, default to https
    let url = if args.url.contains("://") {
        args.url
    } else {
        format!("https://{}", args.url)
    };

    let output_filename = match args.output {
        Some(filename) => filename,
        // use the last component of the URL as a filename
        None => url
            .trim_end_matches('/')
            .split('/')
            .last()
            .unwrap()
            .to_string(),
    };
    let mut output_file = create_output_file(&output_filename)?;

    let agent = ureq::AgentBuilder::new()
        .timeout_read(Duration::from_secs(5))
        .timeout_write(Duration::from_secs(5))
        .build();

    let response = match agent.get(&url).call() {
        Ok(response) => response,
        Err(ureq::Error::Status(code, response)) => {
            fs::remove_file(output_filename).unwrap();
            return Err(format!(
                "server returned code {}: {}",
                code,
                response.status_text()
            ));
        }
        Err(_) => {
            fs::remove_file(output_filename).unwrap();
            return Err(String::from("Unknown error"));
        }
    };

    match std::io::copy(&mut response.into_reader(), &mut output_file) {
        Ok(_) => (),
        Err(e) => {
            fs::remove_file(output_filename).unwrap();
            return Err(format!("Could not write to file: {e}"));
        }
    }

    Ok(())
}

Looks fine.

If you change match args.output to match &args.output, you won't need .to_string() on the second arm. It's a tiny optimization that saves one allocation.

1 Like

I would maybe use a type like Url from the url crate (ureq already uses it internally) instead of String for the file address in the clap arguments. Maybe even try to write a newtype around Url that implements FromStr and enforces that the url must have a hostname.

I would certainly use PathBuf instead of String for the output path. And use unwrap_or_else for the default case. (The Url type has nice methods to get the last part of the url)

I would also get rid of all the file creation handling and just keep it to:

let output = args.output.unwrap_or_else(|| args.url.fragments().last()); // pseudo code
let mut output = BufWriter::new(File::create(&output)?);
if let Err(err) = io::copy(&mut response.into_reader(), &mut output) {
    // Depending on the error, remove the file or not 
}

If the user wants to overwrite a file, I say let them. Or at least don't create the file before you start the http request, that makes your error handling harder than it could be.

Possibly you could try and learn to use anyhow to add context to your errors. That would let you use the ? operator more liberally while keeping the ability to return nice errors.

All in all, only minor things.

2 Likes

File::create(output_file_path).map_err(|e| Err(format!("failed to open file {} for writing; cause: {}", output_filename, e))

1 Like

I did some more reading and rewrote parts of the program with your feedback in mind. Here's what I've got now:

#![warn(clippy::all, clippy::pedantic)]

use anyhow::{anyhow, Context};
use clap::Parser;
use std::fs::{self, File};
use std::io::BufWriter;
use std::path::PathBuf;
use std::time::Duration;
use url::Url;

#[derive(Parser, Debug)]
#[clap(version, about, long_about = None)]
struct Args {
    #[clap(help = "URL of the file to download")]
    url: Url,
    #[clap(short, long, help = "Output filename")]
    output: Option<PathBuf>,
}

fn form_output_filename(url: &Url) -> anyhow::Result<PathBuf> {
    let mut path_segments = url
        .path_segments()
        .with_context(|| format!("invalid URL: {}", url))?;

    // the url crate guarantees the iterator here contains at least one string, so unwrap is fine
    let last_segment = path_segments.next_back().unwrap();

    if last_segment.is_empty() {
        let host = url.host().context("unable to form output filename")?;
        Ok(PathBuf::from(host.to_string()))
    } else {
        Ok(PathBuf::from(last_segment))
    }
}

fn main() -> anyhow::Result<()> {
    let args = Args::parse();
    let agent = ureq::AgentBuilder::new()
        .timeout_read(Duration::from_secs(5))
        .timeout_write(Duration::from_secs(5))
        .build();

    let response = match agent.get(args.url.as_ref()).call() {
        Ok(response) => response,
        Err(ureq::Error::Status(code, response)) => {
            return Err(anyhow!(
                "server returned code {}: {}",
                code,
                response.status_text()
            ));
        }
        Err(_) => {
            return Err(anyhow!("Unknown error"));
        }
    };

    let output_path = match args.output {
        Some(filename) => filename,
        None => form_output_filename(&args.url)?,
    };
    let mut output_file = match File::create(&output_path) {
        Ok(file) => file,
        Err(e) => {
            return Err(anyhow!(
                "failed to open file {} for writing; cause: {}",
                output_path.display(),
                e
            ))
        }
    };

    if let Err(e) = std::io::copy(
        &mut response.into_reader(),
        &mut BufWriter::new(&mut output_file),
    ) {
        fs::remove_file(output_path).unwrap();
        return Err(anyhow!("Could not write to file: {e}"));
    }

    Ok(())
}

2 Likes

Very good :slight_smile:

The only thing I don't like is that you sometimes erase the source error, but I don't think that's really an issue. Here's how I would modify your program to my own tastes. But yours is already very good Rust, and mine could again be improved.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.