Code-review - download (HTTP GET) to disk - error-handling, scalability

Why am I fighting the compiler? - Wrote this, which works, but I feel like I'm breaking every rule in the book:

#[macro_use]
extern crate lazy_static;

use std::ffi::OsString;
use std::path::PathBuf;
use std::str::FromStr;

use actix_rt;
use actix_web::client::Client;
use async_std::fs::File;
use async_std::prelude::*;
use http::uri::Uri;

// Result<OsString, (SendRequestError|std::core::convert::Infallible|actix_http::error::PayloadError)
#[actix_rt::main]
async fn download(uri: &'static str, target_dir: &'static str) -> Result<OsString, ()> {
    let client = Client::default();
    let uri = Uri::from_str(uri).unwrap();

    // Create request builder and send request
    let response = client
        .get(&uri)
        .header("User-Agent", "Actix-web")
        .send()
        .await; // <- Send http request

    println!("Response: {:?}", response);

    let get_host = || uri.host().unwrap().to_owned();

    let path = match &uri.path_and_query() {
        Some(path_and_query) => {
            let p = path_and_query.path().to_owned();
            if p == "/" {
                get_host()
            } else {
                p
            }
        }
        None => get_host(),
    };

    let output_pathbuf = PathBuf::from_str(target_dir).unwrap().join(path);
    let output_file = output_pathbuf.as_os_str();
    let mut file = File::create(output_file).await.unwrap();
    file.write_all(&response.unwrap().body().await.unwrap())
        .await
        .unwrap();
    // file.write_all(b"Hello, world!").await.unwrap();
    return Ok(output_file.to_owned());
}
#[cfg(test)]
mod tests {
    use std::borrow::Borrow;
    // use std::env::temp_dir;
    use std::fs::create_dir_all;
    use std::path::PathBuf;
    use std::str::FromStr;

    use crate::download;

    fn temp_dir() -> PathBuf {
        return PathBuf::from_str("/tmp").unwrap();
    }

    lazy_static! {
        pub static ref TEMP_DIR: PathBuf = temp_dir().join(module_path!());
        pub static ref TEMP_DIR_O: Option<&'static str> = TEMP_DIR.to_str();
        pub static ref TEMP_DIR_S: &'static str = TEMP_DIR_O.unwrap();
    }

    #[test]
    fn test_download() {
        if !TEMP_DIR.exists() {
            create_dir_all(TEMP_DIR.borrow() as &PathBuf).unwrap();
        }

        //let output_dir =
        download(
            "http://www.rust-lang.org",
            TEMP_DIR_S.borrow() as &'static str,
        )
        .unwrap();
        //assert_eq!(output_dir, "foo")
    }
}

Remarks:

  1. I understand that unwrap() is frowned upon, and should be replaced by error percolation (? sugar) and handled at the topmost level (that makes sense). Replacing the unwraps, and now it can return any of SendRequestError|std::core::convert::Infallible|actix_http::error::PayloadError. Should I Box/dyn a std::Error, create a big failure enum, or something cleaner?
  2. Not so happy with the types. I tried working with AsRef<PathBuf>, Into<Uri>, and variants thereof. Clearly that doesn't work.
  3. How do I get the Result?
  4. This is the first step in a multi-URL downloader, but I am worried about efficiency. Do I setup some sort of executor/reactor system atop async/await concepts?

PS: Originally posted error handling - HTTP download function in Rust (async/await) - Code Review Stack Exchange

Thanks for your review

Should I Box/dyn a std::Error, create a big failure enum, or something cleaner?

This is depend on you, what is the best way depend on how you want handle errors.

How do I get the Result?

Result from where?

Do I setup some sort of executor/reactor system atop async/await concepts?

You setup two executor/reactor systems at the same time.
tokio and async_std. I suppose you can solve the task with just one.

@Dushistov I'd like to retry if the error seemed to be some temporary problem, otherwise panic. Obviously I don't want to actually panic, as this would make the function unusable in a library context. But I would want to percolate the error up immediately, and in a CLI's main would panic (though the new main [can] return Result… so maybe even then I would percolate it up (with ?).

Maybe if I'm downloading 100000 files I don't want it to give me any errors until > 10% have errors, or there are more than 1 connection failed errors.

The Result of the download operation would be useful to have. For example, I might return a result of {using_cached: true, buffer: None, location: Path("somewhere")}`, and later—in multi-file context—a list of such.

So for this problem you would recommend pushing everything onto tokio?

I recommend using a boxed std::error::Error as your error type unless you are writing a library, in which case I recommend an enum.

@alice But isn’t it good practice to always assume what you're writing can be used in a library?

The reason I recommend an enum for libraries is that this allows the users of the library to inspect what actually went wrong, which is more difficult with a boxed error. If you are writing an application, you know whether you need to inspect the error, and more importantly, it is rather easy to update it to use an enum instead if it turns out that you later need to inspect the error.

Basically the difference is whether you are able to modify the function that returns a boxed error. When writing libraries, the user of that library can't do that, but if you're writing an application, you can.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.