Ferium, my first Rust project!

Ferium

I've been learning Rust for the past few months and I felt that its time I finally make the program I've been dreaming of for many years, a package manager for Minecraft mods. This is my first Rust project that functions properly. I've just learned Rust library stuff, so the way I've organised my code might be confusing. Any improvements to documentation, code, UX , etc is much appreciated.

For now I've only been able to compile for macOS, I apologise in advance for not releasing Windows or Linux binaries

3 Likes

Nice work @theRookieCoder! It's always nice to see people creating cool stuff in Rust :crab:

Here are some thoughts I had while skimming your code. A lot of them are stylistic points, so feel free to disregard them if you feel differently :slightly_smiling_face:

I also haven't checked whether any of my suggestions compile.


use labrinth::funcs::*;
use octorok::funcs::*;
use readme::README;
use reqwest::Client;
use std::fs::{File, OpenOptions};
use std::io::Write;
use std::process::exit;
use util::{cli::*, json::*, wrappers::*};

Try to avoid glob imports.

I wanted to see how you had implemented command-line argument parsing and it took me a couple seconds to figure out that get_subcommand() probably came from the cli module. More seriously, I had no idea where get_config_file() came from at all and resorted to cloning the repo and searching for it locally because it was too hard looking for it on GitHub :disappointed:

Normally I would either explicitly import the get_subcommand() function (i.e. use util::cli::get_subcommand) or import the module and refer to it as cli::get_subcommand(). That way you aren't polluting your namespace and people can immediately see where things come from.


        SubCommand::Help { implied } => {
            println!("{}", README);
            if implied {
                exit(126);
            }
        }

If you don't have a help subcommand clap will actually generate one for you! That lets you drop the readme module and means you can't have issues where your help text and command-line args are out of sync because you added a new subcommand/flag and forgot to update readme.rs.

From clap's docs:

If you are using subcommands, clap will also auto-generate a help subcommand for you in addition to the traditional flags.

I haven't used clap directly for a while now (I prefer structopt's declarative style) but I believe it'll also do the same with version.


print(&format!("Adding repo {}/{}...", owner, repo_name));

This looks a bit strange. I probably would have made print() either accept any msg: impl Display so you can get rid of that &, or even declared my own macro which does the print!() call and automatically flushes stdout. - that way you can get rid of the leading format! bit and write something like print_now!("Adding repo {}/{}...", owner, repo_name).

You might also want to use the log crate instead of printing directly to the console.


pub async fn request(client: &Client, url: &str, relative: bool) -> Response {
    let url: String = if relative {
        // If provided URL is specified as relative, then prepend the base url
        format!("http://api.github.com{}", url).into()
    } else {
        // Else directly assign the URL
        url.into()
    };

    ...
}

This relative flag kinda bugs me because when I see the function being called I'm always going to need to check the source to see what the true or false refers to.

You could create a relative_request() function which lets you replace something like this...

request(client, &format!("/repos/{}/{}", owner, repo), true)

... with this:

relative_request(client, &format!("/repos/{}/{}", owner, repo))

Alternatively, you could over-engineer things by making the request() function always accept absolute URLs and then defining your own URL builder type/functions.

request(client, urls::github_repo(owner, repo))

pub mod funcs;
pub mod structs;

I think funcs and structs are pretty useless names for a module.

At the top of src/octorok/structs.rs you even say

/*
 * This file contains typed structs of the data structures used by the GitHub API
 */

What about calling it api_responses.rs?

You should also use doc-comments instead of normal comments. Normal comments are completely ignored by the compiler, whereas if you use doc-comments it'll appear in the pretty HTML documentation you get from cargo doc.

//! Data structures used by the GitHub API.

// src/octorok/structs.rs

use std::vec::Vec;

This import is unnecessary because Vec is accessible by default (it's in the prelude).


Another "pattern" I've seen you doing is to define functions that accept strings as &str (which is good!) but then pretty much every time they are used it's with temporary strings that are allocated at the use site with &format!(...).

Looking back at our request() function:

pub async fn request(client: &Client, url: &str, relative: bool) -> Response { ... }

request(client, &format!("/repos/{}/{}", owner, repo), true)

This looks kinda weird and you actually end up doing a bunch of string copies.

For example, you'll have one copy of the string in the calling function (the format!() call), then at the top of request() we call into() on the &str we passed in to turn it into another heap-allocated String, then we pass &url to client.get() which actually does another copy because the IntoUrl implementation for &str calls Url::parse() and Url contains a String with the parsed URL.

If you look at reqwest::Client's get() method you can see it actually accepts any url: impl IntoUrl, so you could avoid all those string copies by using helper functions which directly construct the right Url.


You should also look into proper error handling. It looks like whenever you encounter an error you'll just emit a print then immediately std::process::exit().

/// Get a mod using the `mod_slug`, which can also be the mod ID
pub async fn get_mod(client: &Client, mod_slug: &str) -> Mod {
    match request(client, &format!("/mod/{}", mod_slug), true)
        .await
        .json()
        .await
    {
        Ok(typed) => typed,
        Err(e) => {
            println!("JSON deserialisation failed due to {}", e);
            exit(122);
        }
    }
}

Calling std::process::exit() directly is undesirable for a couple reasons:

  • People won't be able to use this as a library because the moment anything unexpected occurs (e.g. a URL doesn't exist) the entire program will crash, and they've got no way of preventing it
  • Terminating the program immediately is kinda like stopping a computer by pulling out the power plug/battery. Sure it'll stop the program, but destructors won't be called meaning the Client won't have a chance to gracefully close connections and data written to open files won't be flushed to disk (i.e. potential data loss)

You also hard-coded the exit code. At the very least you'll want to move them to constants with doc-comments so I know the difference between an exit code of 122 and 124.

When I say "proper error handling" I mean that your functions should return a Result and use ?. If you introduce a custom error type that'll also let you differentiate between the various types of errors and exit with the correct error code at the very top of the program.

#[tokio::main]
fn main() {
  match run().await {
    Ok(_) => { /* exit successfully */ },
    Err(MyError::ThingDoesntExist(thing)) => {
        println!("Error: \"{}\" doesn't exist", thing);
        std::process::exit(122);
    },
    Err(MyError::ReqwestError(e)) => {
        println!("HTTP request failed: {}", e);
        std::process::exit(124);
    },
    ...
  }
}

async fn run() -> Result<(), MyError> {
  // contents of the previous `main()` function.
}

enum MyError {
  ThingDoesntExist(String),
  ReqwestError(reqwest::Error),
  ...
}

You may also want to look into the anyhow or thiserror crates.


Your get_config() function can actually accept anything implementing std::io::Read, not just a File.
I'd also argue you are doing something more than just "getting" the config, you are parsing it from a string representation.

use std::io::Read;

pub fn parse_config<R: Read>(reader: R) -> Config {
    // Read file contents
    let mut contents = String::new();
    match reader.read_to_string(&mut contents) {
        Ok(_) => (),
        Err(e) => {
            println!("Could not read file due to {}", e);
            exit(120);
        }
    }

    // Try deserialising contents and return it if successful
    match from_str(&contents) {
        Ok(config) => config,
        Err(e) => {
            println!(
                "Config file deserialisation failed due to {}. Check that the formatting is correct!",
                e
            );
            exit(120);
        }
    }
}

But it gets better. Turns out serde_json has a from_reader() function which accepts a reader directly without the temporary string!

use std::io::Read;

pub fn parse_config<R: Read>(reader: R) -> Config {
    match serde_json::from_reader(reader) {
        Ok(config) => config,
        Err(e) => {
            println!(
                "Config file deserialisation failed due to {}. Check that the formatting is correct!",
                e
            );
            exit(120);
        }
    }
}

If we use the error-handling suggestions from earlier we can also get rid of the match statement.

use std::io::Read;

pub fn parse_config<R: Read>(reader: R) -> Result<Config, serde_json::Error> {
    serde_json::from_reader(reader)
}

And now we can see the parse_config() function wasn't needed at all. You could have just done this in main():

let config: Config = serde_json::from_reader(config_file)?;

(assuming main() returns a Result<(), SomeErrorType> where you have defined SomeErrorType to have a From<serde_json::Error> impl)


You shouldn't be using string operations to create paths.

let config_file_path = format!("{}{}", config_file_dir, "config.json");
let config_file_path = Path::new(&config_file_path);

Instead, the Path type has a join() method which is often a lot nicer and side-steps a lot of issues you can have when dealing with filesystem paths (/ vs \, invalid characters, etc.).

let home = tilde("~");
let config_file_path: PathBuf = home.join(".ferium").join("config.json");

Your write_to_config() function does a bunch of set_len() and seek()'s. If it were me I'd just pass in the config file's path and use File::create() to truncate the file to get something close to this:

pub fn save_config(config_file: &Path, config: &Config) -> Result<(), SomeErrorType> {
  let f = File::create(config_file)?;
  serde_json::to_writer_pretty(f, config)?;

  Ok(())
}
13 Likes

Thanks so much for going through my code and making so many suggestions, I really appreciate it! I will definitely implement all of these!

The suggestion about handling errors is very helpful. I really should have done this before, using match and exits everywhere is really messy. (the reason I used exits is because that is the de facto way of ending a program in Python and Dart, I didn't realise it was so destructive!)

I also have one (rather unimportant) problem that is the project itself. The compilation time is almost 6 minutes. But more annoyingly when I make a change, it takes almost 5-10 seconds for the errors and warnings to update. This breaks my continuity and feels quite interruptive. Is this normal in Rust projects? Am I just spoiled by modern IDEs and programming languages? Is there a way to improve this? (making Labrinth separate crate for example)

1 Like

Rust compilation is often slower than we would like, and some things we want to use (e.g. async code and lots of declarative macros) tend to make it even slower. But there are some things you can do to make it faster, and some is actually included in the suggestions by @Michael-F-Bryan above, such as avoiding * imports and using Path::join rather than format! for building paths. Also, in general, smaller functions, files, and crates helps.

Also, cargo check tends to be faster (sometimes a lot faster) than cargo build or cargo test.

That said, I haven't really looked at your code yet, there may or may not be something specific there that makes it slower than it should be.

1 Like

I'm not sure about Dart, but something to note is that Python's sys.exit function doesn't actually call the native exit function the way Rust does, it actually just throws a Python exception (SystemExit) - this actually will run destructors as it bubbles up the call stack.

As this exception is not really intended to be caught, the equivalent in Rust would probably be a panic. However, just as it would probably be preferred in Python to throw a more meaningful exception for each possible failure cause, it is definitely preferred in Rust to return meaningful Result types (as Michael-F-Bryan explained above).

3 Likes

Note that these won't actually improve compile times.

Rust isn't like C's #include where importing something will suddenly add a bunch of extra stuff that the compiler needs to parse and typecheck, unnecessary imports just mean a couple of extra dictionary lookups and a tiny bit more bookkeeping.

Similarly I'd be surprised if the format!() macro had any noticeable difference on compile times.

Instead, the vast majority of slow build times are due to having lots of dependencies with slow build times. As one data point, there's a project which uses tflite (a Rust interface to the TensorFlow Lite C++ library) and if that needs to be recompiled it immediately adds 2-5 minutes onto build times because we end up building half the TensorFlow codebase.

This is quite surprising. Are you using a reasonably recent version of Rust?

Over the last year or so a lot of effort has been put into incremental compilation, and even on that slow project I mentioned earlier I will normally get feedback within about a second of hitting save (I use cargo watch to automatically run cargo check, cargo test, and cargo doc whenever a file changes).

For some numbers, I ran cargo check (just typecheck the crate and its dependencies) on my computer and it took 23 seconds. On the other hand, cargo build took 43 seconds when compiling from scratch.

When you are making incremental changes you can often get feedback within less than a second because cargo check only does type checking and doesn't actually generate a binary.

Hey guys! I'm back from my exams and I've implemented most of @Michael-F-Bryan's suggested changes in this commit. The only suggestions I haven't implemented yet are:

  • Error handling. I'm a bit confused by this so I've implemented all the other changes first. For now I'm unwrapping everything (there are no exit()s don't worry). Because of the lack of error handling I will not be releasing this commit as a release
  • The suggestion on using logging. I'm actually not logging info I'm providing feedback to the user that something is happening. I'm working on adding coloured output to help with the bland look (suggestions for cross platform crates for this is welcome)

I think @matklad put it really well...

You can implement some really top quality error handling in Rust, but because there are so many different ways to go about it you need experience to know which approach is best for a particular application. It also helps when you design good error handling into your application from the start because handling unexpected situations is as integral to your program's control flow as the main business logic (arguably more-so).

In your case, the easiest way to implement error handling is by using the Error type from the anyhow crate. .

To convert your unwrap() code into something which handles errors gracefully instead of panicking you would...

  1. Make your main() function return a Result<(), anyhow::Error> and put an Ok(()) at the bottom... When main() returns an Err it'll print the error message and exit with an unsuccessful error code, and when it returns Ok your program will exit with a successful exit code
  2. Search for any functions that may fail (searching for unwrap() and expect() will get you pretty far)
  3. Make the function return a Result<_, anyhow::Error>
  4. Replace the unwrap() with a ? so your code returns early if it encounters an error
  5. Wrap the function's previous return value with Ok()
  6. Follow the compile errors until everything builds again

I'd recommend starting small and working your way up. Find one function that is called by main() and switch that over to returning Result, then make sure everything builds again before moving to the next function.

Once most of your code is switched over to returning Results we can start improving the error messages. The easiest way to do this is by adding extra "context" to an error. Have a look at the anyhow::Context trait for examples and an explanation of what it'll do to the error message printed when you return from main().

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.