Please review my first crate :D!

As a group project for one of my university courses, I created a rust crate that has optimization algorithms for scheduling processes/jobs on machines.
This is my first crate/bigger-actual project so Im sure there is a lot to fix.
Currently, the first main feature of my crate is working which is the Schrage algorithm (it's described in the docs).

It would be very nice if you would like to download/install this crate and play with it :relaxed:

There are many bad things there :joy: but the first one I can see already is that even though I added a special HTML header docs on crate io still does not render LaTeX (locally it works) so any advice here would be a huge help :grin:

Here is my crate on crates.io:
https://crates.io/crates/proc-opt

and on Github:

No giant blunders I can see. Some suggestions from a read of the source:

The introductory docs defining what you mean by a job, and what optimizing a job is, are great; my only note is don't spend too long getting to showing some actual usage code.

In practice, you're going to want to sort either a list of objects implementing a Job trait or returning the sorted indices: both can be used to build the other quite easily, so pick whichever is reasonable given the number of times you need to get the times. Without something like this, you currently have no way to get the actual jobs that do things sorted.

You don't need to manually implement PartialEq, you can derive that too.

I'm not sure the JobList type is pulling it's weight: using free functions to sort instead seems like it would simplify a lot of code a little bit.

If possible, you should define a common interface at the module top level (Job, at least). If that's not possible, perhaps these different algorithms should be different crates.

Taking theory into practice, real machines/cores generally aren't filled by a single job, but that seems out of scope.

2 Likes

Hi, yes well when it comes to different Job structs, given that this is an assignment and we weren't sure how much we can implement in time (I'm planning on developing this crate after this semester but for the time being we need to have something semi-working and ready even if its ugly) and given that different algorithms can behave differently and give answers to fundamentally different questions we just wanted to stay safe

impl PartialEq for JobList {
    fn eq(&self, other: &Self) -> bool {
        if self.jobs.len() != other.jobs.len() {
            return false;
        }
        for (i, j) in self.jobs.iter().enumerate() {
            if j != &other.jobs[i] {
                return false;
            }
        }
        true
    }
}

The variable naming is confusing: i and j typically denote integer iteration indices, but here j is a Job. The mixed iteration mode is also confusing: either iterate by indices

for i in 0..self.jobs.len() {
    if self.jobs[i] != other.jobs[i] { return false; }
}

or iterate by value using iterator combinators

for (lhs, rhs) in self.jobs.iter().zip_eq(&other.jobs) {
    if lhs != rhs { return false }
}

Also, how is this function different from self.jobs == other.jobs?

I also agree that JobList type is unlikely to pull its weight. Just work with Vec's and inline the methods.

Once you do so, it seems to make sense for functions schrage and part_time_schrage to take their jobs argument by value. As far as I can tell from a cursory glance, you use it in a single place to create a new shortest_delivery_jobs vector by cloning and sorting. It is generally good practice to move the decision to clone to the consumer of your API, since they have more information on whether they need to clone or may pass some owned data which is no longer needed, avoiding redundant allocations.

This will also simplify support of custom allocators and #![no_std] targets, if that is ever needed.

schrage and part_time_schrage also seem to be largely the same. Consider refactoring them to avoid duplicating so much code. A boolean (or custom enum) parameter is the simplest way to do it, but some better API may also be possible.

ready_to_run.jobs.append(&mut vec![current_job]);

That is quite redundant. Just use

ready_to_run.jobs.push(current_job);

I wonder if Clippy lints against this?

let cooldown_times = ready_to_run.sorted_by_cooldown_time();
let max_cooldown_time = cooldown_times.last().unwrap();

Again, you create an entire new vector just to take a single element out of it, discarding everything else. There is a better way to do it:

let max_cooldown_time = ready_to_run.iter().max_by_key(|job| job.cooldown_time).unwrap();

Considering the following code, maybe you should just sort the ready_to_run vector by cooldown_time and pop the last element? This would also allow to avoid redundant unwrap in the code (since the block executes only if the vector is non-empty, so the max element always exists).

2 Likes