A sequence alignment algorithm for bioinformatics

Hello,
Here is the code:

I have written a rust implementation of the Needleman-Wunsch global sequence alignment algorithm. What it does is take 2 strings and find the optimal way to align them with each other by calculating a similarity matrix, then calculating a path through it to find an optimal alignment.
Ex:
AGTCTTA
CCTGTA
outputs:

Seq1: ['A', 'G', 'T', 'C', 'T', 'T', 'A']
Seq2: ['C', 'C', 'T', 'G', '-', 'T', 'A']
Similarity score was: -2

I am new to rust (pretty new to programming in general) and this project is also the first time I have used git. I am looking for advice on how to make my code rustier and ideas on next steps I could take in my learning. If you know about bioinformatics, ideas on a next project or how to expand this one would be great. I'm thinking about doing a fastq parser and using this or another alignment algorithm to go all the way to a SAM output.
Any advice is greatly appreciated!

I am new to rust (pretty new to programming in general) and this project is also the first time I have used git. I am looking for advice on how to make my code rustier and ideas on next steps I could take in my learning.

As long as your code is useful probably no need to worry too much about being "rustier". Nice to hear that you're learning git. That's already a great step.

Having said that, I think the code looks fine. Looks like typical algorithmic code to me. You might want to run cargo fmt to format your code. There are some spaces at places they shouldn't be.

Another tip might be to, if it makes sense, move some code out of your function into a separate function. Usually comments like // Calculate the area can be replaced by a separate function.

For example,

fn main() {
    let width = 5.0;
    let height = 10.0;
    let radius = 7.0;

    // Calculate area of rectangle
    let area = width * height;
    println!("Area of the rectangle: {}", area);

    // Calculate circumference of a circle
    let circumference = 2.0 * 3.14 * radius;
    println!("Circumference of the circle: {}", circumference);
}

can be rewritten to

fn rectangle_area(width: f64, height: f64) -> f64 {
    width * height
}

fn circle_circumference(radius: f64) -> f64 {
    2.0 * 3.14 * radius
}

fn main() {
    let width = 5.0;
    let height = 10.0;
    let radius = 7.0;

    let area = rectangle_area(width, height);
    println!("Area of the rectangle: {}", area);

    let circumference = circle_circumference(radius);
    println!("Circumference of the circle: {}", circumference);
}

One main benefit of this is that comments will often become outdated and hence may lead to confusion later on. But if it is a function with a separate function name, then it's much less likely to become outdated because otherwise the code will not compile.

Extracting functions like this generally does not affect performance. The compiler will optimize the functions away again in most cases.

Oh and you might want to look into tests too. With tests you can sort of "pin down" your algorithm because tests describe what output Y you should get for input X. After adding tests, you can then automatically let the computer test whether your changes to the algorithm are correct.

For example, to your src/main.rs you could add

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_nwallign() {
        let read1 = "AGCTG";
        let read2 = "ACTG";
        let result = nwallign(read1, read2);
        assert_eq!(result, 2);
    }
}

Now if you run cargo test, the output becomes

$ cargo test
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.05s
     Running unittests src/main.rs (target/debug/deps/nw_alg-1703a3c356a44f3b)

running 1 test
test tests::test_nwallign ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

If I change the assert_eq!(result, 2) line to assert_eq!(result, 3), then the output becomes

$ cargo test
   Compiling nw-alg v0.1.0 (/Users/rik/git/NW_alg_rust)
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.15s
     Running unittests src/main.rs (target/debug/deps/nw_alg-1703a3c356a44f3b)

running 1 test
test tests::test_nwallign ... FAILED

failures:

---- tests::test_nwallign stdout ----
Seq1: ['A', 'G', 'C', 'T', 'G']
Seq2: ['A', '-', 'C', 'T', 'G']
Similarity score was: 2
thread 'tests::test_nwallign' panicked at src/main.rs:101:9:
assertion `left == right` failed
  left: 2
 right: 3
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    tests::test_nwallign

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

error: test failed, to rerun pass `--bin nw-alg`