Hi there!
I've written a little tool (that I'd normally have written as a bash script, but since I wanted to use the opportunity to improve my rudimentary rust skills have written in rust) which reads two sorted text files line by line (where each line starts with 32 characters that represent an md5hash of a file) and outputs 3 lists:
- of files only contained in list1
- of files only contained in list2
- of files contained in both lists.
I managed to get to a working solution. What I'm looking for now in general is feedback on how I could have done better, and in particular how I could adapt my solution to adhere to the DRY principle (Don't Repeat Yourself) - since lines 25-36 equal 65-76 in all but indentation.
Here's my code so far:
use std::io::{BufReader, BufRead, Write, self};
use std::fs::{File};
const PATH1: &str = "md5sums1_sorted.csv";
const PATH2: &str = "md5sums2_sorted.csv";
fn main() -> io::Result<()> {
println!("targeted task: read two md5lists and find out which files are unique to each one, and which are shared\n");
let mut only1 = File::create("only_in_list1.csv")?;
let mut only2 = File::create("only_in_list2.csv")?;
let mut shared = File::create("shared_files.csv")?;
let file1 = File::open(PATH1)?;
let file2 = File::open(PATH2)?;
let reader1 = BufReader::new(file1);
let mut reader2 = BufReader::new(file2);
let mut line2 = String::new();
let mut line2_old: String;
let mut line2_matched = false;
reader2.read_line(&mut line2)?;
for line1o in reader1.lines() {
let line1 = line1o?;
println!("loop over file1, comparing lines: {} : {}", line1, line2);
if line2.len() > 32 && line1[0..32].eq(&line2[0..32]) {
println!("contained in both files (1): {line1} = {line2}");
_=write!(shared, "{line1} = {line2}");
line2_matched = true;
}
else if line2.len() < 32 || line1[0..32].lt(&line2[0..32]) {
if line2.len() > 32 {
println!("{} < {}", &line1[0..32], &line2[0..32]);
}
println!("only in file1 (1): {line1}\n");
_=writeln!(only1, "{line1}");
}
else {
println!("{} > {}", &line1[0..32], &line2[0..32]);
if !line2_matched {
println!("only in file2 (1): {line2}");
_=write!(only2, "{line2}");
}
else {
line2_matched = false;
}
loop {
line2_old = line2;
line2 = String::new();
let read = reader2.read_line(&mut line2)?;
println!("read new line2 = {line2}");
if read != 0 && line2.len() > 32 && line2[0..32].lt(&line1[0..32]) {
if ! line2_old[0..32].eq(&line2[0..32]) {
println!("only in file2 (2): {line2}");
_=write!(only2, "{line2}");
}
else {
println!("file2 contained a duplicate: {line2_old} = {line2}")
}
}
else {
break;
}
}
if line2.len() > 32 && line1[0..32].eq(&line2[0..32]) {
println!("contained in both files (2): {line1} = {line2}");
_=write!(shared, "{line1} = {line2}");
line2_matched = true;
}
else if line2.len() < 32 || line1[0..32].lt(&line2[0..32]) {
if line2.len() > 32 {
println!("{} < {}", &line1[0..32], &line2[0..32]);
}
println!("only in file1 (2): {line1}\n");
_=writeln!(only1, "{line1}");
}
println!("\n\nend of loop - is anything missing here?\n\n");
}
}
Ok(())
}
p.s.: In the unlikely case that anybody finds above code useful for any purpose, you're free to copy / modify / republish it wherever you like, please just reference where you found it and my username.
p.p.s.: My question for general feedback is not meant as an invitation to scorn me for my abhorrently bad error handling, this was never meant to be more than a "quick and dirty" solution. I've only added checks for line2
being at least 32 characters long since the way I'm reading file2
gives me an empty string after I've finished reading it.