Counting letters

Would you please comment on this code? Any suggestions are welcome. It all works as I want to, but I would like to learn if it can be done better.

The purpose of the program is to open a file of your choice, and count the number of times a letter of choice occurs in the file, upper or lower case

I translated to english for this forum, but not the variables because I am not sure I can use the same names in English in Rust. So

bestand = file
teller = counter
karakter = character

and I think the other names are the same in Dutch and English.

use std::env;
use std::fs::File;
use std::io::{self, Read};

fn main() -> io::Result<()> {
    println!("\n\n");
    let args: Vec<String> = env::args().collect();

    assert!(
        args.len() == 3 && args[2].len() == 1,
        "Use {} <filename> <letter>\n",
        args[0]
    );

    let letter = args[2].chars().next().unwrap();

    assert!(
        letter.is_ascii_alphabetic(),
        "Use {} <filename> <letter>, no other characters\n",
        args[0]
    );

    let mut bestand = File::open(args[1].as_str())?;
    let mut buffer = [0u8; 1];
    let mut teller = 0;

    while bestand.read_exact(&mut buffer).is_ok() {
        let karakter = buffer[0] as char;
        if karakter.to_ascii_uppercase() == letter.to_ascii_uppercase() {
            teller += 1;
        }
        print!("{}", karakter);
    }

    println!(
        "\n The letter {} occurred {} times in {}, upper or lower case\n",
        letter, teller, args[1]
    );

    Ok(())
}

Thank you!

2 Likes

don't do file IO a single byte a time, do it in large batch, then do your calculation over the in-memory buffer.

also, since you are only dealing with ascii alphabets, you can do the compare in byte directly, no need to iterate in chars.

use iterator filter() and count() is more idiomatic. if you need to do somethingwith the filtered bytes during the process (e.g. to print the character) , you can use inspect().

let target = args[2].as_bytes()[0];
let content = fs::read(args[1].as_str())?;
// without `inspect()`
let total = contents.iter().filter(|byte| byte.eq_ignore_ascii_case(&target)).count();
8 Likes

Yeah, either do everything in u8, and panic on non-ascii input, or deal with unicode and use char. That will allow you to count é and ë, for example.

1 Like
  • Do not assume users will only ever want to count in a file; read from stdin so your program can be used in a pipeline.
  • Use Clap derive macros for parsing and validating CLI arguments; don't do it by hand (and especially do not assume that users give the correct number of arguments).

A shorter, more elegant version..

8 Likes

Very interesting! Thanks for these ideas!

One tiny issue with this solution (which I like) is if I want to count \n I'll get 0.

1 Like

That's an easy fix.

Very elegant. But at this moment also full of things I haven’t learned yet, so something to look forward to :slight_smile:

Specifically what is it you don't understand about it? It can't be "full" of those, because there are only two changes compared to your version:

  • the first one is that parsing the command line arguments is done via Clap's derive macro to avoid boilerplate. Surely you have already used derive macros; if not, here's the relevant chapter in The Book.
  • the second one is reading the file line-by-line, which allows you to use iterators. I'm pretty sure you already have used iterators, because they are the mechanism behind for loops.
2 Likes

It is not so much that I don’t understand it, but a combination of two things:

  1. Some techniques I just didn’t reach yet in the Rust book
  2. A way of thinking I am not used to. When thinking about a solution I notice my mind works along the lines I am used to in C and then translated to Rust. And I know, my way of thinking isn’t dictated by C either, it is just me.

So that is why I love people like you showing other, better, ways to use Rust and to approach a problem. My way of learning is doing, trying and asking & watching masters doing it.

1 Like

It is of course possible to write this in a more prodecural C-style way, which might me more intuitive to you.

    let mut r = BufReader::new(io::stdin());
    let mut buf = String::new();
    let mut count = 0;
    loop {
        match r.read_line(&mut buf)? {
            0 => break,
            _ => {
                count += buf.chars().filter(|&c| c == args.ch).count();
                buf.clear();
            }
        }
    }

Or maybe

    let mut r = BufReader::new(io::stdin());
    let mut buf = String::new();
    let mut count = 0;
    while r.read_line(&mut buf)? > 0 {
        count += buf.chars().filter(|&c| c == args.ch).count();
        buf.clear();
    }

As you can see, there are often many ways to write something. You can mix-and-match functional and procedural styles as you wish. If you run clippy regularly, it'll slowly nudge you towards an idiomatic rust code style.

5 Likes

One minor comment on some of the solutions above: Wrapping Stdin in BufReader is sub-optimal, since Stdin already contains a BufReader. Allocating a second one will waste memory and can actually make things slower instead of faster.

Instead, you can call Stdin::read_line directly, or use Stdin::lock if you want to use the lines() iterator.

(Clippy should add a lint against this...)

7 Likes

One point which nobody else has mentioned (I'm new myself, so may be wrong). Assert! should really only be used to check for things that should never happen in your program, rather than things that ought not happen, such as invalid user input. A panic seems something of an overkill for something so mild.

2 Likes