Working with primitive data type CHAR

I am reading a binary file in, and I want to convert the byte I'm reading into a char. I looked at the std::ascii examples I could find, but it seems like it has been depreciated? I want to try and make something like this:

        let mut ascii_string: String;
        if (! byte_pair[0].is_ascii_printable())
        {
            ascii_string.push(byte_pair[0].as_char())
        }

I have found this:

let c1 = Some(char::from_digit(byte_pair[0].into(), 16));

but it returns an Option, which isn't really what I want. It does contain the correct character, but I have no idea how to convert this into a character which can be pushed into a string. It seems an awfully convoluted method for just pushing an ascii code into a string variable. There must be a better way.

I assume this is a continuation of this thread:

What is the encoding of your file? utf-8? utf-16?

1 Like

From you description I assume you have a list of bytes and you want to collect any ascii characters it contains in a String.
Typically you will use String::from_utf8_lossy to create a String from bytes.
To collect only the ascii characters you can filter on char::is_ascii:

Something like this:

let bytes: &[u8] = b"Hello, world!";
let filtered: Vec<u8> = bytes.iter().copied().filter(|&c| c.is_ascii()).collect();
let ascii_string = String::from_utf8_lossy(&filtered);

That is corrct I have a list of bytes, But I'm trying to create a simple "hexdump" program. So I only want to print 16 bytes at a time, not create a string from all the bytes at once. This is why I was leaning toward the .push()

That is a good question. I wrote the file in C using the fwrite function with size_t of 2 (16 bits). But what I am trying to do here is simply take the byte_pair[0] which would be u8 and push that char into a string so that I can output a string to the side of the output, similar to hexdump or xxd.

00000000: 30f0 31f0 32f0 33f0 34f0 35f0 36f0 37f0  0.1.2.3.4.5.6.7.
00000010: 38f0 39f0 30f0 31f0 32f0 33f0 34f0 35f0  8.9.0.1.2.3.4.5.

I would like to make this fairly generic, so encoding 8 bits with a flag for 16.

There is no actual text in this particular binary file, but the program should be able to ouput strings of ascii if they are in the file.

I don't understand most of your code, but here's a function that turns a chunk of bytes into an ASCII string by replacing all non-ASCII-printable characters with .:

fn asciify(bytes: &[u8]) -> String {
    let mut ret = String::new();
    for &b in bytes {
        ret.push(if b.is_ascii() && !b.is_ascii_control() {
            char::from(b)
        } else {
            '.'
        });
    }
    ret
}

Playground

  • is_ascii_printable might have been an old unstable method. Are you looking at the latest docs? You can substitute a combination of is_ascii and is_ascii_control.
  • char::from_digit doesn't seem like what you want. char implements From<u8> (because all u8s are valid Unicode code points), so char::from should work.
  • as_char isn't a standard library function, so it must be coming from some crate you're using. Maybe is_ascii_printable does too.

This is a very elegant function, and actually introduced some syntax I didn't know you could use in Rust. :slight_smile:

Yes, nice and clean. Depending on size of your byte slices, it might be a little nicer if you change the

let mut ret = String::new();

with

let mut ret = String::with_capacity(bytes.len());

so your String doesn't have to keep growing.

But I may be clueless, because on my system passing an iso of a medical cd through, takes about 13 seconds, same both ways.

1 Like

An alternate solution is to keep everything in bytes and then just define a wrapper type with a custom Display implementation with your desired behavior (and stealing part of @trentj's asciify)

use std::fmt;

struct DisplayAsciiWrapper<'a>(&'a [u8]);

trait DisplayAscii {
    fn display_ascii(&self) -> DisplayAsciiWrapper;
}

impl DisplayAscii for [u8] {
    fn display_ascii(&self) -> DisplayAsciiWrapper {
        DisplayAsciiWrapper(self)
    }
}

impl fmt::Display for DisplayAsciiWrapper<'_> {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        use fmt::Write;
        for b in self.0 {
            if b.is_ascii() && !b.is_ascii_control() {
                f.write_char(*b as char)?
            } else {
                f.write_char('.')?
            }
        }
        Ok(())
    }
}

fn main() {
    let data = [b'A',10,20,b'C'];
    println!("{}",data.display_ascii())
}

Playground

1 Like