Formatting/padding problem

I'm writing a program to generate a list of records for unicode characters. One of the components of the record is a regular expression to match a given character. Is there a way I can get formatting to dynamically format a hexidecimal character? e.g.: if the character is less than 4 hex digits, pad it with 1-2 0s to make it four digits; if its greater than four, pad it with enough zeros to make the \U regular expressio nqualifier work properly?

Well, you can certainly create such a formatting logic, although I’m not sure I’m entirely understanding your used case. Something like

use std::fmt::Display;

#[derive(Clone, Copy, Debug)]
pub struct Hex4Or8Digits(pub u32);

impl Display for Hex4Or8Digits {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        if self.0 <= 0xFFFF {
            write!(f, "{:04X}", self.0)
        } else {
            write!(f, "{:08X}", self.0)
        }
    }
}

fn main() {
    println!(
        "{} {} {} {} {}",
        Hex4Or8Digits(0x12),
        Hex4Or8Digits(0x0),
        Hex4Or8Digits(0xF0A3),
        Hex4Or8Digits(0x10023),
        Hex4Or8Digits(0xFFFFFFF),
    );
    // prints “0012 0000 F0A3 00010023 0FFFFFFF”
}

creates the formatting you describe (as far as I understand you question). If you want it to be slightly different you could adapt the approach.

I’m not entirely sure what your purpose is here though. At least for regular expression in the sense of the regex crate, you can use {} around the hex to support any length, and most characters don’t need to be escapted anyways.

1 Like

If you find you can support this, you may want char::escape_unicode. For example:

"👋🌍!".chars().for_each(|c| print!("{}", c.escape_unicode()));
// Prints: \u{1f44b}\u{1f30d}\u{21}

(N.b. chars are Unicode scalar values.)

This was exactly what I needed, thanks!