`uniwhat`: See what's in that unicode text stdin

uniwhat is a simple command line Rust programme I wrote so I could see exactly what unicode characters were in a string. You should be able to install it with cargo install uniwhat. Crates URL: https://crates.io/crates/uniwhat

$ echo "✨Hello! ä€" | uniwhat
character   byte  UTF-32  encoded as     glyph    name
        0      0  002728  E2 9C A8         ✨      SPARKLES
        1      3  000048  48               H      LATIN CAPITAL LETTER H
        2      4  000065  65               e      LATIN SMALL LETTER E
        3      5  00006C  6C               l      LATIN SMALL LETTER L
        4      6  00006C  6C               l      LATIN SMALL LETTER L
        5      7  00006F  6F               o      LATIN SMALL LETTER O
        6      8  000021  21               !      EXCLAMATION MARK
        7      9  000020  20                      SPACE
        8     10  0000E4  C3 A4            ä      LATIN SMALL LETTER A WITH DIAERESIS
        9     12  0020AC  E2 82 AC         €      EURO SIGN
       10     15  00000A  0A               \n     LINE FEED (LF)

I used uniname but that hasn't been updated in years, so doesn't support recent unicode versions.

11 Likes

Okay that's pretty neat.

Nice! Do you have a link to the source code repo?

Oh wow, I forgot to link it or put it on crates.io brainfart

Github: https://github.com/rory/uniwhat
Sourcehut: ~ebel/uniwhat - sourcehut git

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.