I'm trying to parse strings from my facebook data dump for fun. It encodes everything that's not latin characters in unicode (also for emojis, but I'm not interested in those):
\u00d1\u0087\u00d0\u00b0\u00d1\u0081 \u00d0\u00b8 \u00d0\u00b4\u00d0\u00b5\u00d0\u00b2\u00d0\u00b5\u00d1\u0082 \u00d0\u00bc\u00d0\u00b8\u00d0\u00bd\u00d1\u0083\u00d1\u0082\u00d0\u00b8
This tool gives me the desired output, which is this text in cyrillic:
час и девет минути
I'm sorry but I'm unfamiliar with text encodings and standards. As far as I understand this isn't valid UTF-8, so Rust's built-in
String::from_utf8 result in jibberish like this
ÐÐµÑÑÑ. But I guess you can build
Is there any way I can turn the above unicode into valid cyrllic utf8? Those are my only constraints.