How to do Python's str.translate() in rus?

HadrienG · March 18, 2018, 10:23pm

Yes, I was curious about the bulk copy thing but it seems that it does not make up for the extra overhead at this input size after all.

I already noticed in a previous project of mine that char_indices() can be expensive to use, at the time I resolved it by building a custom character iterator which only provides indices on demand (and specializing it for ASCII as well, which is what I knew to be parsing at the time). Maybe something similar could work here, but again, we're entering the territory of custom abstractions that directly work on the string's bytes

@mgeisler Nice coincidence regarding the ligature UTF-8 representation! I wouldn't have expected the stars to align so well, considering that IIRC some ligature encoding already existed before Unicode was released. It's good to know that capacity-tweaking is not needed after all.

derspiny · March 19, 2018, 8:55am

Isn’t this essentially the NFKD Unicode normalization form? If so, then crates like unicode_normalization are probably going to be much more complete and correct than any reasonably-sized regex.

kyrias · March 20, 2018, 10:17am

However it looks in your font, it is indeed U+FB05 which is the long-s t ligature.

mark · March 20, 2018, 2:01pm

Yes, so I'm not translating that to st, same as the other one.
Thanks.

Topic		Replies	Views
Replacing german umlauts help	7	1429	January 12, 2023
Asking for Rust solution	8	517	September 3, 2021
Solution Review for Chapter 8 Pig Latin code review	6	236	January 20, 2024
Convert `†` to unicode character	3	984	July 7, 2020
Python-like string in Rust help	22	952	March 22, 2023

How to do Python's str.translate() in rus?

Related Topics