Repo: GitHub - BurntSushi/bstr: A string type for Rust that is not required to be valid UTF-8.
Docs: bstr - Rust
For a long time now, I've been mulling over what a real byte string type in Rust would look like. They turn out to be fairly useful when dealing with files or data that you otherwise expect to be text/ASCII-compatible, but do not have any guarantee that it actually is. Rust's Vec<u8>
/&[u8]
is somewhat serviceable for these things as a container, but there are lots of "string" oriented APIs missing from this type. For example, iterating over codepoints/graphemes/words/sentences, changing case, substring search, iterating over lines, splitting, etc. All of these things (and more) are provided by bstr
.
The README and docs contain more details about motivation and future work. There are also tons of examples in the docs, and I tried to add higher level "guidance" where I could to try to nudge folks toward more correct APIs.