I have a bytestring of output from a subprocess, that consists of a series of whitespace-separated words and I want to split that up into words for processing. Ideally, I don't want to convert to a string as I can't be 100% sure the data is valid UTF-8 (I'm 99% sure, but why take the risk?) It wasn't too hard to write my own iterator for this:
#[derive(Debug)]
struct Words<'a> {
str: &'a [u8],
start: usize,
end: usize,
}
impl<'a> Iterator for Words<'a> {
type Item = &'a[u8];
fn next(&mut self) -> Option<Self::Item> {
self.start = self.end;
while self.start < self.str.len() && self.str[self.start].is_ascii_whitespace() {
self.start += 1;
}
if self.start >= self.str.len() {
return None;
}
self.end = self.start;
while self.end < self.str.len() && !self.str[self.end].is_ascii_whitespace() {
self.end += 1;
}
Some(&self.str[self.start..self.end])
}
}
but it feels like this is something I should be able to get from the standard library. I did spend a bit of time before writing my own, but I couldn't find anything that looked straightforward.
Is there an easy way of doing this that I should have found? My reasons are two-fold:
- I'm just learning Rust, and I want to learn to do things the right way, which to me means using the features available rather than rewriting things from scratch where possible.
- While I'm reasonably sure my implementation is OK, I'd much rather use an existing, tested answer instead of adding an extra potential source of bugs by writing my own implementation.