Moving forward an iterator over char

#1

Hi,

I have an iterator over string chars. I’d like to skip whitespaces, do some char related computation, skip again whitespaces and so on. I think skip_while(|c| c.is_whitespace()) could be enough, but it returns a SkipWhile struct, not an Iterator and I can’t figure out what to do with it.
Could you please help me understand this?

Thank you, Luca-

#2

TakeWhile is an Iterator, but since it’s an iterator over only the whitespace characters, what you’d have to do is exhaust that one and then use the iterator that you called take_while on again. But, it doesn’t put back the last checked item into the iterator, so it would be missing one item.

What you actually want is filter, which will only yield items matching the predicate, and drop others.


Edit: Wait, I misread that you wrote, sorry. You can use peeking_take_while from the itertools crate if your iterator is peekable, which will prevent the missing last item problem.

#3

Hi kyrias, thank you for your reply!

I wrote take_while while I meant skip_while, but I think the reasoning is the same.
Suppose I have a string like
skip some whitespaces
If I’m not wrong, filter would give me every character that is not equal to a whitespace, which is not what I need (it will give me something like skipsomewhitespaces)
I’m trying to understand how to write a fn that skips the whitespaces when I need to, moving forward the iterator.

#4

Just to make sure I understand the question, do you want to do something like this:

let chars = "This is a string.".chars();
let result: String = chars.skip_while(|c| c.is_whitespace())
                  .map(|c| c.to_uppercase().to_string())
                  .collect();
println!("{}", result);

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=549c2fefba978417b2ee36e742b2413c

#5

I’m just learning Rust and I’m not sure if it is the same, but I’m trying to keep a struct with a string and an iterator over the string chars in it.
In the impl block I’m trying to write a fn that skip whitespaces, another that consumes some chars and so on, just to practice with iterators, but I find it so hard.

#6

So if your input is “This is a string 123.”, what is your expected output?

#7

If you keep the iterator separate, you can skip and then consume some of it, and then repeat as desired. For instance, to grab the first few words:

let mut chars = my_string.chars();
let word1: String = chars.by_ref()
    .skip_while(|c| c.is_whitespace())
    .take_while(|c| !c.is_whitespace()) // note, the terminating `c` will be lost!
    .collect();
let word2: String = chars.by_ref()
    .skip_while(|c| c.is_whitespace())
    .take_while(|c| !c.is_whitespace())
    .collect();
// etc.

But split_whitespace() will be more efficient for this particular example.

1 Like
#8

@stevensonmt

if input is let s = "_____some_input_______some_other_input" (sorry, I had to replace spaces with underscores) I’d like to do something like:

let s = "_____some_input_______some_other_input";
let foo = Foo::from(s);
foo.skip_spaces();
assert_eq!(foo.first_word(), "some");
assert_eq!(foo.next_char(), " ");
assert_eq!(foo.first_word(), "input");
foo.skip_spaces();
assert_eq!(foo.first_word(), "some");
#9

Thank you @cuviper for your reply, I have to spend some time on your solution trying to understand
it and why "split_whitespace() will be more efficient for this particular example".
Is this because of .collect on every word?

#10

The collect forces an allocation versus the split returning &str slices. The split also has the potential of performing optimized searches for non-whitespace characters, though I’m not sure it’s that clever yet.

Your Foo could be something like this:

pub struct Foo<'a> {
    iter: std::iter::Peekable<std::str::Chars<'a>>,
}

impl<'a> Foo<'a> {
    pub fn skip_spaces(&mut self) {
        while let Some(c) = self.iter.peek() {
            if c.is_whitespace() {
                self.iter.next(); // consume it
            }
        }
    }
}
1 Like
#11

Thank you @cuviper for your help!
I’m partially blown away by the monster declaration of iter :grimacing:
Anyway reading it piece by piece makes it understandable.

#12

Maybe it looks scary because I didn’t bother to import anything? It could be written like:

use std::iter::Peekable;
use std::str::Chars;

pub struct Foo<'a> {
    iter: Peekable<Chars<'a>>,
}

You could also do without Peekable if you use Chars::as_str() to peek at the remaining string, but I wanted to keep a full Iterator style for the example.

2 Likes
#13

Yes, this is much more understandable for a newcomer I think!
After some more experimentations I have found that take_while doesn’t use peek, so all my functional-only tests read one character more than needed. For what I read here most people use a traditional imperative while loop solution instead of a functional one (which would require to implement some kind of peek_while fn).
Ok, so far I am satisfied of what I’ve learned today.

#14

Check out itertools for more great adaptors – like peeking_take_while() for this case. :slightly_smiling_face:

2 Likes
closed #15

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.