I need some help about iterator :)


#1

Hey folks, I am gonna translate a code snippet from cpp to rust, but I occur some problems about iterator, here is the code:

string removeComments(string prgm)
{
    int n = prgm.length();
    string res;
 
    // Flags to indicate that single line and multpile line comments
    // have started or not.
    bool s_cmt = false;
    bool m_cmt = false;
 
 
    // Traverse the given program
    for (int i=0; i<n; i++)
    {
        // If single line comment flag is on, then check for end of it
        if (s_cmt == true && prgm[i] == '\n')
            s_cmt = false;
 
        // If multiple line comment is on, then check for end of it
        else if  (m_cmt == true && prgm[i] == '*' && prgm[i+1] == '/')
            m_cmt = false,  i++;
 
        // If this character is in a comment, ignore it
        else if (s_cmt || m_cmt)
            continue;
 
        // Check for beginning of comments and set the approproate flags
        else if (prgm[i] == '/' && prgm[i+1] == '/')
            s_cmt = true, i++;
        else if (prgm[i] == '/' && prgm[i+1] == '*')
            m_cmt = true,  i++;
 
        // If current character is a non-comment character, append it to res
        else  res += prgm[i];
    }
    return res;
}

And this is my rust code:

pub fn clear_comment() {
    let mut single_comment = false;
    let mut multi_comment = false;
    let mut stdin = io::stdin();
    let mut buffer: Vec<u8> = vec![];
    match stdin.read_to_end(&mut buffer) {
        Ok(bytes) => {
            let input = String::from_utf8(buffer).unwrap();
            let chars = input.chars().peekable();
            for char in chars {
                if (single_comment && char == '\n') {
                    single_comment = false;
                } else if (multi_comment && char == '*' && chars.peek().unwrap() == &'/') {
                    multi_comment = false;

                } else if (char == '/' && chars.peek().unwrap() == &'/') {
                    single_comment = true;
                    // chars.next();
                }
            }
        }
        Err(error) => println!("Error:{}", error),
    }
}

I don’t know how to handle prom[i] and progm[i+1] in Rust, I have tried peek(), but it seeems I did somethings wrong :frowning:
Any suggestion will be appreciated


#2

The minimal fix would be to replace for char in chars with while let Some(char) = chars.next(): https://play.rust-lang.org/?gist=089d58d0c3149125998f2fc5ea3cefbd&version=stable


#3

This is exactly what I want, thanks soooooooo much for your suggestion :slight_smile:


#4

It might not be relevant in this case, but to remove a comment from a string, maybe the regex crate or the pom crates could help you do it easily and fast. The first is really fast, and the second one will allow you to add complex escape sequences. They are both very efficient.


#5

Thanks for your suggestion, but out of curiousity, why could regex could do it faster?


#6

I have thought about using regex crate to clear comment in C. I have spent an hour on writing the decent regular expression, but I could not get it work correctly :frowning:

Some people, when confronted with a problem, think
“I know, I’ll use regular expressions.” Now they have two problems.


#7

Here’s a slightly improved example: https://play.rust-lang.org/?gist=88f0d7ec32de5579bff6ee5b01a040f6&version=stable

  1. move io out of clear_comment fn
  2. use a more convenient read_to_string to get utf8 directly
  3. use match with guards instead of an elif chain
  4. Don’t unwrap the next character (it’s interesting that C++ version does not have out of bounds error because it has a terminating \0)

Note that C++ and Rust versions are functionally different: Rust works with UTF-8 (and might be slower because of it), while C++ works for ascii only. If you want exact C++ equivalent, you can use Vec<u8> instead of String.


#8

wow, it is more elegant than what I did, I want to implement this feature that could clear C comment decently, instead of something exact c++ equivalent