Parsing whitespace vs their escape sequences

    attribute  =    value
path = C:\path\name

In the above example text, whitespace that was put there by the "enter" or "tab" keys should be allowed, but escape characters are not. Therefore, the pathname should be name.

It seems they are the same though:

fn main() {
    let newlines = b"
\n
\r\n
";
    println!("{:?}", newlines); // [10, 10, 10, 13, 10, 10]
}

I expect the first newline right after the opening quote to be a different byte code than the '\n'. But, they both are 10 and I don't know how to differentiate them. I was also surprised the first time I realized '\n' is a single byte and not '' + 'n'.

I found another post where someone says

When you input a string such as "hello\n", it corresponds to something like let input = String::from("hello\\n"). Hence, the \n is literally present.

However, that doesn't seem to be the case in my example. Or maybe I'm just not printing it out as it is actually stored..

You are confusing reading from a stream with string literal syntax.

In a string literal, escape sequences are replaced by their interpretation, so the character sequence \n in a steing literal becomes the newline character (a single byte). There is no way to differentiate it based on the contents of the resulting string.

When you are reading from a stream (eg. stdin), then whatever you key in to the console will be literally put into the returned string. Escape sequences are not interpreted, as this is not the context of a string literal. This is what the quoted explanation refers to.

Thanks for the quick response, and I'm glad it's just me getting it confused. I've written all my tests using the b"..." syntax to test "input". How can I change these input strings to more accurately represent real input? is that what the raw string syntax would do, `r"..."?

EDIT: My parse functions accept &[u8]

You can use the as_bytes method on str to turn a string slice into a byte slice.

str also implements AsRef<[u8]> which does the same thing.

You can use the as_bytes method

Thanks, I should have been more clear though. I meant, how can I write a string in my tests as if it were passed in from a source file.

I think the answer is to use r"..." (raw string syntax), because

println!("{:?}", r"  attribute=\tvalue
path = C:\path\name");

prints the same result as

println!("{:?}", std::fs::read_to_string("file-with-same-text")?);

The '\t' and '\n' are then prefixed with the additional ''. The only exception is the starting tab, which is just because I have VS Code set to use spaces instead of tabs, so they actually are spaces in the source.

Oh yeah if you want to type literal escape sequences into the string and not have them replaced you can use raw string literals, or escape the backslash

assert_eq!("\\n", r"\n")
1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.