Formatting line breaks

I noticed using \ to break lines without inserting a new line works differently than in C - is that on purpose? For instance

fn main() {
    let s: &str="\
one\
  two\
three";
    println!("{}",s);
}

prints out "onetwothree" - I was expecting "one twothree".

Is this a bug? Or how to get it to not strip the line leading spaces...

For reference the same in C

#include<stdio.h>

char* s="one\
  two\
three";

int main(int c, char** argc) {
    printf("%s", s);
}
2 Likes

https://doc.rust-lang.org/reference/tokens.html#string-literals

Line-breaks are allowed in string literals. A line-break is either a newline (U+000A ) or a pair of carriage return and newline (U+000D , U+000A ). Both byte sequences are normally translated to U+000A , but as a special exception, when an unescaped U+005C character (\ ) occurs immediately before a line break, then the line break character(s), and all immediately following `` (U+0020 ), \t (U+0009 ), \n (U+000A ) and \r (U+0000D ) characters are ignored.

4 Likes

In a hand-wavy sense Rust treats an EOL backslash as you saying "I want to continue this on next line, but I also don't want to screw up my indentation.".

Try removing the backslashes!

4 Likes

There’s no way to tell it that, however you could put the space(s) between one and two at the end of the previous line instead

fn main() {
    let s: &str="\
        one  \
        two\
        three";
    println!("{}",s);
}

Of course, for the concrete example at hand, it’s better written "one twothree" directly and in a single line anyways. If you shared some actual (practical) use-case you’re using the \+linebreak syntax for, maybe we could debate alternatives and help figure out the best (or least-ugly) way to express the string in Rust.

1 Like

The actual use case is this - representing pacman level 1 as a string literal - and breaking lines just to aide readability...

static LEVEL1MAP: &str = "\
############################\
#............##............#\
#.####.#####.##.#####.####.#\
#P####.#####.##.#####.####P#\
#..........................#\
#.####.##.########.##.####.#\
#......##....##....##......#\
######.##### ## #####.######\
     #.##          ##.#     \
     #.## ###--### ##.#     \
######.## # HHHH # ##.######\
      .   # HHHH #   .      \
######.## # HHHH # ##.######\
     #.## ######## ##.#     \
     #.##    $     ##.#     \
######.## ######## ##.######\
#............##............#\
#.####.#####.##.#####.####.#\
#P..##................##..P#\
###.##.##.########.##.##.###\
#......##....##....##......#\
#.##########.##.##########.#\
#..........................#\
############################";
2 Likes

The first thing I can come up with is using the concat macro:

static LEVEL1MAP2: &str = concat!(
    "############################",
    "#............##............#",
    "#.####.#####.##.#####.####.#",
    "#P####.#####.##.#####.####P#",
    "#..........................#",
    "#.####.##.########.##.####.#",
    "#......##....##....##......#",
    "######.##### ## #####.######",
    "     #.##          ##.#     ",
    "     #.## ###--### ##.#     ",
    "######.## # HHHH # ##.######",
    "      .   # HHHH #   .      ",
    "######.## # HHHH # ##.######",
    "     #.## ######## ##.#     ",
    "     #.##    $     ##.#     ",
    "######.## ######## ##.######",
    "#............##............#",
    "#.####.#####.##.#####.####.#",
    "#P..##................##..P#",
    "###.##.##.########.##.##.###",
    "#......##....##....##......#",
    "#.##########.##.##########.#",
    "#..........................#",
    "############################",
);

fn main() {
    assert_eq!(LEVEL1MAP1, LEVEL1MAP2);
}

static LEVEL1MAP1: &str = "\
   ############################\
   #............##............#\
   #.####.#####.##.#####.####.#\
   #P####.#####.##.#####.####P#\
   #..........................#\
   #.####.##.########.##.####.#\
   #......##....##....##......#\
   ######.##### ## #####.######\
\x20    #.##          ##.#     \
\x20    #.## ###--### ##.#     \
   ######.## # HHHH # ##.######\
\x20     .   # HHHH #   .      \
   ######.## # HHHH # ##.######\
\x20    #.## ######## ##.#     \
\x20    #.##    $     ##.#     \
   ######.## ######## ##.######\
   #............##............#\
   #.####.#####.##.#####.####.#\
   #P..##................##..P#\
   ###.##.##.########.##.##.###\
   #......##....##....##......#\
   #.##########.##.##########.#\
   #..........................#\
   ############################";
4 Likes

For embedding pre-formatted code as string literals, try indoc.

3 Likes

Which is a useful tool in general, but I don’t think it applies here in particular, because the request was to not keep the line breaks in the actual string data created.

1 Like

If it doesn't already have to be exactly &str, then one thing that might be useful for this purpose is using an array of byte strings:

static LEVEL1MAP: [[u8; 28]; 24] = [
    *b"############################",
    *b"#............##............#",
    *b"#.####.#####.##.#####.####.#",
    *b"#P####.#####.##.#####.####P#",
    *b"#..........................#",
    *b"#.####.##.########.##.####.#",
    *b"#......##....##....##......#",
    *b"######.##### ## #####.######",
    *b"     #.##          ##.#     ",
    *b"     #.## ###--### ##.#     ",
    *b"######.## # HHHH # ##.######",
    *b"      .   # HHHH #   .      ",
    *b"######.## # HHHH # ##.######",
    *b"     #.## ######## ##.#     ",
    *b"     #.##    $     ##.#     ",
    *b"######.## ######## ##.######",
    *b"#............##............#",
    *b"#.####.#####.##.#####.####.#",
    *b"#P..##................##..P#",
    *b"###.##.##.########.##.##.###",
    *b"#......##....##....##......#",
    *b"#.##########.##.##########.#",
    *b"#..........................#",
    *b"############################",
];

This is a few more characters, but it gives you several advantages:

  • Access by row and column index
  • Compile-time checking that everything is the expected size (no rows of different length)
8 Likes

With multiple line text blocks, I don't like putting leading whitespace at the end of my source lines, as it's hard to get the indentation correct with ragged lines. So instead I do things like this:

    let s = "\
        \x20  one\n\
        \x20  two\n\
        \x20  three\n\
    ";

(Though it'd be nicer if "\ " could be used instead of "\x20".)

1 Like

Maybe even nicer than an escape for spaces would be something like \& in Haskell that expands to nothing. Their main use-case is different, since they have variable-length non-delimited numeric character code escapes, disambiguating e. g. \137 (one character) from \13\&7 (two characters, \13 and 7) is occasionally necessary.

By the way, note that I wasn't suggesting to put leading spaces at the end of the line, the original post was about a string without line breaks and thus without a real difference between apparently "leading" or apparently "trailing" whitespace.

1 Like

I would use include_bytes! and put the data in a separate file.

3 Likes

Many thanks - I went with the concat! solution.

Two dimensional byte arrays could work as well - better in a way since all the characters are ascii. However - can't be formatted with concat! and I prefer to keep the board 1-dimensional - it is simpler because the 'squares' are enumerable and hence only need one index; though for display purposes they still need to be translated to x,y...

Her is a link to the program - pacman. It is a port from an older C version of the game.

2 Likes