PSA: {BufRead,str}::lines semantics are changing


#1

Good morning rustlers! PR 28034 has now merged which means that the semantics of the BufRead::lines and str::lines methods are subtly changing. The change will be in the next nightly build and will soon make its way on to the beta channel.

The change in semantics is that a CRLF is now interpreted as a line separator in addition to just a bare newline. For example, this code will change behavior:

let a = "line1\r\nline2\nline3";
println!("{:?}", a.lines().collect::<Vec<_>>());

Currently on stable and beta Rust this will print

["line1\r", "line2", "line3"]

However once the next nightly is produced it will print:

["line1", "line2", "line3"]

Notice that the carriage return (\r) is present in the yielded strings today on stable but it is not present on the updated compiler.

More background for this change can be found on RFC 1212 and if you run into any problems with this please let us know!


#2

Thanks for this. These kinds of small improvements to the behavior of the standard library are the kinds of usability that make Rust a delightful evolving community.


#3

Praise rustacelor, dark lord of rust string handling.


#4

I have to admit, I’m actually pretty upset this sort of thing landed.

This is the worst kind of breaking change: It’s subtle and isn’t easily caught at compile time. I don’t have any skin in the fight, as I don’t typically use Windows files anywhere, and I think this is easily enough caught in tests, but it makes me incredibly nervous we’re still making breaking changes in Rust with just a PSA posted on users.rlo. (And hopefully, this will be in the release notes).

I’d much rather have compilation fail than have silent failures at runtime. In that respect, it’s like Rust’s stability guarantees aren’t providing me the right things.


#5

Even fixing a bug can break people who were dependent on the buggy behavior. You always have to exercise some kind of judgment, and in this case, it was decided to be fixed.

You’re seeing this u.r-l.o post specifically to draw said extra attention, and it will be prominently mentioned in the release notes for 1.4, for sure.


#6

For reference I’m reasonably convinced this will fix more programs than it will break – I ran into this after participating in the damn RFC. The only reason I even understood what was happening is because I did, too (and so I knew to use lines_any).


#7

I know it can be hard to follow the firehose, but I want to make clear that there’s been much more communication than that. This issue has been publicly debated for a couple of months, both in discuss forums, the issue tracker, and then through an RFC, and then through an announced final comment period – and now that the change has been made, we’ve announced it in users, internals, reddit, and twitter, and it will be made very visible in the release notes.

All that said, such a breaking change to a stable API was not made lightly, and is very unusual. But, as @Gankro points out, this change is probably more likely to fix programs as to break them, and we made the decision with community visibility and buy-in.

Finally, FWIW, there is still time to revert the decision before it ever hits stable, if it ends up causing real problems in practice; that’s part of the reason for having our beta channel.


#8

It has its benefits and its drawbacks. Take C++, for example, there are plenty of bugs in the spec but because they have to wait until a new standard is passed to fix them, many users already account for and work around the buggy behavior. What Rust is doing, I would say, is acceptable especially if they do it sparingly.

It’s one of those questions of whether you break backwards compatibility or have a lifetime of having to work around a bug. I’m okay with this change, but it has the ability to set a dangerous precedent for the future.


#9

With every release that a function has been it, it becomes harder to justify changes like this. By Rust 1.6.0, it should be unthinkable to try doing something like this. It’s almost unthinkable now.


#10

From a (formerly) fairly annoyed Windows guy: thank you for landing this.