Hi, I am following some exercises, and one is about creating a clone of "cat". The problem is with the tests that validate the output. The tests will load a reference file that was created by using the official "cat" command, and then will compare that file content with one created by the rust code.
The problem is, the output seems different(LCRF problems I think, mostly) depending if you run cargo test in windows Powershell, wsl2, pure linux bash, etc. A test that actually pass in ubuntu will fail in wsl/ubuntu
The console output is not really helpful in seeing what is different... It looks identical to me.(this is the output of the "spider" test that compares the output with the reference file)
---- spiders stdout ----
thread 'spiders' panicked at 'Unexpected stdout, failed diff original var
βββ original: Don't worry, spiders,
| I keep house
| casually.
βββ diff:
βββ var as str: Don't worry, spiders,
I keep house
casually.
command=`"K:\\code\\rust\\learningrust\\catr\\target\\debug\\catr.exe" "tests/inputs/spiders.txt"`
code=0
stdout="Don\'t worry, spiders,\nI keep house\ncasually.\n"
stderr=""
', /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b\library\core\src\ops\function.rs:227:5
Running the same test on linux returns a success.
So, is there a way to get the same test results across multiples OS shell?
If you can't see the difference with your eyes, you could try hex-dumping the file. That will reveal exact differences byte-by-byte, without relying on any particular character encoding and/or terminal setting. Also, diff.
In the past I've just used String::replace to normalize the strings into using \n after reading the files. Since my files never had intentional Windows line ends, it did nothing on Linux and fixed the tests on Windows. Not the most elegant solution but it worked. I'm not familiar with how line ends work on wsl, but I think it's very plausible that the line ends are the problem.
It is worse than I thought, a command like let line_bytes = file.read_line(&mut buf)?;
will return a different number of bytes for the same string depending if it runs in linux or windows. So, in this particular case, the tests can only really be run on the OS where they were written it seems.
Are you sure you don't have a Git configuration that's automatically switching line endings when you check out a file on Windows? Files with the same contents should, well, have the same contents on all platforms. But IIRC core.autocrlf is set to true on Windows by default so maybe the problem is just that you're not comparing equivalent files.
BufRead::read_line is not documented to have platform-dependent behavior, so this is either a case of some program silently switching out line endings from under you as @trentj suggested (more likely) or a bug in read_line.
You almost certainly shouldn't use lines or read_line anyway. Besides lines removing the original line ending (LF or CRFL) and println then putting in a different line ending, the whole line is buffered in memory, and there's no limit to the length of a line. From the documentation: βit is possible for an attacker to continuously send bytes without ever sending a newline or EOF.β