How to test that a file content is correct across multiple OS

Hi, I am following some exercises, and one is about creating a clone of "cat". The problem is with the tests that validate the output. The tests will load a reference file that was created by using the official "cat" command, and then will compare that file content with one created by the rust code.

The problem is, the output seems different(LCRF problems I think, mostly) depending if you run cargo test in windows Powershell, wsl2, pure linux bash, etc. A test that actually pass in ubuntu will fail in wsl/ubuntu

The console output is not really helpful in seeing what is different... It looks identical to me.(this is the output of the "spider" test that compares the output with the reference file)

---- spiders stdout ----
thread 'spiders' panicked at 'Unexpected stdout, failed diff original var
β”œβ”€β”€ original: Don't worry, spiders,
|   I keep house
|   casually.
β”œβ”€β”€ diff:
└── var as str: Don't worry, spiders,
    I keep house
    casually.

command=`"K:\\code\\rust\\learningrust\\catr\\target\\debug\\catr.exe" "tests/inputs/spiders.txt"`
code=0
stdout="Don\'t worry, spiders,\nI keep house\ncasually.\n"
stderr=""
', /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b\library\core\src\ops\function.rs:227:5

Running the same test on linux returns a success.

So, is there a way to get the same test results across multiples OS shell?

Could you share your code?

Sure, the repo is here

The tests are under tests/cli.rs
Expected output are under tests/expected

1 Like

If you can't see the difference with your eyes, you could try hex-dumping the file. That will reveal exact differences byte-by-byte, without relying on any particular character encoding and/or terminal setting. Also, diff.

1 Like

In the past I've just used String::replace to normalize the strings into using \n after reading the files. Since my files never had intentional Windows line ends, it did nothing on Linux and fixed the tests on Windows. Not the most elegant solution but it worked. I'm not familiar with how line ends work on wsl, but I think it's very plausible that the line ends are the problem.

It is worse than I thought, a command like
let line_bytes = file.read_line(&mut buf)?;
will return a different number of bytes for the same string depending if it runs in linux or windows. So, in this particular case, the tests can only really be run on the OS where they were written it seems.

Are you sure you don't have a Git configuration that's automatically switching line endings when you check out a file on Windows? Files with the same contents should, well, have the same contents on all platforms. But IIRC core.autocrlf is set to true on Windows by default so maybe the problem is just that you're not comparing equivalent files.

2 Likes

BufRead::read_line is not documented to have platform-dependent behavior, so this is either a case of some program silently switching out line endings from under you as @trentj suggested (more likely) or a bug in read_line.

You almost certainly shouldn't use lines or read_line anyway. Besides lines removing the original line ending (LF or CRFL) and println then putting in a different line ending, the whole line is buffered in memory, and there's no limit to the length of a line. From the documentation: β€œit is possible for an attacker to continuously send bytes without ever sending a newline or EOF.”

Good tip about the git settings, after some tests, it seems to be the culprit.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.