Pointers on debugging for Windows?

Hi, I have an open source project which I personally only use on Linux, but it includes CI on Windows, because its good for the project to remain portable, and someone else might be able to use it there. Previously I had CI on MacOS, but thats less "cross", and on github actions its slower (maybe just less capacity) than Windows.

The project has and continues to pass CI on master branch, just fine on Tokio 0.2, Hyper 0.13:

actions/runs/469379644

But upon attempting to upgrade to Tokio 1.x, Hyper 0.14 with minimal other changes, the windows runs start hanging (until canceled) on a particular set of tests. I enabled stack traces and logging, but no new clues:

runs/1684055279

Firstly, I haven't had a Windows setup in maybe 15 years. Now I wish I had a working KVM/QEMU image for Windows, but besides lacking a license key, I'm not sure what other software I might need for debugging and what it all might cost?

Continuing on the route of trying to debug in CI: Tokio catches panics in its executor threads so I'm concerned its panicing but I'm not seeing the stack trace. I found the suggestion that stack traces only work with the -windows-gnu targets, not -windows-msvc, is that still a thing? I might try abort-on-panic config next to see if that changes things.

Otherwise, is there some common method of getting per-thread stack traces for a process on -windows-msvc? After some timeout period? Any more battle tested Windows CI (github or otherwise) setups I could crib from?

Anyone with a working windows environment and debug knowledge, willing to clone body-image, tokio-1.0 branch, run cargo test and let me know what you find?

My goal here is to either fix anything that's no longer(?) portable in my code, or provide a minimized bug report to the right dependency.

Try running the test with --no-capture to stop tests from capturing output. Rust's default panic handler should print the panic unconditionally, even if it is caught (it runs before unwinding begins).

1 Like

Thanks for suggesting, but I was already doing that in the above linked runs. Since then, I've added [profile.test] panic = "abort" and switch to target -windows-gnu, which unfortunately doesn't give any new clues.

So I presume its either some sort of deadlock or a future that for some reason never completes. Anyone know a good way of getting stack traces on all currently running threads in a windows process, ideally via CI (e.g. after some timeout?)

On a whim, I went ahead and re-tried it on MacOS. I'm surprised to find, it actually fails there as well, in the same silent-halting way. Its like the gods of proprietary software have manifest to exact their toll on my blissful, free development experience!

Well, this at least broadens the pool of who I can ask for debugging help!

Minimized reproduction (at least as far as I could, testing via CI) and report:

(Note, AFAIK this could just as as easily be a tokio 1.0, etc. issue.)

For the project, worked around by replacing use of the particular hyper API (serve_connection) with full Server instances: